SlideShare a Scribd company logo
Efficient Query Processing with Optimistically
Compressed Hash Tables & Strings in the USSR
Presented by Huaiyu Xu
PingCAP.com
Motivation
● Hash tables frequently used in analytical queries
● Crucial for overall performance
● But (large) HTs bottlenecked by main memory bandwidth
What can we do about it?
PingCAP.com
Motivation
Orthogonal approaches :
● Optimize access
○ partitioning
● Increase fill-rate
○ Cuckoo, Robin Hood hashing
● Shrink the table itself
○ reduce the bucket/row size (has not received as much attention)
consequently, increase cache-efficiency
PingCAP.com
Shrinking Hash Tables
100 MiB, magically shrink by 10x:
● Increase query throughput
● Downsize your computer
Bonus:
● HT 10MiB, fits into L3/LLC cache
● Improved runtime
Better Latency & Throughput
PingCAP.com
Shrinking Hash Tables
● Domain-guided prefix suppression
● Optimistic splitting
● Unique Strings self-aligned Region (USSR)
PingCAP.com
Domain-guided prefix suppression
● Domain-guided:
○ per-column min/max infomation from meta-data
● Prefix Suppression
○ substract the domain minimum from each value
○ pack multiple columns together
PingCAP.com
Domain-guided prefix suppression
● Compression and Decompression
○ lightweight: handful bitwise operations
○ fast equality comparisons on compressed data
● Generating Pre-Compiled Kernels
○ restrict the number of inputs to 4
○ restrict the types we pack into to 32-, 64- and 128-bit unsigned integers
○ impose an order on the inputs
PingCAP.com
Optimistic Splitting
● decrease effective memory footprint
● Decompose HT into:
● select sum(a) from t; a int64 (8 byte) → sum decimal + a + a (40byte)
● int64 / decimal
Hot HT:
● Frequently accessed
● Cache-resident
● Aggregates:
○ SUM: sub-sums fit smaller
data type
Cold HT:
● Rarely accessed
● Main memory
● Aggregates:
○ SUM: store full SUM or
overflow counter
PingCAP.com
Unique Strings self-aligned Region (USSR)
● Assumption: Many strings repeat
● USSR
○ Query-wide dictionary
○ Limited size (cache resident)
○ Built during scan
● 768kB (hash table 256kB, data region 512kB)
● data region
○ 2^16 slots ( 8-byte / slot )
○ each string takes at least two slots
(one for the hash and one for the string)
○ all pointers inside a data region start
with same 45 bit prefix
● hash table region
○ 2^16 buckets ( 4-byte / bucket )
○ each bucket consists of a 16-bit hash extract
and a 16-bit slot number
○ load factor < 50% (2^16 buckets for at most 2^15 strings)
PingCAP.com
Experiments
PingCAP.com
Micro-bench: Faster HashJoin Probe
● micro-benchmark Domain-Guided Prefix Suppression
● 4 keys [0...1000], 4 payloads [0...10]
● 2.5x faster hash probe including the tuple
reconstruction cost
● > 10^6 rows, the speedups were caused by the
more cache-resident hash table
● < 10^6 rows, mostly affected by the more efficient
comparisons directly on compressed data
PingCAP.com
Micro-bench: USSR and Group-By
● SELECT COUNT(*) FROM T GROUP BY s
● 10 unique strings, all strings had the same length
● the time spent on string comparisons when
checking the keys inside group by’s hash table
● the time spent on computing hash of the
string keys
PingCAP.com
TPC-H (sf = 100): memory footprint
PingCAP.com
TPC-H (sf = 100): memory footprint
● Over TPC-H we measured up to 2.1x lower memory consumption
● However, Optimistic Splitting in fact increases (rather than reduces) the overall memory
consumption as it introduces additional data
● The main idea behind Optimistic Splitting is to reduce memory pressure rather than overall
memory consumption
PingCAP.com
TPC-H (sf = 100): query performance
● USSR alone: Q4, Q12, Q16 benefit from faster string hashing and equality comparisons
● CHT alone: improvement of at least 10%. a) more efficient expression evaluation on smaller data types
provide b) more cache efficient hash table operation on compressed keys
● CHT + OPTIMISTIC + USSR: Q1, Q15 benefited from the Optimistic SUM aggregate which boosted the
aggregate computation
● Q2: the regression was caused by type casting overhead which occurred when operating on compact data
types
PingCAP.com
Faster Real-World Workload (Public BI)
● string heavy
● “CommonGovernment” workbook:
PingCAP.com
Thank You !

More Related Content

PDF
EncExec: Secure In-Cache Execution
PDF
Apache tajo configuration
PDF
Performance evaluation of apache tajo
PDF
Tajo case study bay area hug 20131105
PDF
Query optimization in Apache Tajo
PDF
Mongo nyc nyt + mongodb
PDF
openTSDB - Metrics for a distributed world
PDF
OpenTSDB for monitoring @ Criteo
EncExec: Secure In-Cache Execution
Apache tajo configuration
Performance evaluation of apache tajo
Tajo case study bay area hug 20131105
Query optimization in Apache Tajo
Mongo nyc nyt + mongodb
openTSDB - Metrics for a distributed world
OpenTSDB for monitoring @ Criteo

What's hot (20)

PDF
Optimizing columnar stores
PDF
STOR2RRD presentation from Common CZ/SK 2015
PPTX
Update on OpenTSDB and AsyncHBase
PPTX
Bucket your partitions wisely - Cassandra summit 2016
PDF
Effectively deploying hadoop to the cloud
PDF
InfiniFlux Minmax Cache
PDF
OpenTSDB 2.0
PDF
OSMC 2014: Introduction into collectd | Florian Foster
PPTX
PPTX
Your data isn't that big @ Big Things Meetup 2016-05-16
PPTX
Tuning Apache Phoenix/HBase
PDF
Go and Uber’s time series database m3
PDF
Dataframes Showdown (miniConf 2022)
PDF
OSDC 2012 | Taking hot backups with XtraBackup by Alexey Kopytov
PDF
Full Text Search in PostgreSQL
PDF
Apache Solr as a compressed, scalable, and high performance time series database
PDF
Introduction to Hadoop - FinistJug
PDF
Data Structures and Performance for Scientific Computing with Hadoop and Dumb...
PDF
PgconfSV compression
PPTX
Falando de MySQL
Optimizing columnar stores
STOR2RRD presentation from Common CZ/SK 2015
Update on OpenTSDB and AsyncHBase
Bucket your partitions wisely - Cassandra summit 2016
Effectively deploying hadoop to the cloud
InfiniFlux Minmax Cache
OpenTSDB 2.0
OSMC 2014: Introduction into collectd | Florian Foster
Your data isn't that big @ Big Things Meetup 2016-05-16
Tuning Apache Phoenix/HBase
Go and Uber’s time series database m3
Dataframes Showdown (miniConf 2022)
OSDC 2012 | Taking hot backups with XtraBackup by Alexey Kopytov
Full Text Search in PostgreSQL
Apache Solr as a compressed, scalable, and high performance time series database
Introduction to Hadoop - FinistJug
Data Structures and Performance for Scientific Computing with Hadoop and Dumb...
PgconfSV compression
Falando de MySQL
Ad

Similar to [Paper Reading] Efficient Query Processing with Optimistically Compressed Hash Tables & Strings in the USSR (20)

PDF
RecSplit Minimal Perfect Hashing
PDF
The Parquet Format and Performance Optimization Opportunities
PDF
Imply at Apache Druid Meetup in London 1-15-20
PDF
02-hashing.pdf
PDF
Improve Presto Architectural Decisions with Shadow Cache
PDF
1083 wang
PPTX
Computer System Architecture Lecture Note 8.1 primary Memory
PPT
7_mem_cache.ppt
PPTX
Update on OpenTSDB and AsyncHBase
PPTX
Memory Organization digital image processing
PDF
In datacenter performance analysis of a tensor processing unit
PDF
The Dark Side Of Go -- Go runtime related problems in TiDB in production
PPTX
Cache recap
PPTX
Cache recap
PPTX
Cache recap
PPTX
Cache recap
PPTX
Cache recap
PPTX
Cache recap
PPTX
Cache recap
PDF
Optimizing Python
RecSplit Minimal Perfect Hashing
The Parquet Format and Performance Optimization Opportunities
Imply at Apache Druid Meetup in London 1-15-20
02-hashing.pdf
Improve Presto Architectural Decisions with Shadow Cache
1083 wang
Computer System Architecture Lecture Note 8.1 primary Memory
7_mem_cache.ppt
Update on OpenTSDB and AsyncHBase
Memory Organization digital image processing
In datacenter performance analysis of a tensor processing unit
The Dark Side Of Go -- Go runtime related problems in TiDB in production
Cache recap
Cache recap
Cache recap
Cache recap
Cache recap
Cache recap
Cache recap
Optimizing Python
Ad

More from PingCAP (20)

PDF
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
PPTX
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
PPTX
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
PPTX
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
PPTX
[Paper Reading] QAGen: Generating query-aware test databases
PDF
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
PDF
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
PDF
[Paperreading] Paxos made easy (by sen han)
PPTX
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
PDF
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
PDF
TiDB DevCon 2020 Opening Keynote
PDF
Finding Logic Bugs in Database Management Systems
PDF
Chaos Practice in PingCAP
PDF
TiDB at PayPay
PPTX
Paper Reading: FPTree
PPTX
Paper Reading: Smooth Scan
PPTX
Paper Reading: Flexible Paxos
PPTX
Paper reading: Cost-based Query Transformation in Oracle
PPTX
Paper reading: HashKV and beyond
PDF
Paper Reading: Pessimistic Cardinality Estimation
[Paper Reading]Orca: A Modular Query Optimizer Architecture for Big Data
[Paper Reading]KVSSD: Close integration of LSM trees and flash translation la...
[Paper Reading]Chucky: A Succinct Cuckoo Filter for LSM-Tree
[Paper Reading]The Bw-Tree: A B-tree for New Hardware Platforms
[Paper Reading] QAGen: Generating query-aware test databases
[Paper Reading] Leases: An Efficient Fault-Tolerant Mechanism for Distribute...
[Paper reading] Interleaving with Coroutines: A Practical Approach for Robust...
[Paperreading] Paxos made easy (by sen han)
[Paper Reading] Generalized Sub-Query Fusion for Eliminating Redundant I/O fr...
[Paper Reading] Steering Query Optimizers: A Practical Take on Big Data Workl...
TiDB DevCon 2020 Opening Keynote
Finding Logic Bugs in Database Management Systems
Chaos Practice in PingCAP
TiDB at PayPay
Paper Reading: FPTree
Paper Reading: Smooth Scan
Paper Reading: Flexible Paxos
Paper reading: Cost-based Query Transformation in Oracle
Paper reading: HashKV and beyond
Paper Reading: Pessimistic Cardinality Estimation

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation theory and applications.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Electronic commerce courselecture one. Pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
KodekX | Application Modernization Development
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Review of recent advances in non-invasive hemoglobin estimation
DOCX
The AUB Centre for AI in Media Proposal.docx
Spectral efficient network and resource selection model in 5G networks
NewMind AI Weekly Chronicles - August'25 Week I
Dropbox Q2 2025 Financial Results & Investor Presentation
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation theory and applications.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Digital-Transformation-Roadmap-for-Companies.pptx
Unlocking AI with Model Context Protocol (MCP)
Electronic commerce courselecture one. Pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Mobile App Security Testing_ A Comprehensive Guide.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
KodekX | Application Modernization Development
sap open course for s4hana steps from ECC to s4
Review of recent advances in non-invasive hemoglobin estimation
The AUB Centre for AI in Media Proposal.docx

[Paper Reading] Efficient Query Processing with Optimistically Compressed Hash Tables & Strings in the USSR

  • 1. Efficient Query Processing with Optimistically Compressed Hash Tables & Strings in the USSR Presented by Huaiyu Xu
  • 2. PingCAP.com Motivation ● Hash tables frequently used in analytical queries ● Crucial for overall performance ● But (large) HTs bottlenecked by main memory bandwidth What can we do about it?
  • 3. PingCAP.com Motivation Orthogonal approaches : ● Optimize access ○ partitioning ● Increase fill-rate ○ Cuckoo, Robin Hood hashing ● Shrink the table itself ○ reduce the bucket/row size (has not received as much attention) consequently, increase cache-efficiency
  • 4. PingCAP.com Shrinking Hash Tables 100 MiB, magically shrink by 10x: ● Increase query throughput ● Downsize your computer Bonus: ● HT 10MiB, fits into L3/LLC cache ● Improved runtime Better Latency & Throughput
  • 5. PingCAP.com Shrinking Hash Tables ● Domain-guided prefix suppression ● Optimistic splitting ● Unique Strings self-aligned Region (USSR)
  • 6. PingCAP.com Domain-guided prefix suppression ● Domain-guided: ○ per-column min/max infomation from meta-data ● Prefix Suppression ○ substract the domain minimum from each value ○ pack multiple columns together
  • 7. PingCAP.com Domain-guided prefix suppression ● Compression and Decompression ○ lightweight: handful bitwise operations ○ fast equality comparisons on compressed data ● Generating Pre-Compiled Kernels ○ restrict the number of inputs to 4 ○ restrict the types we pack into to 32-, 64- and 128-bit unsigned integers ○ impose an order on the inputs
  • 8. PingCAP.com Optimistic Splitting ● decrease effective memory footprint ● Decompose HT into: ● select sum(a) from t; a int64 (8 byte) → sum decimal + a + a (40byte) ● int64 / decimal Hot HT: ● Frequently accessed ● Cache-resident ● Aggregates: ○ SUM: sub-sums fit smaller data type Cold HT: ● Rarely accessed ● Main memory ● Aggregates: ○ SUM: store full SUM or overflow counter
  • 9. PingCAP.com Unique Strings self-aligned Region (USSR) ● Assumption: Many strings repeat ● USSR ○ Query-wide dictionary ○ Limited size (cache resident) ○ Built during scan ● 768kB (hash table 256kB, data region 512kB) ● data region ○ 2^16 slots ( 8-byte / slot ) ○ each string takes at least two slots (one for the hash and one for the string) ○ all pointers inside a data region start with same 45 bit prefix ● hash table region ○ 2^16 buckets ( 4-byte / bucket ) ○ each bucket consists of a 16-bit hash extract and a 16-bit slot number ○ load factor < 50% (2^16 buckets for at most 2^15 strings)
  • 11. PingCAP.com Micro-bench: Faster HashJoin Probe ● micro-benchmark Domain-Guided Prefix Suppression ● 4 keys [0...1000], 4 payloads [0...10] ● 2.5x faster hash probe including the tuple reconstruction cost ● > 10^6 rows, the speedups were caused by the more cache-resident hash table ● < 10^6 rows, mostly affected by the more efficient comparisons directly on compressed data
  • 12. PingCAP.com Micro-bench: USSR and Group-By ● SELECT COUNT(*) FROM T GROUP BY s ● 10 unique strings, all strings had the same length ● the time spent on string comparisons when checking the keys inside group by’s hash table ● the time spent on computing hash of the string keys
  • 13. PingCAP.com TPC-H (sf = 100): memory footprint
  • 14. PingCAP.com TPC-H (sf = 100): memory footprint ● Over TPC-H we measured up to 2.1x lower memory consumption ● However, Optimistic Splitting in fact increases (rather than reduces) the overall memory consumption as it introduces additional data ● The main idea behind Optimistic Splitting is to reduce memory pressure rather than overall memory consumption
  • 15. PingCAP.com TPC-H (sf = 100): query performance ● USSR alone: Q4, Q12, Q16 benefit from faster string hashing and equality comparisons ● CHT alone: improvement of at least 10%. a) more efficient expression evaluation on smaller data types provide b) more cache efficient hash table operation on compressed keys ● CHT + OPTIMISTIC + USSR: Q1, Q15 benefited from the Optimistic SUM aggregate which boosted the aggregate computation ● Q2: the regression was caused by type casting overhead which occurred when operating on compact data types
  • 16. PingCAP.com Faster Real-World Workload (Public BI) ● string heavy ● “CommonGovernment” workbook: