Jon Haddad, Rustyrazorblade Consulting
rustyrazorblade.com
Cassandra Performance Tuning
Like You've Been Doing It for Ten
Years
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
• DataStax
• The Last Pickle
• Committer & PMC
• Apple
• Net
fl
ix
• Out on my own!
Last 10 Years
Why do we care
about performance?
Expert Mindset
•Methodology
•Observability
•Practice
Methodology
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
OODA Loop
USE Method
For every resource, check
•Utilization
•Saturation
•Errors
utilization: the average time that the
resource was busy servicing work
utilization: the average time that the
resource was busy servicing work
example: cpu at 90% utilization
saturation: the degree to which the
resource has extra work which it can't
service, often queued
saturation: the degree to which the
resource has extra work which it can't
service, often queued
example: i/o device queue of 100
errors: the count of error events
errors: the count of error events
example: tcp retransmit
Observability
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
Anti-Patterns
• Using averages (latency) or anything
under p99
• Hiding outliers
• Averaging Summary Data
• Not understanding your tool's output
Anti-Patterns
• Using averages (latency) or anything
under p99
• Hiding outliers
• Averaging Summary Data
• Not understanding your tool's output
System Tools
high level: sysstat
# mpstat -P ALL 1 10
Linux 5.15.0-89-generic (ubuntu-vm) 12/08/2023 _aarch64_ (2 CPU)
12:21:55 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
12:21:56 AM all 2.16 0.00 23.24 74.59 0.00 0.00 0.00 0.00 0.00 0.00
12:21:56 AM 0 2.17 0.00 22.83 75.00 0.00 0.00 0.00 0.00 0.00 0.00
12:21:56 AM 1 2.15 0.00 23.66 74.19 0.00 0.00 0.00 0.00 0.00 0.00
12:21:56 AM CPU %usr %nice %sys %iowait %irq %soft %steal %guest %gnice %idle
12:21:57 AM all 3.16 0.00 25.79 71.05 0.00 0.00 0.00 0.00 0.00 0.00
12:21:57 AM 0 3.19 0.00 25.53 71.28 0.00 0.00 0.00 0.00 0.00 0.00
12:21:57 AM 1 3.12 0.00 26.04 70.83 0.00 0.00 0.00 0.00 0.00 0.00
ebpf / bcc-tools
Q: What happens if we give
Cassandra a bigger heap?
# cachestat-bpfcc 1 10
HITS MISSES DIRTIES HITRATIO BUFFERS_MB CACHED_MB
760 20159 0 3.63% 23 454
828 21858 0 3.65% 23 381
202 16282 0 1.23% 23 414
995 15487 5 6.04% 23 475
1433 12313 0 10.42% 23 523
2966 16771 0 15.03% 23 589
5002 19346 0 20.54% 23 664
6451 18185 0 26.19% 23 735
8391 18053 0 31.73% 23 806
11036 17924 0 38.11% 23 876
# ext4dist-bpfcc -m 1 20
Tracing ext4 operation latency... Hit Ctrl-C to end.
23:28:18:
operation = read
msecs : count distribution
0 -> 1 : 18781 |****************************************|
2 -> 3 : 15 | |
4 -> 7 : 5 | |
operation = write
msecs : count distribution
0 -> 1 : 38182 |****************************************|
2 -> 3 : 2 | |
4 -> 7 : 2 | |
# biolatency-bpfcc -QDmT 1 10
Tracing block device I/O... Hit Ctrl-C to end.
23:13:48
disk = b'vda'
msecs : count distribution
0 -> 1 : 10960 |****************************************|
2 -> 3 : 53 | |
4 -> 7 : 8 | |
8 -> 15 : 1 | |
Flame Graphs
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
compression =
{'sstable_compression': 'LZ4Compressor',
'chunk_length_kb': ?};
1010101010010000111010101010101010010000111010101
0101010100100001110101010101010100100001110101010
1010101001000011101010101010101001000011101010
1010101010010000111010101010101010010000111010101
0101010100100001110101010101010100100001110101010
1010101001000011101010101010101001000011101010
write to sstable
1010101010010000111010101010101010010000111010101
0101010100100001110101010101010100100001110101010
1010101001000011101010101010101001000011101010
write to sstable
read from sstable
$ bin/nodetool tablehistograms test foo
test/foo histograms
Percentile Read Latency Write Latency SSTables Partition Size Cell Count
(micros) (micros) (bytes)
50% 785.94 379.02 1.00 35 1
75% 4055.27 454.83 1.00 35 1
95% 4055.27 454.83 1.00 35 1
98% 4055.27 454.83 1.00 35 1
99% 4055.27 454.83 1.00 35 1
Min 654.95 315.85 1.00 30 0
Max 4055.27 454.83 1.00 35 1
$ bin/nodetool tablehistograms test foo
test/foo histograms
Percentile Read Latency Write Latency SSTables Partition Size Cell Count
(micros) (micros) (bytes)
50% 785.94 379.02 1.00 35 1
75% 4055.27 454.83 1.00 35 1
95% 4055.27 454.83 1.00 35 1
98% 4055.27 454.83 1.00 35 1
99% 4055.27 454.83 1.00 35 1
Min 654.95 315.85 1.00 30 0
Max 4055.27 454.83 1.00 35 1
ALTER TABLE test.foo WITH
compression =
{'sstable_compression': 'LZ4Compressor',
'chunk_length_kb': 4};
Read Before Write
• Lightweight Transactions
• Counters
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
Read Ahead
More read ampli
fi
cation
Turn it down!
# blockdev --report
RO RA SSZ BSZ StartSec Size Device
ro 256 512 1024 0 62124032 /dev/loop0
ro 256 512 1024 0 62140416 /dev/loop1
ro 256 512 1024 0 48668672 /dev/loop2
ro 256 512 1024 0 114929664 /dev/loop3
ro 256 512 1024 0 37240832 /dev/loop4
rw 256 512 4096 0 68719476736 /dev/vda
blockdev --setra 8 /dev/vda
Practice
Set Up A Lab
Environment
• tlp-cluster
• AxonOps
Benchmark
• tlp-stress
• nosqlbench
• ndbench
• Methodology
• Observability
• Practice
Announcing Training!
Thank You!!

More Related Content

PDF
Diagnosing Problems in Production - Cassandra
PDF
Numeric Range Queries in Lucene and Solr
PPTX
Oak, the architecture of Apache Jackrabbit 3
PDF
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
PDF
Top 5 Mistakes When Writing Spark Applications
PDF
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
PDF
PPTX
Data Con LA 2022 - Making real-time analytics a reality for digital transform...
Diagnosing Problems in Production - Cassandra
Numeric Range Queries in Lucene and Solr
Oak, the architecture of Apache Jackrabbit 3
Running Apache Spark on Kubernetes: Best Practices and Pitfalls
Top 5 Mistakes When Writing Spark Applications
Data all over the place! How SQL and Apache Calcite bring sanity to streaming...
Data Con LA 2022 - Making real-time analytics a reality for digital transform...

What's hot (20)

PDF
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
PDF
Using S3 Select to Deliver 100X Performance Improvements Versus the Public Cloud
PPTX
Netflix Data Pipeline With Kafka
PDF
Cassandra Introduction & Features
PDF
VictoriaLogs: Open Source Log Management System - Preview
PDF
Apache kafka performance(latency)_benchmark_v0.3
PPTX
Flink vs. Spark
PDF
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
PPTX
Accelerating query processing with materialized views in Apache Hive
PDF
Kafka to the Maxka - (Kafka Performance Tuning)
PDF
Data profiling in Apache Calcite
PDF
Solving PostgreSQL wicked problems
PDF
Performance Troubleshooting Using Apache Spark Metrics
PDF
patroni-based citrus high availability environment deployment
PDF
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
PDF
A Deep Dive into Kafka Controller
PDF
YOW2020 Linux Systems Performance
PDF
CDC Stream Processing with Apache Flink
PDF
InfluxDB & Grafana
PPTX
Introduction to Apache Kafka
Deep Dive into Spark SQL with Advanced Performance Tuning with Xiao Li & Wenc...
Using S3 Select to Deliver 100X Performance Improvements Versus the Public Cloud
Netflix Data Pipeline With Kafka
Cassandra Introduction & Features
VictoriaLogs: Open Source Log Management System - Preview
Apache kafka performance(latency)_benchmark_v0.3
Flink vs. Spark
Kibana Tutorial | Kibana Dashboard Tutorial | Kibana Elasticsearch | ELK Stac...
Accelerating query processing with materialized views in Apache Hive
Kafka to the Maxka - (Kafka Performance Tuning)
Data profiling in Apache Calcite
Solving PostgreSQL wicked problems
Performance Troubleshooting Using Apache Spark Metrics
patroni-based citrus high availability environment deployment
Disaster Recovery with MirrorMaker 2.0 (Ryanne Dolan, Cloudera) Kafka Summit ...
A Deep Dive into Kafka Controller
YOW2020 Linux Systems Performance
CDC Stream Processing with Apache Flink
InfluxDB & Grafana
Introduction to Apache Kafka
Ad

Similar to Cassandra Performance Tuning Like You've Been Doing It for Ten Years (20)

PPTX
Provisioning and Capacity Planning (Travel Meets Big Data)
PDF
Broken Performance Tools
PDF
Provisioning and Capacity Planning Workshop (Dogpatch Labs, September 2015)
PDF
Linux Systems Performance 2016
PDF
C* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra Optimization
PPTX
QCon 2015 Broken Performance Tools
PDF
PDF
LISA2019 Linux Systems Performance
PDF
A close encounter_with_real_world_and_odd_perf_issues
PDF
Introduction to Java Profiling
PPTX
Using Libtracecmd to Analyze Your Latency and Performance Troubles
PPTX
Phd2013 lyamin
PPTX
Александр Лямин. HOWTO. Высокий пакетрейт на x86-64: берем планку в 14,88 Mpps
PPTX
Phd2013 lyamin Высокий пакетрейт на x86-64, берем планку 14.88Mpps
PDF
MeetBSD2014 Performance Analysis
PDF
marko_go_in_badoo
PDF
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
PDF
Linux Performance Profiling and Monitoring
PPTX
Performance Risk Management
PDF
Fine grained monitoring
Provisioning and Capacity Planning (Travel Meets Big Data)
Broken Performance Tools
Provisioning and Capacity Planning Workshop (Dogpatch Labs, September 2015)
Linux Systems Performance 2016
C* Summit EU 2013: Practice Makes Perfect: Extreme Cassandra Optimization
QCon 2015 Broken Performance Tools
LISA2019 Linux Systems Performance
A close encounter_with_real_world_and_odd_perf_issues
Introduction to Java Profiling
Using Libtracecmd to Analyze Your Latency and Performance Troubles
Phd2013 lyamin
Александр Лямин. HOWTO. Высокий пакетрейт на x86-64: берем планку в 14,88 Mpps
Phd2013 lyamin Высокий пакетрейт на x86-64, берем планку 14.88Mpps
MeetBSD2014 Performance Analysis
marko_go_in_badoo
OSDC 2015: Georg Schönberger | Linux Performance Profiling and Monitoring
Linux Performance Profiling and Monitoring
Performance Risk Management
Fine grained monitoring
Ad

More from Jon Haddad (16)

PDF
Performance tuning
PDF
Cassandra Core Concepts - Cassandra Day Toronto
PDF
Diagnosing Problems in Production (Nov 2015)
PDF
Cassandra Core Concepts
PDF
Enter the Snake Pit for Fast and Easy Spark
PDF
Cassandra 3.0 Awesomeness
PDF
Intro to py spark (and cassandra)
PDF
Spark and cassandra (Hulu Talk)
PDF
Intro to Cassandra
PDF
Python and cassandra
PDF
Python performance profiling
PDF
Python & Cassandra - Best Friends
PDF
Introduction to Cassandra - Denver
PDF
Diagnosing Problems in Production: Cassandra Summit 2014
PDF
Crash course intro to cassandra
PDF
Cassandra meetup slides - Oct 15 Santa Monica Coloft
Performance tuning
Cassandra Core Concepts - Cassandra Day Toronto
Diagnosing Problems in Production (Nov 2015)
Cassandra Core Concepts
Enter the Snake Pit for Fast and Easy Spark
Cassandra 3.0 Awesomeness
Intro to py spark (and cassandra)
Spark and cassandra (Hulu Talk)
Intro to Cassandra
Python and cassandra
Python performance profiling
Python & Cassandra - Best Friends
Introduction to Cassandra - Denver
Diagnosing Problems in Production: Cassandra Summit 2014
Crash course intro to cassandra
Cassandra meetup slides - Oct 15 Santa Monica Coloft

Recently uploaded (20)

PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
Getting Started with Data Integration: FME Form 101
PDF
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
DOCX
search engine optimization ppt fir known well about this
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPTX
The various Industrial Revolutions .pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Unlock new opportunities with location data.pdf
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PDF
A review of recent deep learning applications in wood surface defect identifi...
1 - Historical Antecedents, Social Consideration.pdf
A contest of sentiment analysis: k-nearest neighbor versus neural network
O2C Customer Invoices to Receipt V15A.pptx
NewMind AI Weekly Chronicles – August ’25 Week III
sustainability-14-14877-v2.pddhzftheheeeee
A novel scalable deep ensemble learning framework for big data classification...
Getting Started with Data Integration: FME Form 101
A Late Bloomer's Guide to GenAI: Ethics, Bias, and Effective Prompting - Boha...
search engine optimization ppt fir known well about this
Hindi spoken digit analysis for native and non-native speakers
DP Operators-handbook-extract for the Mautical Institute
Taming the Chaos: How to Turn Unstructured Data into Decisions
The various Industrial Revolutions .pptx
Assigned Numbers - 2025 - Bluetooth® Document
Zenith AI: Advanced Artificial Intelligence
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Enhancing emotion recognition model for a student engagement use case through...
Unlock new opportunities with location data.pdf
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
A review of recent deep learning applications in wood surface defect identifi...

Cassandra Performance Tuning Like You've Been Doing It for Ten Years