SlideShare a Scribd company logo
Cassandra troubleshooting:
    out of the shadows

      Benjamin Black, b@b3k.us
Introducing: This Guy
The Allegory of the Cave
Most people start
troubleshooting problems
interpreting shadows on the
wall.
Common shadows.
Paths out of the cave.
Combination of
basic system tools
&
nodetool/JMX
I’m using RP.
My ring is very unbalanced.
I’m using RP.
My ring is very unbalanced.




                   WTF?
nodetool ring
Address   Status   Load       Range                  Ring
                      148873535527910577765226390751398592511
10.248.54.192 Up      5.59 GB                       0 |<--|
10.248.254.15 Up     10.58 GB     42535295865117307932921825928971026431 | ^
10.248.135.239Up      11.01 GB     85070591730234615865843651857942052863 v |
10.248.223.191Up       5.42 GB   106338239662793269832304564822427566079 | ^
10.248.122.240Up       5.51 GB   127605887595351923798765477786913079295 v |
10.248.34.80 Up      5.45 GB    148873535527910577765226390751398592511 |-->|
Address   Status   Load       Range                  Ring
                      148873535527910577765226390751398592511
10.248.54.192 Up      5.59 GB                       0 |<--|
10.248.254.15 Up     10.58 GB     42535295865117307932921825928971026431 | ^
10.248.135.239Up      11.01 GB     85070591730234615865843651857942052863 v |
10.248.223.191Up       5.42 GB   106338239662793269832304564822427566079 | ^
10.248.122.240Up       5.51 GB   127605887595351923798765477786913079295 v |
10.248.34.80 Up      5.45 GB    148873535527910577765226390751398592511 |-->|
Autobootstrap
+
Automatic token assignment
Automatic token algorithm:

Assign a token that will give me
half the range of
the most loaded node.
32
16 16
8 8 16
8888
44888
444488
Address   Status   Load       Range                  Ring
                      148873535527910577765226390751398592511
10.248.54.192 Up      5.59 GB                       0 |<--|
10.248.254.15 Up     10.58 GB     42535295865117307932921825928971026431 | ^
10.248.135.239Up      11.01 GB     85070591730234615865843651857942052863 v |
10.248.223.191Up       5.42 GB   106338239662793269832304564822427566079 | ^
10.248.122.240Up       5.51 GB   127605887595351923798765477786913079295 v |
10.248.34.80 Up      5.45 GB    148873535527910577765226390751398592511 |-->|
nodetool move
+
Manual token assignment
0-(2**127 - 1)
def tokens(nodes)
 0.upto(nodes - 1) do |n|
  p (n * (2**127 - 1) / nodes)
 end
end
=> tokens(6)
0
283568639100782052886145506193140
17621
567137278201564105772291012386280
35242
850705917302346158658436518579420
52863
113427455640312821154458202477256
070484
141784319550391026443072753096570
088105
YES:

This means you need to change tokens on
most of the nodes in your cluster whenever
you add a node.
Writes are fast.
Reads keep getting slower.
Writes are fast.
Reads keep getting slower.




                   WTF?
iostat -x
look at %util
nodetool tpstats
Pool Name                    Active   Pending      Completed
STREAM-STAGE                      0         0              0
RESPONSE-STAGE                    0         0         516280
ROW-READ-STAGE                    8      4096        1164326
LB-OPERATIONS                     0         0              0
MESSAGE-DESERIALIZER-POOL         1    682008        1818682
GMFD                              0         0           6467
LB-TARGET                         0         0              0
CONSISTENCY-MANAGER               0         0         661477
ROW-MUTATION-STAGE                0         0         998780
MESSAGE-STREAMING-POOL            0         0              0
LOAD-BALANCER-STAGE               0         0              0
FLUSH-SORTER-POOL                 0         0              0
MEMTABLE-POST-FLUSHER             0         0              4
FLUSH-WRITER-POOL                 0         0              4
AE-SERVICE-STAGE                  0         0              0
HINTED-HANDOFF-POOL               0         0              3
Pool Name                    Active   Pending      Completed
STREAM-STAGE                      0         0              0
RESPONSE-STAGE                    0         0         516280
ROW-READ-STAGE                    8      4096        1164326
LB-OPERATIONS                     0         0              0
MESSAGE-DESERIALIZER-POOL         1    682008        1818682
GMFD                              0         0           6467
LB-TARGET                         0         0              0
CONSISTENCY-MANAGER               0         0         661477
ROW-MUTATION-STAGE                0         0         998780
MESSAGE-STREAMING-POOL            0         0              0
LOAD-BALANCER-STAGE               0         0              0
FLUSH-SORTER-POOL                 0         0              0
MEMTABLE-POST-FLUSHER             0         0              4
FLUSH-WRITER-POOL                 0         0              4
AE-SERVICE-STAGE                  0         0              0
HINTED-HANDOFF-POOL               0         0              3
Pool Name                    Active   Pending      Completed
STREAM-STAGE                      0         0              0
RESPONSE-STAGE                    0         0         516280
ROW-READ-STAGE                    8      4096        1164326
LB-OPERATIONS                     0         0              0
MESSAGE-DESERIALIZER-POOL         1    682008        1818682
GMFD                              0         0           6467
LB-TARGET                         0         0              0
CONSISTENCY-MANAGER               0         0         661477
ROW-MUTATION-STAGE                0         0         998780
MESSAGE-STREAMING-POOL            0         0              0
LOAD-BALANCER-STAGE               0         0              0
FLUSH-SORTER-POOL                 0         0              0
MEMTABLE-POST-FLUSHER             0         0              4
FLUSH-WRITER-POOL                 0         0              4
AE-SERVICE-STAGE                  0         0              0
HINTED-HANDOFF-POOL               0         0              3
Pool Name                    Active   Pending      Completed
STREAM-STAGE                      0         0              0
RESPONSE-STAGE                    0         0         516280
ROW-READ-STAGE                    8      4096        1164326
LB-OPERATIONS                     0         0              0
MESSAGE-DESERIALIZER-POOL         1    682008        1818682
GMFD                              0         0           6467
LB-TARGET                         0         0              0
CONSISTENCY-MANAGER               0         0         661477
ROW-MUTATION-STAGE                0         0         998780
MESSAGE-STREAMING-POOL            0         0              0
LOAD-BALANCER-STAGE               0         0              0
FLUSH-SORTER-POOL                 0         0              0
MEMTABLE-POST-FLUSHER             0         0              4
FLUSH-WRITER-POOL                 0         0              4
AE-SERVICE-STAGE                  0         0              0
HINTED-HANDOFF-POOL               0         0              3
YOU ARE OUT OF
DISK BANDWIDTH
You can:
Throttle reads at clients
Adjust memtable settings
    (size/ops/time)
Less frequent memtable flush

 Less frequent compaction

Less disk bandwidth demand
Add more nodes
Add more spindles per node
Switch to SSDs
I inserted a bunch of data.
Now my nodes are flapping.
I inserted a bunch of data.
Now my nodes are flapping.




                  WTF?
iostat -x
look at %util
vmstat
look at swap
INFO 13:27:35,309
DiskAccessMode 'auto' determined to be mmap,
indexAccessMode is mmap
mmap() in Cassandra
consumes up to 2GB.
mmap() in Cassandra
consumes up to 2GB.
Per segment.
NOT tracked as JVM heap.




           *See: https://issues.apache.org/jira/browse/CASSANDRA-1214
NOT tracked as JVM heap.
JVM heap not locked in
memory.

           *See: https://issues.apache.org/jira/browse/CASSANDRA-1214
When your data set exceeds
memory,
this is likely.
Swapping can delay gossip
long enough to cause a
node to be marked down.
<DiskAccessMode>mmap_index_only</
DiskAccessMode>
or
disk_access_mode: mmap_index_only
On Linux: swappiness=0
INFO 13:27:35,309
DiskAccessMode isstandard,
indexAccessMode is mmap
Most people start
troubleshooting problems
interpreting shadows on the
wall.
You can now see the path
and the sunlight outside.
YOU CAN HELP!
What things have confused
you?
What problems have you
solved?
What tools have you used to
solve them?
GET INVOLVED!
http://wiki.apache.org/cassandra
#cassandra on freenode

More Related Content

PDF
BCIX Round Table November 2013
PPTX
Local anesthesia
PDF
Spark Streaming and IoT by Mike Freedman
KEY
Introduction to Cassandra: Replication and Consistency
PPTX
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
PDF
Introducing DataFrames in Spark for Large Scale Data Science
PDF
Realtime Analytical Query Processing and Predictive Model Building on High Di...
KEY
There's Plenty of Room at the Bottom
BCIX Round Table November 2013
Local anesthesia
Spark Streaming and IoT by Mike Freedman
Introduction to Cassandra: Replication and Consistency
The Columnar Era: Leveraging Parquet, Arrow and Kudu for High-Performance Ana...
Introducing DataFrames in Spark for Large Scale Data Science
Realtime Analytical Query Processing and Predictive Model Building on High Di...
There's Plenty of Room at the Bottom

Similar to Cassandra Summit 2010 - Operations & Troubleshooting Intro (20)

PDF
Building Apache Cassandra clusters for massive scale
PDF
Cassandra TK 2014 - Large Nodes
PDF
Cassandra SF 2013 - In Case Of Emergency Break Glass
PDF
Apache Cassandra - Diagnostics and monitoring
PDF
Advanced Apache Cassandra Operations with JMX
PDF
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
PPTX
Cassandra Troubleshooting for 2.1 and later
PPTX
M6d cassandrapresentation
PDF
Joining a p2p Conversation - 2017-06 Meetup
PDF
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
PDF
Performance tuning
PDF
Cassandra at Instagram (August 2013)
PDF
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
PDF
Scaling Cassandra for Big Data
PPTX
Cassandra Troubleshooting (for 2.0 and earlier)
PPTX
Cassandra Troubleshooting 3.0
PDF
C* Summit 2013: Cassandra at Instagram by Rick Branson
PDF
Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016
PPTX
How Scylla Make Adding and Removing Nodes Faster and Safer
PDF
Cassandra CLuster Management by Japan Cassandra Community
Building Apache Cassandra clusters for massive scale
Cassandra TK 2014 - Large Nodes
Cassandra SF 2013 - In Case Of Emergency Break Glass
Apache Cassandra - Diagnostics and monitoring
Advanced Apache Cassandra Operations with JMX
Advanced Cassandra Operations via JMX (Nate McCall, The Last Pickle) | C* Sum...
Cassandra Troubleshooting for 2.1 and later
M6d cassandrapresentation
Joining a p2p Conversation - 2017-06 Meetup
Cassandra Performance Tuning Like You've Been Doing It for Ten Years
Performance tuning
Cassandra at Instagram (August 2013)
Cassandra Day SV 2014: Basic Operations with Apache Cassandra
Scaling Cassandra for Big Data
Cassandra Troubleshooting (for 2.0 and earlier)
Cassandra Troubleshooting 3.0
C* Summit 2013: Cassandra at Instagram by Rick Branson
Troubleshooting Cassandra (J.B. Langston, DataStax) | C* Summit 2016
How Scylla Make Adding and Removing Nodes Faster and Safer
Cassandra CLuster Management by Japan Cassandra Community
Ad

Recently uploaded (20)

PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Approach and Philosophy of On baking technology
PDF
KodekX | Application Modernization Development
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Big Data Technologies - Introduction.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Cloud computing and distributed systems.
PDF
Encapsulation theory and applications.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Programs and apps: productivity, graphics, security and other tools
Approach and Philosophy of On baking technology
KodekX | Application Modernization Development
“AI and Expert System Decision Support & Business Intelligence Systems”
Agricultural_Statistics_at_a_Glance_2022_0.pdf
MYSQL Presentation for SQL database connectivity
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Big Data Technologies - Introduction.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Reach Out and Touch Someone: Haptics and Empathic Computing
Cloud computing and distributed systems.
Encapsulation theory and applications.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Chapter 3 Spatial Domain Image Processing.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Network Security Unit 5.pdf for BCA BBA.
Encapsulation_ Review paper, used for researhc scholars
Dropbox Q2 2025 Financial Results & Investor Presentation
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Ad

Cassandra Summit 2010 - Operations & Troubleshooting Intro

  • 1. Cassandra troubleshooting: out of the shadows Benjamin Black, b@b3k.us
  • 3. The Allegory of the Cave
  • 4. Most people start troubleshooting problems interpreting shadows on the wall.
  • 6. Paths out of the cave.
  • 7. Combination of basic system tools & nodetool/JMX
  • 8. I’m using RP. My ring is very unbalanced.
  • 9. I’m using RP. My ring is very unbalanced. WTF?
  • 11. Address Status Load Range Ring 148873535527910577765226390751398592511 10.248.54.192 Up 5.59 GB 0 |<--| 10.248.254.15 Up 10.58 GB 42535295865117307932921825928971026431 | ^ 10.248.135.239Up 11.01 GB 85070591730234615865843651857942052863 v | 10.248.223.191Up 5.42 GB 106338239662793269832304564822427566079 | ^ 10.248.122.240Up 5.51 GB 127605887595351923798765477786913079295 v | 10.248.34.80 Up 5.45 GB 148873535527910577765226390751398592511 |-->|
  • 12. Address Status Load Range Ring 148873535527910577765226390751398592511 10.248.54.192 Up 5.59 GB 0 |<--| 10.248.254.15 Up 10.58 GB 42535295865117307932921825928971026431 | ^ 10.248.135.239Up 11.01 GB 85070591730234615865843651857942052863 v | 10.248.223.191Up 5.42 GB 106338239662793269832304564822427566079 | ^ 10.248.122.240Up 5.51 GB 127605887595351923798765477786913079295 v | 10.248.34.80 Up 5.45 GB 148873535527910577765226390751398592511 |-->|
  • 14. Automatic token algorithm: Assign a token that will give me half the range of the most loaded node.
  • 15. 32 16 16 8 8 16 8888 44888 444488
  • 16. Address Status Load Range Ring 148873535527910577765226390751398592511 10.248.54.192 Up 5.59 GB 0 |<--| 10.248.254.15 Up 10.58 GB 42535295865117307932921825928971026431 | ^ 10.248.135.239Up 11.01 GB 85070591730234615865843651857942052863 v | 10.248.223.191Up 5.42 GB 106338239662793269832304564822427566079 | ^ 10.248.122.240Up 5.51 GB 127605887595351923798765477786913079295 v | 10.248.34.80 Up 5.45 GB 148873535527910577765226390751398592511 |-->|
  • 18. 0-(2**127 - 1) def tokens(nodes) 0.upto(nodes - 1) do |n| p (n * (2**127 - 1) / nodes) end end
  • 20. YES: This means you need to change tokens on most of the nodes in your cluster whenever you add a node.
  • 21. Writes are fast. Reads keep getting slower.
  • 22. Writes are fast. Reads keep getting slower. WTF?
  • 25. Pool Name                    Active   Pending      Completed STREAM-STAGE                      0         0              0 RESPONSE-STAGE                    0         0         516280 ROW-READ-STAGE                    8      4096        1164326 LB-OPERATIONS                     0         0              0 MESSAGE-DESERIALIZER-POOL         1    682008        1818682 GMFD                              0         0           6467 LB-TARGET                         0         0              0 CONSISTENCY-MANAGER               0         0         661477 ROW-MUTATION-STAGE                0         0         998780 MESSAGE-STREAMING-POOL            0         0              0 LOAD-BALANCER-STAGE               0         0              0 FLUSH-SORTER-POOL                 0         0              0 MEMTABLE-POST-FLUSHER             0         0              4 FLUSH-WRITER-POOL                 0         0              4 AE-SERVICE-STAGE                  0         0              0 HINTED-HANDOFF-POOL               0         0              3
  • 26. Pool Name                    Active   Pending      Completed STREAM-STAGE                      0         0              0 RESPONSE-STAGE                    0         0         516280 ROW-READ-STAGE                    8      4096        1164326 LB-OPERATIONS                     0         0              0 MESSAGE-DESERIALIZER-POOL         1    682008        1818682 GMFD                              0         0           6467 LB-TARGET                         0         0              0 CONSISTENCY-MANAGER               0         0         661477 ROW-MUTATION-STAGE                0         0         998780 MESSAGE-STREAMING-POOL            0         0              0 LOAD-BALANCER-STAGE               0         0              0 FLUSH-SORTER-POOL                 0         0              0 MEMTABLE-POST-FLUSHER             0         0              4 FLUSH-WRITER-POOL                 0         0              4 AE-SERVICE-STAGE                  0         0              0 HINTED-HANDOFF-POOL               0         0              3
  • 27. Pool Name                    Active   Pending      Completed STREAM-STAGE                      0         0              0 RESPONSE-STAGE                    0         0         516280 ROW-READ-STAGE                    8      4096        1164326 LB-OPERATIONS                     0         0              0 MESSAGE-DESERIALIZER-POOL         1    682008        1818682 GMFD                              0         0           6467 LB-TARGET                         0         0              0 CONSISTENCY-MANAGER               0         0         661477 ROW-MUTATION-STAGE                0         0         998780 MESSAGE-STREAMING-POOL            0         0              0 LOAD-BALANCER-STAGE               0         0              0 FLUSH-SORTER-POOL                 0         0              0 MEMTABLE-POST-FLUSHER             0         0              4 FLUSH-WRITER-POOL                 0         0              4 AE-SERVICE-STAGE                  0         0              0 HINTED-HANDOFF-POOL               0         0              3
  • 28. Pool Name                    Active   Pending      Completed STREAM-STAGE                      0         0              0 RESPONSE-STAGE                    0         0         516280 ROW-READ-STAGE                    8      4096        1164326 LB-OPERATIONS                     0         0              0 MESSAGE-DESERIALIZER-POOL         1    682008        1818682 GMFD                              0         0           6467 LB-TARGET                         0         0              0 CONSISTENCY-MANAGER               0         0         661477 ROW-MUTATION-STAGE                0         0         998780 MESSAGE-STREAMING-POOL            0         0              0 LOAD-BALANCER-STAGE               0         0              0 FLUSH-SORTER-POOL                 0         0              0 MEMTABLE-POST-FLUSHER             0         0              4 FLUSH-WRITER-POOL                 0         0              4 AE-SERVICE-STAGE                  0         0              0 HINTED-HANDOFF-POOL               0         0              3
  • 29. YOU ARE OUT OF DISK BANDWIDTH
  • 31. Throttle reads at clients
  • 32. Adjust memtable settings (size/ops/time)
  • 33. Less frequent memtable flush Less frequent compaction Less disk bandwidth demand
  • 35. Add more spindles per node
  • 37. I inserted a bunch of data. Now my nodes are flapping.
  • 38. I inserted a bunch of data. Now my nodes are flapping. WTF?
  • 41. INFO 13:27:35,309 DiskAccessMode 'auto' determined to be mmap, indexAccessMode is mmap
  • 43. mmap() in Cassandra consumes up to 2GB. Per segment.
  • 44. NOT tracked as JVM heap. *See: https://issues.apache.org/jira/browse/CASSANDRA-1214
  • 45. NOT tracked as JVM heap. JVM heap not locked in memory. *See: https://issues.apache.org/jira/browse/CASSANDRA-1214
  • 46. When your data set exceeds memory, this is likely.
  • 47. Swapping can delay gossip long enough to cause a node to be marked down.
  • 51. Most people start troubleshooting problems interpreting shadows on the wall.
  • 52. You can now see the path and the sunlight outside.
  • 54. What things have confused you?
  • 55. What problems have you solved?
  • 56. What tools have you used to solve them?