SlideShare a Scribd company logo
Apache Cassandra in the Real World
Jeremy Hanna
Support Engineer

©2013 DataStax Confidential. Do not distribute without consent.
Cassandra Design
•Massive scalability
•Multi-datacenter

•High Performance
•Reliability/Availability
•no SPOF, no special roles
Multi-DC Replication
Ops Friendly
•Simple design
•no special role, no single point of failure

•Lots of exposed metrics via JMX
•Nodes and entire datacenters can go down with no
loss of service

•DataStax OpsCenter
•Visual monitoring tool
•REST interface to metric data
•Free version
•Hands-off services
Developer friendly
•CQL3
•Collections (Set, Map, List)
•Cassandra native drivers
•Native paging
•Tracing
•DataStax DevCenter tool
•Atomic batches
•Lightweight transactions
•Triggers
CQL3 examples
CREATE USER bombadil WITH PASSWORD 'goldberry4ever' SUPERUSER;
CREATE KEYSPACE shire WITH 	
REPLICATION = {'class': 'NetworkTopologyStrategy', 'eu' : 3, 'us-east' : 2};
GRANT ALTER ON KEYSPACE shire TO gandalf;
SELECT * FROM emp WHERE empID IN (130,104) ORDER BY deptID DESC;
INSERT INTO excelsior.clicks (userid, url, date, name)

VALUES (

3715e600-2eb0-11e2-81c1-0800200c9a66,

‘http://cassandra.apache.org',

‘2013-10-09',	
‘Mary')

USING TTL 86400;
UPDATE users 	
SET email = ‘charlie@wonka.com’ 	
WHERE login = ‘cbucket64'	
IF email = ‘cbucket@wonka.com’
Some C* Users
Netflix
•50 clusters, 750 nodes
•Nearly all data in Cassandra
•film metadata
•user ratings
•recommendations

•Interesting use case because:
•Sheer size and how much they depend on it
•Multi-region (effectively multi-datacenter) within AWS
•Highly available (through various AWS outages)

See also: http://planetcassandra.org/blog/post/case-study-netflix
La Poste
•Use case: parcel distribution metadata
•From MySQL to Cassandra
•Holiday load doubles
•4 million parcels/day
•Average day for one of 70,000 postmen
•Scan parcels
•Print parcel list
•Deliver parcels
•Scans remaining, held up to 15 days (TTL)
See also: http://www.slideshare.net/planetcassandra/c-summit-eu-2013-delivering-christmas-gifts-in-france-since-2012
Rackspace
•Use case: multi-tenant cloud monitoring services
•Common time series use case
•raw metric data at varying intervals
•raw data expires using TTLs

•Supports
•Ingestion through modular sources
•Rollups
•Servicing queries at various resolutions

•Currently ingests 120 million metrics/hour
•See Blueflood.io for project details
See also: http://www.slideshare.net/gdusbabek/blueflood-open-source-metrics-processing-at-cassandraeu-2013
Spotify
•Use case began with playlist storage
•Grew significantly beyond that
•Some playlist details
•Essentially version control system
•More than 1 billion playlists
•>40,000 request/second at peak
•Off-line mode (both access and changes)
•Concurrent changes

See also: 
http://www.slideshare.net/planetcassandra/c-summit-eu-2013-playlists-at-spotify-using-cassandra-to-store-version-controlled-objects
Questions?
•@jeromatron on twitter and #cassandra irc
•More real world cases
•http://planetcassandra.org/FiveMinuteInterviews

•DataStax
•Free online training
•Free developer tools

More Related Content

PDF
Apache Cassandra in the Real World
ODP
Intro to cassandra
PDF
The Cassandra Distributed Database
PDF
Pythian: My First 100 days with a Cassandra Cluster
PPTX
An Overview of Apache Cassandra
PPT
Apache Cassandra training. Overview and Basics
PPTX
How to size up an Apache Cassandra cluster (Training)
PPTX
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar
Apache Cassandra in the Real World
Intro to cassandra
The Cassandra Distributed Database
Pythian: My First 100 days with a Cassandra Cluster
An Overview of Apache Cassandra
Apache Cassandra training. Overview and Basics
How to size up an Apache Cassandra cluster (Training)
C* Summit 2013: Cassandra at eBay Scale by Feng Qu and Anurag Jambhekar

What's hot (20)

PPTX
Apache Cassandra at the Geek2Geek Berlin
PDF
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
PDF
Cassandra NoSQL Tutorial
PDF
Apache Cassandra overview
PPT
Cassandra architecture
PPTX
Cassandra an overview
PPTX
Presentation of Apache Cassandra
PPTX
Cassandra ppt 2
PDF
Cisco: Cassandra adoption on Cisco UCS & OpenStack
PPTX
Introduction to NoSQL & Apache Cassandra
PPTX
Cassandra tutorial
PPTX
Cassandra training
PPTX
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
PDF
Running Cassandra in AWS
PPT
NOSQL Database: Apache Cassandra
PDF
Cassandra background-and-architecture
PPTX
Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud
PPTX
Cassandra
PPT
Webinar: Getting Started with Apache Cassandra
PPTX
Cassandra Architecture FTW
Apache Cassandra at the Geek2Geek Berlin
Cassandra Summit 2014: Apache Cassandra Best Practices at Ebay
Cassandra NoSQL Tutorial
Apache Cassandra overview
Cassandra architecture
Cassandra an overview
Presentation of Apache Cassandra
Cassandra ppt 2
Cisco: Cassandra adoption on Cisco UCS & OpenStack
Introduction to NoSQL & Apache Cassandra
Cassandra tutorial
Cassandra training
Running 400-node Cassandra + Spark Clusters in Azure (Anubhav Kale, Microsoft...
Running Cassandra in AWS
NOSQL Database: Apache Cassandra
Cassandra background-and-architecture
Cassandra Community Webinar: CMB - An Open Message Bus for the Cloud
Cassandra
Webinar: Getting Started with Apache Cassandra
Cassandra Architecture FTW
Ad

Similar to Apache Cassandra in the Real World (20)

PPTX
BigData Developers MeetUp
PPTX
Cassandra - A Basic Introduction Guide
PPTX
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
PDF
Streaming Analytics with Spark, Kafka, Cassandra and Akka
PPTX
Big Data on azure
PDF
[RightScale Webinar] Architecting Databases in the cloud: How RightScale Doe...
PDF
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
PPTX
Aruman Cassandra database
PDF
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
PPTX
Apache Cassandra introduction
PDF
Hacking apache cloud stack
PPTX
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
PDF
Big Data Solutions in Azure - David Giard
PDF
The Nile Approach: Re-engineering Postgres for Millions of Tenants by Gwen Sh...
PPTX
Azure DocumentDB Overview
PDF
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
PPTX
Best Practices: Hadoop migration to Azure HDInsight
PDF
Cassandra's Odyssey @ Netflix
PPTX
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
PPTX
M6d cassandrapresentation
BigData Developers MeetUp
Cassandra - A Basic Introduction Guide
Using Apache Cassandra and Apache Kafka to Scale Next Gen Applications
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Big Data on azure
[RightScale Webinar] Architecting Databases in the cloud: How RightScale Doe...
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Aruman Cassandra database
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Apache Cassandra introduction
Hacking apache cloud stack
Vitalii Bondarenko - “Azure real-time analytics and kappa architecture with K...
Big Data Solutions in Azure - David Giard
The Nile Approach: Re-engineering Postgres for Millions of Tenants by Gwen Sh...
Azure DocumentDB Overview
Johnny Miller – Cassandra + Spark = Awesome- NoSQL matters Barcelona 2014
Best Practices: Hadoop migration to Azure HDInsight
Cassandra's Odyssey @ Netflix
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
M6d cassandrapresentation
Ad

More from Jeremy Hanna (11)

PDF
Göteborg Distributed: Eventual Consistency in Apache Cassandra
PDF
Modern Cassandra for Developers
PDF
Troubleshooting Cassandra
PPT
Cassandra + Hadoop: Analisi Batch con Apache Cassandra
KEY
End-to-end Analytics with Apache Cassandra
KEY
Cassandra eu
PPTX
Pig with Cassandra: Adventures in Analytics
PPTX
Cassandra/Hadoop Integration
PPTX
Cassandra + Hadoop @ApacheCon
PPTX
Intro to cassandra + hadoop
KEY
Cassandra+Hadoop
Göteborg Distributed: Eventual Consistency in Apache Cassandra
Modern Cassandra for Developers
Troubleshooting Cassandra
Cassandra + Hadoop: Analisi Batch con Apache Cassandra
End-to-end Analytics with Apache Cassandra
Cassandra eu
Pig with Cassandra: Adventures in Analytics
Cassandra/Hadoop Integration
Cassandra + Hadoop @ApacheCon
Intro to cassandra + hadoop
Cassandra+Hadoop

Recently uploaded (20)

PPTX
MYSQL Presentation for SQL database connectivity
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
KodekX | Application Modernization Development
PPTX
A Presentation on Artificial Intelligence
PDF
Empathic Computing: Creating Shared Understanding
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Electronic commerce courselecture one. Pdf
MYSQL Presentation for SQL database connectivity
Mobile App Security Testing_ A Comprehensive Guide.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Understanding_Digital_Forensics_Presentation.pptx
NewMind AI Monthly Chronicles - July 2025
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Chapter 3 Spatial Domain Image Processing.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Advanced methodologies resolving dimensionality complications for autism neur...
Network Security Unit 5.pdf for BCA BBA.
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
KodekX | Application Modernization Development
A Presentation on Artificial Intelligence
Empathic Computing: Creating Shared Understanding
Spectral efficient network and resource selection model in 5G networks
The Rise and Fall of 3GPP – Time for a Sabbatical?
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Electronic commerce courselecture one. Pdf

Apache Cassandra in the Real World

  • 1. Apache Cassandra in the Real World Jeremy Hanna Support Engineer ©2013 DataStax Confidential. Do not distribute without consent.
  • 2. Cassandra Design •Massive scalability •Multi-datacenter •High Performance •Reliability/Availability •no SPOF, no special roles
  • 4. Ops Friendly •Simple design •no special role, no single point of failure •Lots of exposed metrics via JMX •Nodes and entire datacenters can go down with no loss of service •DataStax OpsCenter •Visual monitoring tool •REST interface to metric data •Free version •Hands-off services
  • 5. Developer friendly •CQL3 •Collections (Set, Map, List) •Cassandra native drivers •Native paging •Tracing •DataStax DevCenter tool •Atomic batches •Lightweight transactions •Triggers
  • 6. CQL3 examples CREATE USER bombadil WITH PASSWORD 'goldberry4ever' SUPERUSER; CREATE KEYSPACE shire WITH REPLICATION = {'class': 'NetworkTopologyStrategy', 'eu' : 3, 'us-east' : 2}; GRANT ALTER ON KEYSPACE shire TO gandalf; SELECT * FROM emp WHERE empID IN (130,104) ORDER BY deptID DESC; INSERT INTO excelsior.clicks (userid, url, date, name)
 VALUES (
 3715e600-2eb0-11e2-81c1-0800200c9a66,
 ‘http://cassandra.apache.org',
 ‘2013-10-09', ‘Mary')
 USING TTL 86400; UPDATE users SET email = ‘charlie@wonka.com’ WHERE login = ‘cbucket64' IF email = ‘cbucket@wonka.com’
  • 8. Netflix •50 clusters, 750 nodes •Nearly all data in Cassandra •film metadata •user ratings •recommendations •Interesting use case because: •Sheer size and how much they depend on it •Multi-region (effectively multi-datacenter) within AWS •Highly available (through various AWS outages) See also: http://planetcassandra.org/blog/post/case-study-netflix
  • 9. La Poste •Use case: parcel distribution metadata •From MySQL to Cassandra •Holiday load doubles •4 million parcels/day •Average day for one of 70,000 postmen •Scan parcels •Print parcel list •Deliver parcels •Scans remaining, held up to 15 days (TTL) See also: http://www.slideshare.net/planetcassandra/c-summit-eu-2013-delivering-christmas-gifts-in-france-since-2012
  • 10. Rackspace •Use case: multi-tenant cloud monitoring services •Common time series use case •raw metric data at varying intervals •raw data expires using TTLs •Supports •Ingestion through modular sources •Rollups •Servicing queries at various resolutions •Currently ingests 120 million metrics/hour •See Blueflood.io for project details See also: http://www.slideshare.net/gdusbabek/blueflood-open-source-metrics-processing-at-cassandraeu-2013
  • 11. Spotify •Use case began with playlist storage •Grew significantly beyond that •Some playlist details •Essentially version control system •More than 1 billion playlists •>40,000 request/second at peak •Off-line mode (both access and changes) •Concurrent changes See also: http://www.slideshare.net/planetcassandra/c-summit-eu-2013-playlists-at-spotify-using-cassandra-to-store-version-controlled-objects
  • 12. Questions? •@jeromatron on twitter and #cassandra irc •More real world cases •http://planetcassandra.org/FiveMinuteInterviews •DataStax •Free online training •Free developer tools