SlideShare a Scribd company logo
Apache Cassandra
    Present and Future
    Jonathan Ellis




Monday, October 24, 2011
History


   Bigtable, 2006                                    Dynamo, 2007



                                             OSS, 2008




                                                  TLP, 2010
                           Incubator, 2009
                                             1.0, October 2011

Monday, October 24, 2011
Why people choose Cassandra

    ✤    Multi-master, multi-DC
    ✤    Linearly scalable
    ✤    Larger-than-memory datasets
    ✤    High performance
    ✤    Full durability
    ✤    Integrated caching
    ✤    Tuneable consistency



Monday, October 24, 2011
Cassandra users

    ✤    Financial
    ✤    Social Media
    ✤    Advertising
    ✤    Entertainment
    ✤    Energy
    ✤    E-tail
    ✤    Health care
    ✤    Government

Monday, October 24, 2011
Road to 1.0

    ✤    Storage engine improvements
    ✤    Specialized replication modes
    ✤    Performance and other real-world considerations
    ✤    Building an ecosystem




Monday, October 24, 2011
Compaction

                           Size-Tiered




                           Leveled




Monday, October 24, 2011
Differences in Cassandra’s leveling

    ✤    ColumnFamily instead of key/value
    ✤    Multithreading (experimental)
    ✤    Optional throttling (16MB/s by default)
    ✤    Per-sstable bloom filter for row keys
    ✤    Larger data files (5MB by default)
    ✤    Does not block writes if compaction falls behind




Monday, October 24, 2011
Column (“secondary”) indexing

cqlsh> CREATE INDEX state_idx ON users(state);
cqlsh> INSERT INTO users (uname, state, birth_date)
                VALUES (‘bsanderson’, ‘UT’, 1975)


    users                                       state_idx
                           state   birth_date
         bsanderson        UT        1975            UT     bsanderson   htayler
           prothfuss        WI       1973            WI      prothfuss
             htayler       UT        1968




Monday, October 24, 2011
Querying


cqlsh> SELECT * FROM users
                WHERE state='UT' AND birth_date > 1970
                ORDER BY tokenize(key);




Monday, October 24, 2011
More sophisticated indexing?

    ✤    Want to:
          ✤   Support more operators
          ✤   Support user-defined ordering
          ✤   Support high-cardinality values
    ✤    But:
          ✤   Scatter/gather scales poorly
          ✤   If it’s not node local, we can’t guarantee atomicity
          ✤   If we can’t guarantee atomicity, doublechecking across nodes is a
              huge penalty


Monday, October 24, 2011
Other storage engine improvements

        ✤    Compression
        ✤    Expiring columns
        ✤    Bounded worst-case reads by re-writing fragmented
             rows




Monday, October 24, 2011
Eventually-consistent counters

    ✤    Counter is partitioned by replica; each replica is “master”
         of its own partition




Monday, October 24, 2011
13

Monday, October 24, 2011
14

Monday, October 24, 2011
15

Monday, October 24, 2011
16

Monday, October 24, 2011
The interesting parts

    ✤    Tuneable consistency
    ✤    Avoiding contention on local increments
          ✤   Store increments, not full value; merge on read + compaction
    ✤    Renewing counter id to deal with data loss
    ✤    Interaction with tombstones




Monday, October 24, 2011
(What about version vectors?)

    ✤    Version vectors allow detecting conflict, but do not give
         enough information to resolve it except in special cases
          ✤   In the counters case, we’d need to hold onto all previous versions
              until we can be sure no new conflict with them can occur
          ✤   Jeff Darcy has a longer discussion at http://pl.atyp.us/
              wordpress/?p=2601




Monday, October 24, 2011
Performance




     A single four-core machine; one million inserts + one million updates



Monday, October 24, 2011
Dealing with the JVM

    ✤    JNA
          ✤   mlockall()
          ✤   posix_fadvise()
          ✤   link()
    ✤    Memory
          ✤   Move cache off-heap
          ✤   In-heap arena allocation for memtables, bloom filters
          ✤   Move compaction to a separate process?



Monday, October 24, 2011
The Cassandra ecosystem

    ✤    Replication into Cassandra
          ✤   Gigaspaces
          ✤   Drizzle
    ✤    Solandra: Cassandra + Solr search
    ✤    DataStax Enterprise: Cassandra + Hadoop analytics




Monday, October 24, 2011
DataStax Enterprise




Monday, October 24, 2011
Operations

    ✤    “Vanilla” Hadoop
          ✤   8+ services to setup, monitor, backup, and recover
              (NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker, Zookeeper,
              Metastore, ...)
          ✤   Single points of failure
          ✤   Can't separate online and offline processing

    ✤    DataStax Enterprise
          ✤   Single, simplified component
          ✤   Peer to peer
          ✤   JobTracker failover
          ✤   No additional cassandra config



Monday, October 24, 2011
What’s next

    ✤    Ease Of Use
    ✤    CQL: “Native” transport, prepared statements
    ✤    Triggers
    ✤    Entity groups
    ✤    Smarter range queries enabling Hive predicate push-
         down
    ✤    Blue sky: streaming / CEP




Monday, October 24, 2011
Questions?

    ✤    jbellis@datastax.com




    ✤    DataStax is hiring!
          ✤   ~15 engineers (35 employees), want to double in 2012
               ✤    Austin, Burlingame, NYC, France, Japan, Belarus
          ✤   100+ customers
          ✤   $11M Series B funding

Monday, October 24, 2011

More Related Content

PDF
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
PDF
Cassandra 3 new features 2016
PDF
MySQL Replication Update - DEbconf 2020 presentation
PDF
MySQL 8.0 Operational Changes
PDF
Json within a relational database
PDF
MySQL 8.0 New Features -- September 27th presentation for Open Source Summit
PDF
Cassandra and materialized views
PDF
Tokyo cassandra conference 2014
Cassandra 1.0 and the future of big data (Cassandra Tokyo 2011)
Cassandra 3 new features 2016
MySQL Replication Update - DEbconf 2020 presentation
MySQL 8.0 Operational Changes
Json within a relational database
MySQL 8.0 New Features -- September 27th presentation for Open Source Summit
Cassandra and materialized views
Tokyo cassandra conference 2014

Similar to Cassandra at High Performance Transaction Systems 2011 (20)

PDF
On Cassandra Development: Past, Present and Future
PDF
Cassandra 2.0 and timeseries
PDF
Cassandra 2.1
PDF
Cassandra Talk: Austin JUG
PPT
Toronto jaspersoft meetup
PDF
Big Data Grows Up - A (re)introduction to Cassandra
PDF
Introduction to cassandra 2014
PDF
Introduction to Apache Cassandra
PDF
State of Cassandra 2012
PDF
Polygot persistence for Java Developers - August 2011 / @Oakjug
PDF
Cassandra and Spark
PPT
5266732.ppt
PDF
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
PDF
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
PDF
Cassandra Day Chicago 2015: Introduction to Apache Cassandra & DataStax Enter...
PDF
Outside The Box With Apache Cassnadra
PDF
1 Dundee - Cassandra 101
PPTX
An Introduction to Cassandra - Oracle User Group
ODP
Intro to cassandra
PDF
Jan 2015 - Cassandra101 Manchester Meetup
On Cassandra Development: Past, Present and Future
Cassandra 2.0 and timeseries
Cassandra 2.1
Cassandra Talk: Austin JUG
Toronto jaspersoft meetup
Big Data Grows Up - A (re)introduction to Cassandra
Introduction to cassandra 2014
Introduction to Apache Cassandra
State of Cassandra 2012
Polygot persistence for Java Developers - August 2011 / @Oakjug
Cassandra and Spark
5266732.ppt
Cassandra Day Atlanta 2015: Introduction to Apache Cassandra & DataStax Enter...
Cassandra Day London 2015: Introduction to Apache Cassandra and DataStax Ente...
Cassandra Day Chicago 2015: Introduction to Apache Cassandra & DataStax Enter...
Outside The Box With Apache Cassnadra
1 Dundee - Cassandra 101
An Introduction to Cassandra - Oracle User Group
Intro to cassandra
Jan 2015 - Cassandra101 Manchester Meetup
Ad

More from jbellis (20)

PPTX
Vector Search @ sw2con for slideshare.pptx
PDF
Five Lessons in Distributed Databases
PDF
Data day texas: Cassandra and the Cloud
PDF
Cassandra Summit 2015
PDF
Cassandra summit keynote 2014
PDF
Cassandra Summit EU 2013
PDF
London + Dublin Cassandra 2.0
PDF
Cassandra Summit 2013 Keynote
PDF
Cassandra at NoSql Matters 2012
PDF
Top five questions to ask when choosing a big data solution
PDF
Massively Scalable NoSQL with Apache Cassandra
PDF
Cassandra 1.1
PDF
Pycon 2012 What Python can learn from Java
PDF
Apache Cassandra: NoSQL in the enterprise
PDF
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
PDF
What python can learn from java
PDF
State of Cassandra, 2011
PDF
Brisk: more powerful Hadoop powered by Cassandra
PDF
PyCon 2010 SQLAlchemy tutorial
PDF
Cassandra 0.7, Los Angeles High Scalability Group
Vector Search @ sw2con for slideshare.pptx
Five Lessons in Distributed Databases
Data day texas: Cassandra and the Cloud
Cassandra Summit 2015
Cassandra summit keynote 2014
Cassandra Summit EU 2013
London + Dublin Cassandra 2.0
Cassandra Summit 2013 Keynote
Cassandra at NoSql Matters 2012
Top five questions to ask when choosing a big data solution
Massively Scalable NoSQL with Apache Cassandra
Cassandra 1.1
Pycon 2012 What Python can learn from Java
Apache Cassandra: NoSQL in the enterprise
Dealing with JVM limitations in Apache Cassandra (Fosdem 2012)
What python can learn from java
State of Cassandra, 2011
Brisk: more powerful Hadoop powered by Cassandra
PyCon 2010 SQLAlchemy tutorial
Cassandra 0.7, Los Angeles High Scalability Group
Ad

Recently uploaded (20)

PPTX
Cloud computing and distributed systems.
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Approach and Philosophy of On baking technology
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
Modernizing your data center with Dell and AMD
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
KodekX | Application Modernization Development
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Empathic Computing: Creating Shared Understanding
PDF
Machine learning based COVID-19 study performance prediction
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
Cloud computing and distributed systems.
Diabetes mellitus diagnosis method based random forest with bat algorithm
Approach and Philosophy of On baking technology
Understanding_Digital_Forensics_Presentation.pptx
cuic standard and advanced reporting.pdf
Modernizing your data center with Dell and AMD
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Spectral efficient network and resource selection model in 5G networks
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
MYSQL Presentation for SQL database connectivity
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Unlocking AI with Model Context Protocol (MCP)
20250228 LYD VKU AI Blended-Learning.pptx
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
KodekX | Application Modernization Development
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Empathic Computing: Creating Shared Understanding
Machine learning based COVID-19 study performance prediction
Building Integrated photovoltaic BIPV_UPV.pdf

Cassandra at High Performance Transaction Systems 2011

  • 1. Apache Cassandra Present and Future Jonathan Ellis Monday, October 24, 2011
  • 2. History Bigtable, 2006 Dynamo, 2007 OSS, 2008 TLP, 2010 Incubator, 2009 1.0, October 2011 Monday, October 24, 2011
  • 3. Why people choose Cassandra ✤ Multi-master, multi-DC ✤ Linearly scalable ✤ Larger-than-memory datasets ✤ High performance ✤ Full durability ✤ Integrated caching ✤ Tuneable consistency Monday, October 24, 2011
  • 4. Cassandra users ✤ Financial ✤ Social Media ✤ Advertising ✤ Entertainment ✤ Energy ✤ E-tail ✤ Health care ✤ Government Monday, October 24, 2011
  • 5. Road to 1.0 ✤ Storage engine improvements ✤ Specialized replication modes ✤ Performance and other real-world considerations ✤ Building an ecosystem Monday, October 24, 2011
  • 6. Compaction Size-Tiered Leveled Monday, October 24, 2011
  • 7. Differences in Cassandra’s leveling ✤ ColumnFamily instead of key/value ✤ Multithreading (experimental) ✤ Optional throttling (16MB/s by default) ✤ Per-sstable bloom filter for row keys ✤ Larger data files (5MB by default) ✤ Does not block writes if compaction falls behind Monday, October 24, 2011
  • 8. Column (“secondary”) indexing cqlsh> CREATE INDEX state_idx ON users(state); cqlsh> INSERT INTO users (uname, state, birth_date) VALUES (‘bsanderson’, ‘UT’, 1975) users state_idx state birth_date bsanderson UT 1975 UT bsanderson htayler prothfuss WI 1973 WI prothfuss htayler UT 1968 Monday, October 24, 2011
  • 9. Querying cqlsh> SELECT * FROM users WHERE state='UT' AND birth_date > 1970 ORDER BY tokenize(key); Monday, October 24, 2011
  • 10. More sophisticated indexing? ✤ Want to: ✤ Support more operators ✤ Support user-defined ordering ✤ Support high-cardinality values ✤ But: ✤ Scatter/gather scales poorly ✤ If it’s not node local, we can’t guarantee atomicity ✤ If we can’t guarantee atomicity, doublechecking across nodes is a huge penalty Monday, October 24, 2011
  • 11. Other storage engine improvements ✤ Compression ✤ Expiring columns ✤ Bounded worst-case reads by re-writing fragmented rows Monday, October 24, 2011
  • 12. Eventually-consistent counters ✤ Counter is partitioned by replica; each replica is “master” of its own partition Monday, October 24, 2011
  • 17. The interesting parts ✤ Tuneable consistency ✤ Avoiding contention on local increments ✤ Store increments, not full value; merge on read + compaction ✤ Renewing counter id to deal with data loss ✤ Interaction with tombstones Monday, October 24, 2011
  • 18. (What about version vectors?) ✤ Version vectors allow detecting conflict, but do not give enough information to resolve it except in special cases ✤ In the counters case, we’d need to hold onto all previous versions until we can be sure no new conflict with them can occur ✤ Jeff Darcy has a longer discussion at http://pl.atyp.us/ wordpress/?p=2601 Monday, October 24, 2011
  • 19. Performance A single four-core machine; one million inserts + one million updates Monday, October 24, 2011
  • 20. Dealing with the JVM ✤ JNA ✤ mlockall() ✤ posix_fadvise() ✤ link() ✤ Memory ✤ Move cache off-heap ✤ In-heap arena allocation for memtables, bloom filters ✤ Move compaction to a separate process? Monday, October 24, 2011
  • 21. The Cassandra ecosystem ✤ Replication into Cassandra ✤ Gigaspaces ✤ Drizzle ✤ Solandra: Cassandra + Solr search ✤ DataStax Enterprise: Cassandra + Hadoop analytics Monday, October 24, 2011
  • 23. Operations ✤ “Vanilla” Hadoop ✤ 8+ services to setup, monitor, backup, and recover (NameNode, SecondaryNameNode, DataNode, JobTracker, TaskTracker, Zookeeper, Metastore, ...) ✤ Single points of failure ✤ Can't separate online and offline processing ✤ DataStax Enterprise ✤ Single, simplified component ✤ Peer to peer ✤ JobTracker failover ✤ No additional cassandra config Monday, October 24, 2011
  • 24. What’s next ✤ Ease Of Use ✤ CQL: “Native” transport, prepared statements ✤ Triggers ✤ Entity groups ✤ Smarter range queries enabling Hive predicate push- down ✤ Blue sky: streaming / CEP Monday, October 24, 2011
  • 25. Questions? ✤ jbellis@datastax.com ✤ DataStax is hiring! ✤ ~15 engineers (35 employees), want to double in 2012 ✤ Austin, Burlingame, NYC, France, Japan, Belarus ✤ 100+ customers ✤ $11M Series B funding Monday, October 24, 2011