NoSQL with Cassandra [email_address]
Agenda Introduction How it work Data Model Roadmap
Cassandra? A high scalable, distributed, structured key-value database. Apache Top Level Project Open sourced by Facebook in 2008 BigTable+Dynamo
 
How it works? Decentralized (no single points of failure) Fault Tolerant Eventually Consistency
 
 
Partitioner RandomPartitioner OrderPreservingPartitioner
Read/Write
When write write to a disk commit log (sequential) replicate Memtable SSTable - stands for Sorted Strings Table compaction tombstone
 
When read Any node Wait for R responses Read-Repair Hinted-Handoff Slower than writes (but still fast) RowCached, KeyCached Scales to billions of rows
CAP theorem Consistency - all nodes see the same data at the same time Availibility - nodes failures do not prevent survivors from continue to operate Partition Tolerance - the system continues to operate despite arbitrary message lose from wikipedia
Consistency Write ZERO - asynchronously ANY ONE QUORUM - N / 2 + 1 ALL Read ONE - first node QUORUM - recent timestamp If W + R > N, you will have consistency W=1, R=N W=N, R=1 W=Q, R=Q where Q = N / 2 + 1
Data Model Column SuperColumn Row ColumnFamily Keyspace
Column { name: "mail", value: "ieon@pixnet.tw", timestamp: 123456789 }
ColumnFamily User { // Standard CF      ma19: { // row key          name: " 馬一九 ", // columns          phone: "1919119",          mail: "ma@foo"      },      small_ben: {          name: " 陳小扁 ",          phone: "4848448",          mail: "chen@bar",          is_jailed: "true"      } }
Traditional RDBMS name phone address 1 王小明 40666888 台北市 2 王中明 28825252 台中市 3 王大明 4129889 台南市
Flexible Schema name phone address 1 王小明 40666888 台北市 name phone address msn 2 王中明 28825252 台中市 [email_address] name mail address 3 王大明 [email_address] 台南市
Super Column Contact {    // Super CF          gasol: {     // row key                  __all__: {     // super column                          dad: "",     // columns                          beer: "",                          ronny: ""                  },                  pixnet: {    // super column                          beer: "",                          ronny: ""                  },                  family: {    // super column                          dad: ""                  }          } }
Sorting - Comparator BytesType - no validation AsciiType - like BytesType, but validates as ASCII LongType - 64 bit long UTF8Type - A string encoded as utf8 LexicalUUIDType - A 128 bit UUID, usually version 4 TimeUUIDType - a 128 bit version 1 UUID, compared by timestamp
Client API THRIFT-601 sending random data crashed thrift service THRIFT-347 PHP TSocket timeout issues Thrift  sucks  and ugly Apache Avro in trunk struct SliceRange {      1: required binary start,      2: required binary finish,      3: required bool reversed=0,      4: required i32 count=100, } struct SlicePredicate {      1: optional list<binary> column_names,      2: optional SliceRange   slice_range, }
get(keyspace, key, ColumnPath) get_slice(keyspace, key, ColumnParent, SlicePredicate) multiget() * multiget_slice(keyspace, keys, ColumnParent, SlicePredicate) get_count() ! get_range_slice() * get_range_slices(keyspace, ColumnParent, SlicePredicate, KeyRange) insert(keyspace, key, ColumnPath, value, timestamp) batch_insert() * remove(keyspace, key, ColumnPath, timestamp) batch_mutate(keyspace, map<CF, list<Mutation>) ignore consistency_level * deprecated ! slow, deserialized all columns
Roadmap SSTable compression dynamic column family changes Vector clock support truncate support Memory efficient compactions Avro 0.7
Thank you  

More Related Content

PPTX
test
PPTX
SSL/POODLE: History repeats itself
PPTX
Build reliable, traceable, distributed systems with ZeroMQ
PDF
DNS 101: Introducción a DNS en Español
PPTX
Chapter 2
PPT
04 cache memory...
PPTX
ZeroMQ: Super Sockets - by J2 Labs
PDF
Applied cryptanalysis - everything else
test
SSL/POODLE: History repeats itself
Build reliable, traceable, distributed systems with ZeroMQ
DNS 101: Introducción a DNS en Español
Chapter 2
04 cache memory...
ZeroMQ: Super Sockets - by J2 Labs
Applied cryptanalysis - everything else

Viewers also liked (11)

PDF
Cassandra datamodel
PDF
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
PPTX
Apache Cassandra, part 1 – principles, data model
PDF
Cassandra Data Modeling
PDF
Introduction to Cassandra & Data model
PPT
Cassandra Data Model
KEY
Developers summit cassandraで見るNoSQL
PDF
Migrating Netflix from Datacenter Oracle to Global Cassandra
PDF
Cassandra model
PDF
Cassandra NoSQL Tutorial
PDF
State of the Word 2011
Cassandra datamodel
Cassandra Summit 2014: Active-Active Cassandra Behind the Scenes
Apache Cassandra, part 1 – principles, data model
Cassandra Data Modeling
Introduction to Cassandra & Data model
Cassandra Data Model
Developers summit cassandraで見るNoSQL
Migrating Netflix from Datacenter Oracle to Global Cassandra
Cassandra model
Cassandra NoSQL Tutorial
State of the Word 2011
Ad

Similar to NoSQL with Cassandra (20)

PPT
NOSQL and Cassandra
PDF
Cassandra
PPTX
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
PPT
Scaling web applications with cassandra presentation
PDF
MyCassandra (Full English Version)
PDF
Cassandra: Open Source Bigtable + Dynamo
PPT
No sql
ODP
Introduciton to Apache Cassandra for Java Developers (JavaOne)
PDF
Cassandra
ODP
Introduction to apache_cassandra_for_developers-lhg
PDF
Introduction to Cassandra
PPT
The No SQL Principles and Basic Application Of Casandra Model
PDF
Cassandra Explained
ODP
Nyc summit intro_to_cassandra
PDF
Cassandra Explained
PDF
Cassandra 2.1
PDF
Cassandra Talk: Austin JUG
PPT
Storage cassandra
PDF
Gcp data engineer
KEY
Cassandra Client Tutorial
NOSQL and Cassandra
Cassandra
Introduce Apache Cassandra - JavaTwo Taiwan, 2012
Scaling web applications with cassandra presentation
MyCassandra (Full English Version)
Cassandra: Open Source Bigtable + Dynamo
No sql
Introduciton to Apache Cassandra for Java Developers (JavaOne)
Cassandra
Introduction to apache_cassandra_for_developers-lhg
Introduction to Cassandra
The No SQL Principles and Basic Application Of Casandra Model
Cassandra Explained
Nyc summit intro_to_cassandra
Cassandra Explained
Cassandra 2.1
Cassandra Talk: Austin JUG
Storage cassandra
Gcp data engineer
Cassandra Client Tutorial
Ad

Recently uploaded (20)

PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Architecture types and enterprise applications.pdf
PPT
What is a Computer? Input Devices /output devices
PPTX
TEXTILE technology diploma scope and career opportunities
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPTX
Configure Apache Mutual Authentication
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
DOCX
search engine optimization ppt fir known well about this
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PDF
A review of recent deep learning applications in wood surface defect identifi...
PPT
Geologic Time for studying geology for geologist
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PPTX
Modernising the Digital Integration Hub
PDF
UiPath Agentic Automation session 1: RPA to Agents
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
OpenACC and Open Hackathons Monthly Highlights July 2025
Zenith AI: Advanced Artificial Intelligence
Architecture types and enterprise applications.pdf
What is a Computer? Input Devices /output devices
TEXTILE technology diploma scope and career opportunities
Benefits of Physical activity for teenagers.pptx
Taming the Chaos: How to Turn Unstructured Data into Decisions
Configure Apache Mutual Authentication
Final SEM Unit 1 for mit wpu at pune .pptx
Comparative analysis of machine learning models for fake news detection in so...
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
How IoT Sensor Integration in 2025 is Transforming Industries Worldwide
search engine optimization ppt fir known well about this
Enhancing plagiarism detection using data pre-processing and machine learning...
A review of recent deep learning applications in wood surface defect identifi...
Geologic Time for studying geology for geologist
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
Modernising the Digital Integration Hub
UiPath Agentic Automation session 1: RPA to Agents
Getting started with AI Agents and Multi-Agent Systems
OpenACC and Open Hackathons Monthly Highlights July 2025

NoSQL with Cassandra

  • 1. NoSQL with Cassandra [email_address]
  • 2. Agenda Introduction How it work Data Model Roadmap
  • 3. Cassandra? A high scalable, distributed, structured key-value database. Apache Top Level Project Open sourced by Facebook in 2008 BigTable+Dynamo
  • 4.  
  • 5. How it works? Decentralized (no single points of failure) Fault Tolerant Eventually Consistency
  • 6.  
  • 7.  
  • 10. When write write to a disk commit log (sequential) replicate Memtable SSTable - stands for Sorted Strings Table compaction tombstone
  • 11.  
  • 12. When read Any node Wait for R responses Read-Repair Hinted-Handoff Slower than writes (but still fast) RowCached, KeyCached Scales to billions of rows
  • 13. CAP theorem Consistency - all nodes see the same data at the same time Availibility - nodes failures do not prevent survivors from continue to operate Partition Tolerance - the system continues to operate despite arbitrary message lose from wikipedia
  • 14. Consistency Write ZERO - asynchronously ANY ONE QUORUM - N / 2 + 1 ALL Read ONE - first node QUORUM - recent timestamp If W + R > N, you will have consistency W=1, R=N W=N, R=1 W=Q, R=Q where Q = N / 2 + 1
  • 15. Data Model Column SuperColumn Row ColumnFamily Keyspace
  • 16. Column { name: &quot;mail&quot;, value: &quot;ieon@pixnet.tw&quot;, timestamp: 123456789 }
  • 17. ColumnFamily User { // Standard CF      ma19: { // row key          name: &quot; 馬一九 &quot;, // columns          phone: &quot;1919119&quot;,          mail: &quot;ma@foo&quot;      },      small_ben: {          name: &quot; 陳小扁 &quot;,          phone: &quot;4848448&quot;,          mail: &quot;chen@bar&quot;,          is_jailed: &quot;true&quot;      } }
  • 18. Traditional RDBMS name phone address 1 王小明 40666888 台北市 2 王中明 28825252 台中市 3 王大明 4129889 台南市
  • 19. Flexible Schema name phone address 1 王小明 40666888 台北市 name phone address msn 2 王中明 28825252 台中市 [email_address] name mail address 3 王大明 [email_address] 台南市
  • 20. Super Column Contact {    // Super CF          gasol: {     // row key                  __all__: {     // super column                          dad: &quot;&quot;,     // columns                          beer: &quot;&quot;,                          ronny: &quot;&quot;                  },                  pixnet: {    // super column                          beer: &quot;&quot;,                          ronny: &quot;&quot;                  },                  family: {    // super column                          dad: &quot;&quot;                  }          } }
  • 21. Sorting - Comparator BytesType - no validation AsciiType - like BytesType, but validates as ASCII LongType - 64 bit long UTF8Type - A string encoded as utf8 LexicalUUIDType - A 128 bit UUID, usually version 4 TimeUUIDType - a 128 bit version 1 UUID, compared by timestamp
  • 22. Client API THRIFT-601 sending random data crashed thrift service THRIFT-347 PHP TSocket timeout issues Thrift sucks and ugly Apache Avro in trunk struct SliceRange {      1: required binary start,      2: required binary finish,      3: required bool reversed=0,      4: required i32 count=100, } struct SlicePredicate {      1: optional list<binary> column_names,      2: optional SliceRange   slice_range, }
  • 23. get(keyspace, key, ColumnPath) get_slice(keyspace, key, ColumnParent, SlicePredicate) multiget() * multiget_slice(keyspace, keys, ColumnParent, SlicePredicate) get_count() ! get_range_slice() * get_range_slices(keyspace, ColumnParent, SlicePredicate, KeyRange) insert(keyspace, key, ColumnPath, value, timestamp) batch_insert() * remove(keyspace, key, ColumnPath, timestamp) batch_mutate(keyspace, map<CF, list<Mutation>) ignore consistency_level * deprecated ! slow, deserialized all columns
  • 24. Roadmap SSTable compression dynamic column family changes Vector clock support truncate support Memory efficient compactions Avro 0.7