SlideShare a Scribd company logo
NoSQL Cassandra
July 3, 2014
Prashanth M S
NoSQL
July 3, 2014 2
Why NoSQL?
Increase in data led to use of cluster of small machines for handling it
(Scale out), but RDBMS are not designed to run on clusters
Big Table from Google and Dynamo from Amazon – were the
alternatives for data storage in the early 2000s
Common characteristics of NoSQL DBs are
◦ Not using relational model
◦ Running well on clusters
◦ Schemaless, Open-source and built for 21st century web estates
July 3, 2014 3
Types of NoSQL DBs
NoSQL Types
Aggregate
Oriented DBs
Key Value
Data Model
Amazon
DynamoDB
Document
Model
MongoDB
CouchDB
Column
Family Model
Cassandra
HBase
Graph DBs
Neo4J
Infinite Graph
July 3, 2014 4
Cassandra Data Model
The table below shows analogy in terms of relational model
Cassandra column family can be thought as map of map
◦ Map<RowKey, SortedMap<ColumnKey, ColumnValue>>
July 3, 2014 5
Relational Model Cassandra Model
Database Keyspace
Table Column Family
Primary Key Row Key
Cassandra Key Components
Gossip
◦ Peer-to-peer communication protocol between nodes of cluster
Partitioner
◦ Determines how to distribute data across nodes of cluster
Replication Strategy
◦ For data replication
Snitch
◦ For network topology
Cassandra.yaml
◦ Timeout settings, tuning properties, etc
July 3, 2014 6
Cassandra Storage
The memtable data is flushed to SSTables on disk. Data in the commit
log is purged after its corresponding data in the memtable is flushed to
the SSTable.
July 3, 2014 7
Cassandra Data Partitioning
Lets say, we have following data
Data is placed on each node based on Partition Key and the range the
node is responsible for
July 3, 2014 8
jim age: 36 car: camaro gender: M
carol age: 37 car: bmw gender: F
johnny age: 12 gender: M
suzy age: 10 gender: F
Node Start Range End Range Partition
Key
Hash Value
A -9223372036854 -4611686018427 johnny -6723372854875
B -4611686018427 -1 jim -2245462676723
C 0 4611686018427 suzy 1168604627387
D 4611686018427 9223372036854 carol 7723358927203
Cassandra Data Distribution
using Vnodes
Vnodes allow each node to own a large number of small partition
ranges distributed throughout the cluster
July 3, 2014 9
Q & A
July 3, 2014 10

More Related Content

PDF
Zolnai geobyte manuscript
PDF
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
ODP
Cloud storage in azienda: perche` Riak ci e` piaciuto
PDF
Cassandra - Wellington No Sql
PPT
Seminar presentation final
PPTX
Cassandra at no_sql
PDF
NoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
PPT
NOSQL and Cassandra
Zolnai geobyte manuscript
C* Keys: Partitioning, Clustering, & CrossFit (Adam Hutson, DataScale) | Cass...
Cloud storage in azienda: perche` Riak ci e` piaciuto
Cassandra - Wellington No Sql
Seminar presentation final
Cassandra at no_sql
NoSQL Data Stores: Introduzione alle Basi di Dati Non Relazionali
NOSQL and Cassandra

Viewers also liked (13)

PPT
Cassandra + Hadoop: Analisi Batch con Apache Cassandra
PDF
NoSql - Key Value
PPTX
AWS (Amazon Web Services) - Trevisan Davide
PPT
Eletti big data_trento_25ott14
PDF
Cassandra, web scale no sql data platform
PPTX
Cassandra ppt 1
ODP
Introduzione a Riak
PDF
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
PDF
Introduction to Data Modeling with Apache Cassandra
PDF
NoSQL, No Worries: Vecchi Problemi, Nuove Soluzioni
PPT
Cassandra Data Model
PDF
Dynamo and BigTable - Review and Comparison
PPTX
Dynamodb Presentation
Cassandra + Hadoop: Analisi Batch con Apache Cassandra
NoSql - Key Value
AWS (Amazon Web Services) - Trevisan Davide
Eletti big data_trento_25ott14
Cassandra, web scale no sql data platform
Cassandra ppt 1
Introduzione a Riak
Cassandra Community Webinar: Apache Spark Analytics at The Weather Channel - ...
Introduction to Data Modeling with Apache Cassandra
NoSQL, No Worries: Vecchi Problemi, Nuove Soluzioni
Cassandra Data Model
Dynamo and BigTable - Review and Comparison
Dynamodb Presentation
Ad

Similar to No SQL Cassandra (20)

PPT
HGrid A Data Model for Large Geospatial Data Sets in HBase
PDF
Architecture et modèle de données Cassandra
PPTX
Introduction to Cassandra and datastax DSE
PDF
Cassandra basics 2.0
PDF
Home For Gypsies – Storage for NoSQL Databases​
PPTX
CCS334 BIG DATA ANALYTICS Session 3 Distributed models.pptx
PDF
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
PPTX
DataStax TechDay - Munich 2014
PDF
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
PPT
Storage cassandra
PPT
Cassandra advanced part-ll
PDF
FULLTEXT02
PDF
C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...
PDF
C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...
PDF
Введение в Apache Cassandra
PDF
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
PDF
Cassandra NoSQL Tutorial
PDF
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
PPTX
Information processing architectures
PPTX
Stratio big data spain
HGrid A Data Model for Large Geospatial Data Sets in HBase
Architecture et modèle de données Cassandra
Introduction to Cassandra and datastax DSE
Cassandra basics 2.0
Home For Gypsies – Storage for NoSQL Databases​
CCS334 BIG DATA ANALYTICS Session 3 Distributed models.pptx
Apache Cassandra For Java Developers - Why, What and How. LJC @ UCL October 2014
DataStax TechDay - Munich 2014
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
Storage cassandra
Cassandra advanced part-ll
FULLTEXT02
C* Summit 2013: Suicide Risk Prediction Using Social Media and Cassandra by K...
C* Summit 2013: No Whistling Required: Cabs, Cassandra, and Hailo by Dave Gar...
Введение в Apache Cassandra
Apache Cassandra and DataStax Enterprise Explained with Peter Halliday at Wil...
Cassandra NoSQL Tutorial
Big data analytics K.Kiruthika II-M.Sc.,Computer Science Bonsecours college f...
Information processing architectures
Stratio big data spain
Ad

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
cuic standard and advanced reporting.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Cloud computing and distributed systems.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Spectroscopy.pptx food analysis technology
PDF
KodekX | Application Modernization Development
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
MIND Revenue Release Quarter 2 2025 Press Release
The Rise and Fall of 3GPP – Time for a Sabbatical?
Unlocking AI with Model Context Protocol (MCP)
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Building Integrated photovoltaic BIPV_UPV.pdf
cuic standard and advanced reporting.pdf
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation_ Review paper, used for researhc scholars
Chapter 3 Spatial Domain Image Processing.pdf
Cloud computing and distributed systems.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Spectroscopy.pptx food analysis technology
KodekX | Application Modernization Development
20250228 LYD VKU AI Blended-Learning.pptx
The AUB Centre for AI in Media Proposal.docx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Per capita expenditure prediction using model stacking based on satellite ima...
MIND Revenue Release Quarter 2 2025 Press Release

No SQL Cassandra

  • 1. NoSQL Cassandra July 3, 2014 Prashanth M S
  • 3. Why NoSQL? Increase in data led to use of cluster of small machines for handling it (Scale out), but RDBMS are not designed to run on clusters Big Table from Google and Dynamo from Amazon – were the alternatives for data storage in the early 2000s Common characteristics of NoSQL DBs are ◦ Not using relational model ◦ Running well on clusters ◦ Schemaless, Open-source and built for 21st century web estates July 3, 2014 3
  • 4. Types of NoSQL DBs NoSQL Types Aggregate Oriented DBs Key Value Data Model Amazon DynamoDB Document Model MongoDB CouchDB Column Family Model Cassandra HBase Graph DBs Neo4J Infinite Graph July 3, 2014 4
  • 5. Cassandra Data Model The table below shows analogy in terms of relational model Cassandra column family can be thought as map of map ◦ Map<RowKey, SortedMap<ColumnKey, ColumnValue>> July 3, 2014 5 Relational Model Cassandra Model Database Keyspace Table Column Family Primary Key Row Key
  • 6. Cassandra Key Components Gossip ◦ Peer-to-peer communication protocol between nodes of cluster Partitioner ◦ Determines how to distribute data across nodes of cluster Replication Strategy ◦ For data replication Snitch ◦ For network topology Cassandra.yaml ◦ Timeout settings, tuning properties, etc July 3, 2014 6
  • 7. Cassandra Storage The memtable data is flushed to SSTables on disk. Data in the commit log is purged after its corresponding data in the memtable is flushed to the SSTable. July 3, 2014 7
  • 8. Cassandra Data Partitioning Lets say, we have following data Data is placed on each node based on Partition Key and the range the node is responsible for July 3, 2014 8 jim age: 36 car: camaro gender: M carol age: 37 car: bmw gender: F johnny age: 12 gender: M suzy age: 10 gender: F Node Start Range End Range Partition Key Hash Value A -9223372036854 -4611686018427 johnny -6723372854875 B -4611686018427 -1 jim -2245462676723 C 0 4611686018427 suzy 1168604627387 D 4611686018427 9223372036854 carol 7723358927203
  • 9. Cassandra Data Distribution using Vnodes Vnodes allow each node to own a large number of small partition ranges distributed throughout the cluster July 3, 2014 9
  • 10. Q & A July 3, 2014 10