SlideShare a Scribd company logo
Cassandra Data Access in Java
eBuddy use of Cassandra
XMS
● User Data Service
● User Discovery Service
● Persistent Session Store
● Message History
● Location-based Discovery
Cassandra in
eBuddy Messaging Platform
● Current size of data
● 1,4 TiB total (replication of 3x); 467 GiB actual data
● 12 million sessions (11 million users plus groups)
● Almost a billion rows in one column family
(inverse social graph)
Some Statistics
Data Access - Overview
Design Objectives
● Data Source Agnostic
● Testable
● Thread Safe
● Strong Typing
● Supports “transactions”, i.e. units of work in batch
● Efficient Mapping to Application Domain Model
● Follows Familiar Patterns (e.g. Spring JDBC Template)
Data Access in Layers
“Operations” Layer
Writing
● Use Generic Typing
● Has Interface
(for testability, etc.)
● Handles Exceptions
Reading
● Use Mappers
Serializers
● Constructed with serializers that convert to types needed
by data access layer
Reading
Data Access Layer
Data Access Object
● Data Access Object (DAO) is singleton
● Transforms from data model to domain model
● Operations object configured with serializers to convert
from data model to domain model
● Defines the mappers for read operations
Next Steps
CQL3
DataStax:
"We believe that CQL3 is a simpler and overall better API for Cassandra
than the thrift API is. Therefore, new projects/applications are encouraged
to use CQL3"
At eBuddy, we are still using the Thrift API and the Java Hector library.
We are currently looking at CQL3 and whether we want to use it going
forward and whether we will "upgrade" existing code.
Structured Data
● Object Mapping Frameworks
● Mapped vs. Embedded Objects
● Nested Properties ("path" access)
Object Mapping Frameworks
● Simple mapper frameworks with (some) JPA support
● Hector Object Mapper
● Kundera
● Firebrand (not JPA)
● has most features,
e.g supports both embedded and mapped object graphs
https://github.com/impetus-opensource/Kundera
http://github.com/hector-client/hector
http://firebrandocm.org
Hierarchical Properties
● Use DynamicComposites to model keys that have a
variable number of components
put(“accounts|msn|x.y.z|sign_in”, “0”);
put(“accounts|msn|x.y.z|key”, “value”);
get(“accounts”) --> retrieved as a map:
{"accounts":
{ "msn":
{ "x.y.z":
{ "sign_in": "0",
"key": "value" } } } }
● Use a slice query to retrieve properties using partial path:
Questions?
XMS
Unlimited messaging.
Better. Free.
We're Hiring!
Download XMS now:

More Related Content

PDF
Multi model-databases
PDF
Processing large-scale graphs with Google Pregel
PPTX
An Intro to Elasticsearch and Kibana
PPTX
Introduction to NoSQL Database
PDF
Siddhi - cloud-native stream processor
PDF
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...
PDF
The Rise of Streaming SQL
PPT
Hibernate
Multi model-databases
Processing large-scale graphs with Google Pregel
An Intro to Elasticsearch and Kibana
Introduction to NoSQL Database
Siddhi - cloud-native stream processor
Webinar Slides: Tungsten Replicator for Elasticsearch - Real-time data loadin...
The Rise of Streaming SQL
Hibernate

What's hot (20)

PPTX
Exploring MongoDB & Elasticsearch: Better Together
PDF
A head start on cloud native event driven applications - bigdatadays
PDF
Overhauling a database engine in 2 months
PDF
Oslo bekk2014
PDF
Stream Processing with Ballerina
PDF
Multi-model databases and node.js
PDF
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
PPTX
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
PDF
Electron, databases, and RxDB
PDF
Webtech Conference: NoSQL and Web scalability
PPTX
Log analysis using elk
PPTX
Accelerating Delivery of Data Products - The EBSCO Way
PDF
FOXX - a Javascript application framework on top of ArangoDB
PPTX
The CIOs Guide to NoSQL
PDF
Backbone using Extensible Database APIs over HTTP
PPTX
NoSQL for SQL Users
PDF
Data platform architecture principles - ieee infrastructure 2020
PPTX
Document Database
PDF
Scaling ArangoDB on Mesosphere DCOS
PPTX
Db presentation google_megastore
Exploring MongoDB & Elasticsearch: Better Together
A head start on cloud native event driven applications - bigdatadays
Overhauling a database engine in 2 months
Oslo bekk2014
Stream Processing with Ballerina
Multi-model databases and node.js
WSO2 Stream Processor: Graphical Editor, HTTP & Message Trace Analytics and m...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
Electron, databases, and RxDB
Webtech Conference: NoSQL and Web scalability
Log analysis using elk
Accelerating Delivery of Data Products - The EBSCO Way
FOXX - a Javascript application framework on top of ArangoDB
The CIOs Guide to NoSQL
Backbone using Extensible Database APIs over HTTP
NoSQL for SQL Users
Data platform architecture principles - ieee infrastructure 2020
Document Database
Scaling ArangoDB on Mesosphere DCOS
Db presentation google_megastore
Ad

Viewers also liked (13)

ODP
Meetup cassandra for_java_cql
PDF
Retention and upsale in using customer data
PDF
C* path
PPTX
Агман Забуровна
PDF
שירות כמנוע צמיחה ורווחיות
PPTX
Агман Забуровна
PPTX
Integrating Voice Of Customers with Customer Success
PDF
B2B - Measuring Customer Satisfaction - Comverse ltd
PDF
Voice of Customers programs in B2B
PPTX
Model assure media pembelajaran
PPTX
Using puppet, foreman and git to develop and operate a large scale internet s...
PDF
Managing service projects in a B2B environment
PPTX
Post Usage ROI?
Meetup cassandra for_java_cql
Retention and upsale in using customer data
C* path
Агман Забуровна
שירות כמנוע צמיחה ורווחיות
Агман Забуровна
Integrating Voice Of Customers with Customer Success
B2B - Measuring Customer Satisfaction - Comverse ltd
Voice of Customers programs in B2B
Model assure media pembelajaran
Using puppet, foreman and git to develop and operate a large scale internet s...
Managing service projects in a B2B environment
Post Usage ROI?
Ad

Similar to Cassandra data access (20)

PPTX
Cassandra Java APIs Old and New – A Comparison
PPTX
Cassandra for mission critical data
PDF
Development without Constraint
PPTX
Introduction to NoSQL CassandraDB
PDF
Apachecon cassandra transport
PDF
Introduction to Apache Cassandra
PPTX
Cassandra implementation for collecting data and presenting data
PPTX
Appache Cassandra
PPT
Scaling Web Applications with Cassandra Presentation (1).ppt
PPTX
Strudel: Framework for Transaction Performance Analyses on SQL/NoSQL Systems
PPTX
DataStax NYC Java Meetup: Cassandra with Java
PDF
Cassandra
PDF
data-modeling-paper
ODP
Nyc summit intro_to_cassandra
PPTX
Cassandra's Sweet Spot - an introduction to Apache Cassandra
PPTX
Learning Cassandra NoSQL
PPTX
Introduction to cassandra
PDF
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
PPTX
Unit -3 _Cassandra-CRUD Operations_Practice Examples
Cassandra Java APIs Old and New – A Comparison
Cassandra for mission critical data
Development without Constraint
Introduction to NoSQL CassandraDB
Apachecon cassandra transport
Introduction to Apache Cassandra
Cassandra implementation for collecting data and presenting data
Appache Cassandra
Scaling Web Applications with Cassandra Presentation (1).ppt
Strudel: Framework for Transaction Performance Analyses on SQL/NoSQL Systems
DataStax NYC Java Meetup: Cassandra with Java
Cassandra
data-modeling-paper
Nyc summit intro_to_cassandra
Cassandra's Sweet Spot - an introduction to Apache Cassandra
Learning Cassandra NoSQL
Introduction to cassandra
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
Unit -3 _Cassandra-CRUD Operations_Practice Examples

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
Teaching material agriculture food technology
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
DOCX
The AUB Centre for AI in Media Proposal.docx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Cloud computing and distributed systems.
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Spectroscopy.pptx food analysis technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Empathic Computing: Creating Shared Understanding
PDF
NewMind AI Weekly Chronicles - August'25 Week I
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Teaching material agriculture food technology
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
The AUB Centre for AI in Media Proposal.docx
“AI and Expert System Decision Support & Business Intelligence Systems”
Building Integrated photovoltaic BIPV_UPV.pdf
Cloud computing and distributed systems.
Diabetes mellitus diagnosis method based random forest with bat algorithm
The Rise and Fall of 3GPP – Time for a Sabbatical?
Encapsulation_ Review paper, used for researhc scholars
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Spectroscopy.pptx food analysis technology
MYSQL Presentation for SQL database connectivity
MIND Revenue Release Quarter 2 2025 Press Release
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Review of recent advances in non-invasive hemoglobin estimation
Empathic Computing: Creating Shared Understanding
NewMind AI Weekly Chronicles - August'25 Week I

Cassandra data access

  • 2. eBuddy use of Cassandra
  • 3. XMS
  • 4. ● User Data Service ● User Discovery Service ● Persistent Session Store ● Message History ● Location-based Discovery Cassandra in eBuddy Messaging Platform
  • 5. ● Current size of data ● 1,4 TiB total (replication of 3x); 467 GiB actual data ● 12 million sessions (11 million users plus groups) ● Almost a billion rows in one column family (inverse social graph) Some Statistics
  • 6. Data Access - Overview
  • 7. Design Objectives ● Data Source Agnostic ● Testable ● Thread Safe ● Strong Typing ● Supports “transactions”, i.e. units of work in batch ● Efficient Mapping to Application Domain Model ● Follows Familiar Patterns (e.g. Spring JDBC Template)
  • 8. Data Access in Layers
  • 10. Writing ● Use Generic Typing ● Has Interface (for testability, etc.) ● Handles Exceptions
  • 12. Serializers ● Constructed with serializers that convert to types needed by data access layer
  • 15. Data Access Object ● Data Access Object (DAO) is singleton ● Transforms from data model to domain model ● Operations object configured with serializers to convert from data model to domain model ● Defines the mappers for read operations
  • 17. CQL3 DataStax: "We believe that CQL3 is a simpler and overall better API for Cassandra than the thrift API is. Therefore, new projects/applications are encouraged to use CQL3" At eBuddy, we are still using the Thrift API and the Java Hector library. We are currently looking at CQL3 and whether we want to use it going forward and whether we will "upgrade" existing code.
  • 18. Structured Data ● Object Mapping Frameworks ● Mapped vs. Embedded Objects ● Nested Properties ("path" access)
  • 19. Object Mapping Frameworks ● Simple mapper frameworks with (some) JPA support ● Hector Object Mapper ● Kundera ● Firebrand (not JPA) ● has most features, e.g supports both embedded and mapped object graphs https://github.com/impetus-opensource/Kundera http://github.com/hector-client/hector http://firebrandocm.org
  • 20. Hierarchical Properties ● Use DynamicComposites to model keys that have a variable number of components put(“accounts|msn|x.y.z|sign_in”, “0”); put(“accounts|msn|x.y.z|key”, “value”); get(“accounts”) --> retrieved as a map: {"accounts": { "msn": { "x.y.z": { "sign_in": "0", "key": "value" } } } } ● Use a slice query to retrieve properties using partial path: