SlideShare a Scribd company logo
CQRS and Event
Sourcing Applications
with Cassandra_
Matthias Niehoff
#CassandraSummit 2015
1
! The Use Case
! Event Sourcing
! CQRS
! Cassandra for Storage
! Spark for Processing
! Benefits & Pitfalls
! Q&A
Agenda_
2
The Use Case
3
24x7 Proxy_
4
LegacySystems

(Not24x7)
“InternetReady“
Applications
(24x7available)
24x7 Proxy
•Caches data
•Provides data
•Stores changes
•Provides changes
•No business logic/validation
•Solution needs to be highly scalable 

(up to 100.000 reads/s, 10.000 writes/s)
•Read and write access needs to be low latency
•Read/write ratio is 10:1 or higher
•Solution needs to deal with up to 500.000.000
customers
Assumptions_
5
Event Sourcing
6
Traditional Pattern: Saving Application State_
7
Store
ID
Address
Article
Name
StockSize updateInventory()
getInventory()
sells
A series of sales and replenishments for
• a tablet
• Starting with 60, sell 20, replenish 10
• a stove
• Starting with 25, sell 5, no replenishments
What is different with Event Sourcing?_
8
Saving only application state
What is the Difference?_
9
:ArticleInventory
Fancy Tablet
50
:ArticleInventory
Gas Stove
20
Saving events instead of state
What is the Difference?_
10
:ArticleInventory
Fancy Tablet
39
15-08-14T19:..
:ArticleInventory
Gas Stove
20
15-08-14T19:..
:ArticleInventory
Fancy Tablet
45
15-08-14T19:..
:ArticleInventory
Gas Stove
20
15-08-14T19:..
:ArticleInventory
Fancy Tablet
50
15-08-14T19:..
:ArticleInventory
Gas Stove
20
15-08-14T19:..
•Log of all stock changes
•Complete rebuild of the state
•Temporal query
•Event replay and rollback
Benefits of Storing Events_
11
CQRS
12
Default Application Architecture_
13
UserInterface
DomainModel
ApplicationServices
DB
CQRS Application Architecture_
14
UserInterface
Query
Services
Command
Services
DomainModel
DB
•The pattern is simple
•Going further
• Split up the domain model
• Independent scaling of models
• Not using a query model at all
• Different databases for models
A Pattern Changing Your Mindset_
15
Event Sourcing & CQRS_
16
Command
Services
Command
Model
ReadLayer
Query
Services
Query
Services
Query
Services Asynchronous
DB
Event Store
Query
Stores
ProcessorEvent
Processor
DB
DB
DB
Storage with Cassandra
17
•Not only an event sink
• Compaction
• Selective replay
•No single point of failure
•Horizontal scale & Geo Replication
•Write ahead of unmodified data
•Plays well with further processing
•Open source & a huge community
•Easy operations
Why Cassandra…
18
For accessing all entities of a given type
Event Store_
19
CREATE TABLE event_source_by_type (
entity_type TEXT,
bucket INT,
entity_key TEXT,
insert_time TIMESTAMP,
update_time TIMESTAMP,
payload TEXT,
PRIMARY KEY((entity_type,bucket),insert_time,entity_key)
) 

WITH CLUSTERING ORDER BY (created_at DESC,entity_key ASC);
e.g. as JSON, XML, protobuf, Avro
prevent huge partitions
CREATE TABLE event_source_by_key (
entity_type TEXT,
entity_key TEXT,
insert_time TIMESTAMP,
update_time TIMESTAMP,
payload TEXT,
PRIMARY KEY((entity_type,entity_key),created_at)
) 

WITH CLUSTERING ORDER BY (created_at DESC);
For accessing an entity directly
Optional: Second Table_
20
e.g. as JSON, XML or protobuf
•Create tables that fit your queries!
•E.g. „Get articles in category ‚computer‘“
Query Stores_
21
CREATE TABLE articles_by_category (
category TEXT PRIMARY KEY,
article_id UUID,
article_info TEXT
);
may need bucketing
could also be a
JSON document
Query Stores_
22
„I need ad-hoc queries“
„I need specific queries with
a lot of different filters“
Query Stores_
23
Processing with Spark
24
•Command model triggers event processor
•Event processor updates query views
From Event Store to Query Store_
25
Command
Model
Event
Processor DB
DB
DB
Event
Processor
Event
Processor
Event Processing in Detail_
26
Command
Model DB
DB
DB
•Easy scale out
•Easy deployment
•Intuitive Scala & Java API
•Fault tolerant
•Out-of-the-box Kafka adapter
•Integrates well with Cassandra
Why Spark?
27
•Spark Streaming application
•Consumes only topics of interest
•Joins the stream of events with the current view
• Use primary key of entity for correlation
• Use joinWithCassandraTable
Spark Job in Detail_
28
1. Create a table for the query view
2. Create a Spark job filling your table
3. Deploy the Spark job
4. Init reprocess of the event DB
• same transformation logic as in normal processing
• source can be different
5. Mark view as initialized
If you need a new query view_
29
Query
DB
Event
DB
Benefits &
Pitfalls
30
•Scalability
• On storage & processing: just add nodes
• Efficient queries due to separation
•Collaboration
• Every client gets its own data access
• Easy to support new queries
Benefits_
31
•More complexity than simple CRUD
•Side effects on event replay
•Eventual consistency in query views
•Concurrent writes
•Performance of replay
Pitfalls_
32
Lost Updates
•Due to parallel processing
• Two events A and B as sequential input
• A is processed after B
•Solution
• Partition Spark RDD by entity key
• Use a lambda architecture
Pitfalls_
33
speed
Data
Stream
Serving
Layer
batch
•Event Store Compaction
• Compact store to improve processing time
• Only store latest entry of a entity key
• e.g. a Spark batch job / Cassandra TTL
•Snapshot / Master State
• Constantly build a complete state of all data
• Can be used
• To speed up initialization
• As a store for a search engine
Pitfalls_
34
The Use Case
Solved with ES & CQRS
35
24x7 Proxy
24x7 Proxy_
36
LegacyCoreSystems

(Not24x7)
“InternetReady“Applications
(24x7available)
37
Questions?
Thank You!
Matthias Niehoff,
IT-Consultant
codecentric AG
Zeppelinstraße 2
76185 Karlsruhe, Germany
www.codecentric.de
blog.codecentric.de
matthiasniehoff
38

More Related Content

PDF
Support Web Services SOAP et RESTful Mr YOUSSFI
PDF
MongoDB WiredTiger Internals
PPTX
Microservices Architecture Part 2 Event Sourcing and Saga
PDF
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
PPTX
Introduction to NoSQL
PPTX
Best Tools for first time Odoo Development
PDF
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
PPTX
Using Queryable State for Fun and Profit
Support Web Services SOAP et RESTful Mr YOUSSFI
MongoDB WiredTiger Internals
Microservices Architecture Part 2 Event Sourcing and Saga
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
Introduction to NoSQL
Best Tools for first time Odoo Development
Grokking Techtalk #39: How to build an event driven architecture with Kafka ...
Using Queryable State for Fun and Profit

What's hot (20)

PPTX
Apache Pinot Meetup Sept02, 2020
PPTX
NOSQL Databases types and Uses
PDF
[기술 트렌드] Gartner 선정 10대 전략 기술
PDF
Explicit architecture
PDF
Apache Flink internals
PDF
Real-Time Market Data Analytics Using Kafka Streams
PPTX
Map Reduce
KEY
MongoDB, E-commerce and Transactions
PPTX
introduction à MongoDB
PDF
Data pipelines observability: OpenLineage & Marquez
PPTX
Architectures n-tiers
PDF
Event Driven-Architecture from a Scalability perspective
PDF
BigData_TP3 : Spark
PDF
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
PPTX
PPTX
JavaFX Presentation
PPTX
Common MongoDB Use Cases
PDF
Introducing ELK
PDF
Introduction au web cours.pdf
PPTX
Chp2 - SOA
Apache Pinot Meetup Sept02, 2020
NOSQL Databases types and Uses
[기술 트렌드] Gartner 선정 10대 전략 기술
Explicit architecture
Apache Flink internals
Real-Time Market Data Analytics Using Kafka Streams
Map Reduce
MongoDB, E-commerce and Transactions
introduction à MongoDB
Data pipelines observability: OpenLineage & Marquez
Architectures n-tiers
Event Driven-Architecture from a Scalability perspective
BigData_TP3 : Spark
ClickHouse in Real Life. Case Studies and Best Practices, by Alexander Zaitsev
JavaFX Presentation
Common MongoDB Use Cases
Introducing ELK
Introduction au web cours.pdf
Chp2 - SOA
Ad

Viewers also liked (9)

PPTX
Unit tests benefits
PDF
Microservice Architecture with CQRS and Event Sourcing
ODP
Event sourcing with Eventuate
PPTX
Moving Beyond Lambda Architectures with Apache Kudu
PDF
CQRS and Event Sourcing with Akka, Cassandra and RabbitMQ
PDF
Developing event-driven microservices with event sourcing and CQRS (svcc, sv...
PPTX
Going Serverless with CQRS on AWS
PDF
Akka persistence == event sourcing in 30 minutes
PPTX
CQRS and Event Sourcing, An Alternative Architecture for DDD
Unit tests benefits
Microservice Architecture with CQRS and Event Sourcing
Event sourcing with Eventuate
Moving Beyond Lambda Architectures with Apache Kudu
CQRS and Event Sourcing with Akka, Cassandra and RabbitMQ
Developing event-driven microservices with event sourcing and CQRS (svcc, sv...
Going Serverless with CQRS on AWS
Akka persistence == event sourcing in 30 minutes
CQRS and Event Sourcing, An Alternative Architecture for DDD
Ad

Similar to codecentric AG: CQRS and Event Sourcing Applications with Cassandra (20)

PDF
Using cassandra as a distributed logging to store pb data
PDF
Logisland "Event Mining at scale"
PDF
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
PPTX
How we evolved data pipeline at Celtra and what we learned along the way
PPTX
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
PDF
Re-Engineering PostgreSQL as a Time-Series Database
PDF
Instaclustr webinar 2017 feb 08 japan
PDF
Avoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
PPTX
High Throughput Analytics with Cassandra & Azure
PPTX
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
PPTX
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
PPTX
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...
PDF
Building a Complex, Real-Time Data Management Application
PPTX
real time data processing is a tsubtopic in the topic in the domain bigdata
PDF
GECon2017_High-volume data streaming in azure_ Aliaksandr Laisha
PDF
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
PDF
Micro-batching: High-performance writes
PDF
Building and deploying large scale real time news system with my sql and dist...
PPTX
Processing 50,000 Events Per Second with Cassandra and Spark (Ben Slater, Ins...
PPTX
Processing 50,000 events per second with Cassandra and Spark
Using cassandra as a distributed logging to store pb data
Logisland "Event Mining at scale"
Self-serve analytics journey at Celtra: Snowflake, Spark, and Databricks
How we evolved data pipeline at Celtra and what we learned along the way
Migration to ClickHouse. Practical guide, by Alexander Zaitsev
Re-Engineering PostgreSQL as a Time-Series Database
Instaclustr webinar 2017 feb 08 japan
Avoiding the Pit of Despair - Event Sourcing with Akka and Cassandra
High Throughput Analytics with Cassandra & Azure
Intro to Apache Apex - Next Gen Platform for Ingest and Transform
Best Practices for Supercharging Cloud Analytics on Amazon Redshift
C* Summit 2013: Optimizing the Public Cloud for Cost and Scalability with Cas...
Building a Complex, Real-Time Data Management Application
real time data processing is a tsubtopic in the topic in the domain bigdata
GECon2017_High-volume data streaming in azure_ Aliaksandr Laisha
Micro-batching: High-performance Writes (Adam Zegelin, Instaclustr) | Cassand...
Micro-batching: High-performance writes
Building and deploying large scale real time news system with my sql and dist...
Processing 50,000 Events Per Second with Cassandra and Spark (Ben Slater, Ins...
Processing 50,000 events per second with Cassandra and Spark

More from DataStax Academy (20)

PDF
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
PPTX
Introduction to DataStax Enterprise Graph Database
PPTX
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
PPTX
Cassandra on Docker @ Walmart Labs
PDF
Cassandra 3.0 Data Modeling
PPTX
Cassandra Adoption on Cisco UCS & Open stack
PDF
Data Modeling for Apache Cassandra
PDF
Coursera Cassandra Driver
PDF
Production Ready Cassandra
PDF
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
PPTX
Cassandra @ Sony: The good, the bad, and the ugly part 1
PPTX
Cassandra @ Sony: The good, the bad, and the ugly part 2
PDF
Standing Up Your First Cluster
PDF
Real Time Analytics with Dse
PDF
Introduction to Data Modeling with Apache Cassandra
PDF
Cassandra Core Concepts
PPTX
Enabling Search in your Cassandra Application with DataStax Enterprise
PPTX
Bad Habits Die Hard
PDF
Advanced Data Modeling with Apache Cassandra
PDF
Advanced Cassandra
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Cassandra on Docker @ Walmart Labs
Cassandra 3.0 Data Modeling
Cassandra Adoption on Cisco UCS & Open stack
Data Modeling for Apache Cassandra
Coursera Cassandra Driver
Production Ready Cassandra
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 2
Standing Up Your First Cluster
Real Time Analytics with Dse
Introduction to Data Modeling with Apache Cassandra
Cassandra Core Concepts
Enabling Search in your Cassandra Application with DataStax Enterprise
Bad Habits Die Hard
Advanced Data Modeling with Apache Cassandra
Advanced Cassandra

Recently uploaded (20)

PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPT
Teaching material agriculture food technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Electronic commerce courselecture one. Pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
cuic standard and advanced reporting.pdf
PDF
KodekX | Application Modernization Development
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Review of recent advances in non-invasive hemoglobin estimation
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
NewMind AI Monthly Chronicles - July 2025
Dropbox Q2 2025 Financial Results & Investor Presentation
Teaching material agriculture food technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Network Security Unit 5.pdf for BCA BBA.
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Electronic commerce courselecture one. Pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Advanced Soft Computing BINUS July 2025.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Advanced methodologies resolving dimensionality complications for autism neur...
The Rise and Fall of 3GPP – Time for a Sabbatical?
cuic standard and advanced reporting.pdf
KodekX | Application Modernization Development

codecentric AG: CQRS and Event Sourcing Applications with Cassandra

  • 1. CQRS and Event Sourcing Applications with Cassandra_ Matthias Niehoff #CassandraSummit 2015 1
  • 2. ! The Use Case ! Event Sourcing ! CQRS ! Cassandra for Storage ! Spark for Processing ! Benefits & Pitfalls ! Q&A Agenda_ 2
  • 4. 24x7 Proxy_ 4 LegacySystems
 (Not24x7) “InternetReady“ Applications (24x7available) 24x7 Proxy •Caches data •Provides data •Stores changes •Provides changes •No business logic/validation
  • 5. •Solution needs to be highly scalable 
 (up to 100.000 reads/s, 10.000 writes/s) •Read and write access needs to be low latency •Read/write ratio is 10:1 or higher •Solution needs to deal with up to 500.000.000 customers Assumptions_ 5
  • 7. Traditional Pattern: Saving Application State_ 7 Store ID Address Article Name StockSize updateInventory() getInventory() sells
  • 8. A series of sales and replenishments for • a tablet • Starting with 60, sell 20, replenish 10 • a stove • Starting with 25, sell 5, no replenishments What is different with Event Sourcing?_ 8
  • 9. Saving only application state What is the Difference?_ 9 :ArticleInventory Fancy Tablet 50 :ArticleInventory Gas Stove 20
  • 10. Saving events instead of state What is the Difference?_ 10 :ArticleInventory Fancy Tablet 39 15-08-14T19:.. :ArticleInventory Gas Stove 20 15-08-14T19:.. :ArticleInventory Fancy Tablet 45 15-08-14T19:.. :ArticleInventory Gas Stove 20 15-08-14T19:.. :ArticleInventory Fancy Tablet 50 15-08-14T19:.. :ArticleInventory Gas Stove 20 15-08-14T19:..
  • 11. •Log of all stock changes •Complete rebuild of the state •Temporal query •Event replay and rollback Benefits of Storing Events_ 11
  • 15. •The pattern is simple •Going further • Split up the domain model • Independent scaling of models • Not using a query model at all • Different databases for models A Pattern Changing Your Mindset_ 15
  • 16. Event Sourcing & CQRS_ 16 Command Services Command Model ReadLayer Query Services Query Services Query Services Asynchronous DB Event Store Query Stores ProcessorEvent Processor DB DB DB
  • 18. •Not only an event sink • Compaction • Selective replay •No single point of failure •Horizontal scale & Geo Replication •Write ahead of unmodified data •Plays well with further processing •Open source & a huge community •Easy operations Why Cassandra… 18
  • 19. For accessing all entities of a given type Event Store_ 19 CREATE TABLE event_source_by_type ( entity_type TEXT, bucket INT, entity_key TEXT, insert_time TIMESTAMP, update_time TIMESTAMP, payload TEXT, PRIMARY KEY((entity_type,bucket),insert_time,entity_key) ) 
 WITH CLUSTERING ORDER BY (created_at DESC,entity_key ASC); e.g. as JSON, XML, protobuf, Avro prevent huge partitions
  • 20. CREATE TABLE event_source_by_key ( entity_type TEXT, entity_key TEXT, insert_time TIMESTAMP, update_time TIMESTAMP, payload TEXT, PRIMARY KEY((entity_type,entity_key),created_at) ) 
 WITH CLUSTERING ORDER BY (created_at DESC); For accessing an entity directly Optional: Second Table_ 20 e.g. as JSON, XML or protobuf
  • 21. •Create tables that fit your queries! •E.g. „Get articles in category ‚computer‘“ Query Stores_ 21 CREATE TABLE articles_by_category ( category TEXT PRIMARY KEY, article_id UUID, article_info TEXT ); may need bucketing could also be a JSON document
  • 22. Query Stores_ 22 „I need ad-hoc queries“ „I need specific queries with a lot of different filters“
  • 25. •Command model triggers event processor •Event processor updates query views From Event Store to Query Store_ 25 Command Model Event Processor DB DB DB Event Processor Event Processor
  • 26. Event Processing in Detail_ 26 Command Model DB DB DB
  • 27. •Easy scale out •Easy deployment •Intuitive Scala & Java API •Fault tolerant •Out-of-the-box Kafka adapter •Integrates well with Cassandra Why Spark? 27
  • 28. •Spark Streaming application •Consumes only topics of interest •Joins the stream of events with the current view • Use primary key of entity for correlation • Use joinWithCassandraTable Spark Job in Detail_ 28
  • 29. 1. Create a table for the query view 2. Create a Spark job filling your table 3. Deploy the Spark job 4. Init reprocess of the event DB • same transformation logic as in normal processing • source can be different 5. Mark view as initialized If you need a new query view_ 29 Query DB Event DB
  • 31. •Scalability • On storage & processing: just add nodes • Efficient queries due to separation •Collaboration • Every client gets its own data access • Easy to support new queries Benefits_ 31
  • 32. •More complexity than simple CRUD •Side effects on event replay •Eventual consistency in query views •Concurrent writes •Performance of replay Pitfalls_ 32
  • 33. Lost Updates •Due to parallel processing • Two events A and B as sequential input • A is processed after B •Solution • Partition Spark RDD by entity key • Use a lambda architecture Pitfalls_ 33 speed Data Stream Serving Layer batch
  • 34. •Event Store Compaction • Compact store to improve processing time • Only store latest entry of a entity key • e.g. a Spark batch job / Cassandra TTL •Snapshot / Master State • Constantly build a complete state of all data • Can be used • To speed up initialization • As a store for a search engine Pitfalls_ 34
  • 35. The Use Case Solved with ES & CQRS 35
  • 38. Thank You! Matthias Niehoff, IT-Consultant codecentric AG Zeppelinstraße 2 76185 Karlsruhe, Germany www.codecentric.de blog.codecentric.de matthiasniehoff 38