SlideShare a Scribd company logo
Anton Kropp
CassieQ: The Distributed Queue Built On Cassandra
© DataStax, All Rights Reserved.
Why use queues?
• Distribution of work
• Decoupling producers/consumers
• Reliability
2
© DataStax, All Rights Reserved.
Existing Queues
• ActiveMQ
• RabbitMQ
• MSMQ
• Kafka
• SQS
• Azure Queue
• others
3
© DataStax, All Rights Reserved.
Advantage of a queue on c*
• Highly available
• Highly distributed
• Massive intake
• Masterless
• Re-use existing data store/operational knowledge
4
© DataStax, All Rights Reserved. 5
But aren’t queues antipatterns?
© DataStax, All Rights Reserved.
Issues with queues in C*
• Modeling off deletes
• Tombstones
• Evenly distributing messages?
• What is the partition key?
• How to synchronize consumers?
6
© DataStax, All Rights Reserved.
Existing C* queues
• Netflix Astyanax recipe
• Cycled time based partitioning
• Row based reader lock
• Messages put into time shard ordered by insert time
• Relies on deletes
• Requires low gc_grace_seconds for fast compaction
7
© DataStax, All Rights Reserved.
Existing C* queues
• Comcast CMB
• Uses Redis as actual queue (cheating)
• Queues are hashed to affine to same redis server
• Cassandra is cold storage backing store
• Random partitioning between 0 and 100
8
© DataStax, All Rights Reserved.
Missing features
• Authentication
• Authorization
• Statistics
• Simple deployment
• Requirement on external infrastructure
9
© DataStax, All Rights Reserved.
CassieQ
• HTTP(s) based API
• No locking
• Fixed size bucket partitioning
• Leverages pointers (kafkaesque)
• Message invisibility
• Azure Queue/SQS inspired
• Docker deployment
• Authentication/authorization
• Ideally once delivery
• Best attempt at FIFO (not guaranteed)
10
© DataStax, All Rights Reserved. 11
docker run –it 
-p 8080:8080 
–p 8081:8081 
paradoxical/cassieq dev
© DataStax, All Rights Reserved.
CassieQ Queue API
12
© DataStax, All Rights Reserved.
CassieQ Admin API
13
© DataStax, All Rights Reserved.
CassieQ workflow
• Client is authorized on an account
• Granular client authorization up to queue level
• Client consumes message from queue with message lease (invisibility)
• Gets pop receipt
• Client acks message with pop receipt
• If pop receipt not valid, lease expired
• Client can update messages
• Update message contents
• Renew lease
14
CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Curalate) | C* Summit 2016
© DataStax, All Rights Reserved. 16
Lets dig inside
CassieQ internals
© DataStax, All Rights Reserved.
TLDR
• Messages partitioned into fixed sized buckets
• Pointers to buckets/messages used to track current state
• Use of lightweight transactions for atomic actions to avoid locking
• Bucketing + pointers eliminates modeling off deletes
17
© DataStax, All Rights Reserved.
CassieQ Buckets
• Messages stored in fixed sized buckets
• Deterministic when full
• Easy to reason about
• Why not time buckets?
• Time bugs suck
• Non deterministic
• Can miss data due to time overlaps
• Messages given monotonic ID
• CAS “id” table
• Bucket # = monotonicId / bucketSize
18
© DataStax, All Rights Reserved.
Pointers to Buckets/Messages
• Reader pointer
• Tracks which bucket a consumer is on
• Repair pointer
• Tracks first non-finalized bucket
• Invisibility pointer
• Tracks first unacked message
19
All 3 pointers point to monotonic id value, potentially in different
buckets
1 2 3 4 5
InvisPointer ReaderPointer
RepairPointer
Pointers to Buckets
© DataStax, All Rights Reserved.
Schema
21
CREATE TABLE queue (
account_name text,
queuename text,
bucket_size int,
version int,
...
PRIMARY KEY (account_name, queuename)
);
CREATE TABLE message (
queueid text,
bucket_num bigint,
monoton bigint,
message text,
version int,
acked boolean,
next_visible_on timestamp,
delivery_count int,
tag text,
created_date timestamp,
updated_date timestamp,
PRIMARY KEY ((queueid, bucket_num), monoton)
);
*queueid=accountName:queueName:version
© DataStax, All Rights Reserved. 22
Reading messages
© DataStax, All Rights Reserved.
Pointers to Buckets/Messages
• Reader pointer
• Tracks which bucket a consumer is on
• Repair pointer
• Tracks first non-finalized bucket
• Invisibility pointer
• Tracks first unacked message
23
© DataStax, All Rights Reserved.
Reading from a bucket
• Read any unacked message in bucket (either FIFO or random)
• Consume message (update its internal version + set its invisibility timeout)
• Return to consumer
24
1 2 3 4
Bucket 1
Undelivered messages
Reader pointer start
1 2 ? 4 5
Buckets… complications
• Once a monoton is generated, it is taken
• Even if a message fails to insert the monoton is taken
• Buckets are now partially filled!
• How to resolve?
1 2 ? 4 5 6
Bucket 2Bucket 1
Reader
Message 3
missing
When to move off a bucket?
1. All known messages in the bucket have been delivered at least once
2. All new messages being written in future buckets
1 2 ? 4 5 6 7 …
Bucket 2Bucket 1
Reader @
bucket 2
Message 3
missing
Tombstone
When to move off a bucket?
• Tombstoning (not cassandra tombstoning, naming is hard!)
• Bucket is sealed, no more writes
• Reader tombstones bucket after its reached
Tombstoning enables us to detect delayed writes
1 2 ? 4 5 6 7 …
Bucket 2Bucket 1
Reader @
bucket 2
Message 3
missing
Tombstone
© DataStax, All Rights Reserved. 29
Repairing delayed
messages
© DataStax, All Rights Reserved.
Pointers to Buckets/Messages
• Reader pointer
• Tracks which bucket a consumer is on
• Repair pointer
• Tracks first non-finalized bucket
• Invisibility pointer
• Tracks first unacked message
30
© DataStax, All Rights Reserved.
Repairing delayed writes
• Scenarios:
• Message taking its time writing (still alive, but slow)
• Message claimed monoton but is dead
• Resolution:
• Watch for tombstone in bucket
• Wait for repair timeout (30 seconds)
• If message shows up, republish
• If not, finalize bucket and move to next bucket (message is dead)
31
Repairing delayed writes
1 2 ? 4 5 6 7 …
Bucket 2Bucket 1
Reader
@ bucket
2
Message 3
missing
Tombstone
Repair Pointer
@ bucket 1
wait 30 seconds…
Repairing delayed writes
1 2 3 4 5 6 7 …
Bucket 2Bucket 1
Reader
@ bucket
2 +
Message 3
Showed up!
Tombstone
Repair Pointer
@ bucket 1
Republished to end
Repairing delayed writes
1 2 3 4 5 6 7 … 3
Bucket 2..Bucket 1
Tombstone
Repair Pointer
@ bucket 2
Reader
@ bucket
2 +
© DataStax, All Rights Reserved. 36
Invisibility
and the unhappy path ☹
© DataStax, All Rights Reserved. 37
What is invisibility?
© DataStax, All Rights Reserved. 38
A mechanism for
message re-delivery
(in a stateless system)
© DataStax, All Rights Reserved.
Pointers to Buckets/Messages
• Reader pointer
• Tracks which bucket a consumer is on
• Repair pointer
• Tracks first non-finalized bucket
• Invisibility pointer
• Tracks first unacked message
39
The happy path
• Client consumes message
• Message is marked as “invisible” with a “re-visibility” timestamp
• Client gets pop receipt encapsulating metadata (including version)
• Client acks within timeframe
• Message marked as consumed if version is the same
The unhappy path :(
• Client doesn’t ack within timeframe
• Message needs to be redelivered
• Subsequent reads checks the invis pointer for visibility
• If max delivers exceeded, push to optional DLQ
• Else redeliver!
© DataStax, All Rights Reserved.
The unhappy path :(
42
1 2 3 4 5 6 7 …
Bucket 1
Invisibility
pointer
Reader
Bucket
pointer
© DataStax, All Rights Reserved.
The unhappy path :(
43
1 2 3 4 5 6 7 …
Bucket 1
Invisibility
pointer
Reader
Bucket
pointer
© DataStax, All Rights Reserved.
The unhappy path :(
44
1 2 3 4 5* 6 7 …
Bucket 1
Invisibility
pointer
Reader
Bucket
pointer
ackack ack out expired
Long term invisibility is bad
• InvisPointer WILL NOT move past a unacked message
• Invisible messages can block other invisible messages
• Possible to starve future messages
© DataStax, All Rights Reserved.
The unhappy path :(
46
1 2 3 4 5 6 7 …
Bucket 1
Invisibility
pointer Reader
Bucket
pointer
ackack ack DLQ ou t
© DataStax, All Rights Reserved.
Conclusion
• Building a queue on c* is hard
• Limited by performance of lightweight transactions and underlying c* choices
• compaction strategies, cluster usage, etc
• Need to make trade off design choices
• CassieQ is used in production but in not stressed under highly contentious scenarios
47
Questions?
or feedback/thoughts/visceral reactions
Contribute to the antipattern @ paradoxical.io
https://github.com/paradoxical-io/cassieq

More Related Content

PPTX
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
PPT
Aerospike: Key Value Data Access
PDF
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
PPTX
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
PPTX
Kafka 101
PDF
Fundamentals of Apache Kafka
PDF
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
PPTX
Real-time Analytics with Trino and Apache Pinot
Deletes Without Tombstones or TTLs (Eric Stevens, ProtectWise) | Cassandra Su...
Aerospike: Key Value Data Access
InfluxDB IOx Tech Talks: The Impossible Dream: Easy-to-Use, Super Fast Softw...
The Future of Column-Oriented Data Processing With Apache Arrow and Apache Pa...
Kafka 101
Fundamentals of Apache Kafka
InfluxDB IOx Tech Talks: Query Engine Design and the Rust-Based DataFusion in...
Real-time Analytics with Trino and Apache Pinot

What's hot (20)

PDF
Locondo 20190215@ec tech_group
PPTX
How to boost your datamanagement with Dremio ?
PDF
Apache Arrow: High Performance Columnar Data Framework
PDF
Deep Dive into Cassandra
PDF
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
PDF
Cassandra 101
PDF
Bulk Loading into Cassandra
PDF
Disaster Recovery Plans for Apache Kafka
PDF
Apache BookKeeper: A High Performance and Low Latency Storage Service
PDF
Understanding InfluxDB Basics: Tags, Fields and Measurements
PPTX
Kafka Intro With Simple Java Producer Consumers
PPTX
Apache kafka
PDF
Pinot: Near Realtime Analytics @ Uber
PPTX
Envoy and Kafka
PDF
Introduction to apache kafka
PPTX
Introduction to Kafka Cruise Control
PDF
Cassandra overview
PPTX
Apache Kafka at LinkedIn
PDF
Scylla Summit 2022: Stream Processing with ScyllaDB
PDF
Apache Kafka Fundamentals for Architects, Admins and Developers
Locondo 20190215@ec tech_group
How to boost your datamanagement with Dremio ?
Apache Arrow: High Performance Columnar Data Framework
Deep Dive into Cassandra
Improving SparkSQL Performance by 30%: How We Optimize Parquet Pushdown and P...
Cassandra 101
Bulk Loading into Cassandra
Disaster Recovery Plans for Apache Kafka
Apache BookKeeper: A High Performance and Low Latency Storage Service
Understanding InfluxDB Basics: Tags, Fields and Measurements
Kafka Intro With Simple Java Producer Consumers
Apache kafka
Pinot: Near Realtime Analytics @ Uber
Envoy and Kafka
Introduction to apache kafka
Introduction to Kafka Cruise Control
Cassandra overview
Apache Kafka at LinkedIn
Scylla Summit 2022: Stream Processing with ScyllaDB
Apache Kafka Fundamentals for Architects, Admins and Developers
Ad

Viewers also liked (13)

PPTX
High performance queues with Cassandra
PDF
Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...
PDF
Scylla Summit 2017: Stateful Streaming Applications with Apache Spark
PPTX
Scylla Summit 2017: From Elasticsearch to Scylla at Zenly
PDF
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
PDF
If You Care About Performance, Use User Defined Types
PDF
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
PDF
Scylla Summit 2017: Cry in the Dojo, Laugh in the Battlefield: How We Constan...
PDF
Scylla Summit 2017: Planning Your Queries for Maximum Performance
PPTX
Scylla Summit 2017: Managing 10,000 Node Storage Clusters at Twitter
PPTX
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
PDF
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
PDF
Scylla Summit 2017: Repair, Backup, Restore: Last Thing Before You Go to Prod...
High performance queues with Cassandra
Scylla Summit 2017: Migrating to Scylla From Cassandra and Others With No Dow...
Scylla Summit 2017: Stateful Streaming Applications with Apache Spark
Scylla Summit 2017: From Elasticsearch to Scylla at Zenly
Scylla Summit 2017: How to Ruin Your Workload's Performance by Choosing the W...
If You Care About Performance, Use User Defined Types
Scylla Summit 2017: A Toolbox for Understanding Scylla in the Field
Scylla Summit 2017: Cry in the Dojo, Laugh in the Battlefield: How We Constan...
Scylla Summit 2017: Planning Your Queries for Maximum Performance
Scylla Summit 2017: Managing 10,000 Node Storage Clusters at Twitter
Scylla Summit 2017: Streaming ETL in Kafka for Everyone with KSQL
Scylla Summit 2017: How to Optimize and Reduce Inter-DC Network Traffic and S...
Scylla Summit 2017: Repair, Backup, Restore: Last Thing Before You Go to Prod...
Ad

Similar to CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Curalate) | C* Summit 2016 (20)

PDF
9 queuing
PDF
Paris Cassandra Meetup - Cassandra for Developers
PDF
On Rabbits and Elephants
PDF
Messaging With ActiveMQ
PDF
Microservices communication styles and event bus
PDF
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
ODP
Cassandra queuing
PPTX
AndroidThing (Internet of things)
PPT
Using Data Queues in Modern Applications
PPT
Advanced queuinginternals
PDF
Cassandra 2.0 to 2.1
PPT
5266732.ppt
PDF
3450 - Writing and optimising applications for performance in a hybrid messag...
PDF
Cloud Computing With Amazon Web Services, Part 4: Reliable Messaging With SQS
PPTX
Luxun a Persistent Messaging System Tailored for Big Data Collecting & Analytics
PPTX
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
DOCX
White paper for High Performance Messaging App Dev with Oracle AQ
PDF
Rabbitmq an amqp message broker
PDF
Work Queue Systems
PDF
Cassandra 2.0 and timeseries
9 queuing
Paris Cassandra Meetup - Cassandra for Developers
On Rabbits and Elephants
Messaging With ActiveMQ
Microservices communication styles and event bus
Highly available, scalable and secure data with Cassandra and DataStax Enterp...
Cassandra queuing
AndroidThing (Internet of things)
Using Data Queues in Modern Applications
Advanced queuinginternals
Cassandra 2.0 to 2.1
5266732.ppt
3450 - Writing and optimising applications for performance in a hybrid messag...
Cloud Computing With Amazon Web Services, Part 4: Reliable Messaging With SQS
Luxun a Persistent Messaging System Tailored for Big Data Collecting & Analytics
Architecting a High-Performance (Open Source) Distributed Message Queuing Sys...
White paper for High Performance Messaging App Dev with Oracle AQ
Rabbitmq an amqp message broker
Work Queue Systems
Cassandra 2.0 and timeseries

More from DataStax (20)

PPTX
Is Your Enterprise Ready to Shine This Holiday Season?
PPTX
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
PPTX
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
PPTX
Best Practices for Getting to Production with DataStax Enterprise Graph
PPTX
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
PPTX
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
PDF
Webinar | Better Together: Apache Cassandra and Apache Kafka
PDF
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
PDF
Introduction to Apache Cassandra™ + What’s New in 4.0
PPTX
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
PPTX
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
PDF
Designing a Distributed Cloud Database for Dummies
PDF
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
PDF
How to Evaluate Cloud Databases for eCommerce
PPTX
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
PPTX
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
PPTX
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
PPTX
Datastax - The Architect's guide to customer experience (CX)
PPTX
An Operational Data Layer is Critical for Transformative Banking Applications
PPTX
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking
Is Your Enterprise Ready to Shine This Holiday Season?
Designing Fault-Tolerant Applications with DataStax Enterprise and Apache Cas...
Running DataStax Enterprise in VMware Cloud and Hybrid Environments
Best Practices for Getting to Production with DataStax Enterprise Graph
Webinar | Data Management for Hybrid and Multi-Cloud: A Four-Step Journey
Webinar | How to Understand Apache Cassandra™ Performance Through Read/Writ...
Webinar | Better Together: Apache Cassandra and Apache Kafka
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Introduction to Apache Cassandra™ + What’s New in 4.0
Webinar: How Active Everywhere Database Architecture Accelerates Hybrid Cloud...
Webinar | Aligning GDPR Requirements with Today's Hybrid Cloud Realities
Designing a Distributed Cloud Database for Dummies
How to Power Innovation with Geo-Distributed Data Management in Hybrid Cloud
How to Evaluate Cloud Databases for eCommerce
Webinar: DataStax Enterprise 6: 10 Ways to Multiply the Power of Apache Cassa...
Webinar: DataStax and Microsoft Azure: Empowering the Right-Now Enterprise wi...
Webinar - Real-Time Customer Experience for the Right-Now Enterprise featurin...
Datastax - The Architect's guide to customer experience (CX)
An Operational Data Layer is Critical for Transformative Banking Applications
Becoming a Customer-Centric Enterprise Via Real-Time Data and Design Thinking

Recently uploaded (20)

PPTX
ISO 45001 Occupational Health and Safety Management System
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPT
Introduction Database Management System for Course Database
PDF
medical staffing services at VALiNTRY
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PPTX
L1 - Introduction to python Backend.pptx
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
top salesforce developer skills in 2025.pdf
PPTX
ai tools demonstartion for schools and inter college
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
Introduction to Artificial Intelligence
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
ISO 45001 Occupational Health and Safety Management System
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Introduction Database Management System for Course Database
medical staffing services at VALiNTRY
PTS Company Brochure 2025 (1).pdf.......
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
L1 - Introduction to python Backend.pptx
Online Work Permit System for Fast Permit Processing
Which alternative to Crystal Reports is best for small or large businesses.pdf
Design an Analysis of Algorithms I-SECS-1021-03
Understanding Forklifts - TECH EHS Solution
Navsoft: AI-Powered Business Solutions & Custom Software Development
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
top salesforce developer skills in 2025.pdf
ai tools demonstartion for schools and inter college
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Operating system designcfffgfgggggggvggggggggg
Introduction to Artificial Intelligence
How to Migrate SBCGlobal Email to Yahoo Easily

CassieQ: The Distributed Message Queue Built on Cassandra (Anton Kropp, Curalate) | C* Summit 2016

  • 1. Anton Kropp CassieQ: The Distributed Queue Built On Cassandra
  • 2. © DataStax, All Rights Reserved. Why use queues? • Distribution of work • Decoupling producers/consumers • Reliability 2
  • 3. © DataStax, All Rights Reserved. Existing Queues • ActiveMQ • RabbitMQ • MSMQ • Kafka • SQS • Azure Queue • others 3
  • 4. © DataStax, All Rights Reserved. Advantage of a queue on c* • Highly available • Highly distributed • Massive intake • Masterless • Re-use existing data store/operational knowledge 4
  • 5. © DataStax, All Rights Reserved. 5 But aren’t queues antipatterns?
  • 6. © DataStax, All Rights Reserved. Issues with queues in C* • Modeling off deletes • Tombstones • Evenly distributing messages? • What is the partition key? • How to synchronize consumers? 6
  • 7. © DataStax, All Rights Reserved. Existing C* queues • Netflix Astyanax recipe • Cycled time based partitioning • Row based reader lock • Messages put into time shard ordered by insert time • Relies on deletes • Requires low gc_grace_seconds for fast compaction 7
  • 8. © DataStax, All Rights Reserved. Existing C* queues • Comcast CMB • Uses Redis as actual queue (cheating) • Queues are hashed to affine to same redis server • Cassandra is cold storage backing store • Random partitioning between 0 and 100 8
  • 9. © DataStax, All Rights Reserved. Missing features • Authentication • Authorization • Statistics • Simple deployment • Requirement on external infrastructure 9
  • 10. © DataStax, All Rights Reserved. CassieQ • HTTP(s) based API • No locking • Fixed size bucket partitioning • Leverages pointers (kafkaesque) • Message invisibility • Azure Queue/SQS inspired • Docker deployment • Authentication/authorization • Ideally once delivery • Best attempt at FIFO (not guaranteed) 10
  • 11. © DataStax, All Rights Reserved. 11 docker run –it -p 8080:8080 –p 8081:8081 paradoxical/cassieq dev
  • 12. © DataStax, All Rights Reserved. CassieQ Queue API 12
  • 13. © DataStax, All Rights Reserved. CassieQ Admin API 13
  • 14. © DataStax, All Rights Reserved. CassieQ workflow • Client is authorized on an account • Granular client authorization up to queue level • Client consumes message from queue with message lease (invisibility) • Gets pop receipt • Client acks message with pop receipt • If pop receipt not valid, lease expired • Client can update messages • Update message contents • Renew lease 14
  • 16. © DataStax, All Rights Reserved. 16 Lets dig inside CassieQ internals
  • 17. © DataStax, All Rights Reserved. TLDR • Messages partitioned into fixed sized buckets • Pointers to buckets/messages used to track current state • Use of lightweight transactions for atomic actions to avoid locking • Bucketing + pointers eliminates modeling off deletes 17
  • 18. © DataStax, All Rights Reserved. CassieQ Buckets • Messages stored in fixed sized buckets • Deterministic when full • Easy to reason about • Why not time buckets? • Time bugs suck • Non deterministic • Can miss data due to time overlaps • Messages given monotonic ID • CAS “id” table • Bucket # = monotonicId / bucketSize 18
  • 19. © DataStax, All Rights Reserved. Pointers to Buckets/Messages • Reader pointer • Tracks which bucket a consumer is on • Repair pointer • Tracks first non-finalized bucket • Invisibility pointer • Tracks first unacked message 19
  • 20. All 3 pointers point to monotonic id value, potentially in different buckets 1 2 3 4 5 InvisPointer ReaderPointer RepairPointer Pointers to Buckets
  • 21. © DataStax, All Rights Reserved. Schema 21 CREATE TABLE queue ( account_name text, queuename text, bucket_size int, version int, ... PRIMARY KEY (account_name, queuename) ); CREATE TABLE message ( queueid text, bucket_num bigint, monoton bigint, message text, version int, acked boolean, next_visible_on timestamp, delivery_count int, tag text, created_date timestamp, updated_date timestamp, PRIMARY KEY ((queueid, bucket_num), monoton) ); *queueid=accountName:queueName:version
  • 22. © DataStax, All Rights Reserved. 22 Reading messages
  • 23. © DataStax, All Rights Reserved. Pointers to Buckets/Messages • Reader pointer • Tracks which bucket a consumer is on • Repair pointer • Tracks first non-finalized bucket • Invisibility pointer • Tracks first unacked message 23
  • 24. © DataStax, All Rights Reserved. Reading from a bucket • Read any unacked message in bucket (either FIFO or random) • Consume message (update its internal version + set its invisibility timeout) • Return to consumer 24 1 2 3 4 Bucket 1 Undelivered messages Reader pointer start
  • 25. 1 2 ? 4 5 Buckets… complications • Once a monoton is generated, it is taken • Even if a message fails to insert the monoton is taken • Buckets are now partially filled! • How to resolve?
  • 26. 1 2 ? 4 5 6 Bucket 2Bucket 1 Reader Message 3 missing When to move off a bucket? 1. All known messages in the bucket have been delivered at least once 2. All new messages being written in future buckets
  • 27. 1 2 ? 4 5 6 7 … Bucket 2Bucket 1 Reader @ bucket 2 Message 3 missing Tombstone When to move off a bucket? • Tombstoning (not cassandra tombstoning, naming is hard!) • Bucket is sealed, no more writes • Reader tombstones bucket after its reached
  • 28. Tombstoning enables us to detect delayed writes 1 2 ? 4 5 6 7 … Bucket 2Bucket 1 Reader @ bucket 2 Message 3 missing Tombstone
  • 29. © DataStax, All Rights Reserved. 29 Repairing delayed messages
  • 30. © DataStax, All Rights Reserved. Pointers to Buckets/Messages • Reader pointer • Tracks which bucket a consumer is on • Repair pointer • Tracks first non-finalized bucket • Invisibility pointer • Tracks first unacked message 30
  • 31. © DataStax, All Rights Reserved. Repairing delayed writes • Scenarios: • Message taking its time writing (still alive, but slow) • Message claimed monoton but is dead • Resolution: • Watch for tombstone in bucket • Wait for repair timeout (30 seconds) • If message shows up, republish • If not, finalize bucket and move to next bucket (message is dead) 31
  • 32. Repairing delayed writes 1 2 ? 4 5 6 7 … Bucket 2Bucket 1 Reader @ bucket 2 Message 3 missing Tombstone Repair Pointer @ bucket 1
  • 34. Repairing delayed writes 1 2 3 4 5 6 7 … Bucket 2Bucket 1 Reader @ bucket 2 + Message 3 Showed up! Tombstone Repair Pointer @ bucket 1 Republished to end
  • 35. Repairing delayed writes 1 2 3 4 5 6 7 … 3 Bucket 2..Bucket 1 Tombstone Repair Pointer @ bucket 2 Reader @ bucket 2 +
  • 36. © DataStax, All Rights Reserved. 36 Invisibility and the unhappy path ☹
  • 37. © DataStax, All Rights Reserved. 37 What is invisibility?
  • 38. © DataStax, All Rights Reserved. 38 A mechanism for message re-delivery (in a stateless system)
  • 39. © DataStax, All Rights Reserved. Pointers to Buckets/Messages • Reader pointer • Tracks which bucket a consumer is on • Repair pointer • Tracks first non-finalized bucket • Invisibility pointer • Tracks first unacked message 39
  • 40. The happy path • Client consumes message • Message is marked as “invisible” with a “re-visibility” timestamp • Client gets pop receipt encapsulating metadata (including version) • Client acks within timeframe • Message marked as consumed if version is the same
  • 41. The unhappy path :( • Client doesn’t ack within timeframe • Message needs to be redelivered • Subsequent reads checks the invis pointer for visibility • If max delivers exceeded, push to optional DLQ • Else redeliver!
  • 42. © DataStax, All Rights Reserved. The unhappy path :( 42 1 2 3 4 5 6 7 … Bucket 1 Invisibility pointer Reader Bucket pointer
  • 43. © DataStax, All Rights Reserved. The unhappy path :( 43 1 2 3 4 5 6 7 … Bucket 1 Invisibility pointer Reader Bucket pointer
  • 44. © DataStax, All Rights Reserved. The unhappy path :( 44 1 2 3 4 5* 6 7 … Bucket 1 Invisibility pointer Reader Bucket pointer ackack ack out expired
  • 45. Long term invisibility is bad • InvisPointer WILL NOT move past a unacked message • Invisible messages can block other invisible messages • Possible to starve future messages
  • 46. © DataStax, All Rights Reserved. The unhappy path :( 46 1 2 3 4 5 6 7 … Bucket 1 Invisibility pointer Reader Bucket pointer ackack ack DLQ ou t
  • 47. © DataStax, All Rights Reserved. Conclusion • Building a queue on c* is hard • Limited by performance of lightweight transactions and underlying c* choices • compaction strategies, cluster usage, etc • Need to make trade off design choices • CassieQ is used in production but in not stressed under highly contentious scenarios 47
  • 48. Questions? or feedback/thoughts/visceral reactions Contribute to the antipattern @ paradoxical.io https://github.com/paradoxical-io/cassieq