SlideShare a Scribd company logo
ScyllaDB 5.2 and
Beyond
Fresh from the ScyllaDB Oven
Avi Kivity, CTO and Co-Founder
Agenda
■ Increasing Streaming Robustness
■ Autoparallel Queries
■ WebAssembly User Defined Functions
■ Per-partition Throttling
■ Alternator Updates
■ Consistent Schema and Topology
■ New SSD Disk Modeling
■ Taming Corner Cases
■ What’s Cooking Now
Repair-Based Node Operations
■ Resumable bootstrap/decommission
■ Stream from primary replica
■ Or a quorum if primary is missing
■ Increases resilience and improves correctness
Autoparallel Queries
■ Aggregations previously done via Spark or custom code
■ Instead, recognize certain CQL patterns
■ Dispatch to all nodes, all vcpus within nodes
Node 5
Node 1
Node 2
Node 4 Node 3
SELECT COUNT(*)
FROM t
WebAssembly UDF/UDA
■ Push compute into database
■ Use any language*
■ Computations run in a WASM sandbox
■ Use case: analytics
*as long as it’s Rust
Per Partition Rate Limit
■ New CQL table attribute to limit access rate to partition
■ Works for reads and writes
■ Prevent bot accounts from spamming database
■ “Hot partition”
Alternator Updates
■ Time-to-live Expiration
■ Improved performance
■ Eliminate classes of operator errors
■ Concurrent schema changes
■ Concurrent topology operations
■ Lay groundwork for more advanced features
■ Concurrent node bootstrap/decommission
■ Tablets
■ Strong consistency
Consistent Schema and Topology
ScyllaDB knows more
about the disk operating
envelope
New SSD Disk Modeling
Taming Corner Cases
Reverse Queries
■ 4.5 and older slow for large partitions
■ 4.6 fast, but skipped cache
■ 5.0+ fast, supports cache
■ Works well with paging SELECT *
FROM tab
WHERE …
ORDER BY clustering_key DESC
■ Queries that encounter large consecutive tombstone runs are now well
supported
■ Partitions with many range tombstones work well
Better Handling of Tombstones
■ Escalating countermeasures as memory usage increases
■ Prevent new queries from starting
■ Allow only one query to make progress
■ Kill all but one query
Improved Out-of-Memory Handling
Repair-Based Tombstone Garbage Collection
■ Eliminate gc_grace_seconds
■ Tie tombstone garbage collection to last repair
■ Improves performance for clusters that have frequent repair
■ Improves correctness for clusters that missed repair
Cooking Now
Nudging the CQL Grammar Towards SQL
■ Relaxing constraints
■ Reconciling semantic oddities
■ Increasing the scope of autoparallel queries
■ A spectrum of cost/performance tradeoffs
■ RAM: Extremely fast (100ns), very expensive
■ NVMe: Very fast, (100µs), expensive
■ HDD: Slow (10ms), cheap
■ Cloud Object storage (S3 and similar)
■ Slow (40ms), cheap
■ Infinitely expandable
■ Easy to manipulate
■ Shared access
Object Storage
■ Very dense databases
■ Where latency is not critical
■ Tiered storage
■ Mix service levels and cost
■ Optimize both cost and latency
Use Cases for Object Storage
Thank You
Stay in Touch
Avi Kivity
avi@scylladb.com
@AviKivity
@avikivity

More Related Content

PDF
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
PDF
Elasticity, Speed & Simplicity: Get the Most Out of New ScyllaDB Capabilities
PDF
What’s New in ScyllaDB Open Source 5.0
PPTX
A Deep Dive into ScyllaDB's Architecture
PDF
How Development Teams Cut Costs with ScyllaDB.pdf
PDF
To Serverless and Beyond
PPTX
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
PDF
ScyllaDB Virtual Workshop: Getting Started with ScyllaDB 2024
Scylla Summit 2022: How ScyllaDB Powers This Next Tech Cycle
Elasticity, Speed & Simplicity: Get the Most Out of New ScyllaDB Capabilities
What’s New in ScyllaDB Open Source 5.0
A Deep Dive into ScyllaDB's Architecture
How Development Teams Cut Costs with ScyllaDB.pdf
To Serverless and Beyond
MongoDB vs Scylla: Production Experience from Both Dev & Ops Standpoint at Nu...
ScyllaDB Virtual Workshop: Getting Started with ScyllaDB 2024

Similar to The Path to ScyllaDB 5.2 (20)

PPTX
Scylla Summit 2018: Joining Billions of Rows in Seconds with One Database Ins...
PDF
Dissecting Real-World Database Performance Dilemmas
PDF
Using ScyllaDB for Extreme Scale Workloads
PDF
Dissecting Real-World Database Performance Dilemmas
PPTX
Why We Chose ScyllaDB over DynamoDB for "User Watch Status"
PDF
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
PDF
How to Monitor and Size Workloads on AWS i3 instances
PDF
Introducing Scylla Open Source 4.0
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
PPTX
Cassandra vs. ScyllaDB: Evolutionary Differences
PDF
ScyllaDB Virtual Workshop
PDF
Using ScyllaDB for Real-Time Write-Heavy Workloads
PDF
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
PPTX
Cassandra to ScyllaDB: Technical Comparison and the Path to Success
PDF
Scylla Summit 2016: ScyllaDB, Present and Future
PPTX
Meeting the challenges of OLTP Big Data with Scylla
PDF
How to achieve no compromise performance and availability
PPTX
Writing Applications for Scylla
PPTX
Scylla Summit 2018: Scylla 3.0 and Beyond
PDF
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
Scylla Summit 2018: Joining Billions of Rows in Seconds with One Database Ins...
Dissecting Real-World Database Performance Dilemmas
Using ScyllaDB for Extreme Scale Workloads
Dissecting Real-World Database Performance Dilemmas
Why We Chose ScyllaDB over DynamoDB for "User Watch Status"
Numberly on Joining Billions of Rows in Seconds: Replacing MongoDB and Hive w...
How to Monitor and Size Workloads on AWS i3 instances
Introducing Scylla Open Source 4.0
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Cassandra vs. ScyllaDB: Evolutionary Differences
ScyllaDB Virtual Workshop
Using ScyllaDB for Real-Time Write-Heavy Workloads
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
Cassandra to ScyllaDB: Technical Comparison and the Path to Success
Scylla Summit 2016: ScyllaDB, Present and Future
Meeting the challenges of OLTP Big Data with Scylla
How to achieve no compromise performance and availability
Writing Applications for Scylla
Scylla Summit 2018: Scylla 3.0 and Beyond
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
Ad

More from ScyllaDB (20)

PDF
Understanding The True Cost of DynamoDB Webinar
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
PDF
New Ways to Reduce Database Costs with ScyllaDB
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
PDF
Leading a High-Stakes Database Migration
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
PDF
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
PDF
Vector Search with ScyllaDB by Szymon Wasik
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
PDF
A Dist Sys Programmer's Journey into AI by Piotr Sarna
Understanding The True Cost of DynamoDB Webinar
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
New Ways to Reduce Database Costs with ScyllaDB
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Leading a High-Stakes Database Migration
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB: 10 Years and Beyond by Dor Laor
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Vector Search with ScyllaDB by Szymon Wasik
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Lessons Learned from Building a Serverless Notifications System by Srushith R...
A Dist Sys Programmer's Journey into AI by Piotr Sarna
Ad

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Approach and Philosophy of On baking technology
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPT
Teaching material agriculture food technology
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Empathic Computing: Creating Shared Understanding
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Machine Learning_overview_presentation.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
Building Integrated photovoltaic BIPV_UPV.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Per capita expenditure prediction using model stacking based on satellite ima...
SOPHOS-XG Firewall Administrator PPT.pptx
Approach and Philosophy of On baking technology
Network Security Unit 5.pdf for BCA BBA.
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
The Rise and Fall of 3GPP – Time for a Sabbatical?
Teaching material agriculture food technology
Assigned Numbers - 2025 - Bluetooth® Document
NewMind AI Weekly Chronicles - August'25-Week II
Empathic Computing: Creating Shared Understanding
MYSQL Presentation for SQL database connectivity
Dropbox Q2 2025 Financial Results & Investor Presentation
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Machine Learning_overview_presentation.pptx
A Presentation on Artificial Intelligence
Reach Out and Touch Someone: Haptics and Empathic Computing

The Path to ScyllaDB 5.2

  • 1. ScyllaDB 5.2 and Beyond Fresh from the ScyllaDB Oven Avi Kivity, CTO and Co-Founder
  • 2. Agenda ■ Increasing Streaming Robustness ■ Autoparallel Queries ■ WebAssembly User Defined Functions ■ Per-partition Throttling ■ Alternator Updates ■ Consistent Schema and Topology ■ New SSD Disk Modeling ■ Taming Corner Cases ■ What’s Cooking Now
  • 3. Repair-Based Node Operations ■ Resumable bootstrap/decommission ■ Stream from primary replica ■ Or a quorum if primary is missing ■ Increases resilience and improves correctness
  • 4. Autoparallel Queries ■ Aggregations previously done via Spark or custom code ■ Instead, recognize certain CQL patterns ■ Dispatch to all nodes, all vcpus within nodes Node 5 Node 1 Node 2 Node 4 Node 3 SELECT COUNT(*) FROM t
  • 5. WebAssembly UDF/UDA ■ Push compute into database ■ Use any language* ■ Computations run in a WASM sandbox ■ Use case: analytics *as long as it’s Rust
  • 6. Per Partition Rate Limit ■ New CQL table attribute to limit access rate to partition ■ Works for reads and writes ■ Prevent bot accounts from spamming database ■ “Hot partition”
  • 7. Alternator Updates ■ Time-to-live Expiration ■ Improved performance
  • 8. ■ Eliminate classes of operator errors ■ Concurrent schema changes ■ Concurrent topology operations ■ Lay groundwork for more advanced features ■ Concurrent node bootstrap/decommission ■ Tablets ■ Strong consistency Consistent Schema and Topology
  • 9. ScyllaDB knows more about the disk operating envelope New SSD Disk Modeling
  • 11. Reverse Queries ■ 4.5 and older slow for large partitions ■ 4.6 fast, but skipped cache ■ 5.0+ fast, supports cache ■ Works well with paging SELECT * FROM tab WHERE … ORDER BY clustering_key DESC
  • 12. ■ Queries that encounter large consecutive tombstone runs are now well supported ■ Partitions with many range tombstones work well Better Handling of Tombstones
  • 13. ■ Escalating countermeasures as memory usage increases ■ Prevent new queries from starting ■ Allow only one query to make progress ■ Kill all but one query Improved Out-of-Memory Handling
  • 14. Repair-Based Tombstone Garbage Collection ■ Eliminate gc_grace_seconds ■ Tie tombstone garbage collection to last repair ■ Improves performance for clusters that have frequent repair ■ Improves correctness for clusters that missed repair
  • 16. Nudging the CQL Grammar Towards SQL ■ Relaxing constraints ■ Reconciling semantic oddities ■ Increasing the scope of autoparallel queries
  • 17. ■ A spectrum of cost/performance tradeoffs ■ RAM: Extremely fast (100ns), very expensive ■ NVMe: Very fast, (100µs), expensive ■ HDD: Slow (10ms), cheap ■ Cloud Object storage (S3 and similar) ■ Slow (40ms), cheap ■ Infinitely expandable ■ Easy to manipulate ■ Shared access Object Storage
  • 18. ■ Very dense databases ■ Where latency is not critical ■ Tiered storage ■ Mix service levels and cost ■ Optimize both cost and latency Use Cases for Object Storage
  • 19. Thank You Stay in Touch Avi Kivity avi@scylladb.com @AviKivity @avikivity