SlideShare a Scribd company logo
Tomasz Grabiec, Distinguished Engineer at ScyllaDB
Felipe Mendes, Solution Architect at ScyllaDB
Replacing Your Cache
with ScyllaDB
Poll
Are you using cache in front of your DB?
Tomasz Grabiec (Tomek), Distinguished Engineer at ScyllaDB
Felipe Mendes, Solution Architect at ScyllaDB
Replacing Your Cache
with ScyllaDB
+ For data-intensive applications that require high
throughput and predictable low latencies
+ Close-to-the-metal design takes full advantage of
modern infrastructure
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ Compatible with Apache Cassandra and Amazon
DynamoDB
+ DBaaS/Cloud, Enterprise and Open Source
solutions
The Database for Gamechangers
4
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the
power a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of
an in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
5
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Digital experiences at
massive scale
Corporate fleet
management
Real-time analytics 2,000,000 SKU -commerce
management
Video recommendation
management
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M
transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Global operations- Avon,
Body Shop + more
Predictable performance
for on sale surges
GPS-based exercise
tracking
Serving dynamic live
streams at scale
Powering India's top
social media platform
Personalized
advertising to players
Distribution of game
assets in Unreal Engine
Introductions
Tomasz Grabiec, Distinguished Engineer at ScyllaDB
+ Core engineer and maintainer at ScyllaDB since its inception
+ Started coding when Commodore 64 was still a thing
+ Lives in Cracow, Poland
Felipe Mendes, Solution Architect at ScyllaDB
+ Published Author on Linux and Databases
+ Helps teams solve their most challenging problems
+ Years of experience with Linux and distributed systems
Agenda
+ Why Cache?
+ How can ScyllaDB help?
+ Caching Strategies
+ ScyllaDB Cache design
+ External Cache Hiccups
+ ScyllaDB as a Cache Replacement
8
Why Cache?
How ScyllaDB focuses on high throughput and low tail latency
unique
9
Our technology
Horizontal & Vertical Scaling
Unique Close-to-Metal Architecture
Built in C++
(no Java overhead)
Everything
Asynchronous
Shared Nothing Shard per Core Specialized Cache
Network
Processor NUMA
Storage
Lower Consistent Latency -> Higher
Revenue
insideline.com site to reduce load times
from nine seconds to 1.4 seconds, ad
revenue increased three percent, and page
views-per-session went up 17 percent.
https://www.thinkwithgoogle.com/future-of-marketing/digital-transformation/the-
google-gospel-of-speed-urs-hoelzle/
https://www.globaldots.com/resources/blog/latency-is-having-a-huge-negative-impact-on-ecommerce-
companies
https://www.fastcompany.com/1825005/how-one-second-could-cost-amazon-16-billion-sales
Tail latency problem
Refresh
User App Business Logic Database
API Calls
DB Calls
Slowest 1% dominates latency
What most people do
Refresh
User App Business Logic
Database
API Calls
Problem solved?
Cache
13
How Can ScyllaDB Help?
Real-life Testimonials Proven at Scale
14
962 C* nodes to 78
60% TCO
95% latency
“By moving to ScyllaDB Enterprise software
running on AWS EC2 infrastructure and on-
premises, Comcast improved P99 latency by
more than 95% and were able to rip out a UI
cache layer “
From Redis + Elasticsearch to ScyllaDB
15
<1ms P99
Zero downtime
TCO
16
TCO
Speed of Redis
From Redis to ScyllaDB for
Data Stores, Fraud Detection, Ad Targeting
Scalability
17
<1ms avg Latency
From Redis to Cassandra to ScyllaDB Cloud
4-8msP99
Fault Tolerance
18
Caching Strategies
Choose your destiny
19
Top caching strategies
Alex Yu @ ByteByteGo – https://blog.bytebytego.com/p/top-caching-strategies
Type of caches
Cache Aside
21
Type of caches
DAX
DAX
DAX
External Write
Through
DAX
DAX
DAX
Cache Aside
22
Type of caches
Write Around /
Write Back
Cache Aside
DAX
DAX
DAX
External Write
Through
DAX
DAX
DAX
23
Type of caches
Embedded Read
Through
Write Around /
Write Back
Cache Aside
DAX
DAX
DAX
External Write
Through
DAX
DAX
DAX
24
ScyllaDB Cache Design
25
Data flow
memtable
Write
RAM
Disk
26
Data flow
memtable
Write
RAM
Disk
commitlog
27
Data flow
memtable
RAM
Disk
sstable
memtable
Write
28
Data flow
RAM
Disk
sstable
memtable
Write
29
Data flow
RAM
Disk
sstable
sstable
sstable
Read
memtable
+ Read consistency easy
+ Pin sstables and memtable
+ Thanks to collocation
+ ..but slow
30
Data flow
RAM
Disk
sstable
sstable
sstable
Read
memtable
31
Data flow with cache
memtable
RAM
Disk
Read
cache
sstable
sstable
sstable
32
Buffer cache?
RAM
Disk
sstable
4K
Inefficient use of memory:
+ Need to cache whole buffers to cache a single row
+ Access locality not likely if data set >> RAM
33
Why not buffer cache?
SSTable page (4K)
Row (300B)
Poor negative caching:
+ Need to cache whole data buffer to indicate absent data
34
Why not buffer cache?
SSTable page (4K)
?
Inefficient use of memory:
+ Redundant buffers due to LSM
+ Read may touch multiple SSTables
+ Memory waste remark pronounced
35
Why not buffer cache?
sstable sstable
sstable
Read
High CPU overhead for reads:
+ Reads need to merge data from multiple sstables
36
Why not buffer cache?
sstable sstable
sstable
Read
High CPU overhead for reads:
+ SSTable format optimized for compact storage, not read speed
+ Parsing overhead:
+ Need to parse index buffers sequentially
+ Need to parse the data file
37
Why not buffer cache?
Premature cache eviction due to SSTable compaction:
+ SSTable compaction removes old files => buffer invalidation
+ Hurts read performance by incurring misses
38
Why not buffer cache?
sstable
sstable
sstable
sstable
+ Object cache
+ Like memtable
+ Optimized for low CPU overhead
+ Fast reads
+ Row-granularity caching
+ Reflects data in all relevant SSTables for a given object (e.g. row)
39
Cache structure
+ ScyllaDB reserves and manages most of the memory on a node
+ Small reserve for the OS
+ No use of Linux page cache (only direct I/O)
+ Cache uses all available free memory
+ Shrinked on pressure from memtable and other allocations
40
Memory management
memtable
cache other
41
CPU sharding
CPU 0
CPU 1
CPU 2
CPU 3
42
Thread-per-core architecture
task task task task task task task
+ All processing in a single thread per CPU
+ Short tasks executed serially
+ Cooperative preemption
43
Cache coherency
memtable
Read
cache
task
task
+ Complex operations on data without dealing with concurrency
+ No locking or complex lock-free algorithms
+ Data structures and algorithms simple
memtable
cache
44
Complex DQL/DML
SELECT * FROM table WHERE pk = 0 and ck >= 2;
DELETE FROM table WHERE pk = 0 and ck >= 2;
45
Range queries
2 5
SELECT * FROM table WHERE ... and ck >= 2;
?
46
Range queries
2 5
SELECT * FROM table WHERE ... and ck >= 2;
range continuity
47
Range deletions
2
DELETE FROM table WHERE ... and ck >= 2;
range continuity
+ tombstone
ScyllaDB cache highlights
+ ScyllaDB has a fast cache
+ Efficient access & maintenance
+ Thanks to collocation with replica and design
+ Takes care of consistency guarantees
+ Handles complexities of data and query model
External
Cache Hiccups
49
+ Increased latency
+ Elevated costs
+ Decreased availability
+ Increased complexity
+ Ruins the DB caching
+ Ignores DB own cache
+ Reduced security
Increased latency
External Embedded in
DB
<5 ms
<1ms
<1ms
Elevated costs
External Embedded in
DB
<5 ms
<1ms
<1ms
Decreased availability
External
HWLB
53
Application complexity
GET
Value
SELECT
Value
Update
Res
Is Nil?
ACK/NAK
Databases hold a lot of context about the data:
+ ScyllaDB is wide-column (Key-Key-Value), while a cache might by Key-Value only.
+ Structured data: Tables, User Defined Types…
+ Cache settings and hit rates per table
+ Time To Live (TTL)
+ Materialized View and Secondary Indexes
+ Much more…
54
Ignores the database knowledge
An external caching layer introduces noise:
+ Ignores built-in RBAC
+ Ineffective caching
+ Data consistency concerns
+ Data availability concerns
+ Scan-resistant caching
55
Ruins database own cache
56
ScyllaDB as a Cache
Replacement
The features you are already familiar with, embedded to your database
Cache Observability
SELECT * FROM users BYPASS CACHE;
SELECT name, occupation FROM users WHERE userid IN
(199, 200, 207) BYPASS CACHE;
SELECT * FROM users WHERE birth_year = 1981 AND
country = 'FR' ALLOW FILTERING BYPASS CACHE;
CQL Extension – BYPASS CACHE
SSTable index caching
■ The whole of index can now
be cached in memory
■ Populated on access (read-
through)
■ Evicted on memory
pressure
■ Partition index summary
still non-evictable and
always resident
RAM
Disk
SSTable indexing - large partition example
Partition size: 10 GB, Rows: 10 M, Index file size: 5 MB
scylla-5.0 -c1 -m4G
scylla-bench -workload uniform -mode read -limit 1 -concurrency 100 -partition-count 1 
-clustering-row-count 10000000 -duration 60m
Before: 2’011 Rows/s
After: 6’191Rows/s
(the node was bound by disk bandwidth, ~530 MB/s)
Summary
+ Placing a cache in front of your Database can fire back
+ A cache lacks the context the DB has under the workload
+ ScyllaDB Cache is optimized to work with zero overhead
+ Multiple users have replaced their cache with ScyllaDB
+ ScyllaDB counts with several optimizations in its implementation
Q&A
ScyllaDB Cloud
Start free trial
scylladb.com/cloud
December 5, 2023
scylladb.com/events
Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

More Related Content

PDF
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
PDF
Scylla Compaction Strategies
PPT
pNFS Introduction
PDF
Apache Hudi: The Path Forward
PDF
Hadoop Strata Talk - Uber, your hadoop has arrived
PPTX
Top NoSQL Data Modeling Mistakes
PDF
RedHat OpenStack Platform Overview
PPTX
Apache NiFi Crash Course Intro
[OpenInfra Days Korea 2018] Day 2 - CEPH 운영자를 위한 Object Storage Performance T...
Scylla Compaction Strategies
pNFS Introduction
Apache Hudi: The Path Forward
Hadoop Strata Talk - Uber, your hadoop has arrived
Top NoSQL Data Modeling Mistakes
RedHat OpenStack Platform Overview
Apache NiFi Crash Course Intro

What's hot (20)

PDF
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
PPTX
Apache Tez - A unifying Framework for Hadoop Data Processing
PPTX
High Performance, High Reliability Data Loading on ClickHouse
PDF
MySQL Router REST API
PPTX
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
PDF
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
PDF
Nsx t reference design guide 3-0
PPTX
Apache Kafka Best Practices
PDF
Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...
PDF
Red Hat Enterprise Linux 8
PDF
Awr + 12c performance tuning
PPT
Oracle 10g Performance: chapter 02 aas
PDF
CEPH DAY BERLIN - MASTERING CEPH OPERATIONS: UPMAP AND THE MGR BALANCER
KEY
Oracle ASM 11g - The Evolution
PDF
A deep dive about VIP,HAIP, and SCAN
PPTX
Interactive real time dashboards on data streams using Kafka, Druid, and Supe...
PPTX
Reference design for v mware nsx
PDF
From my sql to postgresql using kafka+debezium
PPTX
Druid deep dive
PDF
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
The Flux Capacitor of Kafka Streams and ksqlDB (Matthias J. Sax, Confluent) K...
Apache Tez - A unifying Framework for Hadoop Data Processing
High Performance, High Reliability Data Loading on ClickHouse
MySQL Router REST API
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Encryption and Masking for Sensitive Apache Spark Analytics Addressing CCPA a...
Nsx t reference design guide 3-0
Apache Kafka Best Practices
Size Matters-Best Practices for Trillion Row Datasets on ClickHouse-2202-08-1...
Red Hat Enterprise Linux 8
Awr + 12c performance tuning
Oracle 10g Performance: chapter 02 aas
CEPH DAY BERLIN - MASTERING CEPH OPERATIONS: UPMAP AND THE MGR BALANCER
Oracle ASM 11g - The Evolution
A deep dive about VIP,HAIP, and SCAN
Interactive real time dashboards on data streams using Kafka, Druid, and Supe...
Reference design for v mware nsx
From my sql to postgresql using kafka+debezium
Druid deep dive
Spark Saturday: Spark SQL & DataFrame Workshop with Apache Spark 2.3
Ad

Similar to Replacing Your Cache with ScyllaDB (20)

PDF
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
PPTX
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
PDF
How Development Teams Cut Costs with ScyllaDB.pdf
PDF
Caching for Performance Masterclass: Caching Strategies
PDF
How to achieve no compromise performance and availability
PDF
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
PDF
ScyllaDB Virtual Workshop
PDF
Why Databases Cache, but Caches Go to Disk
PDF
Under The Hood Of A Shard-Per-Core Database Architecture
PDF
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
PDF
Scylla db deck, july 2017
PDF
What Developers Need to Unlearn for High Performance NoSQL
PDF
Using ScyllaDB for Real-Time Write-Heavy Workloads
PDF
Transforming the Database: Critical Innovations for Performance at Scale
PDF
Developer Data Modeling Mistakes: From Postgres to NoSQL
PDF
5 Factors When Selecting a High Performance, Low Latency Database
PDF
ScyllaDB Virtual Workshop: Getting Started with ScyllaDB 2024
PDF
Elasticity, Speed & Simplicity: Get the Most Out of New ScyllaDB Capabilities
PDF
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
PDF
Using ScyllaDB for Extreme Scale Workloads
Replacing Your Cache with ScyllaDB by Felipe Cardeneti Mendes and Tomasz Grabiec
7 Reasons Not to Put an External Cache in Front of Your Database.pptx
How Development Teams Cut Costs with ScyllaDB.pdf
Caching for Performance Masterclass: Caching Strategies
How to achieve no compromise performance and availability
Using ScyllaDB for Real-Time Read-Heavy Workloads.pdf
ScyllaDB Virtual Workshop
Why Databases Cache, but Caches Go to Disk
Under The Hood Of A Shard-Per-Core Database Architecture
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
Scylla db deck, july 2017
What Developers Need to Unlearn for High Performance NoSQL
Using ScyllaDB for Real-Time Write-Heavy Workloads
Transforming the Database: Critical Innovations for Performance at Scale
Developer Data Modeling Mistakes: From Postgres to NoSQL
5 Factors When Selecting a High Performance, Low Latency Database
ScyllaDB Virtual Workshop: Getting Started with ScyllaDB 2024
Elasticity, Speed & Simplicity: Get the Most Out of New ScyllaDB Capabilities
AdGear Use Case with Scylla - 1M Queries Per Second with Single-Digit Millise...
Using ScyllaDB for Extreme Scale Workloads
Ad

More from ScyllaDB (20)

PDF
Understanding The True Cost of DynamoDB Webinar
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
PDF
New Ways to Reduce Database Costs with ScyllaDB
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
PDF
Leading a High-Stakes Database Migration
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
PDF
Vector Search with ScyllaDB by Szymon Wasik
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
PDF
A Dist Sys Programmer's Journey into AI by Piotr Sarna
Understanding The True Cost of DynamoDB Webinar
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
New Ways to Reduce Database Costs with ScyllaDB
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Leading a High-Stakes Database Migration
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB: 10 Years and Beyond by Dor Laor
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Vector Search with ScyllaDB by Szymon Wasik
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Lessons Learned from Building a Serverless Notifications System by Srushith R...
A Dist Sys Programmer's Journey into AI by Piotr Sarna

Recently uploaded (20)

PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPT
Teaching material agriculture food technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Modernizing your data center with Dell and AMD
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
GamePlan Trading System Review: Professional Trader's Honest Take
The Rise and Fall of 3GPP – Time for a Sabbatical?
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
MYSQL Presentation for SQL database connectivity
Chapter 3 Spatial Domain Image Processing.pdf
Teaching material agriculture food technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Dropbox Q2 2025 Financial Results & Investor Presentation
Network Security Unit 5.pdf for BCA BBA.
Spectral efficient network and resource selection model in 5G networks
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Unlocking AI with Model Context Protocol (MCP)
NewMind AI Monthly Chronicles - July 2025
Modernizing your data center with Dell and AMD
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Mobile App Security Testing_ A Comprehensive Guide.pdf
The AUB Centre for AI in Media Proposal.docx

Replacing Your Cache with ScyllaDB

  • 1. Tomasz Grabiec, Distinguished Engineer at ScyllaDB Felipe Mendes, Solution Architect at ScyllaDB Replacing Your Cache with ScyllaDB
  • 2. Poll Are you using cache in front of your DB?
  • 3. Tomasz Grabiec (Tomek), Distinguished Engineer at ScyllaDB Felipe Mendes, Solution Architect at ScyllaDB Replacing Your Cache with ScyllaDB
  • 4. + For data-intensive applications that require high throughput and predictable low latencies + Close-to-the-metal design takes full advantage of modern infrastructure + >5x higher throughput + >20x lower latency + >75% TCO savings + Compatible with Apache Cassandra and Amazon DynamoDB + DBaaS/Cloud, Enterprise and Open Source solutions The Database for Gamechangers 4 “ScyllaDB stands apart...It’s the rare product that exceeds my expectations.” – Martin Heller, InfoWorld contributing editor and reviewer “For 99.9% of applications, ScyllaDB delivers all the power a customer will ever need, on workloads that other databases can’t touch – and at a fraction of the cost of an in-memory solution.” – Adrian Bridgewater, Forbes senior contributor
  • 5. 5 +400 Gamechangers Leverage ScyllaDB Seamless experiences across content + devices Digital experiences at massive scale Corporate fleet management Real-time analytics 2,000,000 SKU -commerce management Video recommendation management Threat intelligence service using JanusGraph Real time fraud detection across 6M transactions/day Uber scale, mission critical chat & messaging app Network security threat detection Power ~50M X1 DVRs with billions of reqs/day Precision healthcare via Edison AI Inventory hub for retail operations Property listings and updates Unified ML feature store across the business Cryptocurrency exchange app Geography-based recommendations Global operations- Avon, Body Shop + more Predictable performance for on sale surges GPS-based exercise tracking Serving dynamic live streams at scale Powering India's top social media platform Personalized advertising to players Distribution of game assets in Unreal Engine
  • 6. Introductions Tomasz Grabiec, Distinguished Engineer at ScyllaDB + Core engineer and maintainer at ScyllaDB since its inception + Started coding when Commodore 64 was still a thing + Lives in Cracow, Poland Felipe Mendes, Solution Architect at ScyllaDB + Published Author on Linux and Databases + Helps teams solve their most challenging problems + Years of experience with Linux and distributed systems
  • 7. Agenda + Why Cache? + How can ScyllaDB help? + Caching Strategies + ScyllaDB Cache design + External Cache Hiccups + ScyllaDB as a Cache Replacement
  • 8. 8 Why Cache? How ScyllaDB focuses on high throughput and low tail latency
  • 9. unique 9 Our technology Horizontal & Vertical Scaling Unique Close-to-Metal Architecture Built in C++ (no Java overhead) Everything Asynchronous Shared Nothing Shard per Core Specialized Cache Network Processor NUMA Storage
  • 10. Lower Consistent Latency -> Higher Revenue insideline.com site to reduce load times from nine seconds to 1.4 seconds, ad revenue increased three percent, and page views-per-session went up 17 percent. https://www.thinkwithgoogle.com/future-of-marketing/digital-transformation/the- google-gospel-of-speed-urs-hoelzle/ https://www.globaldots.com/resources/blog/latency-is-having-a-huge-negative-impact-on-ecommerce- companies https://www.fastcompany.com/1825005/how-one-second-could-cost-amazon-16-billion-sales
  • 11. Tail latency problem Refresh User App Business Logic Database API Calls DB Calls Slowest 1% dominates latency
  • 12. What most people do Refresh User App Business Logic Database API Calls Problem solved? Cache
  • 13. 13 How Can ScyllaDB Help? Real-life Testimonials Proven at Scale
  • 14. 14 962 C* nodes to 78 60% TCO 95% latency “By moving to ScyllaDB Enterprise software running on AWS EC2 infrastructure and on- premises, Comcast improved P99 latency by more than 95% and were able to rip out a UI cache layer “
  • 15. From Redis + Elasticsearch to ScyllaDB 15 <1ms P99 Zero downtime TCO
  • 16. 16 TCO Speed of Redis From Redis to ScyllaDB for Data Stores, Fraud Detection, Ad Targeting Scalability
  • 17. 17 <1ms avg Latency From Redis to Cassandra to ScyllaDB Cloud 4-8msP99 Fault Tolerance
  • 19. 19 Top caching strategies Alex Yu @ ByteByteGo – https://blog.bytebytego.com/p/top-caching-strategies
  • 21. 21 Type of caches DAX DAX DAX External Write Through DAX DAX DAX Cache Aside
  • 22. 22 Type of caches Write Around / Write Back Cache Aside DAX DAX DAX External Write Through DAX DAX DAX
  • 23. 23 Type of caches Embedded Read Through Write Around / Write Back Cache Aside DAX DAX DAX External Write Through DAX DAX DAX
  • 30. + Read consistency easy + Pin sstables and memtable + Thanks to collocation + ..but slow 30 Data flow RAM Disk sstable sstable sstable Read memtable
  • 31. 31 Data flow with cache memtable RAM Disk Read cache sstable sstable sstable
  • 33. Inefficient use of memory: + Need to cache whole buffers to cache a single row + Access locality not likely if data set >> RAM 33 Why not buffer cache? SSTable page (4K) Row (300B)
  • 34. Poor negative caching: + Need to cache whole data buffer to indicate absent data 34 Why not buffer cache? SSTable page (4K) ?
  • 35. Inefficient use of memory: + Redundant buffers due to LSM + Read may touch multiple SSTables + Memory waste remark pronounced 35 Why not buffer cache? sstable sstable sstable Read
  • 36. High CPU overhead for reads: + Reads need to merge data from multiple sstables 36 Why not buffer cache? sstable sstable sstable Read
  • 37. High CPU overhead for reads: + SSTable format optimized for compact storage, not read speed + Parsing overhead: + Need to parse index buffers sequentially + Need to parse the data file 37 Why not buffer cache?
  • 38. Premature cache eviction due to SSTable compaction: + SSTable compaction removes old files => buffer invalidation + Hurts read performance by incurring misses 38 Why not buffer cache? sstable sstable sstable sstable
  • 39. + Object cache + Like memtable + Optimized for low CPU overhead + Fast reads + Row-granularity caching + Reflects data in all relevant SSTables for a given object (e.g. row) 39 Cache structure
  • 40. + ScyllaDB reserves and manages most of the memory on a node + Small reserve for the OS + No use of Linux page cache (only direct I/O) + Cache uses all available free memory + Shrinked on pressure from memtable and other allocations 40 Memory management memtable cache other
  • 41. 41 CPU sharding CPU 0 CPU 1 CPU 2 CPU 3
  • 42. 42 Thread-per-core architecture task task task task task task task + All processing in a single thread per CPU + Short tasks executed serially + Cooperative preemption
  • 43. 43 Cache coherency memtable Read cache task task + Complex operations on data without dealing with concurrency + No locking or complex lock-free algorithms + Data structures and algorithms simple memtable cache
  • 44. 44 Complex DQL/DML SELECT * FROM table WHERE pk = 0 and ck >= 2; DELETE FROM table WHERE pk = 0 and ck >= 2;
  • 45. 45 Range queries 2 5 SELECT * FROM table WHERE ... and ck >= 2; ?
  • 46. 46 Range queries 2 5 SELECT * FROM table WHERE ... and ck >= 2; range continuity
  • 47. 47 Range deletions 2 DELETE FROM table WHERE ... and ck >= 2; range continuity + tombstone
  • 48. ScyllaDB cache highlights + ScyllaDB has a fast cache + Efficient access & maintenance + Thanks to collocation with replica and design + Takes care of consistency guarantees + Handles complexities of data and query model
  • 49. External Cache Hiccups 49 + Increased latency + Elevated costs + Decreased availability + Increased complexity + Ruins the DB caching + Ignores DB own cache + Reduced security
  • 50. Increased latency External Embedded in DB <5 ms <1ms <1ms
  • 51. Elevated costs External Embedded in DB <5 ms <1ms <1ms
  • 54. Databases hold a lot of context about the data: + ScyllaDB is wide-column (Key-Key-Value), while a cache might by Key-Value only. + Structured data: Tables, User Defined Types… + Cache settings and hit rates per table + Time To Live (TTL) + Materialized View and Secondary Indexes + Much more… 54 Ignores the database knowledge
  • 55. An external caching layer introduces noise: + Ignores built-in RBAC + Ineffective caching + Data consistency concerns + Data availability concerns + Scan-resistant caching 55 Ruins database own cache
  • 56. 56 ScyllaDB as a Cache Replacement The features you are already familiar with, embedded to your database
  • 58. SELECT * FROM users BYPASS CACHE; SELECT name, occupation FROM users WHERE userid IN (199, 200, 207) BYPASS CACHE; SELECT * FROM users WHERE birth_year = 1981 AND country = 'FR' ALLOW FILTERING BYPASS CACHE; CQL Extension – BYPASS CACHE
  • 59. SSTable index caching ■ The whole of index can now be cached in memory ■ Populated on access (read- through) ■ Evicted on memory pressure ■ Partition index summary still non-evictable and always resident RAM Disk
  • 60. SSTable indexing - large partition example Partition size: 10 GB, Rows: 10 M, Index file size: 5 MB scylla-5.0 -c1 -m4G scylla-bench -workload uniform -mode read -limit 1 -concurrency 100 -partition-count 1 -clustering-row-count 10000000 -duration 60m Before: 2’011 Rows/s After: 6’191Rows/s (the node was bound by disk bandwidth, ~530 MB/s)
  • 61. Summary + Placing a cache in front of your Database can fire back + A cache lacks the context the DB has under the workload + ScyllaDB Cache is optimized to work with zero overhead + Multiple users have replaced their cache with ScyllaDB + ScyllaDB counts with several optimizations in its implementation
  • 62. Q&A ScyllaDB Cloud Start free trial scylladb.com/cloud December 5, 2023 scylladb.com/events
  • 63. Thank you for joining us today. @scylladb scylladb/ slack.scylladb.com @scylladb company/scylladb/ scylladb/

Editor's Notes

  • #2: PRESENTER - Felipe 9:59:45 AM PT – Marisa, Cynthia, Julia mute themselves. Then Marisa to START WEBINAR IN ZOOM. Felipe starts talking at 10:00AM PT Good morning everyone and welcome to our webinar. We are going to give people a few more seconds as they funnel in and we will begin shortly. Felipe to wait 30 seconds as people join the webinar. Felipe to start talking again at 10:00:30 AM PT Hi everyone and welcome! Before we get started, I’d like to quickly review a couple of housekeeping items. We welcome your questions. Please use the Q&A button, located at the bottom of your screen to ask your questions. Remember, you can enter them any time during the webinar -- you don’t have to wait till the end. We will answer as many questions as we can get to at the end of the presentation. Also, please note that today’s webinar is being recorded. We will email you a link to the recording and the slides following the event.
  • #3: PRESENTER - Felipe Before we begin we are pushing a quick poll question.
  • #4: PRESENTER - Felipe 9:59:45 AM PT – Marisa, Cynthia, Julia mute themselves. Then Marisa to START WEBINAR IN ZOOM. Felipe starts talking at 10:00AM PT Good morning everyone and welcome to our webinar. We are going to give people a few more seconds as they funnel in and we will begin shortly. Felipe to wait 30 seconds as people join the webinar. Felipe to start talking again at 10:00:30 AM PT Hi everyone and welcome! Before we get started, I’d like to quickly review a couple of housekeeping items. We welcome your questions. Please use the Q&A button, located at the bottom of your screen to ask your questions. Remember, you can enter them any time during the webinar -- you don’t have to wait till the end. We will answer as many questions as we can get to at the end of the presentation. Also, please note that today’s webinar is being recorded. We will email you a link to the recording and the slides following the event.
  • #5: PRESENTER - Felipe For those of you who are not familiar with ScyllaDB yet, it is the database behind gamechangers - organizations whose success depends upon delivering engaging experiences with impressive speed. ScyllaDB was built with a close-to-the-metal design that squeezes every possible ounce of performance out of modern infrastructure. This translates to predictable low latency even at high throughputs. With such consistent innovation the adoption of our database technology has grown to over 400 key players worldwide
  • #6: PRESENTER - Felipe Many of you will recognize some of the companies among the selection pictured here, such as Starbucks who leverage ScyllaDB for inventory management, Zillow for real-time property listing and updates, and Comcast Xfinity who power all DVR scheduling with ScyllaDB. As it can be seen, ScyllaDB is used across many different industries and for entirely different types of use cases. More than often, your company probably has a use case that is a perfect fit for ScyllaDB and it may be that you don’t know it yet!
  • #10: SHARE LINKS IN CHAT (Marisa) Learn more about ScyllaDB Architecture at https://www.scylladb.com/product/technology/
  • #15: Purpose: Customer case study (Recommendation/Personalization - Media Streaming; Media & Entertainment) Audience: Mixed “Comcast Cable Communications which many know as Xfinity, is a telecommunications giant headquartered in the US that provides cable TV, internet, telephone, and wireless services “The Comcast X1 platform is a cable TV and streaming video service that incorporates a cloud DVR scheduling system for 15 million households, with 2B+ RESTful calls (reads/writes) and 200+M new objects per day. “Beginning first with Oracle and later moving to Cassandra, the scheduler engineering team struggled with database latency at scale. (click) “By moving to ScyllaDB Enterprise software running on AWS EC2 infrastructure and on-premises, Comcast improved P99, P999, and P9999 latency by more than 95% and were able to rip out a UI cache layer (click) “They dramatically reduced their total database infrastructure from 962 Cassandra nodes (across multiple clusters) to 78 ScyllaDB nodes. (click) “and they reduced total costs by more than 60%, saving Comcast over $2.5M annually in infrastructure costs and staff overhead. Note Philip Zimich featured in blog and recorded Summit presentation leads the architecture, development and operations of the Comcast’s X1 Scheduler system that powers the DVR and program reminder experience for the X1 platform Blog/recorded presentation: 78 nodes is total for 6 clusters across 3 data centers using Enterprise subscriptions with AWS infrastructure and on-premises Salesforce: today 5 clusters, 4 in production (2 on EC2, 2 on premises) and totaling 100+ nodes
  • #16: Purpose: Customer case study - (Recommendation/Personalization - Media Streaming; Media & Entertainment) Audience: Mixed “Based in India, Disney+ Hotstar provides on-demand streaming services to more than 18 million paid subscribers and 300 million monthly active users. “Disney + Hotstar’s “Continue Watching” feature tracks every show for every user, capturing timestamps when last watched so users can pick up where left off on any device, to prompt users to watch next episodes, and alert users to new episodes of favorite shows. “Using Kafka for streaming data and Redis (500GB) coupled with Elasticsearch (20TB) for their 20+TB data environment, the engineering team was running into scaling, data complexity, and cost issues. They considered a number of alternatives, from Cassandra and Apache HBase to DynamoDB, ultimately selecting our database-as- a-service ScyllaDB Cloud. The gains were compelling with Disney+ Hotstar…, (click) “achieving sub-millisecond p99 latency at scale (click) “a simplified data architecture with significantly lower TCO Note Blog: calls out 20TB, sub millisecond P99.
  • #17: Purpose: Customer case study (Recommendation/Personalization - Media Streaming; Media & Entertainment) Audience: Mixed “HQ’d in Singapore, Grab is an on-demand transportation company - whether for personal rides or food or package delivery - and one of the most used mobile apps in Southeast Asia. Grab relies on Kafka to stream data for a variety of business use cases. To read the streams they needed a powerful, low-latency metadata store to aggregate the streams and initially used Redis - but it couldn’t keep up with the load. So Grab looked at Cassandra, ScyllaDB, and other NoSQL solutions, and after extensive testing, selected ScyllaDB. (click) ScyllaDB performance was on par with Redis… (click) …but without the scalability and related cost challenges. It also proved much easier than managing Cassandra. Grab now uses ScyllaDB for a variety of use cases including fraud detection, ad targeting, and data store for their front end UI.
  • #18: Purpose: Customer case study (Recommendation/Personalization - Media Streaming; Media & Entertainment) Audience: Mixed “Now part of Fox, Tubi is an ad-supported media streaming service with over 50 millions active users. “Tubi uses ML and an innovative experimentation process to personalize movie recommendations. “Tubi initially used Redis for the recommendation database, but later moved to Cassandra. As their environment grew, so did the need for better latency, throughput, fault tolerance, and maintainability. “So they moved to ScyllaDB Cloud running on AWS. In addition to eliminating JVM tuning, (click) average read latency during peak times was reduced to sub-millisecond (click) and P99 was reduced to 4-8ms.
  • #27: And yes, we write to the commitlog for crash recovery
  • #32: Cache is inserted like this Represents subset of data in sstables
  • #34: An improvement would be to manage most of memory inside Scylla. Still..
  • #35: An improvement would be to manage most of memory inside Scylla. Still..
  • #38: An improvement would be to manage most of memory inside Scylla. Still..
  • #40: … and this is enabled by the fact that cache is collocated with the replica
  • #41: An improvement would be to manage most of memory inside Scylla. Still..
  • #42: Cache is inserted like this Represents subset of data in sstables
  • #43: Cache is inserted like this Represents subset of data in sstables
  • #44: Cache is inserted like this Represents subset of data in sstables
  • #47: Repeated scans never go to disk
  • #53: Mention HWLB When a cache node fails, latency jump because the DB cache is cold - Ruins the database caching! This is not the case for ScyllaDB! Since each info element is replicated (usally 3 times) there is at least 2 nodes with hot cache. ScyllaDB has a HWLB features which allow it to gradually warm the node.
  • #54: There are only two hard things in Computer Science: cache invalidation and naming things. — Phil Karlton
  • #63: URL ScyllaDB Cloud: https://www.scylladb.com/product/scylla-cloud/ Database Performance at Scale Masterclass: https://lp.scylladb.com/database-performance-scale-masterclass-register ScyllaDB University Live: https://lp.scylladb.com/university-live-2023-12-registration
  • #64: Contact Us: Tomasz Grabiec: tgrabiec@scylladb.com Tzach Livyatan: tzach@scylladb.com Join our Slack Channel ScyllaDB Slack Ask your questions on our user forum ScyllaDB Community NoSQL Forum