SlideShare a Scribd company logo
Exploring Phantom Traffic Jams in
Your Data Flows
Pavel Emelyanov, Principal Engineer @ScyllaDB
Poll
Where are you in your NoSQL adoption?
About myself
3
Pavel Emelyanov
● Co-maintainer of Seastar & ScyllaDB
● Ex Linux kernel hacker
+ Infoworld 2020 Technology of the Year
+ Founded by designers of KVM Hypervisor
The Database Built for Gamechangers
4
“ScyllaDB stands apart...It’s the rare product
that exceeds my expectations.”
– Martin Heller, InfoWorld contributing editor and reviewer
“For 99.9% of applications, ScyllaDB delivers all the
power a customer will ever need, on workloads that other
databases can’t touch – and at a fraction of the cost of
an in-memory solution.”
– Adrian Bridgewater, Forbes senior contributor
+ Resolves challenges of legacy NoSQL databases
+ >5x higher throughput
+ >20x lower latency
+ >75% TCO savings
+ DBaaS/Cloud, Enterprise and Open Source solutions
+ Proven globally at scale
5
+400 Gamechangers Leverage ScyllaDB
Seamless experiences
across content + devices
Fast computation of flight
pricing
Corporate fleet
management
Real-time analytics
2,000,000 SKU -commerce
management
Real-time location tracking
for friends/family
Video recommendation
management
IoT for industrial
machines
Synchronize browser
properties for millions
Threat intelligence service
using JanusGraph
Real time fraud detection
across 6M transactions/day
Uber scale, mission critical
chat & messaging app
Network security threat
detection
Power ~50M X1 DVRs with
billions of reqs/day
Precision healthcare via
Edison AI
Inventory hub for retail
operations
Property listings and
updates
Unified ML feature store
across the business
Cryptocurrency exchange
app
Geography-based
recommendations
Distributed storage for
distributed ledger tech
Global operations- Avon,
Body Shop + more
Predictable performance for
on sale surges
GPS-based exercise
tracking
6
+ Written in C++, built over a Seastar library
+ Share nothing
+ All asynchronous
About Seastar
7
What Kind of Jams?
8
+ Hardware bottlenecks
+ Self-limiting
+ Phantom jams
Why Is … So Slow?
9
+ IO-bound
+ CPU-bound
+ Memory-bound
+ Networking bound
Hardware Bottlenecks
10
+ Throttling and Rate-limiting
+ https://linuxfoundation.org/webinars/understanding-storage-i-o-under-load/
+ https://www.scylladb.com/2022/08/03/implementing-a-new-io-scheduler-algorithm-for-mixed-read
-write-workloads/
+ Throttle for guarantees
+ https://wiki.openvz.org/Containers/Guarantees_for_resources
+ Linux cgroup controllers
Self-Limiting
11
Phantom Jams
12
Producer-Consumer
Producer Consumer
Message
P messages / sec C messages / sec
13
+ P ≤ C
+ Producer is process/thread/fiber (pool) doing IO / talking over network
+ Consumer is disk / NIC / server
Producer-Consumer (Cont.)
14
And Dispatcher
Producer Consumer
P messages / sec C messages / sec
Dispatcher
Message
D wake-ups / sec
d messages / wake-up
15
+ Dispatcher examples
+ IO scheduler
+ Traffic shaper
+ API gateway
+ Dispatcher can impose
+ Fair-scheduling
+ Access policy
+ Buffering
+ Routing
+ d ≥ C / D
And Dispatcher (Cont.)
16
+ https://github.com/xemul/queue-dispatch
+ Seastar-centric (simplified seastar)
+ Virtual time-axis
+ Producer
+ Generates empty messages at given rate
+ Dispatcher
+ wakes up every 0.5 ms
+ submits 1.5 ✕ C / D messages
+ Consumer
+ 200k messages / sec by default
The Experiment
17
+ Ideal simulator shows 0.5 ms latency (expected)
+ Real “components” should show P, D and C on average
+ Poisson point process
+ Pause between delays is exponentially distributed
Make Them “Real”
18
Producer Jitter
19
Consumer Jitter
20
Dispatcher Jitter
21
Dispatcher Jitter
22
Effective Dispatch Rate
Producer Consumer
Dispatcher
X messages / sec
23
Effective Dispatch Rate
24
+ Interposer between producer and consumer
+ Queue in interposer
+ Possible jitter (a.k.a. cooperative preemption)
+ Seastar
+ Coroutines / Goroutines
+ Unikernels
In RL of <your system>
25
+ Bottleneck might not be in “hardware”
+ Good metrics is a must
+ https://www.scylladb.com/2022/04/19/exploring-phantom-jams-in-your-data-flow/
Conclusions
Poll
How much data do you under management of your
transactional database?
Q&A
WANT TO KEEP LEARNING?
Join ScyllaDB University for Free:
university.scylladb.com
SCYLLADB VIRTUAL WORKSHOP
Getting Started with ScyllaDB
19 January, 2023, 1PM GMT | 8 AM ET | 6:30 PM IST
Thank you
for joining us today.
@scylladb scylladb/
slack.scylladb.com
@scylladb company/scylladb/
scylladb/

More Related Content

PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
PDF
IAU workshop 2018 day one
PDF
Build Low-Latency Applications in Rust on ScyllaDB
PDF
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
PDF
Despliegue Cloud-Native Simplificado: Infraestructura, Servicios y GenAI en m...
PDF
A hitchhiker‘s guide to the cloud native stack
PDF
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
PDF
Containers - Portable, repeatable user-oriented application delivery. Build, ...
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
IAU workshop 2018 day one
Build Low-Latency Applications in Rust on ScyllaDB
Running a Cost-Effective DynamoDB-Compatible Database on Managed Kubernetes S...
Despliegue Cloud-Native Simplificado: Infraestructura, Servicios y GenAI en m...
A hitchhiker‘s guide to the cloud native stack
A Hitchhiker’s Guide to the Cloud Native Stack. #CDS17
Containers - Portable, repeatable user-oriented application delivery. Build, ...

Similar to Exploring Phantom Traffic Jams in Your Data Flows (20)

PDF
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
PDF
MySQL Database Architectures - High Availability and Disaster Recovery Solution
PDF
'DOCKER' & CLOUD: ENABLERS For DEVOPS
PDF
Docker and Cloud - Enables for DevOps - by ACA-IT
PDF
Updates to Apache CloudStack and LINBIT SDS
PDF
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
PDF
Designing Low-Latency Systems with Rust: An Architectural Deep Dive
PDF
MySQL Database Architectures - 2022-08
PDF
5 Factors When Selecting a High Performance, Low Latency Database
PDF
cncf overview and building edge computing using kubernetes
PDF
ScyllaDB Virtual Workshop
PDF
Docker Containers- Data Engineers' Arsenal.pdf
PDF
Build DynamoDB-Compatible Apps with Python
PDF
Cloud-native .NET Microservices mit Kubernetes
PPT
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...
PDF
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
PDF
OVH Analytics Data Compute - Apache Spark Cluster as a Service
PDF
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
PDF
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
PDF
VMworld 2014: How to Build a Hybrid Cloud
Learning Rust the Hard Way for a Production Kafka + ScyllaDB Pipeline
MySQL Database Architectures - High Availability and Disaster Recovery Solution
'DOCKER' & CLOUD: ENABLERS For DEVOPS
Docker and Cloud - Enables for DevOps - by ACA-IT
Updates to Apache CloudStack and LINBIT SDS
TechEvent 2019: DBaaS from Swisscom Cloud powered by Trivadis; Konrad Häfeli ...
Designing Low-Latency Systems with Rust: An Architectural Deep Dive
MySQL Database Architectures - 2022-08
5 Factors When Selecting a High Performance, Low Latency Database
cncf overview and building edge computing using kubernetes
ScyllaDB Virtual Workshop
Docker Containers- Data Engineers' Arsenal.pdf
Build DynamoDB-Compatible Apps with Python
Cloud-native .NET Microservices mit Kubernetes
Resilience: the key requirement of a [big] [data] architecture - StampedeCon...
Designing Low-Latency Systems with Rust and ScyllaDB: An Architectural Deep Dive
OVH Analytics Data Compute - Apache Spark Cluster as a Service
Ovh analytics data compute with apache spark as a service meetup ovh bordeaux
Scaling Security on 100s of Millions of Mobile Devices Using Apache Kafka® an...
VMworld 2014: How to Build a Hybrid Cloud
Ad

More from ScyllaDB (20)

PDF
Understanding The True Cost of DynamoDB Webinar
PDF
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
PDF
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
PDF
New Ways to Reduce Database Costs with ScyllaDB
PDF
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
PDF
Leading a High-Stakes Database Migration
PDF
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
PDF
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
PDF
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
PDF
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
PDF
ScyllaDB: 10 Years and Beyond by Dor Laor
PDF
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
PDF
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
PDF
Vector Search with ScyllaDB by Szymon Wasik
PDF
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
PDF
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
PDF
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
PDF
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
PDF
Lessons Learned from Building a Serverless Notifications System by Srushith R...
PDF
A Dist Sys Programmer's Journey into AI by Piotr Sarna
Understanding The True Cost of DynamoDB Webinar
Database Benchmarking for Performance Masterclass: Session 2 - Data Modeling ...
Database Benchmarking for Performance Masterclass: Session 1 - Benchmarking F...
New Ways to Reduce Database Costs with ScyllaDB
Powering a Billion Dreams: Scaling Meesho’s E-commerce Revolution with Scylla...
Leading a High-Stakes Database Migration
Achieving Extreme Scale with ScyllaDB: Tips & Tradeoffs
Securely Serving Millions of Boot Artifacts a Day by João Pedro Lima & Matt ...
How Agoda Scaled 50x Throughput with ScyllaDB by Worakarn Isaratham
How Yieldmo Cut Database Costs and Cloud Dependencies Fast by Todd Coleman
ScyllaDB: 10 Years and Beyond by Dor Laor
Reduce Your Cloud Spend with ScyllaDB by Tzach Livyatan
Migrating 50TB Data From a Home-Grown Database to ScyllaDB, Fast by Terence Liu
Vector Search with ScyllaDB by Szymon Wasik
Workload Prioritization: How to Balance Multiple Workloads in a Cluster by Fe...
Two Leading Approaches to Data Virtualization, and Which Scales Better? by Da...
Scaling a Beast: Lessons from 400x Growth in a High-Stakes Financial System b...
Object Storage in ScyllaDB by Ran Regev, ScyllaDB
Lessons Learned from Building a Serverless Notifications System by Srushith R...
A Dist Sys Programmer's Journey into AI by Piotr Sarna
Ad

Recently uploaded (20)

PPTX
Big Data Technologies - Introduction.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Approach and Philosophy of On baking technology
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Encapsulation theory and applications.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Empathic Computing: Creating Shared Understanding
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Big Data Technologies - Introduction.pptx
Understanding_Digital_Forensics_Presentation.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Approach and Philosophy of On baking technology
Advanced methodologies resolving dimensionality complications for autism neur...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Unlocking AI with Model Context Protocol (MCP)
Review of recent advances in non-invasive hemoglobin estimation
Encapsulation theory and applications.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Reach Out and Touch Someone: Haptics and Empathic Computing
Agricultural_Statistics_at_a_Glance_2022_0.pdf
20250228 LYD VKU AI Blended-Learning.pptx
MYSQL Presentation for SQL database connectivity
Programs and apps: productivity, graphics, security and other tools
Dropbox Q2 2025 Financial Results & Investor Presentation
Empathic Computing: Creating Shared Understanding
Chapter 3 Spatial Domain Image Processing.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...

Exploring Phantom Traffic Jams in Your Data Flows

  • 1. Exploring Phantom Traffic Jams in Your Data Flows Pavel Emelyanov, Principal Engineer @ScyllaDB
  • 2. Poll Where are you in your NoSQL adoption?
  • 3. About myself 3 Pavel Emelyanov ● Co-maintainer of Seastar & ScyllaDB ● Ex Linux kernel hacker
  • 4. + Infoworld 2020 Technology of the Year + Founded by designers of KVM Hypervisor The Database Built for Gamechangers 4 “ScyllaDB stands apart...It’s the rare product that exceeds my expectations.” – Martin Heller, InfoWorld contributing editor and reviewer “For 99.9% of applications, ScyllaDB delivers all the power a customer will ever need, on workloads that other databases can’t touch – and at a fraction of the cost of an in-memory solution.” – Adrian Bridgewater, Forbes senior contributor + Resolves challenges of legacy NoSQL databases + >5x higher throughput + >20x lower latency + >75% TCO savings + DBaaS/Cloud, Enterprise and Open Source solutions + Proven globally at scale
  • 5. 5 +400 Gamechangers Leverage ScyllaDB Seamless experiences across content + devices Fast computation of flight pricing Corporate fleet management Real-time analytics 2,000,000 SKU -commerce management Real-time location tracking for friends/family Video recommendation management IoT for industrial machines Synchronize browser properties for millions Threat intelligence service using JanusGraph Real time fraud detection across 6M transactions/day Uber scale, mission critical chat & messaging app Network security threat detection Power ~50M X1 DVRs with billions of reqs/day Precision healthcare via Edison AI Inventory hub for retail operations Property listings and updates Unified ML feature store across the business Cryptocurrency exchange app Geography-based recommendations Distributed storage for distributed ledger tech Global operations- Avon, Body Shop + more Predictable performance for on sale surges GPS-based exercise tracking
  • 6. 6 + Written in C++, built over a Seastar library + Share nothing + All asynchronous About Seastar
  • 8. 8 + Hardware bottlenecks + Self-limiting + Phantom jams Why Is … So Slow?
  • 9. 9 + IO-bound + CPU-bound + Memory-bound + Networking bound Hardware Bottlenecks
  • 10. 10 + Throttling and Rate-limiting + https://linuxfoundation.org/webinars/understanding-storage-i-o-under-load/ + https://www.scylladb.com/2022/08/03/implementing-a-new-io-scheduler-algorithm-for-mixed-read -write-workloads/ + Throttle for guarantees + https://wiki.openvz.org/Containers/Guarantees_for_resources + Linux cgroup controllers Self-Limiting
  • 13. 13 + P ≤ C + Producer is process/thread/fiber (pool) doing IO / talking over network + Consumer is disk / NIC / server Producer-Consumer (Cont.)
  • 14. 14 And Dispatcher Producer Consumer P messages / sec C messages / sec Dispatcher Message D wake-ups / sec d messages / wake-up
  • 15. 15 + Dispatcher examples + IO scheduler + Traffic shaper + API gateway + Dispatcher can impose + Fair-scheduling + Access policy + Buffering + Routing + d ≥ C / D And Dispatcher (Cont.)
  • 16. 16 + https://github.com/xemul/queue-dispatch + Seastar-centric (simplified seastar) + Virtual time-axis + Producer + Generates empty messages at given rate + Dispatcher + wakes up every 0.5 ms + submits 1.5 ✕ C / D messages + Consumer + 200k messages / sec by default The Experiment
  • 17. 17 + Ideal simulator shows 0.5 ms latency (expected) + Real “components” should show P, D and C on average + Poisson point process + Pause between delays is exponentially distributed Make Them “Real”
  • 22. 22 Effective Dispatch Rate Producer Consumer Dispatcher X messages / sec
  • 24. 24 + Interposer between producer and consumer + Queue in interposer + Possible jitter (a.k.a. cooperative preemption) + Seastar + Coroutines / Goroutines + Unikernels In RL of <your system>
  • 25. 25 + Bottleneck might not be in “hardware” + Good metrics is a must + https://www.scylladb.com/2022/04/19/exploring-phantom-jams-in-your-data-flow/ Conclusions
  • 26. Poll How much data do you under management of your transactional database?
  • 27. Q&A WANT TO KEEP LEARNING? Join ScyllaDB University for Free: university.scylladb.com SCYLLADB VIRTUAL WORKSHOP Getting Started with ScyllaDB 19 January, 2023, 1PM GMT | 8 AM ET | 6:30 PM IST
  • 28. Thank you for joining us today. @scylladb scylladb/ slack.scylladb.com @scylladb company/scylladb/ scylladb/