SlideShare a Scribd company logo
Running a Massively Parallel
Self-serve Distributed Data System At Scale
Zhenzhong Xu | @ZhenzhongXu
Real-time Data Infrastructure
Running a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel Self-serve Distributed Data System At Scale
Running a Massively Parallel
Self-serve Distributed Data System at Scale
● What is Netflix Real-time Data Infrastructure
● Challenges
● Solutions and Principles
… Real-time Data Infrastructure at Netflix
What is ...
Business/Product Driven Analytics
● Recommendations / Personalization Algorithms
● Customer Experience
● Content Operation
● A/B Testing
● Marketing
● etc
Rise of Event Driven Architecture
● Notification
● Event Sourcing
● CQRS
● etc
We need a data
platform that’s
both scalable and
real-time.
Data-driven Culture
So what exactly is Keystone Streaming Platform?
Publish, Collect, Move & Compute
event data in near real time @ Cloud Scale
Keystone is ...
… a collection of microservices & components
Stream
Processing
Service
Pub/Sub
Queuing Service
Producer
API
Control Plane
Consumer
API
Self Service UI
Keystone is ...
… a single self-contained logical PaaS
Event Processing
Pipeline
Keystone is ...
… a multi-tenants, self-serving tool
Keystone is ...
… a self healing, cloud failure tolerant service,
guarantees at-least-once delivery semantics
… adaptable to changing environments
Stream
Consumers
Batch System
Fronting
Kafka
Event
Producer
Consumer
Kafka
Control Plane
Self Service UI
Putting together ...
mantis
… the decisions we made to build scalable, reliable and
maintainable systems.
Challenges
Solutions & Principles
Challenge 1:
Scale.
A Single Stream
Anatomy of a Single Stream
Stream
Processing Job
Queuing
Service
Event
Producer
Sink
Running a Massively Parallel Self-serve Distributed Data System At Scale
Separation of Concerns
● Separation of Messaging and Stream Processing services.
● Each service scales individually.
● Each service manages its own states.
● Independently manage service dependencies.
○ Kafka brokers on EC2
○ Streaming Service job on Titus Container Runtime
Titus Container Runtime
● Resource Provisioning
● Scheduling and bin-packing
● Capacity Guarantees
● Resource isolation
● Per container IP address (underlay via VPN)
Delivery/Processing Semantics
● At-most once
● At-least once
● Exactly once
At-least Once Processing Checkpointing
● Synchronous checkpointing through
event loop
Exactly Once* Processing Semantics
● Lightweight Asynchronous Snapshot (Async Barrier Checkpointing)
Per Stream
Monitoring & Alerting
Streams with Fanout
Logical Isolation
Logical Isolation
● Streams Level
● Deployments Level
● What about regional
Island Isolation?
Total Bytes Out = (num Of Consumers + replication factor - 1) * Bytes In
Solve Consumer Fanout with Hierarchies
Total Infrastructure Scale
● 500+ Billion events generated per day
● 1+ Trillian events processed per day
● ~800 Topics
● ~1,800 Streams
● 4000+ Kafka Instances
● ~9,000 Stream Processing Containers
Running a Massively Parallel Self-serve Distributed Data System At Scale
Kafka Cluster Failover
Producer
Kafka
Cluster Router
Failover
Producer
Kafka
Cluster Router
Failover
Cluster
Failback
Producer
Kafka
Cluster Router
Decommissioned
Failover Cluster
Pet vs Cattle
Immutable Kafka
Clusters ?
Principles:
Failure as a First Class Citizen
Separation of Concerns
Embrace Immutability
Challenge 2:
Self-serv & Multi-tenants
Diverse Customer Requirements
● Diverse combination of features
● Diverse platform tradeoffs
“Change is the Only
Constant”
Change Is the Only Constant
● New streams deployed in a few mins
● Customer changing needs
● Scaling activity
● Infrastructure Upgrades
Failure Modes
● Infrastructure Disaster
● Any component can become temporarily unavailable
Provide Building
Blocks
Declarative
Reconciliation
Declarative Reconciliation
● “Declarative” is a communication pattern
● “Reconciliation” to drive the entire system towards goal
Declarative Reconciliation
● Goal States
● Current States
Layered
Reconciliation
Declarative Reconciliation
● Goal States
● Current States
● State Machine Driven Reconciliation
State Machine
Goal State Driver
Running a Massively Parallel Self-serve Distributed Data System At Scale
Single
Source of Truth
Allows Eventual Consistency
Convergence on Goal State
Single
Source of Truth
Allows Eventual Consistency
Convergence on Goal State
Principle:
Leverage Reusable Building Blocks
Declarative Reconciliation
Single Source of Truth
Thought Experiment:
Kitchen Management vs Distributed
Architecture?
Thank you.
References:
https://medium.com/netflix-techblog
https://www.confluent.io/kafka-summit-sf17/multitenant-multicluster-and-hieracrchical-kafka-
messaging-service
We’re hiring - http://bit.ly/NetflixSPaaS
@ZhenzhongXu (tweet me questions!)

More Related Content

PPTX
Keystone event processing pipeline on a dockerized microservices architecture
PPTX
Time and ordering in streaming distributed systems
PPTX
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
PDF
CS80A Foothill College Open Source Talk
PDF
Triangle Devops Meetup 10/2015
PDF
Netflix Cloud Platform and Open Source
PDF
Netflix Open Source Meetup Season 3 Episode 2
PDF
Netflix Open Source Meetup Season 4 Episode 1
Keystone event processing pipeline on a dockerized microservices architecture
Time and ordering in streaming distributed systems
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
CS80A Foothill College Open Source Talk
Triangle Devops Meetup 10/2015
Netflix Cloud Platform and Open Source
Netflix Open Source Meetup Season 3 Episode 2
Netflix Open Source Meetup Season 4 Episode 1

What's hot (20)

PPTX
Netflix viewing data architecture evolution - EBJUG Nov 2014
PDF
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
PDF
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
PPTX
Azure Messaging Crossroads
PDF
[WSO2Con USA 2018] Deploying Applications in K8S and Docker
PDF
Herding Kats - Netflix’s Journey to Kubernetes Public
PDF
How Much Can You Connect? | Bhavesh Raheja, Disney + Hotstar
PDF
NetflixOSS Meetup S6E1 - Titus & Containers
PDF
Engineering Leader opportunity @ Netflix - Playback Data Systems
PDF
Blueprint: Kafka Publisher of Ceilometer
PDF
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
PDF
Kafka Streams
PDF
Netflix oss season 2 episode 1 - meetup Lightning talks
PPTX
goto; London: Keeping your Cloud Footprint in Check
PDF
QCon NYC: Distributed systems in practice, in theory
PDF
Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
KEY
Amazon Web Services (cloud: is it good for anything?)
PDF
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
PDF
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
PDF
Microservices, Monoliths, SOA and How We Got Here
Netflix viewing data architecture evolution - EBJUG Nov 2014
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Santa Cloud: How Netflix Does Holiday Capacity Planning - South Bay SRE Meetu...
Azure Messaging Crossroads
[WSO2Con USA 2018] Deploying Applications in K8S and Docker
Herding Kats - Netflix’s Journey to Kubernetes Public
How Much Can You Connect? | Bhavesh Raheja, Disney + Hotstar
NetflixOSS Meetup S6E1 - Titus & Containers
Engineering Leader opportunity @ Netflix - Playback Data Systems
Blueprint: Kafka Publisher of Ceilometer
AWS Re-Invent 2017 Netflix Keystone SPaaS - Monal Daxini - Abd320 2017
Kafka Streams
Netflix oss season 2 episode 1 - meetup Lightning talks
goto; London: Keeping your Cloud Footprint in Check
QCon NYC: Distributed systems in practice, in theory
Netflix Keystone - How Netflix Handles Data Streams up to 11M Events/Sec
Amazon Web Services (cloud: is it good for anything?)
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Cloud Capacity Planning Tooling - South Bay SRE Meetup Aug-09-2016
Microservices, Monoliths, SOA and How We Got Here
Ad

Similar to Running a Massively Parallel Self-serve Distributed Data System At Scale (20)

PDF
Netflix Keystone—Cloud scale event processing pipeline
PDF
BDX 2016- Monal daxini @ Netflix
PDF
Keystone - ApacheCon 2016
PDF
Monal Daxini - Beaming Flink to the Cloud @ Netflix
PDF
The Netflix Way to deal with Big Data Problems
PDF
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
PPTX
Netflix Data Pipeline With Kafka
PPTX
Netflix Data Pipeline With Kafka
PDF
Unbounded bounded-data-strangeloop-2016-monal-daxini
PDF
Flink forward-2017-netflix keystones-paas
PPTX
Distributed architecture in a cloud native microservices ecosystem
PPTX
Micro Services Architecture
PDF
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
PDF
Introduction-to-Service-Mesh-with-Istio-and-Kiali-OSS-Japan-July-2019.pdf
PDF
Introduction-to-Service-Mesh-with-Istio-and-Kiali-OSS-Japan-July-2019.pdf
PDF
#TwitterRealTime - Real time processing @twitter
PDF
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
PDF
Actors or Not: Async Event Architectures
PPTX
Data & analytics challenges in a microservice architecture
PDF
Uber Real Time Data Analytics
Netflix Keystone—Cloud scale event processing pipeline
BDX 2016- Monal daxini @ Netflix
Keystone - ApacheCon 2016
Monal Daxini - Beaming Flink to the Cloud @ Netflix
The Netflix Way to deal with Big Data Problems
Netflix keystone streaming data pipeline @scale in the cloud-dbtb-2016
Netflix Data Pipeline With Kafka
Netflix Data Pipeline With Kafka
Unbounded bounded-data-strangeloop-2016-monal-daxini
Flink forward-2017-netflix keystones-paas
Distributed architecture in a cloud native microservices ecosystem
Micro Services Architecture
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
Introduction-to-Service-Mesh-with-Istio-and-Kiali-OSS-Japan-July-2019.pdf
Introduction-to-Service-Mesh-with-Istio-and-Kiali-OSS-Japan-July-2019.pdf
#TwitterRealTime - Real time processing @twitter
Azure Event Hubs - Behind the Scenes With Kasun Indrasiri | Current 2022
Actors or Not: Async Event Architectures
Data & analytics challenges in a microservice architecture
Uber Real Time Data Analytics
Ad

Recently uploaded (20)

PPTX
Transform Your Business with a Software ERP System
PPTX
Essential Infomation Tech presentation.pptx
PPTX
Introduction to Artificial Intelligence
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
L1 - Introduction to python Backend.pptx
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
AI in Product Development-omnex systems
PDF
Digital Strategies for Manufacturing Companies
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
How Creative Agencies Leverage Project Management Software.pdf
Transform Your Business with a Software ERP System
Essential Infomation Tech presentation.pptx
Introduction to Artificial Intelligence
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
L1 - Introduction to python Backend.pptx
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
VVF-Customer-Presentation2025-Ver1.9.pptx
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Upgrade and Innovation Strategies for SAP ERP Customers
Odoo POS Development Services by CandidRoot Solutions
Navsoft: AI-Powered Business Solutions & Custom Software Development
Understanding Forklifts - TECH EHS Solution
Reimagine Home Health with the Power of Agentic AI​
AI in Product Development-omnex systems
Digital Strategies for Manufacturing Companies
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
wealthsignaloriginal-com-DS-text-... (1).pdf
How Creative Agencies Leverage Project Management Software.pdf

Running a Massively Parallel Self-serve Distributed Data System At Scale

Editor's Notes

  • #3: Hook: kitchen management vs distributed system architecture
  • #5: Microservice - separation of concern Communication style - declarative reconciliation
  • #8: With a large scale microservice architecture supporting more than 100MM subscribers worldwide. Almost any internet-connected screens are capable of generating events. These amazing amount of events are used to ...
  • #11: Microservice, empowers local decisions without having to consult to a centralized decision making process. Connect silos, break technical, organizational barriers
  • #13: Data backbone for the company to ensure event data collection, movement and computation. Describe what is event data.
  • #14: Control plane vs data plane Chef example, management/orchestration vs heavy lifting/make the work happen.
  • #18: Describe router and stream processing platform. High level non-engine specific api, vs implement job on apis. Talk about two different product offerings, and why fronting kafka need to be protected. Typical latencies.
  • #20: Cloud abstraction. Network partition.
  • #21: Data pipeline routing. High level DSL. Define source, transformation (projection, filtering) and sink location. Briefly mention embarrassingly parallel processing vs more complex processing with shuffling. (locality, operator chaining vs large local states, managing large checkpoints). And we ARE only focusing on embarrassingly parallel case.
  • #22: Implemented on Apache Flink. Scaling. At least once delivery. Tolerate failure.
  • #23: Parallelism/Partitioning isolated across components. Back pressure handling in parallel job. Checkpoint, delivery semantics. Fined grained recovery.
  • #24: One service does one thing only. Do it really well. Expand on states. Bug fixes Platform Upgrades Kafka Stream Processing Engine Runtime Provisioning & Deployment Failover Scaling
  • #25: Immutable infrastructure on container. Talk about actual scheduling, bin-packing.
  • #28: Side effect of processing is identical to exactly once. Not apply to external systems.
  • #29: Monitoring, mention external monitor of committed offsets.
  • #30: Fanout sinks. Support both batch and custom stream processing jobs.
  • #32: Each stream is deployed as separate clusters. Trade offs.
  • #36: Cluster differs in size, names, and recoverability. Pet vs Cattle.
  • #37: From Failover to failback to failover only.
  • #38: From Failover to failback to failover only.
  • #39: From Failover to failback to failover only.
  • #40: Cluster differs in size, names, and recoverability. Pet vs Cattle.
  • #41: From Failover to failback to failover only.
  • #43: Cloud abstraction. Network partition.
  • #46: Constant Changes => Constant Infrastructure out of sync with requirements Requirement differs => Give customer option for different tradeoffs Topic move
  • #47: Customer new streams - deployed in a few mins Customer requested scaling activity - involves topic move. Infrastructure failure Detected scale change Infrastructure upgrades Constant Changes => Constant Infrastructure out of sync with requirements Requirement differs => Give customer option for different tradeoffs
  • #49: Why kitchen orders are immutable/append only? talk about benefits: scale, eventual consistent, less human interaction with declarative reconciliation approach
  • #52: What customer need to know, what they don’t (topic move example) Separation of concern, break source of truth into lower level goal states that each individual component is authoritative for. Goal State Current State exchange. Goal states are typically expressed in a way that’s “idempotent”.
  • #53: Why orders and contracts are append only (immutable)?
  • #58: Building blocks - scale your operation. Declarative reconciliation allows eventual convergence on customer goals in a fast changing environment. Single source of truth to drive reconciliation flow and tolerate to total infrastructure failures by persisting only the essential information.