SlideShare a Scribd company logo
Time and Ordering in
Streaming Distributed Systems
Zhenzhong Xu
Real-time Data Infrastructure
Netflix
@ZhenzhongXu
Time and Ordering in
Streaming Distributed Systems
Zhenzhong Xu
Real-time Data Infrastructure
.
@ZhenzhongXu
Time and ordering in streaming distributed systems
Software engineers think of time as -
● Uniformly measurable
● One directional
● Infinite precision
● Time manifests ordering of events
“Time no longer appears to us as a
gigantic, world-dominating chronos,
nor as a primitive entity, but as
something derived from phenomena
themselves. It is a figment of my
thinking.”
— Schrödinger, Erwin.
“Time is an illusion.”
— Einstein, Albert.
Distributed System
No shared memory, only message
passing via unreliable network with
variable delays, and the system may
suffer from partial failures, unreliable
clocks and processing pauses.
Stream processing connects
distributed systems together, over
space and time, designed with
unbounded data set in mind.
Stream Processing at Netflix
● Keystone Data Pipeline
● Operation insights
● Business analytics
● Event sourcing pattern
Categories of streaming
● Time agnostic
● Transformation
● Filtering
● Projection
● Enrichment
● Inner joins
Categories of streaming
● Time agnostic
● Approximation
● Approximate top-n
● Streaming k-means
● etc
Categories of streaming
● Time agnostic
● Approximation
● Windowing
● Fixed / Tumbling
● Sliding / Hopping
● Session / Dynamic
Project Delta
Eventual consistent, event-
driven data synchronization
platform
● Event sourcing
● Windowing
Challenges:
● Semantics of ordering
● Latency vs. durability
● CDC
● etc
… via the three lens of time
● Uniformity of time
● Arrow of time
● Perception of time
#1
Uniformity of time
Time is a tool ...
Time is a tool ...
Time is a tool ...
Need for synchronization?
= uniform time
Standing on the shoulder of giants
Time flows slower closer to a black hole
Scene from the movie Interstella, depicts time flows slow closer to the supermassive blackhole “Gargantuan”
Clock
synchronization
over network
“NTP can usually maintain time to
within tens of milliseconds over the
public Internet, and can achieve
better than one millisecond accuracy
in local area networks under ideal
conditions. Asymmetric routes and
network congestion can cause errors
of 100 ms or more.”
https://en.wikipedia.org/wiki/Network_Time_Protocol
Figure referenced from Designing Data Intensive Applications by Martin Kleppmann, Chapter 8 Trouble with Distributed Systems
Relying on synchronized wall clock timestamps?
Time in
Stream
Processing
https://ci.apache.org/projects/flink/flink-docs-stable/concepts/programming-model.html
Why time skews
● Information travel takes time
● Low power device
● Process failure
● Unpredictable network congestions
● Timeouts and unbounded delays
● Unreliable clock
● Process pauses
● etc
Figure referenced from Streaming Systems by Tyler Akidau el al, Chapter 1 Streaming 101
Watermark
in action
Animation referenced from Streaming Systems by Tyler Akidau el al, http://streamingbook.net/fig/2-11
Use watermark to bound the
uncertainties of time
● Allowed lateness
Figure referenced from Streaming Systems by Tyler Akidau el al, Chapter 2 Going Streaming: When and How
Project Delta
Watermark and
allowed lateness
POST arrives
outside allowed
lateness boundary
PRE event
duplicated
#2
Arrow of time
Time and ordering in streaming distributed systems
Boltzmann’s entropy formula
Time and ordering in streaming distributed systems
PRE ts: GMT: Saturday, November 17,
2018 7:10:00 AM
POST ts: GMT: Saturday, November
17, 2018 7:09:00 AM
Stream Processing
Custom windowing
Embed our blurred vision to
represent arrow of time in
custom triggers logic
Out of order event timestamp
#3
Perception of time
This is a story
about me and my
uncle ...
Me,
4 years old
My uncle, 2
years old
Imagine an ancestry tree includes all modern human
beings ...
When forcing a global generation order...
Can Bob and Dave be logically the
same generation?
What’s the meaning of “now”?
Light travels in a cone shape
over time ...
The light cone
representing the
past, present, and
future ...
https://en.wikipedia.org/wiki/Light_cone
Light cone
spacetime diagram
Revisit the ancestry
tree
The cone shape shows the
causal/partial ordering from
Dave’s frame of reference.
Lorentz transformation
Observers in different frame of
references perceive different
ordering of events
Relativity of Simultaneity
Time and Ordering depends
on frame of reference (space
and time!)
There is no deterministic global
ordering.
Time and ordering in streaming distributed systems
In a distributed system, it is
sometimes impossible to say
that one of two events
occurred first. The relation
“happened before” is
therefore only a partial
ordering of the events in the
system.
Figure referenced from wikipedia: https://en.wikipedia.org/wiki/Vector_clock
Partial/Causal ordering
An irreflexive partial ordering on a set A is
a relation on A that satisfies three
properties.
1. irreflexivity: a ⊀ a
2. antisymmetry: if a < b then b ⊀ a
3. transitivity: if a < b and b < c then a < c
Total ordering
An irreflexive total ordering is a irreflexive
partial ordering that satisfies another
condition.
4. totality: if a ≠ b then a < b or b< a.
Causal/Partial vs
Total ordering
Causal/Partial vs
Total ordering
Distributed Consensus and Atomic
Broadcast is the same thing!
Both requires total order broadcast.
Linearizability
… to make a system appear as if there is
only a single copy of the data.
Linearizability is the C in CAP theorem.
(practically no CA system, only CP)
Linearizability requires total ordering…
Consensus in a synchronous
environment can be resilient to faults.
FLP result shows that in an async
setting, where only one processor
might crash, there is no distributed
algorithm that solves the consensus
problem.
What happens when
an event get close to a
black hole’s event
horizon
This is very similar to how
process fails in distributed
systems, observer will never be
able to tell whether the process
crashed or simply will take long
time to respond
Time and ordering in streaming distributed systems
The 3 lens of time
● No uniformity of time
● Blurred direction of time
● Limited perception of time
Time and ordering in streaming distributed systems
Time and ordering in streaming distributed systems
Thank you.
Time and ordering in streaming distributed systems

More Related Content

PPTX
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
PPTX
Running a Massively Parallel Self-serve Distributed Data System At Scale
PPTX
Keystone event processing pipeline on a dockerized microservices architecture
PDF
QCon NYC: Distributed systems in practice, in theory
PPTX
Netflix viewing data architecture evolution - EBJUG Nov 2014
PDF
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
PDF
Monitoring kubernetes across data center and cloud
PDF
Herding Kats - Netflix’s Journey to Kubernetes Public
FlinkForward Asia 2019 - Evolving Keystone to an Open Collaborative Real Time...
Running a Massively Parallel Self-serve Distributed Data System At Scale
Keystone event processing pipeline on a dockerized microservices architecture
QCon NYC: Distributed systems in practice, in theory
Netflix viewing data architecture evolution - EBJUG Nov 2014
Detecting Real-Time Financial Fraud with Cloudflow on Kubernetes
Monitoring kubernetes across data center and cloud
Herding Kats - Netflix’s Journey to Kubernetes Public

What's hot (20)

PDF
CS80A Foothill College Open Source Talk
PDF
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
PDF
Microservices, Monoliths, SOA and How We Got Here
PDF
Triangle Devops Meetup 10/2015
PPTX
Azure Messaging Crossroads
PPTX
Lifting the Blinds: Monitoring Windows Server 2012
PDF
NetflixOSS Meetup S6E1 - Titus & Containers
PDF
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
PDF
Engineering Leader opportunity @ Netflix - Playback Data Systems
PPTX
Enforcing Application SLA with Congress and Monasca
PDF
Netflix Cloud Platform and Open Source
PDF
Keystone - ApacheCon 2016
PDF
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
PDF
BDX 2016- Monal daxini @ Netflix
PPTX
Ceilosca
PDF
Keystone - Leverage Big Data 2016
PDF
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
PDF
Leveraging services in stream processor apps at Ticketmaster (Derek Cline, Ti...
PDF
Netflix Container Runtime - Titus - for Container Camp 2016
PDF
Virtualization at Gilt - Rangarajan Radhakrishnan
CS80A Foothill College Open Source Talk
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
Microservices, Monoliths, SOA and How We Got Here
Triangle Devops Meetup 10/2015
Azure Messaging Crossroads
Lifting the Blinds: Monitoring Windows Server 2012
NetflixOSS Meetup S6E1 - Titus & Containers
Monitoring Big Data Systems Done "The Simple Way" - Demi Ben-Ari - Codemotion...
Engineering Leader opportunity @ Netflix - Playback Data Systems
Enforcing Application SLA with Congress and Monasca
Netflix Cloud Platform and Open Source
Keystone - ApacheCon 2016
Kafka Summit NYC 2017 - Every Message Counts: Kafka as a Foundation for Highl...
BDX 2016- Monal daxini @ Netflix
Ceilosca
Keystone - Leverage Big Data 2016
NetflixOSS Meetup S6E2 - Spinnaker, Kayenta
Leveraging services in stream processor apps at Ticketmaster (Derek Cline, Ti...
Netflix Container Runtime - Titus - for Container Camp 2016
Virtualization at Gilt - Rangarajan Radhakrishnan
Ad

Similar to Time and ordering in streaming distributed systems (20)

PPTX
Physical and Logical Clocks
PPT
L12.FA20.ppt
PDF
Clock.pdf
PPT
dokumen.tips_synchronization-in-distributed-systems-chapter-6.ppt
PDF
Time in distributed systmes
PPTX
Synchronization in distributed computing
PPT
Chapter 6-Synchronozation2.ppt
PDF
How shit works: Time
PPT
CS6601-Unit 4 Distributed Systems
PPTX
Unit iii-Synchronization
PPT
Chapter 10
PPTX
PPT
Chap 5
PDF
A Deterministic Model Of Time For Distributed Systems
PPTX
slides.06.pptx
PDF
time-clocks.pdf
PDF
Chapter14.pdfffasfdaddsdsvdsffdhhhahdfdfghhh
PDF
Physical and Logical Time
PPT
clock synchronization in Distributed System
PPT
Time Global States -- Distributed System
Physical and Logical Clocks
L12.FA20.ppt
Clock.pdf
dokumen.tips_synchronization-in-distributed-systems-chapter-6.ppt
Time in distributed systmes
Synchronization in distributed computing
Chapter 6-Synchronozation2.ppt
How shit works: Time
CS6601-Unit 4 Distributed Systems
Unit iii-Synchronization
Chapter 10
Chap 5
A Deterministic Model Of Time For Distributed Systems
slides.06.pptx
time-clocks.pdf
Chapter14.pdfffasfdaddsdsvdsffdhhhahdfdfghhh
Physical and Logical Time
clock synchronization in Distributed System
Time Global States -- Distributed System
Ad

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Spectroscopy.pptx food analysis technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPT
Teaching material agriculture food technology
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Big Data Technologies - Introduction.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Approach and Philosophy of On baking technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Empathic Computing: Creating Shared Understanding
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Spectral efficient network and resource selection model in 5G networks
MIND Revenue Release Quarter 2 2025 Press Release
Spectroscopy.pptx food analysis technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Teaching material agriculture food technology
The Rise and Fall of 3GPP – Time for a Sabbatical?
sap open course for s4hana steps from ECC to s4
Digital-Transformation-Roadmap-for-Companies.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Network Security Unit 5.pdf for BCA BBA.
Big Data Technologies - Introduction.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Approach and Philosophy of On baking technology
Reach Out and Touch Someone: Haptics and Empathic Computing
Empathic Computing: Creating Shared Understanding
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Spectral efficient network and resource selection model in 5G networks

Time and ordering in streaming distributed systems