SlideShare a Scribd company logo
Unbundling the Modern
Streaming Stack
Dunith Dhanushka - 05/10/2022
Navigating the Real-time Analytics Landscape
About Me
twitter.com/dunithd medium.com/event-driven-utopia linkedin.com/in/dunithd/
• I’m Dunith Dhanushka
• Big data solution architect -> DevRel
• Blogs at eventdrivenutopia.com
Background
• This talk is based on my blog
that I published in April, 2022.
• This talk has been updated
with a few new things since
then.
• Enjoy!
Goal of the Talk
What Are We Going To Talk About Today?
Introduce you to the things
required to build real-time applications
that harness value from streaming data
The Plan
The Order of Things
1. A refresher on streaming data
2. The classic streaming stack
3. The modern streaming stack
4. Current trends and the future outlook
What is a Streaming Stack?
Streaming Data
What Is a stream?
A stream is a continuous, never-ending data
f
low with no beginning or
end. The data is incrementally made available over time, enabling you to
act upon it without needing to be downloaded
f
irst.
Events
Streams are made of events
A data stream consists of a series of data points ordered in time.
Each data point represents an “event” or a change in the state of the
business.
T4 T3 T2 T1 T0
Event source
Event stream
Time
Events
Event First Thinking
Modelling State Changes in Systems
A user with ID 1234 purchased item 567 for $3.99 on 2022/06/12 at Austin, TX
Fact Value
User ID 1234
Item ID 567
Price Paid $3.99
Date 2022/06/12
Place Austin, TX
• Events represents facts.
• Events are immutable.
• Events belong to the past.
Making Sense of Streaming Data
Events Have A Shelf Life
Act Fast Before You Lose Their Value
Image credit - https://d3i71xaburhd42.cloudfront.net/8cb6c2711afd3e504400ee12d3b582cc06348b08/7-Figure2-1.png
Real-time Analytics
Extracting Value From Events As Soon as They Are Made Available
REAL-TIME
ANALYTICS
Insights
React
Streams of Events
What is a Streaming Stack?
A streaming stack is the processes, tools, and technologies
you use to derive insights from unbounded data.
The Classic Streaming Stack
The Beginning
• Real-time analytics dates back to decades, existed in the forms of
Complex Event Processing (CEP) and Event Stream Processing (ESP).
• Most of the work has been academic. But few vendors like Progress
Apama, Esper, Tibco, and Streambase tried bringing it to the mass
market.
Then Came Big Data…
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Lambda Architecture
Promotes A Uni
f
ied Serving Layer
Image credit - https://www.databricks.com/glossary/lambda-architecture
Why Didn’t It Pick Up?
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
• Overly complicated technology:
Specialised skillset of distributed
systems and performance
engineering.
• Limited only to the JVM: Non-
JVM developers had no option
rather than adapting.
• Higher footprint on
infrastructure: Stream
processors tax heavily on the CPU
and RAM.
• Maintenance overhead: Having
to maintain both speed and batch
layers.
The Modern Streaming Stack
Modern Streaming Stack
Modern Cloud-native tools
Managed and Serverless platforms
Rich tooling and developer experience
Expressive programming model
MSS is the classic streaming stack reimagined with
self-service cloud-native tools
providing a simpli
f
ied yet powerful developer experience
to build real-time analytics applications.
Modern Streaming Stack
STREAMING DATA
PLATFORM
STREAM PROCESSING
EVENT
PRODUCERS
TIERED
STORAGE
DATA API,
METADATA &
GOVERNANCE
Data-driven
Applications
Operational
Systems
Real-time
Analytics
SERVING LAYER
The Unbundling
Event Production/Enablement
The Origins of Events
STREAMING DATA
PLATFORM
Language Speci
f
ic SDK Clients
Streaming Data Platform
• Ingest events from sources in a
scalable manner, and store
them durably until they are
processed.
• Based on an immutable,
distributed log
f
ile. Events are
appended to the log and
partitioned across multiple
servers for durability and
scalability.
EVENT
PRODUCERS
Streaming Data Platform
TOPIC
TOPIC
TOPIC
TOPIC
TOPIC
TOPIC
Technology Choices
Stream Processors
STREAM PROCESSING
Event-driven Microservices
Streaming ETL
• Stream joins for enrichment
• Filtering/routing/transforming streams
• Data integration
• Repartitioning streams (re-keying)
Streaming Analytics
• Stateful aggregations
• Window operations
• Materialising streams, stream-table duality
• Actors
• Reactive logic execution
• Event-by-event processing, triggering side e
ff
ects
Technology Choices
Serving Layer
INPUT TOPIC OUTPUT TOPIC
Event Streaming Platform
STREAM PROCESSING
Serving Layer
Events Streaming ingestion
Real-time Insights Consumption
Internal/user-facing
Analytics
Data
Applications Recommendation
Ad-hoc
Exploration
Serving Layer
Expectations
• Serve queries with sub-second latency to provide a better user experience.
• Support a throughput of hundreds of thousands of queries per second to
serve an Internet-scale user base.
• Ensure data freshness — serve analytics from data ingested a few seconds
ago.
• Run complex OLAP queries, supporting joins, aggregations, and
f
iltering on
large data sets.
Serving Layer
Technology Choices
Key-value stores,
NoSQL databases Real-time OLAP Databases
Tiered Storage
Serving Layer
STREAMING DATA
PLATFORM
New Events
Older Events
Tiered Storage
• Back
f
illing
• Hydrating new applications
• Experimentation (ad-hoc querying)
• Archival/regulatory compliance
• Training ML models
O
ff
line Use Cases
Data APIs, Metadata, and Governance
Analytics must be democratised
and accessible across the board…
Image credits - https://www.datanami.com/2022/01/21/data-meshes-set-to-spread-in-2022/, https://www.con
f
luent.io/blog/how-to-build-a-data-mesh-using-
event-streams/
Event Mesh
EVENT CATALOG SCHEMA REGISTRY
STREAMING API GRAPHQL API
Serving Layer
STREAM PROCESSOR
EVENT STREAMING
PLATFORM
Decision makers Data applications Regulatory bodies Business partners
Real-time Insights
Technology Choices
Standards Schema Registries
Observations &
Future Outlook
Convergence of Stream Processing and Serving Layer
Streaming databases takes the stateful stream processing to the next level.
SaaS o
ff
errings Integrated serving layer Write logic with SQL
Pluggable integrations
A
ff
ordable Developer friendly
Pay-as-you-go
Less components to manage
Integrated tooling
Caters to non-JVM developers
Self-serve
Rise of The Lakehouse Architecture
A Lakehouse combines a data warehouse, data lake, and an event streaming platform
together.
High-throughput
streaming ingestion
Change Data Capture
Upserts
Transactions
Table formats
Takeaways
Takeaways
There’s No Silver Bullet
• Start small, build the critical path, and iterate.
• Pick components based on the need and know their limitations.
• Experiment, fail fast, and fail cheap.
• Go for managed services, if the team is small and new to streaming
technologies.
• Learn from mistakes, establish processes, and share wisdom!!
Book Announcement!
Thank you!
twitter.com/dunithd medium.com/event-driven-utopia linkedin.com/in/dunithd/
Find me at:

More Related Content

PDF
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
PPTX
Shikha fdp 62_14july2017
PPTX
Introduction to Data Engineering
PPTX
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
PDF
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
PPTX
Digital Business Transformation in the Streaming Era
PPT
Big data.ppt
PPTX
Lecture1
Engineering Machine Learning Data Pipelines Series: Streaming New Data as It ...
Shikha fdp 62_14july2017
Introduction to Data Engineering
Unushs susus susujss. Ssuusussjjsjsit 4.pptx
ADV Slides: When and How Data Lakes Fit into a Modern Data Architecture
Digital Business Transformation in the Streaming Era
Big data.ppt
Lecture1

Similar to Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022 (20)

PDF
Introduction to Stream Processing
PDF
Lean Enterprise, Microservices and Big Data
PPTX
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
PPTX
Pacemaker hadoop infrastructure and soft serve experience
PPTX
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
PPTX
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
PDF
The New Model
PDF
Streaming Visualization
PDF
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
PPTX
Machine Learning for Smarter Apps - Jacksonville Meetup
PPTX
real time data processing is a tsubtopic in the topic in the domain bigdata
PPTX
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
PDF
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
PDF
Big Data Architectures @ JAX / BigDataCon 2016
PDF
Big Data Architecture
PDF
Scaling up with Cisco Big Data: Data + Science = Data Science
PPTX
Overview of Fintech industry in Indian context
PPTX
Big Data Analytics Strategy and Roadmap
PDF
Building real time data-driven products
PPTX
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Introduction to Stream Processing
Lean Enterprise, Microservices and Big Data
Deliver Best-in-Class HPC Cloud Solutions Without Losing Your Mind
Pacemaker hadoop infrastructure and soft serve experience
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
"An introduction to Kx Technology - a Big Data solution", Kyra Coyne, Data Sc...
The New Model
Streaming Visualization
Simplifying Building Automation: Leveraging Semantic Tagging with a New Breed...
Machine Learning for Smarter Apps - Jacksonville Meetup
real time data processing is a tsubtopic in the topic in the domain bigdata
Smart Enterprise Big Data Bus for the Modern Responsive Enterprise- StreamAna...
Using Machine Learning to Understand Kafka Runtime Behavior (Shivanath Babu, ...
Big Data Architectures @ JAX / BigDataCon 2016
Big Data Architecture
Scaling up with Cisco Big Data: Data + Science = Data Science
Overview of Fintech industry in Indian context
Big Data Analytics Strategy and Roadmap
Building real time data-driven products
Hadoop Infrastructure and SoftServe Experience by Vitaliy Bashun, Data Architect
Ad

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Ad

Recently uploaded (20)

PPTX
MYSQL Presentation for SQL database connectivity
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Electronic commerce courselecture one. Pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
A Presentation on Artificial Intelligence
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
MYSQL Presentation for SQL database connectivity
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Building Integrated photovoltaic BIPV_UPV.pdf
Approach and Philosophy of On baking technology
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Empathic Computing: Creating Shared Understanding
Dropbox Q2 2025 Financial Results & Investor Presentation
Review of recent advances in non-invasive hemoglobin estimation
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
NewMind AI Weekly Chronicles - August'25 Week I
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
NewMind AI Monthly Chronicles - July 2025
Electronic commerce courselecture one. Pdf
Network Security Unit 5.pdf for BCA BBA.
Encapsulation_ Review paper, used for researhc scholars
A Presentation on Artificial Intelligence
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Understanding_Digital_Forensics_Presentation.pptx

Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022

  • 1. Unbundling the Modern Streaming Stack Dunith Dhanushka - 05/10/2022 Navigating the Real-time Analytics Landscape
  • 2. About Me twitter.com/dunithd medium.com/event-driven-utopia linkedin.com/in/dunithd/ • I’m Dunith Dhanushka • Big data solution architect -> DevRel • Blogs at eventdrivenutopia.com
  • 3. Background • This talk is based on my blog that I published in April, 2022. • This talk has been updated with a few new things since then. • Enjoy!
  • 4. Goal of the Talk What Are We Going To Talk About Today? Introduce you to the things required to build real-time applications that harness value from streaming data
  • 5. The Plan The Order of Things 1. A refresher on streaming data 2. The classic streaming stack 3. The modern streaming stack 4. Current trends and the future outlook
  • 6. What is a Streaming Stack?
  • 7. Streaming Data What Is a stream? A stream is a continuous, never-ending data f low with no beginning or end. The data is incrementally made available over time, enabling you to act upon it without needing to be downloaded f irst.
  • 8. Events Streams are made of events A data stream consists of a series of data points ordered in time. Each data point represents an “event” or a change in the state of the business. T4 T3 T2 T1 T0 Event source Event stream Time Events
  • 9. Event First Thinking Modelling State Changes in Systems A user with ID 1234 purchased item 567 for $3.99 on 2022/06/12 at Austin, TX Fact Value User ID 1234 Item ID 567 Price Paid $3.99 Date 2022/06/12 Place Austin, TX • Events represents facts. • Events are immutable. • Events belong to the past.
  • 10. Making Sense of Streaming Data
  • 11. Events Have A Shelf Life Act Fast Before You Lose Their Value Image credit - https://d3i71xaburhd42.cloudfront.net/8cb6c2711afd3e504400ee12d3b582cc06348b08/7-Figure2-1.png
  • 12. Real-time Analytics Extracting Value From Events As Soon as They Are Made Available REAL-TIME ANALYTICS Insights React Streams of Events
  • 13. What is a Streaming Stack?
  • 14. A streaming stack is the processes, tools, and technologies you use to derive insights from unbounded data.
  • 16. The Beginning • Real-time analytics dates back to decades, existed in the forms of Complex Event Processing (CEP) and Event Stream Processing (ESP). • Most of the work has been academic. But few vendors like Progress Apama, Esper, Tibco, and Streambase tried bringing it to the mass market.
  • 17. Then Came Big Data…
  • 19. Lambda Architecture Promotes A Uni f ied Serving Layer Image credit - https://www.databricks.com/glossary/lambda-architecture
  • 20. Why Didn’t It Pick Up?
  • 22. • Overly complicated technology: Specialised skillset of distributed systems and performance engineering. • Limited only to the JVM: Non- JVM developers had no option rather than adapting. • Higher footprint on infrastructure: Stream processors tax heavily on the CPU and RAM. • Maintenance overhead: Having to maintain both speed and batch layers.
  • 24. Modern Streaming Stack Modern Cloud-native tools Managed and Serverless platforms Rich tooling and developer experience Expressive programming model
  • 25. MSS is the classic streaming stack reimagined with self-service cloud-native tools providing a simpli f ied yet powerful developer experience to build real-time analytics applications.
  • 26. Modern Streaming Stack STREAMING DATA PLATFORM STREAM PROCESSING EVENT PRODUCERS TIERED STORAGE DATA API, METADATA & GOVERNANCE Data-driven Applications Operational Systems Real-time Analytics SERVING LAYER
  • 28. Event Production/Enablement The Origins of Events STREAMING DATA PLATFORM Language Speci f ic SDK Clients
  • 30. • Ingest events from sources in a scalable manner, and store them durably until they are processed. • Based on an immutable, distributed log f ile. Events are appended to the log and partitioned across multiple servers for durability and scalability. EVENT PRODUCERS Streaming Data Platform TOPIC TOPIC TOPIC TOPIC TOPIC TOPIC
  • 33. STREAM PROCESSING Event-driven Microservices Streaming ETL • Stream joins for enrichment • Filtering/routing/transforming streams • Data integration • Repartitioning streams (re-keying) Streaming Analytics • Stateful aggregations • Window operations • Materialising streams, stream-table duality • Actors • Reactive logic execution • Event-by-event processing, triggering side e ff ects
  • 36. INPUT TOPIC OUTPUT TOPIC Event Streaming Platform STREAM PROCESSING Serving Layer Events Streaming ingestion Real-time Insights Consumption Internal/user-facing Analytics Data Applications Recommendation Ad-hoc Exploration
  • 37. Serving Layer Expectations • Serve queries with sub-second latency to provide a better user experience. • Support a throughput of hundreds of thousands of queries per second to serve an Internet-scale user base. • Ensure data freshness — serve analytics from data ingested a few seconds ago. • Run complex OLAP queries, supporting joins, aggregations, and f iltering on large data sets.
  • 38. Serving Layer Technology Choices Key-value stores, NoSQL databases Real-time OLAP Databases
  • 40. Serving Layer STREAMING DATA PLATFORM New Events Older Events Tiered Storage • Back f illing • Hydrating new applications • Experimentation (ad-hoc querying) • Archival/regulatory compliance • Training ML models O ff line Use Cases
  • 41. Data APIs, Metadata, and Governance
  • 42. Analytics must be democratised and accessible across the board… Image credits - https://www.datanami.com/2022/01/21/data-meshes-set-to-spread-in-2022/, https://www.con f luent.io/blog/how-to-build-a-data-mesh-using- event-streams/
  • 43. Event Mesh EVENT CATALOG SCHEMA REGISTRY STREAMING API GRAPHQL API Serving Layer STREAM PROCESSOR EVENT STREAMING PLATFORM Decision makers Data applications Regulatory bodies Business partners Real-time Insights
  • 46. Convergence of Stream Processing and Serving Layer Streaming databases takes the stateful stream processing to the next level. SaaS o ff errings Integrated serving layer Write logic with SQL Pluggable integrations A ff ordable Developer friendly Pay-as-you-go Less components to manage Integrated tooling Caters to non-JVM developers Self-serve
  • 47. Rise of The Lakehouse Architecture A Lakehouse combines a data warehouse, data lake, and an event streaming platform together. High-throughput streaming ingestion Change Data Capture Upserts Transactions Table formats
  • 49. Takeaways There’s No Silver Bullet • Start small, build the critical path, and iterate. • Pick components based on the need and know their limitations. • Experiment, fail fast, and fail cheap. • Go for managed services, if the team is small and new to streaming technologies. • Learn from mistakes, establish processes, and share wisdom!!