SlideShare a Scribd company logo
“The State of Streaming”
Presented at: Bengaluru Streams Meetup - 17
June, 2023
A practitioner’s guide to modern data architecture
whoami
● ಬೆಂಗಳೂರು boy.
● Cofounder, handyman @
platformatory.io
● OSS → ArchLinux, Envoy, Apache
Kafka, Kong (amongst others)
● Functional Programming,
Distributed systems, Himalayas,
Karnataka Music
- https://in.linkedin.com/in/
pavankmurthy
- https://grahana.net/
- https://twitter.com/p6
TOC
- Fast data beats slow data
- Some fundamental shifts in data engineering
- The modern data stack
- Hint, it has streaming in between
- A tale of two architectures
- Lambda
- Kappa
- A view of the streaming ecosystem
- Kafka is the CNS
- Data Movement
- Stream proc will intersect converge the
operational and analytical planes
- Streaming databases is where a lot of analytical
and BI loads will move to
- Data Mesh is the new architecture paradigm for a
modern data estatehe
- The greatest beneficiary will be AI/ML
328.77 M TB/d
120 ZB/y
*Protip: Big data getting bigger and
faster.
Fast Data > Slow Data
- MTTI = Mean Time To Insight
- MTTA = Mean Time To (Insight Driven, hopefully useful) Action
Traditional Data
Architecture * just
can’t keep up with
the explosion of
data
** includes
- Warehouses
- Marts
- Lakes
- Swamps
A few foundational
shifts for the
modern
data-driven
enterprise
1. Absolutely everything leads to the cloud
2. Real-time processing will be relevant in almost all
mission critical use-cases
3. Best-in-breed platforms beat packaged platforms
4. Data fan-out at scale over point to point
connectivity
5. Domain based architecture is the only way to break
the data monolith
6. A product approach to data is not only useful but
also necessary
McKinsey: How to build a data architecture to drive innovation
Streaming is hard,
but it is worth it 1. Stream as a core primitive across operational
and analytical planes
2. Data Sourcing & Movement
3. Storage
4. Processing
5. Querying
6. Cross-cutting concerns (Security, Observability,
Governance, etc)
Some unified data infrastructure archetypes emerge: Courtesy A16z
Modern BI
Enterprise Multi-Modal processing
AI/ML
The Stories
20Trillion
events/day
400Billion
events/day
1Trillion+
evets/day
- Streaming is hotter than ever
- Apache Kafka: The de-facto protocol for
eventing
- Stream Processing Engines have finally come
off age: Apache Flink, Spark Streaming, KSQL,
Materialize, RisingWave and a whole host of
streaming SQL
- Lake-house architectures are open: Apache
Hudi, Iceberg
- Real Time Analytics now comes with a modern
flavour: Apache Pinot, Druid, Clickhouse…
- AI/ML centric ops will increasingly converge
into streaming
A practitioner’s
view and closing
notes

More Related Content

PDF
Data platform architecture
PPTX
Data streaming fundamentals
PDF
Lyft data Platform - 2019 slides
PDF
The Lyft data platform: Now and in the future
PDF
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
PPTX
Trivento summercamp masterclass 9/9/2016
PDF
Streaming Analytics with Spark, Kafka, Cassandra and Akka
PDF
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson
Data platform architecture
Data streaming fundamentals
Lyft data Platform - 2019 slides
The Lyft data platform: Now and in the future
Unbundling the Modern Streaming Stack With Dunith Dhanushka | Current 2022
Trivento summercamp masterclass 9/9/2016
Streaming Analytics with Spark, Kafka, Cassandra and Akka
Streaming Analytics with Spark, Kafka, Cassandra and Akka by Helena Edelson

Similar to The State of Streaming.pdf (20)

PDF
Streaming analytics state of the art
PDF
Streaming is a Detail
PPTX
Big Data_Architecture.pptx
PDF
Cloud Lambda Architecture Patterns
PPTX
Analysis of Major Trends in Big Data Analytics
PPTX
Analysis of Major Trends in Big Data Analytics
PPTX
Trivento summercamp fast data 9/9/2016
PPTX
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
PPTX
Shikha fdp 62_14july2017
PDF
BD_Architecture and Charateristics.pptx.pdf
PDF
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
PDF
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
PDF
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
PDF
Real-Time Analytics with Confluent and MemSQL
PDF
[WSO2Con EU 2018] The Rise of Streaming SQL
PDF
Modern Data Flow
PDF
Streaming Data Pipelines with Kafka (MEAP) Stefan Sprenger download pdf
PPTX
Azure Stream Analytics
PDF
Data Streaming For Big Data
PPTX
Distributed Data Processing for Real-time Applications
Streaming analytics state of the art
Streaming is a Detail
Big Data_Architecture.pptx
Cloud Lambda Architecture Patterns
Analysis of Major Trends in Big Data Analytics
Analysis of Major Trends in Big Data Analytics
Trivento summercamp fast data 9/9/2016
Analysis-of-Major-Trends-in-big-data-analytics-slim-baltagi-hadoop-summit
Shikha fdp 62_14july2017
BD_Architecture and Charateristics.pptx.pdf
An Analytics Engineer’s Guide to Streaming With Amy Chen | Current 2022
KIT-601 Lecture Notes-UNIT-3.pdf Mining Data Stream
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
Real-Time Analytics with Confluent and MemSQL
[WSO2Con EU 2018] The Rise of Streaming SQL
Modern Data Flow
Streaming Data Pipelines with Kafka (MEAP) Stefan Sprenger download pdf
Azure Stream Analytics
Data Streaming For Big Data
Distributed Data Processing for Real-time Applications
Ad

More from AvinashUpadhyaya3 (6)

PDF
Kong Workshop.pdf
PDF
A Primer Towards Running Kafka on Top of Kubernetes.pdf
PDF
Stories from running Kafka on K8S.pdf
PDF
Kong API Gateway.pdf
PDF
Kuma + Kong
PDF
Introduction to Kong Plugin Development.pdf
Kong Workshop.pdf
A Primer Towards Running Kafka on Top of Kubernetes.pdf
Stories from running Kafka on K8S.pdf
Kong API Gateway.pdf
Kuma + Kong
Introduction to Kong Plugin Development.pdf
Ad

Recently uploaded (20)

PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Modernizing your data center with Dell and AMD
PDF
Electronic commerce courselecture one. Pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
A Presentation on Artificial Intelligence
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Spectral efficient network and resource selection model in 5G networks
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Advanced methodologies resolving dimensionality complications for autism neur...
Diabetes mellitus diagnosis method based random forest with bat algorithm
The Rise and Fall of 3GPP – Time for a Sabbatical?
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
NewMind AI Monthly Chronicles - July 2025
Empathic Computing: Creating Shared Understanding
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
20250228 LYD VKU AI Blended-Learning.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Modernizing your data center with Dell and AMD
Electronic commerce courselecture one. Pdf

The State of Streaming.pdf

  • 1. “The State of Streaming” Presented at: Bengaluru Streams Meetup - 17 June, 2023 A practitioner’s guide to modern data architecture
  • 2. whoami ● ಬೆಂಗಳೂರು boy. ● Cofounder, handyman @ platformatory.io ● OSS → ArchLinux, Envoy, Apache Kafka, Kong (amongst others) ● Functional Programming, Distributed systems, Himalayas, Karnataka Music - https://in.linkedin.com/in/ pavankmurthy - https://grahana.net/ - https://twitter.com/p6
  • 3. TOC - Fast data beats slow data - Some fundamental shifts in data engineering - The modern data stack - Hint, it has streaming in between - A tale of two architectures - Lambda - Kappa - A view of the streaming ecosystem - Kafka is the CNS - Data Movement - Stream proc will intersect converge the operational and analytical planes - Streaming databases is where a lot of analytical and BI loads will move to - Data Mesh is the new architecture paradigm for a modern data estatehe - The greatest beneficiary will be AI/ML
  • 4. 328.77 M TB/d 120 ZB/y *Protip: Big data getting bigger and faster.
  • 5. Fast Data > Slow Data - MTTI = Mean Time To Insight - MTTA = Mean Time To (Insight Driven, hopefully useful) Action
  • 6. Traditional Data Architecture * just can’t keep up with the explosion of data ** includes - Warehouses - Marts - Lakes - Swamps
  • 7. A few foundational shifts for the modern data-driven enterprise 1. Absolutely everything leads to the cloud 2. Real-time processing will be relevant in almost all mission critical use-cases 3. Best-in-breed platforms beat packaged platforms 4. Data fan-out at scale over point to point connectivity 5. Domain based architecture is the only way to break the data monolith 6. A product approach to data is not only useful but also necessary McKinsey: How to build a data architecture to drive innovation
  • 8. Streaming is hard, but it is worth it 1. Stream as a core primitive across operational and analytical planes 2. Data Sourcing & Movement 3. Storage 4. Processing 5. Querying 6. Cross-cutting concerns (Security, Observability, Governance, etc)
  • 9. Some unified data infrastructure archetypes emerge: Courtesy A16z
  • 12. AI/ML
  • 14. - Streaming is hotter than ever - Apache Kafka: The de-facto protocol for eventing - Stream Processing Engines have finally come off age: Apache Flink, Spark Streaming, KSQL, Materialize, RisingWave and a whole host of streaming SQL - Lake-house architectures are open: Apache Hudi, Iceberg - Real Time Analytics now comes with a modern flavour: Apache Pinot, Druid, Clickhouse… - AI/ML centric ops will increasingly converge into streaming A practitioner’s view and closing notes