SlideShare a Scribd company logo
Quix Streams — Kafka Summit 2023 | 1
Quix Streams
Building Real-Time Applications at Scale
Quix Streams — Kafka Summit 2023 | 2
Tomas Neubauer
Previously McLaren technical lead
CTO & Co-founder, Quix
Hello, nice to meet you! 👋
Quix Streams — Kafka Summit 2023 | 3
Racing
background
Roots in real-time data processing in the
most extreme, time-critical environment.
● 50,000 channels per car
● 1.5 kHz per channel
● 1,000s realtime models and simulations
Quix Streams — Kafka Summit 2023 | 3
Quix Streams — Kafka Summit 2023 | 4
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 5
Kafka
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 6
Streaming
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 7
Python
Now raise your hand if you are using…
Quix Streams — Kafka Summit 2023 | 8
Goal
Crash detection
phone-data crashes
Fitness
app
Quix Streams — Kafka Summit 2023 | 9
● Architecture
● ML deployment
● Streaming landscape
● How it works
● Demo - Let's build it
Content
Quix Streams — Kafka Summit 2023 | 10
ML Deployment
Quix Streams — Kafka Summit 2023 | 10
REST API vs Streaming
Quix Streams — Kafka Summit 2023 | 11
phone-data
Websocket
gateway
alerts
Websocket
gateway
ANALYSIS & TRAINING
Trained
model
SageMaker
Architecture
Crash detection
Quix Streams — Kafka Summit 2023 | 12
ML Deployment with API
API REQUEST
WEB API
API RESPONSE
gX gY gZ gTotal
0.5 0.3 0.1 0.9
gX gY gZ gTotal Crash
0.5 0.3 0.1 0.9 1
SERVICE
Quix Streams — Kafka Summit 2023 | 13
Issues with REST APIs
REST API vs Streaming
Quix Streams — Kafka Summit 2023 | 13
Quix Streams — Kafka Summit 2023 | 14
Problems with REST API
API REQUEST
gX gY gZ gTotal
● CPU overhead
● Introducing delay
● Requests gets lost in case of service downtime or slow performance
WEB API
SERVICE
Quix Streams — Kafka Summit 2023 | 15
Problems with REST API
gX gY gZ gTotal
WEB API
SERVICE
API REQUEST
Quix Streams — Kafka Summit 2023 | 16
Problems with REST API
gX gY gZ gTotal
WEB API
SERVICE
API REQUEST
Quix Streams — Kafka Summit 2023 | 17
Problems with REST API
API REQUEST
gX gY gZ gTotal
API REQUEST
gX gY gZ gTotal
WEB API
SERVICE
WEB API
SERVICE
Quix Streams — Kafka Summit 2023 | 18
Stream processing
applications
An overview of stream
processing approaches
Quix Streams — Kafka Summit 2023 | 18
Quix Streams — Kafka Summit 2023 | 19
When you building stream processing applications with Kafka, there are two
options:
1. Just build an application that uses the Kafka producer and consumer APIs
directly
2. Adopt a full-fledged stream processing framework (Flink, Spark streaming,
Beam etc.)
Stream processing applications
Quix Streams — Kafka Summit 2023 | 20
● Works for simple stuff like one-message-at-a-time processing
● No external dependencies like JVM
● Gets very complicated when stateful processing is needed like calculation
aggregations or joining multiple streams
Kafka producer and consumer APIs
Quix Streams — Kafka Summit 2023 | 21
● Fully fledged stream processing frameworks solves stateful,
more complex operations
● But it is for a cost of increased complexity in many dimensions:
○ Java dependency
○ Deployment gets difficult because code is not running on its own but
in server side cluster (Flink cluster or Spark cluster)
○ Debugging is difficult
○ Performance optimization is difficult
○ Gets even worse when we combine synchronous architecture with
asynchronous in one application
Stream processing frameworks
Quix Streams — Kafka Summit 2023 | 22
JAR files…
Quix Streams — Kafka Summit 2023 | 23
Connecting Flink to Kafka is difficult
Quix Streams — Kafka Summit 2023 | 24
SQL looks easy to use but…
Quix Streams — Kafka Summit 2023 | 25
● Poor development experience
○ Logs only accessible from server, no debugging possible
● Performance hit caused by interface between JVM and Python
UDFs are nasty
Quix Streams — Kafka Summit 2023 | 26
DEBUGGING!!! 🐛🐛🐛
Quix Streams — Kafka Summit 2023 | 27
● Combining Kafka API approach with stream processing library
● Abstraction from key-value messages of Kafka API to virtual tables
● Standalone library that runs:
○ Locally for development and debugging
○ In docker or in Kubernetes for production deployments at scale
Is there a third way?
Quix Streams — Kafka Summit 2023 | 28
1. Messages in topic 2. Split messages into individual streams
4. Messages decomposed into rows
5. Memory state
updated from
incoming rows/series
6. State persistence
3. Message converted to tables
7. State and incoming data
is combined to output that
is sent to output topic
Commit offsets
Stateful processing with Pub & Sub client libraries
Quix Streams — Kafka Summit 2023 | 29
Quix Streams
1. Messages in topic 2. Messages decomposed as
rows available via pandas API
3. Messages processed
through pipeline defined as
pandas operations. Output
streamed to output topic.
● Automatic state management
● Automatic checkpointing
● Automatic message serialization/deserialization
Quix Streams — Kafka Summit 2023 | 30
How it works
Kafka + Kubernetes + Python
Quix Streams — Kafka Summit 2023 | 30
Quix Streams — Kafka Summit 2023 | 31
Our approach to stream processing
Containers
Containers running in
Kubernetes scaling hand
to hand with Kafka for
compute scalability.
Kafka
Handle your data reliably
and efficiently in memory
with Kafka. Using Kafka
partitions, replica system and
persistence to deliver
scalability and robustness.
Python
Python gives you flexibility.
It lets you transform data,
not just query it. From simple
filtering to ML use cases like
video processing.
Quix Streams — Kafka Summit 2023 | 32
Processing with streaming
SUB
gForce
X
gForce
Y
gForce
Z
0.5 0.3 0.1
gForce
X
gForce
Y
gForce
Z
gForce
Total
Crash
0.5 0.3 0.1 0.9 1
INPUT TOPIC
APP
OUTPUT TOPIC
PUB
Quix Streams — Kafka Summit 2023 | 33
Scale
SUB
gForce
X
gForce
Y
gForce
Z
0.5 0.3 0.1
gForce
X
gForce
Y
gForce
Z
gForce
Total
Crash
0.5 0.3 0.1 0.9 1
INPUT TOPIC OUTPUT TOPIC
PUB
Quix Streams — Kafka Summit 2023 | 34
Fault tolerant
SUB
gForce
X
gForce
Y
gForce
Z
0.5 0.3 0.1
gForce
X
gForce
Y
gForce
Z
gForce
Total
Crash
0.5 0.3 0.1 0.9 1
INPUT TOPIC OUTPUT TOPIC
PUB
Quix Streams — Kafka Summit 2023 | 35
Let’s build it!
Demo
Quix Streams — Kafka Summit 2023 | 35
Quix Streams — Kafka Summit 2023 | 36
GitHub
Try Quix Streams
Quix Streams — Kafka Summit 2023 | 37
37
info@quix.io | www.quix.io
Thank you

More Related Content

PDF
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
PDF
Unbounded bounded-data-strangeloop-2016-monal-daxini
PPTX
Big Data Analytics_basic introduction of Kafka.pptx
PPTX
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
PPTX
An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.
PPT
Moving Towards a Streaming Architecture
PDF
Big Data Streams Architectures. Why? What? How?
PDF
It's Time To Stop Using Lambda Architecture
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Unbounded bounded-data-strangeloop-2016-monal-daxini
Big Data Analytics_basic introduction of Kafka.pptx
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
An evening with Jay Kreps; author of Apache Kafka, Samza, Voldemort & Azkaban.
Moving Towards a Streaming Architecture
Big Data Streams Architectures. Why? What? How?
It's Time To Stop Using Lambda Architecture

Similar to Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer (20)

PDF
Build real time stream processing applications using Apache Kafka
PDF
Spark (Structured) Streaming vs. Kafka Streams
PDF
Lessons Learned: Using Spark and Microservices
PDF
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
PPT
CS8091_BDA_Unit_IV_Stream_Computing
PDF
Connect K of SMACK:pykafka, kafka-python or?
PPTX
Streaming datasets for personalization
PDF
Building end to end streaming application on Spark
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Structured Streaming with Kafka
PDF
Unified Stream & Batch Processing with Apache Flink (Hadoop Summit Dublin 2016)
PPTX
Streaming options in the wild
PPTX
Trivento summercamp masterclass 9/9/2016
PDF
Towards Data Operations
PDF
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
PDF
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
PPTX
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
PDF
Don't Cross The Streams - Data Streaming And Apache Flink
PDF
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
PPTX
QCon London - Stream Processing with Apache Flink
Build real time stream processing applications using Apache Kafka
Spark (Structured) Streaming vs. Kafka Streams
Lessons Learned: Using Spark and Microservices
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
CS8091_BDA_Unit_IV_Stream_Computing
Connect K of SMACK:pykafka, kafka-python or?
Streaming datasets for personalization
Building end to end streaming application on Spark
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Structured Streaming with Kafka
Unified Stream & Batch Processing with Apache Flink (Hadoop Summit Dublin 2016)
Streaming options in the wild
Trivento summercamp masterclass 9/9/2016
Towards Data Operations
It's Time To Stop Using Lambda Architecture | Yaroslav Tkachenko, Shopify
Spark (Structured) Streaming vs. Kafka Streams - two stream processing platfo...
Apache Flink Meetup Munich (November 2015): Flink Overview, Architecture, Int...
Don't Cross The Streams - Data Streaming And Apache Flink
Stream Processing with Apache Flink (Flink.tw Meetup 2016/07/19)
QCon London - Stream Processing with Apache Flink
Ad

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
PDF
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
Ad

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Spectroscopy.pptx food analysis technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
NewMind AI Weekly Chronicles - August'25 Week I
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Unlocking AI with Model Context Protocol (MCP)
sap open course for s4hana steps from ECC to s4
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Approach and Philosophy of On baking technology
Mobile App Security Testing_ A Comprehensive Guide.pdf
Empathic Computing: Creating Shared Understanding
Spectroscopy.pptx food analysis technology
Dropbox Q2 2025 Financial Results & Investor Presentation
Spectral efficient network and resource selection model in 5G networks
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
MIND Revenue Release Quarter 2 2025 Press Release
Per capita expenditure prediction using model stacking based on satellite ima...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
20250228 LYD VKU AI Blended-Learning.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing

Case-Study: Building Real-Time Applications at Scale-Cyclist Crash Detection with Tomas Neubauer

  • 1. Quix Streams — Kafka Summit 2023 | 1 Quix Streams Building Real-Time Applications at Scale
  • 2. Quix Streams — Kafka Summit 2023 | 2 Tomas Neubauer Previously McLaren technical lead CTO & Co-founder, Quix Hello, nice to meet you! 👋
  • 3. Quix Streams — Kafka Summit 2023 | 3 Racing background Roots in real-time data processing in the most extreme, time-critical environment. ● 50,000 channels per car ● 1.5 kHz per channel ● 1,000s realtime models and simulations Quix Streams — Kafka Summit 2023 | 3
  • 4. Quix Streams — Kafka Summit 2023 | 4 Now raise your hand if you are using…
  • 5. Quix Streams — Kafka Summit 2023 | 5 Kafka Now raise your hand if you are using…
  • 6. Quix Streams — Kafka Summit 2023 | 6 Streaming Now raise your hand if you are using…
  • 7. Quix Streams — Kafka Summit 2023 | 7 Python Now raise your hand if you are using…
  • 8. Quix Streams — Kafka Summit 2023 | 8 Goal Crash detection phone-data crashes Fitness app
  • 9. Quix Streams — Kafka Summit 2023 | 9 ● Architecture ● ML deployment ● Streaming landscape ● How it works ● Demo - Let's build it Content
  • 10. Quix Streams — Kafka Summit 2023 | 10 ML Deployment Quix Streams — Kafka Summit 2023 | 10 REST API vs Streaming
  • 11. Quix Streams — Kafka Summit 2023 | 11 phone-data Websocket gateway alerts Websocket gateway ANALYSIS & TRAINING Trained model SageMaker Architecture Crash detection
  • 12. Quix Streams — Kafka Summit 2023 | 12 ML Deployment with API API REQUEST WEB API API RESPONSE gX gY gZ gTotal 0.5 0.3 0.1 0.9 gX gY gZ gTotal Crash 0.5 0.3 0.1 0.9 1 SERVICE
  • 13. Quix Streams — Kafka Summit 2023 | 13 Issues with REST APIs REST API vs Streaming Quix Streams — Kafka Summit 2023 | 13
  • 14. Quix Streams — Kafka Summit 2023 | 14 Problems with REST API API REQUEST gX gY gZ gTotal ● CPU overhead ● Introducing delay ● Requests gets lost in case of service downtime or slow performance WEB API SERVICE
  • 15. Quix Streams — Kafka Summit 2023 | 15 Problems with REST API gX gY gZ gTotal WEB API SERVICE API REQUEST
  • 16. Quix Streams — Kafka Summit 2023 | 16 Problems with REST API gX gY gZ gTotal WEB API SERVICE API REQUEST
  • 17. Quix Streams — Kafka Summit 2023 | 17 Problems with REST API API REQUEST gX gY gZ gTotal API REQUEST gX gY gZ gTotal WEB API SERVICE WEB API SERVICE
  • 18. Quix Streams — Kafka Summit 2023 | 18 Stream processing applications An overview of stream processing approaches Quix Streams — Kafka Summit 2023 | 18
  • 19. Quix Streams — Kafka Summit 2023 | 19 When you building stream processing applications with Kafka, there are two options: 1. Just build an application that uses the Kafka producer and consumer APIs directly 2. Adopt a full-fledged stream processing framework (Flink, Spark streaming, Beam etc.) Stream processing applications
  • 20. Quix Streams — Kafka Summit 2023 | 20 ● Works for simple stuff like one-message-at-a-time processing ● No external dependencies like JVM ● Gets very complicated when stateful processing is needed like calculation aggregations or joining multiple streams Kafka producer and consumer APIs
  • 21. Quix Streams — Kafka Summit 2023 | 21 ● Fully fledged stream processing frameworks solves stateful, more complex operations ● But it is for a cost of increased complexity in many dimensions: ○ Java dependency ○ Deployment gets difficult because code is not running on its own but in server side cluster (Flink cluster or Spark cluster) ○ Debugging is difficult ○ Performance optimization is difficult ○ Gets even worse when we combine synchronous architecture with asynchronous in one application Stream processing frameworks
  • 22. Quix Streams — Kafka Summit 2023 | 22 JAR files…
  • 23. Quix Streams — Kafka Summit 2023 | 23 Connecting Flink to Kafka is difficult
  • 24. Quix Streams — Kafka Summit 2023 | 24 SQL looks easy to use but…
  • 25. Quix Streams — Kafka Summit 2023 | 25 ● Poor development experience ○ Logs only accessible from server, no debugging possible ● Performance hit caused by interface between JVM and Python UDFs are nasty
  • 26. Quix Streams — Kafka Summit 2023 | 26 DEBUGGING!!! 🐛🐛🐛
  • 27. Quix Streams — Kafka Summit 2023 | 27 ● Combining Kafka API approach with stream processing library ● Abstraction from key-value messages of Kafka API to virtual tables ● Standalone library that runs: ○ Locally for development and debugging ○ In docker or in Kubernetes for production deployments at scale Is there a third way?
  • 28. Quix Streams — Kafka Summit 2023 | 28 1. Messages in topic 2. Split messages into individual streams 4. Messages decomposed into rows 5. Memory state updated from incoming rows/series 6. State persistence 3. Message converted to tables 7. State and incoming data is combined to output that is sent to output topic Commit offsets Stateful processing with Pub & Sub client libraries
  • 29. Quix Streams — Kafka Summit 2023 | 29 Quix Streams 1. Messages in topic 2. Messages decomposed as rows available via pandas API 3. Messages processed through pipeline defined as pandas operations. Output streamed to output topic. ● Automatic state management ● Automatic checkpointing ● Automatic message serialization/deserialization
  • 30. Quix Streams — Kafka Summit 2023 | 30 How it works Kafka + Kubernetes + Python Quix Streams — Kafka Summit 2023 | 30
  • 31. Quix Streams — Kafka Summit 2023 | 31 Our approach to stream processing Containers Containers running in Kubernetes scaling hand to hand with Kafka for compute scalability. Kafka Handle your data reliably and efficiently in memory with Kafka. Using Kafka partitions, replica system and persistence to deliver scalability and robustness. Python Python gives you flexibility. It lets you transform data, not just query it. From simple filtering to ML use cases like video processing.
  • 32. Quix Streams — Kafka Summit 2023 | 32 Processing with streaming SUB gForce X gForce Y gForce Z 0.5 0.3 0.1 gForce X gForce Y gForce Z gForce Total Crash 0.5 0.3 0.1 0.9 1 INPUT TOPIC APP OUTPUT TOPIC PUB
  • 33. Quix Streams — Kafka Summit 2023 | 33 Scale SUB gForce X gForce Y gForce Z 0.5 0.3 0.1 gForce X gForce Y gForce Z gForce Total Crash 0.5 0.3 0.1 0.9 1 INPUT TOPIC OUTPUT TOPIC PUB
  • 34. Quix Streams — Kafka Summit 2023 | 34 Fault tolerant SUB gForce X gForce Y gForce Z 0.5 0.3 0.1 gForce X gForce Y gForce Z gForce Total Crash 0.5 0.3 0.1 0.9 1 INPUT TOPIC OUTPUT TOPIC PUB
  • 35. Quix Streams — Kafka Summit 2023 | 35 Let’s build it! Demo Quix Streams — Kafka Summit 2023 | 35
  • 36. Quix Streams — Kafka Summit 2023 | 36 GitHub Try Quix Streams
  • 37. Quix Streams — Kafka Summit 2023 | 37 37 info@quix.io | www.quix.io Thank you