SlideShare a Scribd company logo
1C O N F I D E N T I A L
Building Consciousness
on Real Time Events: ksqlDB Recipes
2C O N F I D E N T I A L
Gnanaguru(Guru) Sattanathan | guru@confluent.io | @avoguru
3C O N F I D E N T I A L
காஃ$காKAFKA
4C O N F I D E N T I A L
A company is built on
DATA FLOWS
but
All we have is
DATASTORES
5
App App App App
search
HadoopDWH
monitoring security
MQ MQ
cache
cache
A bit of a mess…
6
Kafka is a Streaming Platform
KAFKA
DWH Hadoop
App
App App App App
App
App
App
request-response
messaging
OR
stream
processing
streaming data pipelines
changelogs
7
Event Streaming Platform
• Storage
• Pub / Sub
• Processing
8
The log is a simple idea
Messages are added at the end of the
log
Old New
9
Shard data to get scalability
Messages are sent to different partitions
Producer
(1)
Producer
(2)
Producer
(3)
Cluster of
machines
Partitions live on
different machines
Messages are sent to
different partitions
10
Linearly Scalable Architecture
Single topic:
- Many producers machines
- Many consumer machines
- Many Broker machines
No Bottleneck!
Producers
Consumers
11C O N F I D E N T I A L
12
Streaming
is the toolset
for dealing with
events
as they move!
13C O N F I D E N T I A L
KSQL
The streaming SQL engine for Apache Kafka®
to write real-time applications in SQL
14C O N F I D E N T I A L
KSQL
CREATE STREAM fraudulent_payments AS
SELECT * FROM payments
WHERE fraudProbability > 0.8;
streams
Lowering the
bar: KSQL vs.
Kafka
Streams
Lower the bar to enter the world of streaming
vs.
15C O N F I D E N T I A L
KSQL
● You write only SQL.
No Java, Python, or
other boilerplate to
wrap around it!
● Create KSQL user
defined functions in
Java when needed.
CREATE STREAM fraudulent_payments AS
SELECT * FROM payments
WHERE fraudProbability > 0.8;
16C O N F I D E N T I A L
All you need is Kafka and KSQL
1.Build & package
2. Submit job
required for
fault-tolerance
ksql> SELECT * FROM myStream
Without KSQL With KSQL
processing
storage
17C O N F I D E N T I A L
Something to remember !
KSQL is a process.*
*But what was announced at #kafkasummit is slightly different
18C O N F I D E N T I A L
Data exploration
KSQL example use cases
Data enrichment Streaming ETL
Filter, cleanse, mask Real-time monitoring Anomaly detection
19C O N F I D E N T I A L
Example: CDC from DB via Kafka to Elastic
KSQL processes table
changes in real-time
Kafka Connect
streams data in
Kafka Connect
streams data out
20C O N F I D E N T I A L
Example: Retail
KSQL joins the two
streams in real-time
Stream of shipments
that arrive
Stream of purchases from
online and physical stores
21C O N F I D E N T I A L
Example: IoT, Automotive, Connected Cars
KSQL joins the two
streams in real-time
Kafka Connect
streams data in
Cars send telemetry data
via Kafka API
Kafka Streams application
to notify customers
22C O N F I D E N T I A L
KSQL for Real-Time Monitoring
● Log data monitoring
● Tracking and alerting
● Syslog data
● Sensor / IoT data
● Application metrics
CREATE STREAM syslog_invalid_users AS
SELECT host, message
FROM syslog
WHERE message LIKE '%Invalid user%';
http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
23C O N F I D E N T I A L
KSQL for Anomaly Detection
● Identify patterns or
anomalies in real-
time data, surfaced
in milliseconds
CREATE TABLE possible_fraud AS
SELECT card_number, COUNT(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING COUNT(*) > 3;
24C O N F I D E N T I A L
KSQL for Streaming ETL
● Joining, filtering, and
aggregating streams
of event data
CREATE STREAM vip_actions AS
SELECT user_id, page, action
FROM clickstream c
LEFT JOIN users u
ON c.user_id = u.user_id
WHERE u.level = 'Platinum';
25C O N F I D E N T I A L
KSQL for Data Transformation
● Easily make
derivations of
existing topics
CREATE STREAM pageviews_avro
WITH (PARTITIONS=6,
VALUE_FORMAT='AVRO') AS
SELECT * FROM pageviews_json
PARTITION BY user_id;
26C O N F I D E N T I A L
Updates from Kafka Summit San Fran 2019
Connectors to work
closely with KSQL
Lookups made simpler
Overall KSQL is a Process
& also a database
https://bit.ly/33V17X8
27C O N F I D E N T I A L
ksqldb.io
28C O N F I D E N T I A L
https://ksqldb.io/
29C O N F I D E N T I A L
Demo
https://bit.ly/32UsGQ4
https://bit.ly/32I0qAr
https://kafka-tutorials.confluent.io
30C O N F I D E N T I A L

More Related Content

PDF
Kafka Streams: What it is, and how to use it?
PDF
ksqlDB - Stream Processing simplified!
PDF
ksqlDB: A Stream-Relational Database System
PPTX
Introduction to KSQL: Streaming SQL for Apache Kafka®
PPTX
Confluent와 함께 Data in Motion 실현
PDF
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...
PDF
Cloud arch patterns
PPTX
Google Cloud Platform
Kafka Streams: What it is, and how to use it?
ksqlDB - Stream Processing simplified!
ksqlDB: A Stream-Relational Database System
Introduction to KSQL: Streaming SQL for Apache Kafka®
Confluent와 함께 Data in Motion 실현
Extending the Apache Kafka® Replication Protocol Across Clusters, Sanjana Kau...
Cloud arch patterns
Google Cloud Platform

What's hot (20)

PDF
Introduction to Apache Kafka and Confluent... and why they matter
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PDF
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
PPTX
Introduction to GCP presentation
PPTX
The Top 5 Apache Kafka Use Cases and Architectures in 2022
PDF
Kafka Connect and Streams (Concepts, Architecture, Features)
PDF
Designing Data-Intensive Applications_ The Big Ideas Behind Reliable, Scalabl...
PDF
Apache Kafka® and API Management
PDF
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
PDF
Performance Monitoring: Understanding Your Scylla Cluster
PDF
Kubernetes Introduction
PPTX
An Introduction to Confluent Cloud: Apache Kafka as a Service
ODP
Stream processing using Kafka
PDF
Streaming all over the world Real life use cases with Kafka Streams
PPTX
Adopting OpenTelemetry
PDF
Apache Kafka Architecture & Fundamentals Explained
PPTX
Kafka 101
PPTX
Introduction to GCP (Google Cloud Platform)
PPTX
A visual introduction to Apache Kafka
PDF
Kappa vs Lambda Architectures and Technology Comparison
Introduction to Apache Kafka and Confluent... and why they matter
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
KSQL-ops! Running ksqlDB in the Wild (Simon Aubury, ThoughtWorks) Kafka Summi...
Introduction to GCP presentation
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Kafka Connect and Streams (Concepts, Architecture, Features)
Designing Data-Intensive Applications_ The Big Ideas Behind Reliable, Scalabl...
Apache Kafka® and API Management
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Performance Monitoring: Understanding Your Scylla Cluster
Kubernetes Introduction
An Introduction to Confluent Cloud: Apache Kafka as a Service
Stream processing using Kafka
Streaming all over the world Real life use cases with Kafka Streams
Adopting OpenTelemetry
Apache Kafka Architecture & Fundamentals Explained
Kafka 101
Introduction to GCP (Google Cloud Platform)
A visual introduction to Apache Kafka
Kappa vs Lambda Architectures and Technology Comparison
Ad

Similar to ksqlDB: Building Consciousness on Real Time Events (20)

PDF
KSQL: The Streaming SQL Engine for Apache Kafka
PDF
Streaming ETL to Elastic with Apache Kafka and KSQL
PDF
Amsterdam meetup at ING June 18, 2019
PDF
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
PDF
KSQL: Streaming SQL for Kafka
PDF
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
PDF
Live Coding a KSQL Application
PDF
EDA Meets Data Engineering – What's the Big Deal?
PDF
ksqlDB Workshop
PDF
Using Kafka to integrate DWH and Cloud Based big data systems
PPTX
KSQL and Kafka Streams – When to Use Which, and When to Use Both
PPTX
Live Coding a KSQL Application
PDF
Un'introduzione a Kafka Streams e KSQL... and why they matter!
PDF
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
PPTX
New Approaches for Fraud Detection on Apache Kafka and KSQL
PPTX
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
PPTX
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
PDF
All Streams Ahead! ksqlDB Workshop ANZ
PDF
Concepts and Patterns for Streaming Services with Kafka
PDF
A Tour of Apache Kafka
KSQL: The Streaming SQL Engine for Apache Kafka
Streaming ETL to Elastic with Apache Kafka and KSQL
Amsterdam meetup at ING June 18, 2019
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
KSQL: Streaming SQL for Kafka
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
Live Coding a KSQL Application
EDA Meets Data Engineering – What's the Big Deal?
ksqlDB Workshop
Using Kafka to integrate DWH and Cloud Based big data systems
KSQL and Kafka Streams – When to Use Which, and When to Use Both
Live Coding a KSQL Application
Un'introduzione a Kafka Streams e KSQL... and why they matter!
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
New Approaches for Fraud Detection on Apache Kafka and KSQL
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
SingleStore & Kafka: Better Together to Power Modern Real-Time Data Architect...
All Streams Ahead! ksqlDB Workshop ANZ
Concepts and Patterns for Streaming Services with Kafka
A Tour of Apache Kafka
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
PDF
Migration, backup and restore made easy using Kannika
PDF
Five Things You Need to Know About Data Streaming in 2025
PDF
Data in Motion Tour Seoul 2024 - Keynote
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
PDF
Unlocking value with event-driven architecture by Confluent
PDF
Il Data Streaming per un’AI real-time di nuova generazione
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
PDF
Building API data products on top of your real-time data infrastructure
PDF
Speed Wins: From Kafka to APIs in Minutes
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
Webinar Think Right - Shift Left - 19-03-2025.pptx
Migration, backup and restore made easy using Kannika
Five Things You Need to Know About Data Streaming in 2025
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Roadmap Demo
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Build a Real-Time Decision Support Application for Financial Market Traders w...
Strumenti e Strategie di Stream Governance con Confluent Platform
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Building Real-Time Gen AI Applications with SingleStore and Confluent
Unlocking value with event-driven architecture by Confluent
Il Data Streaming per un’AI real-time di nuova generazione
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Break data silos with real-time connectivity using Confluent Cloud Connectors
Building API data products on top of your real-time data infrastructure
Speed Wins: From Kafka to APIs in Minutes
Evolving Data Governance for the Real-time Streaming and AI Era

Recently uploaded (20)

PPTX
Cloud computing and distributed systems.
PDF
Approach and Philosophy of On baking technology
PPTX
Big Data Technologies - Introduction.pptx
PDF
KodekX | Application Modernization Development
PDF
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Electronic commerce courselecture one. Pdf
Cloud computing and distributed systems.
Approach and Philosophy of On baking technology
Big Data Technologies - Introduction.pptx
KodekX | Application Modernization Development
[발표본] 너의 과제는 클라우드에 있어_KTDS_김동현_20250524.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Diabetes mellitus diagnosis method based random forest with bat algorithm
Understanding_Digital_Forensics_Presentation.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Spectral efficient network and resource selection model in 5G networks
Machine learning based COVID-19 study performance prediction
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Advanced methodologies resolving dimensionality complications for autism neur...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Empathic Computing: Creating Shared Understanding
Dropbox Q2 2025 Financial Results & Investor Presentation
Electronic commerce courselecture one. Pdf

ksqlDB: Building Consciousness on Real Time Events

  • 1. 1C O N F I D E N T I A L Building Consciousness on Real Time Events: ksqlDB Recipes
  • 2. 2C O N F I D E N T I A L Gnanaguru(Guru) Sattanathan | guru@confluent.io | @avoguru
  • 3. 3C O N F I D E N T I A L காஃ$காKAFKA
  • 4. 4C O N F I D E N T I A L A company is built on DATA FLOWS but All we have is DATASTORES
  • 5. 5 App App App App search HadoopDWH monitoring security MQ MQ cache cache A bit of a mess…
  • 6. 6 Kafka is a Streaming Platform KAFKA DWH Hadoop App App App App App App App App request-response messaging OR stream processing streaming data pipelines changelogs
  • 7. 7 Event Streaming Platform • Storage • Pub / Sub • Processing
  • 8. 8 The log is a simple idea Messages are added at the end of the log Old New
  • 9. 9 Shard data to get scalability Messages are sent to different partitions Producer (1) Producer (2) Producer (3) Cluster of machines Partitions live on different machines Messages are sent to different partitions
  • 10. 10 Linearly Scalable Architecture Single topic: - Many producers machines - Many consumer machines - Many Broker machines No Bottleneck! Producers Consumers
  • 11. 11C O N F I D E N T I A L
  • 12. 12 Streaming is the toolset for dealing with events as they move!
  • 13. 13C O N F I D E N T I A L KSQL The streaming SQL engine for Apache Kafka® to write real-time applications in SQL
  • 14. 14C O N F I D E N T I A L KSQL CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8; streams Lowering the bar: KSQL vs. Kafka Streams Lower the bar to enter the world of streaming vs.
  • 15. 15C O N F I D E N T I A L KSQL ● You write only SQL. No Java, Python, or other boilerplate to wrap around it! ● Create KSQL user defined functions in Java when needed. CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8;
  • 16. 16C O N F I D E N T I A L All you need is Kafka and KSQL 1.Build & package 2. Submit job required for fault-tolerance ksql> SELECT * FROM myStream Without KSQL With KSQL processing storage
  • 17. 17C O N F I D E N T I A L Something to remember ! KSQL is a process.* *But what was announced at #kafkasummit is slightly different
  • 18. 18C O N F I D E N T I A L Data exploration KSQL example use cases Data enrichment Streaming ETL Filter, cleanse, mask Real-time monitoring Anomaly detection
  • 19. 19C O N F I D E N T I A L Example: CDC from DB via Kafka to Elastic KSQL processes table changes in real-time Kafka Connect streams data in Kafka Connect streams data out
  • 20. 20C O N F I D E N T I A L Example: Retail KSQL joins the two streams in real-time Stream of shipments that arrive Stream of purchases from online and physical stores
  • 21. 21C O N F I D E N T I A L Example: IoT, Automotive, Connected Cars KSQL joins the two streams in real-time Kafka Connect streams data in Cars send telemetry data via Kafka API Kafka Streams application to notify customers
  • 22. 22C O N F I D E N T I A L KSQL for Real-Time Monitoring ● Log data monitoring ● Tracking and alerting ● Syslog data ● Sensor / IoT data ● Application metrics CREATE STREAM syslog_invalid_users AS SELECT host, message FROM syslog WHERE message LIKE '%Invalid user%'; http://cnfl.io/syslogs-filtering / http://cnfl.io/syslog-alerting
  • 23. 23C O N F I D E N T I A L KSQL for Anomaly Detection ● Identify patterns or anomalies in real- time data, surfaced in milliseconds CREATE TABLE possible_fraud AS SELECT card_number, COUNT(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING COUNT(*) > 3;
  • 24. 24C O N F I D E N T I A L KSQL for Streaming ETL ● Joining, filtering, and aggregating streams of event data CREATE STREAM vip_actions AS SELECT user_id, page, action FROM clickstream c LEFT JOIN users u ON c.user_id = u.user_id WHERE u.level = 'Platinum';
  • 25. 25C O N F I D E N T I A L KSQL for Data Transformation ● Easily make derivations of existing topics CREATE STREAM pageviews_avro WITH (PARTITIONS=6, VALUE_FORMAT='AVRO') AS SELECT * FROM pageviews_json PARTITION BY user_id;
  • 26. 26C O N F I D E N T I A L Updates from Kafka Summit San Fran 2019 Connectors to work closely with KSQL Lookups made simpler Overall KSQL is a Process & also a database https://bit.ly/33V17X8
  • 27. 27C O N F I D E N T I A L ksqldb.io
  • 28. 28C O N F I D E N T I A L https://ksqldb.io/
  • 29. 29C O N F I D E N T I A L Demo https://bit.ly/32UsGQ4 https://bit.ly/32I0qAr https://kafka-tutorials.confluent.io
  • 30. 30C O N F I D E N T I A L