SlideShare a Scribd company logo
1
Now You See Me, Now You Compute
Building Event-Driven Architectures with Apache Kafka®
Michael G. Noll
Technologist, Office of the CTO, Confluent
@miguno
22
Event Streaming
Why
?
33
The world is changing.
44
The New Business Reality
Past
Technology was a support function
Innovation required for growth
Running the business on yesterday’s data was
“good enough”
Today
Technology is the business
Innovation required for survival
Yesterday’s data = failure.
Modern, real-time data infrastructure
is required.
5
The Rise Of Event Streaming
60%Fortune 100 Companies
Using Apache Kafka
66
Taxis become Software
2
min
77
The world is changing.
Then
Hardware product
Up-front purchase
Opaque
No data
Now
Hardware, Software, and Global Internet Service
On-demand
Real-time visibility
Built on a foundation of data
Transportation
88
Transportation
99
This transformation is
happening everywhere
1010
Banking
1111
Retail
1212
What enables this
transformation?
1313
Cloud Machine
Learning
Mobile Event
Streaming
Rethink
Decision Making
Rethink
User Experience
Rethink
Data
Rethink
Data Centers
1414
Do you see me?
Or: Would you blindly cross the street with
traffic information that is 5 minutes old?
1515
Transportation
ETA
Real-time sensor
diagnostics
Driver-rider match
Banking
Fraud detection
Trading and risk
systems
Mobile applications /
customer experience
Retail
Real-time inventory
Real-time POS
reporting
Personalization
Entertainment
Real-time
recommendations
Personalized
news feed
In-app purchases
1616
This is a fundamental paradigm shift...
Infrastructure
as code
Data as continuous
stream of events
Future of the
datacenter
Future of data
Cloud
Event
Streaming
1717
Event Streaming
The
Paradigm
1818
Two Problems in Application Infrastructure
What’s the state of
the world?
What’s happening
in the world?
Solution:
Databases
Solution:
Messaging, RPC, ETL, etc.
1919
ETL/Data Integration Messaging
Batch
Expensive
Time Consuming
Difficult to Scale
No Persistence
Data Loss
No Replay
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
2020
ETL/Data Integration Messaging
Batch
Expensive
Time Consuming
Difficult to Scale
No Persistence
Data Loss
No Replay
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
Transient MessagesStored records
2121
2222
ETL/Data Integration Messaging
Transient MessagesStored records
ETL/Data Integration MessagingMessaging
Batch
Expensive
Time Consuming
Difficult to Scale
No Persistence
Data Loss
No Replay
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
Event Streaming Paradigm
High Throughput
Durable
Persistent
Maintains Order
Fast (Low Latency)
Replay
2323
To rethink data as neither stored records
nor transient messages, but instead as a
continuously updating Stream of Events
Event Streaming Paradigm
24
An Event
records the fact that something happened
24
A good
was sold
An invoice
was issued
A payment
was made
A new customer
registered
25
A Stream
represents history as a sequence of Events
25
26
Events change the way we think
26
Monolithic Approach
● a database
● a variable
● a singleton
● an RPC
Event-First Approach
● an event
● a stream
● a ‘data’ flow
● a stream processor
Orders
Service
Payments
Service
Customers
Service
Orders
Service
Order
Validation
Service
Tax
ServiceEmail
Notification
ServiceDB
request
response
event
streams
27
An Event Streaming Platform
gives you three key functionalities
27
Publish & Subscribe
to Events
Store
Events
Process & Analyze
Events
2828
2929
Event Streaming
Platform
Universal Event Pipeline
Data Stores Logs 3rd Party Apps Custom Apps/Microservices
✓ Real-time but also persistent
✓ Elastic, scalable, reliable
✓ High throughput, low latency
✓ All apps and systems can
now speak to each other for a
complete view of data
3030
Data Stores Logs 3rd Party Apps Custom Apps/Microservices
Real-Time
Inventory
Real-Time
Fraud
Detection
Real-Time
Customer 360
Machine
Learning
Models
Real-Time
Data
Transformation
...
Event-Driven Apps, with Historical Context
Universal Event Pipeline
Event Streaming
Platform
✓ Real-time but also persistent
✓ Elastic, scalable, reliable
✓ High throughput, low latency
✓ All apps and systems can
now speak to each other for a
complete view of data
3131
Event-Driven App
(Location Tracking)
Only Real-time Events
Messaging Queues and
Event Streaming
Platforms can do this
Contextual
Event-Driven App
(ETA)
Real-time combined
with stored data
Only Event Streaming
Platforms can do this
Where is my driver? When will my driver
get here?
Where is my driver? When will my driver
get here?
Why Combine Real-time
With Historical Context?
2
min
3232
The Event Streaming Platform
is the Central Nervous System
for today’s enterprises
3333
Event Streaming Architectures
How to Build
With Kafka
34
is a distributed event streaming platform
Publish & Subscribe
to Events
Store
Events
Process & Analyze
Events
3535
01
Stream your data
in real-time as Events
02
Store your
Event Streams
03
Process & Analyze
your Events Streams
3636
01
Stream your data
in real-time as Events
From apps, microservices
Use a Kafka producer client from your favorite language
… and many more
From/to other systems
Use Kafka Connect plus a Connector for your system
… and many more
37
From apps, microservices: producer example
Python App
network
write
… and more
38
From/to other systems: Kafka Connect
and more
Tip: Great option to gradually move workloads to Kafka while keeping production running!
39
Kafka Connect
● Deployed standalone (development) or as a distributed cluster (production)
● Elastic service that works on bare-metal, VMs, containers, Kubernetes, ...
● The individual ‘Connector’ determines delivery guarantees, e.g., exactly-once
VM VM
40
Single Message Transforms for real-time ETL
Ingress: modify an Event before storing
● Obfuscate sensitive information, e.g. PII
● Add origin of event for lineage tracking
● Remove unnecessary data fields
● … and more
Egress: modify an Event on its way out
● Route high-priority events to faster stores
● Direct events to different Elasticsearch indexes
● Cast data types to match destination
● … and more
{ user: ab123,
gender: female,
ip: 1.2.3.95 }
{ user: ab123,
ip: 1.2.3.XXX }
41
Where SMTs live (ingress example)
Data
Source
Kafka Connect
SMT1
Converter
transform serializes
Source
Connector
generates events
...
SMTn
10101
01010
4242
Confluent Hub
Discover Connectors,
SMTs, and converters
confluent.io/hub
Easy installation
Documentation,
support, etc.
43
02
Store your
Event Streams
43
Kafka Cluster
VM
Storage is
Distributed
Scalable
Reliable
Durable
Performant
44
Topics PartitionsMessages / sec Brokers
10,000,000 25,000 1,000,000 1,500
Topics PartitionsMessages / sec Brokers
250,000 500 25,000 25
Topics PartitionsMessages / sec Brokers
1 5 300 3
Kafka scales from S to XXL
4545
Event Streaming Paradigm
Highly Scalable
Durable
Persistent
Maintains Order
Fast (Low Latency)
Kafka = Source of Truth,
stores every article since 1851
Denormalized into
“Content View”
Normalized assets
(images, articles,
bylines, etc.)
https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/
Store your Events as long as you want
46
Secure your Event Streams
Authentication
Data
Confidentiality
Authorization
47
Achievement Data Unlocked:
All Your Data Now Available as Streams of Events
48
Consumer Bob Consumer Dina
Reads
Offset = 3 Offset = 7
Producer Alice
Writes
91 2 3 4 5 6 7 8
Independent access to Event Streams
49
03
Process & Analyze
your Events Streams
49
With separate frameworks
… and more
With Streaming SQL
KSQL
streams
With apps, microservices
… and more
Kafka consumer clients
or
50
CREATE STREAM fraudulent_payments AS
SELECT * FROM payments
WHERE fraudProbability > 0.8
● You write only SQL
● No Java, Python, or other
boilerplate to wrap
around it!
● Create KSQL User
Defined Functions
in Java when needed
● All you need is Kafka
KSQL
51
Stream Processing with KSQL
4 Headless1 UI 2 CLI
ksql>
3 API
POST /query
Pick your favorite interface
52
Where KSQL lives
VM
network
read/write
Elastic & Scalable
Fault-tolerant
Exactly-once
Kafka security
Aggregations
Windowing
Streams & Tables
KSQL Cluster
53
Stream Processing with KSQL
Stream 01
Stream 02
Stream 03
Table
Process event streams to create new, continuously updated streams or tables
QueryQuery
Streaming
Query
CREATE TABLE OrderTotals AS SELECT * FROM ... EMIT CHANGES
54
Stream Processing with KSQL
Query tables in Kafka from other apps, similar to a relational database
Table
QueryQuery
Pull
Query
SELECT * FROM OrderTotals WHERE region = ‘Europe’
Result
Upcoming feature (KLIP-8)
55
Query tables in Kafka from other apps, similar to a relational database
Other Applications
(Java, Go, Python, etc.)
can directly query tables
Result
request-response
via network
(KSQL REST API)
Table
SELECT * FROM OrderTotals WHERE region = ‘Europe’
Stream Processing with KSQL
Upcoming feature (KLIP-8)
56
KSQL integrates with Kafka Connect
Simplifies event streaming between Kafka and other systems
CREATE SOURCE CONNECTOR my-postgres-jdbc WITH (
connector.class = "io.confluent.connect.jdbc.jdbcSourceConnector",
connection.url = "jdbc:postgresql://dbserver:5432/my-db", ...);
Upcoming feature (KLIP-7)
controls controls
57
KSQL example use case
Creating an event-driven dashboard from a customer database
customers
table
Kafka Connect is
streaming change events
Results are
continuously updating
Elasticsearch
Aggregations are
computed in real-time
58
Kafka Streams
● You write standard Java or
Scala applications to
process your events
● The Kafka Streams library
makes these applications:
elastic, scalable,
fault-tolerant, and more
● All you need is Kafka
streams
59
<dependency>
<groupId>org.apache.kafka</groupId>
<artifactId>kafka-streams</artifactId>
<version>2.3.0</version>
</dependency>
Add as dependency to
your Java/Scala app
Writing a Kafka Streams application
60
KStreams Application
App instance 1
...
App instance n
VM
network
read/write
Elastic & Scalable
Fault-tolerant
Exactly-once
Kafka security
Aggregations
Windowing
Streams & Tables
Where your Kafka Streams apps live
61
Process event streams to create new, continuously updated streams or tables
Orders
Inventory
Shipping
Stream Processing with Kafka Streams apps
Frontend
Event-driven apps and services
communicate through Kafka
Reporting
New apps can easily be added by
tapping into existing event streams
62
App instance 1
...
App instance n
ResultTable
Query your application’s tables and state from other apps
Other Applications
(Java, Go, Python, etc.)
can directly query tables
request-response
via network
(e.g. REST API)
Stream Processing with Kafka Streams apps
Reporting App
63
is a distributed event streaming platform
Publish & Subscribe
to Events
Store
Events
Process & Analyze
Events
6464
Where to go from here
for more details on event-driven architectures with Kafka
65
THANK YOU
@miguno
michael@confluent.io
cnfl.io/meetups cnfl.io/blog cnfl.io/slack

More Related Content

PDF
Concepts and Patterns for Streaming Services with Kafka
PDF
What is Apache Kafka and What is an Event Streaming Platform?
PDF
New Features in Confluent Platform 6.0 / Apache Kafka 2.6
PDF
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
PDF
Benefits of Stream Processing and Apache Kafka Use Cases
PDF
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
PDF
Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandr...
PDF
Top use cases for 2022 with Data in Motion and Apache Kafka
Concepts and Patterns for Streaming Services with Kafka
What is Apache Kafka and What is an Event Streaming Platform?
New Features in Confluent Platform 6.0 / Apache Kafka 2.6
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Benefits of Stream Processing and Apache Kafka Use Cases
Event Driven Architecture with a RESTful Microservices Architecture (Kyle Ben...
Data Transformations on Ops Metrics using Kafka Streams (Srividhya Ramachandr...
Top use cases for 2022 with Data in Motion and Apache Kafka

What's hot (20)

PDF
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
PDF
Real time data processing and model inferncing platform with Kafka streams (N...
PDF
Building Microservices with Apache Kafka
PPTX
Realtime stream processing with kafka
PDF
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
PDF
Time series-analysis-using-an-event-streaming-platform -_v3_final
PDF
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
PDF
ksqlDB Workshop
PDF
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
PDF
What is Apache Kafka®?
PDF
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
PDF
Why Cloud-Native Kafka Matters: 4 Reasons to Stop Managing it Yourself
PDF
Hadoop made fast - Why Virtual Reality Needed Stream Processing to Survive
PDF
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...
PDF
How to Build an Apache Kafka® Connector
PDF
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
PDF
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
PDF
Evolving from Messaging to Event Streaming
PPTX
Cloud native Kafka | Sascha Holtbruegge and Margaretha Erber, HiveMQ
PDF
Simplified Hybrid Cloud Migration with Confluent and Google Cloud
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Real time data processing and model inferncing platform with Kafka streams (N...
Building Microservices with Apache Kafka
Realtime stream processing with kafka
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
Time series-analysis-using-an-event-streaming-platform -_v3_final
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
ksqlDB Workshop
user Behavior Analysis with Session Windows and Apache Kafka's Streams API
What is Apache Kafka®?
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
Why Cloud-Native Kafka Matters: 4 Reasons to Stop Managing it Yourself
Hadoop made fast - Why Virtual Reality Needed Stream Processing to Survive
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...
How to Build an Apache Kafka® Connector
Using Location Data to Showcase Keys, Windows, and Joins in Kafka Streams DSL...
Dissolving the Problem (Making an ACID-Compliant Database Out of Apache Kafka®)
Evolving from Messaging to Event Streaming
Cloud native Kafka | Sascha Holtbruegge and Margaretha Erber, HiveMQ
Simplified Hybrid Cloud Migration with Confluent and Google Cloud
Ad

Similar to Now You See Me, Now You Compute: Building Event-Driven Architectures with Apache Kafka | Strata New York 2019 (20)

PDF
Jay Kreps | Kafka Summit NYC 2019 Keynote (Events Everywhere) | CEO, Confluent
PDF
EDA Meets Data Engineering – What's the Big Deal?
PDF
Devoxx university - Kafka de haut en bas
PDF
Santander Stream Processing with Apache Flink
PDF
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
PDF
Apache Kafka as Event Streaming Platform for Microservice Architectures
PDF
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
PPTX
Streaming Data Ingest and Processing with Apache Kafka
PDF
Unleashing Apache Kafka and TensorFlow in the Cloud

PDF
Beyond the brokers - A tour of the Kafka ecosystem
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
PDF
Introduction to Apache Kafka and why it matters - Madrid
PDF
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
PDF
Beyond the brokers - Un tour de l'écosystème Kafka
PDF
Real-time processing of large amounts of data
PDF
JHipster conf 2019 - Kafka Ecosystem
PDF
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
PDF
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
PPTX
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
PPTX
Apache Kafka® + Machine Learning for Supply Chain 
Jay Kreps | Kafka Summit NYC 2019 Keynote (Events Everywhere) | CEO, Confluent
EDA Meets Data Engineering – What's the Big Deal?
Devoxx university - Kafka de haut en bas
Santander Stream Processing with Apache Flink
Fast Data – Fast Cars: Wie Apache Kafka die Datenwelt revolutioniert
Apache Kafka as Event Streaming Platform for Microservice Architectures
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB)
Streaming Data Ingest and Processing with Apache Kafka
Unleashing Apache Kafka and TensorFlow in the Cloud

Beyond the brokers - A tour of the Kafka ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
Introduction to Apache Kafka and why it matters - Madrid
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Beyond the brokers - Un tour de l'écosystème Kafka
Real-time processing of large amounts of data
JHipster conf 2019 - Kafka Ecosystem
Viele Autos, noch mehr Daten: IoT-Daten-Streaming mit MQTT & Kafka (Kai Waehn...
Best Practices for Streaming IoT Data with MQTT and Apache Kafka
IIoT with Kafka and Machine Learning for Supply Chain Optimization In Real Ti...
Apache Kafka® + Machine Learning for Supply Chain 
Ad

More from Michael Noll (9)

PDF
Kafka 102: Streams and Tables All the Way Down | Kafka Summit San Francisco 2019
PDF
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
PDF
Unlocking the world of stream processing with KSQL, the streaming SQL engine ...
PDF
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
PDF
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
PPTX
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
PPTX
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
PPTX
Apache Storm 0.9 basic training - Verisign
PPTX
Apache Kafka 0.8 basic training - Verisign
Kafka 102: Streams and Tables All the Way Down | Kafka Summit San Francisco 2019
Big, Fast, Easy Data: Distributed Stream Processing for Everyone with KSQL, t...
Unlocking the world of stream processing with KSQL, the streaming SQL engine ...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Rethinking Stream Processing with Apache Kafka: Applications vs. Clusters, St...
Introducing Apache Kafka's Streams API - Kafka meetup Munich, Jan 25 2017
Being Ready for Apache Kafka - Apache: Big Data Europe 2015
Apache Storm 0.9 basic training - Verisign
Apache Kafka 0.8 basic training - Verisign

Recently uploaded (20)

PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPT
Quality review (1)_presentation of this 21
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
Lecture1 pattern recognition............
PDF
Mega Projects Data Mega Projects Data
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Galatica Smart Energy Infrastructure Startup Pitch Deck
Major-Components-ofNKJNNKNKNKNKronment.pptx
IB Computer Science - Internal Assessment.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
climate analysis of Dhaka ,Banglades.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Clinical guidelines as a resource for EBP(1).pdf
Quality review (1)_presentation of this 21
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
STUDY DESIGN details- Lt Col Maksud (21).pptx
Lecture1 pattern recognition............
Mega Projects Data Mega Projects Data
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Miokarditis (Inflamasi pada Otot Jantung)
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf

Now You See Me, Now You Compute: Building Event-Driven Architectures with Apache Kafka | Strata New York 2019

  • 1. 1 Now You See Me, Now You Compute Building Event-Driven Architectures with Apache Kafka® Michael G. Noll Technologist, Office of the CTO, Confluent @miguno
  • 3. 33 The world is changing.
  • 4. 44 The New Business Reality Past Technology was a support function Innovation required for growth Running the business on yesterday’s data was “good enough” Today Technology is the business Innovation required for survival Yesterday’s data = failure. Modern, real-time data infrastructure is required.
  • 5. 5 The Rise Of Event Streaming 60%Fortune 100 Companies Using Apache Kafka
  • 7. 77 The world is changing. Then Hardware product Up-front purchase Opaque No data Now Hardware, Software, and Global Internet Service On-demand Real-time visibility Built on a foundation of data Transportation
  • 13. 1313 Cloud Machine Learning Mobile Event Streaming Rethink Decision Making Rethink User Experience Rethink Data Rethink Data Centers
  • 14. 1414 Do you see me? Or: Would you blindly cross the street with traffic information that is 5 minutes old?
  • 15. 1515 Transportation ETA Real-time sensor diagnostics Driver-rider match Banking Fraud detection Trading and risk systems Mobile applications / customer experience Retail Real-time inventory Real-time POS reporting Personalization Entertainment Real-time recommendations Personalized news feed In-app purchases
  • 16. 1616 This is a fundamental paradigm shift... Infrastructure as code Data as continuous stream of events Future of the datacenter Future of data Cloud Event Streaming
  • 18. 1818 Two Problems in Application Infrastructure What’s the state of the world? What’s happening in the world? Solution: Databases Solution: Messaging, RPC, ETL, etc.
  • 19. 1919 ETL/Data Integration Messaging Batch Expensive Time Consuming Difficult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency)
  • 20. 2020 ETL/Data Integration Messaging Batch Expensive Time Consuming Difficult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency) Transient MessagesStored records
  • 21. 2121
  • 22. 2222 ETL/Data Integration Messaging Transient MessagesStored records ETL/Data Integration MessagingMessaging Batch Expensive Time Consuming Difficult to Scale No Persistence Data Loss No Replay High Throughput Durable Persistent Maintains Order Fast (Low Latency) Event Streaming Paradigm High Throughput Durable Persistent Maintains Order Fast (Low Latency) Replay
  • 23. 2323 To rethink data as neither stored records nor transient messages, but instead as a continuously updating Stream of Events Event Streaming Paradigm
  • 24. 24 An Event records the fact that something happened 24 A good was sold An invoice was issued A payment was made A new customer registered
  • 25. 25 A Stream represents history as a sequence of Events 25
  • 26. 26 Events change the way we think 26 Monolithic Approach ● a database ● a variable ● a singleton ● an RPC Event-First Approach ● an event ● a stream ● a ‘data’ flow ● a stream processor Orders Service Payments Service Customers Service Orders Service Order Validation Service Tax ServiceEmail Notification ServiceDB request response event streams
  • 27. 27 An Event Streaming Platform gives you three key functionalities 27 Publish & Subscribe to Events Store Events Process & Analyze Events
  • 28. 2828
  • 29. 2929 Event Streaming Platform Universal Event Pipeline Data Stores Logs 3rd Party Apps Custom Apps/Microservices ✓ Real-time but also persistent ✓ Elastic, scalable, reliable ✓ High throughput, low latency ✓ All apps and systems can now speak to each other for a complete view of data
  • 30. 3030 Data Stores Logs 3rd Party Apps Custom Apps/Microservices Real-Time Inventory Real-Time Fraud Detection Real-Time Customer 360 Machine Learning Models Real-Time Data Transformation ... Event-Driven Apps, with Historical Context Universal Event Pipeline Event Streaming Platform ✓ Real-time but also persistent ✓ Elastic, scalable, reliable ✓ High throughput, low latency ✓ All apps and systems can now speak to each other for a complete view of data
  • 31. 3131 Event-Driven App (Location Tracking) Only Real-time Events Messaging Queues and Event Streaming Platforms can do this Contextual Event-Driven App (ETA) Real-time combined with stored data Only Event Streaming Platforms can do this Where is my driver? When will my driver get here? Where is my driver? When will my driver get here? Why Combine Real-time With Historical Context? 2 min
  • 32. 3232 The Event Streaming Platform is the Central Nervous System for today’s enterprises
  • 34. 34 is a distributed event streaming platform Publish & Subscribe to Events Store Events Process & Analyze Events
  • 35. 3535 01 Stream your data in real-time as Events 02 Store your Event Streams 03 Process & Analyze your Events Streams
  • 36. 3636 01 Stream your data in real-time as Events From apps, microservices Use a Kafka producer client from your favorite language … and many more From/to other systems Use Kafka Connect plus a Connector for your system … and many more
  • 37. 37 From apps, microservices: producer example Python App network write … and more
  • 38. 38 From/to other systems: Kafka Connect and more Tip: Great option to gradually move workloads to Kafka while keeping production running!
  • 39. 39 Kafka Connect ● Deployed standalone (development) or as a distributed cluster (production) ● Elastic service that works on bare-metal, VMs, containers, Kubernetes, ... ● The individual ‘Connector’ determines delivery guarantees, e.g., exactly-once VM VM
  • 40. 40 Single Message Transforms for real-time ETL Ingress: modify an Event before storing ● Obfuscate sensitive information, e.g. PII ● Add origin of event for lineage tracking ● Remove unnecessary data fields ● … and more Egress: modify an Event on its way out ● Route high-priority events to faster stores ● Direct events to different Elasticsearch indexes ● Cast data types to match destination ● … and more { user: ab123, gender: female, ip: 1.2.3.95 } { user: ab123, ip: 1.2.3.XXX }
  • 41. 41 Where SMTs live (ingress example) Data Source Kafka Connect SMT1 Converter transform serializes Source Connector generates events ... SMTn 10101 01010
  • 42. 4242 Confluent Hub Discover Connectors, SMTs, and converters confluent.io/hub Easy installation Documentation, support, etc.
  • 43. 43 02 Store your Event Streams 43 Kafka Cluster VM Storage is Distributed Scalable Reliable Durable Performant
  • 44. 44 Topics PartitionsMessages / sec Brokers 10,000,000 25,000 1,000,000 1,500 Topics PartitionsMessages / sec Brokers 250,000 500 25,000 25 Topics PartitionsMessages / sec Brokers 1 5 300 3 Kafka scales from S to XXL
  • 45. 4545 Event Streaming Paradigm Highly Scalable Durable Persistent Maintains Order Fast (Low Latency) Kafka = Source of Truth, stores every article since 1851 Denormalized into “Content View” Normalized assets (images, articles, bylines, etc.) https://www.confluent.io/blog/publishing-apache-kafka-new-york-times/ Store your Events as long as you want
  • 46. 46 Secure your Event Streams Authentication Data Confidentiality Authorization
  • 47. 47 Achievement Data Unlocked: All Your Data Now Available as Streams of Events
  • 48. 48 Consumer Bob Consumer Dina Reads Offset = 3 Offset = 7 Producer Alice Writes 91 2 3 4 5 6 7 8 Independent access to Event Streams
  • 49. 49 03 Process & Analyze your Events Streams 49 With separate frameworks … and more With Streaming SQL KSQL streams With apps, microservices … and more Kafka consumer clients or
  • 50. 50 CREATE STREAM fraudulent_payments AS SELECT * FROM payments WHERE fraudProbability > 0.8 ● You write only SQL ● No Java, Python, or other boilerplate to wrap around it! ● Create KSQL User Defined Functions in Java when needed ● All you need is Kafka KSQL
  • 51. 51 Stream Processing with KSQL 4 Headless1 UI 2 CLI ksql> 3 API POST /query Pick your favorite interface
  • 52. 52 Where KSQL lives VM network read/write Elastic & Scalable Fault-tolerant Exactly-once Kafka security Aggregations Windowing Streams & Tables KSQL Cluster
  • 53. 53 Stream Processing with KSQL Stream 01 Stream 02 Stream 03 Table Process event streams to create new, continuously updated streams or tables QueryQuery Streaming Query CREATE TABLE OrderTotals AS SELECT * FROM ... EMIT CHANGES
  • 54. 54 Stream Processing with KSQL Query tables in Kafka from other apps, similar to a relational database Table QueryQuery Pull Query SELECT * FROM OrderTotals WHERE region = ‘Europe’ Result Upcoming feature (KLIP-8)
  • 55. 55 Query tables in Kafka from other apps, similar to a relational database Other Applications (Java, Go, Python, etc.) can directly query tables Result request-response via network (KSQL REST API) Table SELECT * FROM OrderTotals WHERE region = ‘Europe’ Stream Processing with KSQL Upcoming feature (KLIP-8)
  • 56. 56 KSQL integrates with Kafka Connect Simplifies event streaming between Kafka and other systems CREATE SOURCE CONNECTOR my-postgres-jdbc WITH ( connector.class = "io.confluent.connect.jdbc.jdbcSourceConnector", connection.url = "jdbc:postgresql://dbserver:5432/my-db", ...); Upcoming feature (KLIP-7) controls controls
  • 57. 57 KSQL example use case Creating an event-driven dashboard from a customer database customers table Kafka Connect is streaming change events Results are continuously updating Elasticsearch Aggregations are computed in real-time
  • 58. 58 Kafka Streams ● You write standard Java or Scala applications to process your events ● The Kafka Streams library makes these applications: elastic, scalable, fault-tolerant, and more ● All you need is Kafka streams
  • 60. 60 KStreams Application App instance 1 ... App instance n VM network read/write Elastic & Scalable Fault-tolerant Exactly-once Kafka security Aggregations Windowing Streams & Tables Where your Kafka Streams apps live
  • 61. 61 Process event streams to create new, continuously updated streams or tables Orders Inventory Shipping Stream Processing with Kafka Streams apps Frontend Event-driven apps and services communicate through Kafka Reporting New apps can easily be added by tapping into existing event streams
  • 62. 62 App instance 1 ... App instance n ResultTable Query your application’s tables and state from other apps Other Applications (Java, Go, Python, etc.) can directly query tables request-response via network (e.g. REST API) Stream Processing with Kafka Streams apps Reporting App
  • 63. 63 is a distributed event streaming platform Publish & Subscribe to Events Store Events Process & Analyze Events
  • 64. 6464 Where to go from here for more details on event-driven architectures with Kafka