SlideShare a Scribd company logo
1
Concepts and Patterns for
Streaming Services
Perry Krol, Head of Systems Engineering CEMEA
2
Confluent Community
Slack Channel
Over 10,000 Kafkateers are
collaborating every single day on the
Confluent Community Slack channel!
cnfl.io/community-slack
Subscribe to the
Confluent blog
Get frequent updates from key
names in Apache Kafka®
on best
practices, product updates & more!
cnfl.io/read
Welcome to the Mainz
Cloud Native Night Meetup
with Apache Kafka® !
Zoom open at 19:00
19:00PM - 19:10PM
Virtual Cheers and Networking
19:10PM - 19:15PM
Welcome and Intro
20:00PM - 20:00PM
Concepts & Patterns for
Streaming Services with Kafka
Perry Krol
33
Introduction to Apache Kafka®
as
Event-Driven Streaming Platform
44
Apache Kafka® Fundamentals
55
K
V
The Truth is in the Log
K
V
K
V
K
V
Log of Events
Kafka Topic
210 3
66
Partitions
…
…
…
…
77
Partitions
…
…
…
…
Partition 0
Partition 1
Partition 2
Partition 3
• Provides scalable:
- Writes
- Storage
- Consumption
• Ordering is within a
partition only
88
Replicas
Broker 1 Broker 2 Broker 3 Broker 4
Topic A
Partition 0
Topic A
Partition 0
Topic A
Partition 1
Topic A
Partition 0
Topic A
Partition 1
Topic A
Partition 2
Topic A
Partition 3
Topic A
Partition 1
Topic A
Partition 2
Topic A
Partition 2
Leader
Follower
99
Replicas
Broker 1 Broker 2 Broker 3 Broker 4
Topic A
Partition 0
Topic A
Partition 0
Topic A
Partition 1
Topic A
Partition 0
Topic A
Partition 1
Topic A
Partition 2
Topic A
Partition 3
Topic A
Partition 1
Topic A
Partition 2
Topic A
Partition 2
1010
Producers
…
…
…
…
Partition 0
Partition 1
Partition 2
Partition 3
Partitioned Topic
Producer
1111
Competitive Landscape
Non event-streaming
(TIBCO, IBM MQ, Solace, ...)
Event-streaming
Non-Kafka
(Kinesis, Pulsar)
Kafka
Open Source
(Apache Kafka)
Commercial Offerings
Non-Confluent
AWS MSK
Cloudera
Red Hat
Aiven
...
Confluent
DRAFT - For
Internal Enablement
Only
1414
Record Keys & Ordering
Record keys determine the partition with the default kafka partitioner
Keys are used in the default partitioning algorithm:
partition = hash(key) % numPartitions
1212
Consumers
…
…
…
…
Partition 0
Partition 1
Partition 2
Partition 3
Partitioned Topic
Consumer A
1313
Consumers
…
…
…
…
Partition 0
Partition 1
Partition 2
Partition 3
Partitioned Topic
Consumer A
Consumer B
1414
Consumers
…
…
…
…
Partition 0
Partition 1
Partition 2
Partition 3
Partitioned Topic
Consumer A1
Consumer B
Consumer A2
Consumer A3
Consumer A4
1515
Consumers
…
…
…
…
Partition 0
Partition 1
Partition 2
Partition 3
Partitioned Topic
Consumer A1
Consumer B
Consumer A2
Consumer A3
Consumer A4
• A Client Application
• Reads Messages from Topics
• Horizontally, elastically scalable
(if stateless)
1616
Apache Kafka® Connect
17
Streaming Integration with Kafka® Connect
Kafka® Connect
Kafka® Brokers
Sources Sinks
18
Kafka® Connect Data Pipeline
Source Kafka® Connect Kafka®
Connector ConverterTransform(s)
19
Confluent Hub
Online library of pre-packaged and
ready-to-install extensions or add-ons
for Confluent Platform and Apache
Kafka®:
● Connectors
● Transforms
● Converters
Easily install the components that
suit your needs into your local
environment with the Confluent Hub
client command line tool . https://hub.confuent.io
2020
Apache Kafka® KStreams
2121
…
…
Producer
Stream
Processor
Consumer
∑
2222
STREAM
PROCESSING
Create and store
materialized views
Filter
and
join
Act and analyze
in-flight
23
Building blocks for Stream Processing
Core Kafka
Producer Topic Consumer
Kafka Streams
State Stores Change Logs
Processors Operators
Stream Table
Persistence
Compute
Declarative API
ksqlDB
Push Queries Pull Queries
Serverless
Topology
Durable Pub Sub
Transformers
24
Runs
everywhere
Clustering
done for you
Exactly once
processing
Event time
processing
Integrated
database
Joins, windowing,
aggregation
S/M/L/XL/XXL/XXXL
sizes
Things Kafka Streams Does
2525
Confluent KSQL
26
26
KSQLis the
Streaming
SQL Enginefor
Apache Kafka
27
CREATE STREAM vip_actions AS
SELECT userid,
page,
action
FROM clickstream c
LEFT JOIN users u
ON c.userid = u.user_id
WHERE u.level = 'Platinum’
EMIT CHANGES;
Simple SQL syntax for expressing reasoning along and
across data streams.
You can write user-defined functions in Java
Stream Processing with KSQL
28
$ cat < in.txt | grep “ksql” | tr a-z A-Z > out.txt
Kafka Cluster
Stream ProcessingConnect API Connect API
Stream Processing Analogy
@rmoff / Apache Kafka and KSQL in Action : Let’s Build a Streaming Data Pipeline!
Do you think that’s a table
you are querying?
30
Alice + €50
The Stream-Table Duality
Stream
(payments)
Table
(balance)
time
Alice €50
Bob + €18
Alice €50 Alice €50
Bob €18
Alice + €25
Alice €50
Bob €18
Alice €75
Bob €18
Alice – €60
Alice €75
Bob €18
Alice €15
Bob €18
@rmoff / Apache Kafka and KSQL in Action : Let’s Build a Streaming Data Pipeline!
The truth is the log.
The database is a cache
of a subset of the log.
—Pat Helland
Immutability Changes Everything
http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
32
KSQL for Real-Time Monitoring
32
• Log data monitoring, tracking and alerting
• syslog data
• Sensor / IoT data
CREATE TABLE error_counts AS
SELECT error_code, count(*)
FROM monitoring_stream
WINDOW TUMBLING (SIZE 1 MINUTE)
WHERE type = 'ERROR'
GROUP BY error_code;
33
KSQL for Streaming ETL
33
CREATE STREAM engine_oil_pressure_readings AS
SELECT r.deviceid, r.reading, r.timestamp
d.sensor_type, d.uom, d.component
FROM sensor_readings r
LEFT JOIN device_master d
ON r.deviceid = d.id
WHERE d.component = ‘Engine’
AND d.sensor_type = ‘Oil Pressure’
EMIT CHANGES;
Joining, filtering, and aggregating streams of event data
34
Kafka
Connect
Producer API
Elasticsearch
Kafka
Connect
Streaming ETL with Apache Kafka and KSQL
er
PostgreSQL
CDC Debezium
35
KSQL is a stream processing technology
As such it is not yet a great fit for:
Ad-hoc queries
● No indexes yet in KSQL
● Kafka often configured to retain
data for only a limited span of
time
BI reports (Tableau etc.)
● No indexes yet in KSQL
● No JDBC
● Most BI tools don’t understand
continuous, streaming results
3636
Event Driven Microservices with Apache Kafka®
Leveraging a Streaming Platform for Microservices
3737
What are microservices?
Microservices are a software development
technique - a variant of the service-oriented
architecture (SOA) architectural style that
structures an application as a collection of
loosely coupled services.
https://en.wikipedia.org/wiki/Microservices
3838
structures an application as a collection of
loosely coupled services.
this is new!
39
Making changes is risky
40
Handling state is hard Cache?
Embedded?
Route to right instance?
4141
● Scaling is hard
● Handling state is hard
● Sharing, coordinating is hard
● Run a database in each microservice - is hard
What have we learned about microservices?
4242
We actually had some of it right
4343
Immutability
4444
What’s the big idea?
4545
Event Driven Architectures aren’t new…
the world has changed
…but…
4646
Events
Why do you care?
Loose coupling, autonomy, evolvability, scalability, resilience, traceability, replayability
EVENT-FIRST CHANGES HOW YOU
THINK ABOUT WHAT YOU ARE BUILDING
...more importantly...
4747
old world : event-driven architectures
new world: event-streaming
architectures
48
Stream processing
Kafka
Streams
processor
input events
output events
...temporal reasoning...
event-driven microservice
49
Scaling state & querying
Stream
processor
Stream
processor
Stream
processor
Topic: click-stream
Interactive query
CDC events from KTable
CDC Stream
partition
partition
partition
MKleppmann2015:
Turningthedatabaseinsideout
50
Scaling state: behind the scenes
Stream
processor
topic: click-stream
microservice events
topic compactionCDC events from KTable
(partition level)
CDC events
Rebuilding state
Rocksdb
partition
5151
Stream processors are uniquely convergent.
Data + Processing
(sorry dba’s)
5252
All of your data
is
a stream of events
5353
stop...where is my database?
(you said scaling data was hard)
5454
Streams are your persistence model
They are also
your local
database
5555
The atomic unit for tackling complexity
...or microservice or whatever...
Stream
processor
input events
output events
56
It’s pretty powerful
Stream
processor
Stream
processor
Stream
processor
Topic: click-stream
Interactive query
CDC events from KTable
CDC Stream
partition
partition
partition
CQRS
Elastic
5757
Stream processor == Single atomic unit
It does one thing
Like
5858
We think in terms of function
“Bounded Context”
(dataflow - choreography)
5959
Let’s build something….
A simple dataflow series of processors
“Payment processing”
6060
KPay looks like this
https://github.com/confluentinc/demo-scene/tree/master/scalable-payment-processing
6161
Bounded context
“Payments”
1. Payments inflight
2. Account processing [debit/credit]
3. Payments confirmed
62
Payments bounded context
choreography
63
Payments system: bounded context
[1] How much is being processed?
Expressed as:
- Count of payments inflight
- Total $ value processed
[2&3] Update the account balance
Expressed as:
- Debit
- Credit [4] Confirm successful payment
Expressed as:
- Total volume today
- Total $ amount today
64
Payments system: AccountProcessor
accountBalanceKTable = inflight.groupByKey()
.aggregate(
AccountBalance::new,
(key, value, aggregate) -> aggregate.handle(key, value), accountStore);
KStream<String, Payment>[] branch = inflight
.map((KeyValueMapper<String, Payment, KeyValue<String, Payment>>) (key, value) -> {
if (value.getState() == Payment.State.debit) {
value.setStateAndId(Payment.State.credit);
} else if (value.getState() == Payment.State.credit) {
value.setStateAndId(Payment.State.complete);
}
return new KeyValue<>(value.getId(), value);
})
.branch(isCreditRecord, isCompleteRecord);
branch[0].to(paymentsInflightTopic);
branch[1].to(paymentsCompleteTopic);
https://github.com/confluentinc/demo-scene/blob/master/scalable-payment-processing/.../AccountProcessor.java
KTable state
(Kafka Streams)
65
Payments system: AccountBalance
public AccountBalance handle(String key, Payment value) {
this.name = value.getId();
if (value.getState() == Payment.State.debit) {
this.amount = this.amount.subtract(value.getAmount());
} else if (value.getState() == Payment.State.credit) {
this.amount = this.amount.add(value.getAmount());
} else {
// report to dead letter queue via exception handler
throw new RuntimeException("Invalid payment received:" + value);
}
this.lastPayment = value;
return this;
}
https://github.com/confluentinc/demo-scene/.../scalable-payment-processing/.../model/AccountBalance.java
66
Payments system: event model
https://github.com/confluentinc/demo-scene/.../scalable-payment-processing/.../io/confluent/kpay/payments
6767
Bounded context
“Payments”
Is it enough?
no
6868
“It’s asynchronous, I don’t trust it”
(some developer, 2018)
6969
We only have one part of the picture
○ What about failures?
○ Upgrades?
○ How fast is it going?
○ What is happening - is it working?
7070
Event-streaming pillars:
1. Business function (payment)
2. Instrumentation plane (trust)
3. Control plane (coordinate)
4. Operational plane (run)
7171
Event-streaming provides
● Evolution
● Decoupling
● Bounded context modelling
● Composition
(because of SoC)
7272
Our mental model: Abstraction as an Art
Chained/Orchestrated
Bounded contexts
Stream processor
Stream
Event
Pillars
Business function Control plane Instrumentation Operations
Bounded context
7373
Key takeaway (state)
Event streamingdriven microservices are the new atomic unit:
1. Provide simplicity (and time travel)
2. Handle state (via Kafka Streams)
3. Provide a new paradigm: convergent data and logic processing
Stream
processor
7474
Key takeaway (complexity)
● Event-Streaming apps: model as bounded-context dataflows, handle
state & scaling
● Patterns: Build reusable dataflow patterns (instrumentation)
● Composition: Bounded contexts chaining and layering
● Composition: Choreography and Orchestration
7575
Questions?
“Journey to event driven” blog
1. Event-first thinking
2. Programming models
3. Serverless
4. Pillars of event-streaming ms’s
7676
Learn more about
Apache Kafka®
and
Confluent Platform
77
Learn Kafka.
Start building with
Apache Kafka at
Confluent Developer.
developer.confluent.io
7878
https://www.confluent.io/apache-kafka-stream-processing-book-bundle/
79Confluent are giving new users $50 of free usage per month for their first 3 months
Sign up for a Confluent Cloud
account
Please bear in mind that you will
be required to enter credit card
information but will not be charged
unless you go over the $50 usage
in any of the first 3 months or if
you don’t cancel your subscription
before the end of your promotion.
Here’s advice on how to use this promotion to try Confluent Cloud for free!
You won’t be charged if you don’t
go over the limit!
Get the benefits of Confluent
Cloud, but keep an eye on your
your account making sure that you
have enough remaining free
credits available for the rest of
your subscription month!!
Cancel before the 3 months end If
you don’t want to continue past
the promotion
If you fail to cancel within your first
three months you will start being
charged full price. To cancel,
immediately stop all streaming
and storing data in Confluent
Cloud and email cloud-
support@confluent.io
bit.ly/TryConfluentCloudAvailable on
bit.ly/TryConfluentCloud
80
A Confluent community catalyst is
a person who invests relentlessly in
the Apache Kafka® and/or
Confluent communities.
Massive
bragging rights
Access to the
private MVP
Slack channel
Special swagThe recognition
of your peers
Direct interaction
with Apache Kafka
contributors as well as
the Confluent founders
at special events
Free pass for
Kafka Summit SF
Nominate yourself or a peer at
CONFLUENT.IO/NOMINATE
8181
Want to host or speak at
one of our meetups?
Please contact community@confluent.io
and we will make it happen!
82

More Related Content

PDF
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
PDF
New Features in Confluent Platform 6.0 / Apache Kafka 2.6
PDF
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
PPTX
Realtime stream processing with kafka
PDF
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
PDF
Amsterdam meetup at ING June 18, 2019
PDF
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
PDF
Introducing Change Data Capture with Debezium
Build a Bridge to Cloud with Apache Kafka® for Data Analytics Cloud Services
New Features in Confluent Platform 6.0 / Apache Kafka 2.6
Now You See Me, Now You Compute: Building Event-Driven Architectures with Apa...
Realtime stream processing with kafka
Bridge to Cloud: Using Apache Kafka to Migrate to AWS
Amsterdam meetup at ING June 18, 2019
Technical Deep Dive: Using Apache Kafka to Optimize Real-Time Analytics in Fi...
Introducing Change Data Capture with Debezium

What's hot (20)

PDF
Real time data processing and model inferncing platform with Kafka streams (N...
PDF
Benefits of Stream Processing and Apache Kafka Use Cases
PDF
What is Apache Kafka and What is an Event Streaming Platform?
PDF
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...
PPTX
Bridge Your Kafka Streams to Azure Webinar
PDF
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
PDF
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
PPTX
Real time analytics in Azure IoT
PDF
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
PDF
Real-time processing of large amounts of data
PDF
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
PDF
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
PDF
GCP for Apache Kafka® Users: Stream Ingestion and Processing
PDF
Kafka summit SF 2019 - the art of the event-streaming app
PDF
APAC ksqlDB Workshop
PDF
KSQL: Open Source Streaming for Apache Kafka
PPTX
New Approaches for Fraud Detection on Apache Kafka and KSQL
PDF
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
PDF
Cloud Native London 2019 Faas composition using Kafka and cloud-events
PDF
Building a Streaming Platform with Kafka
Real time data processing and model inferncing platform with Kafka streams (N...
Benefits of Stream Processing and Apache Kafka Use Cases
What is Apache Kafka and What is an Event Streaming Platform?
Building a Secure, Tamper-Proof & Scalable Blockchain on Top of Apache Kafka ...
Bridge Your Kafka Streams to Azure Webinar
Kafka Streams vs. KSQL for Stream Processing on top of Apache Kafka
Apache Kafka vs. Integration Middleware (MQ, ETL, ESB) - Friends, Enemies or ...
Real time analytics in Azure IoT
Neha Narkhede | Kafka Summit London 2019 Keynote | Event Streaming: Our Cloud...
Real-time processing of large amounts of data
Streamsheets and Apache Kafka – Interactively build real-time Dashboards and ...
Achieve Sub-Second Analytics on Apache Kafka with Confluent and Imply
GCP for Apache Kafka® Users: Stream Ingestion and Processing
Kafka summit SF 2019 - the art of the event-streaming app
APAC ksqlDB Workshop
KSQL: Open Source Streaming for Apache Kafka
New Approaches for Fraud Detection on Apache Kafka and KSQL
Top 5 Event Streaming Use Cases for 2021 with Apache Kafka
Cloud Native London 2019 Faas composition using Kafka and cloud-events
Building a Streaming Platform with Kafka
Ad

Similar to Concepts and Patterns for Streaming Services with Kafka (20)

PDF
How to Build Streaming Apps with Confluent II
PDF
Introduction to apache kafka, confluent and why they matter
PPTX
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
PDF
Apache Kafka as Event Streaming Platform for Microservice Architectures
PDF
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
PPTX
Streaming Data and Stream Processing with Apache Kafka
PDF
Chti jug - 2018-06-26
PDF
JHipster conf 2019 - Kafka Ecosystem
PDF
The State of Stream Processing
PDF
Apache Kafka as Event-Driven Open Source Streaming Platform (Prague Meetup)
PDF
Jay Kreps | Kafka Summit NYC 2019 Keynote (Events Everywhere) | CEO, Confluent
PDF
Kafka Vienna Meetup 020719
PDF
Beyond the brokers - A tour of the Kafka ecosystem
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
PPTX
Real Time Stream Processing with KSQL and Kafka
PDF
ksqlDB Workshop
PDF
Jug - ecosystem
PDF
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
PPTX
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
PDF
Introduction to Apache Kafka and Confluent... and why they matter!
How to Build Streaming Apps with Confluent II
Introduction to apache kafka, confluent and why they matter
Introduction to ksqlDB and stream processing (Vish Srinivasan - Confluent)
Apache Kafka as Event Streaming Platform for Microservice Architectures
Rethinking Stream Processing with Apache Kafka, Kafka Streams and KSQL
Streaming Data and Stream Processing with Apache Kafka
Chti jug - 2018-06-26
JHipster conf 2019 - Kafka Ecosystem
The State of Stream Processing
Apache Kafka as Event-Driven Open Source Streaming Platform (Prague Meetup)
Jay Kreps | Kafka Summit NYC 2019 Keynote (Events Everywhere) | CEO, Confluent
Kafka Vienna Meetup 020719
Beyond the brokers - A tour of the Kafka ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
Real Time Stream Processing with KSQL and Kafka
ksqlDB Workshop
Jug - ecosystem
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Webinar: Unlock the Power of Streaming Data with Kinetica and Confluent
Introduction to Apache Kafka and Confluent... and why they matter!
Ad

More from QAware GmbH (20)

PDF
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
PDF
Frontends mit Hilfe von KI entwickeln.pdf
PDF
Mit ChatGPT Dinosaurier besiegen - Möglichkeiten und Grenzen von LLM für die ...
PDF
50 Shades of K8s Autoscaling #JavaLand24.pdf
PDF
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
PPTX
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
PDF
Down the Ivory Tower towards Agile Architecture
PDF
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
PDF
Make Developers Fly: Principles for Platform Engineering
PDF
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
PDF
Was kommt nach den SPAs
PDF
Cloud Migration mit KI: der Turbo
PDF
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
PDF
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
PDF
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
PDF
Kubernetes with Cilium in AWS - Experience Report!
PDF
50 Shades of K8s Autoscaling
PDF
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
PDF
Service Mesh Pain & Gain. Experiences from a client project.
PDF
50 Shades of K8s Autoscaling
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
Frontends mit Hilfe von KI entwickeln.pdf
Mit ChatGPT Dinosaurier besiegen - Möglichkeiten und Grenzen von LLM für die ...
50 Shades of K8s Autoscaling #JavaLand24.pdf
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Down the Ivory Tower towards Agile Architecture
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
Make Developers Fly: Principles for Platform Engineering
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
Was kommt nach den SPAs
Cloud Migration mit KI: der Turbo
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Kubernetes with Cilium in AWS - Experience Report!
50 Shades of K8s Autoscaling
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Service Mesh Pain & Gain. Experiences from a client project.
50 Shades of K8s Autoscaling

Recently uploaded (20)

PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Database Infoormation System (DBIS).pptx
PDF
Taxes Foundatisdcsdcsdon Certificate.pdf
PDF
Mega Projects Data Mega Projects Data
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
Fluorescence-microscope_Botany_detailed content
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Computer network topology notes for revision
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
.pdf is not working space design for the following data for the following dat...
PPTX
Global journeys: estimating international migration
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Database Infoormation System (DBIS).pptx
Taxes Foundatisdcsdcsdon Certificate.pdf
Mega Projects Data Mega Projects Data
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Miokarditis (Inflamasi pada Otot Jantung)
Fluorescence-microscope_Botany_detailed content
Reliability_Chapter_ presentation 1221.5784
Bharatiya Antariksh Hackathon 2025 Idea Submission PPT.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Supervised vs unsupervised machine learning algorithms
Computer network topology notes for revision
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
oil_refinery_comprehensive_20250804084928 (1).pptx
.pdf is not working space design for the following data for the following dat...
Global journeys: estimating international migration
Introduction-to-Cloud-ComputingFinal.pptx

Concepts and Patterns for Streaming Services with Kafka

  • 1. 1 Concepts and Patterns for Streaming Services Perry Krol, Head of Systems Engineering CEMEA
  • 2. 2 Confluent Community Slack Channel Over 10,000 Kafkateers are collaborating every single day on the Confluent Community Slack channel! cnfl.io/community-slack Subscribe to the Confluent blog Get frequent updates from key names in Apache Kafka® on best practices, product updates & more! cnfl.io/read Welcome to the Mainz Cloud Native Night Meetup with Apache Kafka® ! Zoom open at 19:00 19:00PM - 19:10PM Virtual Cheers and Networking 19:10PM - 19:15PM Welcome and Intro 20:00PM - 20:00PM Concepts & Patterns for Streaming Services with Kafka Perry Krol
  • 3. 33 Introduction to Apache Kafka® as Event-Driven Streaming Platform
  • 5. 55 K V The Truth is in the Log K V K V K V Log of Events Kafka Topic 210 3
  • 7. 77 Partitions … … … … Partition 0 Partition 1 Partition 2 Partition 3 • Provides scalable: - Writes - Storage - Consumption • Ordering is within a partition only
  • 8. 88 Replicas Broker 1 Broker 2 Broker 3 Broker 4 Topic A Partition 0 Topic A Partition 0 Topic A Partition 1 Topic A Partition 0 Topic A Partition 1 Topic A Partition 2 Topic A Partition 3 Topic A Partition 1 Topic A Partition 2 Topic A Partition 2 Leader Follower
  • 9. 99 Replicas Broker 1 Broker 2 Broker 3 Broker 4 Topic A Partition 0 Topic A Partition 0 Topic A Partition 1 Topic A Partition 0 Topic A Partition 1 Topic A Partition 2 Topic A Partition 3 Topic A Partition 1 Topic A Partition 2 Topic A Partition 2
  • 10. 1010 Producers … … … … Partition 0 Partition 1 Partition 2 Partition 3 Partitioned Topic Producer
  • 11. 1111 Competitive Landscape Non event-streaming (TIBCO, IBM MQ, Solace, ...) Event-streaming Non-Kafka (Kinesis, Pulsar) Kafka Open Source (Apache Kafka) Commercial Offerings Non-Confluent AWS MSK Cloudera Red Hat Aiven ... Confluent DRAFT - For Internal Enablement Only 1414 Record Keys & Ordering Record keys determine the partition with the default kafka partitioner Keys are used in the default partitioning algorithm: partition = hash(key) % numPartitions
  • 12. 1212 Consumers … … … … Partition 0 Partition 1 Partition 2 Partition 3 Partitioned Topic Consumer A
  • 13. 1313 Consumers … … … … Partition 0 Partition 1 Partition 2 Partition 3 Partitioned Topic Consumer A Consumer B
  • 14. 1414 Consumers … … … … Partition 0 Partition 1 Partition 2 Partition 3 Partitioned Topic Consumer A1 Consumer B Consumer A2 Consumer A3 Consumer A4
  • 15. 1515 Consumers … … … … Partition 0 Partition 1 Partition 2 Partition 3 Partitioned Topic Consumer A1 Consumer B Consumer A2 Consumer A3 Consumer A4 • A Client Application • Reads Messages from Topics • Horizontally, elastically scalable (if stateless)
  • 17. 17 Streaming Integration with Kafka® Connect Kafka® Connect Kafka® Brokers Sources Sinks
  • 18. 18 Kafka® Connect Data Pipeline Source Kafka® Connect Kafka® Connector ConverterTransform(s)
  • 19. 19 Confluent Hub Online library of pre-packaged and ready-to-install extensions or add-ons for Confluent Platform and Apache Kafka®: ● Connectors ● Transforms ● Converters Easily install the components that suit your needs into your local environment with the Confluent Hub client command line tool . https://hub.confuent.io
  • 22. 2222 STREAM PROCESSING Create and store materialized views Filter and join Act and analyze in-flight
  • 23. 23 Building blocks for Stream Processing Core Kafka Producer Topic Consumer Kafka Streams State Stores Change Logs Processors Operators Stream Table Persistence Compute Declarative API ksqlDB Push Queries Pull Queries Serverless Topology Durable Pub Sub Transformers
  • 24. 24 Runs everywhere Clustering done for you Exactly once processing Event time processing Integrated database Joins, windowing, aggregation S/M/L/XL/XXL/XXXL sizes Things Kafka Streams Does
  • 27. 27 CREATE STREAM vip_actions AS SELECT userid, page, action FROM clickstream c LEFT JOIN users u ON c.userid = u.user_id WHERE u.level = 'Platinum’ EMIT CHANGES; Simple SQL syntax for expressing reasoning along and across data streams. You can write user-defined functions in Java Stream Processing with KSQL
  • 28. 28 $ cat < in.txt | grep “ksql” | tr a-z A-Z > out.txt Kafka Cluster Stream ProcessingConnect API Connect API Stream Processing Analogy
  • 29. @rmoff / Apache Kafka and KSQL in Action : Let’s Build a Streaming Data Pipeline! Do you think that’s a table you are querying?
  • 30. 30 Alice + €50 The Stream-Table Duality Stream (payments) Table (balance) time Alice €50 Bob + €18 Alice €50 Alice €50 Bob €18 Alice + €25 Alice €50 Bob €18 Alice €75 Bob €18 Alice – €60 Alice €75 Bob €18 Alice €15 Bob €18
  • 31. @rmoff / Apache Kafka and KSQL in Action : Let’s Build a Streaming Data Pipeline! The truth is the log. The database is a cache of a subset of the log. —Pat Helland Immutability Changes Everything http://cidrdb.org/cidr2015/Papers/CIDR15_Paper16.pdf
  • 32. 32 KSQL for Real-Time Monitoring 32 • Log data monitoring, tracking and alerting • syslog data • Sensor / IoT data CREATE TABLE error_counts AS SELECT error_code, count(*) FROM monitoring_stream WINDOW TUMBLING (SIZE 1 MINUTE) WHERE type = 'ERROR' GROUP BY error_code;
  • 33. 33 KSQL for Streaming ETL 33 CREATE STREAM engine_oil_pressure_readings AS SELECT r.deviceid, r.reading, r.timestamp d.sensor_type, d.uom, d.component FROM sensor_readings r LEFT JOIN device_master d ON r.deviceid = d.id WHERE d.component = ‘Engine’ AND d.sensor_type = ‘Oil Pressure’ EMIT CHANGES; Joining, filtering, and aggregating streams of event data
  • 34. 34 Kafka Connect Producer API Elasticsearch Kafka Connect Streaming ETL with Apache Kafka and KSQL er PostgreSQL CDC Debezium
  • 35. 35 KSQL is a stream processing technology As such it is not yet a great fit for: Ad-hoc queries ● No indexes yet in KSQL ● Kafka often configured to retain data for only a limited span of time BI reports (Tableau etc.) ● No indexes yet in KSQL ● No JDBC ● Most BI tools don’t understand continuous, streaming results
  • 36. 3636 Event Driven Microservices with Apache Kafka® Leveraging a Streaming Platform for Microservices
  • 37. 3737 What are microservices? Microservices are a software development technique - a variant of the service-oriented architecture (SOA) architectural style that structures an application as a collection of loosely coupled services. https://en.wikipedia.org/wiki/Microservices
  • 38. 3838 structures an application as a collection of loosely coupled services. this is new!
  • 40. 40 Handling state is hard Cache? Embedded? Route to right instance?
  • 41. 4141 ● Scaling is hard ● Handling state is hard ● Sharing, coordinating is hard ● Run a database in each microservice - is hard What have we learned about microservices?
  • 42. 4242 We actually had some of it right
  • 45. 4545 Event Driven Architectures aren’t new… the world has changed …but…
  • 46. 4646 Events Why do you care? Loose coupling, autonomy, evolvability, scalability, resilience, traceability, replayability EVENT-FIRST CHANGES HOW YOU THINK ABOUT WHAT YOU ARE BUILDING ...more importantly...
  • 47. 4747 old world : event-driven architectures new world: event-streaming architectures
  • 48. 48 Stream processing Kafka Streams processor input events output events ...temporal reasoning... event-driven microservice
  • 49. 49 Scaling state & querying Stream processor Stream processor Stream processor Topic: click-stream Interactive query CDC events from KTable CDC Stream partition partition partition MKleppmann2015: Turningthedatabaseinsideout
  • 50. 50 Scaling state: behind the scenes Stream processor topic: click-stream microservice events topic compactionCDC events from KTable (partition level) CDC events Rebuilding state Rocksdb partition
  • 51. 5151 Stream processors are uniquely convergent. Data + Processing (sorry dba’s)
  • 52. 5252 All of your data is a stream of events
  • 53. 5353 stop...where is my database? (you said scaling data was hard)
  • 54. 5454 Streams are your persistence model They are also your local database
  • 55. 5555 The atomic unit for tackling complexity ...or microservice or whatever... Stream processor input events output events
  • 56. 56 It’s pretty powerful Stream processor Stream processor Stream processor Topic: click-stream Interactive query CDC events from KTable CDC Stream partition partition partition CQRS Elastic
  • 57. 5757 Stream processor == Single atomic unit It does one thing Like
  • 58. 5858 We think in terms of function “Bounded Context” (dataflow - choreography)
  • 59. 5959 Let’s build something…. A simple dataflow series of processors “Payment processing”
  • 60. 6060 KPay looks like this https://github.com/confluentinc/demo-scene/tree/master/scalable-payment-processing
  • 61. 6161 Bounded context “Payments” 1. Payments inflight 2. Account processing [debit/credit] 3. Payments confirmed
  • 63. 63 Payments system: bounded context [1] How much is being processed? Expressed as: - Count of payments inflight - Total $ value processed [2&3] Update the account balance Expressed as: - Debit - Credit [4] Confirm successful payment Expressed as: - Total volume today - Total $ amount today
  • 64. 64 Payments system: AccountProcessor accountBalanceKTable = inflight.groupByKey() .aggregate( AccountBalance::new, (key, value, aggregate) -> aggregate.handle(key, value), accountStore); KStream<String, Payment>[] branch = inflight .map((KeyValueMapper<String, Payment, KeyValue<String, Payment>>) (key, value) -> { if (value.getState() == Payment.State.debit) { value.setStateAndId(Payment.State.credit); } else if (value.getState() == Payment.State.credit) { value.setStateAndId(Payment.State.complete); } return new KeyValue<>(value.getId(), value); }) .branch(isCreditRecord, isCompleteRecord); branch[0].to(paymentsInflightTopic); branch[1].to(paymentsCompleteTopic); https://github.com/confluentinc/demo-scene/blob/master/scalable-payment-processing/.../AccountProcessor.java KTable state (Kafka Streams)
  • 65. 65 Payments system: AccountBalance public AccountBalance handle(String key, Payment value) { this.name = value.getId(); if (value.getState() == Payment.State.debit) { this.amount = this.amount.subtract(value.getAmount()); } else if (value.getState() == Payment.State.credit) { this.amount = this.amount.add(value.getAmount()); } else { // report to dead letter queue via exception handler throw new RuntimeException("Invalid payment received:" + value); } this.lastPayment = value; return this; } https://github.com/confluentinc/demo-scene/.../scalable-payment-processing/.../model/AccountBalance.java
  • 66. 66 Payments system: event model https://github.com/confluentinc/demo-scene/.../scalable-payment-processing/.../io/confluent/kpay/payments
  • 68. 6868 “It’s asynchronous, I don’t trust it” (some developer, 2018)
  • 69. 6969 We only have one part of the picture ○ What about failures? ○ Upgrades? ○ How fast is it going? ○ What is happening - is it working?
  • 70. 7070 Event-streaming pillars: 1. Business function (payment) 2. Instrumentation plane (trust) 3. Control plane (coordinate) 4. Operational plane (run)
  • 71. 7171 Event-streaming provides ● Evolution ● Decoupling ● Bounded context modelling ● Composition (because of SoC)
  • 72. 7272 Our mental model: Abstraction as an Art Chained/Orchestrated Bounded contexts Stream processor Stream Event Pillars Business function Control plane Instrumentation Operations Bounded context
  • 73. 7373 Key takeaway (state) Event streamingdriven microservices are the new atomic unit: 1. Provide simplicity (and time travel) 2. Handle state (via Kafka Streams) 3. Provide a new paradigm: convergent data and logic processing Stream processor
  • 74. 7474 Key takeaway (complexity) ● Event-Streaming apps: model as bounded-context dataflows, handle state & scaling ● Patterns: Build reusable dataflow patterns (instrumentation) ● Composition: Bounded contexts chaining and layering ● Composition: Choreography and Orchestration
  • 75. 7575 Questions? “Journey to event driven” blog 1. Event-first thinking 2. Programming models 3. Serverless 4. Pillars of event-streaming ms’s
  • 76. 7676 Learn more about Apache Kafka® and Confluent Platform
  • 77. 77 Learn Kafka. Start building with Apache Kafka at Confluent Developer. developer.confluent.io
  • 79. 79Confluent are giving new users $50 of free usage per month for their first 3 months Sign up for a Confluent Cloud account Please bear in mind that you will be required to enter credit card information but will not be charged unless you go over the $50 usage in any of the first 3 months or if you don’t cancel your subscription before the end of your promotion. Here’s advice on how to use this promotion to try Confluent Cloud for free! You won’t be charged if you don’t go over the limit! Get the benefits of Confluent Cloud, but keep an eye on your your account making sure that you have enough remaining free credits available for the rest of your subscription month!! Cancel before the 3 months end If you don’t want to continue past the promotion If you fail to cancel within your first three months you will start being charged full price. To cancel, immediately stop all streaming and storing data in Confluent Cloud and email cloud- support@confluent.io bit.ly/TryConfluentCloudAvailable on bit.ly/TryConfluentCloud
  • 80. 80 A Confluent community catalyst is a person who invests relentlessly in the Apache Kafka® and/or Confluent communities. Massive bragging rights Access to the private MVP Slack channel Special swagThe recognition of your peers Direct interaction with Apache Kafka contributors as well as the Confluent founders at special events Free pass for Kafka Summit SF Nominate yourself or a peer at CONFLUENT.IO/NOMINATE
  • 81. 8181 Want to host or speak at one of our meetups? Please contact community@confluent.io and we will make it happen!
  • 82. 82