SlideShare a Scribd company logo
Kafka, Brokers, Streaming
IWOMM
London, 25/04/2018
@eschmiegelow
Agenda
I. Kafka - Brokers
II. Kafka Streaming
III. Demo
I.
Kafka Message Broker
I. Kafka – message broking at scale
Kafka – what is it?
A publish-subscribe messaging system built as a distributed
commit log
Some facts:
- http://kafka.apache.org
- Developed at LinkedIn and open sourced in 2011
- Top Level project since 2012 and backed by Confluent
- Users: LinkedIn (obviously), Yahoo, Netflix, Spotify and
many others
I. Kafka – overview
Kafka is persistent and distributed, offers replication and
runs on a cluster of brokers coordinated by Zookeeper.
Publishers send messages to the brokers. The brokers
persist messages to disk in a log based queue. Each
message is indexed by its offset
Consumers request the messages using an offset/length
based API
I. Kafka - Architecture
• Everything is distributed:
• Producers
• Consumers
• Brokers
• Queues
• Messages are persistent with TTL
• Queues are append-only
• Consumers maintain their own state
• Throughput with partitioning and replication
I. Kafka Message Brokering
I. Kafka Topics
Topics are queues – organized in partitions. A topic can
have multiple partitions, these are replicated to multiple
brokers.
I. Kafka Message Routing
Messages are keyed, the hash is used to route messages
to partitions
I. Kafka Compacted Topics
Compacted Topics are specialised topics that retain the
latest version of a key in memory
I. Kafka Consumers
Two different APIs:
• Low Level (Simple Consumer) with fine grained
control over message offset and commits
• Consumer Group API, which allows group
coordinated consumption
I. Use Cases
Some typical use cases:
• Metrics − Kafka is often used for operational monitoring data.
This involves aggregating statistics from distributed applications to
produce centralized feeds of operational data.
• Event Sourcing - Kafka is an excellent backend for Events thanks
to its preservation of order, multiple consumption and exactly once
semantics.
• Stream Processing − Popular frameworks such as Akka Streams
and Spark Streaming read data from a topic, processes it, and write
processed data to a new topic where it becomes available for users
and applications.
II.
Kafka Streams
II. Kafka Streams - Overview
Kafka Streams – what is it?
Kafka Streams is Confluent’s lightweight approach to
stream processing and is:
● an add on-library to any application
● Built on a the simple concept of pipes and sinks with
functional transformations in between
● Runs within your applications, e.g. docker
● Is built for the JVM, i.e. Java, Scala and Kotlin
II. Kafka Streams - Benefits
Why another stream processing framework?
● Kafka Streams is not a replacement for Flink, Spark and
others
● It’s super lightweight, has almost no dependencies
other than kafka
● Offers minimal latency, one record at a time
processing
➔ It’s the ideal conversion, enrichment, and
transformation toolkit for topic to topic processing
II. Kafka Streams - Apps
Kafka Streams components live in your apps
II. Kafka Streams - Apps
… but communicate with the kafka cluster and coordinate
their work in managed threads.
Internally, a Streams App has
a topology of processors,
each of which are responsible
for a step in the flow, be it
consuming from a source,
transforming a record or
writing to a sink.
II. Kafka Streams - Apps
Distribution and parallelism: Consumers and threads
II. Kafka Streams - API Concepts
Kafka Streams has two core concepts - streams and tables
● Streams are simply a sequence of records and
correspond to a non compacted topic in Kafka
● Tables represent a key-centric view on topics and are
usually backed by comcated topics
In Kafka Streams, these interchangeable, i.e. a table can be
converted to a stream representation and vice-versa
II. Kafka Streams - API Concepts
In addition, Kafka streams allow stateless and stateful
transformations:
● Stateless transformations are record by record based
and cover the typical functions such as map, filter,
groupBy etc..
● Stateful transformations require a store and cover
functions over collections of records such as window
operations, reduce, count and aggregate
III.
Demo
Twitter:
@eschmiegelow
Github repo:
https://github.com/schmiegelow/iwomm-kafka

More Related Content

PPTX
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
PDF
Dependency Management on iOS
PDF
Kafka Streams: What it is, and how to use it?
PPTX
Kafka RealTime Streaming
PPTX
Kafka for Scale
PDF
Kafka Connect by Datio
PPTX
Kafka connect
PDF
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Kafka Connect: Real-time Data Integration at Scale with Apache Kafka, Ewen Ch...
Dependency Management on iOS
Kafka Streams: What it is, and how to use it?
Kafka RealTime Streaming
Kafka for Scale
Kafka Connect by Datio
Kafka connect
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field

What's hot (20)

PDF
Hello, kafka! (an introduction to apache kafka)
PDF
Expanding beyond SPL -- More language support in IBM Streams V4.1
PDF
What's new in Confluent 3.2 and Apache Kafka 0.10.2
PDF
IBM InterConnect 2015 - IIB in the Cloud
PDF
Secure Kafka at Salesforce.com
PDF
Data integration with Apache Kafka
PDF
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
PDF
PostgreSQL + Kafka: The Delight of Change Data Capture
PPTX
Apache flink
DOCX
Fundamentals of Apache Kafka
PPTX
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
PPTX
Real time Messages at Scale with Apache Kafka and Couchbase
PDF
Apache kafka
PDF
Apache Kafka Introduction
PDF
Java Library for High Speed Streaming Data
PDF
Capital One Delivers Risk Insights in Real Time with Stream Processing
PDF
Revitalizing Enterprise Integration with Reactive Streams
PPT
An introduction to Apache Camel
PDF
Building Kafka-powered Activity Stream
PPTX
Introduction to Kafka
Hello, kafka! (an introduction to apache kafka)
Expanding beyond SPL -- More language support in IBM Streams V4.1
What's new in Confluent 3.2 and Apache Kafka 0.10.2
IBM InterConnect 2015 - IIB in the Cloud
Secure Kafka at Salesforce.com
Data integration with Apache Kafka
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
PostgreSQL + Kafka: The Delight of Change Data Capture
Apache flink
Fundamentals of Apache Kafka
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
Real time Messages at Scale with Apache Kafka and Couchbase
Apache kafka
Apache Kafka Introduction
Java Library for High Speed Streaming Data
Capital One Delivers Risk Insights in Real Time with Stream Processing
Revitalizing Enterprise Integration with Reactive Streams
An introduction to Apache Camel
Building Kafka-powered Activity Stream
Introduction to Kafka
Ad

Similar to Kafka and Kafka Streams Intro at iwomm in London (20)

PPTX
Introduction to Kafka Streams Presentation
PPTX
A Short Presentation on Kafka
PDF
Building Streaming Data Applications Using Apache Kafka
PDF
PPTX
Reducing Microservice Complexity with Kafka and Reactive Streams
PPTX
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
PPSX
Apache kafka introduction
PPTX
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
PDF
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
PPTX
Kafkha real time analytics platform.pptx
PPTX
Design Patterns for working with Fast Data in Kafka
PPTX
Design Patterns for working with Fast Data
PDF
Event driven-arch
PPTX
Kafka Streams for Java enthusiasts
PDF
Build real time stream processing applications using Apache Kafka
PDF
Connect K of SMACK:pykafka, kafka-python or?
PPTX
Kafka presentation
PDF
Kafka and Spark Streaming
PPTX
Service messaging using Kafka
PDF
An Introduction to Apache Kafka
Introduction to Kafka Streams Presentation
A Short Presentation on Kafka
Building Streaming Data Applications Using Apache Kafka
Reducing Microservice Complexity with Kafka and Reactive Streams
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
Apache kafka introduction
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
DevOps Fest 2020. Сергій Калінець. Building Data Streaming Platform with Apac...
Kafkha real time analytics platform.pptx
Design Patterns for working with Fast Data in Kafka
Design Patterns for working with Fast Data
Event driven-arch
Kafka Streams for Java enthusiasts
Build real time stream processing applications using Apache Kafka
Connect K of SMACK:pykafka, kafka-python or?
Kafka presentation
Kafka and Spark Streaming
Service messaging using Kafka
An Introduction to Apache Kafka
Ad

Recently uploaded (20)

PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Digital Strategies for Manufacturing Companies
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
ai tools demonstartion for schools and inter college
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Transform Your Business with a Software ERP System
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
medical staffing services at VALiNTRY
PPTX
history of c programming in notes for students .pptx
PPTX
L1 - Introduction to python Backend.pptx
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
How Creative Agencies Leverage Project Management Software.pdf
PTS Company Brochure 2025 (1).pdf.......
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Operating system designcfffgfgggggggvggggggggg
Odoo Companies in India – Driving Business Transformation.pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
How to Choose the Right IT Partner for Your Business in Malaysia
Digital Strategies for Manufacturing Companies
Navsoft: AI-Powered Business Solutions & Custom Software Development
ai tools demonstartion for schools and inter college
Which alternative to Crystal Reports is best for small or large businesses.pdf
Transform Your Business with a Software ERP System
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
medical staffing services at VALiNTRY
history of c programming in notes for students .pptx
L1 - Introduction to python Backend.pptx
Wondershare Filmora 15 Crack With Activation Key [2025
2025 Textile ERP Trends: SAP, Odoo & Oracle
VVF-Customer-Presentation2025-Ver1.9.pptx

Kafka and Kafka Streams Intro at iwomm in London

  • 1. Kafka, Brokers, Streaming IWOMM London, 25/04/2018 @eschmiegelow
  • 2. Agenda I. Kafka - Brokers II. Kafka Streaming III. Demo
  • 4. I. Kafka – message broking at scale Kafka – what is it? A publish-subscribe messaging system built as a distributed commit log Some facts: - http://kafka.apache.org - Developed at LinkedIn and open sourced in 2011 - Top Level project since 2012 and backed by Confluent - Users: LinkedIn (obviously), Yahoo, Netflix, Spotify and many others
  • 5. I. Kafka – overview Kafka is persistent and distributed, offers replication and runs on a cluster of brokers coordinated by Zookeeper. Publishers send messages to the brokers. The brokers persist messages to disk in a log based queue. Each message is indexed by its offset Consumers request the messages using an offset/length based API
  • 6. I. Kafka - Architecture • Everything is distributed: • Producers • Consumers • Brokers • Queues • Messages are persistent with TTL • Queues are append-only • Consumers maintain their own state • Throughput with partitioning and replication
  • 7. I. Kafka Message Brokering
  • 8. I. Kafka Topics Topics are queues – organized in partitions. A topic can have multiple partitions, these are replicated to multiple brokers.
  • 9. I. Kafka Message Routing Messages are keyed, the hash is used to route messages to partitions
  • 10. I. Kafka Compacted Topics Compacted Topics are specialised topics that retain the latest version of a key in memory
  • 11. I. Kafka Consumers Two different APIs: • Low Level (Simple Consumer) with fine grained control over message offset and commits • Consumer Group API, which allows group coordinated consumption
  • 12. I. Use Cases Some typical use cases: • Metrics − Kafka is often used for operational monitoring data. This involves aggregating statistics from distributed applications to produce centralized feeds of operational data. • Event Sourcing - Kafka is an excellent backend for Events thanks to its preservation of order, multiple consumption and exactly once semantics. • Stream Processing − Popular frameworks such as Akka Streams and Spark Streaming read data from a topic, processes it, and write processed data to a new topic where it becomes available for users and applications.
  • 14. II. Kafka Streams - Overview Kafka Streams – what is it? Kafka Streams is Confluent’s lightweight approach to stream processing and is: ● an add on-library to any application ● Built on a the simple concept of pipes and sinks with functional transformations in between ● Runs within your applications, e.g. docker ● Is built for the JVM, i.e. Java, Scala and Kotlin
  • 15. II. Kafka Streams - Benefits Why another stream processing framework? ● Kafka Streams is not a replacement for Flink, Spark and others ● It’s super lightweight, has almost no dependencies other than kafka ● Offers minimal latency, one record at a time processing ➔ It’s the ideal conversion, enrichment, and transformation toolkit for topic to topic processing
  • 16. II. Kafka Streams - Apps Kafka Streams components live in your apps
  • 17. II. Kafka Streams - Apps … but communicate with the kafka cluster and coordinate their work in managed threads. Internally, a Streams App has a topology of processors, each of which are responsible for a step in the flow, be it consuming from a source, transforming a record or writing to a sink.
  • 18. II. Kafka Streams - Apps Distribution and parallelism: Consumers and threads
  • 19. II. Kafka Streams - API Concepts Kafka Streams has two core concepts - streams and tables ● Streams are simply a sequence of records and correspond to a non compacted topic in Kafka ● Tables represent a key-centric view on topics and are usually backed by comcated topics In Kafka Streams, these interchangeable, i.e. a table can be converted to a stream representation and vice-versa
  • 20. II. Kafka Streams - API Concepts In addition, Kafka streams allow stateless and stateful transformations: ● Stateless transformations are record by record based and cover the typical functions such as map, filter, groupBy etc.. ● Stateful transformations require a store and cover functions over collections of records such as window operations, reduce, count and aggregate