SlideShare a Scribd company logo
Kafka internals
David Gruzman, www.nestlogic.com
What is Kafka?
Why it is so interesting? Is it just "yet another
queue" with better performance?
It is not queue, although can be used in that
sense. Lets look on it as a database / storage
technology
Data model
● Ordered, Partitioned collection of Key-
Values.
● Key is optional
● Values - opaque
High level architecture
Role of the broker
Broker is handling read/writes
It forward messages for replication
It does compactions on its own logs replica
(without influence to any other copies)
Role of the controller
Handling “cluster wide events” sent by Zookeeper (all in
sync with Zookeeper registries)
● Brokers list change (registration, failure)
● Leaders election
● Change in topics (deleted, added, num of partitions
changed)
● Track partitions replica
Kafka controller
Zookeeper role
- Kafka controller registration
- List of topics and partitions
- Partition states
- Brokers registration (id, host, port)
- Consumer registration & Subscriptions.
Partitions
- Each partition has its leader broker and N followers.
- Consumers and producers works with leaders only.
- Partition is the main mechanism of a scale, within the
topic.
- Producer specify the target partition via Partitioner
implementation (balancing within available topic
partitions)
Access Pattern
● Writers write data in massive streams. Data
is already "ordered". (This ordering is re-
used)
● Readers consume data, each one from
some position, sequentially.
Write path
Producer
broker
broker
broker
Local Storage Local Storage Local Storage
Elected leader FollowersFollowers
Read path
Read always happens via partition leader.
Kafka helps to balance consumers within the
group. Each topic’s partition could be read by
single consumer at a time to avoid
simultaneous read of the same message by
several consumers within the same group.
Data transfer efficiency
1. Sequential disk access - optimal disk
utilization
2. Zero copy - save CPU cycles.
3. Compression - save network bandwidth.
Compression
Up to 0.7 Kafka - special kind of compressed
messages was handled by clients (producer
and consumer parts), the Broker was not
involved.
Starting 8.1, Kafka broker repackages
messages in order to support logical offsets.
Indexing
- When data flows into the broker, it is being
“indexed”.
- Index files are stored alongside “segments”.
- Segments are files with the data.
Consumer API Levels
● Low level API : work with partitions and
offsets
● High level API : Work with topics, automatic
offset management, load balancing.
Can be rephrased as
● Low level API : Database
● High level : Queue
Layers of functionality
Levels of abstraction
Offset management
Prior to 0.81 it was pure Zookeeper
responsibility to hold offsets metadata.
Starting from 0.81 - there is special offset
manager service. It runs with Broker, use
special topic to store offsets and also do in-
memory caching as optimization.
We can choose what mechanism to use
Key Compaction
Kafka is capable to store only latest value per
key.
It is not a Queue. It is a table.
This capability enables to store the whole state
(the historical data flow), not only latest X days
(in comparison to auto-deletion approach).
Performance
Why it is so fast?
1. Network and Disk formats of messages are
the same. What is to be done is just append.
2. Local storage is used.
3. No attempts to cache / optimize.
Something big happens
We have new world of needs of real time data
processing.
In many cases - it means streams.
For many years I thought it is just counters to
be calculated before saving data into HDFS for
“real work”. Now I see it quite different.
Naive use of Kafka
Possible simple solution...
Kafka as NoSQL
- Sync replication as resilience model
- Single master per partition
- Opaque data
- Compactions
- Optimized for read in the same order as
write was done
- Optimized for massive writes
Compute
Samza, Kafka Streams relation to Kafka is like
MapReduce, Spark relation to HDFS
Kafka became media on top of which we build
computational layers.
Have to be said - no data locality.
Samza, Kafka Streams solve common
problems
State, recovery approach
Both Samza and Kafka streams took approach,
for long time serving RDBMS. Snapshot +
Redo log.
They force stateful stream processing
applications to follow this paradigm.
NestLogic case
What are we doing?
Why do we need Kafka / Spark
How Kafka helped us?
Statistical analysis of data segments
First shot - Spark
What was a problem
- All data have to be processed. We might
have not enough resources to process
particular - huge - segment.
- Spark shuffle when data is bigger than RAM
is challenging
- We are moving to more “real time” and
streaming.
Kafka as shuffle engine
What we learned - flexibility
● We can re-run “reduce” stage several times.
● Kafka clients could wait for connection to be
reestablished with no timeouts, so we can repair failed
Kafka resource leader, and the job will proceed.
● We can run clusters for map and reduce separately :
flexibility to select their sizes. It saves us some money.
● Now we can have different technologies for Map and
Reduce. We are about to replace map stage
(transformations) with ImpalaToGo
What we learned - cont
More concise resource management.
We can look on size of shuffle data, number of
groups (available from Kafka cluster metadata),
and only than decide on size of “reduce”
cluster.
We can interleave map and reduce stages,
because there is no sorting requirements.
Is it universal solution?
● If you need dozens of different, concurrent
jobs :
Yarn + Spark probably the best
● If you need single job to run smoothly and be
flexible with it - our approach comes into
place
So, what we do?
We help to distinguish your data by its nature,
present it and help to decide what should be
done with each
As data scientists...
We believe that checking statistical
homogeneity of data is very important
As business people...
- Do not count attack as popularity
- Do not count fraud as profit
- Do not count bug as lack of interest
And most important
- Work hard to distinguish all above
As big data experts
It is not simple to achieve. It took a lot of efforts
to get good results, orchestrate operation etc.
We believe - you have better utilization of your
Big data, data science and devops resources.
How it looks
NestLogic inc
We work hard to help you to
Know your data.
Thank you for your attention
Сontact us
Contact us on info@nestlogic.com
Or in our site www.nestlogic.com
Helper slides
State is in RocksDB
RocksDB was selected. A few quick facts:
1. Developed by Facebook, based on LevelDB
2. Single node
3. C++ library
4. HBase ideas of sorting, snapshot, transaction logs.
5. My speculation - transaction log is what “glue” it with
Kafka streams
Rebalancing - part 1
- One of the brokers is elected as the coordinator for a
subset of the consumer groups. It will be responsible for
triggering rebalancing attempts for certain consumer
groups on consumer group membership changes or
subscribed topic partition changes.
- It will also be responsible for communicating the
resulting partition-consumer ownership configuration to
all consumers of the group undergoing a rebalance
operation.
Rebalancing - part 2
- On startup or on co-ordinator failover, the consumer
sends a ClusterMetadataRequest to any of the brokers
in the "bootstrap.brokers" list. In the request, it receives
the location of the co-ordinator for it's group.
- The consumer sends a RegisterConsumer request to
it's co-ordinator broker. In the response, it receives the
list of topic partitions that it should own.
- At this time, group management is done and the
consumer starts fetching data and (optionally)
Consumer balancing
It is the capability to balance load, fail over between
consumers in the same group.
Kafka consumer communicates with Co-ordinator Broker
for this. Co-ordinator broker info is stored on ZK and is
available from any broker.
These mechanisms are reused in Kafka Streams
Co-ordinator broker, part 1
1. Reads the list of groups it manages and their
membership information from ZK.
2. If discovered membership is alive (as from ZK), waits for
consumers in each of the groups to re-register with it.
3. Does failure detection for all consumers in a group.
Consumers marked as dead by the co-ordinator's failure
detection protocol are removed from the group and the
co-ordinator marks the rebalance for a group completed
by communicating the new partition ownership to the
Co-ordinator broker, part 2
4. The co-ordinator tracks the changes to topic partition
changes for all topics that any consumer group has
registered interest for. If it detects a new partition for any
topic, it triggers a rebalance operation (killing consumers
socket connection with itself). The creation of new topics
can also trigger a rebalance operation as consumers can
register for topics before they are created
Consumer and Co-ordinator
Log compaction
Any read from offset 0 to any offset Q where Q > P that
completes in less than a configurable SLA will see the final
state of all keys as of time Q. Log head is always a single
segment (default 1GB)
Material used
About compression : http://www.confluent.
io/blog/compression-in-apache-kafka-is-now-
34-percent-faster
Change in message offsets : https://cwiki.
apache.
org/confluence/display/KAFKA/Keyed+Messag
es+Proposal

More Related Content

PDF
Apache Kafka Introduction
PDF
Apache Kafka Fundamentals for Architects, Admins and Developers
PDF
Producer Performance Tuning for Apache Kafka
ODP
Stream processing using Kafka
PDF
Fundamentals of Apache Kafka
PPTX
A visual introduction to Apache Kafka
PPTX
Introduction to Apache Kafka
PDF
Introduction to Kafka Streams
Apache Kafka Introduction
Apache Kafka Fundamentals for Architects, Admins and Developers
Producer Performance Tuning for Apache Kafka
Stream processing using Kafka
Fundamentals of Apache Kafka
A visual introduction to Apache Kafka
Introduction to Apache Kafka
Introduction to Kafka Streams

What's hot (20)

PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PPTX
PDF
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
PPTX
Apache Kafka Best Practices
PDF
An Introduction to Apache Kafka
PDF
A Deep Dive into Kafka Controller
PPTX
RedisConf17- Using Redis at scale @ Twitter
PDF
Common issues with Apache Kafka® Producer
PPTX
Netflix Data Pipeline With Kafka
PPTX
Data Pipelines with Kafka Connect
PPTX
Kafka Tutorial - introduction to the Kafka streaming platform
PDF
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
PDF
Apache Kafka - Martin Podval
PPTX
Introduction to Apache Kafka
PDF
Handle Large Messages In Apache Kafka
PDF
Getting Started with Confluent Schema Registry
PPTX
A Deep Dive into Kafka Controller
PPTX
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
PDF
Apache Kafka
PDF
Kafka 101 and Developer Best Practices
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Tradeoffs in Distributed Systems Design: Is Kafka The Best? (Ben Stopford and...
Apache Kafka Best Practices
An Introduction to Apache Kafka
A Deep Dive into Kafka Controller
RedisConf17- Using Redis at scale @ Twitter
Common issues with Apache Kafka® Producer
Netflix Data Pipeline With Kafka
Data Pipelines with Kafka Connect
Kafka Tutorial - introduction to the Kafka streaming platform
CDC Stream Processing With Apache Flink With Timo Walther | Current 2022
Apache Kafka - Martin Podval
Introduction to Apache Kafka
Handle Large Messages In Apache Kafka
Getting Started with Confluent Schema Registry
A Deep Dive into Kafka Controller
Kafka + Uber- The World’s Realtime Transit Infrastructure, Aaron Schildkrout
Apache Kafka
Kafka 101 and Developer Best Practices
Ad

Similar to Kafka internals (20)

PDF
Introduction to apache kafka
PDF
Kafka syed academy_v1_introduction
PPTX
Kafkha real time analytics platform.pptx
PDF
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
PPTX
Fundamentals and Architecture of Apache Kafka
PDF
Introduction to Apache Kafka
PDF
Why is My Stream Processing Job Slow? with Xavier Leaute
PPTX
Kafka RealTime Streaming
PPTX
Getting Started with Kafka on k8s
PDF
Kinesis vs-kafka-and-kafka-deep-dive
PPTX
Apache Kafka
PPTX
Kafka blr-meetup-presentation - Kafka internals
PDF
Kafka Technical Overview
PPTX
Session 23 - Kafka and Zookeeper
PDF
Kafka in action - Tech Talk - Paytm
PDF
Kafka used at scale to deliver real-time notifications
PDF
Kafka Needs No Keeper
PDF
Apache Kafka Scalable Message Processing and more!
PPTX
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
PDF
Apache Kafka Women Who Code Meetup
Introduction to apache kafka
Kafka syed academy_v1_introduction
Kafkha real time analytics platform.pptx
Apache Kafka - Event Sourcing, Monitoring, Librdkafka, Scaling & Partitioning
Fundamentals and Architecture of Apache Kafka
Introduction to Apache Kafka
Why is My Stream Processing Job Slow? with Xavier Leaute
Kafka RealTime Streaming
Getting Started with Kafka on k8s
Kinesis vs-kafka-and-kafka-deep-dive
Apache Kafka
Kafka blr-meetup-presentation - Kafka internals
Kafka Technical Overview
Session 23 - Kafka and Zookeeper
Kafka in action - Tech Talk - Paytm
Kafka used at scale to deliver real-time notifications
Kafka Needs No Keeper
Apache Kafka Scalable Message Processing and more!
Kafka.pptx (uploaded from MyFiles SomnathDeb_PC)
Apache Kafka Women Who Code Meetup
Ad

More from David Groozman (7)

PPTX
Tachyon meetup slides.
PPTX
ImpalaToGo and Tachyon integration
PPTX
ImpalaToGo design explained
PPTX
ImpalaToGo use case
PPTX
Spark vstez
PPTX
ImpalaToGo introduction
PPT
Cloudera Impala Internals
Tachyon meetup slides.
ImpalaToGo and Tachyon integration
ImpalaToGo design explained
ImpalaToGo use case
Spark vstez
ImpalaToGo introduction
Cloudera Impala Internals

Recently uploaded (20)

PPTX
Current and future trends in Computer Vision.pptx
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
Well-logging-methods_new................
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPT
Mechanical Engineering MATERIALS Selection
PDF
composite construction of structures.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
Sustainable Sites - Green Building Construction
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
OOP with Java - Java Introduction (Basics)
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
PPT on Performance Review to get promotions
Current and future trends in Computer Vision.pptx
Fundamentals of safety and accident prevention -final (1).pptx
Model Code of Practice - Construction Work - 21102022 .pdf
Well-logging-methods_new................
bas. eng. economics group 4 presentation 1.pptx
Mechanical Engineering MATERIALS Selection
composite construction of structures.pdf
Internet of Things (IOT) - A guide to understanding
Embodied AI: Ushering in the Next Era of Intelligent Systems
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
CYBER-CRIMES AND SECURITY A guide to understanding
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Sustainable Sites - Green Building Construction
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Automation-in-Manufacturing-Chapter-Introduction.pdf
Safety Seminar civil to be ensured for safe working.
OOP with Java - Java Introduction (Basics)
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPT on Performance Review to get promotions

Kafka internals

  • 1. Kafka internals David Gruzman, www.nestlogic.com
  • 2. What is Kafka? Why it is so interesting? Is it just "yet another queue" with better performance? It is not queue, although can be used in that sense. Lets look on it as a database / storage technology
  • 3. Data model ● Ordered, Partitioned collection of Key- Values. ● Key is optional ● Values - opaque
  • 5. Role of the broker Broker is handling read/writes It forward messages for replication It does compactions on its own logs replica (without influence to any other copies)
  • 6. Role of the controller Handling “cluster wide events” sent by Zookeeper (all in sync with Zookeeper registries) ● Brokers list change (registration, failure) ● Leaders election ● Change in topics (deleted, added, num of partitions changed) ● Track partitions replica
  • 8. Zookeeper role - Kafka controller registration - List of topics and partitions - Partition states - Brokers registration (id, host, port) - Consumer registration & Subscriptions.
  • 9. Partitions - Each partition has its leader broker and N followers. - Consumers and producers works with leaders only. - Partition is the main mechanism of a scale, within the topic. - Producer specify the target partition via Partitioner implementation (balancing within available topic partitions)
  • 10. Access Pattern ● Writers write data in massive streams. Data is already "ordered". (This ordering is re- used) ● Readers consume data, each one from some position, sequentially.
  • 11. Write path Producer broker broker broker Local Storage Local Storage Local Storage Elected leader FollowersFollowers
  • 12. Read path Read always happens via partition leader. Kafka helps to balance consumers within the group. Each topic’s partition could be read by single consumer at a time to avoid simultaneous read of the same message by several consumers within the same group.
  • 13. Data transfer efficiency 1. Sequential disk access - optimal disk utilization 2. Zero copy - save CPU cycles. 3. Compression - save network bandwidth.
  • 14. Compression Up to 0.7 Kafka - special kind of compressed messages was handled by clients (producer and consumer parts), the Broker was not involved. Starting 8.1, Kafka broker repackages messages in order to support logical offsets.
  • 15. Indexing - When data flows into the broker, it is being “indexed”. - Index files are stored alongside “segments”. - Segments are files with the data.
  • 16. Consumer API Levels ● Low level API : work with partitions and offsets ● High level API : Work with topics, automatic offset management, load balancing. Can be rephrased as ● Low level API : Database ● High level : Queue
  • 19. Offset management Prior to 0.81 it was pure Zookeeper responsibility to hold offsets metadata. Starting from 0.81 - there is special offset manager service. It runs with Broker, use special topic to store offsets and also do in- memory caching as optimization. We can choose what mechanism to use
  • 20. Key Compaction Kafka is capable to store only latest value per key. It is not a Queue. It is a table. This capability enables to store the whole state (the historical data flow), not only latest X days (in comparison to auto-deletion approach).
  • 21. Performance Why it is so fast? 1. Network and Disk formats of messages are the same. What is to be done is just append. 2. Local storage is used. 3. No attempts to cache / optimize.
  • 22. Something big happens We have new world of needs of real time data processing. In many cases - it means streams. For many years I thought it is just counters to be calculated before saving data into HDFS for “real work”. Now I see it quite different.
  • 23. Naive use of Kafka
  • 25. Kafka as NoSQL - Sync replication as resilience model - Single master per partition - Opaque data - Compactions - Optimized for read in the same order as write was done - Optimized for massive writes
  • 26. Compute Samza, Kafka Streams relation to Kafka is like MapReduce, Spark relation to HDFS Kafka became media on top of which we build computational layers. Have to be said - no data locality. Samza, Kafka Streams solve common problems
  • 27. State, recovery approach Both Samza and Kafka streams took approach, for long time serving RDBMS. Snapshot + Redo log. They force stateful stream processing applications to follow this paradigm.
  • 28. NestLogic case What are we doing? Why do we need Kafka / Spark How Kafka helped us?
  • 29. Statistical analysis of data segments
  • 30. First shot - Spark
  • 31. What was a problem - All data have to be processed. We might have not enough resources to process particular - huge - segment. - Spark shuffle when data is bigger than RAM is challenging - We are moving to more “real time” and streaming.
  • 33. What we learned - flexibility ● We can re-run “reduce” stage several times. ● Kafka clients could wait for connection to be reestablished with no timeouts, so we can repair failed Kafka resource leader, and the job will proceed. ● We can run clusters for map and reduce separately : flexibility to select their sizes. It saves us some money. ● Now we can have different technologies for Map and Reduce. We are about to replace map stage (transformations) with ImpalaToGo
  • 34. What we learned - cont More concise resource management. We can look on size of shuffle data, number of groups (available from Kafka cluster metadata), and only than decide on size of “reduce” cluster. We can interleave map and reduce stages, because there is no sorting requirements.
  • 35. Is it universal solution? ● If you need dozens of different, concurrent jobs : Yarn + Spark probably the best ● If you need single job to run smoothly and be flexible with it - our approach comes into place
  • 36. So, what we do? We help to distinguish your data by its nature, present it and help to decide what should be done with each
  • 37. As data scientists... We believe that checking statistical homogeneity of data is very important
  • 38. As business people... - Do not count attack as popularity - Do not count fraud as profit - Do not count bug as lack of interest And most important - Work hard to distinguish all above
  • 39. As big data experts It is not simple to achieve. It took a lot of efforts to get good results, orchestrate operation etc. We believe - you have better utilization of your Big data, data science and devops resources.
  • 41. NestLogic inc We work hard to help you to Know your data.
  • 42. Thank you for your attention
  • 43. Сontact us Contact us on info@nestlogic.com Or in our site www.nestlogic.com
  • 45. State is in RocksDB RocksDB was selected. A few quick facts: 1. Developed by Facebook, based on LevelDB 2. Single node 3. C++ library 4. HBase ideas of sorting, snapshot, transaction logs. 5. My speculation - transaction log is what “glue” it with Kafka streams
  • 46. Rebalancing - part 1 - One of the brokers is elected as the coordinator for a subset of the consumer groups. It will be responsible for triggering rebalancing attempts for certain consumer groups on consumer group membership changes or subscribed topic partition changes. - It will also be responsible for communicating the resulting partition-consumer ownership configuration to all consumers of the group undergoing a rebalance operation.
  • 47. Rebalancing - part 2 - On startup or on co-ordinator failover, the consumer sends a ClusterMetadataRequest to any of the brokers in the "bootstrap.brokers" list. In the request, it receives the location of the co-ordinator for it's group. - The consumer sends a RegisterConsumer request to it's co-ordinator broker. In the response, it receives the list of topic partitions that it should own. - At this time, group management is done and the consumer starts fetching data and (optionally)
  • 48. Consumer balancing It is the capability to balance load, fail over between consumers in the same group. Kafka consumer communicates with Co-ordinator Broker for this. Co-ordinator broker info is stored on ZK and is available from any broker. These mechanisms are reused in Kafka Streams
  • 49. Co-ordinator broker, part 1 1. Reads the list of groups it manages and their membership information from ZK. 2. If discovered membership is alive (as from ZK), waits for consumers in each of the groups to re-register with it. 3. Does failure detection for all consumers in a group. Consumers marked as dead by the co-ordinator's failure detection protocol are removed from the group and the co-ordinator marks the rebalance for a group completed by communicating the new partition ownership to the
  • 50. Co-ordinator broker, part 2 4. The co-ordinator tracks the changes to topic partition changes for all topics that any consumer group has registered interest for. If it detects a new partition for any topic, it triggers a rebalance operation (killing consumers socket connection with itself). The creation of new topics can also trigger a rebalance operation as consumers can register for topics before they are created
  • 52. Log compaction Any read from offset 0 to any offset Q where Q > P that completes in less than a configurable SLA will see the final state of all keys as of time Q. Log head is always a single segment (default 1GB)
  • 53. Material used About compression : http://www.confluent. io/blog/compression-in-apache-kafka-is-now- 34-percent-faster Change in message offsets : https://cwiki. apache. org/confluence/display/KAFKA/Keyed+Messag es+Proposal