Beginners guide
to balance your data
across Apache Kafka partitions
Olena Kutsenko
Developer Advocate at Aiven 🦀
@OlenaKutsenko | @aiven_io
🥦
🧅
🫑🥕
🥖
🥔
🍎
🦞
🧀 🥫
🍅
Supermarket
@OlenaKutsenko | @aiven_io
🥦
🧅
🫑🥕
🥖
🥔
🍎
🦞
🧀 🥫
🍅
Supermarket
@OlenaKutsenko | @aiven_io
🍅
🥦
🧄
🧅
🫑🥕
🥖
🥔
🍎
🦞
🧀 🥫
💁
🍲
Supermarket
🍩
consumer
producer
@OlenaKutsenko | @aiven_io
topic
🔍
partition 1
partition 3
partition 2
consumer
producer
@OlenaKutsenko | @aiven_io
latency
throughput
and
⏳
✉
✉
✉
✉
✉
✉
✉ ✉
✉
✉
✉
✉
✉
✉
✉
✉
✉
✉ ✉
✉
✉
✉
✉
✉
✉
✉
✉
✉
✉ ✉
✉
✉
✉
Metrics
@OlenaKutsenko | @aiven_io
Optimising producers
acks = all 🦥
🤔
Sync or Async
acks = 1 🏎
vs
1. Number of confirmations
@OlenaKutsenko | @aiven_io
Optimising producers
max batch size
📦
throughput
latency
high
high
low
1. Number of confirmations
2. Size of the batch
@OlenaKutsenko | @aiven_io
Optimising producers
when reaching linger.ms
⏱
1. Number of confirmations
2. Size of the batch
3. Linger time
max batch size
📦
Data sent on
or
🔺
🔺
@OlenaKutsenko | @aiven_io
Optimising producers
🗜
📦
1. Number of confirmations
2. Size of the batch
3. Linger time
4. Compression
📦📦
📦📦
@OlenaKutsenko | @aiven_io
Optimising producers
1. Number of confirmations
2. Size of the batch
3. Linger time
4. Compression
5. Record ordering
🐘
🌾
🌳
☀
@OlenaKutsenko | @aiven_io
consumer
p. 1
p. 2
p. 3
producer partitions
9
1
2
3
4
5
6
7
8
1
0
1
1
1
2
1
3
1
4
1
5
1
6
1
7
1
8
1
9
2
0
2
1
2
2
2
3
2
4
2
5
2
6
2
7
2
8
2
9
3
0
3
1
3
2
3
3
3
4
3
5
3
6
9 1
2
3
4
5 6
7
8
1
0
1
1
1
3
1
2
3
4
5
6
7
8
1
Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022
Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022
Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022
Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022
Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022
Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022
Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022
Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022
with most critical ordering.
Key and data distribution
🦁
🐷
🐻
Instead of customer
use shopping trip
🔺
🛒
🛍
🧾
💳
🔎
🔺
🔺
🔺
@OlenaKutsenko | @aiven_io
1.Find the highest cardinality,
with most critical ordering.
2.Use custom partitioner
key
value
Key and data distribution
✂
@OlenaKutsenko | @aiven_io
Optimising consumers
1. Fetch size
@OlenaKutsenko | @aiven_io
Optimising consumers
1. Fetch size
2. Number of partitions
number of consumers
in the group
can be no bigger than
number of partitions
@OlenaKutsenko | @aiven_io
1.Plan in advance
2.Observe and measure
Action plan
- consumer lag
- fetch-latency-avg
and fetch-latency-max
- under replicated partitions
@OlenaKutsenko | @aiven_io
Aiven’s “Rolling challenge”
G10
go.aiven.io/aiven-challenge-current-na-22
Up for a challenge with Aiven for Apache Kafka?
Participate and win a fun prize 🎁
Find us at

More Related Content

PDF
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
PDF
Non-Kafkaesque Apache Kafka - Yottabyte 2018
PDF
Balance Kafka Cluster with Zero Data Movement with Haochen Li & Yaodong Yang
PPTX
Apache Kafka Best Practices
PPTX
Streaming in Practice - Putting Apache Kafka in Production
PDF
Removing performance bottlenecks with Kafka Monitoring and topic configuration
PDF
Tokyo AK Meetup Speedtest - Share.pdf
PDF
Why is My Stream Processing Job Slow? with Xavier Leaute
Achieving a 50% Reduction in Cross-AZ Network Costs from Kafka (Uday Sagar Si...
Non-Kafkaesque Apache Kafka - Yottabyte 2018
Balance Kafka Cluster with Zero Data Movement with Haochen Li & Yaodong Yang
Apache Kafka Best Practices
Streaming in Practice - Putting Apache Kafka in Production
Removing performance bottlenecks with Kafka Monitoring and topic configuration
Tokyo AK Meetup Speedtest - Share.pdf
Why is My Stream Processing Job Slow? with Xavier Leaute

Similar to Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022 (20)

PDF
Kafka zero to hero
PDF
Apache Kafka - From zero to hero
PPTX
Getting Started with Kafka on k8s
PDF
Introduction to Apache Kafka
PDF
Building zero data loss pipelines with apache kafka
PPTX
Kafka RealTime Streaming
PPTX
Putting Kafka Into Overdrive
PDF
Perfug 20-11-2019 - Kafka Performances
PDF
Understanding Apache Kafka P99 Latency at Scale
PDF
Apache Kafka Architecture & Fundamentals Explained
PDF
Kafka in action - Tech Talk - Paytm
PDF
The Dark and Dirty Side of Fixing Uneven Partitions with Olena Babenko & Olen...
PDF
Apache Kafka's Common Pitfalls & Intricacies: A Customer Support Perspective
PDF
DEVELOPING FAST APPLICATIONS WITH OPEN SOURCE SOFTWARE - WITHOUT THE FURY
PDF
Kafka internals
PDF
Apache KAfka
PDF
Geecon.cz 2015 debski krzysztof
PDF
The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...
PDF
Introduction to apache kafka
PDF
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Kafka zero to hero
Apache Kafka - From zero to hero
Getting Started with Kafka on k8s
Introduction to Apache Kafka
Building zero data loss pipelines with apache kafka
Kafka RealTime Streaming
Putting Kafka Into Overdrive
Perfug 20-11-2019 - Kafka Performances
Understanding Apache Kafka P99 Latency at Scale
Apache Kafka Architecture & Fundamentals Explained
Kafka in action - Tech Talk - Paytm
The Dark and Dirty Side of Fixing Uneven Partitions with Olena Babenko & Olen...
Apache Kafka's Common Pitfalls & Intricacies: A Customer Support Perspective
DEVELOPING FAST APPLICATIONS WITH OPEN SOURCE SOFTWARE - WITHOUT THE FURY
Kafka internals
Apache KAfka
Geecon.cz 2015 debski krzysztof
The Details That Matter: Kafka in Production, at Scale with Or Arnon and Elad...
Introduction to apache kafka
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Ad

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Ad

Recently uploaded (20)

PPTX
Chapter 5: Probability Theory and Statistics
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
Configure Apache Mutual Authentication
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
A review of recent deep learning applications in wood surface defect identifi...
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
STKI Israel Market Study 2025 version august
PPT
What is a Computer? Input Devices /output devices
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Abstractive summarization using multilingual text-to-text transfer transforme...
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Chapter 5: Probability Theory and Statistics
NewMind AI Weekly Chronicles – August ’25 Week III
2018-HIPAA-Renewal-Training for executives
Getting started with AI Agents and Multi-Agent Systems
Configure Apache Mutual Authentication
Benefits of Physical activity for teenagers.pptx
A review of recent deep learning applications in wood surface defect identifi...
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Custom Battery Pack Design Considerations for Performance and Safety
Hindi spoken digit analysis for native and non-native speakers
STKI Israel Market Study 2025 version august
What is a Computer? Input Devices /output devices
1 - Historical Antecedents, Social Consideration.pdf
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
A contest of sentiment analysis: k-nearest neighbor versus neural network
Abstractive summarization using multilingual text-to-text transfer transforme...
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Convolutional neural network based encoder-decoder for efficient real-time ob...
A proposed approach for plagiarism detection in Myanmar Unicode text
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx

Balance Your Data Across Apache Kafka Partitions With Olena Kutsenko | Current 2022