The dark and dirty side of
fixing uneven partitions
Olena Kutsenko
Sr. Developer Advocate
Aiven
Olena Babenko
Staff Software Engineer
Aiven
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
It all started well…
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Recommended strategies for partitioning
➔ Select number of partitions based on how data is consumed
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Recommended strategies for partitioning
➔ Select number of partitions based on how data is consumed
➔ Select number of partitioning neither too low nor to high
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Recommended strategies for partitioning
➔ Select number of partitions based on how data is consumed
➔ Select number of partitioning neither too low nor to high
➔ Use keys with the highest cardinality
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Recommended strategies for partitioning
➔ Select number of partitions based on how data is consumed
➔ Select number of partitioning neither too low nor to high
➔ Use keys with the highest cardinality
➔ Be mindful of data distribution over time
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Recommended strategies for partitioning
➔ Select number of partitions based on how data is consumed
➔ Select number of partitioning neither too low nor to high
➔ Use keys with the highest cardinality
➔ Be mindful of data distribution over time
➔ Consider potential edge cases
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
You were pretty happy
about the results
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Nothing predicted the storm
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Or so you thought
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Partition 1 47%
Partition 2 34%
Partition 3 7%
Partition 4 4%
Partition 5 4%
Partition 6 4%
Data balancing gone wild
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
➔ Consumers:
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
➔ Consumers:
◆ Increased consumer lag
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
➔ Consumers:
◆ Increased consumer lag
◆ Consumers that are assigned to a hot partition require bigger resources
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
➔ Consumers:
◆ Increased consumer lag
◆ Consumers that are assigned to a hot partition require bigger resources
◆ Underutilisation of resources when vertical scaling with k8s
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
How uneven partitions affect the system
➔ Brokers:
◆ Heavy load on the file system -> slower brokers
➔ Consumers:
◆ Increased consumer lag
◆ Consumers that are assigned to a hot partition require bigger resources
◆ Underutilisation of resources when vertical scaling with k8s
◆ Out-of-memory exception cycle
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
What to do now?
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
“Premature optimization
is the root of all evil”
Donald Knuth
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
You can’t avoid the change.
Embrace the inevitable.
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Today you’ll learn
●
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Today you’ll learn
● Different recipes to deal with uneven partitioning
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Today you’ll learn
● Different recipes to deal with uneven partitioning
● From easiest 🌶
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Today you’ll learn
● Different recipes to deal with uneven partitioning
● From easiest 🌶 to more difficult 🌶🌶🌶
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
🌶🌶🌶 The advanced techniques will help you
● Rebalance records across partitions
● Scale your topic up or down
● Be effective at disaster recovery
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Partition 1 47%
Partition 2 34%
Partition 3 7%
Partition 4 4%
Partition 5 4%
Partition 6 4%
15%
12%
13%
11%
13%
11%
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5
Partition 6
14%
12%
Partition 7
Partition 8
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Level 1. Easy🌶
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Level 1. Easy🌶
If you don’t use keys..
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
- Pay attention to
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
- Pay attention to
- Data retention period
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
- Pay attention to
- Data retention period
- Number of consumers
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
- Pay attention to
- Data retention period
- Number of consumers
- Data distribution over time
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
No keys - increase the number of partitions
- This way you can’t scale down, but you can scale up!
- Pay attention to
- Data retention period
- Number of consumers
- Data distribution over time
- Linger_ms and batch_size for sticky partitioning
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Level 2. Moderate🌶🌶
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Level 2. Moderate🌶🌶
One or two keys are hot
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
You still can add new partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
You still can add new partitions…. kinda
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
The key challenge:
github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/
producer/internals/BuiltInPartitioner.java
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
The key challenge:
github.com/apache/kafka/blob/trunk/clients/src/main/java/org/apache/kafka/clients/
producer/internals/BuiltInPartitioner.java
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
You will need
● Calculate which key is hot 🔥
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
You will need
● Calculate which key is hot 🔥
● Keep the state
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
You will need
● Calculate which key is hot 🔥
● Keep the state
● Not mess up old keys
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
You will need
● Calculate which key is hot 🔥
● Keep the state
● Not mess up old keys
● Use custom partitioner
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Example
public static int partitionForKey(final byte[] serializedKey, final int numPartitions) {
if (serializedKey == "bananas🍌🍌") {
... do the dirty magic here ...
} else {
return Utils.toPositive(Utils.murmur2(serializedKey)) % (numPartitions - 1);
}
}
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Example
public static int partitionForKey(final byte[] serializedKey, final int numPartitions) {
if (serializedKey == "bananas🍌🍌") {
... do the dirty magic here ...
} else {
return Utils.toPositive(Utils.murmur2(serializedKey)) % (numPartitions - 1);
}
}
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Example
public static int partitionForKey(final byte[] serializedKey, final int numPartitions) {
if (serializedKey == "bananas🍌🍌") {
... do the dirty magic here ...
} else {
return Utils.toPositive(Utils.murmur2(serializedKey)) % (numPartitions - 1);
}
}
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Level 3. Getting hot🌶🌶🌶
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Level 3. Getting hot🌶🌶🌶
Time to migrate to a new topic
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
The time will come…
when you need to re-create the topic
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
The time will come…
when you need to re-create the topic
➔ Rebalance records across partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
The time will come…
when you need to re-create the topic
➔ Rebalance records across partitions
➔ Scale your topic up or down
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
The time will come…
when you need to re-create the topic
➔ Rebalance records across partitions
➔ Scale your topic up or down
➔ To do disaster recovery
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Let’s re-create the topic
and MIGRATE!
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Three main steps
…
Producers Consumers
P1
P0
Old topic
0 6 12
🗝
🗝 1 7 13
…
Producers Consumers
P1
P0
Old topic
0 6 12
🗝
🗝 1 7 13
P1
P2
P0
P3
0 12
🗝
1 13
🗝
2 14
3 15
🗝
🗝
New topic
…
…
Producers Consumers
P1
P0
P1
P2
P0
P3
Old topic
New topic
0 6 12
🗝
🗝
0 12
🗝
1 13
🗝
1 7 13
2 14
3 15
🗝
🗝
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Our goals when migrating
1. Keep downtime to bare minimum
2. No duplicates
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Partition 1 47%
Partition 2 34%
Partition 3 7%
Partition 4 4%
Partition 5 4%
Partition 6 4%
15%
12%
13%
11%
13%
11%
Partition 1
Partition 2
Partition 3
Partition 4
Partition 5
Partition 6
14%
12%
Partition 7
Partition 8
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Two options
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Two options
(in reality way more, but similar in
essence)
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Option 1: Stop the world producers
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Option 1: Stop the world producers
Sharp cut
…
Producers Consumers
P1
P0
Old topic
0 6 12
🗝
🗝 1 7 13
…
Producers Consumers
P1
P0
Old topic
0 6 12
🗝
🗝 1 7 13
…
Producers Consumers
P1
P0
Old topic
0 6 12
🗝
🗝 1 7 13
…
Producers Consumers
P1
P0
Old topic
0 6 12
🗝
🗝 1 7 13
P1
P2
P0
P3
New topic
0 12
🗝
1 13
🗝
2 14
3 15
🗝
🗝
…
Producers
Consumers
P1
P0
Old topic
0 6 12
🗝
🗝 1 7 13
P1
P2
P0
P3
New topic
0 12
🗝
1 13
🗝
2 14
3 15
🗝
🗝
…
Producers
Consumers
P1
P0 0 6 12
🗝
🗝 1 7 13
P1
P2
P0
P3
New topic
0 12
🗝
1 13
🗝
2 14
3 15
🗝
🗝
Old topic
…
Consumers
P1
P0 0 6 12
🗝
🗝 1 7 13
P1
P2
P0
P3
New topic
0 12
🗝
1 13
🗝
2 14
3 15
🗝
🗝
Producers
Old topic
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Advantages
➔ No skipped messages
➔ Prevention of duplicates
➔ No need for extra compute to replicate data from old to new topic
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Limitations
➔ Downtime
➔ Difficult to test new setup and challenging to roll back
➔ Limited time window for migration
➔ Need for seamless collaboration among teams
➔ All-or-nothing migration
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Our goals when migrating
1. Keep downtime to bare minimum
2. No duplicates
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Option 2: Gradual switch relying on
replicated data
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
ABOVE - Olena K
BELOW - Olena B
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Time for plan B
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Time for plan B
with Olena B
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Strategy
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 1
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 2
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 3
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 3
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 3
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 3
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 4
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 4
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 5
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 1
New topic creation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 1
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Partition 1 47%
Partition 2 34%
Partition 3 7%
Partition 4 4%
Partition 5 4%
Partition 6 4%
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Partition 1 44%
Partition 2 30%
Partition 3 6%
Partition 4 3%
Partition 5 3%
Partition 6 3%
Partition 7 3%
Partition 8 3%
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Risks
Had to redo whole process
because of too few/many partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 2
Fast and reliable data pump
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 2
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Data pump application requirements
- Simple
- Fast.
- Reliable
Kafka
Streams Java
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Risks
- Require too much resources if not simple enough
- Cannot keep up if it is too complicated
- Data losses if application is not reliable
- Data loss or duplicates because records from from different
partitions get shuffled
WARNING. Records/keys almost certainly will be mixed.
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
New partitions have mix of data from old partitions
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
If consumers had stopped when order is not correct.
- Read some records one more time
OR
- Skip some records
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
Be careful
during data pump catch up and
if you use big batches to read data
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Out of order events
Old topic timestamps from metadata
could be used to preserve chronological order
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 3
Gradual consumer switch.
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 3
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Risks
- Spikes
- Too long downtime for consumers
- Data loss or duplicates
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 3
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Consumer groups translations
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Simple Consumer Group Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Simple Consumer Group Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Simple Consumer Group Translation
Old Consumer Group:
Partition 0: offset 13
Partition 2: offset 23
Last consumed event
Partition 0: offset 12
Partition 2: offset 22
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Simple Consumer Group Translation
Timestamps:
Partition 0: 07:01:04
Partition 2: 07:01:03
Earliest timestamp:
07:01:03
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Simple Consumer Group Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Risks
- Spikes
- Too long downtime for consumers
- Data loss or duplicates
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Streaming Consumer Group
Translations
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Streaming Consumer Group Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Streaming Consumer Group Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Streaming Consumer Group Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Streaming Consumer Group Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Streaming Consumer Group Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Risks
- Spikes
- Too long downtime for consumers
- Data loss or duplicates
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Offset Translations
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Offset Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Offset Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Offset Translation
Consumer Group 1:
Partition 0: offset 13
Partition 2: offset 23
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Offset Translation
For r in records:
P = r.metadata.old_partition
If offsets[P] <= r.metadata.offset:
return
Consumer Group 1:
Partition 0: offset 13
Partition 2: offset 23
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Mirror Maker offset translation
Data pump -> MirrorSourceTask
Old + New records metadata -> Records in Offset Sync topics
Offset translation -> MirrorCheckpointTask
Problems:
- Main usecase data transfer between 2 clusters, not a same
- Till version 3.3 offset translation by measuring the 'distance' between
the MM2 offset sync and the upstream consumer group, and then
assuming that the same distance applies in the downstream topic.
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Risks
- Spikes
- Too long downtime for consumers
- Data loss or duplicates
- Poor offsets estimations
- Bad timing for offsets translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Bad timing for offset translations
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Bad timing for Offset Translation
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Bad timing for Offset Translation
Either
Start from 32:
duplicate B1, B2
OR
Start from 35:
A2 is lost
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
“Bounded” stream
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Gradual consumer switch
Earliest offset Duplicates guaranteed
Consumer’s earliest timestamp High probability of
duplicates
Offset translation A few duplicates
Offset translation + “late events”
tracking
Almost no duplicates
Offset translation + “late events”
tracking + “Bounded” stream
approach
Rare/no duplicates
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Gradual consumer switch
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
More topics to talk about
- Apache Mirror Maker implementation details
- Stateless vs Stateful consumers
- Idempotence
- Changing schemas
- New key selection strategy
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 4
Gradual producers switch
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Step 4
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Risks
- Data loss or duplicates if data pump is not fast enough
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
To summarize it all
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Key learnings
● No keys - add partitions 🌶
● A few hot keys - you still can add partitions 🌶🌶
● Workarounds are not sufficient? - Migrate the topic 🌶🌶🌶
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Migrate the topic 🌶🌶🌶
● Sharp cut - stop the producers first
○ Exactly once delivery
○ Expect the downtime
○ All-or-nothing migration
● Generic gradual switch
○ Minimal downtime
○ Possibility to test before switching
○ Switch consumer groups gradually
○ Minimize chance of duplicates
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Olena Kutsenko
Sr. Developer Advocate
Olena Babenko
Staff Software Engineer
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Olena Kutsenko
Sr. Developer Advocate
Olena Babenko
Staff Software Engineer
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
Olena Kutsenko
Sr. Developer Advocate
Olena Babenko
Staff Software Engineer
The trusted open source
data platform for everyone
olena@aiven.io
@OlenaKutsenko aiven.io Olena Babenko:
#G8
The trusted open source
data platform for everyone

More Related Content

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why

More from HostedbyConfluent (20)

PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
PDF
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
PDF
How to Build an Event-based Control Center for the Electrical Grid
PDF
Keep Your Kafka Cloud Costs in Check with Showbacks
PDF
When Securing Access to Data is About Life and Death
PDF
Aggregating Ad Events with Kafka Streams and Interactive Queries at Invidi
PDF
Mastering Kafka Consumer Distribution: A Guide to Efficient Scaling and Resou...
PDF
Flink 2.0: Navigating the Future of Unified Stream and Batch Processing
PDF
Leveraging Tiered Storage in Strimzi-Operated Kafka for Cost-Effective Stream...
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Automating Speed: A Proven Approach to Preventing Performance Regressions in ...
How to Build an Event-based Control Center for the Electrical Grid
Keep Your Kafka Cloud Costs in Check with Showbacks
When Securing Access to Data is About Life and Death
Aggregating Ad Events with Kafka Streams and Interactive Queries at Invidi
Mastering Kafka Consumer Distribution: A Guide to Efficient Scaling and Resou...
Flink 2.0: Navigating the Future of Unified Stream and Batch Processing
Leveraging Tiered Storage in Strimzi-Operated Kafka for Cost-Effective Stream...
Ad

Recently uploaded (20)

PPTX
TEXTILE technology diploma scope and career opportunities
PDF
Convolutional neural network based encoder-decoder for efficient real-time ob...
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PPTX
Modernising the Digital Integration Hub
PDF
Comparative analysis of machine learning models for fake news detection in so...
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PPTX
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
PPTX
2018-HIPAA-Renewal-Training for executives
PDF
sbt 2.0: go big (Scala Days 2025 edition)
PDF
UiPath Agentic Automation session 1: RPA to Agents
PPTX
Microsoft Excel 365/2024 Beginner's training
PPT
Geologic Time for studying geology for geologist
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
The influence of sentiment analysis in enhancing early warning system model f...
PDF
Enhancing plagiarism detection using data pre-processing and machine learning...
PPTX
Benefits of Physical activity for teenagers.pptx
PDF
Improvisation in detection of pomegranate leaf disease using transfer learni...
PPTX
Chapter 5: Probability Theory and Statistics
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
TEXTILE technology diploma scope and career opportunities
Convolutional neural network based encoder-decoder for efficient real-time ob...
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Modernising the Digital Integration Hub
Comparative analysis of machine learning models for fake news detection in so...
NewMind AI Weekly Chronicles – August ’25 Week III
GROUP4NURSINGINFORMATICSREPORT-2 PRESENTATION
2018-HIPAA-Renewal-Training for executives
sbt 2.0: go big (Scala Days 2025 edition)
UiPath Agentic Automation session 1: RPA to Agents
Microsoft Excel 365/2024 Beginner's training
Geologic Time for studying geology for geologist
A proposed approach for plagiarism detection in Myanmar Unicode text
Module 1.ppt Iot fundamentals and Architecture
The influence of sentiment analysis in enhancing early warning system model f...
Enhancing plagiarism detection using data pre-processing and machine learning...
Benefits of Physical activity for teenagers.pptx
Improvisation in detection of pomegranate leaf disease using transfer learni...
Chapter 5: Probability Theory and Statistics
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
Ad

The Dark and Dirty Side of Fixing Uneven Partitions with Olena Babenko & Olena Kutsenko