SlideShare a Scribd company logo
1
Revisiting Exactly Once
Semantics (EOS)
Jason Gustafson: Engineer@Confluent
2
- Review exactly once mechanics
- Identify the remaining gaps
- Discuss how they are being addressed
Overview
3
Kafka EOS
4
A B C D E
Read
Process
Write
Input
Output
5
A B C D E
Read
Process
Write
Input
Output
6
A B C D E
A’
Read
Process
Write
Input
Output
7
A B C D E
A’
Read
Process
Write
Input
Output
8
A B C D E
A’ B’ C’
Read
Process
Write
Input
Output
9
A B C D E
A’ B’ C’
Read
Process
Write
Input
Output
10
A B C D E
A’ B’ C’ D’ E’
Read
Process
Write
Input
Output
11
1 1 1 1 1 1 1 1
Read
Process
Write
Input
Output
12
1 1 1 1 1 1 1 1
Read
Process
Write
Input
Output
13
1 1 1 1 1 1 1 1
2
Read
Process
Write
Input
Output
14
1 1 1 1 1 1 1 1
2
Read
Process
Write
Input
Output
15
1 1 1 1 1 1 1 1
2 4
Read
Process
Write
Input
Output
16
1 1 1 1 1 1 1 1
2 4
Read
Process
Write
Input
Output
17
1 1 1 1 1 1 1 1
2 4 2
Read
Process
Write
Input
Output
18
A B C D E
Read
Process
Write
Input
Output
19
A B C D E
Read
Process
Write
Input
Output
20
A B C D E
B’
Read
Process
Write
Input
Output
A’
21
A B C D E
B’
Read
Process
Write
Input
Output
A’
22
A B C D E
B’ D’
Read
Process
Write
Input
Output
A’ C’ E’
23
A B C D E
A’ B’
Read
Process
Write
Input
Output
24
A B C D E
A’ B’
Read
Process
Write
Input
Output
2
Position
25
A B C D E
A’ B’
Read
Process
Write
Input
Output
2
Position
ReadPosition
26
A B C D E
A’ B’
Read
Process
Write
Input
Output
2
Position
ReadPosition
WritePosition
27
A B C D E
A’ B’
Read
Process
Write
Input
Output
2
Position
ReadPosition
WritePosition
28
A B C D E
A’ B’ C’ D’
Read
Process
Write
Input
Output
2
Position
ReadPosition
WritePosition
29
A B C D E
A’ B’ C’ D’
Read
Process
Write
Input
Output
2 4
Position
ReadPosition
WritePosition
30
A B C D E
A’ B’ C’ D’
Read
Process
Write
Input
Output
2 4
Position
ReadPosition
WritePosition
31
A B C D E
A’ B’ C’ D’ E’
Read
Process
Write
Input
Output
2 4
Position
ReadPosition
WritePosition
32
A B C D E
A’ B’ C’ D’ E’
Read
Process
Write
Input
Output
2 4 5
Position
ReadPosition
WritePosition
33
A B C D E
A’ B’ C’ D’ E’
Read
Process
Write
Input
Output
2 4 5
Position
ReadPosition
WritePosition
Atomic
34
Read
Process
Write
ReadPosition
WritePosition
Atomic
The Kafka Approach:
1) Transactional writes across multiple
partitions
2) Protocol to guarantee single writer
35
Read
Process
Write
ReadPosition
WritePosition
Atomic
The Kafka Approach:
1) Transactional writes across multiple
partitions
2) Protocol to guarantee single writer
36
Kafka Transactions
37
Read
Process
Write
ReadPosition
WritePosition
Atomic
38
Read
Process
Write
ReadPosition
WritePosition
Atomic
AddPartition
Write
BeginTxn
CommitTxn
39
Output (O)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
40
Output (O)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
41
Output (O)
ongoing
(O)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
42
Output (O)
A’ B’
ongoing
(O)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
43
Output (O)
A’ B’
ongoing ongoing
(O) (O, P)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
44
Output (O)
2
A’ B’
ongoing ongoing
(O) (O, P)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
45
Output (O)
2
A’ B’
ongoing ongoing
prepare
commit
(O) (O, P) (O, P)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
46
Output (O)
2
A’ B’
ongoing ongoing
prepare
commit
(O) (O, P) (O, P)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
47
Output (O)
2
A’ B’
ongoing ongoing
prepare
commit
(O) (O, P) (O, P)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
48
Output (O)
2
A’ B’
ongoing ongoing
prepare
commit
done
commit
(O) (O, P) (O, P) ()
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
49
Output (O)
2
A’ B’
done
commit
()
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
50
Output (O)
2
A’ B’
done
commit
()
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
51
Output (O)
2 5
A’ B’ C’ D’ E’
done
commit
ongoing ongoing
() (O) (O, P)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
52
Output (O)
2 5
A’ B’ C’ D’ E’
done
commit
ongoing ongoing
() (O) (O, P)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
Timeout
AbortTxn
53
Output (O)
2 5
A’ B’ C’ D’ E’
done
commit
ongoing ongoing
prepare
abort
() (O) (O, P) (O, P)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
Timeout
AbortTxn
54
Output (O)
2 5
A’ B’ C’ D’ E’
done
commit
ongoing ongoing
prepare
abort
() (O) (O, P) (O, P)
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
Timeout
AbortTxn
55
Output (O)
2 5
A’ B’ C’ D’ E’
done
commit
ongoing ongoing
prepare
abort
finish
abort
() (O) (O, P) (O, P) ()
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
Timeout
AbortTxn
56
Output (O)
2 5
A’ B’ C’ D’ E’
done
commit
ongoing ongoing
prepare
abort
finish
abort
() (O) (O, P) (O, P) ()
Status
Partitions
A B C D E
Input
Position (P)
Transaction Log
AddPartition
Write
BeginTxn
CommitTxn
Timeout
AbortTxn
57
Read
Process
Write
ReadPosition
WritePosition
Atomic
The Kafka Approach:
1) Transactional writes across multiple
partitions
2) Protocol to guarantee single writer
58
Single Writer
(WTF is a transactional ID?)
59
A B C D E
Input Output
Processor
60
A B C
Input
Output
Processor 1
D E F
G H I
Processor 2
Processor 3
61
OutputA B C
Input
D E F
G H I
Processor 1
Processor 2
Processor 3
62
OutputA B C
Input
D E F
G H I
“Single writer” does not mean that an
output partition has only one writer.
Processor 1
Processor 2
Processor 3
63
OutputA B C
Input
D E F
G H I
The guarantee we need is that there is a
single writer tied to each input partition.
Processor 1
Processor 2
Processor 3
64
Consumer Group
OutputA B C
Input
D E F
G H I
Processor 1
Processor 2
Processor 3
65
Consumer Group
OutputA B C
Input
D E F
G H I
Processor 1
Processor 2
Processor 3
66
Processor 1
Consumer Group
OutputA B C
Input
D E F
G H I
Processor 2
Processor 3
67
The Initialization Problem
68
Processor 2
A B C D E F
A’
Input
Output
1
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
69
Processor 2
A B C D E F
A’ B’ C’
Input
Output
1 3
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
70
Processor 2
A B C D E F
A’ B’ C’
Input
Output
1 3
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Transaction committing but
not complete
71
Processor 2
A B C D E F
A’ B’ C’
Input
Output
1 3
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Transaction committing but
not complete
72
Processor 2
A B C D E F
A’ B’ C’
Input
Output
1 3
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Input partition reassigned
to processor 2
73
Processor 2
A B C D E F
A’ B’ C’ B’
Input
Output
1 3 2
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Processor 2 reads latest
committed position of 1
74
Processor 2
A B C D E F
A’ B’ C’ B’
Input
Output
1 3 2
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Transaction from processor
1 completes
75
The Fencing Problem
76
Processor 1’
A B C D E F
A’ B’ C’
Input
Output
1
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Processor 1 has an
ongoing transaction
77
Processor 1’
A B C D E F
A’ B’ C’
Input
Output
1
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Processor 1 is partitioned
from the cluster
78
Processor 1’
A B C D E F
A’ B’ C’
Input
Output
1
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Partition reassigned to
processor 2
79
Processor 1’
A B C D E F
A’ B’ C’
Input
Output
1
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Transaction is aborted
80
Processor 1’
A B C D E F
A’ B’ C’ B’
Input
Output
1 2
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
81
Processor 1’
A B C D E F
A’ B’ C’ B’
Input
Output
1 2
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
Processor 1 is able to
communicate again
82
Processor 1’
A B C D E F
A’ B’ C’ B’ D’
Input
Output
1 2
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
83
Processor 1’
A B C D E F
A’ B’ C’ B’ D’
Input
Output
1 2
Position
Read
Process
Write
ReadPosition
WritePosition
Processor 1
Read
Process
Write
ReadPosition
WritePosition
84
● Configured by the producer `transactional.id`
property
● Defines a single writer scope
● Enforced by a monotonic epoch
● Initialization protocol to await pending
transaction completion
Transactional ID
85
Bump Epoch
Is Transaction
In Progress?
Begin Abort
Is Transaction
Completing?
Await
Completion
Yes
Yes
No
Return New
Epoch
No
Transactional Id
Initialization
86
Bump Epoch
Is Transaction
In Progress?
Begin Abort
Is Transaction
Completing?
Await
Completion
Yes
Yes
No
Return New
Epoch
No
Transactional Id
Initialization
87
Bump Epoch
Is Transaction
In Progress?
Begin Abort
Is Transaction
Completing?
Await
Completion
Yes
Yes
No
Return New
Epoch
No
Transactional Id
Initialization
88
Bump Epoch
Is Transaction
In Progress?
Begin Abort
Is Transaction
Completing?
Await
Completion
Yes
Yes
No
Return New
Epoch
No
Transactional Id
Initialization
89
Bump Epoch
Is Transaction
In Progress?
Begin Abort
Is Transaction
Completing?
Await
Completion
Yes
Yes
No
Return New
Epoch
No
Transactional Id
Initialization
90
Consumer Group
A B C
Input
Processor 1
D E F
H I J
Processor 2
Processor 3
txnl.id=A
epoch=1
txnl.id=B
epoch=1
txnl.id=C
epoch=1
91
Processor 1
Consumer Group
A B C
Input
D E F
H I J
txnl.id=A
epoch=1
txnl.id=B
epoch=1
txnl.id=C
epoch=1
Processor 2
Processor 3
92
Processor 1
Consumer Group
A B C
Input
D E F
H I J
txnl.id=A
epoch=2
txnl.id=B
epoch=1
txnl.id=C
epoch=1
Processor 2
Processor 3
93
Read
Process
Write
ReadPosition
WritePosition
The Kafka Approach:
1) Transactional writes across multiple
partitions
2) Protocol to guarantee single writer
Atomic
94
Read
Process
Write
ReadPosition
WritePosition
Atomic
Bump transactional ID
epoch and await pending
transaction completion
95
Read
Process
Write
ReadPosition
WritePosition
Atomic
Multi-partition
transactional write
protected by epoch
Bump transactional ID
epoch and await pending
transaction completion
96
EOS In Code
979797
EOS In Code
consumer.assign(partitions)
producer.initTransactions()
while (true) {
input, offsets = consumer.poll()
output = process(input)
producer.beginTransaction()
producer.send(output)
producer.sendOffsets(offsets)
producer.commitTransaction()
}
989898
EOS In Code
consumer.assign(partitions)
producer.initTransactions()
while (true) {
input, offsets = consumer.poll()
output = process(input)
producer.beginTransaction()
producer.send(output)
producer.sendOffsets(offsets)
producer.commitTransaction()
}
999999
EOS In Code
consumer.assign(partitions)
producer.initTransactions()
while (true) {
input, offsets = consumer.poll()
output = process(input)
producer.beginTransaction()
producer.send(output)
producer.sendOffsets(offsets)
producer.commitTransaction()
}
100
EOS Producer Scalability
101
Consumer Group
OutputA B C
Input
Processor
D E F
G H I
Processor
Processor
102
Kafka Streams
Application
OutputA B C
Input
Thread
D E F
G H I
Thread
Thread
103
Kafka Streams
Application
OutputA B C
Input
Thread
D E F
G H I
Thread
Thread
104
Streams Thread
Consumer Producer
state
state
state
Input Partition
Assignment
Output
Partitions
105
Streams Thread
Consumer Producer
state
state
state
Input Partition
Assignment
Output
Partitions
Task
106
Streams Thread
Consumer Producer
state
state
state
Input Partition
Assignment
Output
Partitions
For EOS, how do we guarantee single
writer scope for each input partition?
107
0 1 2 3 4 5t=0
At t=0, we have one
thread which is assigned
all partitions
108
0 1 2 3 4 5t=0
At t=1, the group
rebalances and we have
two threads
0 1t=1 2 3 4 5
109
0 1 2 3 4 5t=0
We must track the
transactional state
dependence across
rebalances!
0 1t=1 2 3 4 5
110
0 1 2 3 4 5t=0
We must track the
transactional state
dependence across
rebalances!
0 1t=1 2 3 4 5
t=2 0 1 2 3 4 5
111
0 1 2 3 4 5t=0
0 1t=1 2 3 4 5
t=2 0 1 2 3 4 5
0 1 2 3 4 5t=3
We must track the
transactional state
dependence across
rebalances!
112
Streams Thread
Consumer Producer
state
state
state
Input Partition
Assignment
Output
Partitions
113
Streams Thread
Consumer
Producer(s)
state
state
state
Input Partition
Assignment
Output
Partitions
114
0 1 2 3 4 5t=0
transactional.id = input partition
115
t=0
transactional.id = input partition0 1t=1 2 3 4 5
0 1 2 3 4 5
116
t=0
transactional.id = input partitiont=1
t=2 0 1 2 3 4 5
0 1 2 3 4 5
0 1 2 3 4 5
117
t=0
t=1
t=2
t=3 0 1 2 3 4 5
transactional.id = input partition
0 1 2 3 4 5
0 1 2 3 4 5
0 1 2 3 4 5
118
Streams Thread
Consumer
Producer(s)
state
state
state
Input Partition
Assignment
Output
Partitions
119
Record
Accumulator
Anatomy of the Kafka
Producer
IO
Thread
Network
Layer
Buffer Pool
Txn
Manager
120
What is the problem?
121
● The transactional producer assumes a static
assignment of input partitions
● Consumer group partition assignments are
dynamic
What is the
problem?
122122122
What is the
problem? consumer.assign(partitions)
producer.initTransactions()
while (true) {
input, offsets = consumer.poll()
output = process(input)
producer.beginTransaction()
producer.send(output)
producer.sendOffsets(offsets)
producer.commitTransaction()
}
123123123
What is the
problem? consumer.assign(partitions)
producer.initTransactions()
while (true) {
input, offsets = consumer.poll()
output = process(input)
producer.beginTransaction()
producer.send(output)
producer.sendOffsets(offsets)
producer.commitTransaction()
}
124124124
What is the
problem? consumer.subscribe(topics)
producer.initTransactions()
while (true) {
input, offsets = consumer.poll()
output = process(input)
producer.beginTransaction()
producer.send(output)
producer.sendOffsets(offsets)
producer.commitTransaction()
}
125
How to fix this
126
1. Allow the producer to multiplex many
transactional IDs
Options
127
Txn
Manager
Record
Accumulator
Anatomy of the Kafka
Producer
IO
Thread
Network
Layer
Buffer Pool
128
Txn
Manager
Txn
Manager
Record
Accumulator
Anatomy of the Kafka
Producer
IO
Thread
Network
Layer
Buffer Pool Txn
Manager
129
Record
Accumulator
Record
Accumulator
Txn
Manager
Txn
Manager
Record
Accumulator
Anatomy of the Kafka
Producer
IO
Thread
Network
Layer
Buffer Pool Txn
Manager
130
1. Allow the producer to multiplex many
transactional IDs
Options
131
1. Allow the producer to multiplex many
transactional IDs
2. Producer pooling for better resource sharing
Options
132
1. Allow the producer to multiplex many
transactional IDs
2. Producer pooling for better resource sharing
3. Address the assignment dependency problem
Options
133
1. Allow the producer to multiplex many
transactional IDs
2. Producer pooling for better resource sharing
3. Address the assignment dependency problem
Options
134
KIP-447
135
Txn Coordinator 1
Txn Coordinator 2
Txn Coordinator 3
Kafka Streams
Application
Thread
Thread
Thread
136
Txn Coordinator 1
Txn Coordinator 2
Txn Coordinator 3
Kafka Streams
Application
Thread
Thread
Thread
txnl.id=A
137
Txn Coordinator 1
Txn Coordinator 2
Txn Coordinator 3
Kafka Streams
Application
Thread
Thread
Thread
txnl.id=A
txnl.id=B
138
Txn Coordinator 1
Txn Coordinator 2
Txn Coordinator 3
Kafka Streams
Application
Thread
Thread
Thread
txnl.id=A
txnl.id=B
txnl.id=C
txnl.id=D
txnl.id=E
139
1. Use the shared group id to find the transaction
coordinator
KIP-447
Recipe
140
Txn Coordinator 1
Txn Coordinator 2
Txn Coordinator 3
Kafka Streams
Application
Thread
Thread
Thread
141
1. Use the shared group id to find the transaction
coordinator
2. Make the transaction coordinator aware of
group partition assignments
KIP-447
Recipe
142
0 1 2 3 4 5t=0
0 1t=1 2 3 4 5
t=2 0 1 2 3 4 5
0 1 2 3 4 5t=3
143
0 1 2 3 4 5generation=0
0 1generation=1 2 3 4 5
generation=2 0 1 2 3 4 5
0 1 2 3 4 5generation=3
144
1. Use the shared group id to find the transaction
coordinator
2. Make the transaction coordinator aware of
group partition assignments
3. Add logic to initialize and fence with
consideration of assignment
KIP-447
Recipe
145
Bump Epoch
Is Transaction
In Progress?
Begin Abort
Is Transaction
Completing?
Await
Completion
Yes
Yes
No
Return New
Epoch
No
Transactional Id
Initialization
146
Bump Epoch
Is Transaction
In Progress?
Begin Abort
Is Transaction
Completing?
Await
Completion
Yes
Yes
No
Return New
Epoch
No
Transaction
Assignment
Initialization
For all
assigned
partitions:
147
1. Use the shared group id to find the transaction
coordinator
2. Make the transaction coordinator aware of
group partition assignments
3. Add logic to initialize and fence with
consideration of assignment
4. Expose all this in a nice API
KIP-447
Recipe
148148148
KIP-447
In Code
consumer.subscribe(topics)
producer.initTxn()
while (true) {
input, offsets = consumer.poll()
output = process(input)
producer.beginTransaction()
producer.send(output)
producer.sendOffsets(offsets)
producer.commitTransaction()
}
149149149
KIP-447
In Code consumer.subscribe(topics,
onAssignment(partitions, gen) {
producer.initTxn(partitions, gen)
})
while (true) {
input, offsets = consumer.poll()
output = process(input)
producer.beginTransaction()
producer.send(output)
producer.sendOffsets(offsets)
producer.commitTransaction()
}
150
Streams Thread
Consumer Producer
state
state
state
Input Partition
Assignment
Output
Partitions
151
Error Resiliency
152
kafka-producer-network-thread | producer-1] ERROR o.a.k.clients.producer.internals.Sender -
[Producer clientId=producer-1] The broker returned
org.apache.kafka.common.errors.UnknownProducerIdException: This exception is
raised by the broker if it could not locate the producer metadata associated with the producerId in
question. This could happen if, for instance, the producer's records were deleted because their
retention time had elapsed. Once the last records of the producerId are removed, the producer's
metadata is removed from the broker, and future appends by the producer will return this
exception. for topic-partition foo-0 at offset -1. This indicates data loss on the broker, and
should be investigated.
153
Record
Accumulator
Anatomy of the Kafka
Producer
IO
Thread
Network
Layer
Buffer Pool
Txn
Manager
154
Record
Accumulator
Anatomy of the Kafka
Producer
IO
Thread
Network
Layer
Buffer Pool
Txn
Manager
155
Partition
Sequence
Number
foo-0 5
bar-1 23
baz-0 16
State Machine
Anatomy of the
Transaction Manager
state: InTransaction
156
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
0 1 2 3 4
3 0 0 1 0
2 1 1 1 2
1 1 2 1 1
157
0 1 2 3 4
3 0 0 1 0
2 1 1 1 2
1 1 2 1 1
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
0 2 1
1 1 1
3 2 1
Producer
State Cache
158
0 1 2 3 4 5
3 0 0 1 0 1
2 1 1 1 2 1
1 1 2 1 1 2
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
0 2 1
1 1 2
3 2 1
Producer
State Cache
159
0 1 2 3 4 5 6
3 0 0 1 0 1 0
2 1 1 1 2 1 2
1 1 2 1 1 2 2
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
0 2 2
1 1 2
3 2 1
Producer
State Cache
160
0 1 2 3 4 5 6
3 0 0 1 0 1 0
2 1 1 1 2 1 2
1 1 2 1 1 2 2
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
0 2 2
1 1 2
3 2 1
Producer
State Cache
Records deleted after hitting
retention limit
161
4 5 6
0 1 0
2 1 2
1 2 2
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
0 2 2
1 1 2
3 2 1
Producer
State Cache
162
4 5 6
0 1 0
2 1 2
1 2 2
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
0 2 2
1 1 2
Producer
State Cache
163
Well..
164
- Used to change the partition key in Kafka
Streams
- Once repartitioned data has been consumed, it
is no longer needed
- Proactively actively delete unneeded data!
Repartition
Topics
165
0 1 2 3 4 5 6
3 0 0 1 0 1 0
2 1 1 1 2 1 2
1 1 2 1 1 2 2
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
0 2 2
1 1 2
3 2 1
Producer
State Cache
166
0 1 2 3 4 5 6
3 0 0 1 0 1 0
2 1 1 1 2 1 2
1 1 2 1 1 2 2
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
0 2 2
1 1 2
3 2 1
Producer
State Cache
167
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
0 2 2
1 1 2
3 2 1
Producer
State Cache
168
Offset
ProducerId
Epoch
Sequence
Log
Broker Partition State
ProducerId Epoch Sequence
Producer
State Cache
169
How to fix this
170
Log
171
Log
172
m
Log
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
173
m
Log
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
174
m
Log
ProducerId Epoch Sequence
0 3 5
1 1 1
n.snapshot
n
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
175
m
Log
ProducerId Epoch Sequence
0 3 5
1 1 1
n.snapshot
n
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
176
m
Log
ProducerId Epoch Sequence
0 3 5
n.snapshot
n
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
177
Maybe just don’t do that?
178
m
Log
ProducerId Epoch Sequence
0 3 5
1 1 1
n.snapshot
n
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
179
Don’t forget about replication
180
m n
replica 1
ProducerId Epoch Sequence
0 3 5
1 1 1
n.snapshot
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
181
m n
replica 1
ProducerId Epoch Sequence
0 3 5
1 1 1
n.snapshot
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
replica 2
182
m n
replica 1
ProducerId Epoch Sequence
0 3 5
1 1 1
n.snapshot
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
replica 2
183
m n
replica 1
ProducerId Epoch Sequence
0 3 5
1 1 1
n.snapshot
ProducerId Epoch Sequence
0 2 1
1 1 1
m.snapshot
replica 2
ProducerId Epoch Sequence
0 3 5
184
● Log record deletion
● Log start offset inconsistency
● Unclean leader election
Sequence
Conflicts
185
KIP-360
186
replica 1
ProducerId Epoch Sequence
0 3 5
1 1 1
State Cache
replica 2
ProducerId Epoch Sequence
0 3 5
State Cache
187
replica 1
(leader)
ProducerId Epoch Sequence
0 3 5
1 1 1
State Cache
replica 2
(follower)
ProducerId Epoch Sequence
0 3 5
State Cache
188
replica 1
(leader)
ProducerId Epoch Sequence
0 3 5
1 1 3
replica 2
(follower)
ProducerId Epoch Sequence
0 3 5
State Cache State Cache
189
replica 1
(leader)
ProducerId Epoch Sequence
0 3 5
1 1 3
replica 2
(follower)
ProducerId Epoch Sequence
0 3 5
1 1 3
State Cache State Cache
190
replica 1
(follower)
ProducerId Epoch Sequence
0 3 5
1 1 1
replica 2
(leader)
ProducerId Epoch Sequence
0 3 5
State Cache State Cache
191
replica 1
(follower)
ProducerId Epoch Sequence
0 3 5
1 1 1
replica 2
(leader)
ProducerId Epoch Sequence
0 3 5
1 2 2
State Cache State Cache
192
replica 1
(follower)
ProducerId Epoch Sequence
0 3 5
1 2 2
replica 2
(leader)
ProducerId Epoch Sequence
0 3 5
1 2 2
State Cache State Cache
193
Error Current Impact Future Recovery
OutOfOrderSequence FATAL Abort transaction, bump
epoch and continue
UnknownProducerId FATAL Abort transaction, bump
epoch and continue
KIP-360 Error Handling
194
Error Current Impact Future Recovery
OutOfOrderSequence FATAL Abort transaction, bump
epoch and continue
UnknownProducerId FATAL Abort transaction, bump
epoch and continue
ProducerFenced FATAL Rejoin consumer group,
reinitialize transaction state
KIP-360/KIP-447 Error Handling
195
Conclusion
196
Kafka has an elegant transaction model which has held up well.
To reach full maturity:
● Address the producer/consumer semantic mismatch
● Improve producer resilience to errors
197
Thank you!
198
- KIP-447:
https://cwiki.apache.org/confluence/display/KAFK
A/KIP-447%3A+Producer+scalability+for+exactly+o
nce+semantics
- KIP-360:
https://cwiki.apache.org/confluence/display/KAFK
A/KIP-360%3A+Improve+handling+of+unknown+pro
ducer
- Contribute to Apache Kafka:
https://kafka.apache.org/contributing
- Join Confluent: https://www.confluent.io/careers/
Resources

More Related Content

PPTX
Triton and Symbolic execution on GDB@DEF CON China
PDF
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)
PDF
From Zero To Production (NixOS, Erlang) @ Erlang Factory SF 2016
PDF
Code lifecycle in the jvm - TopConf Linz
PDF
What to expect from Java 9
PDF
Fluentd meetup
PDF
Redis cluster
PDF
Twisted Introduction
Triton and Symbolic execution on GDB@DEF CON China
Good news, everybody! Guile 2.2 performance notes (FOSDEM 2016)
From Zero To Production (NixOS, Erlang) @ Erlang Factory SF 2016
Code lifecycle in the jvm - TopConf Linz
What to expect from Java 9
Fluentd meetup
Redis cluster
Twisted Introduction

What's hot (20)

PDF
Why scala is not my ideal language and what I can do with this
PDF
Like loggly using open source
PDF
Dynamo: Not Just For Datastores
PPT
Composing and Executing Parallel Data Flow Graphs wth Shell Pipes
PPTX
Code generation with javac plugin
PDF
Reactive server with netty
PDF
Making our Future better
PDF
AWS re:Invent 2018 notes
PDF
FPGA design with CλaSH
PDF
Fluentd vs. Logstash for OpenStack Log Management
PDF
An Introduction to Twisted
PDF
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
ZIP
How we use Twisted in Launchpad
PDF
Networking and Go: An Epic Journey
PDF
What is new in Go 1.8
PPT
Behavioral Reflection
PDF
Knit, Chisel, Hack: Building Programs in Guile Scheme (Strange Loop 2016)
PDF
Virtual Machines Lecture
PDF
Optimizing with persistent data structures (LLVM Cauldron 2016)
PPTX
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Why scala is not my ideal language and what I can do with this
Like loggly using open source
Dynamo: Not Just For Datastores
Composing and Executing Parallel Data Flow Graphs wth Shell Pipes
Code generation with javac plugin
Reactive server with netty
Making our Future better
AWS re:Invent 2018 notes
FPGA design with CλaSH
Fluentd vs. Logstash for OpenStack Log Management
An Introduction to Twisted
CSW2017 Henry li how to find the vulnerability to bypass the control flow gua...
How we use Twisted in Launchpad
Networking and Go: An Epic Journey
What is new in Go 1.8
Behavioral Reflection
Knit, Chisel, Hack: Building Programs in Guile Scheme (Strange Loop 2016)
Virtual Machines Lecture
Optimizing with persistent data structures (LLVM Cauldron 2016)
Talk Python To Me: Stream Processing in your favourite Language with Beam on ...
Ad

Similar to Exactly Once Semantics Revisited (Jason Gustafson, Confluent) Kafka Summit NYC 2019 (20)

PPTX
Presto overview
PDF
ez-clang C++ REPL for bare-metal embedded devices
PDF
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
PPTX
WCM Transfer Services
PDF
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
PDF
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
ODP
Formal Verification of Transactional Interaction Contract
PDF
Event Stream Processing with BeepBeep 3
PDF
Digital System Design Lab Report - VHDL ECE
PDF
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
PPTX
UNit-4.pptx programming the basic computer
PDF
UM2019 Extended BPF: A New Type of Software
PDF
Memory efficient pytorch
PDF
Lecture07(DHDNBK)-Behavior-Modelling.pdf
PDF
Debug Information And Where They Come From
PDF
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
PPTX
ML Visuals.pptx
ODP
Formal Verification of Web Service Interaction Contracts
PPT
PPT
2D viewing
Presto overview
ez-clang C++ REPL for bare-metal embedded devices
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
WCM Transfer Services
Exactly-Once Made Easy: Transactional Messaging Improvement for Usability and...
Big Data Day LA 2016/ Hadoop/ Spark/ Kafka track - Data Provenance Support in...
Formal Verification of Transactional Interaction Contract
Event Stream Processing with BeepBeep 3
Digital System Design Lab Report - VHDL ECE
Apache Flink: Better, Faster & Uncut - Piotr Nowojski, data Artisans
UNit-4.pptx programming the basic computer
UM2019 Extended BPF: A New Type of Software
Memory efficient pytorch
Lecture07(DHDNBK)-Behavior-Modelling.pdf
Debug Information And Where They Come From
State Management in Apache Flink : Consistent Stateful Distributed Stream Pro...
ML Visuals.pptx
Formal Verification of Web Service Interaction Contracts
2D viewing
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
PDF
Migration, backup and restore made easy using Kannika
PDF
Five Things You Need to Know About Data Streaming in 2025
PDF
Data in Motion Tour Seoul 2024 - Keynote
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
PDF
Unlocking value with event-driven architecture by Confluent
PDF
Il Data Streaming per un’AI real-time di nuova generazione
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
PDF
Building API data products on top of your real-time data infrastructure
PDF
Speed Wins: From Kafka to APIs in Minutes
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
Webinar Think Right - Shift Left - 19-03-2025.pptx
Migration, backup and restore made easy using Kannika
Five Things You Need to Know About Data Streaming in 2025
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Roadmap Demo
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Build a Real-Time Decision Support Application for Financial Market Traders w...
Strumenti e Strategie di Stream Governance con Confluent Platform
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Building Real-Time Gen AI Applications with SingleStore and Confluent
Unlocking value with event-driven architecture by Confluent
Il Data Streaming per un’AI real-time di nuova generazione
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Break data silos with real-time connectivity using Confluent Cloud Connectors
Building API data products on top of your real-time data infrastructure
Speed Wins: From Kafka to APIs in Minutes
Evolving Data Governance for the Real-time Streaming and AI Era

Recently uploaded (20)

PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Encapsulation theory and applications.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Cloud computing and distributed systems.
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPT
Teaching material agriculture food technology
PDF
Electronic commerce courselecture one. Pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
MIND Revenue Release Quarter 2 2025 Press Release
Reach Out and Touch Someone: Haptics and Empathic Computing
sap open course for s4hana steps from ECC to s4
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Review of recent advances in non-invasive hemoglobin estimation
The Rise and Fall of 3GPP – Time for a Sabbatical?
Chapter 3 Spatial Domain Image Processing.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Encapsulation theory and applications.pdf
Understanding_Digital_Forensics_Presentation.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation_ Review paper, used for researhc scholars
Spectral efficient network and resource selection model in 5G networks
Cloud computing and distributed systems.
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Teaching material agriculture food technology
Electronic commerce courselecture one. Pdf

Exactly Once Semantics Revisited (Jason Gustafson, Confluent) Kafka Summit NYC 2019