SlideShare a Scribd company logo
Exactly-Once, Again: Adding EOS
Support for Kafka Connect Source
Connectors
Chris Egerton
Nice to meet you!
Chris Egerton
Open Source Program
Office @Aiven
Apache Kafka
committer and PMC
member
(Official bio) (Unofficial bio)
Exactly-Once, Again: Adding
EOS Support for Kafka
Connect Source Connectors
● “Exactly-once semantics”
● “Semantics” instead of “delivery”, “guarantees”, “delivery
guarantees”, etc. (see Two Generals’ Problem)
● Levels:
○ Probably-once
○ At-least-once
○ At-most-once
○ Exactly-once
● With all else equal, exactly-once is best
● But of course, it’s the hardest to implement
EOS
Exactly-Once, Again: Adding
EOS Support for Kafka
Connect Source Connectors
Source Connectors
● Kafka stores and transmits events. Where do these events
come from, and where do they go?
● DYI producer/consumer application? Nah 👎
● Connectors: no-code (or low-code) applications to integrate
Kafka with other systems
● Sink connectors write data from Kafka to the external system
● Source connectors read data from the external system into
Kafka
Exactly-Once, Again: Adding
EOS Support for Kafka
Connect Source Connectors
Kafka Connect
● Distributed, horizontally-scalable, fault-
tolerant ingest/export tool for Kafka
● Developers implement connectors
against the Kafka Connect API
● Cluster administrators install connectors
onto one or more Kafka Connect workers,
which combine to form a cluster
● Users can then create and manage
connectors on that cluster by submitting
JSON configurations via a REST API
● (For users) No code required!
{
"name": "local-file-source",
"connector.class": "FileStreamSink",
"tasks.max": "1",
"file": "test.txt",
"topic": "connect-test"
}
We’re going to talk about designing support for exactly-once
semantics (EOS) with source connectors developed for Kafka
Connect.
In summary…
Challenges
1⃣ƿ
Ø#μГ& Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
2⃣ƿ
q–)+Г& όƿ
/ & 0ƿ
+#–+Г#¤Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
3⃣ƿ
ò#–+№
Г& όƿ
0#Ю
Г¤#+№
ƿ
–)ƿ
ǽ/ μЧ/
4⃣ƿ̀ )8 ΛГ#„ƿ
🧟🧟🧟🧟🧟
Source offsets, in detail
● Source connectors provide source records
● Source records come with source offsets (partition + offset)
● On startup, connectors use source offsets to know where to
resume from
● Source offsets are stored in an offsets topic by Kafka
Connect
✅🎉
// TODO
ɀ Ø"μГ%Г%όƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
;1⃣<ƿ
Г„ƿ
–φ#ƿ
+#„ъ)& „ГΛГЮ
Г–№
ƿ
)μƿ
–φ#ƿ
Ρ)& & #Ρ–)+
ɀ q–)*Г%όƿ
, %-ƿ
*"–*Г"¤Г%όƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
;2⃣<ƿ
Г„ƿ
–φ#ƿ
+#„ъ)& „ГΛГЮ
Г–№
ƿ
)μƿ
ǽ/μЧ/ƿ
Ņ)& & #Ρ–
Exactly-once for Kafka clients
ɀ ǽ@
nBCDEƿ
ŚG/ Ρ–Ю
№
ƿ
H& Ρ#ƿ
Ø#Ю
Г¤#+№
ƿ
/ & 0ƿ
ſ +/ & „/ Ρ–Г)& / Ю
ƿ
ë #„„/ όГ& ό
ɂ ò#Ю
#/ „#0ƿ
Г& ƿ
LMNNMLML
ɂ O #& Ρ#P
ƿ
Q ό/ Г& Sƿ
Г& ƿ
–φ#ƿ
–Г–Ю
#
ɀ /
-"0 ъ)–"%–ƿ
ъ*)-} Ρ"*Tƿ
+#–+Г#„ƿ
$ Г–φ)} –ƿ
0} ъЮ
ГΡ/ –#„ƿ
;3⃣<
ɀ ſ *, %„, Ρ–Г)%, Ю
ƿ
ъ*)-} Ρ"*Tƿ
/ –)8 ГΡƿ
Ρ+)„„V –)ъГΡƿ
$ +Г–#„
ɀ ſ *, %„, Ρ–Г)%, Ю
ƿ
/
Ø„Tƿ
„Г& όЮ
#ƿ
$ +Г–#+ƿ
ъ#+ƿ
@
Ø
ɂ /
%Г–Г, Ю
Г7"ƿ
–*, %„, Ρ–Г)%„ƿ
$ Г–φƿ
/ ƿ
ъ+)0} Ρ#+ƿ
–)ƿ
μ"%Ρ"ƿ
)} –ƿ
)–φ#+ƿ
ъ+)0} Ρ#+„ƿ
$ Г–φƿ
–φ#ƿ
„/ 8 #ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
@
Ø
✅🎉
Tracking source offsets (2⃣)
Ļ #μ)+#T
ɀ n)Ю
Ю
ƿ
Ρ)& & #Ρ–)+ƿ
μ)+ƿ
„)} +Ρ#ƿ
+#Ρ)+0„
ɀ ỳ +Г–#ƿ
+#Ρ)+0„ƿ
–)ƿ
ǽ/ μЧ/
ɀ n#+Г)0ГΡ/ Ю
Ю
№
ƿ
$ +Г–#ƿ
;Ρ)8 8 Г–<ƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
–)ƿ
ǽ/ μЧ/
ɀ n+)¤Г0#„ƿ
/ –V Ю
#/ „–V )& Ρ#ƿ
„} ъъ)+–
μ–#+T
ɀ Ļ #όГ& ƿ
–+/ & „/ Ρ–Г)&
ɀ n)Ю
Ю
ƿ
Ρ)& & #Ρ–)+ƿ
μ)+ƿ
„)} +Ρ#ƿ
+#Ρ)+0„
ɀ ỳ +Г–#ƿ
+#Ρ)+0„ƿ
Г8 8 #0Г/ –#Ю
№
ƿ
–)ƿ
ǽ/ μЧ/ ƿ
;„–ГЮ
Ю
<
ɀ ỳ +Г–#ƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
–)ƿ
ǽ/ μЧ/ ƿ
ɀ Ņ)8 8 Г–ƿ
–+/ & „/ Ρ–Г)&
ɀ n+)¤Г0#„ƿ
#G/ Ρ–Ю
№
V )& Ρ#ƿ
„} ъъ)+–
✅🎉
75% done!
1⃣ƿ
Ø#μГ& Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
2⃣ƿ
q–)+Г& όƿ
/ & 0ƿ
+#–+Г#¤Г& όƿ
„)} +Ρ#ƿ
)μμ„#–„
3⃣ƿ
ò#–+№
Г& όƿ
0#Ю
Г¤#+№
ƿ
–)ƿ
ǽ/ μЧ/
🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟
🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟
🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟
✅
✅
✅
4⃣ƿ
ſ φ/ –ƿ
Ю
#/¤#„Y
“Chekhov’s gun cabinet”
Kafka Connect: diving deeper
ɀ Ņ)%%"Ρ–)*„ƿ
ό#& #+/–#ƿ
Ρ)& μГό} +/–Г)& „ƿ
μ)+ƿ
)& #ƿ
)+ƿ
8 )+#ƿ
–, „Ч„
ɀ ſ / „Ч„ƿ
Ρ/ & ƿ
Λ#ƿ
„ъ+#/ 0ƿ
/ Ρ+)„„ƿ
8 } Ю
–ГъЮ
#ƿ
ǽ/ μЧ/ ƿ
Ņ)& & #Ρ–ƿ
$ )+Ч#+„
ɀ ſ / „Ч„ƿ
/ +#ƿ
/ „„Гό& #0ƿ
–)ƿ
$ )+Ч#+„ƿ
0} +Г& όƿ
/ ƿ
*"Λ, Ю
, %Ρ"
ɀ ƿ
„Г& όЮ
#ƿ
$ )+Ч#+P
ƿ
–φ#ƿ
Ю
", -"*P
ƿ
0#–#+8 Г& #„ƿ
–φ/ –ƿ
/ „„Гό& 8 #& –
ɀ ſ / „Чƿ
Ρ)& μГό} +/ –Г)& „ƿ
/ +#ƿ
„–)+#0ƿ
Г& ƿ
/ ƿ
„Г& όЮ
#V ъ/+–Г–Г)& ƿ
ǽ/μЧ/ƿ
–)ъГΡƿ
Ρ/ Ю
Ю
#0ƿ
–φ#ƿ
Ρ)%μГόƿ
–)ъГΡ
ɀ Z)+ƿ
„)} +Ρ#ƿ
Ρ)& & #Ρ–)+„P
ƿ
ǽ/ μЧ/ ƿ
Ņ)& & #Ρ–ƿ
} „#„ƿ
)& #ƿ
ъ+)0} Ρ#+ƿ
ъ#+ƿ
–/ „Чƿ
–)ƿ
„#& 0ƿ
„)} +Ρ#ƿ
+#Ρ)+0„ƿ
–)ƿ
ǽ/μЧ/
Zombie fencing: correctness goals
ſ )Ю
#+/ –#0ƿ
„Ρ#& / +Г)„T
ɀ ;[ & <ό+/ Ρ#μ} Ю
ƿ
$ )+Ч#+ƿ
„φ} –0)$ &
ɀ +ΛГ–+/ +№
ƿ
$ )+Ч#+ƿ
„–/ +–} ъ„
ɀ +ΛГ–+/ +ГЮ
№
V Ю
)& όP
ƿ
/ +ΛГ–+/ +ГЮ
№
V ъЮ
/Ρ#0P
ƿ
/ & 0ƿ
/+ΛГ–+/+ГЮ
№
V –Г8 #0ƿ
ъ/ } „#„
Zombie fencing: UX goals
● Minimal unnecessary interruptions (keep processing data)
● Minimal changes to connector code
● Minimal connector management API changes
Zombie fencing: actually pretty easy?
● Give each task a transactional ID derived from the name of
the connector and the task ID
○ E.g., “reddit-source-0” or “chris-ksl-3”
● Let tasks fence out older instances on startup
○ Fencing: disabling a producer from writing to Kafka
Zombie fencing: actually pretty easy?
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;& #$ ƿ
–/ „Ч„<
Fences out
;"%Ρ"„ƿ
)} –
;"%Ρ"„ƿ
)} –
Fences out
ɀ q)Y ƿ
0)#„ƿ
–φГ„ƿ
$ )+Ч
ɀ ])Ю
ƿ
@
ƿ
$ Г„φƿ
🥲
Source partition reshuffling problem
ɀ q)} +Ρ#ƿ
Ρ)& & #Ρ–)+„ƿ
+#/ 0ƿ
μ+)8 ƿ
„)} *Ρ"ƿ
ъ, *–Г–Г)%„
ɂ Ø/ –/ Λ/ „#ƿ
–/ ΛЮ
#„P
ƿ
ǽ/ μЧ/ ƿ
–)ъГΡ„P
ƿ
#–ΡM
ɀ ſ φ#„#ƿ
/ +#ƿ
/ Ю
Ю
)Ρ/ –#0ƿ
/ Ρ+)„„ƿ
–/ „Ч„
ɀ ſ φ/ –ƿ
/ Ю
Ю
)Ρ/ –Г)& ƿ
Ρ/ & ƿ
Ρφ/ & ό#ƿ
)¤#+ƿ
–Г8 #
Source partition reshuffling problem
P0
Task 0
P1
ſ / „Чƿ
N
@
& Г–Г/ Ю
ƿ
/ Ю
Ю
)Ρ/ –Г)& ƿ
^Ţƿ
–/ „Ч„P
ƿ
Ţƿ
ъ/ +–Г–Г)& „<
P0
Task 0
P2
ƿ
& #$ ƿ
ъ/ +–Г–Г)& ƿ
^nŢ`ƿ
Г„ƿ
Ρ+#/ –#0
P1
ſ / „Чƿ
N
& ƿ
#GГ„–Г& όƿ
ъ/ +–Г–Г)& ƿ
^nN`ƿ
Г„ƿ
0#Ю
#–#0
P0
ſ / „Чƿ
L
P2
ſ / „Чƿ
N
P1
P2
Source partition reshuffling problem
P0
ſ / „Чƿ
L
P2
HЮ
0ƿ
–/ „Ч„
P1
Task 1
ff #$ ƿ
–/ „Ч„
P2
ſ / „Чƿ
N
;"%Ρ"„ƿ
)} –
● New task T1 starts
before old task T0
stops
ɀ Ļ )–φƿ
/ +#ƿ
/ „„Гό& #0ƿ
ъ/+–Г–Г)& ƿ
nŢ
ɀ Ø} ъЮ
ГΡ/ –#„ƿ
/ Λ)} & 0b
Zombie fencing: second try
ɀ ỳ φ#& #¤#+ƿ
& #$ ƿ
–/„Чƿ
Ρ)& μГό} +/–Г)& „ƿ
/+#ƿ
+#/0ƿ
μ+)8 ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡP
ƿ
ъ#+μ)+8 ƿ
/ƿ
+)} & 0ƿ
)μƿ
7)0 ΛГ"ƿ
μ"%ΡГ%ό
ɂ Z)+ƿ
#¤#+№
ƿ
& #$ ƿ
–/ „ЧP
ƿ
–/ Ч#ƿ
Г–„ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
@
Øƿ
/ & 0ƿ
ъ+##8 ъ–Г¤#Ю
№
ƿ
Г& Г–Г/Ю
Гc#ƿ
–+/& „/Ρ–Г)& „
ɀ ff )$ ƿ
Г–d„ƿ
Г8 ъ)„„ГΛЮ
#ƿ
μ)+ƿ
/ &№
ƿ
)Ю
0#+ƿ
Г& „–/ & Ρ#„ƿ
)μƿ
& #$ Ю
№
V Ρ+#/ –#0ƿ
–/ „Ч„ƿ
–)ƿ
Λ#ƿ
+} & & Г& όƿ
Λ#μ)+#ƿ
/ &№
ƿ
& #$ #+ƿ
Г& „–/ & Ρ#„ƿ
/ +#ƿ
„–/ +–#0
ɀ Ø)ƿ
–φГ„ƿ
0} +Г& όƿ
+#Λ/ Ю
/ & Ρ#
Zombie fencing: first try
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;& #$ ƿ
–/ „Ч„<
;"%Ρ"„ƿ
)} –
Fences out
;"%Ρ"„ƿ
)} –
;"%Ρ"„ƿ
)} –
Zombie fencing: second try
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
;& #$ ƿ
–/ „Ч„<
` )0 ΛГ"ƿ
μ"%ΡГ%ό
*)} %-
● Woohoo, we did it!
● Just kidding, put your
hand back down
Zombie fencing: second try
ɀ ỳ φ#& #¤#+ƿ
& #$ ƿ
–/„Чƿ
Ρ)& μГό} +/–Г)& „ƿ
/+#ƿ
+#/0ƿ
μ+)8 ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡP
ƿ
ъ#+μ)+8 ƿ
/ƿ
+)} & 0ƿ
)μƿ
7)0 ΛГ"ƿ
μ"%ΡГ%ό
ɂ Z)+ƿ
#¤#+№
ƿ
& #$ ƿ
–/ „ЧP
ƿ
–/ Ч#ƿ
Г–„ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
@
Øƿ
/ & 0ƿ
ъ+##8 ъ–Г¤#Ю
№
ƿ
Г& Г–Г/Ю
Гc#ƿ
–+/& „/Ρ–Г)& „
ɀ ff )$ ƿ
Г–d„ƿ
Г8 ъ)„„ГΛЮ
#ƿ
μ)+ƿ
/ &№
ƿ
)Ю
0#+ƿ
Г& „–/ & Ρ#„ƿ
)μƿ
& #$ Ю
№
V Ρ+#/ –#0ƿ
–/ „Ч„ƿ
–)ƿ
Λ#ƿ
+} & & Г& όƿ
Λ#μ)+#ƿ
/ &№
ƿ
& #$ #+ƿ
Г& „–/ & Ρ#„ƿ
/ +#ƿ
„–/ +–#0
ɀ Ø)ƿ
–φГ„ƿ
0} +Г& όƿ
+#Λ/ Ю
/ & Ρ#
Connector resizing
T0
T1
T2
T3
Z)} +ƿ
–/ „Ч„
T0
T1
T2
ſ φ+##ƿ
–/ „Ч„
` )0 ΛГ"ƿ
μ"%ΡГ%ό
*)} %-=
* - based on new tasks
ɀ HЮ
0ƿ
–/ „Чƿ
+} & & Г& όƿ
/ μ–#+ƿ
& #$ #+ƿ
–/ „Ч„ƿ
/ +#ƿ
„–/ +–#0
ɀ @
μƿ
/ ƿ
ъ/ +–Г–Г)& ƿ
Г„ƿ
+#/„„Гό& #0ƿ
μ+)8 ƿ
)Ю
0ƿ
ſ e ƿ
–)ƿ
& #$ ƿ
ſ Lfſ Nfſ ŢY
ɀ Ø} ъЮ
ГΡ/ –#„ƿ
/ Λ)} & 0
Zombie fencing: third time’s the
charm
ɀ @
& „–#/ 0ƿ
)μƿ
} „Г& όƿ
–φ#ƿ
& #$ ƿ
„#–ƿ
)μƿ
–/ „Чƿ
Ρ)& μГό„ƿ
μ)+ƿ
)} +ƿ
+)} & 0ƿ
)μƿ
c)8 ΛГ#ƿ
μ#& ΡГ& όP
ƿ
} „#ƿ
–φ#ƿ
)Ю
0ƿ
„#–
ɀ ό/ Г& P
ƿ
0)ƿ
–φГ„ƿ
0} +Г& όƿ
+#Λ/ Ю
/ & Ρ#
ɀ Ю
+Гόφ–P
ƿ
–φГ„ƿ
φ/ „ƿ
–)ƿ
Λ#ƿ
#& )} όφP
ƿ
+Гόφ–
ɀ ỳ #Ю
Ю
Y ƿ
φ)$ ƿ
0)ƿ
$ #ƿ
Ч& )$ ƿ
/ Λ)} –ƿ
–φ#ƿ
)Ю
0ƿ
„#–ƿ
)μƿ
–/ „Чƿ
Ρ)& μГό„
Zombie leaders
ɀ ſ φ#ƿ
Ю
#/ 0#+ƿ
)μƿ
–φ#ƿ
ΡЮ
} „–#+ƿ
Г„ƿ
–φ#ƿ
)& Ю
№
ƿ
)& #ƿ
/ Ю
Ю
)$ #0ƿ
–)ƿ
ъ} ΛЮ
Г„φƿ
–/ „Чƿ
Ρ)& μГό„ƿ
–)ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡ
ɀ O )$ #¤#+P
ƿ
$ #ƿ
0)& d–ƿ
#& μ)+Ρ#ƿ
–φГ„ƿ
¤#+№
ƿ
„–+)& όЮ
№
ɀ q)8 #ƿ
$ )+Ч#+„ƿ
8 /№
ƿ
8 Г„–/ Ч#& Ю
№
ƿ
Λ#Ю
Г#¤#ƿ
–φ#№
ƿ
/ +#ƿ
–φ#ƿ
Ю
#/ 0#+
ɀ @
& / ΡΡ} +/ –#ƿ
–/ „Чƿ
Ρ)& μГό„ƿ
8 /№
ƿ
Λ#ƿ
ъ} ΛЮ
Г„φ#0ƿ
Г& ƿ
+/ ъГ0ƿ
„} ΡΡ#„„Г)& P
ƿ
)¤#+$ +Г–Г& όƿ
¤/ Ю
Г0ƿ
–/ „Чƿ
Ρ)& μГό„
● Leader should fence
out three tasks
● But leader only
fences out two
Zombie leaders
my-connector (3 tasks)
Ņ)%μГόƿ
–)ъГΡ
my-connector (2 tasks)
my-connector (1 task)
q–/ +–Г& όƿ
„–/ –#
Write by zombie leader
ỳ +Г–#ƿ
Λ№
ƿ
/ Ρ–} / Ю
ƿ
Ю
#/ 0#+
ò#Λ/ Ю
/ & Ρ#
ڤ"%–
ɀ H& ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
Ю
#/ 0#+ƿ
„##„T
○ One new task
ɂ ſ $ )ƿ
)Ю
0ƿ
–/ „Ч„
Zombie fencing: guarded config topic
ɀ O )$ ƿ
0)ƿ
$ #ƿ
ъ+#¤#& –ƿ
c)8 ΛГ#ƿ
Ю
#/ 0#+„ƿ
μ+)8 ƿ
/ ΡΡ#„„Г& όƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡ
ɀ ſ +/ & „/ Ρ–Г)& / Ю
ƿ
ъ+)0} Ρ#+ƿ
–)ƿ
–φ#ƿ
+#„Ρ} #ƿ
;/ ό/ Г& <b
ɀ @
„ƿ
–φГ„ƿ
#& )} όφ
ɀ +#ƿ
$ #ƿ
ό} / +/ & –##0ƿ
–φ/ –ƿ
–φ#ƿ
Ρ)& μГόƿ
–)ъГΡƿ
Ρ/ & ƿ
Λ#ƿ
} „#0ƿ
/ „ƿ
/ ƿ
„)} +Ρ#ƿ
)μƿ
–+} –φƿ
μ)+ƿ
c)8 ΛГ#ƿ
μ#& ΡГ& όƿ
& )$ 
ɀ 🙃🙃🙃
Leadership change
ɀ @
μƿ
/ ƿ
Ю
#/ 0#+ƿ
μ/ Ю
Ю
„ƿ
)} –ƿ
)μƿ
–φ#ƿ
ΡЮ
} „–#+P
ƿ
/ ƿ
& #$ ƿ
)& #ƿ
Г„ƿ
Ρφ)„#&
ɀ O )$ ƿ
0)#„ƿ
–φ#ƿ
& #$ ƿ
Ю
#/ 0#+ƿ
Ч& )$ ƿ
$ φ/ –ƿ
& ##0„ƿ
μ#& ΡГ& ό
Leadership change (with new tasks)
my-connector (3 tasks)
Config topic
my-connector (2 tasks)
q–/ +–Г& όƿ
„–/ –#
ỳ +Г–#ƿ
Λ№
ƿ
Ю
#/ 0#+
ò#Λ/ Ю
/ & Ρ#ƿ
;g ƿ
c)8 ΛГ#ƿ
μ#& ΡГ& ό<
Leader falls out of cluster
ڤ"%–
ɀ H& ƿ
μГ+„–ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
)Ю
0ƿ
–/ „Ч„ƿ
/ +#ƿ
μ#& Ρ#0ƿ
)} –ƿ
„} ΡΡ#„„μ} Ю
Ю
№
Rebalance (+ new leader)
ɀ H& ƿ
„#Ρ)& 0ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
& #$ ƿ
Ю
#/ 0#+ƿ
0)#„& d–ƿ
φ/¤#ƿ
–)ƿ
0)ƿ
/ &№
ƿ
μ#& ΡГ& ό
Leadership change (with new tasks)
my-connector (3 tasks)
Config topic
my-connector (2 tasks)
q–/ +–Г& όƿ
„–/ –#
Write by leader
]#/ 0#+ƿ
μ/ Ю
Ю
„ƿ
)} –ƿ
)μƿ
ΡЮ
} „–#+ƿ
;Λ#μ)+#ƿ
+#Λ/ Ю
/ & Ρ#<
ò#Λ/ Ю
/ & Ρ#ƿ
;g ƿ
& #$ ƿ
Ю
#/ 0#+<
ڤ"%–
ɀ H& ƿ
+#Λ/ Ю
/ & Ρ#P
ƿ
& #$ ƿ
Ю
#/ 0#+ƿ
φ/ „ƿ
–)ƿ
μ#& Ρ#ƿ
)} –ƿ
)Ю
0ƿ
–/ „Ч„
● But how can it tell?
ɀ ſ φ#ƿ
Ρ)& μГόƿ
–)ъГΡƿ
Ю
))Ч„ƿ
–φ#ƿ
„/ 8 #ƿ
Г& ƿ
Λ)–φƿ
„Ρ#& / +Г)„
ɀ ò#8 #8 Λ#+P
ƿ
$ #ƿ
$ / & –ƿ
–)ƿ
/¤)Г0ƿ
} & & #Ρ#„„/ +№
ƿ
Г& –#++} ъ–Г)& „
Zombie fencing: fence, then write
ɀ Ņ} ++#& –ƿ
„#я } #& Ρ#ƿ
)μƿ
#¤#& –„T
ɂ n} ΛЮ
Г„φƿ
& #$ ƿ
–/ „Чƿ
Ρ)& μГό„
ɂ ò#Λ/ Ю
/ & Ρ#
ɂ i
ƿ̀ )8 ΛГ#ƿ
μ#& ΡГ& ό
ɂ q–/ +–ƿ
& #$ ƿ
–/ „Ч„
ɀ ff #$ ƿ
)+0#+T
ɂ ` )8 ΛГ#ƿ
μ#& ΡГ& ό
ɂ n} ΛЮ
Г„φƿ
& #$ ƿ
–/ „Чƿ
Ρ)& μГό„
ɂ ò#Λ/ Ю
/ & Ρ#
ɂ q–/ +–ƿ
& #$ ƿ
–/ „Ч„
ɀ ſ φГ„ƿ
φ/ „ƿ
–)ƿ
Λ#ƿ
Г–P
ƿ
+Гόφ–
ɀ j )} ƿ
$ Г„φƿ
😈
That was not a good idea
● Poor UX
○ Causes tasks to fail in between zombie fencing and end
of rebalance
○ Forcibly kills them, no chance to commit pending offsets
○ Looks like a bug to users
● Correctness issue
○ Users can manually restart failed tasks
○ Even in between zombie fencing and publishing new
task configs
○ Uh oh, a zombie task made it to the other end of the
rebalance!
Zombie fencing: durable task counts
● Forget the “fence then write” logic
● Instead, we explicitly track the number of to-be-fenced tasks
in the config topic with a task count record
● These serve two purposes:
○ Explicitly: if fencing is necessary, how many tasks have
to be fenced out
○ Implicitly: determine whether zombie fencing is
necessary
Zombie fencing: durable task counts
my-connector (3 tasks)
Ņ)%μГόƿ
–)ъГΡ
my-connector (2 tasks)
my-connector-task-count (2)
q–/ +–Г& όƿ
„–/ –#
ff #$ ƿ
–/ „Чƿ
Ρ)& μГό„
Rebalance (+ zombie fencing)
ڤ"%–
ɀ H& ƿ
+#Λ/ Ю
/ & Ρ#T
ɂ ]#/ 0#+ƿ
μ#& Ρ#„ƿ
–φ+##ƿ
–/ „Ч„ƿ
Λ/ „#0ƿ
)& ƿ
Ю
/ –#„–ƿ
–/ „Чƿ
Ρ)} & –ƿ
+#Ρ)+0
ɂ ]#/ 0#+ƿ
$ +Г–#„ƿ
& #$ ƿ
–/ „Чƿ
Ρ)} & –ƿ
)μƿ
–$ )ƿ
–/ „Ч„ƿ
Λ/ „#0ƿ
)& ƿ
Ю
/ –#„–ƿ
–/ „Чƿ
Ρ)& μГό„
my-connector-task-count (3)
Zombie fencing: durable task counts
my-connector (3 tasks)
Config topic
my-connector (2 tasks)
my-connector-task-count (2)
q–/ +–Г& όƿ
„–/ –#
ff #$ ƿ
–/ „Чƿ
Ρ)& μГό„
Rebalance (+ zombie fencing)
ڤ"%–
my-connector-task-count (3)
Safe to run bring up tasks?
✅
❌
✅
ɀ ỳ φ/–ƿ
Ρ)} Ю
0ƿ
ъ)„„ГΛЮ
№
ƿ
Λ+#/Чƿ
–φГ„ƿ
& )$ 
ɀ Hφƿ
№
#/ φP
ƿ
$ φ/ –ƿ
$ / „ƿ
–φ/ –ƿ
/ Λ)} –ƿ
–/ „Чƿ
+#„–/ +–„
Zombie fencing: durable task counts
Laggy task startup
● Zombie fencing disables all initialized task producers from
writing to Kafka
● What if a zombie task lags and hasn’t initialized its producer
by the time zombie fencing for a new generation of tasks
takes place?
● Or, what if a task is restarted on a zombie worker after
zombie fencing takes place?
Laggy task startup
reddit-source-0
reddit-source-1
reddit-source-2
reddit-source-3
ò#00Г–ƿ
„)} +Ρ#ƿ
^k ƿ
)Ю
0ƿ
–/ „Ч„<
reddit-source-0
reddit-source-1
reddit-source-2
ò#00Г–ƿ
„)} +Ρ#ƿ
^e ƿ
& #$ ƿ
–/ „Ч„<
` )0 ΛГ"ƿ
μ"%ΡГ%ό
*)} %-
reddit-source-3
^ſ / „Чƿ
Г„ƿ
Ю
/ όόГ& όƿ
0} +Г& όƿ
„–/ +–} ъ<
^ſ / „Чƿ
φ/ „ƿ
μГ& Г„φ#0ƿ
„–/ +–} ъ<
Zombie fencing: check your work
ɀ μ–#+ƿ
Г& Г–Г/ Ю
ГcГ& όƿ
–+/ & „/ Ρ–Г)& „ƿ
μ)+ƿ
/ ƿ
–/ „Чƿ
ъ+)0} Ρ#+P
ƿ
φ/¤#ƿ
–)ƿ
8 / Ч#ƿ
„} +#ƿ
Г–d„ƿ
„–ГЮ
Ю
ƿ
„/ μ#ƿ
–)ƿ
+} & ƿ
–φ#ƿ
–/ „Ч
ɀ ff #$ ƿ
„#я } #& Ρ#ƿ
)μƿ
#¤#& –„T
ɂ Ø#ΡГ0#ƿ
–)ƿ
;+#<„–/ +–ƿ
–/ „Ч
ɂ Ņ+#/ –#ƿ
ъ+)0} Ρ#+ƿ
μ)+ƿ
–/ „Чƿ
/ & 0ƿ
Г& Г–Г/ Ю
Гc#ƿ
–+/ & „/ Ρ–Г)& „
ɂ ò#/ 0ƿ
–)ƿ
#& 0ƿ
)μƿ
Ρ)& μГόƿ
–)ъГΡ
ɂ @
μƿ
& #$ ƿ
–/ „Чƿ
Ρ)& μГό„ƿ
μ)} & 0P
ƿ
/ Λ)+–ƿ
„–/ +–} ъƿ
/ & 0ƿ
/ Λ/ & 0)& ƿ
–φ#ƿ
–/ „Ч
ɂ H–φ#+$ Г„#P
ƿ
„/ μ#ƿ
–)ƿ
„–/ +–ƿ
ъ+)Ρ#„„Г& όƿ
0/ –/
ɀ O /¤#ƿ
$ #ƿ
μГ& / Ю
Ю
№
ƿ
0)& #ƿ
Г–
🎉🎉🎉 Yes! 🎉🎉🎉
(But…)
Caveats
● Fencing during rebalancing is not a good idea
○ Makes rebalances more brittle
○ Requires a new rebalance any time we want to restart a
task that failed due to failed zombie fencing
● Instead, we fence outside of rebalances
○ During task startup, workers issue a REST request to the
leader to perform zombie fencing for the connector
○ The leader will perform that round (if necessary), then
send back a 2XX response
○ If a non-2XX response is received, the task is marked
failed
○ Tasks can easily be restarted
Caveats
ɀ ſ φ+)$ / $ /№
ƿ
ъ+)0} Ρ#+„ƿ
μ)+ƿ
Г& Г–Г/ Ю
ГcГ& όƿ
–+/ & „/ Ρ–Г)& „ƿ
Г„ƿ
$ / „–#μ} Ю
ɀ ỳ #ƿ
/ 00#0ƿ
/ ƿ
& #$ ƿ
/ 08 Г& ƿ
ΡЮ
Г#& –ƿ n@
ƿ
Г& ƿ
e Me ƿ
–)ƿ
0)ƿ
–φГ„ƿ
Г& „–#/ 0
ɀ ỳ #ƿ
φ/¤#ƿ
–)ƿ
Λ#ƿ
Ρ/ +#μ} Ю
ƿ
/ Λ)} –ƿ
φ)$ ƿ
–φ#ƿ
Ю
#/ 0#+ƿ
} „#„ƿ
–φ#ƿ
–+/ & „/ Ρ–Г)& / Ю
ƿ
ъ+)0} Ρ#+
ɀ qГ8 ГЮ
/ +ƿ
QΡЮ
/ Г8 V –φ#& V Ρφ#ΡЧSƿ
Ю
)όГΡƿ
–)ƿ
„)} +Ρ#ƿ
–/ „Ч„
In practice
@
8 ъЮ
#8 #& –/ –Г)& ƿ
0#–/ ГЮ
„ƿ
/ +#ƿ
Λ)+Г& όP
ƿ
φ)$ ƿ
0)ƿ
$ #ƿ
/ Ρ–} / Ю
Ю
№
ƿ
} „#ƿ
–φГ„ƿ
μ#/ –} +#
In practice (cluster administrators)
ɀ ff #$ ƿ
ΡЮ
} „–#+„T
ɂ [ „#ƿ
¤#+„Г)& ƿ
e Me MLƿ
)+ƿ
Ю
/ –#+
ɂ Ņ)& μГό} +#ƿ
#¤#+№
ƿ
$ )+Ч#+ƿ
$ Г–φƿ
exactly.once.source.support = enabled
ɀ ŚGГ„–Г& όƿ
ΡЮ
} „–#+„T
ɂ ò)Ю
Ю
Г& όƿ
} ъό+/ 0#ƿ
NE
ɂ Ļ +Г& όƿ
/ Ю
Ю
ƿ
$ )+Ч#+„ƿ
–)ƿ
e Me MLƿ
)+ƿ
Ю
/ –#+
ɂ Ņ)& μГό} +#ƿ
#¤#+№
ƿ
$ )+Ч#+ƿ
$ Г–φƿ
exactly.once.source.support = preparing
○ ò)Ю
Ю
Г& όƿ
} ъό+/ 0#ƿ
ŢE
ɂ Ņ)& μГό} +#ƿ
#¤#+№
ƿ
$ )+Ч#+ƿ
$ Г–φƿ
exactly.once.source.support = enabled
In practice (downstream readers)
● Have to filter out records from aborted transactions
● If using the Java consumer, configure with isolation.level
= read_committed
● For sink connectors, do at least one of the following:
○ Configure worker with consumer.isolation.level =
read_committed
○ Configure connector with
consumer.override.isolation.level =
read_committed with (3.0.0 or later, with default
worker configuration)
In practice (writing connectors)
Have to define source offsets correctly
public abstract class SourceTask {
public abstract List<SourceRecord> poll();
}
public class SourceRecord {
public SourceRecord(Map<String, ?>
sourcePartition, Map<String, ?> sourceOffset, ...)
}
In practice (writing connectors)
O /¤#ƿ
–)ƿ
} „"ƿ
„)} +Ρ#ƿ
)μμ„#–„ƿ
Ρ)++#Ρ–Ю
№
public abstract class SourceTask {
protected SourceTaskContext context;
public abstract void start(Map<String, String> props);
}
public interface SourceTaskContext {
OffsetStorageReader offsetStorageReader();
}
public interface OffsetStorageReader {
<T> Map<Map<String, T>, Map<String, Object>>
offsets(Collection<Map<String, T>> partitions);
}
In summary
ɀ ŚG/ Ρ–Ю
№
V )& Ρ#ƿ
Г„ƿ
φ/ +0ƿ
–)ƿ
Г8 ъЮ
#8 #& –
ɂ Ś„ъ#ΡГ/ Ю
Ю
№
ƿ
φ/ & 0Ю
Г& όƿ
c)8 ΛГ#ƿ
$ )+Ч#+„fc)8 ΛГ#ƿ
–/ „Ч„ƿ
/ Ρ+)„„ƿ
–/ „Чƿ
+#Ρ)& μГό} +/ –Г)& „
ɀ ŚG/ Ρ–Ю
№
V )& Ρ#ƿ
Г„ƿ
;φ)ъ#μ} Ю
Ю
№
<ƿ
#/ „№
ƿ
–)ƿ
} „#
ɂ @
μƿ
Г–d„ƿ
& )–P
ƿ
φ/ +/ „„ƿ
ъГ& όƿ
8 #ƿ
)& ƿ
ăГ+/b
ɂ φ––ъ„TffГ„„} #„M/ ъ/ Ρφ#M)+όfТ
Г+/ fъ+)Т
#Ρ–„fǽ Zǽ fГ„„} #„
ɀ Z)+ƿ
/ Ю
Ю
ƿ
–φ#ƿ
0#–/ ГЮ
„P
ƿ
Ρφ#ΡЧƿ
)} –ƿ
ǽ@
nBnND
ɂ φ––ъ„TffΡ$ ГЧГM/ ъ/ Ρφ#M)+όfΡ)& μЮ
} #& Ρ#f0Г„ъЮ
/№
fǽ Zǽ fǽ@
n
BnNDṊ e i
ŚG/ Ρ–Ю
№
V
H& Ρ#g q} ъъ)+–g μ)+g q)} +Ρ#g Ņ)& & #Ρ–)+„
Thank you!
ƿ
Open Source Program Office @Aiven
Ρφ#Г„&' ( Г¤&* +Г,
Chris Egerton
-Г* -Ρφ#Г„. &ό&#–,* . 12 33456 3
@C0urante

More Related Content

PDF
Apache Kafka Architecture & Fundamentals Explained
PDF
Storage Capacity Management on Multi-tenant Kafka Cluster with Nurettin Omeroglu
PDF
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
PDF
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
PDF
Lessons Learned Scaling Stateful Kafka Streams Topologies with Ferran Galí i ...
PDF
Distributed Locking in Kubernetes
PDF
Kafka 101 and Developer Best Practices
PDF
Airflow presentation
Apache Kafka Architecture & Fundamentals Explained
Storage Capacity Management on Multi-tenant Kafka Cluster with Nurettin Omeroglu
Apache kafka 모니터링을 위한 Metrics 이해 및 최적화 방안
왜 쿠버네티스는 systemd로 cgroup을 관리하려고 할까요
Lessons Learned Scaling Stateful Kafka Streams Topologies with Ferran Galí i ...
Distributed Locking in Kubernetes
Kafka 101 and Developer Best Practices
Airflow presentation

What's hot (20)

PDF
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...
PPTX
Scylla Summit 2022: Making Schema Changes Safe with Raft
PDF
ksqlDB로 실시간 데이터 변환 및 스트림 처리
PDF
Introduction to Kafka Streams
PDF
Reliable Message Reprocessing Patterns for Kafka with Dunith Dhanushka
PDF
Apache Airflow
PDF
[MeetUp][2nd] 오리뎅이의_쿠버네티스_네트워킹_v1.2
PDF
Secrets of Performance Tuning Java on Kubernetes
PPTX
Introduction to Apache Kafka
PPTX
PostgreSQL and CockroachDB SQL
ODP
Monitoring With Prometheus
PPTX
Everything You Need To Know About Persistent Storage in Kubernetes
PDF
Kafka Streams: What it is, and how to use it?
PPTX
Deep Dive into Keystone Tokens and Lessons Learned
PDF
Spring Cloud Workshop
PDF
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
PDF
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
PDF
Planning for Disaster Recovery (DR) with Galera Cluster
PDF
[Outdated] Secrets of Performance Tuning Java on Kubernetes
PDF
Performance Monitoring: Understanding Your Scylla Cluster
Exactly-Once Semantics Revisited: Distributed Transactions across Flink and K...
Scylla Summit 2022: Making Schema Changes Safe with Raft
ksqlDB로 실시간 데이터 변환 및 스트림 처리
Introduction to Kafka Streams
Reliable Message Reprocessing Patterns for Kafka with Dunith Dhanushka
Apache Airflow
[MeetUp][2nd] 오리뎅이의_쿠버네티스_네트워킹_v1.2
Secrets of Performance Tuning Java on Kubernetes
Introduction to Apache Kafka
PostgreSQL and CockroachDB SQL
Monitoring With Prometheus
Everything You Need To Know About Persistent Storage in Kubernetes
Kafka Streams: What it is, and how to use it?
Deep Dive into Keystone Tokens and Lessons Learned
Spring Cloud Workshop
Trino: A Ludicrously Fast Query Engine - Pulsar Summit NA 2021
Advanced Model Inferencing leveraging Kubeflow Serving, KNative and Istio
Planning for Disaster Recovery (DR) with Galera Cluster
[Outdated] Secrets of Performance Tuning Java on Kubernetes
Performance Monitoring: Understanding Your Scylla Cluster
Ad

More from HostedbyConfluent (20)

PDF
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
PDF
Renaming a Kafka Topic | Kafka Summit London
PDF
Evolution of NRT Data Ingestion Pipeline at Trendyol
PDF
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
PDF
Exactly-once Stream Processing with Arroyo and Kafka
PDF
Fish Plays Pokemon | Kafka Summit London
PDF
Tiered Storage 101 | Kafla Summit London
PDF
Building a Self-Service Stream Processing Portal: How And Why
PDF
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
PDF
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
PDF
Navigating Private Network Connectivity Options for Kafka Clusters
PDF
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
PDF
Explaining How Real-Time GenAI Works in a Noisy Pub
PDF
TL;DR Kafka Metrics | Kafka Summit London
PDF
A Window Into Your Kafka Streams Tasks | KSL
PDF
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
PDF
Data Contracts Management: Schema Registry and Beyond
PDF
Code-First Approach: Crafting Efficient Flink Apps
PDF
Debezium vs. the World: An Overview of the CDC Ecosystem
PDF
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Renaming a Kafka Topic | Kafka Summit London
Evolution of NRT Data Ingestion Pipeline at Trendyol
Ensuring Kafka Service Resilience: A Dive into Health-Checking Techniques
Exactly-once Stream Processing with Arroyo and Kafka
Fish Plays Pokemon | Kafka Summit London
Tiered Storage 101 | Kafla Summit London
Building a Self-Service Stream Processing Portal: How And Why
From the Trenches: Improving Kafka Connect Source Connector Ingestion from 7 ...
Future with Zero Down-Time: End-to-end Resiliency with Chaos Engineering and ...
Navigating Private Network Connectivity Options for Kafka Clusters
Apache Flink: Building a Company-wide Self-service Streaming Data Platform
Explaining How Real-Time GenAI Works in a Noisy Pub
TL;DR Kafka Metrics | Kafka Summit London
A Window Into Your Kafka Streams Tasks | KSL
Mastering Kafka Producer Configs: A Guide to Optimizing Performance
Data Contracts Management: Schema Registry and Beyond
Code-First Approach: Crafting Efficient Flink Apps
Debezium vs. the World: An Overview of the CDC Ecosystem
Beyond Tiered Storage: Serverless Kafka with No Local Disks
Ad

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation theory and applications.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Approach and Philosophy of On baking technology
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Spectral efficient network and resource selection model in 5G networks
Encapsulation theory and applications.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
The AUB Centre for AI in Media Proposal.docx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Chapter 3 Spatial Domain Image Processing.pdf
MYSQL Presentation for SQL database connectivity
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Building Integrated photovoltaic BIPV_UPV.pdf
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
sap open course for s4hana steps from ECC to s4
Approach and Philosophy of On baking technology
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Electronic commerce courselecture one. Pdf
MIND Revenue Release Quarter 2 2025 Press Release
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Understanding_Digital_Forensics_Presentation.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors with Chris Egerton

  • 1. Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors Chris Egerton
  • 2. Nice to meet you! Chris Egerton Open Source Program Office @Aiven Apache Kafka committer and PMC member (Official bio) (Unofficial bio)
  • 3. Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors
  • 4. ● “Exactly-once semantics” ● “Semantics” instead of “delivery”, “guarantees”, “delivery guarantees”, etc. (see Two Generals’ Problem) ● Levels: ○ Probably-once ○ At-least-once ○ At-most-once ○ Exactly-once ● With all else equal, exactly-once is best ● But of course, it’s the hardest to implement EOS
  • 5. Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors
  • 6. Source Connectors ● Kafka stores and transmits events. Where do these events come from, and where do they go? ● DYI producer/consumer application? Nah 👎 ● Connectors: no-code (or low-code) applications to integrate Kafka with other systems ● Sink connectors write data from Kafka to the external system ● Source connectors read data from the external system into Kafka
  • 7. Exactly-Once, Again: Adding EOS Support for Kafka Connect Source Connectors
  • 8. Kafka Connect ● Distributed, horizontally-scalable, fault- tolerant ingest/export tool for Kafka ● Developers implement connectors against the Kafka Connect API ● Cluster administrators install connectors onto one or more Kafka Connect workers, which combine to form a cluster ● Users can then create and manage connectors on that cluster by submitting JSON configurations via a REST API ● (For users) No code required! { "name": "local-file-source", "connector.class": "FileStreamSink", "tasks.max": "1", "file": "test.txt", "topic": "connect-test" }
  • 9. We’re going to talk about designing support for exactly-once semantics (EOS) with source connectors developed for Kafka Connect. In summary…
  • 10. Challenges 1⃣ƿ Ø#μГ& Г& όƿ „)} +Ρ#ƿ )μμ„#–„ 2⃣ƿ q–)+Г& όƿ / & 0ƿ +#–+Г#¤Г& όƿ „)} +Ρ#ƿ )μμ„#–„ 3⃣ƿ ò#–+№ Г& όƿ 0#Ю Г¤#+№ ƿ –)ƿ ǽ/ μЧ/ 4⃣ƿ̀ )8 ΛГ#„ƿ 🧟🧟🧟🧟🧟
  • 11. Source offsets, in detail ● Source connectors provide source records ● Source records come with source offsets (partition + offset) ● On startup, connectors use source offsets to know where to resume from ● Source offsets are stored in an offsets topic by Kafka Connect ✅🎉 // TODO ɀ Ø"μГ%Г%όƿ „)} +Ρ#ƿ )μμ„#–„ƿ ;1⃣<ƿ Г„ƿ –φ#ƿ +#„ъ)& „ГΛГЮ Г–№ ƿ )μƿ –φ#ƿ Ρ)& & #Ρ–)+ ɀ q–)*Г%όƿ , %-ƿ *"–*Г"¤Г%όƿ „)} +Ρ#ƿ )μμ„#–„ƿ ;2⃣<ƿ Г„ƿ –φ#ƿ +#„ъ)& „ГΛГЮ Г–№ ƿ )μƿ ǽ/μЧ/ƿ Ņ)& & #Ρ–
  • 12. Exactly-once for Kafka clients ɀ ǽ@ nBCDEƿ ŚG/ Ρ–Ю № ƿ H& Ρ#ƿ Ø#Ю Г¤#+№ ƿ / & 0ƿ ſ +/ & „/ Ρ–Г)& / Ю ƿ ë #„„/ όГ& ό ɂ ò#Ю #/ „#0ƿ Г& ƿ LMNNMLML ɂ O #& Ρ#P ƿ Q ό/ Г& Sƿ Г& ƿ –φ#ƿ –Г–Ю # ɀ / -"0 ъ)–"%–ƿ ъ*)-} Ρ"*Tƿ +#–+Г#„ƿ $ Г–φ)} –ƿ 0} ъЮ ГΡ/ –#„ƿ ;3⃣< ɀ ſ *, %„, Ρ–Г)%, Ю ƿ ъ*)-} Ρ"*Tƿ / –)8 ГΡƿ Ρ+)„„V –)ъГΡƿ $ +Г–#„ ɀ ſ *, %„, Ρ–Г)%, Ю ƿ / Ø„Tƿ „Г& όЮ #ƿ $ +Г–#+ƿ ъ#+ƿ @ Ø ɂ / %Г–Г, Ю Г7"ƿ –*, %„, Ρ–Г)%„ƿ $ Г–φƿ / ƿ ъ+)0} Ρ#+ƿ –)ƿ μ"%Ρ"ƿ )} –ƿ )–φ#+ƿ ъ+)0} Ρ#+„ƿ $ Г–φƿ –φ#ƿ „/ 8 #ƿ –+/ & „/ Ρ–Г)& / Ю ƿ @ Ø ✅🎉
  • 13. Tracking source offsets (2⃣) Ļ #μ)+#T ɀ n)Ю Ю ƿ Ρ)& & #Ρ–)+ƿ μ)+ƿ „)} +Ρ#ƿ +#Ρ)+0„ ɀ ỳ +Г–#ƿ +#Ρ)+0„ƿ –)ƿ ǽ/ μЧ/ ɀ n#+Г)0ГΡ/ Ю Ю № ƿ $ +Г–#ƿ ;Ρ)8 8 Г–<ƿ „)} +Ρ#ƿ )μμ„#–„ƿ –)ƿ ǽ/ μЧ/ ɀ n+)¤Г0#„ƿ / –V Ю #/ „–V )& Ρ#ƿ „} ъъ)+– μ–#+T ɀ Ļ #όГ& ƿ –+/ & „/ Ρ–Г)& ɀ n)Ю Ю ƿ Ρ)& & #Ρ–)+ƿ μ)+ƿ „)} +Ρ#ƿ +#Ρ)+0„ ɀ ỳ +Г–#ƿ +#Ρ)+0„ƿ Г8 8 #0Г/ –#Ю № ƿ –)ƿ ǽ/ μЧ/ ƿ ;„–ГЮ Ю < ɀ ỳ +Г–#ƿ „)} +Ρ#ƿ )μμ„#–„ƿ –)ƿ ǽ/ μЧ/ ƿ ɀ Ņ)8 8 Г–ƿ –+/ & „/ Ρ–Г)& ɀ n+)¤Г0#„ƿ #G/ Ρ–Ю № V )& Ρ#ƿ „} ъъ)+– ✅🎉
  • 14. 75% done! 1⃣ƿ Ø#μГ& Г& όƿ „)} +Ρ#ƿ )μμ„#–„ 2⃣ƿ q–)+Г& όƿ / & 0ƿ +#–+Г#¤Г& όƿ „)} +Ρ#ƿ )μμ„#–„ 3⃣ƿ ò#–+№ Г& όƿ 0#Ю Г¤#+№ ƿ –)ƿ ǽ/ μЧ/ 🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟 🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟 🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟🧟 ✅ ✅ ✅ 4⃣ƿ ſ φ/ –ƿ Ю #/¤#„Y
  • 15. “Chekhov’s gun cabinet” Kafka Connect: diving deeper ɀ Ņ)%%"Ρ–)*„ƿ ό#& #+/–#ƿ Ρ)& μГό} +/–Г)& „ƿ μ)+ƿ )& #ƿ )+ƿ 8 )+#ƿ –, „Ч„ ɀ ſ / „Ч„ƿ Ρ/ & ƿ Λ#ƿ „ъ+#/ 0ƿ / Ρ+)„„ƿ 8 } Ю –ГъЮ #ƿ ǽ/ μЧ/ ƿ Ņ)& & #Ρ–ƿ $ )+Ч#+„ ɀ ſ / „Ч„ƿ / +#ƿ / „„Гό& #0ƿ –)ƿ $ )+Ч#+„ƿ 0} +Г& όƿ / ƿ *"Λ, Ю , %Ρ" ɀ ƿ „Г& όЮ #ƿ $ )+Ч#+P ƿ –φ#ƿ Ю ", -"*P ƿ 0#–#+8 Г& #„ƿ –φ/ –ƿ / „„Гό& 8 #& – ɀ ſ / „Чƿ Ρ)& μГό} +/ –Г)& „ƿ / +#ƿ „–)+#0ƿ Г& ƿ / ƿ „Г& όЮ #V ъ/+–Г–Г)& ƿ ǽ/μЧ/ƿ –)ъГΡƿ Ρ/ Ю Ю #0ƿ –φ#ƿ Ρ)%μГόƿ –)ъГΡ ɀ Z)+ƿ „)} +Ρ#ƿ Ρ)& & #Ρ–)+„P ƿ ǽ/ μЧ/ ƿ Ņ)& & #Ρ–ƿ } „#„ƿ )& #ƿ ъ+)0} Ρ#+ƿ ъ#+ƿ –/ „Чƿ –)ƿ „#& 0ƿ „)} +Ρ#ƿ +#Ρ)+0„ƿ –)ƿ ǽ/μЧ/
  • 16. Zombie fencing: correctness goals ſ )Ю #+/ –#0ƿ „Ρ#& / +Г)„T ɀ ;[ & <ό+/ Ρ#μ} Ю ƿ $ )+Ч#+ƿ „φ} –0)$ & ɀ +ΛГ–+/ +№ ƿ $ )+Ч#+ƿ „–/ +–} ъ„ ɀ +ΛГ–+/ +ГЮ № V Ю )& όP ƿ / +ΛГ–+/ +ГЮ № V ъЮ /Ρ#0P ƿ / & 0ƿ /+ΛГ–+/+ГЮ № V –Г8 #0ƿ ъ/ } „#„
  • 17. Zombie fencing: UX goals ● Minimal unnecessary interruptions (keep processing data) ● Minimal changes to connector code ● Minimal connector management API changes
  • 18. Zombie fencing: actually pretty easy? ● Give each task a transactional ID derived from the name of the connector and the task ID ○ E.g., “reddit-source-0” or “chris-ksl-3” ● Let tasks fence out older instances on startup ○ Fencing: disabling a producer from writing to Kafka
  • 19. Zombie fencing: actually pretty easy? reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;)Ю 0ƿ –/ „Ч„< reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;& #$ ƿ –/ „Ч„< Fences out ;"%Ρ"„ƿ )} – ;"%Ρ"„ƿ )} – Fences out ɀ q)Y ƿ 0)#„ƿ –φГ„ƿ $ )+Ч ɀ ])Ю ƿ @ ƿ $ Г„φƿ 🥲
  • 20. Source partition reshuffling problem ɀ q)} +Ρ#ƿ Ρ)& & #Ρ–)+„ƿ +#/ 0ƿ μ+)8 ƿ „)} *Ρ"ƿ ъ, *–Г–Г)%„ ɂ Ø/ –/ Λ/ „#ƿ –/ ΛЮ #„P ƿ ǽ/ μЧ/ ƿ –)ъГΡ„P ƿ #–ΡM ɀ ſ φ#„#ƿ / +#ƿ / Ю Ю )Ρ/ –#0ƿ / Ρ+)„„ƿ –/ „Ч„ ɀ ſ φ/ –ƿ / Ю Ю )Ρ/ –Г)& ƿ Ρ/ & ƿ Ρφ/ & ό#ƿ )¤#+ƿ –Г8 #
  • 21. Source partition reshuffling problem P0 Task 0 P1 ſ / „Чƿ N @ & Г–Г/ Ю ƿ / Ю Ю )Ρ/ –Г)& ƿ ^Ţƿ –/ „Ч„P ƿ Ţƿ ъ/ +–Г–Г)& „< P0 Task 0 P2 ƿ & #$ ƿ ъ/ +–Г–Г)& ƿ ^nŢ`ƿ Г„ƿ Ρ+#/ –#0 P1 ſ / „Чƿ N & ƿ #GГ„–Г& όƿ ъ/ +–Г–Г)& ƿ ^nN`ƿ Г„ƿ 0#Ю #–#0 P0 ſ / „Чƿ L P2 ſ / „Чƿ N P1 P2
  • 22. Source partition reshuffling problem P0 ſ / „Чƿ L P2 HЮ 0ƿ –/ „Ч„ P1 Task 1 ff #$ ƿ –/ „Ч„ P2 ſ / „Чƿ N ;"%Ρ"„ƿ )} – ● New task T1 starts before old task T0 stops ɀ Ļ )–φƿ / +#ƿ / „„Гό& #0ƿ ъ/+–Г–Г)& ƿ nŢ ɀ Ø} ъЮ ГΡ/ –#„ƿ / Λ)} & 0b
  • 23. Zombie fencing: second try ɀ ỳ φ#& #¤#+ƿ & #$ ƿ –/„Чƿ Ρ)& μГό} +/–Г)& „ƿ /+#ƿ +#/0ƿ μ+)8 ƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡP ƿ ъ#+μ)+8 ƿ /ƿ +)} & 0ƿ )μƿ 7)0 ΛГ"ƿ μ"%ΡГ%ό ɂ Z)+ƿ #¤#+№ ƿ & #$ ƿ –/ „ЧP ƿ –/ Ч#ƿ Г–„ƿ –+/ & „/ Ρ–Г)& / Ю ƿ @ Øƿ / & 0ƿ ъ+##8 ъ–Г¤#Ю № ƿ Г& Г–Г/Ю Гc#ƿ –+/& „/Ρ–Г)& „ ɀ ff )$ ƿ Г–d„ƿ Г8 ъ)„„ГΛЮ #ƿ μ)+ƿ / &№ ƿ )Ю 0#+ƿ Г& „–/ & Ρ#„ƿ )μƿ & #$ Ю № V Ρ+#/ –#0ƿ –/ „Ч„ƿ –)ƿ Λ#ƿ +} & & Г& όƿ Λ#μ)+#ƿ / &№ ƿ & #$ #+ƿ Г& „–/ & Ρ#„ƿ / +#ƿ „–/ +–#0 ɀ Ø)ƿ –φГ„ƿ 0} +Г& όƿ +#Λ/ Ю / & Ρ#
  • 24. Zombie fencing: first try reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;)Ю 0ƿ –/ „Ч„< reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;& #$ ƿ –/ „Ч„< ;"%Ρ"„ƿ )} – Fences out ;"%Ρ"„ƿ )} – ;"%Ρ"„ƿ )} –
  • 25. Zombie fencing: second try reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;)Ю 0ƿ –/ „Ч„< reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ;& #$ ƿ –/ „Ч„< ` )0 ΛГ"ƿ μ"%ΡГ%ό *)} %- ● Woohoo, we did it! ● Just kidding, put your hand back down
  • 26. Zombie fencing: second try ɀ ỳ φ#& #¤#+ƿ & #$ ƿ –/„Чƿ Ρ)& μГό} +/–Г)& „ƿ /+#ƿ +#/0ƿ μ+)8 ƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡP ƿ ъ#+μ)+8 ƿ /ƿ +)} & 0ƿ )μƿ 7)0 ΛГ"ƿ μ"%ΡГ%ό ɂ Z)+ƿ #¤#+№ ƿ & #$ ƿ –/ „ЧP ƿ –/ Ч#ƿ Г–„ƿ –+/ & „/ Ρ–Г)& / Ю ƿ @ Øƿ / & 0ƿ ъ+##8 ъ–Г¤#Ю № ƿ Г& Г–Г/Ю Гc#ƿ –+/& „/Ρ–Г)& „ ɀ ff )$ ƿ Г–d„ƿ Г8 ъ)„„ГΛЮ #ƿ μ)+ƿ / &№ ƿ )Ю 0#+ƿ Г& „–/ & Ρ#„ƿ )μƿ & #$ Ю № V Ρ+#/ –#0ƿ –/ „Ч„ƿ –)ƿ Λ#ƿ +} & & Г& όƿ Λ#μ)+#ƿ / &№ ƿ & #$ #+ƿ Г& „–/ & Ρ#„ƿ / +#ƿ „–/ +–#0 ɀ Ø)ƿ –φГ„ƿ 0} +Г& όƿ +#Λ/ Ю / & Ρ#
  • 27. Connector resizing T0 T1 T2 T3 Z)} +ƿ –/ „Ч„ T0 T1 T2 ſ φ+##ƿ –/ „Ч„ ` )0 ΛГ"ƿ μ"%ΡГ%ό *)} %-= * - based on new tasks ɀ HЮ 0ƿ –/ „Чƿ +} & & Г& όƿ / μ–#+ƿ & #$ #+ƿ –/ „Ч„ƿ / +#ƿ „–/ +–#0 ɀ @ μƿ / ƿ ъ/ +–Г–Г)& ƿ Г„ƿ +#/„„Гό& #0ƿ μ+)8 ƿ )Ю 0ƿ ſ e ƿ –)ƿ & #$ ƿ ſ Lfſ Nfſ ŢY ɀ Ø} ъЮ ГΡ/ –#„ƿ / Λ)} & 0
  • 28. Zombie fencing: third time’s the charm ɀ @ & „–#/ 0ƿ )μƿ } „Г& όƿ –φ#ƿ & #$ ƿ „#–ƿ )μƿ –/ „Чƿ Ρ)& μГό„ƿ μ)+ƿ )} +ƿ +)} & 0ƿ )μƿ c)8 ΛГ#ƿ μ#& ΡГ& όP ƿ } „#ƿ –φ#ƿ )Ю 0ƿ „#– ɀ ό/ Г& P ƿ 0)ƿ –φГ„ƿ 0} +Г& όƿ +#Λ/ Ю / & Ρ# ɀ Ю +Гόφ–P ƿ –φГ„ƿ φ/ „ƿ –)ƿ Λ#ƿ #& )} όφP ƿ +Гόφ– ɀ ỳ #Ю Ю Y ƿ φ)$ ƿ 0)ƿ $ #ƿ Ч& )$ ƿ / Λ)} –ƿ –φ#ƿ )Ю 0ƿ „#–ƿ )μƿ –/ „Чƿ Ρ)& μГό„
  • 29. Zombie leaders ɀ ſ φ#ƿ Ю #/ 0#+ƿ )μƿ –φ#ƿ ΡЮ } „–#+ƿ Г„ƿ –φ#ƿ )& Ю № ƿ )& #ƿ / Ю Ю )$ #0ƿ –)ƿ ъ} ΛЮ Г„φƿ –/ „Чƿ Ρ)& μГό„ƿ –)ƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡ ɀ O )$ #¤#+P ƿ $ #ƿ 0)& d–ƿ #& μ)+Ρ#ƿ –φГ„ƿ ¤#+№ ƿ „–+)& όЮ № ɀ q)8 #ƿ $ )+Ч#+„ƿ 8 /№ ƿ 8 Г„–/ Ч#& Ю № ƿ Λ#Ю Г#¤#ƿ –φ#№ ƿ / +#ƿ –φ#ƿ Ю #/ 0#+ ɀ @ & / ΡΡ} +/ –#ƿ –/ „Чƿ Ρ)& μГό„ƿ 8 /№ ƿ Λ#ƿ ъ} ΛЮ Г„φ#0ƿ Г& ƿ +/ ъГ0ƿ „} ΡΡ#„„Г)& P ƿ )¤#+$ +Г–Г& όƿ ¤/ Ю Г0ƿ –/ „Чƿ Ρ)& μГό„
  • 30. ● Leader should fence out three tasks ● But leader only fences out two Zombie leaders my-connector (3 tasks) Ņ)%μГόƿ –)ъГΡ my-connector (2 tasks) my-connector (1 task) q–/ +–Г& όƿ „–/ –# Write by zombie leader ỳ +Г–#ƿ Λ№ ƿ / Ρ–} / Ю ƿ Ю #/ 0#+ ò#Λ/ Ю / & Ρ# ڤ"%– ɀ H& ƿ +#Λ/ Ю / & Ρ#P ƿ Ю #/ 0#+ƿ „##„T ○ One new task ɂ ſ $ )ƿ )Ю 0ƿ –/ „Ч„
  • 31. Zombie fencing: guarded config topic ɀ O )$ ƿ 0)ƿ $ #ƿ ъ+#¤#& –ƿ c)8 ΛГ#ƿ Ю #/ 0#+„ƿ μ+)8 ƿ / ΡΡ#„„Г& όƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡ ɀ ſ +/ & „/ Ρ–Г)& / Ю ƿ ъ+)0} Ρ#+ƿ –)ƿ –φ#ƿ +#„Ρ} #ƿ ;/ ό/ Г& <b ɀ @ „ƿ –φГ„ƿ #& )} όφ ɀ +#ƿ $ #ƿ ό} / +/ & –##0ƿ –φ/ –ƿ –φ#ƿ Ρ)& μГόƿ –)ъГΡƿ Ρ/ & ƿ Λ#ƿ } „#0ƿ / „ƿ / ƿ „)} +Ρ#ƿ )μƿ –+} –φƿ μ)+ƿ c)8 ΛГ#ƿ μ#& ΡГ& όƿ & )$ ɀ 🙃🙃🙃
  • 32. Leadership change ɀ @ μƿ / ƿ Ю #/ 0#+ƿ μ/ Ю Ю „ƿ )} –ƿ )μƿ –φ#ƿ ΡЮ } „–#+P ƿ / ƿ & #$ ƿ )& #ƿ Г„ƿ Ρφ)„#& ɀ O )$ ƿ 0)#„ƿ –φ#ƿ & #$ ƿ Ю #/ 0#+ƿ Ч& )$ ƿ $ φ/ –ƿ & ##0„ƿ μ#& ΡГ& ό
  • 33. Leadership change (with new tasks) my-connector (3 tasks) Config topic my-connector (2 tasks) q–/ +–Г& όƿ „–/ –# ỳ +Г–#ƿ Λ№ ƿ Ю #/ 0#+ ò#Λ/ Ю / & Ρ#ƿ ;g ƿ c)8 ΛГ#ƿ μ#& ΡГ& ό< Leader falls out of cluster ڤ"%– ɀ H& ƿ μГ+„–ƿ +#Λ/ Ю / & Ρ#P ƿ )Ю 0ƿ –/ „Ч„ƿ / +#ƿ μ#& Ρ#0ƿ )} –ƿ „} ΡΡ#„„μ} Ю Ю № Rebalance (+ new leader) ɀ H& ƿ „#Ρ)& 0ƿ +#Λ/ Ю / & Ρ#P ƿ & #$ ƿ Ю #/ 0#+ƿ 0)#„& d–ƿ φ/¤#ƿ –)ƿ 0)ƿ / &№ ƿ μ#& ΡГ& ό
  • 34. Leadership change (with new tasks) my-connector (3 tasks) Config topic my-connector (2 tasks) q–/ +–Г& όƿ „–/ –# Write by leader ]#/ 0#+ƿ μ/ Ю Ю „ƿ )} –ƿ )μƿ ΡЮ } „–#+ƿ ;Λ#μ)+#ƿ +#Λ/ Ю / & Ρ#< ò#Λ/ Ю / & Ρ#ƿ ;g ƿ & #$ ƿ Ю #/ 0#+< ڤ"%– ɀ H& ƿ +#Λ/ Ю / & Ρ#P ƿ & #$ ƿ Ю #/ 0#+ƿ φ/ „ƿ –)ƿ μ#& Ρ#ƿ )} –ƿ )Ю 0ƿ –/ „Ч„ ● But how can it tell? ɀ ſ φ#ƿ Ρ)& μГόƿ –)ъГΡƿ Ю ))Ч„ƿ –φ#ƿ „/ 8 #ƿ Г& ƿ Λ)–φƿ „Ρ#& / +Г)„ ɀ ò#8 #8 Λ#+P ƿ $ #ƿ $ / & –ƿ –)ƿ /¤)Г0ƿ } & & #Ρ#„„/ +№ ƿ Г& –#++} ъ–Г)& „
  • 35. Zombie fencing: fence, then write ɀ Ņ} ++#& –ƿ „#я } #& Ρ#ƿ )μƿ #¤#& –„T ɂ n} ΛЮ Г„φƿ & #$ ƿ –/ „Чƿ Ρ)& μГό„ ɂ ò#Λ/ Ю / & Ρ# ɂ i ƿ̀ )8 ΛГ#ƿ μ#& ΡГ& ό ɂ q–/ +–ƿ & #$ ƿ –/ „Ч„ ɀ ff #$ ƿ )+0#+T ɂ ` )8 ΛГ#ƿ μ#& ΡГ& ό ɂ n} ΛЮ Г„φƿ & #$ ƿ –/ „Чƿ Ρ)& μГό„ ɂ ò#Λ/ Ю / & Ρ# ɂ q–/ +–ƿ & #$ ƿ –/ „Ч„ ɀ ſ φГ„ƿ φ/ „ƿ –)ƿ Λ#ƿ Г–P ƿ +Гόφ– ɀ j )} ƿ $ Г„φƿ 😈
  • 36. That was not a good idea ● Poor UX ○ Causes tasks to fail in between zombie fencing and end of rebalance ○ Forcibly kills them, no chance to commit pending offsets ○ Looks like a bug to users ● Correctness issue ○ Users can manually restart failed tasks ○ Even in between zombie fencing and publishing new task configs ○ Uh oh, a zombie task made it to the other end of the rebalance!
  • 37. Zombie fencing: durable task counts ● Forget the “fence then write” logic ● Instead, we explicitly track the number of to-be-fenced tasks in the config topic with a task count record ● These serve two purposes: ○ Explicitly: if fencing is necessary, how many tasks have to be fenced out ○ Implicitly: determine whether zombie fencing is necessary
  • 38. Zombie fencing: durable task counts my-connector (3 tasks) Ņ)%μГόƿ –)ъГΡ my-connector (2 tasks) my-connector-task-count (2) q–/ +–Г& όƿ „–/ –# ff #$ ƿ –/ „Чƿ Ρ)& μГό„ Rebalance (+ zombie fencing) ڤ"%– ɀ H& ƿ +#Λ/ Ю / & Ρ#T ɂ ]#/ 0#+ƿ μ#& Ρ#„ƿ –φ+##ƿ –/ „Ч„ƿ Λ/ „#0ƿ )& ƿ Ю / –#„–ƿ –/ „Чƿ Ρ)} & –ƿ +#Ρ)+0 ɂ ]#/ 0#+ƿ $ +Г–#„ƿ & #$ ƿ –/ „Чƿ Ρ)} & –ƿ )μƿ –$ )ƿ –/ „Ч„ƿ Λ/ „#0ƿ )& ƿ Ю / –#„–ƿ –/ „Чƿ Ρ)& μГό„ my-connector-task-count (3)
  • 39. Zombie fencing: durable task counts my-connector (3 tasks) Config topic my-connector (2 tasks) my-connector-task-count (2) q–/ +–Г& όƿ „–/ –# ff #$ ƿ –/ „Чƿ Ρ)& μГό„ Rebalance (+ zombie fencing) ڤ"%– my-connector-task-count (3) Safe to run bring up tasks? ✅ ❌ ✅
  • 40. ɀ ỳ φ/–ƿ Ρ)} Ю 0ƿ ъ)„„ГΛЮ № ƿ Λ+#/Чƿ –φГ„ƿ & )$ ɀ Hφƿ № #/ φP ƿ $ φ/ –ƿ $ / „ƿ –φ/ –ƿ / Λ)} –ƿ –/ „Чƿ +#„–/ +–„ Zombie fencing: durable task counts
  • 41. Laggy task startup ● Zombie fencing disables all initialized task producers from writing to Kafka ● What if a zombie task lags and hasn’t initialized its producer by the time zombie fencing for a new generation of tasks takes place? ● Or, what if a task is restarted on a zombie worker after zombie fencing takes place?
  • 42. Laggy task startup reddit-source-0 reddit-source-1 reddit-source-2 reddit-source-3 ò#00Г–ƿ „)} +Ρ#ƿ ^k ƿ )Ю 0ƿ –/ „Ч„< reddit-source-0 reddit-source-1 reddit-source-2 ò#00Г–ƿ „)} +Ρ#ƿ ^e ƿ & #$ ƿ –/ „Ч„< ` )0 ΛГ"ƿ μ"%ΡГ%ό *)} %- reddit-source-3 ^ſ / „Чƿ Г„ƿ Ю / όόГ& όƿ 0} +Г& όƿ „–/ +–} ъ< ^ſ / „Чƿ φ/ „ƿ μГ& Г„φ#0ƿ „–/ +–} ъ<
  • 43. Zombie fencing: check your work ɀ μ–#+ƿ Г& Г–Г/ Ю ГcГ& όƿ –+/ & „/ Ρ–Г)& „ƿ μ)+ƿ / ƿ –/ „Чƿ ъ+)0} Ρ#+P ƿ φ/¤#ƿ –)ƿ 8 / Ч#ƿ „} +#ƿ Г–d„ƿ „–ГЮ Ю ƿ „/ μ#ƿ –)ƿ +} & ƿ –φ#ƿ –/ „Ч ɀ ff #$ ƿ „#я } #& Ρ#ƿ )μƿ #¤#& –„T ɂ Ø#ΡГ0#ƿ –)ƿ ;+#<„–/ +–ƿ –/ „Ч ɂ Ņ+#/ –#ƿ ъ+)0} Ρ#+ƿ μ)+ƿ –/ „Чƿ / & 0ƿ Г& Г–Г/ Ю Гc#ƿ –+/ & „/ Ρ–Г)& „ ɂ ò#/ 0ƿ –)ƿ #& 0ƿ )μƿ Ρ)& μГόƿ –)ъГΡ ɂ @ μƿ & #$ ƿ –/ „Чƿ Ρ)& μГό„ƿ μ)} & 0P ƿ / Λ)+–ƿ „–/ +–} ъƿ / & 0ƿ / Λ/ & 0)& ƿ –φ#ƿ –/ „Ч ɂ H–φ#+$ Г„#P ƿ „/ μ#ƿ –)ƿ „–/ +–ƿ ъ+)Ρ#„„Г& όƿ 0/ –/ ɀ O /¤#ƿ $ #ƿ μГ& / Ю Ю № ƿ 0)& #ƿ Г–
  • 45. Caveats ● Fencing during rebalancing is not a good idea ○ Makes rebalances more brittle ○ Requires a new rebalance any time we want to restart a task that failed due to failed zombie fencing ● Instead, we fence outside of rebalances ○ During task startup, workers issue a REST request to the leader to perform zombie fencing for the connector ○ The leader will perform that round (if necessary), then send back a 2XX response ○ If a non-2XX response is received, the task is marked failed ○ Tasks can easily be restarted
  • 46. Caveats ɀ ſ φ+)$ / $ /№ ƿ ъ+)0} Ρ#+„ƿ μ)+ƿ Г& Г–Г/ Ю ГcГ& όƿ –+/ & „/ Ρ–Г)& „ƿ Г„ƿ $ / „–#μ} Ю ɀ ỳ #ƿ / 00#0ƿ / ƿ & #$ ƿ / 08 Г& ƿ ΡЮ Г#& –ƿ n@ ƿ Г& ƿ e Me ƿ –)ƿ 0)ƿ –φГ„ƿ Г& „–#/ 0 ɀ ỳ #ƿ φ/¤#ƿ –)ƿ Λ#ƿ Ρ/ +#μ} Ю ƿ / Λ)} –ƿ φ)$ ƿ –φ#ƿ Ю #/ 0#+ƿ } „#„ƿ –φ#ƿ –+/ & „/ Ρ–Г)& / Ю ƿ ъ+)0} Ρ#+ ɀ qГ8 ГЮ / +ƿ QΡЮ / Г8 V –φ#& V Ρφ#ΡЧSƿ Ю )όГΡƿ –)ƿ „)} +Ρ#ƿ –/ „Ч„
  • 47. In practice @ 8 ъЮ #8 #& –/ –Г)& ƿ 0#–/ ГЮ „ƿ / +#ƿ Λ)+Г& όP ƿ φ)$ ƿ 0)ƿ $ #ƿ / Ρ–} / Ю Ю № ƿ } „#ƿ –φГ„ƿ μ#/ –} +#
  • 48. In practice (cluster administrators) ɀ ff #$ ƿ ΡЮ } „–#+„T ɂ [ „#ƿ ¤#+„Г)& ƿ e Me MLƿ )+ƿ Ю / –#+ ɂ Ņ)& μГό} +#ƿ #¤#+№ ƿ $ )+Ч#+ƿ $ Г–φƿ exactly.once.source.support = enabled ɀ ŚGГ„–Г& όƿ ΡЮ } „–#+„T ɂ ò)Ю Ю Г& όƿ } ъό+/ 0#ƿ NE ɂ Ļ +Г& όƿ / Ю Ю ƿ $ )+Ч#+„ƿ –)ƿ e Me MLƿ )+ƿ Ю / –#+ ɂ Ņ)& μГό} +#ƿ #¤#+№ ƿ $ )+Ч#+ƿ $ Г–φƿ exactly.once.source.support = preparing ○ ò)Ю Ю Г& όƿ } ъό+/ 0#ƿ ŢE ɂ Ņ)& μГό} +#ƿ #¤#+№ ƿ $ )+Ч#+ƿ $ Г–φƿ exactly.once.source.support = enabled
  • 49. In practice (downstream readers) ● Have to filter out records from aborted transactions ● If using the Java consumer, configure with isolation.level = read_committed ● For sink connectors, do at least one of the following: ○ Configure worker with consumer.isolation.level = read_committed ○ Configure connector with consumer.override.isolation.level = read_committed with (3.0.0 or later, with default worker configuration)
  • 50. In practice (writing connectors) Have to define source offsets correctly public abstract class SourceTask { public abstract List<SourceRecord> poll(); } public class SourceRecord { public SourceRecord(Map<String, ?> sourcePartition, Map<String, ?> sourceOffset, ...) }
  • 51. In practice (writing connectors) O /¤#ƿ –)ƿ } „"ƿ „)} +Ρ#ƿ )μμ„#–„ƿ Ρ)++#Ρ–Ю № public abstract class SourceTask { protected SourceTaskContext context; public abstract void start(Map<String, String> props); } public interface SourceTaskContext { OffsetStorageReader offsetStorageReader(); } public interface OffsetStorageReader { <T> Map<Map<String, T>, Map<String, Object>> offsets(Collection<Map<String, T>> partitions); }
  • 52. In summary ɀ ŚG/ Ρ–Ю № V )& Ρ#ƿ Г„ƿ φ/ +0ƿ –)ƿ Г8 ъЮ #8 #& – ɂ Ś„ъ#ΡГ/ Ю Ю № ƿ φ/ & 0Ю Г& όƿ c)8 ΛГ#ƿ $ )+Ч#+„fc)8 ΛГ#ƿ –/ „Ч„ƿ / Ρ+)„„ƿ –/ „Чƿ +#Ρ)& μГό} +/ –Г)& „ ɀ ŚG/ Ρ–Ю № V )& Ρ#ƿ Г„ƿ ;φ)ъ#μ} Ю Ю № <ƿ #/ „№ ƿ –)ƿ } „# ɂ @ μƿ Г–d„ƿ & )–P ƿ φ/ +/ „„ƿ ъГ& όƿ 8 #ƿ )& ƿ ăГ+/b ɂ φ––ъ„TffГ„„} #„M/ ъ/ Ρφ#M)+όfТ Г+/ fъ+)Т #Ρ–„fǽ Zǽ fГ„„} #„ ɀ Z)+ƿ / Ю Ю ƿ –φ#ƿ 0#–/ ГЮ „P ƿ Ρφ#ΡЧƿ )} –ƿ ǽ@ nBnND ɂ φ––ъ„TffΡ$ ГЧГM/ ъ/ Ρφ#M)+όfΡ)& μЮ } #& Ρ#f0Г„ъЮ /№ fǽ Zǽ fǽ@ n BnNDṊ e i ŚG/ Ρ–Ю № V H& Ρ#g q} ъъ)+–g μ)+g q)} +Ρ#g Ņ)& & #Ρ–)+„
  • 53. Thank you! ƿ Open Source Program Office @Aiven Ρφ#Г„&' ( Г¤&* +Г, Chris Egerton -Г* -Ρφ#Г„. &ό&#–,* . 12 33456 3 @C0urante