SlideShare a Scribd company logo
Make your world event driven
Krzysztof Debski @DebskiChris
15 years as an IT professional
@DebskiChris
http://hermes.allegro.tech
Who am I
Allegro
500+ people in IT
50+ independent teams
16 years on market
2 years after technical revolution
Events
Events are everywhere
Log data
Database replication
Warehouse dump
Search engines
Messaging systems
How to handle events?
Stream data platform
Data Integration Stream processing
Events and microservices
Service
Service
Service
Service
Service
Service
Service
Service
Service
Service
Service
Events and microservices
Service
Service
Service
Service
Service
Service
Service
Service
Service
Service
Service
Domain
Domain
Domain
Kafka
Kafka as a backbone
Service
Producer
Service
Consumer
Kafka
Broker
Zookeeper
Kafka Data
10
9
7
8
6
5
3
4
2
1
8
5
4
2
1
10
9
7
6
3
Data Topic
Topic
Producer_1
…
Producer_n
Remove old events
Publish event
Topic
Partitioning
10
9
7
8
6
5
3
4
2
1
8
5
4
2
1
10
9
7
6
3
5
4
2
8
1
Data Topic Partition
5
4
2
8
1
Replicas
Partitioning
Producer_1
…
Producer_n
Publish event
Partition 0
Partition 1
Partition 2
Partitioning
Service
Producer
Service
Consumer
Broker
Zookeeper
Broker
Broker
P1 P0
P2 P1
P0 P2
Topics operations
Create
auto.create.topics.enabled=true
Change
Replication factor
Partition count – only increasing
Delete
>= 0.8.2
delete.topic.enable=true
Demo
Initial list of brokers is static
New Producer API from 0.8.2
Async producer by default
Key partitioning is tricky
ACK is set by producer
Subscriber
Consumer 1
Broker
P0
Broker
P1
Broker
P2
Broker
P3
Consumer 2 Consumer 3 Consumer 4 Consumer 5 Consumer 6
Consumer group 1 Consumer group 2
Subscriber
Consumer 1
Broker
P0
Broker
P1
Broker
P2
Broker
P3
Consumer 2 Consumer 3 Consumer 4 Consumer 5 Consumer 6
Consumer group 1 Consumer group 2
Subscriber
Producer_1
…
Producer_n
Consumer_group_1Consumer_group_2
Remove old events
Publish event
Read eventRead event
Remove old messages
Offset management
<=0.8.1 - Zookeeper
>=0.8.2 - Zookeeper or Kafka
>=0.9(?) - Kafka
Demo
Simple consumer vs. High level consumer
Offset storage
Dual commits
Scaling consumers
KAFKA-1682
Kafka <= 0.8.2
No security
Kafka > 0.8.2
unix-like users, permissions, ACL
Performance issues
Rebalancing leaders
Broker 1
P1 P0
Broker 2
P2 P1
Broker 3
P0 P2
Topic: test Partition count: 3 Replication factor: 1
Configs: retention.ms=86400000
Topic: test Partition: 0 Leader: 3 Replicas: 3, 1 ISR: 3, 1
Topic: test Partition: 1 Leader: 1 Replicas: 1, 2 ISR: 1, 2
Topic: test Partition: 2 Leader: 2 Replicas: 2, 3 ISR: 2, 3
Rebalancing leaders
Broker 1
P1 P0
Broker 2
P2 P1
Broker 3
P0 P2
Topic: test Partition count: 3 Replication factor: 1
Configs: retention.ms=86400000
Topic: test Partition: 0 Leader: 3 Replicas: 3, 1 ISR: 3, 1
Topic: test Partition: 1 Leader: 1 Replicas: 1, 2 ISR: 1, 2
Topic: test Partition: 2 Leader: 2 Replicas: 2, 3 ISR: 2, 3
Brokers that should
have partition copies
Rebalancing leaders
Broker 1
P1 P0
Broker 2
P2 P1
Broker 3
P0 P2
Topic: test Partition count: 3 Replication factor: 1
Configs: retention.ms=86400000
Topic: test Partition: 0 Leader: 3 Replicas: 3, 1 ISR: 3, 1
Topic: test Partition: 1 Leader: 1 Replicas: 1, 2 ISR: 1, 2
Topic: test Partition: 2 Leader: 2 Replicas: 2, 3 ISR: 2, 3
In Sync Replicas
Rebalancing leaders
Broker 1
P1 P0
Broker 2
P2 P1
Broker 3
P0 P2
Topic: test Partition count: 3 Replication factor: 1
Configs: retention.ms=86400000
Topic: test Partition: 0 Leader: 3 Replicas: 3, 1 ISR: 3, 1
Topic: test Partition: 1 Leader: 1 Replicas: 1, 2 ISR: 1, 2
Topic: test Partition: 2 Leader: 2 Replicas: 2, 3 ISR: 2, 3
Leader broker ID
Rebalancing leaders
Broker 1
P1 P0
Broker 2
P2 P1
Broker 3
P0 P2
Topic: test Partition count: 3 Replication factor: 1
Configs: retention.ms=86400000
Topic: test Partition: 0 Leader: 3 Replicas: 3, 1 ISR: 3, 1
Topic: test Partition: 1 Leader: 1 Replicas: 1, 2 ISR: 1, 2
Topic: test Partition: 2 Leader: 2 Replicas: 2, 3 ISR: 2, 3
Rebalancing leaders
Broker 1
P1 P0
Broker 2
P2 P1
Broker 3
P0 P2
Topic: test Partition count: 3 Replication factor: 1
Configs: retention.ms=86400000
Topic: test Partition: 0 Leader: 1 Replicas: 3, 1 ISR: 1
Topic: test Partition: 1 Leader: 1 Replicas: 1, 2 ISR: 1, 2
Topic: test Partition: 2 Leader: 2 Replicas: 2, 3 ISR: 2
Rebalancing leaders
Broker 1
P1 P0
Broker 2
P2 P1
Broker 3
P0 P2
Topic: test Partition count: 3 Replication factor: 1
Configs: retention.ms=86400000
Topic: test Partition: 0 Leader: 1 Replicas: 3, 1 ISR: 1, 3
Topic: test Partition: 1 Leader: 1 Replicas: 1, 2 ISR: 1, 2
Topic: test Partition: 2 Leader: 2 Replicas: 2, 3 ISR: 2, 3
Lost events
ACK levels
0 - don’t wait for response from the leader
1 - only the leader has to respond
-1 - all replicas must be in sync
Speed
Safety
Lost Events
ERROR [Replica Manager on Broker 2]: Error when processing
fetch request for partition [test,1] offset 10000 from consumer
with correlation id 0. Possible cause:
Request for offset 10000 but we only have log segments in the
range 8000 to 9000. (kafka.server.ReplicaManager)
Lost events
Broker 1 Broker 2
Producer
ACK = 1
Replication factor = 1
replica.lag.max.messages = 2000
commited offset = 10000 commited offset = 9000
Zookeeper
Lost events
Broker 1 Broker 2
Producer
ACK = 1
Replication factor = 1
replica.lag.max.messages = 2000
commited offset = 10000 commited offset = 9000
Zookeeper
Lost events
Broker 1 Broker 2
Producer
ACK = 1
Replication factor = 1
replica.lag.max.messages = 2000
commited offset = 10000 commited offset = 9000
Zookeeper
commited offset = 9000
Monitoring
Kafka Offset Monitor
Graphite
Slow responses
Slow responses
75%
99%
99,9%
responsetime
Slow responses vs. message size
messagesize
75%
99%
99,9%
Fixed message sizeresponsetime
75%
99%
99,9%
Kafka
kernel 3.2.x
Kafka
kernel 3.2.x
Kafka
kernel 3.2.x kernel >= 3.8.x
Optimize throughput
Message sizemessagesize
99,9%
all
topics
99,9%
biggest
topic
Optimize message size
JSON human readable
big memory and network footprint
poor support for Hadoop
Optimize message size
JSON
Snappy
ERROR Error when sending message to topic t3 with key: 4 bytes, value: 100
bytes with error: The server experienced an unexpected error when
processing the request (org.apache.kafka.clients.producer.internals.
ErrorLoggingCallback)
java: target/snappy-1.1.1/snappy.cc:423: char* snappy::internal::
CompressFragment(const char*, size_t, char*, snappy::uint16*, int): Assertion
`0 == memcmp(base, candidate, matched)' failed.
errors on publishing large amount of
messages
Optimize message size
JSON
Snappy
Lz4
failed on distributed data
compressionratio
single
topic
multiple
topics
Optimize message size
JSON
Snappy
Lz4
Avro
small network footprint
Hadoop friendly
easy schema verification
Allegro QR contest
Hermes
Hermes
Hermes
Hermes
Frontend
Hermes
Frontend
Hermes
Frontend
Hermes
Consumer
Hermes
Consumer
REST
REST, JMS
Topic management
pl.allegro.JDD2015.demo.basic
Group Topic
Delivery model
Exactly once
At most once
At least once
Delivery model
Exactly once - almost impossible
At most once - risky
At least once
Event identification
Hermes
Frontend
Kafka
Broker
POST
{“event”: ”test”}
{
"id": "58d7ff07-dd0e-4103-9b1f-55706f3049e6",
"timestamp”: 1430443071995,
“data”: {“event”: ”test”}
}
HTTP 201 Created
Message-id: 58d7ff07-dd0e-4103-9b1f-55706f3049e6
Lost events
Hermes
Frontend
Producer
Hermes
Consumer
Consumer
Kafka
Broker
Zookeeper
Tracker
Publication
data Delivery
attempts
Multi data center
Hermes
Frontend
Hermes
Manager
Hermes
Frontend
Hermes
Consumer
Hermes
Consumer
Slow responses - normal
Hermes
Frontend
Producer
Hermes
Consumer
Consumer
Kafka
Broker
Zookeeper
POST
HTTP
201
Created
Slow responses - fail
Hermes
Frontend
Producer
Hermes
Consumer
Consumer
Kafka
Broker
Zookeeper
POST
HTTP
202
Accepted
Improved security
Authentication and authorization interfaces provided
By Default:
You can create any topic in your group
You can publish everywhere (in progress)
Group owner defines subscriptions
Improved offset management
Hermes
Producer
Hermes consumer
Remove old messages
Publish event
Commited
Local unsent events
Read event
Improved offset management
Hermes consumer
Remove old messages
Local unsent events
New event
Service
instance
Improved offset management
Hermes consumer
Remove old messages
Local unsent events
New event
Service
instance
HTTP 503
Unavailable
Improved offset management
Remove old messages
Local unsent events
New event
Service
instance
HTTP 503
Unavailable
Check TTL &
Add to queue
Hermes consumer
Consumer backoff
100% adapt 1/s 1/min
Turn back the time
PUT /groups/{group}/topics/{topic}/subscriptions/{subscription}/retransmission -8h
Find us:
Blog: allegrotech.io
Twitter: @allegrotechblog
work with us
kariera.allegro.pl

More Related Content

PDF
Java zone 2015 How to make life with kafka easier.
PDF
A study of our DNS full-resolvers
PDF
Kubernetes DNS Horror Stories
PDF
Building a Distributed Message Log from Scratch - SCaLE 16x
PDF
Best Practices - PHP and the Oracle Database
PDF
How the OOM Killer Deleted My Namespace
PDF
Kubernetes at Datadog Scale
PDF
Making the most out of kubernetes audit logs
Java zone 2015 How to make life with kafka easier.
A study of our DNS full-resolvers
Kubernetes DNS Horror Stories
Building a Distributed Message Log from Scratch - SCaLE 16x
Best Practices - PHP and the Oracle Database
How the OOM Killer Deleted My Namespace
Kubernetes at Datadog Scale
Making the most out of kubernetes audit logs

What's hot (20)

PDF
[233] level 2 network programming using packet ngin rtos
PDF
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
PDF
Mysql8 advance tuning with resource group
PDF
Preview of Apache Pulsar 2.5.0
PPTX
Taming HBase with Apache Phoenix and SQL
PDF
How Prometheus Store the Data
PPTX
Building a Replicated Logging System with Apache Kafka
PDF
[India Merge World Tour] Meru Networks
PPTX
[오픈소스컨설팅] Linux Network Troubleshooting
PDF
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
PDF
如何使用 iframe 製作一個易於更新及更安全的前端套件
PDF
OARC 31: NSEC Caching Revisited
PDF
Tuning TCP and NGINX on EC2
PDF
Advanced Oracle Troubleshooting
PPTX
Nginx Scalable Stack
PDF
Part 2 - Local Name Resolution in Windows Networks
PPTX
Deep Dive in Docker Overlay Networks
PPTX
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
PDF
Thoughts about DNS for DDoS
[233] level 2 network programming using packet ngin rtos
Troubleshooting Complex Oracle Performance Problems with Tanel Poder
Mysql8 advance tuning with resource group
Preview of Apache Pulsar 2.5.0
Taming HBase with Apache Phoenix and SQL
How Prometheus Store the Data
Building a Replicated Logging System with Apache Kafka
[India Merge World Tour] Meru Networks
[오픈소스컨설팅] Linux Network Troubleshooting
10 ways to shoot yourself in the foot with kubernetes, #9 will surprise you! ...
如何使用 iframe 製作一個易於更新及更安全的前端套件
OARC 31: NSEC Caching Revisited
Tuning TCP and NGINX on EC2
Advanced Oracle Troubleshooting
Nginx Scalable Stack
Part 2 - Local Name Resolution in Windows Networks
Deep Dive in Docker Overlay Networks
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Thoughts about DNS for DDoS
Ad

Similar to JDD2015: Make your world event driven - Krzysztof Dębski (20)

PDF
Geecon.cz 2015 debski krzysztof
PPTX
Velocity 2019 - Kafka Operations Deep Dive
PPTX
Deep Dive into Apache Kafka
PDF
Seek and Destroy Kafka Under Replication
PPTX
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
PPTX
Kafka blr-meetup-presentation - Kafka internals
PDF
Kafka Summit SF 2017 - Running Kafka as a Service at Scale
PDF
PDF
Introduction to apache kafka
PPTX
Apache Kafka Best Practices
PDF
Building zero data loss pipelines with apache kafka
PDF
Introduction to Apache Kafka
PDF
Kafka Technical Overview
PPTX
Microservices interaction at scale using Apache Kafka
PPTX
Kafka RealTime Streaming
PPTX
Kafka: Internals
PDF
Balance Kafka Cluster with Zero Data Movement with Haochen Li & Yaodong Yang
PPTX
Streaming in Practice - Putting Apache Kafka in Production
PPTX
Apache Kafka
PDF
Apache Kafka Women Who Code Meetup
Geecon.cz 2015 debski krzysztof
Velocity 2019 - Kafka Operations Deep Dive
Deep Dive into Apache Kafka
Seek and Destroy Kafka Under Replication
Kafka Summit NYC 2017 - Deep Dive Into Apache Kafka
Kafka blr-meetup-presentation - Kafka internals
Kafka Summit SF 2017 - Running Kafka as a Service at Scale
Introduction to apache kafka
Apache Kafka Best Practices
Building zero data loss pipelines with apache kafka
Introduction to Apache Kafka
Kafka Technical Overview
Microservices interaction at scale using Apache Kafka
Kafka RealTime Streaming
Kafka: Internals
Balance Kafka Cluster with Zero Data Movement with Haochen Li & Yaodong Yang
Streaming in Practice - Putting Apache Kafka in Production
Apache Kafka
Apache Kafka Women Who Code Meetup
Ad

Recently uploaded (20)

PPTX
Principles of Marketing, Industrial, Consumers,
PDF
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PPTX
New Microsoft PowerPoint Presentation - Copy.pptx
PDF
Power and position in leadershipDOC-20250808-WA0011..pdf
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PDF
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
PPT
Chapter four Project-Preparation material
PPTX
ICG2025_ICG 6th steering committee 30-8-24.pptx
PDF
COST SHEET- Tender and Quotation unit 2.pdf
PDF
Cours de Système d'information about ERP.pdf
PPTX
3. HISTORICAL PERSPECTIVE UNIIT 3^..pptx
PPTX
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
PDF
Laughter Yoga Basic Learning Workshop Manual
DOCX
Business Management - unit 1 and 2
PDF
IFRS Notes in your pocket for study all the time
PPTX
Belch_12e_PPT_Ch18_Accessible_university.pptx
PDF
Reconciliation AND MEMORANDUM RECONCILATION
PDF
Elevate Cleaning Efficiency Using Tallfly Hair Remover Roller Factory Expertise
Principles of Marketing, Industrial, Consumers,
BsN 7th Sem Course GridNNNNNNNN CCN.pdf
340036916-American-Literature-Literary-Period-Overview.ppt
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
New Microsoft PowerPoint Presentation - Copy.pptx
Power and position in leadershipDOC-20250808-WA0011..pdf
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
Chapter four Project-Preparation material
ICG2025_ICG 6th steering committee 30-8-24.pptx
COST SHEET- Tender and Quotation unit 2.pdf
Cours de Système d'information about ERP.pdf
3. HISTORICAL PERSPECTIVE UNIIT 3^..pptx
svnfcksanfskjcsnvvjknsnvsdscnsncxasxa saccacxsax
Laughter Yoga Basic Learning Workshop Manual
Business Management - unit 1 and 2
IFRS Notes in your pocket for study all the time
Belch_12e_PPT_Ch18_Accessible_university.pptx
Reconciliation AND MEMORANDUM RECONCILATION
Elevate Cleaning Efficiency Using Tallfly Hair Remover Roller Factory Expertise

JDD2015: Make your world event driven - Krzysztof Dębski