SlideShare a Scribd company logo
KAFKA AS A MQ
CAN YOU DO IT, AND SHOULD YOU DO IT?

Adam Warski, Apache Kafka London Meetup
@adamwarski, SoftwareMill, Kafka London Meetup
THE PLAN
➤ Acknowledgments in plain Kafka
➤ Why selective acknowledgments?
➤ Why not …MQ?
➤ Kmq implementation
➤ Demo
➤ Performance
@adamwarski, SoftwareMill, Kafka London Meetup
➤ Offset commits:
➤ Using this, we can implement:
➤ at-least-once processing
➤ at-most-once processing
topic
msg25msg24
ACKNOWLEDGMENTS IN PLAIN KAFKA
msg18
partition 1
partition 2
partition 3
msg19 msg20 msg21 msg22 msg23
commit offset: 20
commit offset: 24
@adamwarski, SoftwareMill, Kafka London Meetup
WHY SELECTIVE ACKNOWLEDGMENTS?
➤ Integrating with external systems
➤ e.g. HTTP/REST endpoints
➤ email
➤ other messaging
➤ Individual calls might fail
➤ should be retried
➤ without retrying the whole batch
➤ without delaying subsequent batches
@adamwarski, SoftwareMill, Kafka London Meetup
WHY NOT …MQ?
➤ Typical usage scenario for a message queue
➤ RabbitMQ, ActiveMQ, Artemis, SQS …
➤ Kafka:
➤ proven & reliable clustering & replication mechanisms
➤ performance
➤ convenience: reduce operational complexity
@adamwarski, SoftwareMill, Kafka London Meetup
AMAZON SQS
➤ Message queue as-a-service
➤ Simple API:
➤ CreateQueue
➤ SendMessage
➤ ReceiveMessage
➤ DeleteMessage
➤ Received messages are blocked for a period of time
➤ visibility timeout
@adamwarski, SoftwareMill, Kafka London Meetup
KMQ: IMPLEMENTATION
➤ Two topics:
➤ queue: messages to process
➤ markers: for each message, start/end markers
➤ same number of partitions
➤ A number of queue clients
➤ here data is processed
➤ A number of redelivery trackers
@adamwarski, SoftwareMill, Kafka London Meetup
QUEUE CLIENT
1. Read message from queue
2. Write start [offset] to markers
➤ wait for send to complete!
3. Commit offset to queue
4. Process the message
5. Write end [offset] markers
markers topic
partition 1
partition 2
partition 3
queue topic
partition 1
partition 2
partition 3
msg37
4. process
message
fail processing, wait
for redelivery
msg39msg40
1. read
messages from
topic
start marker
offset: 39
2. write start
markers
msg38
3. commit
offsets
offset: 38
success, confirm
message processed
end marker
offset: 37
5. write end
markers
redelivery tracker
// started, not ended markers
offset=10, time=1488010644
offset=15, time=1488141843
offset=24, time=1488289812
…
marker

stream
every second 

trigger
redeliver

timed out

messages

read & redeliver message
msg10
@adamwarski, SoftwareMill, Kafka London Meetup
REDELIVERY TRACKER
➤ A Kafka application
➤ consumes the markers topic
➤ Multiple instances for fail-over
➤ Uses Kafka’s auto-partition-assignment
@adamwarski, SoftwareMill, Kafka London Meetup
REDELIVERY TRACKER
➤ In-memory priority queue
➤ by Kafka’s marker timestamp
➤ messages with start markers, but no end markers
➤ Checks for messages to redeliver at regular intervals
➤ redelivery: seek + send
➤ in order
DEMO
@adamwarski, SoftwareMill, Kafka London Meetup
PERFORMANCE
➤ 3-node Kafka cluster
➤ m4.2xlarge servers (8 CPUs, 32GiB RAM)
➤ single AZ
➤ 100 byte messages, sent in batches of up to 10
➤ Up to 8 sender/receiver nodes
➤ 64 to 160 partitions
➤ replication-factor=3
➤ min.insync.replicas=2
➤ acks=all (-1)
@adamwarski, SoftwareMill, Kafka London Meetup
PLAIN KAFKA KMQ
@adamwarski, SoftwareMill, Kafka London Meetup
LATENCY
➤ Plain Kafka: ~50 milliseconds
➤ kmq: 50ms - 130ms
@adamwarski, SoftwareMill, Kafka London Meetup
WHAT IF MESSAGES ARE DROPPED?
➤ 50% drop rate
@adamwarski, SoftwareMill, Kafka London Meetup
KMQ INTERNALS
➤ RedeliveryTracker
➤ Implemented in Scala, with a Java API
➤ Uses Akka
➤ One tracking actor per markers topic partition
➤ One redeliver actor per queue topic partition
➤ Started/stopped when partitions are revoked/assigned
➤ KmqClient
➤ Single Java class
➤ + marker value classes
@adamwarski, SoftwareMill, Kafka London Meetup
ABOUT ME
➤ Software engineer, co-founder @
➤ Custom software development: Scala/Kafka/Java/Cassandra/…
➤ Open-source: sttp, QuickLens, ElasticMQ, Envers, MacWire, …
➤ Blog @ softwaremill.com/blog
➤ Twitter @ twitter.com/adamwarski
@adamwarski, SoftwareMill, Kafka London Meetup
SUMMARY
➤ Individual, selective message acknowledgments
➤ similar to SQS
➤ Alternative to batch/up-to-offset acknowledgments in plain Kafka
➤ Storage overhead: additional meta-data topic
➤ Performance overhead: comparable
➤ Integrating with external systems
@adamwarski, SoftwareMill, Kafka London Meetup
LINKS
➤ GitHub: https://github.com/softwaremill/kmq
➤ Introductory blog: https://softwaremill.com/using-kafka-as-a-message-queue/
➤ Message queue performance: https://softwaremill.com/mqperf/
➤ @adamwarski / adam@warski.org
THANK YOU!

More Related Content

PPTX
Brief introduction to Kafka Streaming Platform
PPTX
Kafka Tutorial - DevOps, Admin and Ops
PPTX
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
PPTX
Kafka Intro With Simple Java Producer Consumers
PPTX
Kafka Tutorial - introduction to the Kafka streaming platform
PPTX
Kafka Tutorial Advanced Kafka Consumers
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 2)
PPTX
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS
Brief introduction to Kafka Streaming Platform
Kafka Tutorial - DevOps, Admin and Ops
Kafka MirrorMaker: Disaster Recovery, Scaling Reads, Isolate Mission Critical...
Kafka Intro With Simple Java Producer Consumers
Kafka Tutorial - introduction to the Kafka streaming platform
Kafka Tutorial Advanced Kafka Consumers
Kafka Tutorial - Introduction to Apache Kafka (Part 2)
Amazon Cassandra Basics & Guidelines for AWS/EC2/VPC/EBS

What's hot (20)

PPTX
Avro Tutorial - Records with Schema for Kafka and Hadoop
PPTX
Kafka Tutorial, Kafka ecosystem with clustering examples
PPTX
Kafka Tutorial: Advanced Producers
PPTX
Kafka Tutorial: Streaming Data Architecture
PPTX
Kafka Tutorial - basics of the Kafka streaming platform
PPTX
Kafka Tutorial: Kafka Security
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PPTX
Amazon AWS basics needed to run a Cassandra Cluster in AWS
PPTX
Best Practices for Running Kafka on Docker Containers
PPTX
Kafka: Internals
PDF
ES & Kafka
PDF
Lessons from managing a Pulsar cluster (Nutanix)
PDF
Introduction to Apache Kafka
PPTX
Kafka tutorial
PDF
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
PPTX
Apache Kafka: Next Generation Distributed Messaging System
PPTX
Apache kafka
PDF
Schema Evolution for Resilient Data microservices
PPTX
Understanding kafka
Avro Tutorial - Records with Schema for Kafka and Hadoop
Kafka Tutorial, Kafka ecosystem with clustering examples
Kafka Tutorial: Advanced Producers
Kafka Tutorial: Streaming Data Architecture
Kafka Tutorial - basics of the Kafka streaming platform
Kafka Tutorial: Kafka Security
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Amazon AWS basics needed to run a Cassandra Cluster in AWS
Best Practices for Running Kafka on Docker Containers
Kafka: Internals
ES & Kafka
Lessons from managing a Pulsar cluster (Nutanix)
Introduction to Apache Kafka
Kafka tutorial
Introducing HerdDB - a distributed JVM embeddable database built upon Apache ...
Apache Kafka: Next Generation Distributed Messaging System
Apache kafka
Schema Evolution for Resilient Data microservices
Understanding kafka
Ad

Similar to Kafka as a message queue (20)

PDF
Deep dive into Apache Kafka consumption
PDF
Fundamentals of Apache Kafka
PPTX
Apache kafka
PDF
Apache Kafka Introduction
PPTX
Introduction to Kafka
PDF
Apache Kafka - Scalable Message-Processing and more !
PDF
Messaging queue - Kafka
PPTX
Apache Kafka
PPTX
Copy of Kafka-Camus
PPTX
Kafka blr-meetup-presentation - Kafka internals
PPTX
Fundamentals and Architecture of Apache Kafka
PDF
Grokking TechTalk #24: Kafka's principles and protocols
PPTX
Kafka RealTime Streaming
PPTX
Kafkha real time analytics platform.pptx
PPTX
Kafka overview v0.1
PDF
Building zero data loss pipelines with apache kafka
PPTX
Introduction to Kafka Streams Presentation
PDF
Apache Kafka - Free Friday
PDF
PDF
Apache Kafka - Scalable Message-Processing and more !
Deep dive into Apache Kafka consumption
Fundamentals of Apache Kafka
Apache kafka
Apache Kafka Introduction
Introduction to Kafka
Apache Kafka - Scalable Message-Processing and more !
Messaging queue - Kafka
Apache Kafka
Copy of Kafka-Camus
Kafka blr-meetup-presentation - Kafka internals
Fundamentals and Architecture of Apache Kafka
Grokking TechTalk #24: Kafka's principles and protocols
Kafka RealTime Streaming
Kafkha real time analytics platform.pptx
Kafka overview v0.1
Building zero data loss pipelines with apache kafka
Introduction to Kafka Streams Presentation
Apache Kafka - Free Friday
Apache Kafka - Scalable Message-Processing and more !
Ad

More from SoftwareMill (20)

PDF
Growing Oxen: channel operators and retries
PDF
How To Survive a Live-Coding Session
PDF
Goryle i ser szwajcarski. Czego medycyna ratunkowa może Cię nauczyć o tworzen...
PPTX
Have you ever wondered about code review?
PDF
Reactive Integration with Akka Streams and Alpakka
PDF
W świecie botów czyli po co nam SI
PDF
Small intro to Big Data
PDF
Out-of-the-box Reactive Streams with Java 9
PDF
Hiring, Bots and Beer. (Hiring in the IT industry)
PDF
Teal Is The New Black
PDF
Windowing data in big data streams
PDF
Introduction to Cassandra
PDF
Origins of Free
PDF
Cassandra - how to fail?
PDF
How to manage in a flat organized, remote and transparent company
PDF
Performance tests with gatling
PDF
Origins of free
PDF
Projekt z punktu widzenia UX designera
PDF
Machine learning by example
PPTX
Open source big data landscape and possible ITS applications
Growing Oxen: channel operators and retries
How To Survive a Live-Coding Session
Goryle i ser szwajcarski. Czego medycyna ratunkowa może Cię nauczyć o tworzen...
Have you ever wondered about code review?
Reactive Integration with Akka Streams and Alpakka
W świecie botów czyli po co nam SI
Small intro to Big Data
Out-of-the-box Reactive Streams with Java 9
Hiring, Bots and Beer. (Hiring in the IT industry)
Teal Is The New Black
Windowing data in big data streams
Introduction to Cassandra
Origins of Free
Cassandra - how to fail?
How to manage in a flat organized, remote and transparent company
Performance tests with gatling
Origins of free
Projekt z punktu widzenia UX designera
Machine learning by example
Open source big data landscape and possible ITS applications

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Spectroscopy.pptx food analysis technology
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Encapsulation theory and applications.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
KodekX | Application Modernization Development
PDF
Machine learning based COVID-19 study performance prediction
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Electronic commerce courselecture one. Pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Cloud computing and distributed systems.
Unlocking AI with Model Context Protocol (MCP)
Review of recent advances in non-invasive hemoglobin estimation
Building Integrated photovoltaic BIPV_UPV.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Spectroscopy.pptx food analysis technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Encapsulation theory and applications.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
NewMind AI Weekly Chronicles - August'25 Week I
The Rise and Fall of 3GPP – Time for a Sabbatical?
KodekX | Application Modernization Development
Machine learning based COVID-19 study performance prediction
Dropbox Q2 2025 Financial Results & Investor Presentation
Electronic commerce courselecture one. Pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
The AUB Centre for AI in Media Proposal.docx
Encapsulation_ Review paper, used for researhc scholars
Cloud computing and distributed systems.

Kafka as a message queue

  • 1. KAFKA AS A MQ CAN YOU DO IT, AND SHOULD YOU DO IT?
 Adam Warski, Apache Kafka London Meetup
  • 2. @adamwarski, SoftwareMill, Kafka London Meetup THE PLAN ➤ Acknowledgments in plain Kafka ➤ Why selective acknowledgments? ➤ Why not …MQ? ➤ Kmq implementation ➤ Demo ➤ Performance
  • 3. @adamwarski, SoftwareMill, Kafka London Meetup ➤ Offset commits: ➤ Using this, we can implement: ➤ at-least-once processing ➤ at-most-once processing topic msg25msg24 ACKNOWLEDGMENTS IN PLAIN KAFKA msg18 partition 1 partition 2 partition 3 msg19 msg20 msg21 msg22 msg23 commit offset: 20 commit offset: 24
  • 4. @adamwarski, SoftwareMill, Kafka London Meetup WHY SELECTIVE ACKNOWLEDGMENTS? ➤ Integrating with external systems ➤ e.g. HTTP/REST endpoints ➤ email ➤ other messaging ➤ Individual calls might fail ➤ should be retried ➤ without retrying the whole batch ➤ without delaying subsequent batches
  • 5. @adamwarski, SoftwareMill, Kafka London Meetup WHY NOT …MQ? ➤ Typical usage scenario for a message queue ➤ RabbitMQ, ActiveMQ, Artemis, SQS … ➤ Kafka: ➤ proven & reliable clustering & replication mechanisms ➤ performance ➤ convenience: reduce operational complexity
  • 6. @adamwarski, SoftwareMill, Kafka London Meetup AMAZON SQS ➤ Message queue as-a-service ➤ Simple API: ➤ CreateQueue ➤ SendMessage ➤ ReceiveMessage ➤ DeleteMessage ➤ Received messages are blocked for a period of time ➤ visibility timeout
  • 7. @adamwarski, SoftwareMill, Kafka London Meetup KMQ: IMPLEMENTATION ➤ Two topics: ➤ queue: messages to process ➤ markers: for each message, start/end markers ➤ same number of partitions ➤ A number of queue clients ➤ here data is processed ➤ A number of redelivery trackers
  • 8. @adamwarski, SoftwareMill, Kafka London Meetup QUEUE CLIENT 1. Read message from queue 2. Write start [offset] to markers ➤ wait for send to complete! 3. Commit offset to queue 4. Process the message 5. Write end [offset] markers
  • 9. markers topic partition 1 partition 2 partition 3 queue topic partition 1 partition 2 partition 3 msg37 4. process message fail processing, wait for redelivery msg39msg40 1. read messages from topic start marker offset: 39 2. write start markers msg38 3. commit offsets offset: 38 success, confirm message processed end marker offset: 37 5. write end markers redelivery tracker // started, not ended markers offset=10, time=1488010644 offset=15, time=1488141843 offset=24, time=1488289812 … marker
 stream every second 
 trigger redeliver
 timed out
 messages
 read & redeliver message msg10
  • 10. @adamwarski, SoftwareMill, Kafka London Meetup REDELIVERY TRACKER ➤ A Kafka application ➤ consumes the markers topic ➤ Multiple instances for fail-over ➤ Uses Kafka’s auto-partition-assignment
  • 11. @adamwarski, SoftwareMill, Kafka London Meetup REDELIVERY TRACKER ➤ In-memory priority queue ➤ by Kafka’s marker timestamp ➤ messages with start markers, but no end markers ➤ Checks for messages to redeliver at regular intervals ➤ redelivery: seek + send ➤ in order
  • 12. DEMO
  • 13. @adamwarski, SoftwareMill, Kafka London Meetup PERFORMANCE ➤ 3-node Kafka cluster ➤ m4.2xlarge servers (8 CPUs, 32GiB RAM) ➤ single AZ ➤ 100 byte messages, sent in batches of up to 10 ➤ Up to 8 sender/receiver nodes ➤ 64 to 160 partitions ➤ replication-factor=3 ➤ min.insync.replicas=2 ➤ acks=all (-1)
  • 14. @adamwarski, SoftwareMill, Kafka London Meetup PLAIN KAFKA KMQ
  • 15. @adamwarski, SoftwareMill, Kafka London Meetup LATENCY ➤ Plain Kafka: ~50 milliseconds ➤ kmq: 50ms - 130ms
  • 16. @adamwarski, SoftwareMill, Kafka London Meetup WHAT IF MESSAGES ARE DROPPED? ➤ 50% drop rate
  • 17. @adamwarski, SoftwareMill, Kafka London Meetup KMQ INTERNALS ➤ RedeliveryTracker ➤ Implemented in Scala, with a Java API ➤ Uses Akka ➤ One tracking actor per markers topic partition ➤ One redeliver actor per queue topic partition ➤ Started/stopped when partitions are revoked/assigned ➤ KmqClient ➤ Single Java class ➤ + marker value classes
  • 18. @adamwarski, SoftwareMill, Kafka London Meetup ABOUT ME ➤ Software engineer, co-founder @ ➤ Custom software development: Scala/Kafka/Java/Cassandra/… ➤ Open-source: sttp, QuickLens, ElasticMQ, Envers, MacWire, … ➤ Blog @ softwaremill.com/blog ➤ Twitter @ twitter.com/adamwarski
  • 19. @adamwarski, SoftwareMill, Kafka London Meetup SUMMARY ➤ Individual, selective message acknowledgments ➤ similar to SQS ➤ Alternative to batch/up-to-offset acknowledgments in plain Kafka ➤ Storage overhead: additional meta-data topic ➤ Performance overhead: comparable ➤ Integrating with external systems
  • 20. @adamwarski, SoftwareMill, Kafka London Meetup LINKS ➤ GitHub: https://github.com/softwaremill/kmq ➤ Introductory blog: https://softwaremill.com/using-kafka-as-a-message-queue/ ➤ Message queue performance: https://softwaremill.com/mqperf/ ➤ @adamwarski / adam@warski.org