SlideShare a Scribd company logo
2016
HOW TO COOK APACHE KAFKA
WITH CAMEL AND SPRING BOOT
2Java EE conference 2016
Ivan Vasyliev
Playtika Core Services Team
AGENDA
Basics of Apache Kafka
Apache Camel
Spring Boot
Demo
Q&A
3Java EE conference 2016
CODE SLIDES
WHY APACHE KAFKA?
4Java EE conference 2016
http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
WHY APACHE KAFKA?
Designed for large scale
Widely adopted by top tech companies
Hardened production quality product
Data replication out of the box
5Java EE conference 2016
FEATURES
At most once, at least once guarantees
Batching for high throughput cases
Efficient with DEFAULT settings
6Java EE conference 2016
EVEN MORE FEATURES
Mirroring between datacenters
Connectors to various DWH
Complex event processing integrations
7Java EE conference 2016
HIGH LEVEL VIEW
8Java EE conference 2016
http://kafka.apache.org/documentation.html#introduction
HIGH LEVEL VIEW
Publisher/subscriber and point-to-point models
Client which sends message – producer
Client which receives messages - consumer
9Java EE conference 2016
WHAT IS NOT INCLUDED - JMS
10Java EE conference 2016
WHAT IS NOT INCLUDED - JMS
Not a JMS compliant server
No message headers
Can employ message key
Send in payload
Wait for it, on roadmap
No transactions/JTA support
11Java EE conference 2016
WHAT IS NOT INCLUDED - EXACTLY ONCE GUARANTEE
12Java EE conference 2016
WHAT IS NOT INCLUDED - EXACTLY ONCE GUARANTEE
No exactly once guarantee
Duplicates because of failures
De-duplication is on roadmap
De-duplication on consumer
With camel EIP, by message ID/body
Consumer can tolerate duplicates
13Java EE conference 2016
APACHE KAFKA LANGUAGE
14Java EE conference 2016
http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
APACHE KAFKA LANGUAGE
Topic - represents stream of messages
Contains set of partitions
Partition - subset of messages in stream
Partitioning is done by message key on producer
No “queue” in dictionary
15Java EE conference 2016
TOPICS AND PARTITIONS
16Java EE conference 2016
http://kafka.apache.org/documentation.html#intro_topics
TOPICS AND PARTITIONS
Partition is smallest unit of storage in kafka
Partition is data file with messages
Producer always append to end of file
Consumers scroll/seek over file
Consumer offset is persisted (zk or kafka)
Strong ordering guarantees for consumer
17Java EE conference 2016
QUEUE SEMANTIC IS DONE ON CLIENT
18Java EE conference 2016
http://kafka.apache.org/documentation.html#intro_consumers
QUEUE
Consumer offset is persisted by group id/per partition
Queue semantic inside of consumer group
Topic semantic between consumer groups
19Java EE conference 2016
CONSUMPTION IS ALL ABOUT OFFSETS
20Java EE conference 2016
https://hadoopabcd.wordpress.com/2015/04/11/kafka-building-a-real-time-data-pipeline/
CONSUMPTION IS ALL ABOUT OFFSETS
Consumer polls data from broker
Consumer offset is send (committed) to server
Auto offset commit enabled
By separate thread, periodically
Auto offset commit disabled
By your code, when batch of messages processed
21Java EE conference 2016
CONSUMER OFFSET AND AUTO-COMMIT
22Java EE conference 2016
CONSUMER OFFSET AND AUTO-COMMIT
With “auto-commit” enabled you can loose messages
Step1: One thread did not finish processing and failed
Step 2: Auto-commit thread does not care
Auto-commit is OK for status heartbeats
Auto-commit is NOT OK if you need “at least once”
guarantee, e.g. payment processing
23Java EE conference 2016
DATA REPLICATION
24Java EE conference 2016
DATA REPLICATION
Leader receives all reads and writes
Decides when to commit message
Follower syncs messages from leader
Take over if leader is down
Replication controller maintains leader
Zookeper used for coordination
Leader election
Consensus protocol
25Java EE conference 2016
APACHE KAFKA PRODUCER
26Java EE conference 2016
APACHE KAFKA PRODUCER
Performs load balancing
Uses message key to select partition
Finds appropriate kafka broker leader for partition
Has few configurable acknowledge modes
Can do batching in async mode
27Java EE conference 2016
DELIVERY GUARANTEED
28Java EE conference 2016
DELIVERY GUARANTEED
Durability with ack levels on producer side
Data replication between brokers
No in-memory state, efficient persistence
Manually committing offset on consumer side
29Java EE conference 2016
ISSUES - OPS
Ops is not free
There is Zookeeper on board
Easy to setup with Docker/Rancher
Need to learn basics to setup and monitor
30Java EE conference 2016
ISSUES – DATA
Can’t auto-scale existing data
Option 1: Add new partitions, they will go to new nodes
Option 2: Do it manually, move partitions around
Option 3: Wait for it, on roadmap
Mirroring seems to work into one direction
Can’t handle very large number of topics
31Java EE conference 2016
WHY APACHE CAMEL?
32Java EE conference 2016
WHY APACHE CAMEL?
Message routing DSL (java/scala/grooovy)
Enterprise Integration Patterns
Idempotent consumer (de-duplication)
Aggregator
…
Abstractions for testing
MockEndpoint
Route Advice
33Java EE conference 2016
APACHE CAMEL
34Java EE conference 2016
http://camel.apache.org/java-dsl.html
APACHE CAMEL
Lightweight and embeddable
Spring boot integration
Connectors to various message and data sources
35Java EE conference 2016
SPRING BOOT
36Java EE conference 2016
SPRING BOOT
Fat jar/jee containerless deployment
Autoconfiguration and conditionals
Сodeless usage of spring cloud/netflix projects
37Java EE conference 2016
38Java EE conference 2016
GOTCHA’S – PRODUCER FASTER THAN CONSUMER, PRECONDITIONS
Its not recommended to have lots of partitions
Each partition is consumed by one consumer thread
Producer X times faster than consumer
39Java EE conference 2016
GOTCHA’S – PRODUCER FASTER THAN CONSUMERS, ACTIONS
Monitor kafka lag
Messages not consumed by group
Add intermediate multiplexing queue
See camel “seda” component
Think carefully since in-memory state can lead to data loss
Consider adding more partitions
Will allow more consumption threads
40Java EE conference 2016
GOTCHA’S – PRODUCER FASTER THAN CONSUMERS, TOOLS
41Java EE conference 2016
https://github.com/quantifind/KafkaOffsetMonitor
GOTCHA’S – AUTO OFFSET RESET
When you start test you do not receive any messages
Producer sends message before consumer is UP
Check auto.offset.reset setting in unit test
Latest (or largest in old api) can lead to consumption of only new messages
Earliest (or smallest in old api) will mean “from beginning”
42Java EE conference 2016
GOTCHA’S – CLIENT VERSION MIGHT NEED TO MATCH SERVER
Clients supposed to be “backward compatible”, but …
If you see weird things – you should check classpath
43Java EE conference 2016
GOTCHA’S – WATCH THE CLASSPATH
Multiple versions of kafka client
Multiple versions of kafka client dependencies
Multiple versions of zookeper client
44Java EE conference 2016
DEPENDENCY MANAGEMENT
Use dependency management to force versions and
exclusions
Use “Maven helper” Intellij plugin to check issues
https://github.com/krasa/MavenHelper
https://plugins.jetbrains.com/plugin/7179
45Java EE conference 2016
46Java EE conference 2016
Thank you!
ivasylyev@playtika.com
Join us:
http://goo.gl/LuWMo3
47Java EE conference 2016

More Related Content

PDF
Camel Kafka Connectors: Tune Kafka to “Speak” with (Almost) Everything (Andre...
PDF
Introduction to Apache Kafka
PPTX
Apache Kafka 0.8 basic training - Verisign
PDF
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
PPTX
Developing with the Go client for Apache Kafka
PDF
PPTX
Apache Kafka
PPTX
Kafka Tutorial - basics of the Kafka streaming platform
Camel Kafka Connectors: Tune Kafka to “Speak” with (Almost) Everything (Andre...
Introduction to Apache Kafka
Apache Kafka 0.8 basic training - Verisign
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
Developing with the Go client for Apache Kafka
Apache Kafka
Kafka Tutorial - basics of the Kafka streaming platform

What's hot (20)

PPTX
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
PDF
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
PDF
Let the alpakka pull your stream
PPTX
Kafka Tutorial: Streaming Data Architecture
PDF
Introduction to apache kafka
PDF
Kafka Connect & Streams - the ecosystem around Kafka
PPTX
Data Pipelines with Kafka Connect
PDF
A la rencontre de Kafka, le log distribué par Florian GARCIA
ODP
Introduction to Apache Kafka- Part 1
KEY
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
PPTX
Introducing Exactly Once Semantics To Apache Kafka
PDF
Introduction to Apache Kafka and why it matters - Madrid
PDF
What's new in Confluent 3.2 and Apache Kafka 0.10.2
PPTX
Introduction Apache Kafka
PDF
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...
PDF
The best of Apache Kafka Architecture
PDF
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
PDF
Exactly-once Semantics in Apache Kafka
PDF
Building High-Throughput, Low-Latency Pipelines in Kafka
PPTX
kafka for db as postgres
Discover Kafka on OpenShift: Processing Real-Time Financial Events at Scale (...
Kafka Summit SF 2017 - Kafka Connect Best Practices – Advice from the Field
Let the alpakka pull your stream
Kafka Tutorial: Streaming Data Architecture
Introduction to apache kafka
Kafka Connect & Streams - the ecosystem around Kafka
Data Pipelines with Kafka Connect
A la rencontre de Kafka, le log distribué par Florian GARCIA
Introduction to Apache Kafka- Part 1
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Introducing Exactly Once Semantics To Apache Kafka
Introduction to Apache Kafka and why it matters - Madrid
What's new in Confluent 3.2 and Apache Kafka 0.10.2
Introduction Apache Kafka
From Newbie to Highly Available, a Successful Kafka Adoption Tale (Jonathan S...
The best of Apache Kafka Architecture
[Spark Summit EU 2017] Apache spark streaming + kafka 0.10 an integration story
Exactly-once Semantics in Apache Kafka
Building High-Throughput, Low-Latency Pipelines in Kafka
kafka for db as postgres
Ad

Viewers also liked (20)

PPTX
JEEConf 2016. Effectiveness and code optimization in Java applications
PDF
Developing real-time data pipelines with Spring and Kafka
PDF
Leverage Enterprise Integration Patterns with Apache Camel and Twitter
PDF
Java 8 puzzlers
PDF
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
KEY
Functional UI testing of Adobe Flex RIA
PPTX
Creating your own private Download Center with Bintray
PDF
WebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
PDF
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...
PPTX
Java 8 Puzzlers [as presented at OSCON 2016]
PPTX
Spring Data: New approach to persistence
KEY
Testing Flex RIAs for NJ Flex user group
PPTX
Confession of an Engineer
PPTX
Morning at Lohika 2nd anniversary
PDF
Couchbase Sydney meetup #1 Couchbase Architecture and Scalability
ODP
Springboot and camel
PDF
Patterns and antipatterns in Docker image lifecycle @ DevOpsDays Charlotte 2017
PDF
How Immutability Helps in OOP
PDF
Patterns and antipatterns in Docker image lifecycle as was presented at Oracl...
PDF
Patterns and antipatterns in Docker image lifecycle as was presented at Scale...
JEEConf 2016. Effectiveness and code optimization in Java applications
Developing real-time data pipelines with Spring and Kafka
Leverage Enterprise Integration Patterns with Apache Camel and Twitter
Java 8 puzzlers
JavaOne 2013: «Java and JavaScript - Shaken, Not Stirred»
Functional UI testing of Adobe Flex RIA
Creating your own private Download Center with Bintray
WebSockets: The Current State of the Most Valuable HTML5 API for Java Developers
DevOps @Scale (Greek Tragedy in 3 Acts) as it was presented at Oracle Code SF...
Java 8 Puzzlers [as presented at OSCON 2016]
Spring Data: New approach to persistence
Testing Flex RIAs for NJ Flex user group
Confession of an Engineer
Morning at Lohika 2nd anniversary
Couchbase Sydney meetup #1 Couchbase Architecture and Scalability
Springboot and camel
Patterns and antipatterns in Docker image lifecycle @ DevOpsDays Charlotte 2017
How Immutability Helps in OOP
Patterns and antipatterns in Docker image lifecycle as was presented at Oracl...
Patterns and antipatterns in Docker image lifecycle as was presented at Scale...
Ad

Similar to Javaeeconf 2016 how to cook apache kafka with camel and spring boot (20)

PPT
Shopzilla On Concurrency
PDF
Life beyond Java 8
PPTX
Design Patterns for working with Fast Data in Kafka
PPTX
Design Patterns for working with Fast Data
PDF
Red Hat Java Update and Quarkus Introduction
PDF
101 mistakes FINN.no has made with Kafka (Baksida meetup)
PPTX
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
PDF
Staying in Sync: From Transactions to Streams
PPTX
An introduction to Apache Kafka and Kafka ecosystem at LinkedIn
PDF
PDF
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
PDF
An Introduction to Apache Kafka
PPT
Kafka Explainaton
PDF
Kafka Summit SF 2017 - Kafka and the Polyglot Programmer
PPTX
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
PPTX
kafka simplicity and complexity
PDF
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
PDF
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
PPTX
A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
PDF
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka
Shopzilla On Concurrency
Life beyond Java 8
Design Patterns for working with Fast Data in Kafka
Design Patterns for working with Fast Data
Red Hat Java Update and Quarkus Introduction
101 mistakes FINN.no has made with Kafka (Baksida meetup)
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
Staying in Sync: From Transactions to Streams
An introduction to Apache Kafka and Kafka ecosystem at LinkedIn
Apache Kafka: A high-throughput distributed messaging system @ JCConf 2014
An Introduction to Apache Kafka
Kafka Explainaton
Kafka Summit SF 2017 - Kafka and the Polyglot Programmer
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
kafka simplicity and complexity
Introduction and Overview of Apache Kafka, TriHUG July 23, 2013
Type safe, versioned, and rewindable stream processing with Apache {Avro, K...
A Modern C++ Kafka API | Kenneth Jia, Morgan Stanley
Kafka Summit NYC 2017 - Introducing Exactly Once Semantics in Apache Kafka

Recently uploaded (20)

PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
top salesforce developer skills in 2025.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Essential Infomation Tech presentation.pptx
PPTX
ai tools demonstartion for schools and inter college
PDF
Digital Strategies for Manufacturing Companies
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
System and Network Administraation Chapter 3
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
history of c programming in notes for students .pptx
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
top salesforce developer skills in 2025.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
How to Choose the Right IT Partner for Your Business in Malaysia
Essential Infomation Tech presentation.pptx
ai tools demonstartion for schools and inter college
Digital Strategies for Manufacturing Companies
Design an Analysis of Algorithms I-SECS-1021-03
Softaken Excel to vCard Converter Software.pdf
How Creative Agencies Leverage Project Management Software.pdf
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
System and Network Administraation Chapter 3
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Navsoft: AI-Powered Business Solutions & Custom Software Development
history of c programming in notes for students .pptx

Javaeeconf 2016 how to cook apache kafka with camel and spring boot

  • 2. HOW TO COOK APACHE KAFKA WITH CAMEL AND SPRING BOOT 2Java EE conference 2016 Ivan Vasyliev Playtika Core Services Team
  • 3. AGENDA Basics of Apache Kafka Apache Camel Spring Boot Demo Q&A 3Java EE conference 2016 CODE SLIDES
  • 4. WHY APACHE KAFKA? 4Java EE conference 2016 http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
  • 5. WHY APACHE KAFKA? Designed for large scale Widely adopted by top tech companies Hardened production quality product Data replication out of the box 5Java EE conference 2016
  • 6. FEATURES At most once, at least once guarantees Batching for high throughput cases Efficient with DEFAULT settings 6Java EE conference 2016
  • 7. EVEN MORE FEATURES Mirroring between datacenters Connectors to various DWH Complex event processing integrations 7Java EE conference 2016
  • 8. HIGH LEVEL VIEW 8Java EE conference 2016 http://kafka.apache.org/documentation.html#introduction
  • 9. HIGH LEVEL VIEW Publisher/subscriber and point-to-point models Client which sends message – producer Client which receives messages - consumer 9Java EE conference 2016
  • 10. WHAT IS NOT INCLUDED - JMS 10Java EE conference 2016
  • 11. WHAT IS NOT INCLUDED - JMS Not a JMS compliant server No message headers Can employ message key Send in payload Wait for it, on roadmap No transactions/JTA support 11Java EE conference 2016
  • 12. WHAT IS NOT INCLUDED - EXACTLY ONCE GUARANTEE 12Java EE conference 2016
  • 13. WHAT IS NOT INCLUDED - EXACTLY ONCE GUARANTEE No exactly once guarantee Duplicates because of failures De-duplication is on roadmap De-duplication on consumer With camel EIP, by message ID/body Consumer can tolerate duplicates 13Java EE conference 2016
  • 14. APACHE KAFKA LANGUAGE 14Java EE conference 2016 http://research.microsoft.com/en-us/um/people/srikanth/netdb11/netdb11papers/netdb11-final12.pdf
  • 15. APACHE KAFKA LANGUAGE Topic - represents stream of messages Contains set of partitions Partition - subset of messages in stream Partitioning is done by message key on producer No “queue” in dictionary 15Java EE conference 2016
  • 16. TOPICS AND PARTITIONS 16Java EE conference 2016 http://kafka.apache.org/documentation.html#intro_topics
  • 17. TOPICS AND PARTITIONS Partition is smallest unit of storage in kafka Partition is data file with messages Producer always append to end of file Consumers scroll/seek over file Consumer offset is persisted (zk or kafka) Strong ordering guarantees for consumer 17Java EE conference 2016
  • 18. QUEUE SEMANTIC IS DONE ON CLIENT 18Java EE conference 2016 http://kafka.apache.org/documentation.html#intro_consumers
  • 19. QUEUE Consumer offset is persisted by group id/per partition Queue semantic inside of consumer group Topic semantic between consumer groups 19Java EE conference 2016
  • 20. CONSUMPTION IS ALL ABOUT OFFSETS 20Java EE conference 2016 https://hadoopabcd.wordpress.com/2015/04/11/kafka-building-a-real-time-data-pipeline/
  • 21. CONSUMPTION IS ALL ABOUT OFFSETS Consumer polls data from broker Consumer offset is send (committed) to server Auto offset commit enabled By separate thread, periodically Auto offset commit disabled By your code, when batch of messages processed 21Java EE conference 2016
  • 22. CONSUMER OFFSET AND AUTO-COMMIT 22Java EE conference 2016
  • 23. CONSUMER OFFSET AND AUTO-COMMIT With “auto-commit” enabled you can loose messages Step1: One thread did not finish processing and failed Step 2: Auto-commit thread does not care Auto-commit is OK for status heartbeats Auto-commit is NOT OK if you need “at least once” guarantee, e.g. payment processing 23Java EE conference 2016
  • 24. DATA REPLICATION 24Java EE conference 2016
  • 25. DATA REPLICATION Leader receives all reads and writes Decides when to commit message Follower syncs messages from leader Take over if leader is down Replication controller maintains leader Zookeper used for coordination Leader election Consensus protocol 25Java EE conference 2016
  • 26. APACHE KAFKA PRODUCER 26Java EE conference 2016
  • 27. APACHE KAFKA PRODUCER Performs load balancing Uses message key to select partition Finds appropriate kafka broker leader for partition Has few configurable acknowledge modes Can do batching in async mode 27Java EE conference 2016
  • 28. DELIVERY GUARANTEED 28Java EE conference 2016
  • 29. DELIVERY GUARANTEED Durability with ack levels on producer side Data replication between brokers No in-memory state, efficient persistence Manually committing offset on consumer side 29Java EE conference 2016
  • 30. ISSUES - OPS Ops is not free There is Zookeeper on board Easy to setup with Docker/Rancher Need to learn basics to setup and monitor 30Java EE conference 2016
  • 31. ISSUES – DATA Can’t auto-scale existing data Option 1: Add new partitions, they will go to new nodes Option 2: Do it manually, move partitions around Option 3: Wait for it, on roadmap Mirroring seems to work into one direction Can’t handle very large number of topics 31Java EE conference 2016
  • 32. WHY APACHE CAMEL? 32Java EE conference 2016
  • 33. WHY APACHE CAMEL? Message routing DSL (java/scala/grooovy) Enterprise Integration Patterns Idempotent consumer (de-duplication) Aggregator … Abstractions for testing MockEndpoint Route Advice 33Java EE conference 2016
  • 34. APACHE CAMEL 34Java EE conference 2016 http://camel.apache.org/java-dsl.html
  • 35. APACHE CAMEL Lightweight and embeddable Spring boot integration Connectors to various message and data sources 35Java EE conference 2016
  • 36. SPRING BOOT 36Java EE conference 2016
  • 37. SPRING BOOT Fat jar/jee containerless deployment Autoconfiguration and conditionals Сodeless usage of spring cloud/netflix projects 37Java EE conference 2016
  • 39. GOTCHA’S – PRODUCER FASTER THAN CONSUMER, PRECONDITIONS Its not recommended to have lots of partitions Each partition is consumed by one consumer thread Producer X times faster than consumer 39Java EE conference 2016
  • 40. GOTCHA’S – PRODUCER FASTER THAN CONSUMERS, ACTIONS Monitor kafka lag Messages not consumed by group Add intermediate multiplexing queue See camel “seda” component Think carefully since in-memory state can lead to data loss Consider adding more partitions Will allow more consumption threads 40Java EE conference 2016
  • 41. GOTCHA’S – PRODUCER FASTER THAN CONSUMERS, TOOLS 41Java EE conference 2016 https://github.com/quantifind/KafkaOffsetMonitor
  • 42. GOTCHA’S – AUTO OFFSET RESET When you start test you do not receive any messages Producer sends message before consumer is UP Check auto.offset.reset setting in unit test Latest (or largest in old api) can lead to consumption of only new messages Earliest (or smallest in old api) will mean “from beginning” 42Java EE conference 2016
  • 43. GOTCHA’S – CLIENT VERSION MIGHT NEED TO MATCH SERVER Clients supposed to be “backward compatible”, but … If you see weird things – you should check classpath 43Java EE conference 2016
  • 44. GOTCHA’S – WATCH THE CLASSPATH Multiple versions of kafka client Multiple versions of kafka client dependencies Multiple versions of zookeper client 44Java EE conference 2016
  • 45. DEPENDENCY MANAGEMENT Use dependency management to force versions and exclusions Use “Maven helper” Intellij plugin to check issues https://github.com/krasa/MavenHelper https://plugins.jetbrains.com/plugin/7179 45Java EE conference 2016

Editor's Notes

  • #3: Я Иван Васильев сейчас работаю в компании плейтика Занимаюсь разработкой высоконагруженных околоигровых сервисов
  • #4: Расскажу про базовые концепции в кафке Расскажу зачем вам апач кемел Причем тут спринг бут Покажу пример Постараюсь ответить на вопросы
  • #5: JMS брокеры плохо масштабируются Нужны сторонние решения по репликации или шаред хранилище Какфка хорошо масштабируется и имеет встроенный механизм репликации
  • #6: Uber, Netflix, Cisco, Paypal Несмотря на то что 0.x.x она готова к проду
  • #7: Имплементированы Хотябы один и Возможно один Есть отправка и получения сообщений пачками для увеличения пропускной способности ЭТО ВАЖНО: Работает на дефолтных настройках
  • #8: Есть зеркалирование между датацентрами Есть конекторы к DWH базам для BI Есть интеграция с последними решениями в области CEP
  • #9: Кафка с высоты птичьего полета – стандартный брокер Посредник для асинхронного обмена сообщениями Необходимый компонент для хореографии серсисов
  • #10: Семантика очереди и топика Отправители сообщений – продюсеры Получатели – консюмеры
  • #11: Кафка это не JMS сервер Также нет JMS клиента Весь ваш код можно выкинуть и это неплохо
  • #12: Например нет хидеров – можно использовать ключ или пейлоад Нет транзакций и прочего что описано в JMS спеке
  • #13: Кафка НЕ импементирует “только один” семантику Я думаю что никто не имплементирует но тут честно признались Вам придется это решать в коде приложения
  • #14: Дубликаты возникают по разным причинам Два раза нажали на кнопку отправки Были ошибки сети Ваш консумер должен быть готов к обработке дубликатов Используйте кемел и паттерны
  • #15: Какфка брокер содержит топики Топики содержат партишены Партишен это файл с сообщенийми Топик это папка с файлами
  • #16: Топик это поток сообщений Партишен это часть сообщений в потоке В кафке нет очередей но семантика очереди возможна
  • #17: Партишен это файл Продюсер всегда апендит в файл Консюмер сканирует (seek) файл чтобы прочитать сообщения
  • #18: Офсет консюмера сохраняется на сервере Обычно в зукипере но в новой кафке можно и в брокере Порядок доставки сообщений соответствует порядку отправки (если нет ошибок)
  • #19: Очередь сделана путем объединения консумеров в группы Каждый партишен обрабатывается только одним консумеров из группы Одно сообщение читается один раз группой вне независимости от количества консумеров
  • #20: Офсет в партишене хранится по имени группы Внутри группы мы имеем семантику очереди Между группами мы имеем семантику топика
  • #21: Это значит что одна группа может иметь в разных партишенах разный офсет
  • #22: Консумеры полят данные с брокера Офсет хранится на сервере Автокомит включен отправка офсета в одтельном потоке Автокомит выключен – отправка вашим кодом по окончанию процессинга
  • #23: Если вы используете автокомит вы можете потерять сообщения
  • #24: Автокомит норм для отправки и обработки статус сообщений Автокомит не норм когда обрабатываются важные сообщения например платежи
  • #25: В репликации данных есть роли брокера Лидер для партиции Фоловер для партиции Контроллер репликации Координатор
  • #26: Лидер работает с клиентами Фоловер забирает изменения с лидера Контроллер следит за лидером и фоловерами Зукипер помогает выбирать лидера
  • #27: Продюсер кафки умеет делать балансировку нагрузки
  • #28: Продюсер выбирает партишен по ключу (хешированием) Находит лидера партиции и отправляет сообщения Ждет или не ждет подтвержения от реплик Умеет отправлять сообщения пачками
  • #29: Кафка обеспечивает гарантированную доставку сообщений Данные не хранятся в памяти – все хранится на дисках
  • #30: Для гарантированной доставки важно: Правильно выбрать количество подтвердивших партиций при отправке Правильно сконфигурить репликацию Вручную комитить офсет
  • #31: Администрирование не бесплатное, в зависимостях зукипер Однако есть варианты установки через локер Для мониторинга и понимания метрик надо понимать как работает кафка
  • #32: Если вы добавляете ноды в кластер существующие данные сами переезжать не будут Зеркалирование работает в одну сторону У вас не получится создать каждому юзеру по топику
  • #33: Кафка апи это low-level уровень для работы с брокером Вы начнете изобретать более высокий уровень абстакции Вы сделаете это неправильно
  • #34: Кемел уже предоставляет декларативную обработку сообщений Он также позволяет использовать имплементацию EIP А также есть абстракции для тестирования
  • #35: Пример обработки файла кемелом Декларативная обработка сообщений
  • #36: Я не советую использовать сервисмикс/фьюз/фабрик и прочие псевдо-контейнеры Кемел можно встроить в ваше приложение Также есть автоконфигурация для бута Также есть адаптеры к куче источников данных и сообщений
  • #37: Спринг бут требует отдельной презентации Для тех кто не знает это фреймворк для построения сервисов возможно микро
  • #38: В двух словах бут предоставляет утилиты для фатжар серсиов Предоставляет также механизм автоконфигурайций для зависимостей Это позволяет использовать например проекты от нетфликс
  • #39: А теперь немного кода
  • #40: Иметь много партишенов авторы не рекомендуют Один партишен процессится одним консумеров в группе Продюсеры могут быть намного быстрее консумаров
  • #41: Нужно мониторить лаг (непрочитанные сообщения) по группе Можно добавить инмеморию кью в которую перекладывать сообщения и ее процессить бОльшим числом потоком Можно попробовать добавить больше партишенов
  • #42: Есть тул для мониторинга лага кафки вот его UI
  • #43: В тестах может быть ситуация что вы отправляете сообщения но не получаете их Проверьте настройку auto.offset.reset, она может быть выставлена в «получать только новые»
  • #44: Желательно версию клиента иметь такой же как версия сервера или ниже Если вы видите странное – проверьте класспас
  • #46: Есть замечательный GUI плагин к идее который позволяет резолвить зависимости