SlideShare a Scribd company logo
©2016 LinkedIn Corporation. All Rights Reserved.
Kafka 0.9, Things you should know
©2016 LinkedIn Corporation. All Rights Reserved.
Ratish Ravindran
Site Reliability Engineer
LinkedIn, Data Infrastructure Streaming
©2016 LinkedIn Corporation. All Rights Reserved. 3
Agenda
 Security
 Kafka Connect
 User defined quota
 New consumer
 Notable improvements and fixes
 Upgrading from kafka 0.8
 Kafka 0.10 - highlights
©2016 LinkedIn Corporation. All Rights Reserved. 4
Security
 Why ?
 Multitenant cluster
 Multiple clusters
 Network ACLs
©2016 LinkedIn Corporation. All Rights Reserved. 5
Security
 Authentication
 Kerberos
 TLS
 Unix like permission
 "Principal P is [Allowed/Denied] Operation O From Host H On Resource R"
bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 --add --
allow-principal User:Bob --allow-principal User:Alice --allow-host 198.51.100.0 --
allow-host 198.51.100.1 --operation Read --operation Write --topic Test-topic
 Encryption
©2016 LinkedIn Corporation. All Rights Reserved. 6
Kafka Connect
 Why ?
 Multiple tools for importing and exporting
 High engineering and operational overhead
 Some tools are poor fit for job
©2016 LinkedIn Corporation. All Rights Reserved. 7
Kafka Connect
Kafka
Data
source
C
Data
source
B
Data
source
A
Data
sink 3
Data
sink 2
Data
sink 1T1
T2
T3 T6
T5
T4
©2016 LinkedIn Corporation. All Rights Reserved. 8
Kafka Connect
Kafka
Data
source
C
Data
source
B
Data
source
A
Data
sink 3
Data
sink 2
Data
sink 1
KafkaConnect
KafkaConnect
©2016 LinkedIn Corporation. All Rights Reserved. 9
Kafka Connect
Key Properties:
 Broad copying by default
 Streaming and batch
 Scales to application
 Focus on copying data only
 Parallel
 Connector API
©2016 LinkedIn Corporation. All Rights Reserved. 10
Kafka Connect
Advantages :
 Fault tolerance
 Partitioning
 Offset management
 Delivery semantics
 Operations
 Monitoring
©2016 LinkedIn Corporation. All Rights Reserved. 11
User defined quota
 Why ?
 High reads
 High writes
 SLAs
©2016 LinkedIn Corporation. All Rights Reserved. 12
User defined quota
 Single large cluster
 Producer side (quota.producer.default)
 Consumer side (quota.consumer.default)
 Per client , Per broker
 Quota override
./bin/kafka-config.sh --alter
--add-config ‘producer_byte_rate=1048576,consumer_byte_rate=1048576’
--entity-type clients
--entity-name TestTopic
--zookeeper localhost:2181
©2016 LinkedIn Corporation. All Rights Reserved. 13
New consumer
Motivation :
 Thin consumer client
 Central co-ordination
 Allow manual partition assignment
 Allow manual offset management
 Invocation of user specified callback on rebalance
 Non blocking consumer APIs
©2016 LinkedIn Corporation. All Rights Reserved. 14
New consumer
Features:
 Group management protocol
 Consumer
©2016 LinkedIn Corporation. All Rights Reserved. 15
New consumer
State Diagram - consumer
©2016 LinkedIn Corporation. All Rights Reserved. 16
New consumer
Features:
 Group management protocol
 Consumer
 Co-ordinator
©2016 LinkedIn Corporation. All Rights Reserved. 17
New consumer
State Diagram – Co-ordinator
©2016 LinkedIn Corporation. All Rights Reserved. 18
New consumer
Features:
 Group management protocol
 Consumer
 Co-ordinator
 Failure detection protocol
©2016 LinkedIn Corporation. All Rights Reserved. 19
New consumer
Interesting scenarios:
 Co-ordinator failover/connection loss
 Partition changes for subscribed topics
 Offset commit during rebalance
 Hearbeats during rebalance
 Slow consumers
©2016 LinkedIn Corporation. All Rights Reserved. 20
Notable improvements and fixes
 Automated replica lag tuning (replica.lag.time.max.ms)
 New purgatory design – low memory overhead
 Auto-assign node ids
 No data loss in Mirror Maker – unclean shutdown
 Log compaction for compressed topics
 Handling of corrupt index files
©2016 LinkedIn Corporation. All Rights Reserved. 21
Upgrading from kafka 0.8
 inter.broker.protocol.version=0.8.2.x
 Update code and restart
 inter.broker.protocol.version=0.9.0.0
 Restart brokers again
©2016 LinkedIn Corporation. All Rights Reserved. 22
Potential Breaking Changes:
 Java 1.6 and Scala 2.9 are not supported
 Broker IDs > 1000 ( reserved.broker.max.id and
broker.id.generation.enable )
 replica.lag.max.messages removed
 replica.lag.time.max.ms
 No compaction for topics without key
Upgrading from kafka 0.8
©2016 LinkedIn Corporation. All Rights Reserved. 23
Potential Breaking Changes contd….
 Changes in default JVM options
Upgrading from kafka 0.8
©2016 LinkedIn Corporation. All Rights Reserved. 24
Why not kafka 0.10 ?
©2016 LinkedIn Corporation. All Rights Reserved. 25
Kafka 0.10 - Highlights
 Kafka Streams
 Rack Awareness
 Timestamps in messages
 SASL improvements
 Kafka consumer max record
 Protocol version improvements
©2016 LinkedIn Corporation. All Rights Reserved. 26
References
 http://kafka.apache.org/090/documentation.html
 http://www.confluent.io/blog/apache-kafka-0.9-is-released
 http://www.confluent.io/blog/announcing-apache-kafka-0.10-and-confluent-
platform-3.0
 https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+R
ewrite+Design
©2016 LinkedIn Corporation. All Rights Reserved.

More Related Content

PPTX
Multi tier, multi-tenant, multi-problem kafka
PPTX
More Datacenters, More Problems
PPTX
Kafka overview and use cases
PPTX
Enterprise Kafka: Kafka as a Service
PPTX
Salesforce enabling real time scenarios at scale using kafka
PPTX
Kafka at Scale: Multi-Tier Architectures
PPTX
Kafka Quotas Talk at LinkedIn
PPTX
Microsoft challenges of a multi tenant kafka service
Multi tier, multi-tenant, multi-problem kafka
More Datacenters, More Problems
Kafka overview and use cases
Enterprise Kafka: Kafka as a Service
Salesforce enabling real time scenarios at scale using kafka
Kafka at Scale: Multi-Tier Architectures
Kafka Quotas Talk at LinkedIn
Microsoft challenges of a multi tenant kafka service

What's hot (20)

PPTX
Tuning Kafka for Fun and Profit
PDF
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
PDF
Distributed Enterprise Monitoring and Management of Apache Kafka (William McL...
PPTX
Apache Kafka at LinkedIn - How LinkedIn Customizes Kafka to Work at the Trill...
PDF
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
PDF
How we eased out security journey with OAuth (Goodbye Kerberos!) | Paul Makka...
PPTX
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
PPTX
An Introduction to Confluent Cloud: Apache Kafka as a Service
PPTX
IoT Data Streaming - Why MQTT and Kafka are a match made in heaven | Dominik ...
PDF
Introducing Confluent Cloud: Apache Kafka as a Service
PDF
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
PDF
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
PDF
Don't Cross the Streams! (or do, we got you)
PDF
Stream Processing with Apache Kafka and .NET
PDF
What is Apache Kafka and What is an Event Streaming Platform?
PPTX
Microservices in the Apache Kafka Ecosystem
PDF
Redis and Kafka - Advanced Microservices Design Patterns Simplified
PDF
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
PPTX
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
PDF
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Tuning Kafka for Fun and Profit
Scaling a backend for a big data and blockchain environment by Rafael Ríos at...
Distributed Enterprise Monitoring and Management of Apache Kafka (William McL...
Apache Kafka at LinkedIn - How LinkedIn Customizes Kafka to Work at the Trill...
Making Kafka Cloud Native | Jay Kreps, Co-Founder & CEO, Confluent
How we eased out security journey with OAuth (Goodbye Kerberos!) | Paul Makka...
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
An Introduction to Confluent Cloud: Apache Kafka as a Service
IoT Data Streaming - Why MQTT and Kafka are a match made in heaven | Dominik ...
Introducing Confluent Cloud: Apache Kafka as a Service
Enhancing Apache Kafka for Large Scale Real-Time Data Pipeline at Tencent | K...
Making your Life Easier with MongoDB and Kafka (Robert Walters, MongoDB) Kafk...
Don't Cross the Streams! (or do, we got you)
Stream Processing with Apache Kafka and .NET
What is Apache Kafka and What is an Event Streaming Platform?
Microservices in the Apache Kafka Ecosystem
Redis and Kafka - Advanced Microservices Design Patterns Simplified
Supercharge Your Real-time Event Processing with Neo4j's Streams Kafka Connec...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
Delivering: from Kafka to WebSockets | Adam Warski, SoftwareMill
Ad

Viewers also liked (20)

PDF
101 mistakes FINN.no has made with Kafka (Baksida meetup)
PPTX
Presentation1
PPTX
Embedded Mirror Maker
PDF
Apache Kafka - Scalable Message-Processing and more !
PPTX
Apache Kafka Security
PDF
Kinesis vs-kafka-and-kafka-deep-dive
PDF
Securing Kafka
PDF
Reliable and Scalable Data Ingestion at Airbnb
PPTX
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
PPT
Kafka Reliability - When it absolutely, positively has to be there
PPTX
Building a Replicated Logging System with Apache Kafka
PPTX
Kerberos Authentication Protocol
PPTX
No data loss pipeline with apache kafka
PPTX
Streaming in Practice - Putting Apache Kafka in Production
PPTX
Netflix Data Pipeline With Kafka
PPTX
Data Streaming with Apache Kafka & MongoDB
PDF
Producer Performance Tuning for Apache Kafka
PPTX
Introduction to Apache Kafka
PDF
Hadoop and Kerberos
PDF
Handle Large Messages In Apache Kafka
101 mistakes FINN.no has made with Kafka (Baksida meetup)
Presentation1
Embedded Mirror Maker
Apache Kafka - Scalable Message-Processing and more !
Apache Kafka Security
Kinesis vs-kafka-and-kafka-deep-dive
Securing Kafka
Reliable and Scalable Data Ingestion at Airbnb
Building Large-Scale Stream Infrastructures Across Multiple Data Centers with...
Kafka Reliability - When it absolutely, positively has to be there
Building a Replicated Logging System with Apache Kafka
Kerberos Authentication Protocol
No data loss pipeline with apache kafka
Streaming in Practice - Putting Apache Kafka in Production
Netflix Data Pipeline With Kafka
Data Streaming with Apache Kafka & MongoDB
Producer Performance Tuning for Apache Kafka
Introduction to Apache Kafka
Hadoop and Kerberos
Handle Large Messages In Apache Kafka
Ad

Similar to Kafka 0.9, Things you should know (20)

PPTX
An introduction to Apache Kafka and Kafka ecosystem at LinkedIn
PDF
Apache Kafka Introduction
PDF
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
PPTX
Kafka blr-meetup-presentation - Kafka internals
PPTX
Apache Kafka at LinkedIn
PDF
Using MongoDB with Kafka - Use Cases and Best Practices
PDF
Please Upgrade Apache Kafka. Now. (Gwen Shapira, Confluent) Kafka Summit SF 2019
PDF
Capital One Delivers Risk Insights in Real Time with Stream Processing
PDF
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
PPTX
MuleSoft Meetup Singapore #8 March 2021
PDF
Introduction to apache kafka
PDF
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
PDF
Tokyo AK Meetup Speedtest - Share.pdf
PDF
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
PPT
Apache kafka- Onkar Kadam
PDF
Kafka Needs No Keeper
PDF
Kafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift
PPTX
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
PPT
Kafka Explainaton
PPTX
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
An introduction to Apache Kafka and Kafka ecosystem at LinkedIn
Apache Kafka Introduction
Apache Kafka from 0.7 to 1.0, History and Lesson Learned
Kafka blr-meetup-presentation - Kafka internals
Apache Kafka at LinkedIn
Using MongoDB with Kafka - Use Cases and Best Practices
Please Upgrade Apache Kafka. Now. (Gwen Shapira, Confluent) Kafka Summit SF 2019
Capital One Delivers Risk Insights in Real Time with Stream Processing
Show Me Kafka Tools That Will Increase My Productivity! (Stephane Maarek, Dat...
MuleSoft Meetup Singapore #8 March 2021
Introduction to apache kafka
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Tokyo AK Meetup Speedtest - Share.pdf
14th Athens Big Data Meetup - Landoop Workshop - Apache Kafka Entering The St...
Apache kafka- Onkar Kadam
Kafka Needs No Keeper
Kafka & Storm - FifthElephant 2015 by @bhaskerkode, Helpshift
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Kafka Explainaton
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Approach and Philosophy of On baking technology
PDF
Machine learning based COVID-19 study performance prediction
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Big Data Technologies - Introduction.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Modernizing your data center with Dell and AMD
PDF
Encapsulation theory and applications.pdf
PDF
KodekX | Application Modernization Development
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Advanced methodologies resolving dimensionality complications for autism neur...
Approach and Philosophy of On baking technology
Machine learning based COVID-19 study performance prediction
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Big Data Technologies - Introduction.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Modernizing your data center with Dell and AMD
Encapsulation theory and applications.pdf
KodekX | Application Modernization Development
Agricultural_Statistics_at_a_Glance_2022_0.pdf
NewMind AI Weekly Chronicles - August'25 Week I
NewMind AI Monthly Chronicles - July 2025
Reach Out and Touch Someone: Haptics and Empathic Computing
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

Kafka 0.9, Things you should know

  • 1. ©2016 LinkedIn Corporation. All Rights Reserved. Kafka 0.9, Things you should know
  • 2. ©2016 LinkedIn Corporation. All Rights Reserved. Ratish Ravindran Site Reliability Engineer LinkedIn, Data Infrastructure Streaming
  • 3. ©2016 LinkedIn Corporation. All Rights Reserved. 3 Agenda  Security  Kafka Connect  User defined quota  New consumer  Notable improvements and fixes  Upgrading from kafka 0.8  Kafka 0.10 - highlights
  • 4. ©2016 LinkedIn Corporation. All Rights Reserved. 4 Security  Why ?  Multitenant cluster  Multiple clusters  Network ACLs
  • 5. ©2016 LinkedIn Corporation. All Rights Reserved. 5 Security  Authentication  Kerberos  TLS  Unix like permission  "Principal P is [Allowed/Denied] Operation O From Host H On Resource R" bin/kafka-acls.sh --authorizer-properties zookeeper.connect=localhost:2181 --add -- allow-principal User:Bob --allow-principal User:Alice --allow-host 198.51.100.0 -- allow-host 198.51.100.1 --operation Read --operation Write --topic Test-topic  Encryption
  • 6. ©2016 LinkedIn Corporation. All Rights Reserved. 6 Kafka Connect  Why ?  Multiple tools for importing and exporting  High engineering and operational overhead  Some tools are poor fit for job
  • 7. ©2016 LinkedIn Corporation. All Rights Reserved. 7 Kafka Connect Kafka Data source C Data source B Data source A Data sink 3 Data sink 2 Data sink 1T1 T2 T3 T6 T5 T4
  • 8. ©2016 LinkedIn Corporation. All Rights Reserved. 8 Kafka Connect Kafka Data source C Data source B Data source A Data sink 3 Data sink 2 Data sink 1 KafkaConnect KafkaConnect
  • 9. ©2016 LinkedIn Corporation. All Rights Reserved. 9 Kafka Connect Key Properties:  Broad copying by default  Streaming and batch  Scales to application  Focus on copying data only  Parallel  Connector API
  • 10. ©2016 LinkedIn Corporation. All Rights Reserved. 10 Kafka Connect Advantages :  Fault tolerance  Partitioning  Offset management  Delivery semantics  Operations  Monitoring
  • 11. ©2016 LinkedIn Corporation. All Rights Reserved. 11 User defined quota  Why ?  High reads  High writes  SLAs
  • 12. ©2016 LinkedIn Corporation. All Rights Reserved. 12 User defined quota  Single large cluster  Producer side (quota.producer.default)  Consumer side (quota.consumer.default)  Per client , Per broker  Quota override ./bin/kafka-config.sh --alter --add-config ‘producer_byte_rate=1048576,consumer_byte_rate=1048576’ --entity-type clients --entity-name TestTopic --zookeeper localhost:2181
  • 13. ©2016 LinkedIn Corporation. All Rights Reserved. 13 New consumer Motivation :  Thin consumer client  Central co-ordination  Allow manual partition assignment  Allow manual offset management  Invocation of user specified callback on rebalance  Non blocking consumer APIs
  • 14. ©2016 LinkedIn Corporation. All Rights Reserved. 14 New consumer Features:  Group management protocol  Consumer
  • 15. ©2016 LinkedIn Corporation. All Rights Reserved. 15 New consumer State Diagram - consumer
  • 16. ©2016 LinkedIn Corporation. All Rights Reserved. 16 New consumer Features:  Group management protocol  Consumer  Co-ordinator
  • 17. ©2016 LinkedIn Corporation. All Rights Reserved. 17 New consumer State Diagram – Co-ordinator
  • 18. ©2016 LinkedIn Corporation. All Rights Reserved. 18 New consumer Features:  Group management protocol  Consumer  Co-ordinator  Failure detection protocol
  • 19. ©2016 LinkedIn Corporation. All Rights Reserved. 19 New consumer Interesting scenarios:  Co-ordinator failover/connection loss  Partition changes for subscribed topics  Offset commit during rebalance  Hearbeats during rebalance  Slow consumers
  • 20. ©2016 LinkedIn Corporation. All Rights Reserved. 20 Notable improvements and fixes  Automated replica lag tuning (replica.lag.time.max.ms)  New purgatory design – low memory overhead  Auto-assign node ids  No data loss in Mirror Maker – unclean shutdown  Log compaction for compressed topics  Handling of corrupt index files
  • 21. ©2016 LinkedIn Corporation. All Rights Reserved. 21 Upgrading from kafka 0.8  inter.broker.protocol.version=0.8.2.x  Update code and restart  inter.broker.protocol.version=0.9.0.0  Restart brokers again
  • 22. ©2016 LinkedIn Corporation. All Rights Reserved. 22 Potential Breaking Changes:  Java 1.6 and Scala 2.9 are not supported  Broker IDs > 1000 ( reserved.broker.max.id and broker.id.generation.enable )  replica.lag.max.messages removed  replica.lag.time.max.ms  No compaction for topics without key Upgrading from kafka 0.8
  • 23. ©2016 LinkedIn Corporation. All Rights Reserved. 23 Potential Breaking Changes contd….  Changes in default JVM options Upgrading from kafka 0.8
  • 24. ©2016 LinkedIn Corporation. All Rights Reserved. 24 Why not kafka 0.10 ?
  • 25. ©2016 LinkedIn Corporation. All Rights Reserved. 25 Kafka 0.10 - Highlights  Kafka Streams  Rack Awareness  Timestamps in messages  SASL improvements  Kafka consumer max record  Protocol version improvements
  • 26. ©2016 LinkedIn Corporation. All Rights Reserved. 26 References  http://kafka.apache.org/090/documentation.html  http://www.confluent.io/blog/apache-kafka-0.9-is-released  http://www.confluent.io/blog/announcing-apache-kafka-0.10-and-confluent- platform-3.0  https://cwiki.apache.org/confluence/display/KAFKA/Kafka+0.9+Consumer+R ewrite+Design
  • 27. ©2016 LinkedIn Corporation. All Rights Reserved.