SlideShare a Scribd company logo
©2017 LinkedIn Corporation. All Rights Reserved.
An Introduction to Apache Kafka and
Kafka Ecosystem at LinkedIn
Dong Lin
Data Infra Streaming @ LinkedIn
Open Data Science Conference
©2017 LinkedIn Corporation. All Rights Reserved.
Agenda
▪ Kafka basics (50 min)
▪ Kafka ecosystem at LinkedIn (40 min)
▪ Hands-on (30 min)
©2017 LinkedIn Corporation. All Rights Reserved. 3
Kafka basics
▪ What is Kafka?
– Motivation and design philosophy
▪ Who uses Kafka?
– Adoption in the open source community and use-cases at LinkedIn
▪ What is the fundamental design of Kafka?
– Partition and replication model
▪ How to configure Kafka for your use-case?
– Tradeoff among performance, persistence, availability and message order
▪ What is the development roadmap of Kafka?
– Recent and upcoming features
©2017 LinkedIn Corporation. All Rights Reserved. 4
Publish/Subscribe Messaging
• Multiple producers
• Multiple consumers
• Scalable and durable
• Created by LinkedIn
• Open sourced under Apache
©2017 LinkedIn Corporation. All Rights Reserved. 5
PageViewEvent
Hadoop
Direct transmission
Web server
©2017 LinkedIn Corporation. All Rights Reserved.
Many problems
Multiple
consumers
Destination
is slow
Destination
permanent
failure
Bug in
downstream
application
Destination
temporarily
unavailable
Multiple
producers
At least once
delivery
6
PageViewEvent
HadoopWeb server
©2017 LinkedIn Corporation. All Rights Reserved.
Use a publish-subscribe messaging system
Multiple
consumers
Destination
permanent
failure
Bug in
downstream
application
Multiple
producers
Destination
temporarily
unavailable
Pub/sub
system
7
Hadoop
Destination
is slow
At least once
delivery
Web server
©2017 LinkedIn Corporation. All Rights Reserved.
Use Kafka
Spark streaming
Multiple
consumers
Destination
permanent
failure
Bug in
downstream
application
FunctionalityPersistent
Delivery semanticsPerformance
Destination
temporarily
unavailable
Availability
8
Destination
is slow
At least once
delivery
Multiple
producers
Web server
©2017 LinkedIn Corporation. All Rights Reserved.
Problem: closely-coupled pipelines
▪ O(N^2) pipelines – limited organizational scalability
▪ Messages are duplicated proportional to number of clients
9
©2017 LinkedIn Corporation. All Rights Reserved.
Solution: publish-subscribe messaging system
▪ O(N) pipelines
▪ Space efficient
▪ Producers are decoupled from consumers
10
©2017 LinkedIn Corporation. All Rights Reserved.
Kafka as Unix Pipes
$ cat *.txt | tr A-Z a-z | grep hello
$ tail –F *.txt | tr A-Z a-z | grep hello
producer kafka Hadoop kafka Hadoop
Samza kafka Samza
Reference: http://www.confluent.io/blog
11
©2017 LinkedIn Corporation. All Rights Reserved.
Fan In
12
©2017 LinkedIn Corporation. All Rights Reserved.
Fan Out
13
©2017 LinkedIn Corporation. All Rights Reserved.
Add Branch
14
©2017 LinkedIn Corporation. All Rights Reserved.
Switch Branch
15
©2017 LinkedIn Corporation. All Rights Reserved.
Delete Branch
16
©2017 LinkedIn Corporation. All Rights Reserved.
Parallel Consumption
17
©2017 LinkedIn Corporation. All Rights Reserved. 18
Kafka basics
▪ What is Kafka?
– Motivation and design philosophy
▪ Who uses Kafka?
– Adoption in the open source community and use-cases at LinkedIn
▪ What is the fundamental design of Kafka?
– Partition and replication model
▪ How to configure Kafka for your use-case?
– Tradeoff among performance, persistence, availability and message order
▪ What is the development roadmap of Kafka?
– Recent and upcoming features
©2017 LinkedIn Corporation. All Rights Reserved.
Companies that use Kafka
LinkedIn Yahoo Twitter Airbnb
Pinterest Square Coursera Uber
Goldman Sachs Box Paypal Cisco
Dropbox Spotify Wikipedia Microsoft
Netflix CloudFlare Hotels.com …
Reference: https://cwiki.apache.org/confluence/display/KAFKA/Powered+By
19
©2017 LinkedIn Corporation. All Rights Reserved.
Apache projects integrated with Kafka
• Stream processing
• Apache Storm
• Apache Samza
• Apache Spark Streaming
• Search and Query
• Apache Hive
• Presto
• Apache Hadoop
…
20
©2017 LinkedIn Corporation. All Rights Reserved.
Kafka volume at LinkedIn
21
• Produced
• Per day
2Trillion
messages
• Single cluster
• Unique data
5Gbps
Inbound
• Average 3X
consumption
• Before mirroring
18Gbps
Outbound
• Largest cluster has
250k partitions
• Up to 10k partitions
per broker
2.5M
Partitions
©2017 LinkedIn Corporation. All Rights Reserved.
Kafka use-cases at LinkedIn
22
• Member-related
Activity
Tracking Metrics Queuing Logging
• Application
metrics, service
calls
• Internal
application data,
messaging
• Largest users
are Samza and
Search
• Dedicated
cluster for
application logs
going to ELK
• High volume, low
retention
©2017 LinkedIn Corporation. All Rights Reserved. 23
Kafka basics
▪ What is Kafka?
– Motivation and design philosophy
▪ Who uses Kafka?
– Adoption in the open source community and use-cases at LinkedIn
▪ What is the fundamental design of Kafka?
– Partition and replication model
▪ How to configure Kafka for your use-case?
– Tradeoff among performance, persistence, availability and message order
▪ What is the development roadmap of Kafka?
– Recent and upcoming features
©2017 LinkedIn Corporation. All Rights Reserved.
Design goal
▪ Performance
– High throughput
– Low latency
– Scalable
▪ Persistence and availability
– Data should be available in the event of (permanent) server failure
▪ Functionality
– Rewind back in time
▪ Strong delivery semantics
– At-least-once delivery / exactly-once delivery
– In-order message delivery within partition
24
©2017 LinkedIn Corporation. All Rights Reserved.
Characteristics
• High throughput (~300 MBps per machine)
– Immutable append-only data structure for fast disk access
– Efficient data transfer via zero copy
– Mostly messages are read directly from page cache
– Partitioning model for scalability
– Batching and compression
• Low latency (~2 ms)
– Make data universally available in near real-time
• Strong guarantees about messages
– Messages strictly ordered within partition
– All data persistent on disk with replication
– Exactly once delivery
25
©2017 LinkedIn Corporation. All Rights Reserved.
Is disk slow?
26
©2017 LinkedIn Corporation. All Rights Reserved.
Traditional data copy
27
▪ 4 copies
▪ 2 context switches
©2017 LinkedIn Corporation. All Rights Reserved.
Efficient zero copy
28
▪ 3 copies
▪ 0 context switch
▪ Only 2 copies if consumers
are mostly caught up
©2017 LinkedIn Corporation. All Rights Reserved.
Kafka as log
29
©2017 LinkedIn Corporation. All Rights Reserved.
Producer -> Topic -> Consumer
30
©2017 LinkedIn Corporation. All Rights Reserved.
Topic divided into partitions
• Partitions are distributed and replicated across brokers
• Parallel produce/consume
• Messages with the same key go to the same partition
31
©2017 LinkedIn Corporation. All Rights Reserved.
Old New
Partition consists of messages with offsets
• Append only
• Strict order
• Messages assigned with incremental offsets
32
©2017 LinkedIn Corporation. All Rights Reserved. 33
▪ Disk/network/CPU load
distributed across brokers in
unit of partitions
Broker in Kafka
©2017 LinkedIn Corporation. All Rights Reserved.
Producer in Kafka
▪ Messages with same key go
to the same partition
▪ Messages without a key go to
a random partition
34
©2017 LinkedIn Corporation. All Rights Reserved.
Consumer in Kafka
▪ Consume can belong to a
consumer group (CG)
▪ Consumes in the same CG
– Parallel processing of messages
– Share the consumer offset
35
©2017 LinkedIn Corporation. All Rights Reserved.
When a broker fails…
X
36
©2017 LinkedIn Corporation. All Rights Reserved.
Partition replication in Kafka
▪Brokers can fail
– Controlled: e.g., upgrades/config changes
– Uncontrolled: disk failure, power outage, out-of-memory etc.
▪Need high availability
– Typical failover < 10 ms
▪Need data persistence
37
©2017 LinkedIn Corporation. All Rights Reserved.
Partition replica assignment
▪ Replicas are laid out evenly across brokers
▪ First assigned replica is preferred as leader.
▪ Writes/reads go to leader, which sends message to followers
38
©2017 LinkedIn Corporation. All Rights Reserved.
Replication (at a high-level)
39
©2017 LinkedIn Corporation. All Rights Reserved.
Replication (at a high-level)
40
©2017 LinkedIn Corporation. All Rights Reserved.
Replication (at a high-level)
41
©2017 LinkedIn Corporation. All Rights Reserved.
Replication (at a high-level)
42
©2017 LinkedIn Corporation. All Rights Reserved. 43
Kafka basics
▪ What is Kafka?
– Motivation and design philosophy
▪ Who uses Kafka?
– Adoption in the open source community and use-cases at LinkedIn
▪ What is the fundamental design of Kafka?
– Partition and replication model
▪ How to configure Kafka for your use-case?
– Tradeoff among performance, persistence, availability and message order
▪ What is the development roadmap of Kafka?
– Recent and upcoming features
©2017 LinkedIn Corporation. All Rights Reserved.
No one-size-fits-all configuration
44
©2017 LinkedIn Corporation. All Rights Reserved.
Tradeoff between performance and persistence
• Should broker send ack to producer right after step 1?
• Higher persistence and lower throughput with acks = -1 in producer config
X
45
©2017 LinkedIn Corporation. All Rights Reserved.
Tradeoff between performance and message order
46
• Should producer send new message before ack of the last message?
• In-order delivery and lower throughput with
max.in.flight.requests.per.connection = 1 in producer config
Kafka BrokerProducer
message 1
message 0 failed
retry message 0
message 0
©2017 LinkedIn Corporation. All Rights Reserved.
Tradeoff between persistence and availability
• Should we allow message produce if all in-sync replicas are offline?
• Higher availability and weaker persistence with
unclean.leader.election.enable = true in broker config
0 1 2 3 4 5 6 7 8
0 1 2 3 4 5 60 1 2 3 4 5
Follower 1 Follower 2
Leader
Read Read
47
7 8
X
X
©2017 LinkedIn Corporation. All Rights Reserved.
Tradeoff between availability and cost
• Do we need more replicas for the topic?
• Higher availability and higher cost with RF=3 in comparison to RF=2)
48
producer
Broker
Broker Broker
producer
Broker
Broker
RF=3 RF=2
©2017 LinkedIn Corporation. All Rights Reserved. 49
Kafka basics
▪ What is Kafka?
– Motivation and design philosophy
▪ Who uses Kafka?
– Adoption in the open source community and use-cases at LinkedIn
▪ What is the fundamental design of Kafka?
– Partition and replication model
▪ How to configure Kafka for your use-case?
– Tradeoff among performance, persistence, availability and message order
▪ What is the development roadmap of Kafka?
– Recent and upcoming features
©2017 LinkedIn Corporation. All Rights Reserved.
Kafka provides great performance, availability and data persistence
Are there other features that will be valuable to users?
50
©2017 LinkedIn Corporation. All Rights Reserved.
Improved support for multi-tenancy
▪ Sasl/Kerberos and SSL support (KIP-12)
▪ Quota (KIP-13)
▪ Namespace in Kafka topics (KIP-37)
▪ Zookeeper authentication (KIP-38)
▪ End-to-end encryption
51
©2017 LinkedIn Corporation. All Rights Reserved.
Reduced hardware and operational cost
▪ Dynamic configuration (KIP-21)
▪ Rack aware replica assignment (KIP-36)
▪ Self healing (KIP-46)
▪ On demand data deletion (KIP-107)
▪ JBOD support (KIP-112 and KIP-113)
52
©2017 LinkedIn Corporation. All Rights Reserved.
Additional functionality for broader use-cases
▪ Kafka connect for data import/export (KIP-26)
▪ Streaming processor (KIP-28)
▪ Timestamp in message (KIP-32)
▪ Exactly-once delivery and transactional messaging (KIP-98)
53
©2017 LinkedIn Corporation. All Rights Reserved.
Learn more about Kafka
▪ Stream processing meetup
▪ Kafka summit
▪ Kafka improvement proposals
https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals
▪ LinkedIn engineering blog https://engineering.linkedin.com/blog
54
©2017 LinkedIn Corporation. All Rights Reserved. 55
©2017 LinkedIn Corporation. All Rights Reserved.
Agenda
▪ Kafka basics (50 min)
▪ Kafka ecosystem at LinkedIn (40 min)
– Projects to monitor and manage Kafka servers
– Projects to monitor and debug Kafka clients
– Projects to make Kafka easier to use
– Projects that are built on Kafka
▪ Hands on (30 min)
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to monitor and manage Kafka servers
▪ cruise-control for automatically balancing partitions across brokers
▪ kafka-monitor for monitoring kafka service availability etc.
▪ kafka-audit for monitoring data loss
▪ InGraph for monitoring all JMX metrics from Kafka as time-series graph
57
©2017 LinkedIn Corporation. All Rights Reserved.
Problems before having Cruise Control
▪ SRE needs to wake up at night to move partitions in case of hardware failure
▪ SRE needs to manually move partitions to balance load across brokers
▪ Reduced availability due to need to wait for manual recovery
▪ The partition movement may impact production traffic
58
Open sourced on Github in Aug, 2017
©2017 LinkedIn Corporation. All Rights Reserved.
Cruise Control Architecture
59
▪ Self-heal from broker failure
▪ Balance load across brokers
without manual intervention
▪ Controlled impact on PROD
traffic when moving partitions
©2017 LinkedIn Corporation. All Rights Reserved.
Example Cruise Control goals
▪ Partitions should be distributed across brokers in a rack-aware manner
▪ Broker resource utilization should be below the user-specified threshold
▪ Try to evenly distribute resource utilization across brokers
60
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to monitor and manage Kafka servers
▪ cruise-control for automatically balancing partitions across brokers
▪ kafka-monitor for monitoring kafka service availability etc.
▪ kafka-audit for monitoring data loss
▪ InGraph for monitoring all JMX metrics from Kafka as time-series graph
61
©2017 LinkedIn Corporation. All Rights Reserved.
Problems before having Kafka Monitor
▪ Some issues are discovered only after bug report from Kafka user
▪ Can not quantify the availability and the latency of Kafka cluster
▪ Can not quantify the availability and the latency of Kafka mirrored pipeline
62
©2017 LinkedIn Corporation. All Rights Reserved.
Kafka Monitor Architecture
63
▪ Alert on service unavailability
▪ Quantify service availability
▪ Measure end-to-end latency
▪ Detect violation of Kafka semantics
Our availability SLA is 99.99%
©2017 LinkedIn Corporation. All Rights Reserved.
Other Kafka Monitor features
64
▪ Automatically distribute partitions of the monitor topic evenly across brokers
▪ Extensible module to export JMX metrics to various stores (e.g. Graphite)
▪ Pluggable interface to test Kafka service with your own client implementation
Open sourced on Github in May, 2016
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to monitor and manage Kafka servers
▪ cruise-control for automatically balancing partitions across brokers
▪ kafka-monitor for monitoring kafka service availability etc.
▪ kafka-audit for monitoring data loss
▪ InGraph for monitoring all JMX metrics from Kafka as time-series graph
65
©2017 LinkedIn Corporation. All Rights Reserved.
Problems before having Kafka Audit
▪ Hard to help user identify why their message is not received
▪ Hard to detect and debug message loss in Kafka pipelines
66
©2017 LinkedIn Corporation. All Rights Reserved.
Kafka Audit Architecture
67
▪ Detect messages loss
▪ Debug message loss
▪ Audit Kafka resource usage
©2017 LinkedIn Corporation. All Rights Reserved.
Example Kafka Audit UI
68
When, where and how many of messages are delivered to Kafka
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to monitor and manage Kafka servers
▪ cruise-control for automatically balancing partitions across brokers
▪ kafka-monitor for monitoring kafka service availability etc.
▪ kafka-audit for monitoring data loss
▪ InGraph for monitoring all JMX metrics from Kafka as time-series graph
69
©2017 LinkedIn Corporation. All Rights Reserved.
InGraph Architecture
70
Metric topic
in
Kafka Cluster
Broker
Broker
Client
InGraph
with
UI
Metric
messages
metric
messages
©2017 LinkedIn Corporation. All Rights Reserved.
Example InGraph UI
71
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to monitor and debug Kafka clients
▪ Burrow for monitoring offset lag of consumer groups
▪ kafka-audit for monitoring Kafka resource usage per client
72
©2017 LinkedIn Corporation. All Rights Reserved.
Burrow Architecture
▪ Detect lagging consumers
▪ Detect stalled consumers
▪ Detect stopped consumers
▪ Detect offset rewind
▪ Open sourced on Github
73
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to monitor and debug Kafka clients
▪ Burrow for monitoring offset lag of consumer groups
▪ kafka-audit for monitoring Kafka resource usage per client
74
Attribute the hardware cost in $$ to users of Kafka
and reduce unnecessary usage of Kafka
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to make Kafka easier to use
▪ kafka-rest to allow non-Java client to produce and consume from Kafka cluster
▪ schema-registry for conversion between binary data and IndexedRecord
▪ li-apache-kafka-clients to support large message etc.
▪ Nuage for users to create and manage properties (e.g. retention time) of their
topic by themselves
75
©2017 LinkedIn Corporation. All Rights Reserved.
Kafka Rest Architecture
76
▪ Support non-Java clients
▪ No need to maintain client
libraries in multiple languages
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to make Kafka easier to use
▪ kafka-rest to allow non-Java client to produce and consume from Kafka cluster
▪ schema-registry for conversion between binary data and IndexedRecord
▪ li-apache-kafka-clients to support large message etc.
▪ Nuage for users to create and manage properties (e.g. retention time) of their
topic by themselves
77
©2017 LinkedIn Corporation. All Rights Reserved.
Schema Registry Architecture
78
▪ Enable efficient binary
encoding of schema in the
Kafka message
▪ Track schema evolution
for forward and backward
compatibility
Kafka Cluster
LiProducer
with
Schema cache
LiConsumer
with
Schema cache
IndexedRecord
IndexedRecord
Binary
data
Binary
data
Schema Registry
Register schema Fetch schema
User application User application
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to make Kafka easier to use
▪ kafka-rest to allow non-Java client to produce and consume from Kafka cluster
▪ schema-registry for conversion between binary data and IndexedRecord
▪ li-apache-kafka-clients to support large message etc.
▪ Nuage for users to create and manage properties (e.g. retention time) of their
topic by themselves
79
©2017 LinkedIn Corporation. All Rights Reserved.
Large message support in li-apache-kafka-clients
80
©2017 LinkedIn Corporation. All Rights Reserved.
Projects to make Kafka easier to use
▪ kafka-rest to allow non-Java client to produce and consume from Kafka cluster
▪ schema-registry for conversion between binary data and IndexedRecord
▪ li-apache-kafka-clients to support large message etc.
▪ Nuage for users to create and manage properties (e.g. retention time) of their
topic by themselves
81
©2017 LinkedIn Corporation. All Rights Reserved.
Put things together
82
©2017 LinkedIn Corporation. All Rights Reserved.
Help yourself with these open source projects
▪ Cruise Control (https://github.com/linkedin/cruise-control)
▪ Kafka Monitor (https://github.com/linkedin/kafka-monitor)
▪ Burrow (https://github.com/linkedin/burrow)
▪ li-apache-kafka-clients (https://github.com/linkedin/li-apache-kafka-clients)
▪ Future projects open sourced by LinkedIn streaming team can be found at
https://github.com/linkedin/streaming
83
All projects are actively maintained and used in LinkedIn production environment
100% free of charge!
©2017 LinkedIn Corporation. All Rights Reserved.
Projects at LinkedIn that are built on Kafka
▪ Stream processing – Apache Samza
▪ Change data capture – Brooklin
▪ Strongly consistent key-value store – Espresso
▪ Efficient key-value store for derived data – Venice
84
©2017 LinkedIn Corporation. All Rights Reserved. 85
©2017 LinkedIn Corporation. All Rights Reserved. 86
Agenda
▪ Kafka basics (50 min)
▪ Kafka ecosystem at LinkedIn (40 min)
▪ Hands-on (30 min)
©2017 LinkedIn Corporation. All Rights Reserved. 87
Hands-on
▪ Visit goo.gl/D7GFfB
▪ Single cluster
– Download and compile Apache Kafka
– Setup a cluster of one broker
– Create and describe topic
– Produce and consume using Apache Kafka tools
– Monitor availability of your cluster using Kafka Monitor
▪ Mirrored pipeline
– Setup another cluster of one broker
– Setup MM to mirror traffic from the source cluster to the destination cluster
– Produce to the source cluster and consume from the destination cluster
– Monitor availability of your pipeline using Kafka Monitor

More Related Content

PPTX
Kafka at Scale: Multi-Tier Architectures
PPTX
Tuning Kafka for Fun and Profit
PPTX
Design Patterns for working with Fast Data
PPTX
Kafka at Peak Performance
PPTX
Securing Hadoop in an Enterprise Context (v2)
PDF
Architecting for Scale
PPT
Connecting applicationswitha mq
PPTX
Automate Hadoop Cluster Deployment in a Banking Ecosystem
Kafka at Scale: Multi-Tier Architectures
Tuning Kafka for Fun and Profit
Design Patterns for working with Fast Data
Kafka at Peak Performance
Securing Hadoop in an Enterprise Context (v2)
Architecting for Scale
Connecting applicationswitha mq
Automate Hadoop Cluster Deployment in a Banking Ecosystem

What's hot (20)

PPTX
Building Event-Driven Systems with Apache Kafka
PDF
PPTX
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
PPTX
Real time Messages at Scale with Apache Kafka and Couchbase
PDF
PHP and the Cloud: The view from the bazaar
PDF
Building Kafka-powered Activity Stream
PDF
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
KEY
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
PPTX
Building an Event Bus at Scale
PPTX
Reducing Microservice Complexity with Kafka and Reactive Streams
PDF
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
PDF
Fabric8 mq
PDF
A la rencontre de Kafka, le log distribué par Florian GARCIA
PPTX
kafka for db as postgres
PPTX
Introduction to Apache Kafka
PDF
Introduction to Apache Kafka and why it matters - Madrid
PPTX
Micro service architecture
PDF
Fundamentals of Apache Kafka
PPTX
Introduction to Kafka
PPTX
Kafka at scale facebook israel
Building Event-Driven Systems with Apache Kafka
Protecting your data at rest with Apache Kafka by Confluent and Vormetric
Real time Messages at Scale with Apache Kafka and Couchbase
PHP and the Cloud: The view from the bazaar
Building Kafka-powered Activity Stream
Building Machine Learning inference pipelines at scale | AWS Summit Tel Aviv ...
Data Models and Consumer Idioms Using Apache Kafka for Continuous Data Stream...
Building an Event Bus at Scale
Reducing Microservice Complexity with Kafka and Reactive Streams
Building Stream Infrastructure across Multiple Data Centers with Apache Kafka
Fabric8 mq
A la rencontre de Kafka, le log distribué par Florian GARCIA
kafka for db as postgres
Introduction to Apache Kafka
Introduction to Apache Kafka and why it matters - Madrid
Micro service architecture
Fundamentals of Apache Kafka
Introduction to Kafka
Kafka at scale facebook israel
Ad

Similar to An introduction to Apache Kafka and Kafka ecosystem at LinkedIn (20)

PPTX
Apache Kafka at LinkedIn
PPTX
Kafkha real time analytics platform.pptx
PPTX
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
PPTX
Understanding kafka
PDF
Apache Kafka - Free Friday
PDF
Apache Kafka - Scalable Message-Processing and more !
PPT
Apache kafka- Onkar Kadam
PPTX
Apache Kafka: Next Generation Distributed Messaging System
PPTX
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
PPTX
Kafka - Linkedin's messaging backbone
PDF
An Introduction to Apache Kafka
PDF
Fault Tolerance with Kafka
PPTX
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
PPTX
messaging.pptx
PPTX
Copy of Kafka-Camus
PPTX
Kafka Basic For Beginners
PDF
Apache Kafka Introduction
PPTX
Unleashing Real-time Power with Kafka.pptx
PPTX
kafka simplicity and complexity
PPTX
Apache kafka
Apache Kafka at LinkedIn
Kafkha real time analytics platform.pptx
Apache Kafka with Spark Streaming: Real-time Analytics Redefined
Understanding kafka
Apache Kafka - Free Friday
Apache Kafka - Scalable Message-Processing and more !
Apache kafka- Onkar Kadam
Apache Kafka: Next Generation Distributed Messaging System
What is Kafka & why is it Important? (UKOUG Tech17, Birmingham, UK - December...
Kafka - Linkedin's messaging backbone
An Introduction to Apache Kafka
Fault Tolerance with Kafka
Kafka 0.8.0 Presentation to Atlanta Java User's Group March 2013
messaging.pptx
Copy of Kafka-Camus
Kafka Basic For Beginners
Apache Kafka Introduction
Unleashing Real-time Power with Kafka.pptx
kafka simplicity and complexity
Apache kafka
Ad

More from Dong Lin (6)

PPTX
FeatHub_DataFun_2023.pptx
PPTX
FeatHub_GAIDC_2022.pptx
PPTX
FeatHub_FFA_2022
PPTX
基于 Flink 和 AI Flow 的实时推荐系统
PPTX
为实时机器学习设计的算法接口与迭代引擎_FFA_2021
PPTX
Kafka at half the price with JBOD setup
FeatHub_DataFun_2023.pptx
FeatHub_GAIDC_2022.pptx
FeatHub_FFA_2022
基于 Flink 和 AI Flow 的实时推荐系统
为实时机器学习设计的算法接口与迭代引擎_FFA_2021
Kafka at half the price with JBOD setup

Recently uploaded (20)

PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
bas. eng. economics group 4 presentation 1.pptx
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
Lesson 3_Tessellation.pptx finite Mathematics
PDF
Structs to JSON How Go Powers REST APIs.pdf
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Digital Logic Computer Design lecture notes
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
PPT on Performance Review to get promotions
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
Internet of Things (IOT) - A guide to understanding
DOCX
573137875-Attendance-Management-System-original
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
bas. eng. economics group 4 presentation 1.pptx
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
additive manufacturing of ss316l using mig welding
Lesson 3_Tessellation.pptx finite Mathematics
Structs to JSON How Go Powers REST APIs.pdf
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Digital Logic Computer Design lecture notes
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Model Code of Practice - Construction Work - 21102022 .pdf
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPT on Performance Review to get promotions
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
Arduino robotics embedded978-1-4302-3184-4.pdf
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
Internet of Things (IOT) - A guide to understanding
573137875-Attendance-Management-System-original

An introduction to Apache Kafka and Kafka ecosystem at LinkedIn

  • 1. ©2017 LinkedIn Corporation. All Rights Reserved. An Introduction to Apache Kafka and Kafka Ecosystem at LinkedIn Dong Lin Data Infra Streaming @ LinkedIn Open Data Science Conference
  • 2. ©2017 LinkedIn Corporation. All Rights Reserved. Agenda ▪ Kafka basics (50 min) ▪ Kafka ecosystem at LinkedIn (40 min) ▪ Hands-on (30 min)
  • 3. ©2017 LinkedIn Corporation. All Rights Reserved. 3 Kafka basics ▪ What is Kafka? – Motivation and design philosophy ▪ Who uses Kafka? – Adoption in the open source community and use-cases at LinkedIn ▪ What is the fundamental design of Kafka? – Partition and replication model ▪ How to configure Kafka for your use-case? – Tradeoff among performance, persistence, availability and message order ▪ What is the development roadmap of Kafka? – Recent and upcoming features
  • 4. ©2017 LinkedIn Corporation. All Rights Reserved. 4 Publish/Subscribe Messaging • Multiple producers • Multiple consumers • Scalable and durable • Created by LinkedIn • Open sourced under Apache
  • 5. ©2017 LinkedIn Corporation. All Rights Reserved. 5 PageViewEvent Hadoop Direct transmission Web server
  • 6. ©2017 LinkedIn Corporation. All Rights Reserved. Many problems Multiple consumers Destination is slow Destination permanent failure Bug in downstream application Destination temporarily unavailable Multiple producers At least once delivery 6 PageViewEvent HadoopWeb server
  • 7. ©2017 LinkedIn Corporation. All Rights Reserved. Use a publish-subscribe messaging system Multiple consumers Destination permanent failure Bug in downstream application Multiple producers Destination temporarily unavailable Pub/sub system 7 Hadoop Destination is slow At least once delivery Web server
  • 8. ©2017 LinkedIn Corporation. All Rights Reserved. Use Kafka Spark streaming Multiple consumers Destination permanent failure Bug in downstream application FunctionalityPersistent Delivery semanticsPerformance Destination temporarily unavailable Availability 8 Destination is slow At least once delivery Multiple producers Web server
  • 9. ©2017 LinkedIn Corporation. All Rights Reserved. Problem: closely-coupled pipelines ▪ O(N^2) pipelines – limited organizational scalability ▪ Messages are duplicated proportional to number of clients 9
  • 10. ©2017 LinkedIn Corporation. All Rights Reserved. Solution: publish-subscribe messaging system ▪ O(N) pipelines ▪ Space efficient ▪ Producers are decoupled from consumers 10
  • 11. ©2017 LinkedIn Corporation. All Rights Reserved. Kafka as Unix Pipes $ cat *.txt | tr A-Z a-z | grep hello $ tail –F *.txt | tr A-Z a-z | grep hello producer kafka Hadoop kafka Hadoop Samza kafka Samza Reference: http://www.confluent.io/blog 11
  • 12. ©2017 LinkedIn Corporation. All Rights Reserved. Fan In 12
  • 13. ©2017 LinkedIn Corporation. All Rights Reserved. Fan Out 13
  • 14. ©2017 LinkedIn Corporation. All Rights Reserved. Add Branch 14
  • 15. ©2017 LinkedIn Corporation. All Rights Reserved. Switch Branch 15
  • 16. ©2017 LinkedIn Corporation. All Rights Reserved. Delete Branch 16
  • 17. ©2017 LinkedIn Corporation. All Rights Reserved. Parallel Consumption 17
  • 18. ©2017 LinkedIn Corporation. All Rights Reserved. 18 Kafka basics ▪ What is Kafka? – Motivation and design philosophy ▪ Who uses Kafka? – Adoption in the open source community and use-cases at LinkedIn ▪ What is the fundamental design of Kafka? – Partition and replication model ▪ How to configure Kafka for your use-case? – Tradeoff among performance, persistence, availability and message order ▪ What is the development roadmap of Kafka? – Recent and upcoming features
  • 19. ©2017 LinkedIn Corporation. All Rights Reserved. Companies that use Kafka LinkedIn Yahoo Twitter Airbnb Pinterest Square Coursera Uber Goldman Sachs Box Paypal Cisco Dropbox Spotify Wikipedia Microsoft Netflix CloudFlare Hotels.com … Reference: https://cwiki.apache.org/confluence/display/KAFKA/Powered+By 19
  • 20. ©2017 LinkedIn Corporation. All Rights Reserved. Apache projects integrated with Kafka • Stream processing • Apache Storm • Apache Samza • Apache Spark Streaming • Search and Query • Apache Hive • Presto • Apache Hadoop … 20
  • 21. ©2017 LinkedIn Corporation. All Rights Reserved. Kafka volume at LinkedIn 21 • Produced • Per day 2Trillion messages • Single cluster • Unique data 5Gbps Inbound • Average 3X consumption • Before mirroring 18Gbps Outbound • Largest cluster has 250k partitions • Up to 10k partitions per broker 2.5M Partitions
  • 22. ©2017 LinkedIn Corporation. All Rights Reserved. Kafka use-cases at LinkedIn 22 • Member-related Activity Tracking Metrics Queuing Logging • Application metrics, service calls • Internal application data, messaging • Largest users are Samza and Search • Dedicated cluster for application logs going to ELK • High volume, low retention
  • 23. ©2017 LinkedIn Corporation. All Rights Reserved. 23 Kafka basics ▪ What is Kafka? – Motivation and design philosophy ▪ Who uses Kafka? – Adoption in the open source community and use-cases at LinkedIn ▪ What is the fundamental design of Kafka? – Partition and replication model ▪ How to configure Kafka for your use-case? – Tradeoff among performance, persistence, availability and message order ▪ What is the development roadmap of Kafka? – Recent and upcoming features
  • 24. ©2017 LinkedIn Corporation. All Rights Reserved. Design goal ▪ Performance – High throughput – Low latency – Scalable ▪ Persistence and availability – Data should be available in the event of (permanent) server failure ▪ Functionality – Rewind back in time ▪ Strong delivery semantics – At-least-once delivery / exactly-once delivery – In-order message delivery within partition 24
  • 25. ©2017 LinkedIn Corporation. All Rights Reserved. Characteristics • High throughput (~300 MBps per machine) – Immutable append-only data structure for fast disk access – Efficient data transfer via zero copy – Mostly messages are read directly from page cache – Partitioning model for scalability – Batching and compression • Low latency (~2 ms) – Make data universally available in near real-time • Strong guarantees about messages – Messages strictly ordered within partition – All data persistent on disk with replication – Exactly once delivery 25
  • 26. ©2017 LinkedIn Corporation. All Rights Reserved. Is disk slow? 26
  • 27. ©2017 LinkedIn Corporation. All Rights Reserved. Traditional data copy 27 ▪ 4 copies ▪ 2 context switches
  • 28. ©2017 LinkedIn Corporation. All Rights Reserved. Efficient zero copy 28 ▪ 3 copies ▪ 0 context switch ▪ Only 2 copies if consumers are mostly caught up
  • 29. ©2017 LinkedIn Corporation. All Rights Reserved. Kafka as log 29
  • 30. ©2017 LinkedIn Corporation. All Rights Reserved. Producer -> Topic -> Consumer 30
  • 31. ©2017 LinkedIn Corporation. All Rights Reserved. Topic divided into partitions • Partitions are distributed and replicated across brokers • Parallel produce/consume • Messages with the same key go to the same partition 31
  • 32. ©2017 LinkedIn Corporation. All Rights Reserved. Old New Partition consists of messages with offsets • Append only • Strict order • Messages assigned with incremental offsets 32
  • 33. ©2017 LinkedIn Corporation. All Rights Reserved. 33 ▪ Disk/network/CPU load distributed across brokers in unit of partitions Broker in Kafka
  • 34. ©2017 LinkedIn Corporation. All Rights Reserved. Producer in Kafka ▪ Messages with same key go to the same partition ▪ Messages without a key go to a random partition 34
  • 35. ©2017 LinkedIn Corporation. All Rights Reserved. Consumer in Kafka ▪ Consume can belong to a consumer group (CG) ▪ Consumes in the same CG – Parallel processing of messages – Share the consumer offset 35
  • 36. ©2017 LinkedIn Corporation. All Rights Reserved. When a broker fails… X 36
  • 37. ©2017 LinkedIn Corporation. All Rights Reserved. Partition replication in Kafka ▪Brokers can fail – Controlled: e.g., upgrades/config changes – Uncontrolled: disk failure, power outage, out-of-memory etc. ▪Need high availability – Typical failover < 10 ms ▪Need data persistence 37
  • 38. ©2017 LinkedIn Corporation. All Rights Reserved. Partition replica assignment ▪ Replicas are laid out evenly across brokers ▪ First assigned replica is preferred as leader. ▪ Writes/reads go to leader, which sends message to followers 38
  • 39. ©2017 LinkedIn Corporation. All Rights Reserved. Replication (at a high-level) 39
  • 40. ©2017 LinkedIn Corporation. All Rights Reserved. Replication (at a high-level) 40
  • 41. ©2017 LinkedIn Corporation. All Rights Reserved. Replication (at a high-level) 41
  • 42. ©2017 LinkedIn Corporation. All Rights Reserved. Replication (at a high-level) 42
  • 43. ©2017 LinkedIn Corporation. All Rights Reserved. 43 Kafka basics ▪ What is Kafka? – Motivation and design philosophy ▪ Who uses Kafka? – Adoption in the open source community and use-cases at LinkedIn ▪ What is the fundamental design of Kafka? – Partition and replication model ▪ How to configure Kafka for your use-case? – Tradeoff among performance, persistence, availability and message order ▪ What is the development roadmap of Kafka? – Recent and upcoming features
  • 44. ©2017 LinkedIn Corporation. All Rights Reserved. No one-size-fits-all configuration 44
  • 45. ©2017 LinkedIn Corporation. All Rights Reserved. Tradeoff between performance and persistence • Should broker send ack to producer right after step 1? • Higher persistence and lower throughput with acks = -1 in producer config X 45
  • 46. ©2017 LinkedIn Corporation. All Rights Reserved. Tradeoff between performance and message order 46 • Should producer send new message before ack of the last message? • In-order delivery and lower throughput with max.in.flight.requests.per.connection = 1 in producer config Kafka BrokerProducer message 1 message 0 failed retry message 0 message 0
  • 47. ©2017 LinkedIn Corporation. All Rights Reserved. Tradeoff between persistence and availability • Should we allow message produce if all in-sync replicas are offline? • Higher availability and weaker persistence with unclean.leader.election.enable = true in broker config 0 1 2 3 4 5 6 7 8 0 1 2 3 4 5 60 1 2 3 4 5 Follower 1 Follower 2 Leader Read Read 47 7 8 X X
  • 48. ©2017 LinkedIn Corporation. All Rights Reserved. Tradeoff between availability and cost • Do we need more replicas for the topic? • Higher availability and higher cost with RF=3 in comparison to RF=2) 48 producer Broker Broker Broker producer Broker Broker RF=3 RF=2
  • 49. ©2017 LinkedIn Corporation. All Rights Reserved. 49 Kafka basics ▪ What is Kafka? – Motivation and design philosophy ▪ Who uses Kafka? – Adoption in the open source community and use-cases at LinkedIn ▪ What is the fundamental design of Kafka? – Partition and replication model ▪ How to configure Kafka for your use-case? – Tradeoff among performance, persistence, availability and message order ▪ What is the development roadmap of Kafka? – Recent and upcoming features
  • 50. ©2017 LinkedIn Corporation. All Rights Reserved. Kafka provides great performance, availability and data persistence Are there other features that will be valuable to users? 50
  • 51. ©2017 LinkedIn Corporation. All Rights Reserved. Improved support for multi-tenancy ▪ Sasl/Kerberos and SSL support (KIP-12) ▪ Quota (KIP-13) ▪ Namespace in Kafka topics (KIP-37) ▪ Zookeeper authentication (KIP-38) ▪ End-to-end encryption 51
  • 52. ©2017 LinkedIn Corporation. All Rights Reserved. Reduced hardware and operational cost ▪ Dynamic configuration (KIP-21) ▪ Rack aware replica assignment (KIP-36) ▪ Self healing (KIP-46) ▪ On demand data deletion (KIP-107) ▪ JBOD support (KIP-112 and KIP-113) 52
  • 53. ©2017 LinkedIn Corporation. All Rights Reserved. Additional functionality for broader use-cases ▪ Kafka connect for data import/export (KIP-26) ▪ Streaming processor (KIP-28) ▪ Timestamp in message (KIP-32) ▪ Exactly-once delivery and transactional messaging (KIP-98) 53
  • 54. ©2017 LinkedIn Corporation. All Rights Reserved. Learn more about Kafka ▪ Stream processing meetup ▪ Kafka summit ▪ Kafka improvement proposals https://cwiki.apache.org/confluence/display/KAFKA/Kafka+Improvement+Proposals ▪ LinkedIn engineering blog https://engineering.linkedin.com/blog 54
  • 55. ©2017 LinkedIn Corporation. All Rights Reserved. 55
  • 56. ©2017 LinkedIn Corporation. All Rights Reserved. Agenda ▪ Kafka basics (50 min) ▪ Kafka ecosystem at LinkedIn (40 min) – Projects to monitor and manage Kafka servers – Projects to monitor and debug Kafka clients – Projects to make Kafka easier to use – Projects that are built on Kafka ▪ Hands on (30 min)
  • 57. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to monitor and manage Kafka servers ▪ cruise-control for automatically balancing partitions across brokers ▪ kafka-monitor for monitoring kafka service availability etc. ▪ kafka-audit for monitoring data loss ▪ InGraph for monitoring all JMX metrics from Kafka as time-series graph 57
  • 58. ©2017 LinkedIn Corporation. All Rights Reserved. Problems before having Cruise Control ▪ SRE needs to wake up at night to move partitions in case of hardware failure ▪ SRE needs to manually move partitions to balance load across brokers ▪ Reduced availability due to need to wait for manual recovery ▪ The partition movement may impact production traffic 58 Open sourced on Github in Aug, 2017
  • 59. ©2017 LinkedIn Corporation. All Rights Reserved. Cruise Control Architecture 59 ▪ Self-heal from broker failure ▪ Balance load across brokers without manual intervention ▪ Controlled impact on PROD traffic when moving partitions
  • 60. ©2017 LinkedIn Corporation. All Rights Reserved. Example Cruise Control goals ▪ Partitions should be distributed across brokers in a rack-aware manner ▪ Broker resource utilization should be below the user-specified threshold ▪ Try to evenly distribute resource utilization across brokers 60
  • 61. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to monitor and manage Kafka servers ▪ cruise-control for automatically balancing partitions across brokers ▪ kafka-monitor for monitoring kafka service availability etc. ▪ kafka-audit for monitoring data loss ▪ InGraph for monitoring all JMX metrics from Kafka as time-series graph 61
  • 62. ©2017 LinkedIn Corporation. All Rights Reserved. Problems before having Kafka Monitor ▪ Some issues are discovered only after bug report from Kafka user ▪ Can not quantify the availability and the latency of Kafka cluster ▪ Can not quantify the availability and the latency of Kafka mirrored pipeline 62
  • 63. ©2017 LinkedIn Corporation. All Rights Reserved. Kafka Monitor Architecture 63 ▪ Alert on service unavailability ▪ Quantify service availability ▪ Measure end-to-end latency ▪ Detect violation of Kafka semantics Our availability SLA is 99.99%
  • 64. ©2017 LinkedIn Corporation. All Rights Reserved. Other Kafka Monitor features 64 ▪ Automatically distribute partitions of the monitor topic evenly across brokers ▪ Extensible module to export JMX metrics to various stores (e.g. Graphite) ▪ Pluggable interface to test Kafka service with your own client implementation Open sourced on Github in May, 2016
  • 65. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to monitor and manage Kafka servers ▪ cruise-control for automatically balancing partitions across brokers ▪ kafka-monitor for monitoring kafka service availability etc. ▪ kafka-audit for monitoring data loss ▪ InGraph for monitoring all JMX metrics from Kafka as time-series graph 65
  • 66. ©2017 LinkedIn Corporation. All Rights Reserved. Problems before having Kafka Audit ▪ Hard to help user identify why their message is not received ▪ Hard to detect and debug message loss in Kafka pipelines 66
  • 67. ©2017 LinkedIn Corporation. All Rights Reserved. Kafka Audit Architecture 67 ▪ Detect messages loss ▪ Debug message loss ▪ Audit Kafka resource usage
  • 68. ©2017 LinkedIn Corporation. All Rights Reserved. Example Kafka Audit UI 68 When, where and how many of messages are delivered to Kafka
  • 69. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to monitor and manage Kafka servers ▪ cruise-control for automatically balancing partitions across brokers ▪ kafka-monitor for monitoring kafka service availability etc. ▪ kafka-audit for monitoring data loss ▪ InGraph for monitoring all JMX metrics from Kafka as time-series graph 69
  • 70. ©2017 LinkedIn Corporation. All Rights Reserved. InGraph Architecture 70 Metric topic in Kafka Cluster Broker Broker Client InGraph with UI Metric messages metric messages
  • 71. ©2017 LinkedIn Corporation. All Rights Reserved. Example InGraph UI 71
  • 72. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to monitor and debug Kafka clients ▪ Burrow for monitoring offset lag of consumer groups ▪ kafka-audit for monitoring Kafka resource usage per client 72
  • 73. ©2017 LinkedIn Corporation. All Rights Reserved. Burrow Architecture ▪ Detect lagging consumers ▪ Detect stalled consumers ▪ Detect stopped consumers ▪ Detect offset rewind ▪ Open sourced on Github 73
  • 74. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to monitor and debug Kafka clients ▪ Burrow for monitoring offset lag of consumer groups ▪ kafka-audit for monitoring Kafka resource usage per client 74 Attribute the hardware cost in $$ to users of Kafka and reduce unnecessary usage of Kafka
  • 75. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to make Kafka easier to use ▪ kafka-rest to allow non-Java client to produce and consume from Kafka cluster ▪ schema-registry for conversion between binary data and IndexedRecord ▪ li-apache-kafka-clients to support large message etc. ▪ Nuage for users to create and manage properties (e.g. retention time) of their topic by themselves 75
  • 76. ©2017 LinkedIn Corporation. All Rights Reserved. Kafka Rest Architecture 76 ▪ Support non-Java clients ▪ No need to maintain client libraries in multiple languages
  • 77. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to make Kafka easier to use ▪ kafka-rest to allow non-Java client to produce and consume from Kafka cluster ▪ schema-registry for conversion between binary data and IndexedRecord ▪ li-apache-kafka-clients to support large message etc. ▪ Nuage for users to create and manage properties (e.g. retention time) of their topic by themselves 77
  • 78. ©2017 LinkedIn Corporation. All Rights Reserved. Schema Registry Architecture 78 ▪ Enable efficient binary encoding of schema in the Kafka message ▪ Track schema evolution for forward and backward compatibility Kafka Cluster LiProducer with Schema cache LiConsumer with Schema cache IndexedRecord IndexedRecord Binary data Binary data Schema Registry Register schema Fetch schema User application User application
  • 79. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to make Kafka easier to use ▪ kafka-rest to allow non-Java client to produce and consume from Kafka cluster ▪ schema-registry for conversion between binary data and IndexedRecord ▪ li-apache-kafka-clients to support large message etc. ▪ Nuage for users to create and manage properties (e.g. retention time) of their topic by themselves 79
  • 80. ©2017 LinkedIn Corporation. All Rights Reserved. Large message support in li-apache-kafka-clients 80
  • 81. ©2017 LinkedIn Corporation. All Rights Reserved. Projects to make Kafka easier to use ▪ kafka-rest to allow non-Java client to produce and consume from Kafka cluster ▪ schema-registry for conversion between binary data and IndexedRecord ▪ li-apache-kafka-clients to support large message etc. ▪ Nuage for users to create and manage properties (e.g. retention time) of their topic by themselves 81
  • 82. ©2017 LinkedIn Corporation. All Rights Reserved. Put things together 82
  • 83. ©2017 LinkedIn Corporation. All Rights Reserved. Help yourself with these open source projects ▪ Cruise Control (https://github.com/linkedin/cruise-control) ▪ Kafka Monitor (https://github.com/linkedin/kafka-monitor) ▪ Burrow (https://github.com/linkedin/burrow) ▪ li-apache-kafka-clients (https://github.com/linkedin/li-apache-kafka-clients) ▪ Future projects open sourced by LinkedIn streaming team can be found at https://github.com/linkedin/streaming 83 All projects are actively maintained and used in LinkedIn production environment 100% free of charge!
  • 84. ©2017 LinkedIn Corporation. All Rights Reserved. Projects at LinkedIn that are built on Kafka ▪ Stream processing – Apache Samza ▪ Change data capture – Brooklin ▪ Strongly consistent key-value store – Espresso ▪ Efficient key-value store for derived data – Venice 84
  • 85. ©2017 LinkedIn Corporation. All Rights Reserved. 85
  • 86. ©2017 LinkedIn Corporation. All Rights Reserved. 86 Agenda ▪ Kafka basics (50 min) ▪ Kafka ecosystem at LinkedIn (40 min) ▪ Hands-on (30 min)
  • 87. ©2017 LinkedIn Corporation. All Rights Reserved. 87 Hands-on ▪ Visit goo.gl/D7GFfB ▪ Single cluster – Download and compile Apache Kafka – Setup a cluster of one broker – Create and describe topic – Produce and consume using Apache Kafka tools – Monitor availability of your cluster using Kafka Monitor ▪ Mirrored pipeline – Setup another cluster of one broker – Setup MM to mirror traffic from the source cluster to the destination cluster – Produce to the source cluster and consume from the destination cluster – Monitor availability of your pipeline using Kafka Monitor