SlideShare a Scribd company logo
Apache kafka
About.me
 Desenvolvedor de software +10 anos
 Ilegra desde novembro / 2016
 Arquitetura Unicred
 Projeto core banking
 Estudado:
 Microservices
 Reactive services
About
 History
 Motivation
 What is?
 Design concepts
 Use case
History
Motivation
 Complex infrastructure
 Different tools
 Performance issues
 Governance
 Persistent
Motivation
Motivation
Motivation
Motivation
What is?
What is?
What is?
 Distributed streaming platform
 You can produce and consume messages similar to conventional message systems.
 Store data - fault tolerant
 Process events in order
High level architecture
Log
 Persistent sequence messages
 Newest messages are append to the log
 Similar to database transaction log
 State recovery
 Sequence of actions / events
 Debug
 Audit
Topic
 Logical division
 Subject
 Unit of coupling
 Can be split into partitions
Partition
Partition
 Fisic unit
 Each partition are a log
 Producers and consumers connect
 Split into brokers
Partition
Producer
 Write messages to partitions
 Round Robbin - Balanced
 Semantic Strategy - Hot Spot attention
 Messages can be compressed
 Sync
 Send and wait for response.
 Exception will be treat and resend manually
 Async
 Register a callback
Producer
 Several configs
 acks = 0
■ Very fast.
■ Without consistent guarantee
 acks = 1
■ Wait the response of the leader
■ Sync can increase the latency
 acks = all
■ Wait for all replication process
Consumer
 Consume messages
 Control the offset
 Commit the offset
 Have a consumer group
Consumer
Consumer
Consumer
Consumer
Consumer
Summary
 Topic can have partitions
 Each partition is a log
 Producers send messages
 Consumers pull messages and control the offset
 Each consumer group receive just one message per partition
 Number of partitions is the unit of parallel processing
 Use to connect data systems to kafka
 Can override common ETL systems
 Available connectors:
 Amazon S3
 JDBC, MYSQL
 HANA
 Cassandra
 Elastic Search
 FTP
Connector
Who are using?
Who are using?
 ⅓ Fortune 500
 https://kafka.apache.org/powered-by
 Linkedin
 Netflix
 Yahoo
 Twitter
 Goldman Sachs
Other features
 Stream
 Aggregation
 Generate new topics
 Complex data pipelines
 Dynamic data processor
 KSQL
 Real time streaming with sql language
Other features - Stream
Other features - Stream
Other features - Stream
Use cases - Data replication
App1 App2
Use cases - Messaging
App1 App2
Use cases - Messaging
App1 App2
App3
Use cases - Messaging
App1 App2
App3
User activity
Analysis / Dashboard
ETL
BI
Events, Stream, Client Join
MS1
MS2
Events, Stream, Client Join
Questions

More Related Content

PDF
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
PDF
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
PDF
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
PPTX
Kubernetes + netflix oss
PDF
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
PPTX
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
PDF
Kong API
PDF
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...
Improving Logging Ingestion Quality At Pinterest: Fighting Data Corruption An...
Building Scalable and Extendable Data Pipeline for Call of Duty Games (Yarosl...
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
Kubernetes + netflix oss
Polyglot, fault-tolerant event-driven programming with kafka, kubernetes and ...
A Look into the Mirror: Patterns and Best Practices for MirrorMaker2 | Cliff ...
Kong API
Polyglot, Fault Tolerant Event-Driven Programming with Kafka, Kubernetes and ...

What's hot (20)

PDF
Kafka on Kubernetes: Does it really have to be "The Hard Way"? (Viktor Gamov ...
PDF
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
PDF
Changing landscapes in data integration - Kafka Connect for near real-time da...
PDF
Why you should have a Schema Registry | David Hettler, Celonis SE
PDF
Open sourcing a successful internal project - Reversim 2021
PDF
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
PDF
3 Ways to Deliver an Elastic, Cost-Effective Cloud Architecture
PDF
Codeless pipelines with pulsar and flink
PDF
Kafka based Global Data Mesh at Wix
PDF
Using Apache Kafka from Go
PDF
From bytes to objects: describing your events | Dale Lane and Kate Stanley, IBM
PDF
Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SY...
PDF
Asynchronous Transaction Processing With Kafka as a Single Source of Truth - ...
PDF
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...
PDF
How to mutate your immutable log | Andrey Falko, Stripe
PPTX
Kafka for Scale
PDF
3 Ways to Deliver an Elastic, Cost-Effective Cloud Architecture (ANZ)
PDF
Jax london - Battle-tested event-driven patterns for your microservices archi...
PDF
What is Apache Kafka®?
PDF
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Kafka on Kubernetes: Does it really have to be "The Hard Way"? (Viktor Gamov ...
From Postgres to Event-Driven: using docker-compose to build CDC pipelines in...
Changing landscapes in data integration - Kafka Connect for near real-time da...
Why you should have a Schema Registry | David Hettler, Celonis SE
Open sourcing a successful internal project - Reversim 2021
Live Demo Jam Expands: The Leading-Edge Streaming Data Platform with NiFi, Ka...
3 Ways to Deliver an Elastic, Cost-Effective Cloud Architecture
Codeless pipelines with pulsar and flink
Kafka based Global Data Mesh at Wix
Using Apache Kafka from Go
From bytes to objects: describing your events | Dale Lane and Kate Stanley, IBM
Making Sense of Your Event-Driven Dataflows (Jorge Esteban Quilcate Otoya, SY...
Asynchronous Transaction Processing With Kafka as a Single Source of Truth - ...
Using the FLiPN Stack for Edge AI (Flink, NiFi, Pulsar) - Pulsar Summit Asia ...
How to mutate your immutable log | Andrey Falko, Stripe
Kafka for Scale
3 Ways to Deliver an Elastic, Cost-Effective Cloud Architecture (ANZ)
Jax london - Battle-tested event-driven patterns for your microservices archi...
What is Apache Kafka®?
Running Flink in Production: The good, The bad and The in Between - Lakshmi ...
Ad

Similar to Apache kafka (20)

PDF
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
PDF
Big Data Streams Architectures. Why? What? How?
PDF
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
PDF
Apache Kafka - Scalable Message-Processing and more !
PDF
Connect K of SMACK:pykafka, kafka-python or?
PPTX
Data Streaming with Apache Kafka & MongoDB - EMEA
PPTX
Webinar: Data Streaming with Apache Kafka & MongoDB
PPT
LinkedIn - A highly scalable Architecture on Java!
PPT
Three SOA Case Studies
PPTX
Kafka for data scientists
PDF
Timothy Spann: Apache Pulsar for ML
PDF
Newsweaver - Big Data Storage
PPTX
Data Streaming with Apache Kafka & MongoDB
PDF
bigdata 2022_ FLiP Into Pulsar Apps
PDF
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
PPTX
Software architecture for data applications
PPT
O'Reilly Velocity Conference 2008
PPTX
Liveperson DLD 2015
PPTX
Handling Data in Mega Scale Systems
PDF
Kafka Training Online | Apache Kafka Course
AI&BigData Lab 2016. Сарапин Виктор: Размер имеет значение: анализ по требова...
Big Data Streams Architectures. Why? What? How?
DM Radio Webinar: Adopting a Streaming-Enabled Architecture
Apache Kafka - Scalable Message-Processing and more !
Connect K of SMACK:pykafka, kafka-python or?
Data Streaming with Apache Kafka & MongoDB - EMEA
Webinar: Data Streaming with Apache Kafka & MongoDB
LinkedIn - A highly scalable Architecture on Java!
Three SOA Case Studies
Kafka for data scientists
Timothy Spann: Apache Pulsar for ML
Newsweaver - Big Data Storage
Data Streaming with Apache Kafka & MongoDB
bigdata 2022_ FLiP Into Pulsar Apps
From Monoliths to Microservices - A Journey With Confluent With Gayathri Veal...
Software architecture for data applications
O'Reilly Velocity Conference 2008
Liveperson DLD 2015
Handling Data in Mega Scale Systems
Kafka Training Online | Apache Kafka Course
Ad

Recently uploaded (20)

PDF
cuic standard and advanced reporting.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Spectroscopy.pptx food analysis technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Approach and Philosophy of On baking technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Empathic Computing: Creating Shared Understanding
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Cloud computing and distributed systems.
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
cuic standard and advanced reporting.pdf
Electronic commerce courselecture one. Pdf
A comparative analysis of optical character recognition models for extracting...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectroscopy.pptx food analysis technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Approach and Philosophy of On baking technology
Reach Out and Touch Someone: Haptics and Empathic Computing
Advanced methodologies resolving dimensionality complications for autism neur...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Unlocking AI with Model Context Protocol (MCP)
Empathic Computing: Creating Shared Understanding
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Cloud computing and distributed systems.
Dropbox Q2 2025 Financial Results & Investor Presentation
Programs and apps: productivity, graphics, security and other tools
Encapsulation_ Review paper, used for researhc scholars
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

Apache kafka