SlideShare a Scribd company logo
1
Au-delà des brokers, un tour de
l’environement Kafka
Florent Ramière @framiere
Technical Account Manager/SE
Confluent
PARIS - 11 OCTOBRE 2018
2
2
Interactive session!
3
Massive volumes of
new data generated
every day
Mobile Cloud Microservices Internet of
Things
Machine
Learning
Distributed across
apps, devices,
datacenters, clouds
Structured,
unstructured
polymorphic
What
4
With
5
How
6
Store & ETL Process
Publish &
Subscribe
In short
7
From a simple idea
8
From a simple idea
9
… with great properties
• Scalability
• Replication
• Security
• Resiliency
• Throughput
• Ordering
• Exactly Once Semantic
• Transaction
• Idempotency
• Immutability
• Performance
• …
10
10
Platform
11
… spawned a full platform
Apache Kafka®
Core | Connect API | Streams API
Stream Processing & Compatibility
KSQL | Schema Registry
Operations
Replicator | Auto Data Balancer | Connectors | MQTT Proxy | Operator
Database
Changes
Log Events IoT Data Web Events other events
Hadoop
Database
Data
Warehouse
CRM
other
DATA INTEGRATION
Transformations
Custom Apps
Analytics
Monitoring
other
REAL-TIME
APPLICATIONS
OPEN SOURCE FEATURES COMMERCIAL FEATURES
Datacenter Public Cloud Confluent Cloud
CONFLUENT PLATFORM
Administration & Monitoring
Control Center | Security
Connectivity
Clients | Connectors | REST Proxy
CONFLUENT FULLY-MANAGEDCUSTOMER SELF-MANAGED
12
12
ETL
13
Apache Kafka Connect API: Import and Export Data In & Out of Kafka
JDBC
Mongo
MySQL
Elastic
Cassandra
HDFS
Kafka Connect API
Kafka Pipeline
Connector
Connector
Connector
Connector
Connector
Connector
Sources Sinks
Fault tolerant
Manage hundreds of
data sources and sinks
Preserves data schema
Integrated within
Confluent Control Center
14
Connectors: Connect Kafka Easily with Data Sources and Sinks
Databases Datastore/File Store
Analytics Applications / Other
Orange Logo denotes Connectors developed and fully supported by Confluent
15
Kafka Connect API, Part of the Apache Kafka™ Project
Connect any source to any target system
Integrated
• 100% compatible with Kafka v0.9 and
higher
• Integrated with Confluent’s Schema
Registry
• Easy to manage with Confluent Control
Center
Flexible
• 40+ open source connectors available
• Easy to develop additional connectors
• Flexible support for data types and
formats
Compatible
• Maintains critical metadata
• Preserves schema information
• Supports schema evolution
Reliable
• Automated failover
• Exactly-once guarantees
• Balances workload between nodes
16
Confluent Hub - The Kafka App Store
17
17
Connectivity
18
Clients: Communicate with Kafka in a Broad Variety of Languages
Apache Kafka
Confluent Platform Community Supported
Proxy http/REST
stdin/stdout
Confluent Platform Clients developed and fully supported by Confluent
19
REST Proxy: Talking to Non-native Kafka Apps and Outside the Firewall
REST Proxy
Non-Java Applications
Native Kafka Java
Applications
Schema Registry
REST / HTTP
Simplifies administrative
actions
Simplifies message creation
and consumption
Provides a RESTful
interface to a Kafka cluster
20
20
IOT
21
MQTT Proxy: Streamline IoT Data Integration with Kafka
Connect all IoT data sources with the streaming
platform - leverages all of your infrastructure
investments
Reduce operational cost and complexity by
eliminating third party MQTT brokers and their
intermediate storage and lag
Ensure IoT data delivery at all QoS levels (QoS0,
QoS1 and QoS2) of the MQTT protocol
Kafka Broker
Kafka Broker
Kafka Broker
MQTT
ProxyGatewaysDevices
MQTT MQTT
22
?
Frictionless MQTT Connectivity with Confluent Platform
Kafka BrokerKafka BrokerKafka BrokerDevicesDevicesDevicesGateways
MQTT
Broker
Connect
w/ MQTT
connector
Connect
w/ MQTT
connectorMQTT
DevicesDevicesDevicesDevices MQTT
DevicesDevicesDevicesDevices MQTT
Kafka BrokerKafka BrokerKafka Broker
MQTT
ProxyMQTT
DevicesDevicesDevicesGateways
DevicesDevicesDevicesDevices MQTT
DevicesDevicesDevicesDevices
MQTT
Approach 1: Integrate 3rd Party MQTT Broker(s) with Kafka Connect :
Approach 2: Integrate MQTT clients directly via MQTT Proxy (CP 5.x and later) :
23
23
Processing
24
Stream Processing by Analogy
Kafka Cluster
Connect API Stream Processing Connect API
$ cat < in.txt | grep "ksql" | tr a-z A-Z > out.txt
25
• subscribe()
• poll()
• send()
• flush()
Consumer,
Producer
• mapValues()
• filter()
• punctuate()
Kafka Streams
• Select…from…
• Join…where…
• Group by..
KSQL
Flexibility Simplicity
Trade offs
26
KSQL: Enable Stream Processing using SQL-like Semantics
Example Use Cases
• Streaming ETL
• Anomaly detection
• Event monitoring
Leverage Kafka Streams API
without any coding required
KSQL server
Engine
(runs queries)
REST API
CLIClients
Confluent
Control Center
GUI
Kafka Cluster
Use any programming language
Connect via CLI or Control Center
user interface
27
CREATE TABLE possible_fraud AS
SELECT card_number, count(*)
FROM authorization_attempts
WINDOW TUMBLING (SIZE 5 SECONDS)
GROUP BY card_number
HAVING count(*) > 3;
28
28
Schema
29
The Challenge of Data Compatibility at Scale
App 1
App 2
App 3
Many sources without a policy
causes mayhem in a centralized
data pipeline
Ensuring downstream systems can
use the data is key to an
operational stream pipeline
Example: Date formats
Even within a single application,
different formats can be
presented
Incompatibly formatted message
30
Schema Registry: Make Data Backwards Compatible and Future-Proof
● Define the expected fields for each Kafka topic
● Automatically handle schema changes (e.g. new
fields)
● Prevent backwards incompatible changes
● Support multi-data center environments
Elastic
Cassandra
HDFS
Example Consumers
Serializer
App 1
Serializer
App 2
!
Kafka Topic!
Schema
Registry
31
31
Deployment
32
Multiple options!
• Zip
• Yum/apt
• Ansible
• Docker
• DC/OS
• Helm-charts
• Confluent Operator
• ... Cloud!
33
Operator: Achieve End to End Automation on Kubernetes
Confluent Platform
Docker Images
Public Cloud On-Premises
Pivotal Mesosphere Red HatAWS Azure GCP
Confluent Operator operationalizes years of experience delivering a
fully-managed service - Confluent Cloud - on the leading public clouds
Confluent Cloud
Docker Images
Confluent Operator
Accelerate time to value with
automated zero-touch
provisioning
Reduce OpEx and boost DevOps
agility with rolling updates, elastic
scaling and auto data balancing
Increase resiliency via SLA
monitoring through Control
Center or Prometheus
34
34
Tools
35
Auto Data Balancer: Achieve Enterprise-level Performance for Kafka
Before
After
Rebalance
Dynamically move partitions
to optimize resource
utilization and reliability
Enable elastic scaling by
easily adding and removing
nodes from your Kafka cluster
ADB traffic is throttled upon
data transfers to ensure
network bandwidth
36
Replicator: Stretch Kafka Across Data Centers and Public Cloud
Protect business-critical data and
metadata by replicating down to topic-level
configurations
Minimize recovery time objectives (RTO)
through automated failover and
switchback
Meet recovery point objectives (RPO)
running more workers to increase
replication throughput
Bridge your data center to the
cloud with Confluent Cloud
37
Deploy Confluent Platform on K8s via a Growing Partner Ecosystem
38
38
Monitoring
39
Confluent Control Center– Cluster Health & Administration
Cluster health dashboard
• Monitor the health of your
Kafka clusters
and get alerts if any problems
occur
• Measure system load,
performance,
and operations
• View aggregate statistics or
drill down
by broker or topic
Cluster administration
• Monitor topic configurations
40
Operate More Secure, Reliable and Performant Apache Kafka
● Broker configuration view → see config across
multiples Kafka clusters or check values for specific
brokers
● Consumer lag → view how consumers are
performing based on offset, spot potential issues
and take proactive steps to keep performance high
● Feature access controls → control customer access
to topic inspection, schemas, and KSQL
For Operators
Control Center enhancements in Confluent Platform 5.0
41
Build More Powerful Streaming Applications
● Topic inspection → gain insight into the actual data in
Kafka topics
● Schema registry integration → view older and current
schema versions in a git-like UI
● KSQL GUI → create streams and tables from topics,
experiment with transient queries, and run persistent
queries to filter and enrich data
Control Center enhancements in Confluent Platform 5.0
For Developers
42
View consumer-partition lag across
topics for a consumer group
Alert on max consumer group lag
across all topics
Consumer Lag Monitoring
42
Control Center enhancements in Confluent Platform 5.0
43
Make stream processing more accessible
Build stream processing IP in CE
Manage streams & tables
Run KSQL (transient & persistent)
View persistent queries
KSQL UI
43
Control Center enhancements in Confluent Platform 5.0
44
White papers
https://github.com/framiere/monitoring-demo
45
45
46
Confluent resources
47
Resources - Confluent Cloud Datasheet
https://www.confluent.io/wp-content/uploads/confluent-cloud-datasheet.pdf
48
Resources - Confluent Enterprise Reference Architecture
https://www.confluent.io/whitepaper/confluent-enterprise-reference-architecture/
49
Optimizing Your Apache Kafka® Deployment
https://www.confluent.io/white-paper/optimizing-your-apache-kafka-deployment/
50
Small Cluster Reference Architecture – 19 software nodes – 8 Hosts
10 nodes
6 nodes
2 nodes
1 node
51
Large Cluster Reference Architecture – 22 software nodes - 19 hosts
3 nodes
4 nodes
5 nodes
4 nodes
2 nodes
2 nodes
1 node
52
52
Community
53
Resources – Community Slack and Mailing List
https://slackpass.io/confluentcommunity
https://groups.google.com/forum/#!forum/confluent-platform
54
Confluent Blog
55
A Kafka Story
https://github.com/framiere/a-kafka-story
56
cp-demo
https://github.com/confluentinc/cp-demo
With security inside!
57
Kafka Boom Boom
https://github.com/Dabz/kafka-boom-boom
58
PARIS - 11 OCTOBRE 2018

More Related Content

PDF
Monografia Pjesa 3
PDF
Beyond the brokers - Un tour de l'écosystème Kafka
PDF
Beyond the brokers - A tour of the Kafka ecosystem
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
PDF
New Features in Confluent Platform 6.0 / Apache Kafka 2.6
PDF
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
PDF
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
PDF
Confluent Platform 5.4 + Apache Kafka 2.4 Overview (RBAC, Tiered Storage, Mul...
Monografia Pjesa 3
Beyond the brokers - Un tour de l'écosystème Kafka
Beyond the brokers - A tour of the Kafka ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
New Features in Confluent Platform 6.0 / Apache Kafka 2.6
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Kubernetes connectivity to Cloud Native Kafka | Evan Shortiss and Hugo Guerre...
Confluent Platform 5.4 + Apache Kafka 2.4 Overview (RBAC, Tiered Storage, Mul...

Similar to Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière (20)

PDF
Confluent Operator as Cloud-Native Kafka Operator for Kubernetes
PDF
DIMT '23 Session_Demo_ Latest Innovations Breakout.pdf
PPTX
Best Practices for Building Hybrid-Cloud Architectures | Hans Jespersen
PDF
Confluent kafka meetupseattle jan2017
PPTX
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
PDF
Devoxx university - Kafka de haut en bas
PDF
DIMT 2023 SG - Hands-on Workshop_ Getting started with Confluent Cloud.pdf
PDF
Kubernetes Connectivity to Cloud Native Kafka | Christina Lin and Evan Shorti...
PDF
Day in the life event-driven workshop
PDF
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
PPTX
Confluent and Syncsort Webinar August 2016
PDF
The Never Landing Stream with HTAP and Streaming
PDF
JHipster conf 2019 - Kafka Ecosystem
PDF
Kafka summit apac session
PDF
.NET Cloud-Native Bootcamp- Los Angeles
PDF
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
PDF
Confluent Tech Talk Korea
PDF
Introducing Confluent Cloud: Apache Kafka as a Service
PPTX
Streaming Data Ingest and Processing with Apache Kafka
PDF
What's New in Confluent Platform 5.5
Confluent Operator as Cloud-Native Kafka Operator for Kubernetes
DIMT '23 Session_Demo_ Latest Innovations Breakout.pdf
Best Practices for Building Hybrid-Cloud Architectures | Hans Jespersen
Confluent kafka meetupseattle jan2017
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Devoxx university - Kafka de haut en bas
DIMT 2023 SG - Hands-on Workshop_ Getting started with Confluent Cloud.pdf
Kubernetes Connectivity to Cloud Native Kafka | Christina Lin and Evan Shorti...
Day in the life event-driven workshop
Budapest Data/ML - Building Modern Data Streaming Apps with NiFi, Flink and K...
Confluent and Syncsort Webinar August 2016
The Never Landing Stream with HTAP and Streaming
JHipster conf 2019 - Kafka Ecosystem
Kafka summit apac session
.NET Cloud-Native Bootcamp- Los Angeles
Confluent per il settore FSI: Accelerare l'Innovazione con il Data Streaming...
Confluent Tech Talk Korea
Introducing Confluent Cloud: Apache Kafka as a Service
Streaming Data Ingest and Processing with Apache Kafka
What's New in Confluent Platform 5.5
Ad

More from confluent (20)

PDF
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
PPTX
Webinar Think Right - Shift Left - 19-03-2025.pptx
PDF
Migration, backup and restore made easy using Kannika
PDF
Five Things You Need to Know About Data Streaming in 2025
PDF
Data in Motion Tour Seoul 2024 - Keynote
PDF
Data in Motion Tour Seoul 2024 - Roadmap Demo
PDF
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
PDF
Data in Motion Tour 2024 Riyadh, Saudi Arabia
PDF
Build a Real-Time Decision Support Application for Financial Market Traders w...
PDF
Strumenti e Strategie di Stream Governance con Confluent Platform
PDF
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
PDF
Building Real-Time Gen AI Applications with SingleStore and Confluent
PDF
Unlocking value with event-driven architecture by Confluent
PDF
Il Data Streaming per un’AI real-time di nuova generazione
PDF
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
PDF
Break data silos with real-time connectivity using Confluent Cloud Connectors
PDF
Building API data products on top of your real-time data infrastructure
PDF
Speed Wins: From Kafka to APIs in Minutes
PDF
Evolving Data Governance for the Real-time Streaming and AI Era
PDF
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Stream Processing Handson Workshop - Flink SQL Hands-on Workshop (Korean)
Webinar Think Right - Shift Left - 19-03-2025.pptx
Migration, backup and restore made easy using Kannika
Five Things You Need to Know About Data Streaming in 2025
Data in Motion Tour Seoul 2024 - Keynote
Data in Motion Tour Seoul 2024 - Roadmap Demo
From Stream to Screen: Real-Time Data Streaming to Web Frontends with Conflue...
Data in Motion Tour 2024 Riyadh, Saudi Arabia
Build a Real-Time Decision Support Application for Financial Market Traders w...
Strumenti e Strategie di Stream Governance con Confluent Platform
Compose Gen-AI Apps With Real-Time Data - In Minutes, Not Weeks
Building Real-Time Gen AI Applications with SingleStore and Confluent
Unlocking value with event-driven architecture by Confluent
Il Data Streaming per un’AI real-time di nuova generazione
Unleashing the Future: Building a Scalable and Up-to-Date GenAI Chatbot with ...
Break data silos with real-time connectivity using Confluent Cloud Connectors
Building API data products on top of your real-time data infrastructure
Speed Wins: From Kafka to APIs in Minutes
Evolving Data Governance for the Real-time Streaming and AI Era
Catch the Wave: SAP Event-Driven and Data Streaming for the Intelligence Ente...
Ad

Recently uploaded (20)

PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
cuic standard and advanced reporting.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Empathic Computing: Creating Shared Understanding
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
KodekX | Application Modernization Development
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Mobile App Security Testing_ A Comprehensive Guide.pdf
Advanced Soft Computing BINUS July 2025.pdf
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
cuic standard and advanced reporting.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
NewMind AI Weekly Chronicles - August'25 Week I
Dropbox Q2 2025 Financial Results & Investor Presentation
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Reach Out and Touch Someone: Haptics and Empathic Computing
GamePlan Trading System Review: Professional Trader's Honest Take
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Empathic Computing: Creating Shared Understanding
NewMind AI Monthly Chronicles - July 2025
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
KodekX | Application Modernization Development

Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière

  • 1. 1 Au-delà des brokers, un tour de l’environement Kafka Florent Ramière @framiere Technical Account Manager/SE Confluent PARIS - 11 OCTOBRE 2018
  • 3. 3 Massive volumes of new data generated every day Mobile Cloud Microservices Internet of Things Machine Learning Distributed across apps, devices, datacenters, clouds Structured, unstructured polymorphic What
  • 6. 6 Store & ETL Process Publish & Subscribe In short
  • 9. 9 … with great properties • Scalability • Replication • Security • Resiliency • Throughput • Ordering • Exactly Once Semantic • Transaction • Idempotency • Immutability • Performance • …
  • 11. 11 … spawned a full platform Apache Kafka® Core | Connect API | Streams API Stream Processing & Compatibility KSQL | Schema Registry Operations Replicator | Auto Data Balancer | Connectors | MQTT Proxy | Operator Database Changes Log Events IoT Data Web Events other events Hadoop Database Data Warehouse CRM other DATA INTEGRATION Transformations Custom Apps Analytics Monitoring other REAL-TIME APPLICATIONS OPEN SOURCE FEATURES COMMERCIAL FEATURES Datacenter Public Cloud Confluent Cloud CONFLUENT PLATFORM Administration & Monitoring Control Center | Security Connectivity Clients | Connectors | REST Proxy CONFLUENT FULLY-MANAGEDCUSTOMER SELF-MANAGED
  • 13. 13 Apache Kafka Connect API: Import and Export Data In & Out of Kafka JDBC Mongo MySQL Elastic Cassandra HDFS Kafka Connect API Kafka Pipeline Connector Connector Connector Connector Connector Connector Sources Sinks Fault tolerant Manage hundreds of data sources and sinks Preserves data schema Integrated within Confluent Control Center
  • 14. 14 Connectors: Connect Kafka Easily with Data Sources and Sinks Databases Datastore/File Store Analytics Applications / Other Orange Logo denotes Connectors developed and fully supported by Confluent
  • 15. 15 Kafka Connect API, Part of the Apache Kafka™ Project Connect any source to any target system Integrated • 100% compatible with Kafka v0.9 and higher • Integrated with Confluent’s Schema Registry • Easy to manage with Confluent Control Center Flexible • 40+ open source connectors available • Easy to develop additional connectors • Flexible support for data types and formats Compatible • Maintains critical metadata • Preserves schema information • Supports schema evolution Reliable • Automated failover • Exactly-once guarantees • Balances workload between nodes
  • 16. 16 Confluent Hub - The Kafka App Store
  • 18. 18 Clients: Communicate with Kafka in a Broad Variety of Languages Apache Kafka Confluent Platform Community Supported Proxy http/REST stdin/stdout Confluent Platform Clients developed and fully supported by Confluent
  • 19. 19 REST Proxy: Talking to Non-native Kafka Apps and Outside the Firewall REST Proxy Non-Java Applications Native Kafka Java Applications Schema Registry REST / HTTP Simplifies administrative actions Simplifies message creation and consumption Provides a RESTful interface to a Kafka cluster
  • 21. 21 MQTT Proxy: Streamline IoT Data Integration with Kafka Connect all IoT data sources with the streaming platform - leverages all of your infrastructure investments Reduce operational cost and complexity by eliminating third party MQTT brokers and their intermediate storage and lag Ensure IoT data delivery at all QoS levels (QoS0, QoS1 and QoS2) of the MQTT protocol Kafka Broker Kafka Broker Kafka Broker MQTT ProxyGatewaysDevices MQTT MQTT
  • 22. 22 ? Frictionless MQTT Connectivity with Confluent Platform Kafka BrokerKafka BrokerKafka BrokerDevicesDevicesDevicesGateways MQTT Broker Connect w/ MQTT connector Connect w/ MQTT connectorMQTT DevicesDevicesDevicesDevices MQTT DevicesDevicesDevicesDevices MQTT Kafka BrokerKafka BrokerKafka Broker MQTT ProxyMQTT DevicesDevicesDevicesGateways DevicesDevicesDevicesDevices MQTT DevicesDevicesDevicesDevices MQTT Approach 1: Integrate 3rd Party MQTT Broker(s) with Kafka Connect : Approach 2: Integrate MQTT clients directly via MQTT Proxy (CP 5.x and later) :
  • 24. 24 Stream Processing by Analogy Kafka Cluster Connect API Stream Processing Connect API $ cat < in.txt | grep "ksql" | tr a-z A-Z > out.txt
  • 25. 25 • subscribe() • poll() • send() • flush() Consumer, Producer • mapValues() • filter() • punctuate() Kafka Streams • Select…from… • Join…where… • Group by.. KSQL Flexibility Simplicity Trade offs
  • 26. 26 KSQL: Enable Stream Processing using SQL-like Semantics Example Use Cases • Streaming ETL • Anomaly detection • Event monitoring Leverage Kafka Streams API without any coding required KSQL server Engine (runs queries) REST API CLIClients Confluent Control Center GUI Kafka Cluster Use any programming language Connect via CLI or Control Center user interface
  • 27. 27 CREATE TABLE possible_fraud AS SELECT card_number, count(*) FROM authorization_attempts WINDOW TUMBLING (SIZE 5 SECONDS) GROUP BY card_number HAVING count(*) > 3;
  • 29. 29 The Challenge of Data Compatibility at Scale App 1 App 2 App 3 Many sources without a policy causes mayhem in a centralized data pipeline Ensuring downstream systems can use the data is key to an operational stream pipeline Example: Date formats Even within a single application, different formats can be presented Incompatibly formatted message
  • 30. 30 Schema Registry: Make Data Backwards Compatible and Future-Proof ● Define the expected fields for each Kafka topic ● Automatically handle schema changes (e.g. new fields) ● Prevent backwards incompatible changes ● Support multi-data center environments Elastic Cassandra HDFS Example Consumers Serializer App 1 Serializer App 2 ! Kafka Topic! Schema Registry
  • 32. 32 Multiple options! • Zip • Yum/apt • Ansible • Docker • DC/OS • Helm-charts • Confluent Operator • ... Cloud!
  • 33. 33 Operator: Achieve End to End Automation on Kubernetes Confluent Platform Docker Images Public Cloud On-Premises Pivotal Mesosphere Red HatAWS Azure GCP Confluent Operator operationalizes years of experience delivering a fully-managed service - Confluent Cloud - on the leading public clouds Confluent Cloud Docker Images Confluent Operator Accelerate time to value with automated zero-touch provisioning Reduce OpEx and boost DevOps agility with rolling updates, elastic scaling and auto data balancing Increase resiliency via SLA monitoring through Control Center or Prometheus
  • 35. 35 Auto Data Balancer: Achieve Enterprise-level Performance for Kafka Before After Rebalance Dynamically move partitions to optimize resource utilization and reliability Enable elastic scaling by easily adding and removing nodes from your Kafka cluster ADB traffic is throttled upon data transfers to ensure network bandwidth
  • 36. 36 Replicator: Stretch Kafka Across Data Centers and Public Cloud Protect business-critical data and metadata by replicating down to topic-level configurations Minimize recovery time objectives (RTO) through automated failover and switchback Meet recovery point objectives (RPO) running more workers to increase replication throughput Bridge your data center to the cloud with Confluent Cloud
  • 37. 37 Deploy Confluent Platform on K8s via a Growing Partner Ecosystem
  • 39. 39 Confluent Control Center– Cluster Health & Administration Cluster health dashboard • Monitor the health of your Kafka clusters and get alerts if any problems occur • Measure system load, performance, and operations • View aggregate statistics or drill down by broker or topic Cluster administration • Monitor topic configurations
  • 40. 40 Operate More Secure, Reliable and Performant Apache Kafka ● Broker configuration view → see config across multiples Kafka clusters or check values for specific brokers ● Consumer lag → view how consumers are performing based on offset, spot potential issues and take proactive steps to keep performance high ● Feature access controls → control customer access to topic inspection, schemas, and KSQL For Operators Control Center enhancements in Confluent Platform 5.0
  • 41. 41 Build More Powerful Streaming Applications ● Topic inspection → gain insight into the actual data in Kafka topics ● Schema registry integration → view older and current schema versions in a git-like UI ● KSQL GUI → create streams and tables from topics, experiment with transient queries, and run persistent queries to filter and enrich data Control Center enhancements in Confluent Platform 5.0 For Developers
  • 42. 42 View consumer-partition lag across topics for a consumer group Alert on max consumer group lag across all topics Consumer Lag Monitoring 42 Control Center enhancements in Confluent Platform 5.0
  • 43. 43 Make stream processing more accessible Build stream processing IP in CE Manage streams & tables Run KSQL (transient & persistent) View persistent queries KSQL UI 43 Control Center enhancements in Confluent Platform 5.0
  • 45. 45 45
  • 47. 47 Resources - Confluent Cloud Datasheet https://www.confluent.io/wp-content/uploads/confluent-cloud-datasheet.pdf
  • 48. 48 Resources - Confluent Enterprise Reference Architecture https://www.confluent.io/whitepaper/confluent-enterprise-reference-architecture/
  • 49. 49 Optimizing Your Apache Kafka® Deployment https://www.confluent.io/white-paper/optimizing-your-apache-kafka-deployment/
  • 50. 50 Small Cluster Reference Architecture – 19 software nodes – 8 Hosts 10 nodes 6 nodes 2 nodes 1 node
  • 51. 51 Large Cluster Reference Architecture – 22 software nodes - 19 hosts 3 nodes 4 nodes 5 nodes 4 nodes 2 nodes 2 nodes 1 node
  • 53. 53 Resources – Community Slack and Mailing List https://slackpass.io/confluentcommunity https://groups.google.com/forum/#!forum/confluent-platform
  • 58. 58 PARIS - 11 OCTOBRE 2018