SlideShare a Scribd company logo
Confluent Platform 5.4 -
RBAC, Multi-Region Clusters and more
March 25th 2020
Kai Waehner
Technology Evangelist
kai.waehner@confluent.io
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de
Ways to Deploy
Confluent Platform
The enterprise distribution of
Apache Kafka
VM
Deploy on any platform
on-prem or cloud
Self Managed Software Fully Managed Software
Confluent Cloud
Apache Kafka re-engineered
for the Cloud
Available on the leading public clouds
Also via MarketplaceAlso via Marketplace
Confluent Platform
Fully Managed Cloud ServiceSelf Managed Software FREEDOM OF CHOICE
COMMITTER-DRIVEN EXPERTISE PartnersTrainingProfessional
Services
Enterpris
e Support
Apache Kafka
EFFICIENT
OPERATIONS AT SCALE
PRODUCTION-
STAGE PREREQUISITES
UNRESTRICTED
DEVELOPER PRODUCTIVITY
SQL-based Stream Processing
KSQL (ksqlDB)
Rich Pre-built Ecosystem
Connectors | Hub | Schema Registry
Multi-language Development
non-Java clients | REST Proxy
GUI-driven Mgmt & Monitoring
Control Center
Flexible DevOps Automation
Operator | Ansible
Dynamic Performance & Elasticity
Auto Data Balancer | Tiered Storage
Enterprise-grade Security
RBAC | Secrets | Audit logs
Data Compatibility
Schema Registry | Schema Validation
Global Resilience
Multi-Region Clusters | Replicator
Developer Operator Architect
Open Source | Community licensed
PARTNERSHIP
FOR BUSINESS SUCCESS
Complete Engagement Model
Revenue / Cost / Risk Impact
TCO / ROI
Executive Buyer
Rapid Pace of Innovation to Enable Enterprises
January 2020
CP 5.4 (based on AK 2.4)
Security
● Role-Based Access Control
● Structured Audit Logs
Resilience
● Multi-Region Clusters
Data Compatibility
● Schema Validation
Management & Monitoring
● Control Center
○ RBAC management
○ Replicator monitoring
Performance & Elasticity
● Tiered Storage (preview)
Stream Processing
● ksqlDB features (preview)
April 2019
CP 5.2 (based on AK 2.2)
Developers
● Free single-broker
developer license
● librdkafka and clients 1.0
KSQL
● New query expressions
● GUI enhancements
Replicator
● Schema migration to
CCloud
Control Center
● Dynamix broker
configuration
● Schema Registry
management
● Multi-cluster Connect &
KSQL
● Enhanced scalability
July 2018
CP 5.0 (based on AK 2.0)
Security
● AD/LDAP Authorizer
Replicator
● Automatic offset translation
Control Center
● Consumer lag
● View broker configuration
● View topics
● KSQL editor
Ecosystem
● MQTT Proxy
July 2019
CP 5.3 (based on AK 2.3)
Security
● Role-Based Access Control
(preview)
● Secret Protection
DevOps automation
● Kubernetes Operator
● Ansible Playbooks
Management & Monitoring
● Control Center redesigned
user interface
● New CLI
Enterprise-grade Security
Enterprise Grade
Security
• Architecting with security is a
design priority
• Avoiding unnecessary complexity is
key
• As usage of event streaming
spreads, native tools (e.g. Kafka
Access Control Lists) for managing
authorization can become complex
• Problem is exacerbated when
failing to standardize security across
the platform 6
Why you need better
authorization?
Role-Based Access Control
Provides platform-wide security
with fine-tuned granularity
• Granular control of access
permissions, including:
• Clusters, topics, consumer
groups, connectors
• Efficient management at large scale
• Delegate authorization
management to true resource
owners
• Platform-wide standardization
• Enforced via GUI, CLI and APIs
• Enforced across all CP
components: Connect, KSQL,
Schema Registry, REST Proxy,
Control Center and MQTT Proxy
Users/
Groups
Roles Resource
scoping
CLI GUI API
Role
Binding
RBAC
authorization
7
Enterprise Grade
Security
• Lack of visibility into actions taken
by users/applications
• Difficult to perform forensics to
detect anomalies and identify bad
actors
• Failure to comply with regulatory
requirements
8
Why you need better
visibility?
Structured Audit
Logs
Enable security traceability and
regulatory compliance
• Detection of abnormal behavior and
potential security threats
• Capture authorization logs in a set of
dedicated Kafka topics
• Process and analyze with KSQL, or
offload to external systems (e.g.
Splunk, S3)
• Industry Standardization
• Uses CloudEvents specification to
define the syntax of the logs
Event Description Category Capture
Default
Authorize An RBAC
authorization is
being requested.
MANAGEMENT Yes
CreateTopics A topic is being
created.
MANAGEMENT Yes
Produce A Kafka producer is
writing a batch of
records to a topic.
PRODUCE No
FetchConsumer A Kafka consumer is
reading a batch of
records from a topic.
CONSUME No
LeaderAndIsr Controller is sending
leader and ISR state
to a broker.
INTERBROKER No
Sample Audit Logs
9
Global Resilience
Global Resilience
• Modern companies have high
expectations for durability,
availability, and latency
• Replication based on Kafka Connect
(e.g. Replicator or MirrorMaker 2)
come with operational complexity
and require downtime
• Stretch cluster architectures
historically came with a tradeoff:
availability vs. performance
11
Why you need better
disaster recovery?
Multi-Region Clusters
Change the game for disaster recovery for Kafka
• Zero downtime and zero data loss
for critical Kafka Topics
• Automated client failover
• Streamlined DR operations
• Leverages Kafka’s internal
replication
• No separate Connect clusters
• Single multi-region cluster with
high write throughput
• Asynchronous replication using
“Observer” replicas
• Low bandwidth costs and high read
throughput
• Remote consumers read data
locally, directly from Observers
Broker
1
Broker
2
Broker
3
ZK1
Broker
4
Broker
5
Broker
6
Broker
1
Broker
2
ZK2
Client D Client F Client G
Failover site
ZK3
Broker
3
Broker
4
Broker
5
Broker
6
Client A Client B
us-central-1
Client A Client B
automated
client failover
Observer
replicas
us-west-1 us-east-1
Site failure!
“tie-breaker”
datacenter
Single Kafka Cluster
12
Data Compatibility
Data Compatibility
• Confluent Schema Registry increase
data compatibility through client-
level “agreements”, but Kafka is
unaware
• No programmatic way of enforcing
that producers talk to Schema
Registry before publishing
messages to Kafka
• Leads to risk and uncertainty
regarding data quality for large
organizations
14
Why you need
enhanced validation of
of data quality?
Schema Validation
Provides a centralized way of
controlling data compatibility
• Certainty and piece of mind at scale
regarding data quality
• Automated broker-side schema
validation and enforcement
• Direct interface from the broker
to Confluent Schema Registry
• Granular control over schema
validation
• Enabled at the topic level
Producer Broker
Schema
Registry
1. Invalid
schema
2. Error
message
confluent.value.schema.validation=true
15
GUI -Driven
Management & Monitoring
GUI Driven
Management
&
Monitoring
• Control Center is rapidly becoming
the de facto user interface for many
Confluent Platform users
• We must ensure that Control Center
can manage and monitor Confluent
Platform comprehensively
• Need for a wide variety of use cases
and supported scale
17
Why these
improvements to
Control Center?
GUI-driven mgmt for
new CP 5.4 features
• Role-Based Access Control
• View own permissions, and
manage subordinate role
bindings
• Multi-Region Clusters
• Track Observer replica
placement in each topic view
• Schema Validation
• Enable at the topic level when
creating or editing topics
RBAC management
18
Confluent Replicator
integration
• Simplified monitoring for multi-site
replication with the GUI
• Track key metrics such as
throughput and lag
Replicator monitoring
19
New aggregate views
• Simplified monitoring and
troubleshooting for Kafka clusters
• Cluster Overview: shows overall
status of the Kafka cluster,
including brokers, replicas,
partitions and topics
• Metrics Dashboard: aggregates
all Kafka cluster metrics into a
single page
Cluster Overview
Metrics Dashboard
20
Dynamic Performance & Elasticity
Dynamic Performance
& Elasticity
• As event streaming spreads, the
platform is required to store larger
amounts of data for longer periods
of time
• Kafka’s tight coupling between
compute and storage leads to
difficulty to scale the platform
• Longer data retention leads to high
storage costs
22
Why you need
enhanced scalability
and data efficiency ?
Tiered Storage (preview)
Enable Kafka with infinite retention cost-
effectively
• Infinite retention
• Older data is offloaded to
inexpensive object storage,
accessible at any time
• Reduced storage costs
• Storage limitations, like capacity and
duration, are effectively uncapped
• Elastic scalability
• “Lighter” Kafka brokers enable
instantaneous load balancing when
scaling up
Broker
Compute Storage
Clients
Transactions,
auth, quota
enforcement,
compaction, ...
Local
Remote
Object Storage
23
Confluent Server
Enables enterprise features
Required to enable:
● Operator
● RBAC
● Structured Audit Logs
● Multi-Region Clusters
● Schema Validation
● Tiered Storage (preview)
Optional software package
● Deploy CP with Confluent
Server or Apache Kafka
● In-place migration between
Confluent Server and Kafka
Apache Kafka
enterprise capabilities
Confluent Server
Confluent Platform
KSQL
Schema
Registry
REST Proxy
Control Center Replicator MQTT Proxy
24
SQL-based Stream Processing
Kafka
producer/
consumer
Kafka
Streams
ksqlDB
The 3 stream processing modalities with Confluent
ConsumerRecords<String, String> records = consumer.poll(100);
Map<String, Integer> counts = new DefaultMap<String,
Integer>();
for (ConsumerRecord<String, Integer> record : records) {
String key = record.key();
int c = counts.get(key)
c += record.value()
counts.put(key, c)
}
for (Map.Entry<String, Integer> entry : counts.entrySet()) {
int stateCount;
int attempts;
while (attempts++ < MAX_RETRIES) {
try {
stateCount = stateStore.getValue(entry.getKey())
stateStore.setValue(entry.getKey(), entry.getValue() +
stateCount)
break;
} catch (StateStoreException e) {
RetryUtils.backoff(attempts);
}
}
}
The 3 stream processing modalities differ in
flexibility and ease of use
Kafka producer/consumer Kafka Streams ksqlDB
builder
.stream("input-stream",
Consumed.with(Serdes.String(), Serdes.String()))
.groupBy((key, value) -> value)
.count()
.toStream()
.to("counts", Produced.with(Serdes.String(), Serdes.Long()));
SELECT x, count(*) FROM stream GROUP BY x EMIT CHANGES;
Using external processing systems leads to
complicated architectures
DB CONNECTOR
CONNECTOR
APP
APP
DB
STREAM
PROCESSING
CONNECTOR APPDB
We can put it back together in a simpler way
DB
APP
APP
DB
APP
PULL
PUSH
CONNECTORS
STREAM PROCESSING
STATE STORES
ksqlDB
Connect integration and pull queries enable end-to-end
streaming in just a few SQL statements
Serve lookups against
materialized views
Create
materialized views
Perform continuous
transformations
Capture data
CREATE STREAM purchases AS
SELECT viewtime, userid,pageid,
TIMESTAMPTOSTRING(viewtime, 'yyyy-MM-dd HH:mm:ss.SSS')
FROM pageviews;
CREATE TABLE orders_by_country AS
SELECT country, COUNT(*) AS order_count, SUM(order_total) AS order_total
FROM purchases
WINDOW TUMBLING (SIZE 5 MINUTES)
LEFT JOIN user_profiles ON purchases.customer_id = user_profiles.customer_id
GROUP BY country
EMIT CHANGES;
SELECT * FROM orders_by_country WHERE country='usa';
CREATE SOURCE CONNECTOR jdbcConnector WITH (
‘connector.class’ = '...JdbcSourceConnector',
‘connection.url’ = '...',
…);
Confluent Platform 5.4 + Apache Kafka 2.4 Overview (RBAC, Tiered Storage, Multi-Region Clusters, Audit Logs, and more)
Confluent Platform
and Confluent Cloud
are always built on the
latest Version of
Apache Kafka
If you want to learn what’s included in
Apache Kafka 2.4, we have resources
available for you:
• Technical Blog:
https://www.confluent.io/blog/apac
he-kafka-2-4-latest-version-updates
• Overview Video:
https://youtu.be/Ipzc--mbvzg
32
Apache Kafka 2.4
33
Alle Events hier:
https://events.confluent.io/
confluentkitchenseries2020
Kai Waehner
Technology Evangelist
kai.waehner@confluent.io
LinkedIn
@KaiWaehner
www.confluent.io
www.kai-waehner.de

More Related Content

PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PDF
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
PDF
Vmware overview
PPT
Cloud Computing Jobs In India
PDF
An Introduction to Apache Kafka
PDF
Cisco Digital Network Architecture - Introducing the Network Intuitive
PDF
Apache Kafka® Use Cases for Financial Services
PDF
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
IoT Sensor Analytics with Python, Jupyter, TensorFlow, Keras, Apache Kafka, K...
Vmware overview
Cloud Computing Jobs In India
An Introduction to Apache Kafka
Cisco Digital Network Architecture - Introducing the Network Intuitive
Apache Kafka® Use Cases for Financial Services
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME

What's hot (20)

PDF
Apache Kafka in the Airline, Aviation and Travel Industry
PPT
ODP
Stream processing using Kafka
PDF
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
PDF
Kappa vs Lambda Architectures and Technology Comparison
PDF
Benefits of Stream Processing and Apache Kafka Use Cases
PDF
PDF
ksqlDB: A Stream-Relational Database System
PDF
Kafka Connect & Streams - the ecosystem around Kafka
PPTX
Azure Overview Arc
PPTX
Cisco ASA Firewalls
PDF
Streaming all over the world Real life use cases with Kafka Streams
PDF
Apache Kafka - Martin Podval
PDF
VMware NSX 101: What, Why & How
PDF
Introduction to Kafka Streams
PDF
Grafana introduction
PPTX
Microservices on Anypoint Platform
PDF
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
PDF
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
PDF
Apache Kafka Introduction
Apache Kafka in the Airline, Aviation and Travel Industry
Stream processing using Kafka
Distributed Tracing for Kafka with OpenTelemetry with Daniel Kim | Kafka Summ...
Kappa vs Lambda Architectures and Technology Comparison
Benefits of Stream Processing and Apache Kafka Use Cases
ksqlDB: A Stream-Relational Database System
Kafka Connect & Streams - the ecosystem around Kafka
Azure Overview Arc
Cisco ASA Firewalls
Streaming all over the world Real life use cases with Kafka Streams
Apache Kafka - Martin Podval
VMware NSX 101: What, Why & How
Introduction to Kafka Streams
Grafana introduction
Microservices on Anypoint Platform
Best Practices for Streaming IoT Data with MQTT and Apache Kafka®
Real-Life Use Cases & Architectures for Event Streaming with Apache Kafka
Apache Kafka Introduction
Ad

Similar to Confluent Platform 5.4 + Apache Kafka 2.4 Overview (RBAC, Tiered Storage, Multi-Region Clusters, Audit Logs, and more) (20)

PDF
What's new in confluent platform 5.4 online talk
PDF
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
PDF
New Features in Confluent Platform 6.0 / Apache Kafka 2.6
PDF
OnPrem Monitoring.pdf
PDF
Apache Kafka 2.3 + Confluent Platform 5.3 => What's New?
PDF
Why Cloud-Native Kafka Matters: 4 Reasons to Stop Managing it Yourself
PDF
Beyond the brokers - A tour of the Kafka ecosystem
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
PDF
Partner Connect APAC - 2022 - April
PDF
Team Collaboration in Kafka Clusters With Maria Berinde-Tampanariu | Current ...
PDF
Beyond the brokers - Un tour de l'écosystème Kafka
PDF
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
PDF
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
PDF
Reinventing Kafka in the Data Streaming Era - Jun Rao
PDF
Confluent Partner Tech Talk with Synthesis
PPTX
Unlock value with Confluent and AWS.pptx
PPTX
Bridge Your Kafka Streams to Azure Webinar
PDF
Confluent Messaging Modernization Forum
PDF
App modernization on AWS with Apache Kafka and Confluent Cloud
PDF
Introduction to Apache Kafka and Confluent... and why they matter
What's new in confluent platform 5.4 online talk
Au delà des brokers, un tour de l’environnement Kafka | Florent Ramière
New Features in Confluent Platform 6.0 / Apache Kafka 2.6
OnPrem Monitoring.pdf
Apache Kafka 2.3 + Confluent Platform 5.3 => What's New?
Why Cloud-Native Kafka Matters: 4 Reasons to Stop Managing it Yourself
Beyond the brokers - A tour of the Kafka ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
Partner Connect APAC - 2022 - April
Team Collaboration in Kafka Clusters With Maria Berinde-Tampanariu | Current ...
Beyond the brokers - Un tour de l'écosystème Kafka
Architecture patterns for distributed, hybrid, edge and global Apache Kafka d...
Confluent Platform 5.5 + Apache Kafka 2.5 => New Features (JSON Schema, Proto...
Reinventing Kafka in the Data Streaming Era - Jun Rao
Confluent Partner Tech Talk with Synthesis
Unlock value with Confluent and AWS.pptx
Bridge Your Kafka Streams to Azure Webinar
Confluent Messaging Modernization Forum
App modernization on AWS with Apache Kafka and Confluent Cloud
Introduction to Apache Kafka and Confluent... and why they matter
Ad

More from Kai Wähner (20)

PDF
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
PDF
When NOT to use Apache Kafka?
PDF
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
PDF
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
PDF
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
PDF
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
PDF
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
PDF
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
PDF
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
PDF
Apache Kafka in the Healthcare Industry
PDF
Apache Kafka in the Healthcare Industry
PDF
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
PDF
Kafka for Real-Time Replication between Edge and Hybrid Cloud
PDF
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
PDF
Apache Kafka Landscape for Automotive and Manufacturing
PPTX
The Top 5 Apache Kafka Use Cases and Architectures in 2022
PDF
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
PDF
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
PDF
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
PDF
Apache Kafka in the Transportation and Logistics
Apache Kafka as Data Hub for Crypto, NFT, Metaverse (Beyond the Buzz!)
When NOT to use Apache Kafka?
Kafka for Live Commerce to Transform the Retail and Shopping Metaverse
The Heart of the Data Mesh Beats in Real-Time with Apache Kafka
Apache Kafka vs. Cloud-native iPaaS Integration Platform Middleware
Data Warehouse vs. Data Lake vs. Data Streaming – Friends, Enemies, Frenemies?
Serverless Kafka and Spark in a Multi-Cloud Lakehouse Architecture
Resilient Real-time Data Streaming across the Edge and Hybrid Cloud with Apac...
Data Streaming with Apache Kafka in the Defence and Cybersecurity Industry
Apache Kafka in the Healthcare Industry
Apache Kafka in the Healthcare Industry
Apache Kafka for Real-time Supply Chain in the Food and Retail Industry
Kafka for Real-Time Replication between Edge and Hybrid Cloud
Apache Kafka for Predictive Maintenance in Industrial IoT / Industry 4.0
Apache Kafka Landscape for Automotive and Manufacturing
The Top 5 Apache Kafka Use Cases and Architectures in 2022
Event Streaming CTO Roundtable for Cloud-native Kafka Architectures
Apache Kafka in the Public Sector (Government, National Security, Citizen Ser...
Telco 4.0 - Payment and FinServ Integration for Data in Motion with 5G and Ap...
Apache Kafka in the Transportation and Logistics

Recently uploaded (20)

PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
DOCX
The Five Best AI Cover Tools in 2025.docx
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPT
JAVA ppt tutorial basics to learn java programming
PDF
System and Network Administraation Chapter 3
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
ai tools demonstartion for schools and inter college
PPTX
Introduction to Artificial Intelligence
PPTX
Materi-Enum-and-Record-Data-Type (1).pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 41
How to Migrate SBCGlobal Email to Yahoo Easily
The Five Best AI Cover Tools in 2025.docx
Which alternative to Crystal Reports is best for small or large businesses.pdf
ManageIQ - Sprint 268 Review - Slide Deck
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Odoo POS Development Services by CandidRoot Solutions
Design an Analysis of Algorithms II-SECS-1021-03
JAVA ppt tutorial basics to learn java programming
System and Network Administraation Chapter 3
Understanding Forklifts - TECH EHS Solution
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Upgrade and Innovation Strategies for SAP ERP Customers
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
ai tools demonstartion for schools and inter college
Introduction to Artificial Intelligence
Materi-Enum-and-Record-Data-Type (1).pptx

Confluent Platform 5.4 + Apache Kafka 2.4 Overview (RBAC, Tiered Storage, Multi-Region Clusters, Audit Logs, and more)

  • 1. Confluent Platform 5.4 - RBAC, Multi-Region Clusters and more March 25th 2020 Kai Waehner Technology Evangelist kai.waehner@confluent.io LinkedIn @KaiWaehner www.confluent.io www.kai-waehner.de
  • 2. Ways to Deploy Confluent Platform The enterprise distribution of Apache Kafka VM Deploy on any platform on-prem or cloud Self Managed Software Fully Managed Software Confluent Cloud Apache Kafka re-engineered for the Cloud Available on the leading public clouds Also via MarketplaceAlso via Marketplace
  • 3. Confluent Platform Fully Managed Cloud ServiceSelf Managed Software FREEDOM OF CHOICE COMMITTER-DRIVEN EXPERTISE PartnersTrainingProfessional Services Enterpris e Support Apache Kafka EFFICIENT OPERATIONS AT SCALE PRODUCTION- STAGE PREREQUISITES UNRESTRICTED DEVELOPER PRODUCTIVITY SQL-based Stream Processing KSQL (ksqlDB) Rich Pre-built Ecosystem Connectors | Hub | Schema Registry Multi-language Development non-Java clients | REST Proxy GUI-driven Mgmt & Monitoring Control Center Flexible DevOps Automation Operator | Ansible Dynamic Performance & Elasticity Auto Data Balancer | Tiered Storage Enterprise-grade Security RBAC | Secrets | Audit logs Data Compatibility Schema Registry | Schema Validation Global Resilience Multi-Region Clusters | Replicator Developer Operator Architect Open Source | Community licensed PARTNERSHIP FOR BUSINESS SUCCESS Complete Engagement Model Revenue / Cost / Risk Impact TCO / ROI Executive Buyer
  • 4. Rapid Pace of Innovation to Enable Enterprises January 2020 CP 5.4 (based on AK 2.4) Security ● Role-Based Access Control ● Structured Audit Logs Resilience ● Multi-Region Clusters Data Compatibility ● Schema Validation Management & Monitoring ● Control Center ○ RBAC management ○ Replicator monitoring Performance & Elasticity ● Tiered Storage (preview) Stream Processing ● ksqlDB features (preview) April 2019 CP 5.2 (based on AK 2.2) Developers ● Free single-broker developer license ● librdkafka and clients 1.0 KSQL ● New query expressions ● GUI enhancements Replicator ● Schema migration to CCloud Control Center ● Dynamix broker configuration ● Schema Registry management ● Multi-cluster Connect & KSQL ● Enhanced scalability July 2018 CP 5.0 (based on AK 2.0) Security ● AD/LDAP Authorizer Replicator ● Automatic offset translation Control Center ● Consumer lag ● View broker configuration ● View topics ● KSQL editor Ecosystem ● MQTT Proxy July 2019 CP 5.3 (based on AK 2.3) Security ● Role-Based Access Control (preview) ● Secret Protection DevOps automation ● Kubernetes Operator ● Ansible Playbooks Management & Monitoring ● Control Center redesigned user interface ● New CLI
  • 6. Enterprise Grade Security • Architecting with security is a design priority • Avoiding unnecessary complexity is key • As usage of event streaming spreads, native tools (e.g. Kafka Access Control Lists) for managing authorization can become complex • Problem is exacerbated when failing to standardize security across the platform 6 Why you need better authorization?
  • 7. Role-Based Access Control Provides platform-wide security with fine-tuned granularity • Granular control of access permissions, including: • Clusters, topics, consumer groups, connectors • Efficient management at large scale • Delegate authorization management to true resource owners • Platform-wide standardization • Enforced via GUI, CLI and APIs • Enforced across all CP components: Connect, KSQL, Schema Registry, REST Proxy, Control Center and MQTT Proxy Users/ Groups Roles Resource scoping CLI GUI API Role Binding RBAC authorization 7
  • 8. Enterprise Grade Security • Lack of visibility into actions taken by users/applications • Difficult to perform forensics to detect anomalies and identify bad actors • Failure to comply with regulatory requirements 8 Why you need better visibility?
  • 9. Structured Audit Logs Enable security traceability and regulatory compliance • Detection of abnormal behavior and potential security threats • Capture authorization logs in a set of dedicated Kafka topics • Process and analyze with KSQL, or offload to external systems (e.g. Splunk, S3) • Industry Standardization • Uses CloudEvents specification to define the syntax of the logs Event Description Category Capture Default Authorize An RBAC authorization is being requested. MANAGEMENT Yes CreateTopics A topic is being created. MANAGEMENT Yes Produce A Kafka producer is writing a batch of records to a topic. PRODUCE No FetchConsumer A Kafka consumer is reading a batch of records from a topic. CONSUME No LeaderAndIsr Controller is sending leader and ISR state to a broker. INTERBROKER No Sample Audit Logs 9
  • 11. Global Resilience • Modern companies have high expectations for durability, availability, and latency • Replication based on Kafka Connect (e.g. Replicator or MirrorMaker 2) come with operational complexity and require downtime • Stretch cluster architectures historically came with a tradeoff: availability vs. performance 11 Why you need better disaster recovery?
  • 12. Multi-Region Clusters Change the game for disaster recovery for Kafka • Zero downtime and zero data loss for critical Kafka Topics • Automated client failover • Streamlined DR operations • Leverages Kafka’s internal replication • No separate Connect clusters • Single multi-region cluster with high write throughput • Asynchronous replication using “Observer” replicas • Low bandwidth costs and high read throughput • Remote consumers read data locally, directly from Observers Broker 1 Broker 2 Broker 3 ZK1 Broker 4 Broker 5 Broker 6 Broker 1 Broker 2 ZK2 Client D Client F Client G Failover site ZK3 Broker 3 Broker 4 Broker 5 Broker 6 Client A Client B us-central-1 Client A Client B automated client failover Observer replicas us-west-1 us-east-1 Site failure! “tie-breaker” datacenter Single Kafka Cluster 12
  • 14. Data Compatibility • Confluent Schema Registry increase data compatibility through client- level “agreements”, but Kafka is unaware • No programmatic way of enforcing that producers talk to Schema Registry before publishing messages to Kafka • Leads to risk and uncertainty regarding data quality for large organizations 14 Why you need enhanced validation of of data quality?
  • 15. Schema Validation Provides a centralized way of controlling data compatibility • Certainty and piece of mind at scale regarding data quality • Automated broker-side schema validation and enforcement • Direct interface from the broker to Confluent Schema Registry • Granular control over schema validation • Enabled at the topic level Producer Broker Schema Registry 1. Invalid schema 2. Error message confluent.value.schema.validation=true 15
  • 17. GUI Driven Management & Monitoring • Control Center is rapidly becoming the de facto user interface for many Confluent Platform users • We must ensure that Control Center can manage and monitor Confluent Platform comprehensively • Need for a wide variety of use cases and supported scale 17 Why these improvements to Control Center?
  • 18. GUI-driven mgmt for new CP 5.4 features • Role-Based Access Control • View own permissions, and manage subordinate role bindings • Multi-Region Clusters • Track Observer replica placement in each topic view • Schema Validation • Enable at the topic level when creating or editing topics RBAC management 18
  • 19. Confluent Replicator integration • Simplified monitoring for multi-site replication with the GUI • Track key metrics such as throughput and lag Replicator monitoring 19
  • 20. New aggregate views • Simplified monitoring and troubleshooting for Kafka clusters • Cluster Overview: shows overall status of the Kafka cluster, including brokers, replicas, partitions and topics • Metrics Dashboard: aggregates all Kafka cluster metrics into a single page Cluster Overview Metrics Dashboard 20
  • 21. Dynamic Performance & Elasticity
  • 22. Dynamic Performance & Elasticity • As event streaming spreads, the platform is required to store larger amounts of data for longer periods of time • Kafka’s tight coupling between compute and storage leads to difficulty to scale the platform • Longer data retention leads to high storage costs 22 Why you need enhanced scalability and data efficiency ?
  • 23. Tiered Storage (preview) Enable Kafka with infinite retention cost- effectively • Infinite retention • Older data is offloaded to inexpensive object storage, accessible at any time • Reduced storage costs • Storage limitations, like capacity and duration, are effectively uncapped • Elastic scalability • “Lighter” Kafka brokers enable instantaneous load balancing when scaling up Broker Compute Storage Clients Transactions, auth, quota enforcement, compaction, ... Local Remote Object Storage 23
  • 24. Confluent Server Enables enterprise features Required to enable: ● Operator ● RBAC ● Structured Audit Logs ● Multi-Region Clusters ● Schema Validation ● Tiered Storage (preview) Optional software package ● Deploy CP with Confluent Server or Apache Kafka ● In-place migration between Confluent Server and Kafka Apache Kafka enterprise capabilities Confluent Server Confluent Platform KSQL Schema Registry REST Proxy Control Center Replicator MQTT Proxy 24
  • 26. Kafka producer/ consumer Kafka Streams ksqlDB The 3 stream processing modalities with Confluent
  • 27. ConsumerRecords<String, String> records = consumer.poll(100); Map<String, Integer> counts = new DefaultMap<String, Integer>(); for (ConsumerRecord<String, Integer> record : records) { String key = record.key(); int c = counts.get(key) c += record.value() counts.put(key, c) } for (Map.Entry<String, Integer> entry : counts.entrySet()) { int stateCount; int attempts; while (attempts++ < MAX_RETRIES) { try { stateCount = stateStore.getValue(entry.getKey()) stateStore.setValue(entry.getKey(), entry.getValue() + stateCount) break; } catch (StateStoreException e) { RetryUtils.backoff(attempts); } } } The 3 stream processing modalities differ in flexibility and ease of use Kafka producer/consumer Kafka Streams ksqlDB builder .stream("input-stream", Consumed.with(Serdes.String(), Serdes.String())) .groupBy((key, value) -> value) .count() .toStream() .to("counts", Produced.with(Serdes.String(), Serdes.Long())); SELECT x, count(*) FROM stream GROUP BY x EMIT CHANGES;
  • 28. Using external processing systems leads to complicated architectures DB CONNECTOR CONNECTOR APP APP DB STREAM PROCESSING CONNECTOR APPDB
  • 29. We can put it back together in a simpler way DB APP APP DB APP PULL PUSH CONNECTORS STREAM PROCESSING STATE STORES ksqlDB
  • 30. Connect integration and pull queries enable end-to-end streaming in just a few SQL statements Serve lookups against materialized views Create materialized views Perform continuous transformations Capture data CREATE STREAM purchases AS SELECT viewtime, userid,pageid, TIMESTAMPTOSTRING(viewtime, 'yyyy-MM-dd HH:mm:ss.SSS') FROM pageviews; CREATE TABLE orders_by_country AS SELECT country, COUNT(*) AS order_count, SUM(order_total) AS order_total FROM purchases WINDOW TUMBLING (SIZE 5 MINUTES) LEFT JOIN user_profiles ON purchases.customer_id = user_profiles.customer_id GROUP BY country EMIT CHANGES; SELECT * FROM orders_by_country WHERE country='usa'; CREATE SOURCE CONNECTOR jdbcConnector WITH ( ‘connector.class’ = '...JdbcSourceConnector', ‘connection.url’ = '...', …);
  • 32. Confluent Platform and Confluent Cloud are always built on the latest Version of Apache Kafka If you want to learn what’s included in Apache Kafka 2.4, we have resources available for you: • Technical Blog: https://www.confluent.io/blog/apac he-kafka-2-4-latest-version-updates • Overview Video: https://youtu.be/Ipzc--mbvzg 32 Apache Kafka 2.4