SlideShare a Scribd company logo
Lessons on running Kafka on K8S
Avinash Upadhyaya
Tech Socialite @ Platformatory
Ashwin Venkatesh
Principal Engineer @ Platformatory
Speaker Info
● Platform engineer @ platformatory.io
● Meetup organizer for Kong, Kafka, Grafana,
Docker and Bangalore Streams
● CCDAK among other cloud certs
● Principal Engineer @ platformatory.io
● Experienced Apache Kafka consultant
● CCDAK
Cloud native what?
Hold my beer while I rebalance stuff
- More gluttony for torture
- Surprisingly simpler than
configuring
server.properties by hand
(or ansible)
- (if done well)
You want to run Kafka on K8S?
The Operator
Pattern in a
summary
- Kubernetes operator watches a CR type and takes application-specific actions to make the
current state match the desired state in that resource
- Implement domain-specific knowledge using Kubernetes
- Allows managing complex applications using the Kubernetes API and the kubectl interface
Any complex stateful
workload that can’t be run
as a fully managed service
will be provided as a K8S
operator
Scope of coverage:
A mental model on
Kubernetes
Operators for kafka
- Operator Core
- Custom Resources
- Workload Type
- Networking
- Storage
- Security
- Authentication
- Authorization
- Operational Features
- Balancing
- Monitoring
- Disaster Recovery
- Scale up/out
- Deployments & Rollouts
- Extensibility
Security: What is a typical requirement for kafka?
● Auto generate certificates for TLS and mTLS between brokers and other internal components
● Natively support authentication mechanism such as SASL/PLAIN, SASL/SCRAM,
SASL/OAUTHBEARER, SASL/GSSAPI
● Authorization with ACLs - Provide user management capabilities using the k8s API
Operations: What is a typical requirement for kafka?
● Re-balancing partitions when the load on the brokers is uneven, broker is added/removed
● Monitoring cluster health with JMX metrics
● Rolling upgrades with no downtime
● Replicate data across clusters
● Rack awareness for durability
Confluent For
Kubernetes(CFK)
● Confluent Platform on Kubernetes
● Based on experience of running Kafka on
Kubernetes for Confluent Cloud
● Uses StatefulSets for restoring a Kafka pod with
the same Kafka broker ID, configuration, and
persistent storage volumes if a failure occurs.
● Provides server properties, JVM, and Log4j
configuration overrides for customization of all
Confluent Platform components.
● Complete granular RBAC
● Support for credential management systems,
such as Hashicorp Vault, to inject sensitive
configurations in memory to Confluent
deployments
● Supports tiered storage
● Supports multi-region
Strimzi
● Open source, CNCF sandbox project
● Implement security in a Kubernetes-native
fashion
● Uses StrimziPodSets to overcome challenges of
StatefulSets
○ Add/remove broker arbitrarily
○ Stretch cluster across k8s clusters
○ Different configurations and volumes for different
brokers
● KafkaBridge for a RESTful HTTP interface
Koperator (Banzai
Cloud)
● Open-source core component of Banzai Cloud
Supertubes
○ most of the compelling features and integrations
are only available as part of the Supertubes Core
or Supertubes Pro product suites
● Envoy based load balancing for external access
● Uses pods instead of StatefulSets, in order to
○ modify the configuration of unique Brokers
○ remove specific Brokers from clusters
○ use multiple Persistent Volumes for each Broker
Comparison
Prescriptive Advise
- As with all things, k8s: It is important to setup
resource constraints (CPU, MemLimits)
- Generally advised to have Kafka nodes tainted
to NoSchedule and run on a dedicated basis.
- = no binpack nodes
- For most real-life use-cases, CRs are a starting
point. Will need to be or packaged to “platform
recipes” with different components, orienting
some level of tenancy around the brokers as
well as the components
- Typically a higher order Helm chart, preferably
with GitOps style deployments
- Prospective users must also think about operator
tenancy itself. Could be a global operator or a
namespaced operator
Key Takeaways
- Running Kafka on K8S can be a lot of toil,
without an operator. If you are running Kafka at
scale (and not on a managed service), consider
running one. It will save you time, money &
sanity
- You can make a choice based on your
environment, features (or the lack thereof),
licensing and other specialized purposes
- YMMV with Operator CRs. Each operator has its
own opinion based on the realities it was
designed for
- Kafka is ultimately not “k8s native”. The operator
only provides so much operational sugar
- As a result, there are several shoehorning
mechanisms (such as config overrides to inject
component properties, builtin); Full expressivity of
the workload doesn’t quite exist
- All operators provide comparable performance
Thank you
hello@platformatory.com
www.platformatory.io

More Related Content

PDF
Stories from running Kafka on K8S.pdf
PPTX
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
PDF
Managing Stateful Services with the Operator Pattern in Kubernetes - Kubernet...
PDF
The Kubernetes Operator Pattern - ContainerConf Nov 2017
PDF
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
PDF
Deploying Anything as a Service (XaaS) Using Operators on Kubernetes
PDF
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
PDF
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications
Stories from running Kafka on K8S.pdf
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with A...
Managing Stateful Services with the Operator Pattern in Kubernetes - Kubernet...
The Kubernetes Operator Pattern - ContainerConf Nov 2017
Production Ready Kafka on Kubernetes (Devandra Tagare, Lyft) Kafka Summit SF ...
Deploying Anything as a Service (XaaS) Using Operators on Kubernetes
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Running Kafka On Kubernetes With Strimzi For Real-Time Streaming Applications

Similar to A Primer Towards Running Kafka on Top of Kubernetes.pdf (20)

PDF
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
PDF
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
PPTX
Caribbean Developers Conference - 201K8s
PDF
Deploying Kafka on DC/OS
PDF
The Operator Pattern - Managing Stateful Services in Kubernetes
PDF
OSDC 2019 | Introducing Kudo – Kubernetes Operators the easy way by Matt Jarvis
PPTX
How kubernetes operators can rescue dev secops in midst of a pandemic updated
PDF
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
PPTX
Ofir Makmal - Intro To Kubernetes Operators - Google Cloud Summit 2018 Tel Aviv
PPTX
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
PPTX
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
PPTX
Building big data pipelines with Kafka and Kubernetes
PDF
Self-hosting Kafka at Scale: Netflix's Journey & Challenges
PDF
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
PDF
Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...
PDF
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
PDF
Reliable Performance at Scale with Apache Spark on Kubernetes
PPTX
Cloud native Kafka | Sascha Holtbruegge and Margaretha Erber, HiveMQ
PDF
Capital One Delivers Risk Insights in Real Time with Stream Processing
PPTX
Best Practices for Running Kafka on Docker Containers
Modern Cloud-Native Streaming Platforms: Event Streaming Microservices with K...
Disaster Recovery Options Running Apache Kafka in Kubernetes with Rema Subra...
Caribbean Developers Conference - 201K8s
Deploying Kafka on DC/OS
The Operator Pattern - Managing Stateful Services in Kubernetes
OSDC 2019 | Introducing Kudo – Kubernetes Operators the easy way by Matt Jarvis
How kubernetes operators can rescue dev secops in midst of a pandemic updated
Netflix Keystone Pipeline at Big Data Bootcamp, Santa Clara, Nov 2015
Ofir Makmal - Intro To Kubernetes Operators - Google Cloud Summit 2018 Tel Aviv
Putting Kafka In Jail – Best Practices To Run Kafka On Kubernetes & DC/OS
ApacheCon2019 Talk: Kafka, Cassandra and Kubernetes at Scale – Real-time Ano...
Building big data pipelines with Kafka and Kubernetes
Self-hosting Kafka at Scale: Netflix's Journey & Challenges
Everything you ever needed to know about Kafka on Kubernetes but were afraid ...
Secure Kafka at scale in true multi-tenant environment ( Vishnu Balusu & Asho...
Strimzi - Where Apache Kafka meets OpenShift - OpenShift Spain MeetUp
Reliable Performance at Scale with Apache Spark on Kubernetes
Cloud native Kafka | Sascha Holtbruegge and Margaretha Erber, HiveMQ
Capital One Delivers Risk Insights in Real Time with Stream Processing
Best Practices for Running Kafka on Docker Containers
Ad

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Cloud computing and distributed systems.
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
KodekX | Application Modernization Development
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Electronic commerce courselecture one. Pdf
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
Advanced methodologies resolving dimensionality complications for autism neur...
Machine learning based COVID-19 study performance prediction
Cloud computing and distributed systems.
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
KodekX | Application Modernization Development
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Mobile App Security Testing_ A Comprehensive Guide.pdf
Understanding_Digital_Forensics_Presentation.pptx
Electronic commerce courselecture one. Pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
cuic standard and advanced reporting.pdf
MIND Revenue Release Quarter 2 2025 Press Release
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
sap open course for s4hana steps from ECC to s4
Per capita expenditure prediction using model stacking based on satellite ima...
Spectral efficient network and resource selection model in 5G networks
Dropbox Q2 2025 Financial Results & Investor Presentation
Diabetes mellitus diagnosis method based random forest with bat algorithm
“AI and Expert System Decision Support & Business Intelligence Systems”
Ad

A Primer Towards Running Kafka on Top of Kubernetes.pdf

  • 1. Lessons on running Kafka on K8S Avinash Upadhyaya Tech Socialite @ Platformatory Ashwin Venkatesh Principal Engineer @ Platformatory
  • 2. Speaker Info ● Platform engineer @ platformatory.io ● Meetup organizer for Kong, Kafka, Grafana, Docker and Bangalore Streams ● CCDAK among other cloud certs ● Principal Engineer @ platformatory.io ● Experienced Apache Kafka consultant ● CCDAK
  • 4. Hold my beer while I rebalance stuff
  • 5. - More gluttony for torture - Surprisingly simpler than configuring server.properties by hand (or ansible) - (if done well) You want to run Kafka on K8S?
  • 6. The Operator Pattern in a summary - Kubernetes operator watches a CR type and takes application-specific actions to make the current state match the desired state in that resource - Implement domain-specific knowledge using Kubernetes - Allows managing complex applications using the Kubernetes API and the kubectl interface
  • 7. Any complex stateful workload that can’t be run as a fully managed service will be provided as a K8S operator
  • 8. Scope of coverage: A mental model on Kubernetes Operators for kafka - Operator Core - Custom Resources - Workload Type - Networking - Storage - Security - Authentication - Authorization - Operational Features - Balancing - Monitoring - Disaster Recovery - Scale up/out - Deployments & Rollouts - Extensibility
  • 9. Security: What is a typical requirement for kafka? ● Auto generate certificates for TLS and mTLS between brokers and other internal components ● Natively support authentication mechanism such as SASL/PLAIN, SASL/SCRAM, SASL/OAUTHBEARER, SASL/GSSAPI ● Authorization with ACLs - Provide user management capabilities using the k8s API
  • 10. Operations: What is a typical requirement for kafka? ● Re-balancing partitions when the load on the brokers is uneven, broker is added/removed ● Monitoring cluster health with JMX metrics ● Rolling upgrades with no downtime ● Replicate data across clusters ● Rack awareness for durability
  • 11. Confluent For Kubernetes(CFK) ● Confluent Platform on Kubernetes ● Based on experience of running Kafka on Kubernetes for Confluent Cloud ● Uses StatefulSets for restoring a Kafka pod with the same Kafka broker ID, configuration, and persistent storage volumes if a failure occurs. ● Provides server properties, JVM, and Log4j configuration overrides for customization of all Confluent Platform components. ● Complete granular RBAC ● Support for credential management systems, such as Hashicorp Vault, to inject sensitive configurations in memory to Confluent deployments ● Supports tiered storage ● Supports multi-region
  • 12. Strimzi ● Open source, CNCF sandbox project ● Implement security in a Kubernetes-native fashion ● Uses StrimziPodSets to overcome challenges of StatefulSets ○ Add/remove broker arbitrarily ○ Stretch cluster across k8s clusters ○ Different configurations and volumes for different brokers ● KafkaBridge for a RESTful HTTP interface
  • 13. Koperator (Banzai Cloud) ● Open-source core component of Banzai Cloud Supertubes ○ most of the compelling features and integrations are only available as part of the Supertubes Core or Supertubes Pro product suites ● Envoy based load balancing for external access ● Uses pods instead of StatefulSets, in order to ○ modify the configuration of unique Brokers ○ remove specific Brokers from clusters ○ use multiple Persistent Volumes for each Broker
  • 15. Prescriptive Advise - As with all things, k8s: It is important to setup resource constraints (CPU, MemLimits) - Generally advised to have Kafka nodes tainted to NoSchedule and run on a dedicated basis. - = no binpack nodes - For most real-life use-cases, CRs are a starting point. Will need to be or packaged to “platform recipes” with different components, orienting some level of tenancy around the brokers as well as the components - Typically a higher order Helm chart, preferably with GitOps style deployments - Prospective users must also think about operator tenancy itself. Could be a global operator or a namespaced operator
  • 16. Key Takeaways - Running Kafka on K8S can be a lot of toil, without an operator. If you are running Kafka at scale (and not on a managed service), consider running one. It will save you time, money & sanity - You can make a choice based on your environment, features (or the lack thereof), licensing and other specialized purposes - YMMV with Operator CRs. Each operator has its own opinion based on the realities it was designed for - Kafka is ultimately not “k8s native”. The operator only provides so much operational sugar - As a result, there are several shoehorning mechanisms (such as config overrides to inject component properties, builtin); Full expressivity of the workload doesn’t quite exist - All operators provide comparable performance