SlideShare a Scribd company logo
Centralizing Kubernetes and Container
Operations
Oleg Chunikhin | CTO, Kublr
Introductions
Oleg Chunikhin
CTO, Kublr
• Nearly 20 years in the field of software
architecture and development.
• Joined Kublr as the CTO in 2016.
• Kublr is an enterprise Kubernetes management and
operations platform that helps accelerate Kubernetes
adoption and containerized applications management for
enterprises.
History
• Custom software development company
• Dozens of projects per year
• Varying target environments: clouds, on-prem, hybrid
• Unified application delivery and ops platform wanted:
monitoring, logs, security, multiple env, ...
Docker and Kubernetes to the Rescue
• Docker is great, but local
• Kubernetes is great... when it is up and running
• Who sets up and operates K8S clusters?
• Who takes care of operational aspects at scale?
• How do you provide governance and ensure
compliance?
Enterprise Kubernetes Needs
Developers SRE/Ops/DevOps/SecOps
• Self-service
• Compatible
• Conformant
• Configurable
• Open & Flexible
• Org multi-tenancy
• Single pane of glass
• Operations
• Monitoring
• Log collection
• Image management
• Identity management
• Security
• Reliability
• Performance
• Portability
Kubernetes Management Platform Wanted
• Portability – clouds, on-prem, air-gapped, different OS’
• Centralized multi-cluster operations saves resources – many
environments (dev, prod, QA, ...), teams, applications
• Self-service and governance for Kubernetes operations
• Reliability – cluster self-healing, self-reliance
• Limited management profile – cloud and K8S API
• Architecture – flexible, open, pluggable, compatible
• Sturdy – secure, scalable, modular, HA, DR etc.
Central Control Plane: Operations
K8S Clusters
Cloud(s)
Data
center
API UI
Log collection
Operations
Monitoring
Authn and authz, SSO, federation
Audit Image Repo
Infrastructure management
Backup & DR
Dev
K8S API
Cloud API
Prod
PoC
Dev
Central Control Plane: Operations
Infrastructure
Automation
Cluster: Self-Sufficiency
Central
control
plane
MASTER
KUBLR
overlay network, discovery,
connectivity
K8s Master Components:
etcd, scheduler, API, controller
Docker
KUBELET KUBLRKUBELET
NODE
Docker
overlay network, discovery,
connectivity
Infrastructure and
Application containers
Orchestration
Store Secrets
discovery
Simple
orchestration and
configuration agent
Cluster: Portability
• (Almost) everything runs in containers
• Simple (single-binary) management agent
• Minimal store requirements
• Shared, eventually consistent
• Secure: RW files for masters, RO for nodes
• Thus the store can be anything:
S3, SA, NFS, rsynced dir, provided files, ...
• Minimal infra automation requirements
• Configure and run configuration agent
• Enable access to the store
• Can be AWS CF, Azure ARM, BOSH,
Ansible, ...
• Load balancer is not required for multi-master;
each agent can independently fail over to a healthy
master
Infrastructure
Automation
MASTER
KUBLR
overlay network, discovery,
connectivity
K8s Master Components:
etcd, scheduler, API, controller
Docker
KUBELET KUBLRKUBELET
NODE
Docker
overlay network, discovery,
connectivity
Infrastructure and
Application containers
Orchestration
Store Secrets
discovery
Cluster: Reliability
• Rely on underlying platform as much as
possible
• ASG on AWS
• IAM on AWS for store access
• SA on Azure, S3 on AWS
• ARM on Azure, CF on AWS
• Minimal infrastructure SLA
tolerate temporary failures
• Multi-muster API failover on nodes
• Resource management, memory requests
and limits for OS and k8s components
Infrastructure
Automation
MASTER
KUBLR
overlay network, discovery,
connectivity
K8s Master Components:
etcd, scheduler, API, controller
Docker
KUBELET KUBLRKUBELET
NODE
Docker
overlay network, discovery,
connectivity
Infrastructure and
Application containers
Orchestration
Store
Central Control Plane: Logs and Metrics
K8S Clusters
Cloud(s)
Data
center
API UI Operations
Authn and authz, SSO, federation
Image Repo
Infrastructure management
Backup & DR
Dev
K8S API
Cloud API
Prod
PoC
Dev
Log collection Monitoring
Audit
Centralized Monitoring and Log Collection.
Why Bother?
• Prometheus and ELK are heavy and not easy to operate;
need attention and at least 4-8 Gb RAM... each, per cluster
• Cloud/SaaS monitoring is not always permitted or available
• Existing monitoring is often not container-aware
• No aggregated view and analysis
• No alerting governance
K8S Monitoring with Prometheus
• Discover nodes, services, pods
via K8S API
• Query metrics from discovered
endpoints
• Endpoint are accessed directly
via internal cluster addresses
Kubernetes Cluster
Prometheus
Nodes
K8S API
Grafana
Pods
Discovery
Srv
Metrics
Centralized Monitoring
Cluster registry
PROMETHEUSGrafana
K8S Proxy API
nodes, pods,
service endpoints
Ship externally
Ship externally
Prometheus
config
Prometheus
data
Configurator
Control plane
KUBERNETES CLUSTER
Prometheus
(collector)
Prometheus
(collector)
Centralized Monitoring: Considerations
• Prometheus resource usage tuning
• Long-term storage (m3)
• Configuration file growth with many clusters
• Metrics labeling
• Additional load on API server
Centralized Monitoring
K8S Logging with Elasticsearch
• Fluentd runs on nodes
• OS, K8s, and container logs
collected and shipped to
Elasticsearch
• Kibana for visualization
Kubernetes Cluster
Elasticsearch
Kibana
Pods
Logs
Prometheus
(collector)
RabbitMQ
Centralized Log Collection
Cluster registry
K8S Proxy API
Port
forwarding
MQTT
Ship externally
Messaging
config
Configurator
Control plane
RabbitMQ
Shovel
ElasticsearchLogstash
Fluentd
KUBERNETES CLUSTER
filter
filter analyze
Ship externally
MQTT
Forwarder
filter
Centralized Log Collection: Considerations
• Tune Elasticsearch resource usage
• Take into account additional load on API server
• Log index structure normalization
{
"data": {
"elasticsearch": {
"version": "6.x"
}
}
}
{
"flatData": [
{
"key": "elasticsearch.version",
"type": "string",
"key_type": "elasticsearch.version.string",
"value_string": "6.x"
},
...
]
}
The Rest: Considerations
• Identity management
Use Identity Broker (e.g. KeyCloak): Users, Authn, Autzn, SSO, RBAC, Federation, ...
• Backup and disaster recovery
K8s metadata + app data/volumes: full cluster recovery or copy
Docker image management
Docker image registry (e.g. Nexus, Artifactory, Docker Hub);
image scanning;
air-gapped or isolated environment: image registries proxying and caching,
“system” images
Q&A
Oleg Chunikhin
Chief Technology Officer
oleg@kublr.com
@olgch
Kublr | kublr.com
@kublr
Thank you!

More Related Content

PPTX
The Evolution of your Kubernetes Cluster
PPTX
Canary Releases on Kubernetes w/ Spinnaker, Istio, and Prometheus
PDF
Kubernetes as Infrastructure Abstraction
PPTX
Kubernetes data science and machine learning
PPTX
Application Portability with Kubernetes (k8)
PDF
Introduction to Kubernetes RBAC
PPTX
Building Portable Applications with Kubernetes
PDF
Kubernetes stack reliability
The Evolution of your Kubernetes Cluster
Canary Releases on Kubernetes w/ Spinnaker, Istio, and Prometheus
Kubernetes as Infrastructure Abstraction
Kubernetes data science and machine learning
Application Portability with Kubernetes (k8)
Introduction to Kubernetes RBAC
Building Portable Applications with Kubernetes
Kubernetes stack reliability

What's hot (20)

PDF
Centralizing Kubernetes Management in Restrictive Environments
PPTX
Implement Advanced Scheduling Techniques in Kubernetes
PPTX
Kubernetes in Highly Restrictive Environments
PDF
Openstack days sv building highly available services using kubernetes (preso)
PDF
Multi-cloud Kubernetes BCDR with Velero
PDF
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
PDF
Setup Hybrid Clusters Using Kubernetes Federation
PDF
Running I/O intensive workloads on Kubernetes, by Nati Shalom
PPTX
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
PDF
Managing kubernetes deployment with operators
PDF
Kubernetes Networking 101
PPTX
A Million ways of Deploying a Kubernetes Cluster
PPTX
Kubernetes 1.16 and rancher 2.3 enhancements
PDF
Sf bay area Kubernetes meetup dec8 2016 - deployment models
PDF
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
PPTX
Spinnaker on Kubernetes
PDF
WTF Do We Need a Service Mesh?
PDF
From Code to Kubernetes
PPTX
Advanced Scheduling in Kubernetes
PDF
Helm - Package Manager for Kubernetes
Centralizing Kubernetes Management in Restrictive Environments
Implement Advanced Scheduling Techniques in Kubernetes
Kubernetes in Highly Restrictive Environments
Openstack days sv building highly available services using kubernetes (preso)
Multi-cloud Kubernetes BCDR with Velero
The Good, the Bad and the Ugly of Migrating Hundreds of Legacy Applications ...
Setup Hybrid Clusters Using Kubernetes Federation
Running I/O intensive workloads on Kubernetes, by Nati Shalom
Canary Releases on Kubernetes with Spinnaker, Istio, & Prometheus (2020)
Managing kubernetes deployment with operators
Kubernetes Networking 101
A Million ways of Deploying a Kubernetes Cluster
Kubernetes 1.16 and rancher 2.3 enhancements
Sf bay area Kubernetes meetup dec8 2016 - deployment models
MongoDB.local DC 2018: MongoDB Ops Manager + Kubernetes
Spinnaker on Kubernetes
WTF Do We Need a Service Mesh?
From Code to Kubernetes
Advanced Scheduling in Kubernetes
Helm - Package Manager for Kubernetes
Ad

Similar to Centralizing Kubernetes and Container Operations (20)

PDF
DevOpsDays Houston 2019 - Terry Shea - Centralizing Kubernetes Operations
PDF
Kubernetes – An open platform for container orchestration
PPTX
Application portability with kubernetes
PPTX
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
PPTX
Moby KubeCon 2017
PDF
Intro into Rook and Ceph on Kubernetes
PDF
Deploying kubernetes at scale on OpenStack
PDF
DevConf.cz - Introduction to Kubernetes Operators for Databases
PDF
How Self-Healing Nodes and Infrastructure Management Impact Reliability
PDF
Why kubernetes for Serverless (FaaS)
PDF
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
PDF
Hybrid architecture solutions with kubernetes and the cloud native stack
PPTX
Container Conf 2017: Rancher Kubernetes
PPTX
Introduction to Kubernetes
PPTX
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
PDF
Accelerate Application Innovation Journey with Azure Kubernetes Service
PDF
Evénement Docker Paris: Anticipez les nouveaux business model et réduisez vos...
PDF
Navigating the Container Orchestration Maze
PDF
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
PDF
Monitoring kubernetes across data center and cloud
DevOpsDays Houston 2019 - Terry Shea - Centralizing Kubernetes Operations
Kubernetes – An open platform for container orchestration
Application portability with kubernetes
01 - VMUGIT - Lecce 2018 - Fabio Rapposelli, VMware
Moby KubeCon 2017
Intro into Rook and Ceph on Kubernetes
Deploying kubernetes at scale on OpenStack
DevConf.cz - Introduction to Kubernetes Operators for Databases
How Self-Healing Nodes and Infrastructure Management Impact Reliability
Why kubernetes for Serverless (FaaS)
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
Hybrid architecture solutions with kubernetes and the cloud native stack
Container Conf 2017: Rancher Kubernetes
Introduction to Kubernetes
Secure Your Containers: What Network Admins Should Know When Moving Into Prod...
Accelerate Application Innovation Journey with Azure Kubernetes Service
Evénement Docker Paris: Anticipez les nouveaux business model et réduisez vos...
Navigating the Container Orchestration Maze
OSDC 2018 | Three years running containers with Kubernetes in Production by T...
Monitoring kubernetes across data center and cloud
Ad

More from Kublr (9)

PDF
Container Runtimes and Tooling, v2
PDF
Container Runtimes and Tooling
PDF
Kubernetes in Hybrid Environments with Submariner
PDF
Kubernetes Ingress 101
PDF
Kubernetes persistence 101
PDF
Portable CI/CD Environment as Code with Kubernetes, Kublr and Jenkins
PDF
Kubernetes 101
PDF
Setting up CI/CD Pipeline with Kubernetes and Kublr step by-step
PDF
How to Run Kubernetes in Restrictive Environments
Container Runtimes and Tooling, v2
Container Runtimes and Tooling
Kubernetes in Hybrid Environments with Submariner
Kubernetes Ingress 101
Kubernetes persistence 101
Portable CI/CD Environment as Code with Kubernetes, Kublr and Jenkins
Kubernetes 101
Setting up CI/CD Pipeline with Kubernetes and Kublr step by-step
How to Run Kubernetes in Restrictive Environments

Recently uploaded (20)

PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
Teaching material agriculture food technology
PDF
Modernizing your data center with Dell and AMD
PDF
Advanced IT Governance
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
cuic standard and advanced reporting.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Empathic Computing: Creating Shared Understanding
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
NewMind AI Monthly Chronicles - July 2025
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Understanding_Digital_Forensics_Presentation.pptx
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
GamePlan Trading System Review: Professional Trader's Honest Take
The AUB Centre for AI in Media Proposal.docx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Teaching material agriculture food technology
Modernizing your data center with Dell and AMD
Advanced IT Governance
“AI and Expert System Decision Support & Business Intelligence Systems”
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Unlocking AI with Model Context Protocol (MCP)
Diabetes mellitus diagnosis method based random forest with bat algorithm
cuic standard and advanced reporting.pdf

Centralizing Kubernetes and Container Operations

  • 1. Centralizing Kubernetes and Container Operations Oleg Chunikhin | CTO, Kublr
  • 2. Introductions Oleg Chunikhin CTO, Kublr • Nearly 20 years in the field of software architecture and development. • Joined Kublr as the CTO in 2016. • Kublr is an enterprise Kubernetes management and operations platform that helps accelerate Kubernetes adoption and containerized applications management for enterprises.
  • 3. History • Custom software development company • Dozens of projects per year • Varying target environments: clouds, on-prem, hybrid • Unified application delivery and ops platform wanted: monitoring, logs, security, multiple env, ...
  • 4. Docker and Kubernetes to the Rescue • Docker is great, but local • Kubernetes is great... when it is up and running • Who sets up and operates K8S clusters? • Who takes care of operational aspects at scale? • How do you provide governance and ensure compliance?
  • 5. Enterprise Kubernetes Needs Developers SRE/Ops/DevOps/SecOps • Self-service • Compatible • Conformant • Configurable • Open & Flexible • Org multi-tenancy • Single pane of glass • Operations • Monitoring • Log collection • Image management • Identity management • Security • Reliability • Performance • Portability
  • 6. Kubernetes Management Platform Wanted • Portability – clouds, on-prem, air-gapped, different OS’ • Centralized multi-cluster operations saves resources – many environments (dev, prod, QA, ...), teams, applications • Self-service and governance for Kubernetes operations • Reliability – cluster self-healing, self-reliance • Limited management profile – cloud and K8S API • Architecture – flexible, open, pluggable, compatible • Sturdy – secure, scalable, modular, HA, DR etc.
  • 7. Central Control Plane: Operations K8S Clusters Cloud(s) Data center API UI Log collection Operations Monitoring Authn and authz, SSO, federation Audit Image Repo Infrastructure management Backup & DR Dev K8S API Cloud API Prod PoC Dev
  • 9. Infrastructure Automation Cluster: Self-Sufficiency Central control plane MASTER KUBLR overlay network, discovery, connectivity K8s Master Components: etcd, scheduler, API, controller Docker KUBELET KUBLRKUBELET NODE Docker overlay network, discovery, connectivity Infrastructure and Application containers Orchestration Store Secrets discovery Simple orchestration and configuration agent
  • 10. Cluster: Portability • (Almost) everything runs in containers • Simple (single-binary) management agent • Minimal store requirements • Shared, eventually consistent • Secure: RW files for masters, RO for nodes • Thus the store can be anything: S3, SA, NFS, rsynced dir, provided files, ... • Minimal infra automation requirements • Configure and run configuration agent • Enable access to the store • Can be AWS CF, Azure ARM, BOSH, Ansible, ... • Load balancer is not required for multi-master; each agent can independently fail over to a healthy master Infrastructure Automation MASTER KUBLR overlay network, discovery, connectivity K8s Master Components: etcd, scheduler, API, controller Docker KUBELET KUBLRKUBELET NODE Docker overlay network, discovery, connectivity Infrastructure and Application containers Orchestration Store Secrets discovery
  • 11. Cluster: Reliability • Rely on underlying platform as much as possible • ASG on AWS • IAM on AWS for store access • SA on Azure, S3 on AWS • ARM on Azure, CF on AWS • Minimal infrastructure SLA tolerate temporary failures • Multi-muster API failover on nodes • Resource management, memory requests and limits for OS and k8s components Infrastructure Automation MASTER KUBLR overlay network, discovery, connectivity K8s Master Components: etcd, scheduler, API, controller Docker KUBELET KUBLRKUBELET NODE Docker overlay network, discovery, connectivity Infrastructure and Application containers Orchestration Store
  • 12. Central Control Plane: Logs and Metrics K8S Clusters Cloud(s) Data center API UI Operations Authn and authz, SSO, federation Image Repo Infrastructure management Backup & DR Dev K8S API Cloud API Prod PoC Dev Log collection Monitoring Audit
  • 13. Centralized Monitoring and Log Collection. Why Bother? • Prometheus and ELK are heavy and not easy to operate; need attention and at least 4-8 Gb RAM... each, per cluster • Cloud/SaaS monitoring is not always permitted or available • Existing monitoring is often not container-aware • No aggregated view and analysis • No alerting governance
  • 14. K8S Monitoring with Prometheus • Discover nodes, services, pods via K8S API • Query metrics from discovered endpoints • Endpoint are accessed directly via internal cluster addresses Kubernetes Cluster Prometheus Nodes K8S API Grafana Pods Discovery Srv Metrics
  • 15. Centralized Monitoring Cluster registry PROMETHEUSGrafana K8S Proxy API nodes, pods, service endpoints Ship externally Ship externally Prometheus config Prometheus data Configurator Control plane KUBERNETES CLUSTER Prometheus (collector) Prometheus (collector)
  • 16. Centralized Monitoring: Considerations • Prometheus resource usage tuning • Long-term storage (m3) • Configuration file growth with many clusters • Metrics labeling • Additional load on API server
  • 18. K8S Logging with Elasticsearch • Fluentd runs on nodes • OS, K8s, and container logs collected and shipped to Elasticsearch • Kibana for visualization Kubernetes Cluster Elasticsearch Kibana Pods Logs
  • 19. Prometheus (collector) RabbitMQ Centralized Log Collection Cluster registry K8S Proxy API Port forwarding MQTT Ship externally Messaging config Configurator Control plane RabbitMQ Shovel ElasticsearchLogstash Fluentd KUBERNETES CLUSTER filter filter analyze Ship externally MQTT Forwarder filter
  • 20. Centralized Log Collection: Considerations • Tune Elasticsearch resource usage • Take into account additional load on API server • Log index structure normalization { "data": { "elasticsearch": { "version": "6.x" } } } { "flatData": [ { "key": "elasticsearch.version", "type": "string", "key_type": "elasticsearch.version.string", "value_string": "6.x" }, ... ] }
  • 21. The Rest: Considerations • Identity management Use Identity Broker (e.g. KeyCloak): Users, Authn, Autzn, SSO, RBAC, Federation, ... • Backup and disaster recovery K8s metadata + app data/volumes: full cluster recovery or copy Docker image management Docker image registry (e.g. Nexus, Artifactory, Docker Hub); image scanning; air-gapped or isolated environment: image registries proxying and caching, “system” images
  • 22. Q&A
  • 23. Oleg Chunikhin Chief Technology Officer oleg@kublr.com @olgch Kublr | kublr.com @kublr Thank you!

Editor's Notes

  • #4: Where the project comes from Company overview Kubernetes as a solution – standardized delivery platform Kubernetes is great for managing containers, but who manages Kubernetes? How to streamline monitoring and collection of logs with multiple Kubernetes clusters?
  • #7: Requirements Portability – support for cloud environments, on prem deployment, and isolated deployments Multi-cluster operations support Centralized log collection and monitoring Reliability – self healing, modularity, cluster self-reliance Limited connectivity profile – do not require many ports Architecture – flexible, open, pluggable Security
  • #10: The control plane is only critically involved in the cluster when the cluster is created The control plane uses cloud specific infrastructure management automation frameworks – CF, ARM, BOSH, VMware, etc. After the cluster infrastructure is created and configured, the cluster does not need the control plane Self-coordination via the orchestration store Orchestration store and underlying platform are the only coordination devices the cluster needs to operate and recover failures from Masters and nodes are configured for the orchestration store access Master(s) will try to get secrets and discovery information from the store; if not available – will generate and publish a new set With multiple masters – the latest published package wins Nodes will take the latest published data and use it.
  • #14: Prometheus
  • #15: Prometheus
  • #16: Control plane keeps track of managed clusters Configurator reconfigures Prometheus when cluster list changes Prometheus configuration is in K8S config maps