Mastering Kubernetes: A Comprehensive Guide to Cluster Architecture, Upgrades, and Maintenance

Kubernetes is much more than just a container orchestrator — it’s a robust platform that transforms the way you deploy, manage, and scale applications in the cloud. Whether you’re new to Kubernetes or looking to refine your cluster management skills, this guide will walk you through the core architectural components, essential tools, and practical strategies for maintenance, upgrades, and backup/restore operations.

Kubernetes Architecture Overview

At its core, Kubernetes is built on a robust architecture that separates cluster management from workload execution. Here’s a closer look at the key components:

Control Plane Components

Control Plane Nodes: The control plane is the brain of your cluster. It can span multiple servers and is ideally run on dedicated controller machines to ensure global cluster management.

Kube-API Server: This component provides the Kubernetes APIs — the primary interface for all cluster interactions. When you issue commands using kubectl, they go through the API server.

Etcd: Etcd is the reliable, consistent data store that holds the entire state of the cluster. Every action (creating pods, updating services, etc.) is recorded in etcd, making it the cornerstone of cluster state management.

Kube-Scheduler: Responsible for the scheduling process, the kube-scheduler examines the available nodes and assigns pods based on resource availability and policies.

Kube-Controller-Manager: Think of this as the “catch-all” component — it runs a collection of controllers that continuously monitor the cluster state and handle routine tasks (e.g., node management, replication, endpoint management).

Worker Nodes

Worker nodes are where your containers actually run. They include several critical components:

Kubelet: This is an agent running on each worker node. It communicates with the control plane to ensure that the containers are running and healthy.

Container Runtime: While not part of Kubernetes itself, the container runtime (such as containerd, Docker, or CRI-O) is essential for running containerized applications on the worker nodes.

Kube-Proxy: Acting as a network proxy, kube-proxy manages networking rules on each node to enable smooth communication between pods and services.

Etcd Design Patterns

Kubernetes offers flexibility in how etcd is deployed:

Stacked etcd: In this design, etcd runs on the same nodes as the control plane components. It simplifies management but can consume additional resources.

External etcd: Here, etcd is deployed on separate servers, isolating the data store from the control plane. This is beneficial for larger clusters that require high availability and scalability.

Essential Kubernetes Tools

Kubernetes is supported by a rich ecosystem of tools that simplify cluster management, configuration, and deployment:

kubectl: The official command-line interface to interact with your cluster. Everything you do with Kubernetes — whether it’s deploying applications or troubleshooting — is done through kubectl.

kubeadm: A tool that streamlines the process of creating and configuring a Kubernetes cluster. It’s the go-to solution for bootstrapping a production-grade cluster.

Minikube: A single-node Kubernetes cluster designed for development and testing purposes. It allows you to experiment with Kubernetes on your local machine.

Helm: Helm is a package manager that transforms complex Kubernetes configurations into reusable charts and templates, making deployment easier and more consistent.

Kompose: For those transitioning from Docker, Kompose converts Docker Compose files into Kubernetes objects, easing the migration process.

Kustomize: A configuration management tool that enables you to customize raw, template-free YAML files for different environments. It offers functionality similar to Helm but without templating.

Node Management and Maintenance

Managing nodes effectively is essential for a healthy Kubernetes cluster. Here are some key concepts:

Draining Nodes

During maintenance, you might need to remove a node from service. Draining ensures that containers on that node are gracefully terminated or rescheduled on other nodes without interruption. Use:

kubectl drain <node_name> --ignore-daemonsets

The --ignore-daemonsets flag ensures that daemonset-managed pods (which are tied to the node) are skipped.

Uncordoning Nodes

Once maintenance is complete, you can bring a node back into service using uncordon:

kubectl uncordon <node_name>

This command allows the node to start receiving new pods again.

Upgrading Kubernetes with kubeadm

Upgrading Kubernetes in production requires a careful, node-by-node approach to minimize downtime. Here’s a high-level process:

Upgrading the Control Plane

Drain the Control Plane Node:

kubectl drain control-node --ignore-daemonsets

Update and Install kubeadm:

sudo apt-get update 
sudo apt-get install -y --allow-change-held-packages kubeadm=1.27.2-00

Plan the Upgrade:

sudo kubeadm upgrade plan v1.27.2

Apply the Upgrade:

sudo kubeadm upgrade apply v1.27.2

Update kubelet and kubectl:

sudo apt-get update sudo apt-get install -y --allow-change-held-packages kubelet=1.27.2-00 kubectl=1.27.2-00 
sudo systemctl daemon-reload 
sudo systemctl restart kubelet

Uncordon the Control Plane Node:

kubectl uncordon control-node

Upgrading Worker Nodes

Drain the Worker Node (from the control plane):

kubectl drain workernode1 --ignore-daemonsets --force

Update kubeadm on the Worker:

sudo apt-get update 
sudo apt-get install -y --allow-change-held-packages kubeadm=1.27.2-00 sudo kubeadm upgrade node

Update kubelet and kubectl on the Worker:

sudo apt-get update 
sudo apt-get install -y --allow-change-held-packages kubelet=1.27.2-00 kubectl=1.27.2-00 sudo systemctl daemon-reload sudo systemctl restart kubelet

Backing Up and Restoring etcd Data

Since etcd is the backbone of your Kubernetes cluster, regular backups are crucial.

Backing Up etcd

Use the following command to create a snapshot backup:

ETCDCTL_API=3 etcdctl snapshot save /home/cloud_user/etcd_backup.db \
  --endpoints=https://10.0.1.101:2379 \
  --cacert=/home/cloud_user/etcd-certs/etcd-ca.pem \
  --cert=/home/cloud_user/etcd-certs/etcd-server.crt \
  --key=/home/cloud_user/etcd-certs/etcd-server.key

Restoring etcd

Stop etcd:

sudo systemctl stop etcd

Remove Existing Data:

sudo rm -rf /var/lib/etcd

Restore the Snapshot:

sudo ETCDCTL_API=3 etcdctl snapshot restore /home/cloud_user/etcd_backup.db \   
--initial-cluster etcd-restore=https://10.0.1.101:2380 \   
--initial-advertise-peer-urls https://10.0.1.101:2380 \   
--name etcd-restore \   
--data-dir /var/lib/etcd

Adjust Ownership:

sudo chown -R etcd:etcd /var/lib/etcd

Restart etcd:

sudo systemctl start etcd

In the above commands:

ETCDCTL_API=3 sets the etcdctl client to use API version 3.
— endpoints specifies the etcd server endpoint.
— cacert, — cert, and — key ensure a secure connection to the etcd server.

Mastering Kubernetes: A Comprehensive Guide to Cluster Architecture, Upgrades, and Maintenance

Steffin Issac

| Cloud Support Engineer | AWS EC2/EBS/VPC/RDS Expert | DevOps | Docker & Kubernetes | Prometheus & Grafana Monitoring Specialist | Linux Admin (Ubuntu/CentOS/Red Hat) | Scripting Wizard (Bash/YAML/JSON)

Kubernetes Architecture Overview

Control Plane Components

Worker Nodes

Etcd Design Patterns

Essential Kubernetes Tools

Node Management and Maintenance

Draining Nodes

Uncordoning Nodes

Upgrading Kubernetes with kubeadm

Upgrading the Control Plane

Upgrading Worker Nodes

Backing Up and Restoring etcd Data

Backing Up etcd

Restoring etcd

More articles by this author

Others also viewed

Kubernetes Architecture

Important Kubernetes objects (aka resources)

From Monolithic to Microservices: A Zero-Downtime Approach to System Modernization

Docker Overview -

📊 Monitoring Systems and Services with Prometheus: A Complete Guide

Implementing Managed Rotating Secrets in Kubernetes: A Production Journey

Understanding Kubernetes Architecture: The Building Blocks

Kubernetes Components: A Comprehensive Guide for Beginners

K for Kubernetes series (Blog 4)

Kubernetes

Explore topics

Kubernetes Architecture Overview

Control Plane Components

Worker Nodes

Etcd Design Patterns

Essential Kubernetes Tools

Node Management and Maintenance

Draining Nodes

Uncordoning Nodes

Upgrading Kubernetes with kubeadm

Upgrading the Control Plane

Upgrading Worker Nodes

Backing Up and Restoring etcd Data

Backing Up etcd

Restoring etcd

Unlocking the Power of Multi-Container Pods

May 20, 2025

Building Self-Healing Pods in Kubernetes: Restart Policies, Probes, and the Pod Lifecycle Explained

Apr 7, 2025

Mastering Kubernetes Probes: Liveness, Readiness, and Startup — Your Guide to Healthy Deployments

Mar 31, 2025

Demystifying Kubernetes RBAC, Service Accounts, and Metrics Server

Mar 14, 2025

Inside Kubernetes: A Deep Dive Into the Architecture that Powers Modern Cloud Applications

Mar 6, 2025

Comprehensive Guide to Setting Up a Kubernetes Cluster Using kubeadm

Mar 4, 2025

Others also viewed

Kubernetes Architecture

Important Kubernetes objects (aka resources)

From Monolithic to Microservices: A Zero-Downtime Approach to System Modernization

Docker Overview -

📊 Monitoring Systems and Services with Prometheus: A Complete Guide

Implementing Managed Rotating Secrets in Kubernetes: A Production Journey

Understanding Kubernetes Architecture: The Building Blocks

Kubernetes Components: A Comprehensive Guide for Beginners

K for Kubernetes series (Blog 4)

Kubernetes

Explore topics