SlideShare a Scribd company logo
Kubernetes HA
Montreal Kubernetes Meetup
October 12
Hello, my name is Alexandre
@alex_gervais
alexgervais
AppDirect background
- Chef provisioning
- Centos 7
- Multiple deployments
- AWS
- On-premise
- Automation, automation, automation!
- Packer
- Terraform
Although it is easy to deploy and make your applications and micro-services highly
available within a Kubernetes cluster, Kubernetes masters are not HA in typical
setups.
It requires a little more work, but not that much…
Here’s the 3-step program.
0. Single master
1.etcd clustering
$ curl https://discovery.etcd.io/new?size=3
2. Master election
podmaster and hyperkube
On every master node:
/etc/kubernetes/manifests/podmaster.yaml
gcr.io/google_containers/podmaster:1.1
/srv/kubernetes/kube-controller-manager.yaml
gcr.io/google_containers/hyperkube:1.4.0
/srv/kubernetes/kube-scheduler.yaml
gcr.io/google_containers/hyperkube:1.4.0
On the elected node:
The podmaster will copy kube-controller-manager.yaml and kube-
scheduler.yaml to /etc/kubernetes/manifests and kubelet picks
them up!
Disclaimer
Since kubernetes 1.2
--leader-elect
--apiserver-count=3
3. API load balancing
🎉
$ kubectl get po --namespace=kube-system -o wide
NAME READY STATUS RESTARTS AGE IP NODE
kube-addon-manager-ip-172-31-29-97.ec2.internal 1/1 Running 1 40d 172.31.29.97 ip-172-31-29-97.ec2.internal
kube-controller-manager-ip-172-31-29-97.ec2.internal 1/1 Running 1 40d 172.31.29.97 ip-172-31-29-97.ec2.internal
kube-dns-v19-5ut0y 3/3 Running 3 40d 10.0.55.2 ip-172-31-51-130.ec2.internal
kube-dns-v19-srphp 3/3 Running 0 13d 10.0.50.5 ip-172-31-46-232.ec2.internal
kube-dns-v19-tf5u6 3/3 Running 1 33d 10.0.20.3 ip-172-31-29-97.ec2.internal
kube-scheduler-ip-172-31-29-97.ec2.internal 1/1 Running 1 40d 172.31.29.97 ip-172-31-29-97.ec2.internal
kubernetes-dashboard-v1.1.0-zta4y 1/1 Running 0 40d 10.0.55.5 ip-172-31-51-130.ec2.internal
podmaster-ip-172-31-29-97.ec2.internal 3/3 Running 3 40d 172.31.29.97 ip-172-31-29-97.ec2.internal
podmaster-ip-172-31-52-169.ec2.internal 3/3 Running 6 33d 172.31.52.169 ip-172-31-52-169.ec2.internal
podmaster-ip-172-31-7-176.ec2.internal 3/3 Running 3 40d 172.31.7.176 ip-172-31-7-176.ec2.internal
$ kubectl get ep
NAME ENDPOINTS AGE
kubernetes 172.31.29.97:6443,172.31.52.169:6443,172.31.7.176:6443 40d
Cluster-wide upgrades
- Chef(ing)
- Rolling upgrades of existing nodes
- Terraform(ing)
- Replace nodes, one-by-one
- Datadog monitoring
References
- etcd clustering
https://coreos.com/etcd/docs/latest/clustering.html
- hyperkube
https://github.com/kubernetes/kubernetes/tree/master/cluster/images/hyperkube
- Master node deployments
https://coreos.com/kubernetes/docs/latest/deploy-master.html
- Kubernetes HA recipe
http://kubernetes.io/docs/admin/high-availability/
AppDirect Shameless Plug

More Related Content

PPTX
Kubernetes 101 Workshop
PDF
Achieving CI/CD with Kubernetes
PDF
Kubernetes Introduction
PPTX
Scaling Docker Containers using Kubernetes and Azure Container Service
PDF
Kubernetes 101
PDF
Scaling Docker with Kubernetes
PPTX
Orchestrating Docker Containers with Google Kubernetes on OpenStack
PDF
Container Days Boston - Kubernetes in production
Kubernetes 101 Workshop
Achieving CI/CD with Kubernetes
Kubernetes Introduction
Scaling Docker Containers using Kubernetes and Azure Container Service
Kubernetes 101
Scaling Docker with Kubernetes
Orchestrating Docker Containers with Google Kubernetes on OpenStack
Container Days Boston - Kubernetes in production

What's hot (20)

PDF
Kubernetes on aws
PPTX
9 ways to consume kubernetes on open stack in 15 mins (k8s meetup)
PPTX
Stateful set in kubernetes implementation & usecases
PPTX
Kubernetes Introduction
PPTX
DevOps with Kubernetes
PPTX
CoreOS: The Inside and Outside of Linux Containers
PDF
Kubernetes 101 for Developers
PPTX
Containerizing a REST API and Deploying to Kubernetes
PDF
Kubernetes Boston — Custom High Availability of Kubernetes
PDF
Marc Sluiter - 15 Kubernetes Features in 15 Minutes
PPT
Kubernetes on CloudStack with coreOS
PPTX
Introduction to Kubernetes
PPTX
Introduction to Kubernetes
PDF
CI/CD with Kubernetes, Helm & Wercker (#madScalability)
PDF
Rex gke-clustree
PDF
Kubernetes 101 and Fun
PPTX
Managing Docker Containers In A Cluster - Introducing Kubernetes
PDF
Apache Stratos 4.1.0 Architecture
PDF
Cluster management with Kubernetes
PDF
Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...
Kubernetes on aws
9 ways to consume kubernetes on open stack in 15 mins (k8s meetup)
Stateful set in kubernetes implementation & usecases
Kubernetes Introduction
DevOps with Kubernetes
CoreOS: The Inside and Outside of Linux Containers
Kubernetes 101 for Developers
Containerizing a REST API and Deploying to Kubernetes
Kubernetes Boston — Custom High Availability of Kubernetes
Marc Sluiter - 15 Kubernetes Features in 15 Minutes
Kubernetes on CloudStack with coreOS
Introduction to Kubernetes
Introduction to Kubernetes
CI/CD with Kubernetes, Helm & Wercker (#madScalability)
Rex gke-clustree
Kubernetes 101 and Fun
Managing Docker Containers In A Cluster - Introducing Kubernetes
Apache Stratos 4.1.0 Architecture
Cluster management with Kubernetes
Docker at Shopify: From This-Looks-Fun to Production by Simon Eskildsen (Shop...
Ad

Similar to Kubernetes HA @ AppDirect - Montreal Kubernetes Meetup (20)

PDF
Build Your Own CaaS (Container as a Service)
PDF
Artem Zhurbila - docker clusters (solit 2015)
PDF
Kubernetes Architecture and Introduction
PDF
Create a Varnish cluster in Kubernetes for Drupal caching - DrupalCon North A...
PDF
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
PDF
An Introduction to the Kubernetes API
PPTX
K8s in 3h - Kubernetes Fundamentals Training
PPTX
How Honestbee Does CI/CD on Kubernetes - Vincent DeSmet
PDF
Kubernetes Kops - Automation Night
PDF
Azure kubernetes service (aks) part 3
PDF
JupyterHub + kubernetes
PDF
Kubernetes111111111111111111122233334334
PDF
Kubernetes
PDF
Kubernetes for Java Developers
PDF
Federated Kubernetes: As a Platform for Distributed Scientific Computing
PDF
Kubernetes Java Operator
PDF
Mattia Gandolfi - Improving utilization and portability with Containers and C...
PPTX
Kubernetes from the ground up
PDF
ProxySQL on Kubernetes
PDF
KubeCon EU 2020 - Provider vSphere All Things vSphere Working Group
Build Your Own CaaS (Container as a Service)
Artem Zhurbila - docker clusters (solit 2015)
Kubernetes Architecture and Introduction
Create a Varnish cluster in Kubernetes for Drupal caching - DrupalCon North A...
Kubernetes Architecture and Introduction – Paris Kubernetes Meetup
An Introduction to the Kubernetes API
K8s in 3h - Kubernetes Fundamentals Training
How Honestbee Does CI/CD on Kubernetes - Vincent DeSmet
Kubernetes Kops - Automation Night
Azure kubernetes service (aks) part 3
JupyterHub + kubernetes
Kubernetes111111111111111111122233334334
Kubernetes
Kubernetes for Java Developers
Federated Kubernetes: As a Platform for Distributed Scientific Computing
Kubernetes Java Operator
Mattia Gandolfi - Improving utilization and portability with Containers and C...
Kubernetes from the ground up
ProxySQL on Kubernetes
KubeCon EU 2020 - Provider vSphere All Things vSphere Working Group
Ad

Recently uploaded (20)

PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Odoo POS Development Services by CandidRoot Solutions
PPTX
history of c programming in notes for students .pptx
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
System and Network Administration Chapter 2
PPTX
L1 - Introduction to python Backend.pptx
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
System and Network Administraation Chapter 3
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Odoo POS Development Services by CandidRoot Solutions
history of c programming in notes for students .pptx
How Creative Agencies Leverage Project Management Software.pdf
System and Network Administration Chapter 2
L1 - Introduction to python Backend.pptx
Navsoft: AI-Powered Business Solutions & Custom Software Development
Online Work Permit System for Fast Permit Processing
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Design an Analysis of Algorithms II-SECS-1021-03
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PTS Company Brochure 2025 (1).pdf.......
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
System and Network Administraation Chapter 3
How to Migrate SBCGlobal Email to Yahoo Easily
VVF-Customer-Presentation2025-Ver1.9.pptx

Kubernetes HA @ AppDirect - Montreal Kubernetes Meetup

  • 2. Hello, my name is Alexandre @alex_gervais alexgervais
  • 3. AppDirect background - Chef provisioning - Centos 7 - Multiple deployments - AWS - On-premise - Automation, automation, automation! - Packer - Terraform
  • 4. Although it is easy to deploy and make your applications and micro-services highly available within a Kubernetes cluster, Kubernetes masters are not HA in typical setups. It requires a little more work, but not that much… Here’s the 3-step program.
  • 6. 1.etcd clustering $ curl https://discovery.etcd.io/new?size=3
  • 8. podmaster and hyperkube On every master node: /etc/kubernetes/manifests/podmaster.yaml gcr.io/google_containers/podmaster:1.1 /srv/kubernetes/kube-controller-manager.yaml gcr.io/google_containers/hyperkube:1.4.0 /srv/kubernetes/kube-scheduler.yaml gcr.io/google_containers/hyperkube:1.4.0 On the elected node: The podmaster will copy kube-controller-manager.yaml and kube- scheduler.yaml to /etc/kubernetes/manifests and kubelet picks them up!
  • 10. 3. API load balancing
  • 11. 🎉 $ kubectl get po --namespace=kube-system -o wide NAME READY STATUS RESTARTS AGE IP NODE kube-addon-manager-ip-172-31-29-97.ec2.internal 1/1 Running 1 40d 172.31.29.97 ip-172-31-29-97.ec2.internal kube-controller-manager-ip-172-31-29-97.ec2.internal 1/1 Running 1 40d 172.31.29.97 ip-172-31-29-97.ec2.internal kube-dns-v19-5ut0y 3/3 Running 3 40d 10.0.55.2 ip-172-31-51-130.ec2.internal kube-dns-v19-srphp 3/3 Running 0 13d 10.0.50.5 ip-172-31-46-232.ec2.internal kube-dns-v19-tf5u6 3/3 Running 1 33d 10.0.20.3 ip-172-31-29-97.ec2.internal kube-scheduler-ip-172-31-29-97.ec2.internal 1/1 Running 1 40d 172.31.29.97 ip-172-31-29-97.ec2.internal kubernetes-dashboard-v1.1.0-zta4y 1/1 Running 0 40d 10.0.55.5 ip-172-31-51-130.ec2.internal podmaster-ip-172-31-29-97.ec2.internal 3/3 Running 3 40d 172.31.29.97 ip-172-31-29-97.ec2.internal podmaster-ip-172-31-52-169.ec2.internal 3/3 Running 6 33d 172.31.52.169 ip-172-31-52-169.ec2.internal podmaster-ip-172-31-7-176.ec2.internal 3/3 Running 3 40d 172.31.7.176 ip-172-31-7-176.ec2.internal $ kubectl get ep NAME ENDPOINTS AGE kubernetes 172.31.29.97:6443,172.31.52.169:6443,172.31.7.176:6443 40d
  • 12. Cluster-wide upgrades - Chef(ing) - Rolling upgrades of existing nodes - Terraform(ing) - Replace nodes, one-by-one - Datadog monitoring
  • 13. References - etcd clustering https://coreos.com/etcd/docs/latest/clustering.html - hyperkube https://github.com/kubernetes/kubernetes/tree/master/cluster/images/hyperkube - Master node deployments https://coreos.com/kubernetes/docs/latest/deploy-master.html - Kubernetes HA recipe http://kubernetes.io/docs/admin/high-availability/

Editor's Notes

  • #2: Welcome to this talk on setuping an Highly Available kubernetes cluster This is not a beginners talk, so I assume you know what Kubernetes can do for you and hopefully you already scheduled some pods in your own cluster or minikube
  • #3: Backend software engineer, turned fullstack software dev, turned devops. Unicorn tech startup based in SF AppDirect’s mission has always been to help people find, buy and use the software. Whitelabel marketplace -- think appstore or shopify for cloud. As developers, we started our container infrastructure a while ago, and it lead us to Kubernetes.
  • #4: The existing Ops team of sysadmins had constraints... On-prem: softlayer, openstack, bare-metal Launching a new cluster takes roughly 10 minutes Still call our worker nodes “minions”
  • #5: Even if the master would die, your application/service would survive… the running containers on minions won’t disappear! It just makes it less reliable to update your deployment, scale or orchestrate in case of cluster-wide failures.
  • #6: 3 dependant services 5 kubernetes process/components For us, these are all running under systemd supervision Kubelet, kube-proxy and kube-apiserver are stateless -- YAY! But kube-scheduler and kube-controller-manager are not… we would not want the scheduler to “double create” or “double destroy” a running pod because of a race-condition… we will need to figure out a way around this.
  • #7: Etcd is the underlying Kubernetes datastore Etcd is meant to be clustered, therefore it’s easy to bootstrap with etcd built-in discovery There are many more ways to cluster your Etcd store.
  • #8: Kubelet has a “manifest” mechanism, which will load any pod definition from a specific folder on the host independently of the apiserver, scheduler and controller-manager Every master node has a podmaster manifest; so we can expect 3 podmaster pods. Each podmaster pod runs 2 containers. Each of those container are responsible for the election of either kube-scheduler or kube-controller-manager. The election is achieved using a the underlying etcd store “CompareAndSwap” functionality.
  • #9: Podmaster does the election Hyperkube is released for every version, and bundles the kubernetes binaries. All elections are independent; kube-scheduler could win the election on the first node and kube-scheduler win the election on the second node.
  • #10: New “leader-elect” flag added to controller-manager and scheduler Although it went pretty much undocumented, the flag allow leader election using the kube-apiserver without the need for podmaster. Using this flags allow 3 controller-manager or scheduler to run in parallel, but a single execution of the logic loop at any given time. Also, kube-apiserver added the “apiserver-count” flag so all 3 of our masters are available in the dns-resolvable “kubernetes” endpoint
  • #11: kube-apiserver is active-active-active Every client of kube-apiservice must go through load-balancing
  • #12: Here we see our podmaster running on each master node. The controller-manager and scheduler being schedule on a single master. We also did the same with the newly-added addon-manager. kube-apiserver and etcd could also run as docker processes instead of systemd; we just chose not to. Master-nodes are also “cordoned” so no pod is scheduled on these nodes except for manifests. This allows us to run kubernetes master components on cheaper hardware
  • #13: Now that we have achieved HA and we are resilient to failure! Let’s put it to good use… like live cluster upgrades Run chef-client on existing master nodes to bring them up to date. Since it’s HA, we don’t mind losing 1 master processes during upgrade Just like `kubectl rolling-update` we spawn new minions with pre-backed ami into the cluster and destroy old ones.
  • #15: We are recruiting! Whether you are a frontend or backend developer, that you are passionate about security or do performance testing, if you are a 10x talent, we have a place for you!