SlideShare a Scribd company logo
8
Most read
10
Most read
16
Most read
Efficient Kubernetes scaling using Karpenter
EFFICIENT
KUBERNETES SCALING
USING KARPENTER_
Marko Bevc
Efficient Kubernetes scaling using Karpenter
ABOUT
ME_ ●
Head of Consultancy at The Scale Factory (B2B SaaS consultancy,
AWS Advanced consulting partner and K8s service provider)
●
Ops background, wearing different hats, engaged with many
different technologies
●
Open source contributor, maintainer and supporter
●
HashiCorp Ambassador, OpenUK Ambassador
●
Certifications and competencies: AWS, CKA, RHEL, HCTA
●
Fan of automation/simplifying things, hiking and travelling
@_MarkoB
https://www.linkedin.com/in/marko-bevc/
| @marko@hachyderm.io
KUBERNETES
SCALING_
• None out of the box – manual 👩‍💻👨‍💻
• Kubernetes resources:
–Pods – the smallest execution unit
–Nodes – compute/instances to run Pods on
–Other: storage, network, etc.
@_MarkoB
HPA
CONCEPT_
• Horizontal Pod Autoscaler
• Adding more instances(e.g. Pods)
• Doesn’t apply to non-scalable objects (e.g. DaemonSet)
• Target observed metrics (i.e. average CPU or memory
utilization)
• Scaling out
VPA
CONCEPT_
• Vertical Pod Autoscaler
• Adjusting size/power (e.g. resources/limits)
• “Right-sizing” your workloads to actual usage
• Most commonly used on a Deployment objects
• Scaling up
PODS
SCALING_
• Other approaches:
– HPA | VPA* (HorizontalPodAutoscaler | VerticalPodAutoscaler)
– GCP: MultidimPodAutoscaler
– KEDA (K8s Event Driven Autoscaling)
– Knative (K8s based serverless platform)
CLUSTER
AUTOSCALER_
• Industry ‘de-facto’ auto-scaling standard
• Cost efficiency – automatically adjusts cluster: scale up/down
• Leaning on existing Cloud building blocks
• Challenges: Node Group limitations (AZ, instance type, labels),
complex to use, tightly bound to the scheduler, global controller
CLUSTER
AUTOSCALER
SCALE-UP_
●
Reconciliation and filtering
●
Scale up (in-memory simulation, <10sec)
●
Expanders: random, most/least pods, price, priority
●
Scale down (<10min)
New Nodes
Pending Pods
10 sec
NODE
SCHEDULING_
@_MarkoB
Kubernetes
Control Plane
unscheduled
@_MarkoB
NODE
SCHEDULING_
Kubernetes
Control Plane
unscheduled
@_MarkoB
NODE CA
SCHEDULING_
Kubernetes
Control Plane
size, arch, GPU, etc.
@_MarkoB
NODE KARPENTER
SCHEDULING_
Kubernetes
Control Plane
KARPENTER
ARCHITECTURE_
@_MarkoB
https://karpenter.sh
KEY
CONCEPTS_
• Straightforward setup:
– Provision AWS IAM Roles for Service Accounts (IRSA)
– Install controllers (leader elect HA)
– Apply Provisioner CRD (configuration) – one or more!
– Deploy workloads
• Capacity life-cycle loop: watch evaluate provision remove
→ → →
• Well-known labels as Provisioner constraints:
– kubernetes.io/arch = amd64
– kubernetes.io/os = linux
– node.kubernetes.io/instance-type = m5.large
– topology.kubernetes.io/zone = eu-west-1
– karpenter.sh/capacity-type = on-demand | spot
●
Multi-dimension scaling (up/down and in/out)!
@_MarkoB
SCALING
UP_
• Provisioning and scaling
• Adding more just-in-time capacity to meet demand
• Early binding to nodes
• Scheduling constraints: resource.requests, nodeAffinity, nodeSelector,
PodDisruptionBudget, topologySpreadConstraints, inter-pod (anti-)affinity
• Removing scheduler tight coupling
@_MarkoB
New Node
Pending Pods
<10 sec
SCALING
IN_
@_MarkoB
<10 sec
Obsolete Node
Pending Pods
• Terminate obsolete capacity reducing costs
→
• Removing underutilised or empty nodes
• Node TTLs (emptiness & expiration)
• Consolidation
• Interruption
• Drift
CAPACITY
CONSOLIDATION_
●
Consolidation, a.k.a off-line bin packing
●
Rebalancing Node workloads based on utilisation (CPU, memory)
●
Mechanisms for cluster consolidation:
– Delete (on-demand | spot)
– Replace (on-demand)
●
Optimises for cost, minimising disruption obeying:
– Scheduling constraints (PDBs, AZ affinity, topology spread constraints)
– Termination grace period and expiration TTL
– Instance unhealthy events and spot events (termination)
●
Using least disruption when multiple Nodes that could be consolidated:
– Nodes running fewer pods
– Nodes that will expire soon
– Nodes with lower priority Pods
@_MarkoB
OTHER
OPTIONS_
●
Custom User Data and AMI (i.e. Bottlerocket)
●
Kubelet configuration (containerRuntime, systemReserved)
●
Taints (or startupTaints)
●
Control Pod Density
– Network limitations
●
Number of ENIs
●
Number of IP addresses that can be assigned to ENI
– Static Pod Density (podsPerCore)
– Dynamic Pod Density (maxPods)
– Limit Pod Density: topology spread, restrict instance types
@_MarkoB
TIME FOR
A DEMO!_
@_MarkoB
CONCLUSIONS_
& TAKEAWAYS
●
Capacity planning is hard! 🧪
●
Key advantages: 🔥
– Flexible, lowers complexity & portable
– Fast: provisioning latency <1min down to 15sec (group-less)
→
– Efficient: multi-dimension scaling, consolidation (delete or replace)
– Adaptive: right-sizing, interruption events
– Compliance (TTL)📖
●
To keep in mind: 🧑‍🏫
– Currently supported provider is AWS (adoption in the future?*)
– Not supporting Spot Rebalance Recommendations
– Careful with non-interruptable workloads, edge case of 1 replica
– https://github.com/aws/karpenter/issues ➡️ ⚒️
@_MarkoB
●
Resources:
– https://github.com/mbevc1/public-speaking/
– https://github.com/aws/karpenter/
– https://kubernetes.io/docs/reference/labels-annotations-taints/
– https://github.com/kubernetes/autoscaler
– https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html
– https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/
scalability_tests.md
– https://blog.kloia.com/karpenter-cluster-autoscaler-76d7f7ec0d0e
– https://blog.scaleway.com/understanding-kubernetes-autoscaling/
– https://aws.amazon.com/blogs/aws/introducing-karpenter-an-open-source-high-performance-
kubernetes-cluster-autoscaler/
FURTHER
READING_
@_MarkoB
KEEP IN
TOUCH_
https://www.scalefactory.com/
@_MarkoB
@mbevc1
@mbevc1
https://www.linkedin.com/in/marko-bevc/
https://www.scalefactory.com/
Web:
Twitter:
GitHub:
GitLab:
LinkedIn:

More Related Content

PPTX
Kubernetes fundamentals
PDF
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
PPTX
Kubernetes
PDF
An overview of the Kubernetes architecture
PPTX
Kubernetes 101 for Beginners
PDF
Introduction of Kubernetes - Trang Nguyen
ODP
Kubernetes Architecture
PDF
Kubernetes Networking | Kubernetes Services, Pods & Ingress Networks | Kubern...
Kubernetes fundamentals
Kubernetes Architecture | Understanding Kubernetes Components | Kubernetes Tu...
Kubernetes
An overview of the Kubernetes architecture
Kubernetes 101 for Beginners
Introduction of Kubernetes - Trang Nguyen
Kubernetes Architecture
Kubernetes Networking | Kubernetes Services, Pods & Ingress Networks | Kubern...

What's hot (20)

PDF
Kubernetes
PDF
Hands-On Introduction to Kubernetes at LISA17
PDF
Kubernetes Networking
PDF
Kubernetes - introduction
PDF
Nodeless scaling with Karpenter
PPTX
Kubernetes PPT.pptx
PDF
Karpenter
PPTX
A brief study on Kubernetes and its components
PDF
Kubernetes 101
PPTX
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
PPTX
Meetup 23 - 02 - OVN - The future of networking in OpenStack
PDF
DevJam 2019 - Introduction to Kubernetes
PPTX
DevOps with Kubernetes
PPTX
K8s in 3h - Kubernetes Fundamentals Training
PDF
Kubernetes Introduction
PDF
Introduction to Kubernetes Workshop
PPTX
Kubernetes Basics
PDF
Kubernetes - A Comprehensive Overview
PDF
Introduction to kubernetes
PDF
Kubernetes From Scratch .pdf
Kubernetes
Hands-On Introduction to Kubernetes at LISA17
Kubernetes Networking
Kubernetes - introduction
Nodeless scaling with Karpenter
Kubernetes PPT.pptx
Karpenter
A brief study on Kubernetes and its components
Kubernetes 101
Kafka Tutorial - Introduction to Apache Kafka (Part 1)
Meetup 23 - 02 - OVN - The future of networking in OpenStack
DevJam 2019 - Introduction to Kubernetes
DevOps with Kubernetes
K8s in 3h - Kubernetes Fundamentals Training
Kubernetes Introduction
Introduction to Kubernetes Workshop
Kubernetes Basics
Kubernetes - A Comprehensive Overview
Introduction to kubernetes
Kubernetes From Scratch .pdf
Ad

Similar to Efficient Kubernetes scaling using Karpenter (20)

PPTX
Qubole @ AWS Meetup Bangalore - July 2015
PDF
Seamless scaling of Kubernetes nodes
PDF
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
PDF
Lc3 beijing-june262018-sahdev zala-guangya
PPTX
H-Hypermap Heatmap Analytics at Scale
PDF
Cluster schedulers
PDF
Hadoop and Spark
PPTX
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
PDF
Running Kafka on Kubernetes, across three clouds at Adobe
PDF
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
PDF
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
PDF
Facebook Presto presentation
PDF
MySQL in the Hosted Cloud
PDF
Ippevent : openshift Introduction
PDF
Ceph for Big Science - Dan van der Ster
PDF
Google Kubernetes Engine Deep Dive Meetup
PDF
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
PPTX
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Qubole @ AWS Meetup Bangalore - July 2015
Seamless scaling of Kubernetes nodes
Big Data Day LA 2016/ Big Data Track - How To Use Impala and Kudu To Optimize...
Lc3 beijing-june262018-sahdev zala-guangya
H-Hypermap Heatmap Analytics at Scale
Cluster schedulers
Hadoop and Spark
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
Running Kafka on Kubernetes, across three clouds at Adobe
FOSS4G In The Cloud: Using Open Source to build Cloud based Spatial Infrastru...
H-Hypermap - Heatmap Analytics at Scale: Presented by David Smiley, D W Smile...
Facebook Presto presentation
MySQL in the Hosted Cloud
Ippevent : openshift Introduction
Ceph for Big Science - Dan van der Ster
Google Kubernetes Engine Deep Dive Meetup
On CloudStack, Docker, Kubernetes, and Big Data…Oh my ! By Sebastien Goasguen...
Ceph Day New York 2014: Best Practices for Ceph-Powered Implementations of St...
Ad

More from Marko Bevc (8)

PDF
Using HCP Waypoint
PDF
How secure are your Terraform sensitive values?
PDF
Who is afraid of privileged containers ?
PDF
Terraform 0.13: Rise of the modules
PDF
Who is afraid of privileged containers ?
PDF
Terraform 0.13: Rise of the modules
PDF
Who is afraid of privileged containers ?
PDF
Commodified IaC using Terraform Cloud
Using HCP Waypoint
How secure are your Terraform sensitive values?
Who is afraid of privileged containers ?
Terraform 0.13: Rise of the modules
Who is afraid of privileged containers ?
Terraform 0.13: Rise of the modules
Who is afraid of privileged containers ?
Commodified IaC using Terraform Cloud

Recently uploaded (20)

PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation theory and applications.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Approach and Philosophy of On baking technology
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
cuic standard and advanced reporting.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Big Data Technologies - Introduction.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation theory and applications.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Approach and Philosophy of On baking technology
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Review of recent advances in non-invasive hemoglobin estimation
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Digital-Transformation-Roadmap-for-Companies.pptx
cuic standard and advanced reporting.pdf
NewMind AI Monthly Chronicles - July 2025
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
20250228 LYD VKU AI Blended-Learning.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Encapsulation_ Review paper, used for researhc scholars
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Big Data Technologies - Introduction.pptx

Efficient Kubernetes scaling using Karpenter

  • 4. ABOUT ME_ ● Head of Consultancy at The Scale Factory (B2B SaaS consultancy, AWS Advanced consulting partner and K8s service provider) ● Ops background, wearing different hats, engaged with many different technologies ● Open source contributor, maintainer and supporter ● HashiCorp Ambassador, OpenUK Ambassador ● Certifications and competencies: AWS, CKA, RHEL, HCTA ● Fan of automation/simplifying things, hiking and travelling @_MarkoB https://www.linkedin.com/in/marko-bevc/ | @marko@hachyderm.io
  • 5. KUBERNETES SCALING_ • None out of the box – manual 👩‍💻👨‍💻 • Kubernetes resources: –Pods – the smallest execution unit –Nodes – compute/instances to run Pods on –Other: storage, network, etc. @_MarkoB
  • 6. HPA CONCEPT_ • Horizontal Pod Autoscaler • Adding more instances(e.g. Pods) • Doesn’t apply to non-scalable objects (e.g. DaemonSet) • Target observed metrics (i.e. average CPU or memory utilization) • Scaling out
  • 7. VPA CONCEPT_ • Vertical Pod Autoscaler • Adjusting size/power (e.g. resources/limits) • “Right-sizing” your workloads to actual usage • Most commonly used on a Deployment objects • Scaling up
  • 8. PODS SCALING_ • Other approaches: – HPA | VPA* (HorizontalPodAutoscaler | VerticalPodAutoscaler) – GCP: MultidimPodAutoscaler – KEDA (K8s Event Driven Autoscaling) – Knative (K8s based serverless platform)
  • 9. CLUSTER AUTOSCALER_ • Industry ‘de-facto’ auto-scaling standard • Cost efficiency – automatically adjusts cluster: scale up/down • Leaning on existing Cloud building blocks • Challenges: Node Group limitations (AZ, instance type, labels), complex to use, tightly bound to the scheduler, global controller
  • 10. CLUSTER AUTOSCALER SCALE-UP_ ● Reconciliation and filtering ● Scale up (in-memory simulation, <10sec) ● Expanders: random, most/least pods, price, priority ● Scale down (<10min) New Nodes Pending Pods 10 sec
  • 14. size, arch, GPU, etc. @_MarkoB NODE KARPENTER SCHEDULING_ Kubernetes Control Plane
  • 16. KEY CONCEPTS_ • Straightforward setup: – Provision AWS IAM Roles for Service Accounts (IRSA) – Install controllers (leader elect HA) – Apply Provisioner CRD (configuration) – one or more! – Deploy workloads • Capacity life-cycle loop: watch evaluate provision remove → → → • Well-known labels as Provisioner constraints: – kubernetes.io/arch = amd64 – kubernetes.io/os = linux – node.kubernetes.io/instance-type = m5.large – topology.kubernetes.io/zone = eu-west-1 – karpenter.sh/capacity-type = on-demand | spot ● Multi-dimension scaling (up/down and in/out)! @_MarkoB
  • 17. SCALING UP_ • Provisioning and scaling • Adding more just-in-time capacity to meet demand • Early binding to nodes • Scheduling constraints: resource.requests, nodeAffinity, nodeSelector, PodDisruptionBudget, topologySpreadConstraints, inter-pod (anti-)affinity • Removing scheduler tight coupling @_MarkoB New Node Pending Pods <10 sec
  • 18. SCALING IN_ @_MarkoB <10 sec Obsolete Node Pending Pods • Terminate obsolete capacity reducing costs → • Removing underutilised or empty nodes • Node TTLs (emptiness & expiration) • Consolidation • Interruption • Drift
  • 19. CAPACITY CONSOLIDATION_ ● Consolidation, a.k.a off-line bin packing ● Rebalancing Node workloads based on utilisation (CPU, memory) ● Mechanisms for cluster consolidation: – Delete (on-demand | spot) – Replace (on-demand) ● Optimises for cost, minimising disruption obeying: – Scheduling constraints (PDBs, AZ affinity, topology spread constraints) – Termination grace period and expiration TTL – Instance unhealthy events and spot events (termination) ● Using least disruption when multiple Nodes that could be consolidated: – Nodes running fewer pods – Nodes that will expire soon – Nodes with lower priority Pods @_MarkoB
  • 20. OTHER OPTIONS_ ● Custom User Data and AMI (i.e. Bottlerocket) ● Kubelet configuration (containerRuntime, systemReserved) ● Taints (or startupTaints) ● Control Pod Density – Network limitations ● Number of ENIs ● Number of IP addresses that can be assigned to ENI – Static Pod Density (podsPerCore) – Dynamic Pod Density (maxPods) – Limit Pod Density: topology spread, restrict instance types @_MarkoB
  • 22. CONCLUSIONS_ & TAKEAWAYS ● Capacity planning is hard! 🧪 ● Key advantages: 🔥 – Flexible, lowers complexity & portable – Fast: provisioning latency <1min down to 15sec (group-less) → – Efficient: multi-dimension scaling, consolidation (delete or replace) – Adaptive: right-sizing, interruption events – Compliance (TTL)📖 ● To keep in mind: 🧑‍🏫 – Currently supported provider is AWS (adoption in the future?*) – Not supporting Spot Rebalance Recommendations – Careful with non-interruptable workloads, edge case of 1 replica – https://github.com/aws/karpenter/issues ➡️ ⚒️ @_MarkoB
  • 23. ● Resources: – https://github.com/mbevc1/public-speaking/ – https://github.com/aws/karpenter/ – https://kubernetes.io/docs/reference/labels-annotations-taints/ – https://github.com/kubernetes/autoscaler – https://docs.aws.amazon.com/eks/latest/userguide/cluster-autoscaler.html – https://github.com/kubernetes/autoscaler/blob/master/cluster-autoscaler/proposals/ scalability_tests.md – https://blog.kloia.com/karpenter-cluster-autoscaler-76d7f7ec0d0e – https://blog.scaleway.com/understanding-kubernetes-autoscaling/ – https://aws.amazon.com/blogs/aws/introducing-karpenter-an-open-source-high-performance- kubernetes-cluster-autoscaler/ FURTHER READING_ @_MarkoB