SlideShare a Scribd company logo
@
Introduction
Me
@NU.nl
NU.nl
About
• First dutch digital news platform.
• Unique visitors:
• 7 mln. / month
• 2.1 mln. / day
• Page hits: ~12 mln / day
• API: ~150k rpm / 2500rps
NU.nl
Sanoma
• Part of Sanoma
• NL: NU.nl, Viva, Libelle, Scoopy
• FI: Helsingin Sanomat
• Reaching ~9.8 mln dutch people / month
IT organization
Teams
• NU.nl teams
• Web 1 (application / front-end-ish)
• Web 2 (application / back-end-ish / infra)
• Feature 1 & 2 (cross-discipline)
• iOS
• Android
• Sanoma teams
• DevSupport, Mediatool, Content Aggregation
NU.nl
Growing number of teams
• Increased number of parallel workflows
• Testing
• Releasing
• Roadmaps
• Knowing about everything no longer possible
• Aligning ‘procedures by agreement’ increasingly hard
Why Kubernetes?
Current infrastructure
AWS accounts & VPCs
VPC
sanoma
RDS Elasticache
ALBs
EC2
Cloudfront
API CMS WWW XYZ
VPC
nu-test
FOO K8S
VPC
nu-prod
BAR K8S
Infrastructure provisioning
Terrible (Terraform + Ansible)
terrible plan
terrible apply
terrible ansible
Development workflow
From code to release
• Code
• Automated tests
• Code review
• Manually initiated deploy to test
• Feature test
• Manually initiated deploy to staging
• Exploratory test
• Manually initiated deploy to production
DevOps practices
Solid foundation
• All infra in code
• Terraform
• Terrible providing mechanisms:
• Authorization
• Managing TF state files
DevOps practices
But…
• Setting up additional test environments slow
• Slow feedback loop
• Terraform plan vs apply (surprise surprise, it didn’t work)
• Ansible (~20 minutes)
• Vagrant? (but not fully representative of EC2)
• Config drift
• Hard to nail down every system package version
• EC2 instances having different lifecycle
DevOps practices
But… (part 2)
• No scaling infra*
• Heavily invested in Ansible
• Config & secrets management problematic
• GUIs time consuming
• No change history
• Or highly detached from code history
• No context
• Not overly secret
*Yes, we know it’s 2019
DevOps practices
But… (part 3)
• Current deployment system assumes fixed set of servers
• Possible alternatives include:
• ASG rolling updates (can get slow)
• Pull current application code on start-up (even slower)
• Bake AMI
• Periodically poll for application version to be deployed
• Works quite well
• …as long as new code combined with config doesn’t break.
• So a certain level of orchestration would be needed.
Where to start?
Everything’s connected
Timing
What direction to move?
• DevOps challenges
• Desire to improve delivery process, having true artifacts
• Early 2018
• Containers are a well-established way of ‘packaging’ an application
• Kubernetes getting out of early-adopters phase
• NU.nl (re-)launching a new product: NUjij
Improvement layers
A journey or a destination?
1: Containers as artifacts
• Versatile
• Forces us to do certain things right
• 12factor
• Centralized logging
• Easily moved through a pipeline
• Lots of tooling
Improvement layers
A journey or a destination?
2: A flexible platform to deploy and run containerized applications on
• Tackling challenges at platform level instead of per-application:
• Scaling
• Security updates
• Observability
• Deployment & configuration process
Improvement layers
A journey or a destination?
2: A flexible platform to deploy and run containerized applications on
• Kubernetes
• Rapidly increasing adoption
• Short feedback loop
• Ability to run locally (unlike, say, ECS)
• Easily stamp out deployments for:
• feature testing/demo-ing
• e2e tests
Narrowing the scope
Lets not get carried away
The goal is not:
• To chop up change all of our applications into nano- micro-services
• They’re not that monolithic anyway
• To put everything in Kubernetes
• Managed AWS services where possible
• Redis, RDS
Focus on agility and efficiency of what we change most frequently: Code
Initial cluster setup
The journey begins
Multiple clusters
By criticality
3 AWS accounts, 3 clusters:
• osc-nu-prod
• production
• osc-nu-test
• test
• staging
• osc-nu-dev
• proofing infra changes
Kops
Why Kops?
• Manages cluster upgrades
• Rolling upgrade
• Draining nodes
• EKS not yet available
• Let alone in eu-west-1
Kops
Glueing together cluster setup and kube-system setup
Kops
Upgrading a cluster
Kops
Upgrading a cluster
Kops
Templating Terraform and custom vars
Components
kube-system
• Networking
• Calico
• EFS
• previousnext/k8s-aws-efs
• No AZ-restrictions when re-scheduling pods
• Creates new EFS filesystem for each PersistentVolumeClaim
• Security & reliability (isolated IOPs budgets)
• Slow on initial deploy
Components
kube-system
• AWS IAM Authenticator
• The ‘Zalando suite’
• Skipper
• Skipper Daemonset
• kube-ingress-aws-controller Deployment
• ExternalDNS
• Configures PowerDNS (& others) based on ingress host
Components
Zalando skipper
• Skipper Daemonset
• Feature rich (metrics, shadow traffic,
blue/green)
• kube-ingress-aws-controller Deployment
• https://github.com/zalando-incubator/kube-
ingress-aws-controller
• Sets up & manages ALB
• Finds appropriate ACM certificate
• Supports multiple ACM certificates per ALB
Components
Autoscaling
• Horizontal Pod Autoscaler
• Scales number of pods based on
(CPU) utilization
• Cluster autoscaler
• Running on master nodes
• Scales asg out when pods pending
• Scales asg in when nodes
underutilized
Components
Logging & metrics
• ELK
• Prometheus / Grafana
Jenkins
Build & Deploy pipeline
Jenkins
Temporary deployment for running tests
• Deploy to temp. namespace
• Jenkins-SU
• Run tests in deployment
• Deploy to test/staging/production
• By bumping image version
• Production: Jenkins-SU
• Clean up temp. namespace
• Jenkins-SU
Jenkins
Jenkins-SU
• Sets up namespace
• Adding RBAC for Jenkins
• Only if ns name matches pattern ‘Jenkins-*’
• Deletes namespace
• Only if ns name matches pattern ‘Jenkins-*’
• Avoids need for Jenkins to be able to delete every namespace
curl -X POST --user ${JENKINS_SU_AUTH} --data '{"name": "${K8S_BUILD_NS}"}' http://su.jenkins-su/ns/
curl -X DELETE --user ${JENKINS_SU_AUTH} --data '{"name": "${K8S_BUILD_NS}"}' http://su.jenkins-su/ns/
Kubernetes in action
Kubernetes in action
Questions
• Will it be stable?
• Will we be able to operate?
• Should we wait for EKS?
• Do we actually want EKS? What will EKS be like?
Learning from failure
1
No memory limits
Incident 1
Accidentally trying to load a ElasticSearch index of 90Gb
• Misconfigured elast-alert (trying to read entire index)
• No memory limit configured
Incident 1
Accidentally trying to load a ElasticSearch index of 90Gb
• Required manual intervention: Yes
• Stopping the bleeding:
• Remove elast-alert
• Permanent fixes:
• Don’t load entire index
• Apply limits
2
No CPU limits
Incident 2
Rapid traffic increase affecting core components
• 2019-03-18 Utrecht shooting
• 11:11 First article published
• 11:56 breaking push
• CPU burstable pods causing node 100% CPU
• Core components (kubelet, ingress) suffering
Incident 2
Rapid traffic increase affecting core components
Incident 2
Rapid traffic increase affecting core components
Incident 2
Rapid traffic increase affecting core components
Incident 2
Rapid traffic increase affecting core components
Incident 2
Rapid traffic increase affecting core components
pod
pod
kubelet
skipper
node
Pods:
0.4 CPU req.
0.8 CPU limit
80% CPU utilization
pod
kubelet
skipper
node
pod
Pods:
0.4 CPU req.
0.8 CPU limit
120% CPU utilization
problems
Incident 2
Rapid traffic increase affecting core components
• Required manual intervention: No
• Fixes:
• Reduce CPU burstable amount of pods
• Increase resource requests of skipper
• Mind QoS: Guaranteed, Burstable, Best effort
• Reserve cpu & memory for kubelet
• --kube-reserved
• --system-reserved
3
Memory limits
OOMkiller
Incident 3
Application update increasing memory footprint
• Upgrade including moving from MongoDB 3 to MongoDB 4
• HorizontalPodAutoscaler based on CPU
• Scaling based on CPU not kicking in
• New increased memory footprint causing OOMkilled
Incident 3
Application update increasing memory footprint
Incident 3
Application update increasing memory footprint
• Required manual intervention: Yes
• Stopping the bleeding:
• Increase memory limit of Talk pods
• Permanent fixes:
• Adjust CPU request/limit & HPA thresholds
• Scale on both CPU and memory
• Note: Not all applications ‘give back’ memory
• Set memory limit higher than request to prevent ‘snowball effect’
Incident 3
OOMKilled snowball effect
pod pod pod pod
pod
pod
pod
pod
pod pod
starting
…
1 2
3 4
3
Memory limits
!?
(obligatory this-is-fine meme)
That’s not fine
Is it?
• On the positive side:
• All are result of (lack of) resource limit configuration
• This can be learned
• On the negative side:
• This needs to be learned
• Note: ‘Availability bias’
Improving
Automation
Improving the pipeline
• Automating setting the image version is not enough
• Rolling out Kubernetes manifests still manual task
• Updating configuration & secrets still manual task
• Duplication in manifests between stages
• Not easily seen what parts are different
• Differences intentional or accidental?
• This actually slows us down
• Does git represent the current state?
kubectl -n talk get secrets env -o json |jq -r '.data | map_values(@base64d) | to_entries | .[] | .key + "="" + .value +"""'
Helm
The package manager for Kubernetes
• Charts
• Configured via values
• It’s like Terraform modules
• Or Ansible group_vars
• Leveraging community knowledge and efforts
• E.g. prometheus-operator
• No need to copy charts, able to reference.
• Helm v3
SOPS: Secrets OPerationS
Secrets management stinks, use some sops!
• By Mozilla
• Manage AWS API access, not keys
• Versatile
• YAML, JSON, ENV, INI, binary (plain text)
• Not limited to Kubernetes
• Meaningful diffs
• Alternatives considered:
• Kamus
• Bitnami SealedSecrets
Helmfile
Wiring it together
• Charts
• Referenced from online chart sources or local
• Environments
• Test, staging, production
• Referencing values and secrets
• Releases
• Release name
• Reference to chart
• Values (can be a templated file, using vars and secrets from environment)
Helmfile
Wiring it together
environment
values
secrets
(SOPS)
release X
release Y
release Z
ENV
values
values
values
Helmfile
Helmfile
Wiring it together
• Advantages:
• Meaningful git diffs
• Easily manage multiple releases in single pipeline, e.g.:
• Everything related to monitoring and logging
• Kube-system
• Declarative definition
• Of what would otherwise be numerous helm args and steps in CI/CD pipeline
Helmfile
Wiring it together
• Advantages (continued):
• Ability to pass in ENV vars
• E.g. build result image tags
• Ability to reference complex charts created by community
• Charts as a building block allows re-use. Example:
• Instead of plain yaml you write a chart
• If fitting workflow, the chart can be a published artifact
• Chart can be re-used e.g. in e2e tests
Helmfile
Wiring it together
• Disadvantages:
• 2 levels of templating
• Chart itself
• Only if writing own charts
• Environment & release values into Helm values
• Template error message not overly clear
• Or even misleading
• At least it breaks
Helmfile
Example
Helmfile
Example
Helmfile
Example
Helmfile
Jenkins
Kubernetes at NU.nl   (Kubernetes meetup 2019-09-05)
Helmfile
Final words
But tiller?
• Helm as a templating engine
• Option: Using Helm 2 ‘Tillerless’
• Tiller outside of cluster, not by-passing RBAC
• Start using Helm as package manager when Helm 3 settles down
• Easy removal of temp. per-feature deploys
• Diffs
Challenge
Auto-scaling
scale fast… scale far…
Auto-scaling
Breaking news push
Auto-scaling
Types of scaling
• Reactive
• Breaking news
• K8S cluster-autoscaler
• Can’t schedule pod? Add nodes.
• Predictive
• Ticket sale start
• Black Friday
Auto-scaling
Types of scaling
• From within cluster
• K8S cluster-autoscaler
• From outside of cluster
• ASG scaling policies
Auto-scaling
Scaling speed
node spin-up duration
node count 70% utilization
Auto-scaling
Times 5 within 5 minutes?
Cluster auto-scaler
Bag of tricks
• Mix predictive and reactive
• Add asg instances without telling cluster-autoscaler
• Traffic expected to arrive by the time cluster-autoscaler starts to scale in,
leaving plenty of resources as needed.
• Pause pods
• Lower priority pods that can safely be evicted
• Effectively ‘creating headroom’ in cluster
Considerations
When engaging ‘ludicrous mode’™
Can control-plane handle scale?
• KOPS
• Size master nodes for max. cluster size
• Overhead cost
• EKS
• What’s behind the abstraction?
• ELB 503s exist after all
• Plan: Proof of concepts
Pending
Not the pods…
Consider EKS
Managed control plane
EKS Kops
Managed control plane Total control over setup
Easier: EKS IAM roles for pods
• Launched 2019-09-04 (yesterday)*
Smooth rolling upgrade process
Probably cheaper (2/3 of 3x m4.large) No VPC CNI Pod density limitations
* https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/
EKS IAM roles for pods
Also possible on DIY clusters, officially launched yesterday
• OIDC federation access (OpenID Connect identity provider)
• Assume role via Secure Token Service (STS)
• Projected service account tokens (JWT) in pod
• STS can validate JWT tokens against OIDC provider
• Boils down to:
• Enable/set-up prerequisites in cluster
• Add ServiceAccount having IAM role annotation to pod
• Use recent AWS SDK
Multiple clusters per AWS account
Don’t lock ourselves in a corner.
api.<aws-account-name>.<k8s-sanoma-domain>
api.<cluster-name>.<aws-account-name>.<k8s-sanoma-domain>
Route53 zone 1
Route53 zone 1Route53 zone 2
NS records
CI/CD to separate cluster
Similar flows
• No more taints and tolerations
• Similar authorization mechanism to all deploy targets
• Possibly IAM
• No need for Jenkins-SU
• Clusters should be cattle anyway
Pipelines
GitOps
• Manage namespaces via pipeline:
• kube-system
• monitor
• Creation of application namespaces including RBAC
• Helmfile
System applications
Small improvements
• Prometheus-operator
• PrometheusRule resource type
• Default dashboards
• EFS
• https://github.com/previousnext/k8s-aws-efs
• Current. Works well but not a lot of active development.
• 2 contributors. 46 stars.
• https://github.com/kubernetes-incubator/external-storage
• De facto EFS provisioner. 146 contributors. 1630 stars.
• Bonus: No more time-consuming initial volume set-up
Expand
Increase Return on Investment
• Add more applications
• Facilitate parallel testing & development workflows
• Feature testing
• Mobile app development
• E2e tests
Links
Further reading
Scaling & spot instances:
• https://itnext.io/the-definitive-guide-to-running-ec2-spot-instances-as-kubernetes-worker-nodes-68ef2095e767
EKS:
• https://medium.com/glia-tech/productionproofing-eks-ed52951ffd6c
QoS:
• https://www.replex.io/blog/everything-you-need-to-know-about-kubernetes-quality-of-service-qos-classes
Failure stories:
• https://k8s.af/
Summary
Know your limits
Automate all the things
Everything code
Kubernetes is a journey, not a destination
All should be cattle. No pets allowed!
?

More Related Content

PDF
Innovating faster with SBT, Continuous Delivery, and LXC
PDF
KubeCon 2019 Recap (Parts 1-3)
PPTX
Distributed automation sel_conf_2015
PDF
Five Years of EC2 Distilled
PPTX
London Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben Coughlan
PPTX
Sas 2015 event_driven
PDF
Xen_and_Rails_deployment
PDF
DCSF19 Container Security: Theory & Practice at Netflix
Innovating faster with SBT, Continuous Delivery, and LXC
KubeCon 2019 Recap (Parts 1-3)
Distributed automation sel_conf_2015
Five Years of EC2 Distilled
London Hashicorp Meetup #22 - Congruent infrastructure @zopa by Ben Coughlan
Sas 2015 event_driven
Xen_and_Rails_deployment
DCSF19 Container Security: Theory & Practice at Netflix

What's hot (15)

PPT
Docker in the Cloud
PDF
How DreamHost builds a Public Cloud with OpenStack
PDF
Mini-Training: Netflix Simian Army
PDF
Exactly-once Semantics in Apache Kafka
PDF
20140708 - Jeremy Edberg: How Netflix Delivers Software
PPTX
Building Micro-Services with Scala
PDF
FunctionalConf '16 Robert Virding Erlang Ecosystem
PDF
HA SOA Application with GlusterFS
PPTX
Kubernetes
PPT
DevOpsCon Cloud Workshop
PPTX
To Build My Own Cloud with Blackjack…
PPTX
Distributed automation selcamp2016
PDF
SaltConf14 - Justin Carmony, Deseret Digital Media - Teaching Devs About DevOps
PPTX
Autoscaled Distributed Automation Expedia Know How
PPTX
All the troubles you get into when setting up a production ready Kubernetes c...
Docker in the Cloud
How DreamHost builds a Public Cloud with OpenStack
Mini-Training: Netflix Simian Army
Exactly-once Semantics in Apache Kafka
20140708 - Jeremy Edberg: How Netflix Delivers Software
Building Micro-Services with Scala
FunctionalConf '16 Robert Virding Erlang Ecosystem
HA SOA Application with GlusterFS
Kubernetes
DevOpsCon Cloud Workshop
To Build My Own Cloud with Blackjack…
Distributed automation selcamp2016
SaltConf14 - Justin Carmony, Deseret Digital Media - Teaching Devs About DevOps
Autoscaled Distributed Automation Expedia Know How
All the troubles you get into when setting up a production ready Kubernetes c...
Ad

Similar to Kubernetes at NU.nl (Kubernetes meetup 2019-09-05) (20)

PDF
Elastic Kubernetes Services (EKS)
PPTX
DevOps with Kubernetes and Helm - OSCON 2018
PPTX
Aks: k8s e azure
PDF
Kubernetes lessons learned
PPTX
DevOps with Kubernetes and Helm
PPTX
Simplify Your Way To Expert Kubernetes Management
PPTX
DevOps with Kubernetes and Helm - Jenkins World Edition
PDF
Evolving for Kubernetes
PDF
Kubecon seattle 2018 recap - Application Deployment aspects
PDF
Deploying on Kubernetes - An intro
PPTX
Introduction+to+Kubernetes-Details-D.pptx
PPTX
Kubernetes 101
PPTX
Kubernetes Manchester - 6th December 2018
PPTX
Why kubernetes matters
PPTX
Working with kubernetes
PPTX
Kubernetes Internals
PDF
Xpdays: Kubernetes CI-CD Frameworks Case Study
PDF
Kubernetes: My BFF
PPTX
Kubernetes PPT.pptx
PDF
Kubernetes Architecture - beyond a black box - Part 1
Elastic Kubernetes Services (EKS)
DevOps with Kubernetes and Helm - OSCON 2018
Aks: k8s e azure
Kubernetes lessons learned
DevOps with Kubernetes and Helm
Simplify Your Way To Expert Kubernetes Management
DevOps with Kubernetes and Helm - Jenkins World Edition
Evolving for Kubernetes
Kubecon seattle 2018 recap - Application Deployment aspects
Deploying on Kubernetes - An intro
Introduction+to+Kubernetes-Details-D.pptx
Kubernetes 101
Kubernetes Manchester - 6th December 2018
Why kubernetes matters
Working with kubernetes
Kubernetes Internals
Xpdays: Kubernetes CI-CD Frameworks Case Study
Kubernetes: My BFF
Kubernetes PPT.pptx
Kubernetes Architecture - beyond a black box - Part 1
Ad

Recently uploaded (20)

PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Cloud computing and distributed systems.
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Machine learning based COVID-19 study performance prediction
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
Building Integrated photovoltaic BIPV_UPV.pdf
NewMind AI Monthly Chronicles - July 2025
Encapsulation_ Review paper, used for researhc scholars
Review of recent advances in non-invasive hemoglobin estimation
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
NewMind AI Weekly Chronicles - August'25 Week I
Understanding_Digital_Forensics_Presentation.pptx
Cloud computing and distributed systems.
Unlocking AI with Model Context Protocol (MCP)
Chapter 3 Spatial Domain Image Processing.pdf
Big Data Technologies - Introduction.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
The Rise and Fall of 3GPP – Time for a Sabbatical?
Machine learning based COVID-19 study performance prediction
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf

Kubernetes at NU.nl (Kubernetes meetup 2019-09-05)

  • 1. @
  • 4. NU.nl About • First dutch digital news platform. • Unique visitors: • 7 mln. / month • 2.1 mln. / day • Page hits: ~12 mln / day • API: ~150k rpm / 2500rps
  • 5. NU.nl Sanoma • Part of Sanoma • NL: NU.nl, Viva, Libelle, Scoopy • FI: Helsingin Sanomat • Reaching ~9.8 mln dutch people / month
  • 6. IT organization Teams • NU.nl teams • Web 1 (application / front-end-ish) • Web 2 (application / back-end-ish / infra) • Feature 1 & 2 (cross-discipline) • iOS • Android • Sanoma teams • DevSupport, Mediatool, Content Aggregation
  • 7. NU.nl Growing number of teams • Increased number of parallel workflows • Testing • Releasing • Roadmaps • Knowing about everything no longer possible • Aligning ‘procedures by agreement’ increasingly hard
  • 9. Current infrastructure AWS accounts & VPCs VPC sanoma RDS Elasticache ALBs EC2 Cloudfront API CMS WWW XYZ VPC nu-test FOO K8S VPC nu-prod BAR K8S
  • 10. Infrastructure provisioning Terrible (Terraform + Ansible) terrible plan terrible apply terrible ansible
  • 11. Development workflow From code to release • Code • Automated tests • Code review • Manually initiated deploy to test • Feature test • Manually initiated deploy to staging • Exploratory test • Manually initiated deploy to production
  • 12. DevOps practices Solid foundation • All infra in code • Terraform • Terrible providing mechanisms: • Authorization • Managing TF state files
  • 13. DevOps practices But… • Setting up additional test environments slow • Slow feedback loop • Terraform plan vs apply (surprise surprise, it didn’t work) • Ansible (~20 minutes) • Vagrant? (but not fully representative of EC2) • Config drift • Hard to nail down every system package version • EC2 instances having different lifecycle
  • 14. DevOps practices But… (part 2) • No scaling infra* • Heavily invested in Ansible • Config & secrets management problematic • GUIs time consuming • No change history • Or highly detached from code history • No context • Not overly secret *Yes, we know it’s 2019
  • 15. DevOps practices But… (part 3) • Current deployment system assumes fixed set of servers • Possible alternatives include: • ASG rolling updates (can get slow) • Pull current application code on start-up (even slower) • Bake AMI • Periodically poll for application version to be deployed • Works quite well • …as long as new code combined with config doesn’t break. • So a certain level of orchestration would be needed.
  • 17. Timing What direction to move? • DevOps challenges • Desire to improve delivery process, having true artifacts • Early 2018 • Containers are a well-established way of ‘packaging’ an application • Kubernetes getting out of early-adopters phase • NU.nl (re-)launching a new product: NUjij
  • 18. Improvement layers A journey or a destination? 1: Containers as artifacts • Versatile • Forces us to do certain things right • 12factor • Centralized logging • Easily moved through a pipeline • Lots of tooling
  • 19. Improvement layers A journey or a destination? 2: A flexible platform to deploy and run containerized applications on • Tackling challenges at platform level instead of per-application: • Scaling • Security updates • Observability • Deployment & configuration process
  • 20. Improvement layers A journey or a destination? 2: A flexible platform to deploy and run containerized applications on • Kubernetes • Rapidly increasing adoption • Short feedback loop • Ability to run locally (unlike, say, ECS) • Easily stamp out deployments for: • feature testing/demo-ing • e2e tests
  • 21. Narrowing the scope Lets not get carried away The goal is not: • To chop up change all of our applications into nano- micro-services • They’re not that monolithic anyway • To put everything in Kubernetes • Managed AWS services where possible • Redis, RDS Focus on agility and efficiency of what we change most frequently: Code
  • 22. Initial cluster setup The journey begins
  • 23. Multiple clusters By criticality 3 AWS accounts, 3 clusters: • osc-nu-prod • production • osc-nu-test • test • staging • osc-nu-dev • proofing infra changes
  • 24. Kops Why Kops? • Manages cluster upgrades • Rolling upgrade • Draining nodes • EKS not yet available • Let alone in eu-west-1
  • 25. Kops Glueing together cluster setup and kube-system setup
  • 29. Components kube-system • Networking • Calico • EFS • previousnext/k8s-aws-efs • No AZ-restrictions when re-scheduling pods • Creates new EFS filesystem for each PersistentVolumeClaim • Security & reliability (isolated IOPs budgets) • Slow on initial deploy
  • 30. Components kube-system • AWS IAM Authenticator • The ‘Zalando suite’ • Skipper • Skipper Daemonset • kube-ingress-aws-controller Deployment • ExternalDNS • Configures PowerDNS (& others) based on ingress host
  • 31. Components Zalando skipper • Skipper Daemonset • Feature rich (metrics, shadow traffic, blue/green) • kube-ingress-aws-controller Deployment • https://github.com/zalando-incubator/kube- ingress-aws-controller • Sets up & manages ALB • Finds appropriate ACM certificate • Supports multiple ACM certificates per ALB
  • 32. Components Autoscaling • Horizontal Pod Autoscaler • Scales number of pods based on (CPU) utilization • Cluster autoscaler • Running on master nodes • Scales asg out when pods pending • Scales asg in when nodes underutilized
  • 33. Components Logging & metrics • ELK • Prometheus / Grafana
  • 35. Jenkins Temporary deployment for running tests • Deploy to temp. namespace • Jenkins-SU • Run tests in deployment • Deploy to test/staging/production • By bumping image version • Production: Jenkins-SU • Clean up temp. namespace • Jenkins-SU
  • 36. Jenkins Jenkins-SU • Sets up namespace • Adding RBAC for Jenkins • Only if ns name matches pattern ‘Jenkins-*’ • Deletes namespace • Only if ns name matches pattern ‘Jenkins-*’ • Avoids need for Jenkins to be able to delete every namespace curl -X POST --user ${JENKINS_SU_AUTH} --data '{"name": "${K8S_BUILD_NS}"}' http://su.jenkins-su/ns/ curl -X DELETE --user ${JENKINS_SU_AUTH} --data '{"name": "${K8S_BUILD_NS}"}' http://su.jenkins-su/ns/
  • 38. Kubernetes in action Questions • Will it be stable? • Will we be able to operate? • Should we wait for EKS? • Do we actually want EKS? What will EKS be like?
  • 41. Incident 1 Accidentally trying to load a ElasticSearch index of 90Gb • Misconfigured elast-alert (trying to read entire index) • No memory limit configured
  • 42. Incident 1 Accidentally trying to load a ElasticSearch index of 90Gb • Required manual intervention: Yes • Stopping the bleeding: • Remove elast-alert • Permanent fixes: • Don’t load entire index • Apply limits
  • 44. Incident 2 Rapid traffic increase affecting core components • 2019-03-18 Utrecht shooting • 11:11 First article published • 11:56 breaking push • CPU burstable pods causing node 100% CPU • Core components (kubelet, ingress) suffering
  • 45. Incident 2 Rapid traffic increase affecting core components
  • 46. Incident 2 Rapid traffic increase affecting core components
  • 47. Incident 2 Rapid traffic increase affecting core components
  • 48. Incident 2 Rapid traffic increase affecting core components
  • 49. Incident 2 Rapid traffic increase affecting core components pod pod kubelet skipper node Pods: 0.4 CPU req. 0.8 CPU limit 80% CPU utilization pod kubelet skipper node pod Pods: 0.4 CPU req. 0.8 CPU limit 120% CPU utilization problems
  • 50. Incident 2 Rapid traffic increase affecting core components • Required manual intervention: No • Fixes: • Reduce CPU burstable amount of pods • Increase resource requests of skipper • Mind QoS: Guaranteed, Burstable, Best effort • Reserve cpu & memory for kubelet • --kube-reserved • --system-reserved
  • 53. Incident 3 Application update increasing memory footprint • Upgrade including moving from MongoDB 3 to MongoDB 4 • HorizontalPodAutoscaler based on CPU • Scaling based on CPU not kicking in • New increased memory footprint causing OOMkilled
  • 54. Incident 3 Application update increasing memory footprint
  • 55. Incident 3 Application update increasing memory footprint • Required manual intervention: Yes • Stopping the bleeding: • Increase memory limit of Talk pods • Permanent fixes: • Adjust CPU request/limit & HPA thresholds • Scale on both CPU and memory • Note: Not all applications ‘give back’ memory • Set memory limit higher than request to prevent ‘snowball effect’
  • 56. Incident 3 OOMKilled snowball effect pod pod pod pod pod pod pod pod pod pod starting … 1 2 3 4
  • 58. That’s not fine Is it? • On the positive side: • All are result of (lack of) resource limit configuration • This can be learned • On the negative side: • This needs to be learned • Note: ‘Availability bias’
  • 60. Automation Improving the pipeline • Automating setting the image version is not enough • Rolling out Kubernetes manifests still manual task • Updating configuration & secrets still manual task • Duplication in manifests between stages • Not easily seen what parts are different • Differences intentional or accidental? • This actually slows us down • Does git represent the current state? kubectl -n talk get secrets env -o json |jq -r '.data | map_values(@base64d) | to_entries | .[] | .key + "="" + .value +"""'
  • 61. Helm The package manager for Kubernetes • Charts • Configured via values • It’s like Terraform modules • Or Ansible group_vars • Leveraging community knowledge and efforts • E.g. prometheus-operator • No need to copy charts, able to reference. • Helm v3
  • 62. SOPS: Secrets OPerationS Secrets management stinks, use some sops! • By Mozilla • Manage AWS API access, not keys • Versatile • YAML, JSON, ENV, INI, binary (plain text) • Not limited to Kubernetes • Meaningful diffs • Alternatives considered: • Kamus • Bitnami SealedSecrets
  • 63. Helmfile Wiring it together • Charts • Referenced from online chart sources or local • Environments • Test, staging, production • Referencing values and secrets • Releases • Release name • Reference to chart • Values (can be a templated file, using vars and secrets from environment)
  • 64. Helmfile Wiring it together environment values secrets (SOPS) release X release Y release Z ENV values values values Helmfile
  • 65. Helmfile Wiring it together • Advantages: • Meaningful git diffs • Easily manage multiple releases in single pipeline, e.g.: • Everything related to monitoring and logging • Kube-system • Declarative definition • Of what would otherwise be numerous helm args and steps in CI/CD pipeline
  • 66. Helmfile Wiring it together • Advantages (continued): • Ability to pass in ENV vars • E.g. build result image tags • Ability to reference complex charts created by community • Charts as a building block allows re-use. Example: • Instead of plain yaml you write a chart • If fitting workflow, the chart can be a published artifact • Chart can be re-used e.g. in e2e tests
  • 67. Helmfile Wiring it together • Disadvantages: • 2 levels of templating • Chart itself • Only if writing own charts • Environment & release values into Helm values • Template error message not overly clear • Or even misleading • At least it breaks
  • 73. Helmfile Final words But tiller? • Helm as a templating engine • Option: Using Helm 2 ‘Tillerless’ • Tiller outside of cluster, not by-passing RBAC • Start using Helm as package manager when Helm 3 settles down • Easy removal of temp. per-feature deploys • Diffs
  • 77. Auto-scaling Types of scaling • Reactive • Breaking news • K8S cluster-autoscaler • Can’t schedule pod? Add nodes. • Predictive • Ticket sale start • Black Friday
  • 78. Auto-scaling Types of scaling • From within cluster • K8S cluster-autoscaler • From outside of cluster • ASG scaling policies
  • 79. Auto-scaling Scaling speed node spin-up duration node count 70% utilization
  • 81. Cluster auto-scaler Bag of tricks • Mix predictive and reactive • Add asg instances without telling cluster-autoscaler • Traffic expected to arrive by the time cluster-autoscaler starts to scale in, leaving plenty of resources as needed. • Pause pods • Lower priority pods that can safely be evicted • Effectively ‘creating headroom’ in cluster
  • 82. Considerations When engaging ‘ludicrous mode’™ Can control-plane handle scale? • KOPS • Size master nodes for max. cluster size • Overhead cost • EKS • What’s behind the abstraction? • ELB 503s exist after all • Plan: Proof of concepts
  • 84. Consider EKS Managed control plane EKS Kops Managed control plane Total control over setup Easier: EKS IAM roles for pods • Launched 2019-09-04 (yesterday)* Smooth rolling upgrade process Probably cheaper (2/3 of 3x m4.large) No VPC CNI Pod density limitations * https://aws.amazon.com/blogs/opensource/introducing-fine-grained-iam-roles-service-accounts/
  • 85. EKS IAM roles for pods Also possible on DIY clusters, officially launched yesterday • OIDC federation access (OpenID Connect identity provider) • Assume role via Secure Token Service (STS) • Projected service account tokens (JWT) in pod • STS can validate JWT tokens against OIDC provider • Boils down to: • Enable/set-up prerequisites in cluster • Add ServiceAccount having IAM role annotation to pod • Use recent AWS SDK
  • 86. Multiple clusters per AWS account Don’t lock ourselves in a corner. api.<aws-account-name>.<k8s-sanoma-domain> api.<cluster-name>.<aws-account-name>.<k8s-sanoma-domain> Route53 zone 1 Route53 zone 1Route53 zone 2 NS records
  • 87. CI/CD to separate cluster Similar flows • No more taints and tolerations • Similar authorization mechanism to all deploy targets • Possibly IAM • No need for Jenkins-SU • Clusters should be cattle anyway
  • 88. Pipelines GitOps • Manage namespaces via pipeline: • kube-system • monitor • Creation of application namespaces including RBAC • Helmfile
  • 89. System applications Small improvements • Prometheus-operator • PrometheusRule resource type • Default dashboards • EFS • https://github.com/previousnext/k8s-aws-efs • Current. Works well but not a lot of active development. • 2 contributors. 46 stars. • https://github.com/kubernetes-incubator/external-storage • De facto EFS provisioner. 146 contributors. 1630 stars. • Bonus: No more time-consuming initial volume set-up
  • 90. Expand Increase Return on Investment • Add more applications • Facilitate parallel testing & development workflows • Feature testing • Mobile app development • E2e tests
  • 91. Links Further reading Scaling & spot instances: • https://itnext.io/the-definitive-guide-to-running-ec2-spot-instances-as-kubernetes-worker-nodes-68ef2095e767 EKS: • https://medium.com/glia-tech/productionproofing-eks-ed52951ffd6c QoS: • https://www.replex.io/blog/everything-you-need-to-know-about-kubernetes-quality-of-service-qos-classes Failure stories: • https://k8s.af/
  • 93. Know your limits Automate all the things Everything code Kubernetes is a journey, not a destination All should be cattle. No pets allowed!
  • 94. ?