SlideShare a Scribd company logo
Blue-green & canary deployments
Ivan Kruglov | 13.09.2019 |
16 kubernetes clusters
multitenant
1.5K nodes
700 services
Facts
eu-west-1
us-west-1CI/CD
eu-west-1
us-west-1CI/CD
management cluster
shipper
eu-west-1
us-west-1
1. staging
2. canary
3. full on
application.yaml
• helm chart + values
• cluster selector
• rollout strategy
application.yaml
• capacity management
• traffic management
• staging
• canary
• full-on
example rollout strategy
• 10% capacity
• no traffic
• 10% capacity
• 10% traffic
• 100% capacity
• 100% traffic
Shipper
https://github.com/bookingcom/shipper
https://docs.shipper-k8s.io
Thank you!
Ivan Kruglov
ivan.kruglov@booking.com

More Related Content

PPTX
Kubernetes в Booking.com
PDF
From AWS to GCP, TABLEAPP Architecture Story
PDF
Tableapp architecture migration story for GCPUG.TW
PDF
Serverless Multi Region Cache Replication
PPTX
Randall Hunt - AWS Midwest Community Day Keynote
PPTX
Orchestration with Kubernetes
PDF
Kubernetes on IBM Cloud + DevOps コンテナCIで簡易アプリ作ってみた
PPTX
Paul Fazzone and James Watters at SpringOne Platform 2017
Kubernetes в Booking.com
From AWS to GCP, TABLEAPP Architecture Story
Tableapp architecture migration story for GCPUG.TW
Serverless Multi Region Cache Replication
Randall Hunt - AWS Midwest Community Day Keynote
Orchestration with Kubernetes
Kubernetes on IBM Cloud + DevOps コンテナCIで簡易アプリ作ってみた
Paul Fazzone and James Watters at SpringOne Platform 2017

What's hot (11)

PDF
Using Kubernetes to deploy Django in GCP
PDF
Kubeflow control plane
PPTX
Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...
PDF
Large Scale Kubernetes on AWS at Europe's Leading Online Fashion Platform - C...
PPTX
ITGM#14 - How do we use Kubernetes in Zalando
PDF
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
PDF
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
PDF
Automating Kubernetes Environments with Ansible
PDF
From AWS/STUPS to Kubernetes on AWS @Zalando - Berlin Kubernetes Meetup
PPTX
Netflix Story of Embracing the Cloud
PDF
JEEConf 2018 - Camel microservices with Spring Boot and Kubernetes
Using Kubernetes to deploy Django in GCP
Kubeflow control plane
Jeremy Foran [BAI Communications] | Detecting Subway Overcrowding in Real Tim...
Large Scale Kubernetes on AWS at Europe's Leading Online Fashion Platform - C...
ITGM#14 - How do we use Kubernetes in Zalando
PipelineAI Continuous Machine Learning and AI - Rework Deep Learning Summit -...
End to end Machine Learning using Kubeflow - Build, Train, Deploy and Manage
Automating Kubernetes Environments with Ansible
From AWS/STUPS to Kubernetes on AWS @Zalando - Berlin Kubernetes Meetup
Netflix Story of Embracing the Cloud
JEEConf 2018 - Camel microservices with Spring Boot and Kubernetes
Ad

More from Ivan Kruglov (15)

PPTX
SRE: Site Reliability Engineering
PPTX
Обратная сторона сервис-ориентированной архитектуры
PPTX
Тернии контейнеризованных приложений и микросервисов
PPTX
Introducing envoy-based service mesh at Booking.com
PPTX
Service mesh для микросервисов
PPTX
SOA: Строим свой service mesh
PDF
Solving some of the scalability problems at booking.com
PDF
Sereal: a view from inside
PPSX
SOA: послать запрос на сервер? Что может быть проще?!
PPSX
Мониторинг, когда не тестируешь
PPTX
Архитектура поиска в Booking.com
PDF
Processing JSON messages in highspeed
PDF
Bringing code to the data: from MySQL to RocksDB for high volume searches
PDF
Optimize sereal
PDF
Sereal and its tooling
SRE: Site Reliability Engineering
Обратная сторона сервис-ориентированной архитектуры
Тернии контейнеризованных приложений и микросервисов
Introducing envoy-based service mesh at Booking.com
Service mesh для микросервисов
SOA: Строим свой service mesh
Solving some of the scalability problems at booking.com
Sereal: a view from inside
SOA: послать запрос на сервер? Что может быть проще?!
Мониторинг, когда не тестируешь
Архитектура поиска в Booking.com
Processing JSON messages in highspeed
Bringing code to the data: from MySQL to RocksDB for high volume searches
Optimize sereal
Sereal and its tooling
Ad

Recently uploaded (20)

PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Getting Started with Data Integration: FME Form 101
PDF
Empathic Computing: Creating Shared Understanding
PDF
Encapsulation theory and applications.pdf
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Approach and Philosophy of On baking technology
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Tartificialntelligence_presentation.pptx
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
A comparative analysis of optical character recognition models for extracting...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Spectroscopy.pptx food analysis technology
Getting Started with Data Integration: FME Form 101
Empathic Computing: Creating Shared Understanding
Encapsulation theory and applications.pdf
MIND Revenue Release Quarter 2 2025 Press Release
Diabetes mellitus diagnosis method based random forest with bat algorithm
Encapsulation_ Review paper, used for researhc scholars
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Dropbox Q2 2025 Financial Results & Investor Presentation
20250228 LYD VKU AI Blended-Learning.pptx
Approach and Philosophy of On baking technology
The Rise and Fall of 3GPP – Time for a Sabbatical?
Reach Out and Touch Someone: Haptics and Empathic Computing
Tartificialntelligence_presentation.pptx
Group 1 Presentation -Planning and Decision Making .pptx

Blue-green & canary deployments

Editor's Notes

  • #2: I came here to talk briefly about how we do blue-green and canary deployments @ booking.com
  • #3: But before I start some facts. We have 16 multitenant k8s clusters which roughly summing up to 15 hundres nodes and which hosts around 7 hundreds services.
  • #4: Due to our scale we always deploy to multiple regions and we love advanced deployment strategies because they allow us to ensure seemless customer experience while shipping products at fast pace. However, such strategies are not trivial to manage. One of the ways is to configure CI/CD to deploy to several regions at the same time. In our experience, this proven to be a error prone approach because every failure in either CI/CD or a cluster leads to an inconsistent versions deployed to different regions.
  • #5: That’s why we decided to follow a different way. What we wanted to achieve is a setup which would ensure consistency across clusters but which would also gives us advanced control over how we rollout changes, which kubernetes doesn’t give by default.
  • #6: So, we decided to go with so called “management cluster” which hosts a set of k8s controllers which we jointly call “shipper”. A user deploys an application spec to management cluster, and shipper coordinates the deployments of the app to the application clusters. And it also continuously reconcile the state to make sure that applications in all clusters are in sync. But on top, shipper let as define deployment steps. For instance – we have staging, canary and full-on steps. Will talk a bit more about what these steps are in a second.
  • #7: But before, I would like to show you what an application spec consist of. A spec defines a link to helm chart with its customizations, cluster selector which instruct shipper which regions to deploy to and finally the rollout steps. Each rollout steps controls two aspects: capacity of the application and traffic which reaches the application.
  • #8: So, for instance, our default strategy consists of three steps. Steps #1 – staging. In this stage we create a new version of application but route no traffic there. So, this is your last chance for any final checks. On the next stage – canary, we route a portion of live traffic to this newly deployed application. And if something goes wrong we can always rollback. And when we satisfy with canary, we go full-on.
  • #9: This is what we do, and the cool thing is that you can do the same! Because shipper is an open-source project which we create and keep developing.