SlideShare a Scribd company logo
Es-operator
Building an Operator
From the Bottom Up
MIKKEL LARSEN
@mikkeloscar
2019-05-21
2
$ whoami
Mikkel Larsen
Software Engineer
Cloud Infrastructure (Kubernetes/AWS)
@ Zalando SE
@mikkeloscar @mikkeloscar
3
“EUROPE’S LEADING ONLINE FASHION PLATFORM”
4
17 markets
WE BRING FASHION TO PEOPLE IN 17 COUNTRIES
7 fulfillment centers
26 million active customers
5.4 billion € revenue 2018
250 million visits per month
15,000 employees in Europe
5
KUBERNETES @ ZALANDO
~125
clusters
1400~
nodes
Since
Oct 2016
Node
Autoscaling
From v1.4
to v1.13
Default
Deployment
Target
6
300k+
Products
per country
~2000
Brands
~700
Categories
45%
Mobile Traffic
12K
QPS
8K
Updates/s
SEARCH @ ZALANDO
Es-operator: Building an Elasticsearch Operator from the bottom up - kube-con eu 2019
8
WORKLOAD
~200 instances
EC2
K8S
9
RUNNING ELASTICSEARCH IN KUBERNETES
1. Safe automatic updates
(Including Kubernetes cluster updates)
2. Advanced auto-scaling for cost efficiency
10
Node
UPDATING ELASTICSEARCH (STATEFULSET)
Node
ES Pod
ready
ES Pod
terminating
ES Pod
ready
Node
ES Pod
ready
ES Pod
draining
Node
ES Pod
1) PreStop Hook (bash script)
● Exclude node in ES
● Wait for node to drain (up to 1h)
● Data is moved to existing nodes
ready
2) PostStart Hook (bash script)
● Remove all excludes
● Let ES rebalance from existing nodes
11
OPERATOR PATTERN
coreos.com/blog/introducing-operators.html
12
v0: MANAGE STATEFULSET
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: test-cluster
annotations:
es-operator/desired-replicas: ”3”
spec:
updateStrategy:
type: OnDelete
replicas: 2
template: # PodTemplate
{...}
● Complicated to update
without changing replicas.
● State must be stored in
annotations
13
v1: ELASTICSEARCH DATA SETS
apiVersion: zalando.org/v1
kind: ElasticsearchDataSet
metadata:
name: test-cluster
spec:
scaling:
{...}
replicas: 3
template: # PodTemplate
{...}
volumeClaimTemplates:
{...}
github.com/zalando-incubator/es-operator
14
ES
Data
ES
Data
ES
Data
ELASTICSEARCH DATA SETS
github.com/zalando-incubator/es-operator
ES
Data
ES
Data
ES
Data
ES
Data
ES
Data
ES
Data
ES
Master
ES
Master
ES
Master
ES
Operator
ES Cluster
15
Node
UPDATING ELASTICSEARCH (OPERATOR)
github.com/zalando-incubator/es-operator
Node
ES PodES Pod
ready
draining
ES Pod
ready
Node
ES Pod
ready
ES
Operator
draining
Node
ES Service
2) Drain node
3) Delete Pod
ES PodES Pod
ready
1) Scale out by 1
16
SCALING UP ELASTICSEARCH (1)
METRICS
Thresholds
● CPU
● Duration
● Cooldown
Boundaries
● Max # Pod replicas
● Min # Shards per node
Node
ES Pod6
shards
ready
Node
ES Pod3
shards
ready
Node
ES Pod3
shards
ready Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Increase pod replicas
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod1
shard
ready
17
SCALING DOWN ELASTICSEARCH
METRICS
Thresholds
● CPU
● Duration
● Cooldown
Boundaries
● Min # Replica
● Max # Shards per node
● Max disk usage (%)
Node
ES Pod6
shards
ready
Node
ES Pod3
shards
ready
Node
ES Pod3
shards
readyNode
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod1
shard
ready
Decrease Pod replicas
DON’T OPERATE
WHEN CLUSTER
IS NOT GREEN!
18
SCALING UP ELASTICSEARCH (2)
METRICS
Thresholds
● CPU
● Duration
● Cooldown
Boundaries
● Min # Shards per node
● Max # Pod replicas
Node
ES Pod1
shard
ready
Node
ES Pod3
shards
ready
Node
ES Pod1
shard
ready Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod1
shard
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod2
shards
ready
Node
ES Pod1
shard
ready
Increase index replicas
19
SCALING IN PRODUCTION (7d)
20
SCALING IN PRODUCTION (24h)
21
LESSONS LEARNED / TAKEAWAYS
● Turn those bash scripts into an operator!
● Assume Operator can die at any point.
● Start simple, add abstractions only when needed.
22
OPEN SOURCE
Elasticsearch Operator
github.com/zalando-incubator/es-operator
Kubernetes on AWS
github.com/zalando-incubator/kubernetes-on-aws
Postgres Operator
github.com/zalando/postgres-operator
Kubernetes Operator Pythonic Framework (Kopf)
github.com/zalando-incubator/kopf
MIKKEL LARSEN
mikkel.larsen@zalando.de
@mikkeloscar
2019-05-21
¡GRACIAS!

More Related Content

PDF
GCPでSplatoonの戦績を分析する
PPTX
“Xcore (library) for android platform” by Uladzimir Klyshevich
PPTX
nodester Architecture overview & roadmap
PPTX
Nodester Architecture overview & roadmap
PPTX
Major Managed Kubernetes Platforms Comparison (AWS, GCP, Azure)
PDF
Using check json to monitor anything - monitoring house plants
PDF
Bcn open stack meet up - july 2014
PDF
PHARO IOT
GCPでSplatoonの戦績を分析する
“Xcore (library) for android platform” by Uladzimir Klyshevich
nodester Architecture overview & roadmap
Nodester Architecture overview & roadmap
Major Managed Kubernetes Platforms Comparison (AWS, GCP, Azure)
Using check json to monitor anything - monitoring house plants
Bcn open stack meet up - july 2014
PHARO IOT

What's hot (17)

PDF
OpenShift.io on Gluster
PPTX
Elixir 5 minute intro
PDF
Script for the geomeetup presentation
PPTX
Kubernetes
PDF
Cypher for Gremlin
PDF
Altitude San Francisco 2018: WebAssembly Tools & Applications
PDF
Running kubernetes
PPTX
Migrating legacy e-commerce application to MS Azure
PPT
Cloud computing comparing
PPTX
C100 k and go
PDF
The Concierge Paradigm
PDF
Load balancing in the SRE way
PPT
David Lovelace - Analysing, displaying and sharing historic landscapes from f...
PDF
Automating Kubernetes Environments with Ansible
PDF
What is (not) Pharo 8?
PPTX
How to deploy docker container inside ikoula's cloud
PDF
Docker and Pharo @ZWEIDENKER
OpenShift.io on Gluster
Elixir 5 minute intro
Script for the geomeetup presentation
Kubernetes
Cypher for Gremlin
Altitude San Francisco 2018: WebAssembly Tools & Applications
Running kubernetes
Migrating legacy e-commerce application to MS Azure
Cloud computing comparing
C100 k and go
The Concierge Paradigm
Load balancing in the SRE way
David Lovelace - Analysing, displaying and sharing historic landscapes from f...
Automating Kubernetes Environments with Ansible
What is (not) Pharo 8?
How to deploy docker container inside ikoula's cloud
Docker and Pharo @ZWEIDENKER
Ad

Similar to Es-operator: Building an Elasticsearch Operator from the bottom up - kube-con eu 2019 (20)

PDF
Scale search powered apps with Elastisearch, k8s and go - Maxime Boisvert
PDF
Operating Elasticsearch in Kubernetes - microXchg Berlin 2019
PDF
Elasticsearch on Kubernetes
PDF
Security sizing meetup
PDF
Deploying Anything as a Service (XaaS) Using Operators on Kubernetes
PDF
Run the elastic stack on kubernetes with eck
PDF
How LogDNA Scaled Elasticsearch on Kubernetes
PDF
Elasticsearch x Autoscaling (AWS)
PDF
The Operator Pattern - Managing Stateful Services in Kubernetes
PPTX
Elastic meetup june16
PDF
Is your Elastic Cluster Stable and Production Ready?
PPTX
Building the search engine: from thorns to stars
PDF
Continuously Deliver Your Kubernetes Infrastructure - KubeCon 2018 Copenhagen
PPTX
Migrating from EKS Cluster Autoscaler to Karpenter
PPTX
Demystifying k8s operators
PDF
AgileTW Feat. DevOpsTW: 維運 Kubernetes 的兩三事
PDF
Berlin Buzzwords 2022 - Autoscaling Elasticsearch for Logs on Kubernetes
PDF
The Kubernetes Operator Pattern - ContainerConf Nov 2017
PDF
Operatorhub.io and your Kubernetes cluster | DevNation Tech Talk
PDF
Nodeless scaling with Karpenter
Scale search powered apps with Elastisearch, k8s and go - Maxime Boisvert
Operating Elasticsearch in Kubernetes - microXchg Berlin 2019
Elasticsearch on Kubernetes
Security sizing meetup
Deploying Anything as a Service (XaaS) Using Operators on Kubernetes
Run the elastic stack on kubernetes with eck
How LogDNA Scaled Elasticsearch on Kubernetes
Elasticsearch x Autoscaling (AWS)
The Operator Pattern - Managing Stateful Services in Kubernetes
Elastic meetup june16
Is your Elastic Cluster Stable and Production Ready?
Building the search engine: from thorns to stars
Continuously Deliver Your Kubernetes Infrastructure - KubeCon 2018 Copenhagen
Migrating from EKS Cluster Autoscaler to Karpenter
Demystifying k8s operators
AgileTW Feat. DevOpsTW: 維運 Kubernetes 的兩三事
Berlin Buzzwords 2022 - Autoscaling Elasticsearch for Logs on Kubernetes
The Kubernetes Operator Pattern - ContainerConf Nov 2017
Operatorhub.io and your Kubernetes cluster | DevNation Tech Talk
Nodeless scaling with Karpenter
Ad

Recently uploaded (20)

PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
System and Network Administraation Chapter 3
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PPTX
history of c programming in notes for students .pptx
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Transform Your Business with a Software ERP System
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Introduction to Artificial Intelligence
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Nekopoi APK 2025 free lastest update
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Wondershare Filmora 15 Crack With Activation Key [2025
System and Network Administraation Chapter 3
How to Choose the Right IT Partner for Your Business in Malaysia
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Odoo Companies in India – Driving Business Transformation.pdf
Operating system designcfffgfgggggggvggggggggg
history of c programming in notes for students .pptx
Design an Analysis of Algorithms II-SECS-1021-03
Transform Your Business with a Software ERP System
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Softaken Excel to vCard Converter Software.pdf
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Introduction to Artificial Intelligence
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PTS Company Brochure 2025 (1).pdf.......
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Nekopoi APK 2025 free lastest update
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...

Es-operator: Building an Elasticsearch Operator from the bottom up - kube-con eu 2019