SlideShare a Scribd company logo
TRAINING THE NEXT GENERATION OF EUROPEAN FOG COMPUTING EXPERTS
Container orchestration in
geo-distributed cloud computing platforms
Keynote at HotCloudPerf
April 20th 2021
Mulugeta Ayalew Tamiru, Guillaume Pierre, Johan Tordsson and Erik Elmroth
Elastisys AB & Université de Rennes 1
1
Geo-distributed cloud platforms
2
Fault tolerance Proximity
Resource aggregation Regulatory compliance
Goal: reliably deploy software across the full platform
▪ Containers everywhere
• To abstract ourselves from heterogeneity of the host hardware +
hypervisors
▪ Deploy potentially large numbers of containers
• If necessary: burst to a public cloud
▪ Control container placements
• Manually
• Semi-automatically: “as close as possible from X”
• Automatically: load-balanced across all locations
3
Kubernetes Federation (KubeFed)
▪ Resource management and
application deployment on
multiple Kubernetes clusters
(member clusters) from a
single control plane (host
cluster)
▪ BUT: KubeFed was not
specifically designed for
worldwide geo-distribution
4
Experimental setup
5
▪ 1 host cluster and 5 member
clusters with Kubernetes 1.14
▪ Each cluster with a master
and five worker nodes
▪ Host cluster nodes: 4vCPUs,
16GB RAM
▪ Member cluster nodes:
4vCPUs, 4 GB RAM
▪ Simple nginx web server app
Problem -- Instability
6
Stability
Impact of network configuration on stability
7
AVERAGE NO . OF TIMEOUT ERRORS PER MINUTE (N ) AND STABILITY (υ) OF THE UNCONTROLLED
SYSTEM FOR THE THREE EVALUATION SCENARIOS .
Network delay/ packet
loss rate increased
Cluster failure
Network delay/ packet
loss rate restored
Cluster restored
KubeFed configuration parameters
8
Parameter Default
Cluster Available Delay 20s
Cluster Unavailable Delay 60s
Leader Elect Lease Duration 15s
Leader Elect Renew Deadline 10s
Leader Elect Retry Period 5s
Cluster Health Check Timeout 3s
Cluster Health Check Period 10s
Cluster Health Check Failure
Threshold
3
Stability vs. failure detection delay
9
Solution -- Controller to adjust CHCT at run-time
10
Results -- Stationary scenario
11
Results -- Network variability scenario
12
Network delay/ packet
loss rate increased
Network delay/ packet
loss rate restored
Results -- Cluster failure scenario
13
Cluster failure Cluster
restored
(Temporary) conclusion
▪ We observe significant instability in KubeFed-based
geo-distributed fog platforms due to:
• poor network conditions
• default / static configuration parameters
▪ We designed a proportional controller to adjust CHCT at
run-time
• Improves the system stability from 83–92% with no controller to
99.5–100% using the controller
Mulugeta Tamiru, Guillaume Pierre, Johan Tordsson, Erik Elmroth. Instability in Geo-Distributed Kubernetes Federation:
Causes and Mitigation. In Proceedings of IEEE MASCOTS, Nov 2020.
14
Now that we fixed the instability problem, is KubeFed ready
to manage large-scale geo-distributed platforms?
Note quite: in KubeFed, any deployment request is pushed to the
requested cluster regardless of the resource availability in this cluster.
15
Let’s replay 1 hour of
Google cluster trace,
distribute jobs to one out
of 5 clusters according to
a binomial distribution:
▪ 3 overloaded clusters
▪ 2 mostly idle clusters
Problems to address
▪ Make sure applications are not deployed in overloaded clusters
• Even if this requires choosing another cluster automatically…
▪ Support application autoscaling in multi-cluster environments
• Vary the number of replicas within a single cluster…
• … or across multiple clusters
▪ Allow the system to burst out to a public cloud in case of resource
overload
• And retract public-cloud resources as early as possible
▪ Seamlessly integrate in existing KubeFed platforms
16
17
Deploy mcd-app-1 across two clusters
which receive most network traffic
Make sure end-user requests are
distributed across both clusters
18
Autoscale the application deployment
to maintain reasonable CPU usage
Dynamically provision more resources
from the public cloud if necessary
19
Conclusion
Geo-distributed Kubernetes federations are now:
▪ Stable
▪ Resource availability aware
▪ Network traffic and network latency aware
▪ Burstable between available clusters, and to the public cloud
mck8s is available: https://github.com/moule3053/mck8s
Mulugeta Tamiru, Guillaume Pierre, Johan Tordsson, Erik Elmroth. mck8s: an orchestration platform for geo-distributed
multi-cluster environments. In Proceedings of ICCCN, Jul 2021.
20
The FogGuru project has received funding from the European Union’s
Horizon 2020 research and innovation programme under the Marie
Skłodowska-Curie grant 765452.
TRAINING THE NEXT GENERATION
OF EUROPEAN FOG COMPUTING EXPERTS
www.fogguru.eu
21

More Related Content

PDF
An Experiment-Driven Performance Model of Stream Processing Operators in Fog ...
PDF
From data centers to fog computing: the evaporating cloud
PDF
From Cloud to Fog: the Tao of IT Infrastructure Decentralization
PDF
Fog Computing for Dummies
PDF
Control of computing systems
PDF
Stream Processing
PDF
Low Energy Task Scheduling based on Work Stealing
PDF
SERENE 2014 School: Daniel varro serene2014_school
An Experiment-Driven Performance Model of Stream Processing Operators in Fog ...
From data centers to fog computing: the evaporating cloud
From Cloud to Fog: the Tao of IT Infrastructure Decentralization
Fog Computing for Dummies
Control of computing systems
Stream Processing
Low Energy Task Scheduling based on Work Stealing
SERENE 2014 School: Daniel varro serene2014_school

What's hot (20)

PPTX
A Guide to Data Versioning with MapR Snapshots
PPTX
2019 swan-cs3
PPTX
Smart Data Center Design
PDF
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
PPTX
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
PDF
Optimization of graph storage using GoFFish
PPTX
Network simulator 2
PPTX
Energy Audit aaS with OPNFV
PDF
Eventual Consistency - JUG DA
DOCX
Dc project 1
PDF
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
PDF
Experiences with High-bandwidth Networks
PDF
Low Power High-Performance Computing on the BeagleBoard Platform
PDF
Virtual Clusters for (RDF) Stream Processing
PPTX
Clone cloud
PDF
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
PDF
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
PDF
Detecting Lateral Movement with a Compute-Intense Graph Kernel
PDF
A tutorial on GreenCloud
PDF
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
A Guide to Data Versioning with MapR Snapshots
2019 swan-cs3
Smart Data Center Design
Superframe Scheduling with Beacon Enable Mode in Wireless Industrial Networks
Deadline Monotonic Scheduling to Reduce Overhead of Superframe in ISA100.11a
Optimization of graph storage using GoFFish
Network simulator 2
Energy Audit aaS with OPNFV
Eventual Consistency - JUG DA
Dc project 1
DEEP-mon: Dynamic and Energy Efficient Power monitoring for container-based i...
Experiences with High-bandwidth Networks
Low Power High-Performance Computing on the BeagleBoard Platform
Virtual Clusters for (RDF) Stream Processing
Clone cloud
IoT Workload Distribution Impact Between Edge and Cloud Computing in a Smart ...
Revisiting Sensor MAC for Periodic Monitoring: Why Should Transmitters Be Ear...
Detecting Lateral Movement with a Compute-Intense Graph Kernel
A tutorial on GreenCloud
A Scalable Dataflow Implementation of Curran's Approximation Algorithm
Ad

Similar to Container orchestration in geo-distributed cloud computing platforms (20)

PDF
Hybrid Cloud with Kubernetes Federation
PDF
Federated Kubernetes: As a Platform for Distributed Scientific Computing
PPTX
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
PDF
2016 08-30 Kubernetes talk for Waterloo DevOps
PDF
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
PPTX
Topic 2 - Cloud Computing Basics,,,.pptx
PPTX
Container orchestration and microservices world
PPTX
Multicluster Kubernetes and Service Mesh Patterns
PPTX
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
PDF
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
PPTX
Migration of an Enterprise UI Microservice System from Cloud Foundry to Kuber...
PDF
Kubernetes Multi-cluster without Federation - Kubecon EU 2018
PDF
Why kubernetes for Serverless (FaaS)
PDF
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
PDF
Kubernetes intro
PDF
Kubernetes: My BFF
PDF
Getting Started with Kubernetes
PPTX
Kubernetes fundamentals
PPTX
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
PDF
Kubernetes and CoreOS @ Athens Docker meetup
Hybrid Cloud with Kubernetes Federation
Federated Kubernetes: As a Platform for Distributed Scientific Computing
DevOps Fest 2019. Stanislav Kolenkin. Сonnecting pool Kubernetes clusters: Fe...
2016 08-30 Kubernetes talk for Waterloo DevOps
Kafka on Kubernetes: Keeping It Simple (Nikki Thean, Etsy) Kafka Summit SF 2019
Topic 2 - Cloud Computing Basics,,,.pptx
Container orchestration and microservices world
Multicluster Kubernetes and Service Mesh Patterns
Gluecon 2013 - Netflix Cloud Native Tutorial Details (part 2)
Multi-cluster Kubernetes Networking- Patterns, Projects and Guidelines
Migration of an Enterprise UI Microservice System from Cloud Foundry to Kuber...
Kubernetes Multi-cluster without Federation - Kubecon EU 2018
Why kubernetes for Serverless (FaaS)
Kubernetes for Serverless - Serverless Summit 2017 - Krishna Kumar
Kubernetes intro
Kubernetes: My BFF
Getting Started with Kubernetes
Kubernetes fundamentals
Planes, Raft, and Pods: A Tour of Distributed Systems Within Kubernetes
Kubernetes and CoreOS @ Athens Docker meetup
Ad

More from FogGuru MSCA Project (20)

PDF
PDF
The magical recipe for speaking in public
PDF
Introduction to the economics of innovation
PDF
Introduction to entrepreneurial finances
PDF
Financing Innovation and Intellectual property
PDF
Creating Competitive Advantage: Resource and Capabilities
PDF
Business growth: material for exercises
PDF
Business growth: material for discussions
PDF
Scale-ups and large companies
PDF
Management, organization and leadership
PDF
Key strategies for growth
PDF
Financing growth
PDF
Machine Learning: exercises
PDF
Introduction to Machine Learning
PDF
Writing code well: tools, tips and tricks
PDF
How to make a presentation
PDF
How to carry out bibliographic research
PDF
Guidelines for empirical evaluations
PDF
Ethics and Personal Data
PDF
Business case 1: Soft mobility in Rennes Metropole
The magical recipe for speaking in public
Introduction to the economics of innovation
Introduction to entrepreneurial finances
Financing Innovation and Intellectual property
Creating Competitive Advantage: Resource and Capabilities
Business growth: material for exercises
Business growth: material for discussions
Scale-ups and large companies
Management, organization and leadership
Key strategies for growth
Financing growth
Machine Learning: exercises
Introduction to Machine Learning
Writing code well: tools, tips and tricks
How to make a presentation
How to carry out bibliographic research
Guidelines for empirical evaluations
Ethics and Personal Data
Business case 1: Soft mobility in Rennes Metropole

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Electronic commerce courselecture one. Pdf
PPT
Teaching material agriculture food technology
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
KodekX | Application Modernization Development
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
cuic standard and advanced reporting.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Approach and Philosophy of On baking technology
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Electronic commerce courselecture one. Pdf
Teaching material agriculture food technology
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
The AUB Centre for AI in Media Proposal.docx
KodekX | Application Modernization Development
Digital-Transformation-Roadmap-for-Companies.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Spectral efficient network and resource selection model in 5G networks
Understanding_Digital_Forensics_Presentation.pptx
cuic standard and advanced reporting.pdf
Programs and apps: productivity, graphics, security and other tools
Chapter 3 Spatial Domain Image Processing.pdf
Network Security Unit 5.pdf for BCA BBA.
The Rise and Fall of 3GPP – Time for a Sabbatical?
Per capita expenditure prediction using model stacking based on satellite ima...
Machine learning based COVID-19 study performance prediction
Approach and Philosophy of On baking technology
Dropbox Q2 2025 Financial Results & Investor Presentation
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

Container orchestration in geo-distributed cloud computing platforms

  • 1. TRAINING THE NEXT GENERATION OF EUROPEAN FOG COMPUTING EXPERTS Container orchestration in geo-distributed cloud computing platforms Keynote at HotCloudPerf April 20th 2021 Mulugeta Ayalew Tamiru, Guillaume Pierre, Johan Tordsson and Erik Elmroth Elastisys AB & Université de Rennes 1 1
  • 2. Geo-distributed cloud platforms 2 Fault tolerance Proximity Resource aggregation Regulatory compliance
  • 3. Goal: reliably deploy software across the full platform ▪ Containers everywhere • To abstract ourselves from heterogeneity of the host hardware + hypervisors ▪ Deploy potentially large numbers of containers • If necessary: burst to a public cloud ▪ Control container placements • Manually • Semi-automatically: “as close as possible from X” • Automatically: load-balanced across all locations 3
  • 4. Kubernetes Federation (KubeFed) ▪ Resource management and application deployment on multiple Kubernetes clusters (member clusters) from a single control plane (host cluster) ▪ BUT: KubeFed was not specifically designed for worldwide geo-distribution 4
  • 5. Experimental setup 5 ▪ 1 host cluster and 5 member clusters with Kubernetes 1.14 ▪ Each cluster with a master and five worker nodes ▪ Host cluster nodes: 4vCPUs, 16GB RAM ▪ Member cluster nodes: 4vCPUs, 4 GB RAM ▪ Simple nginx web server app
  • 7. Impact of network configuration on stability 7 AVERAGE NO . OF TIMEOUT ERRORS PER MINUTE (N ) AND STABILITY (υ) OF THE UNCONTROLLED SYSTEM FOR THE THREE EVALUATION SCENARIOS . Network delay/ packet loss rate increased Cluster failure Network delay/ packet loss rate restored Cluster restored
  • 8. KubeFed configuration parameters 8 Parameter Default Cluster Available Delay 20s Cluster Unavailable Delay 60s Leader Elect Lease Duration 15s Leader Elect Renew Deadline 10s Leader Elect Retry Period 5s Cluster Health Check Timeout 3s Cluster Health Check Period 10s Cluster Health Check Failure Threshold 3
  • 9. Stability vs. failure detection delay 9
  • 10. Solution -- Controller to adjust CHCT at run-time 10
  • 11. Results -- Stationary scenario 11
  • 12. Results -- Network variability scenario 12 Network delay/ packet loss rate increased Network delay/ packet loss rate restored
  • 13. Results -- Cluster failure scenario 13 Cluster failure Cluster restored
  • 14. (Temporary) conclusion ▪ We observe significant instability in KubeFed-based geo-distributed fog platforms due to: • poor network conditions • default / static configuration parameters ▪ We designed a proportional controller to adjust CHCT at run-time • Improves the system stability from 83–92% with no controller to 99.5–100% using the controller Mulugeta Tamiru, Guillaume Pierre, Johan Tordsson, Erik Elmroth. Instability in Geo-Distributed Kubernetes Federation: Causes and Mitigation. In Proceedings of IEEE MASCOTS, Nov 2020. 14
  • 15. Now that we fixed the instability problem, is KubeFed ready to manage large-scale geo-distributed platforms? Note quite: in KubeFed, any deployment request is pushed to the requested cluster regardless of the resource availability in this cluster. 15 Let’s replay 1 hour of Google cluster trace, distribute jobs to one out of 5 clusters according to a binomial distribution: ▪ 3 overloaded clusters ▪ 2 mostly idle clusters
  • 16. Problems to address ▪ Make sure applications are not deployed in overloaded clusters • Even if this requires choosing another cluster automatically… ▪ Support application autoscaling in multi-cluster environments • Vary the number of replicas within a single cluster… • … or across multiple clusters ▪ Allow the system to burst out to a public cloud in case of resource overload • And retract public-cloud resources as early as possible ▪ Seamlessly integrate in existing KubeFed platforms 16
  • 17. 17 Deploy mcd-app-1 across two clusters which receive most network traffic Make sure end-user requests are distributed across both clusters
  • 18. 18 Autoscale the application deployment to maintain reasonable CPU usage Dynamically provision more resources from the public cloud if necessary
  • 19. 19
  • 20. Conclusion Geo-distributed Kubernetes federations are now: ▪ Stable ▪ Resource availability aware ▪ Network traffic and network latency aware ▪ Burstable between available clusters, and to the public cloud mck8s is available: https://github.com/moule3053/mck8s Mulugeta Tamiru, Guillaume Pierre, Johan Tordsson, Erik Elmroth. mck8s: an orchestration platform for geo-distributed multi-cluster environments. In Proceedings of ICCCN, Jul 2021. 20
  • 21. The FogGuru project has received funding from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant 765452. TRAINING THE NEXT GENERATION OF EUROPEAN FOG COMPUTING EXPERTS www.fogguru.eu 21