SlideShare a Scribd company logo
Running Cloud Foundry 
An Experience Report
About this talk
• Receive an opinion about running Cloud 
Foundry (CF) 
• How to shoot your own leg with CF and 
overcommitment settings 
• How to perform CF updates 
• How to harden CF 
• Wise words about CF services
Introduction
about.me/fischerjulian
Running a public 
Cloud Foundry 
for more than a year.
It works.
In order to run 
Cloud Foundry smoothly …
… refer to the package leaflet for 
risks and side effects and consult 
pivotal, cloudcredo or anynines.“
The details
The anynines Stack
Cloud Foundry 
OpenStack 
VMware 
Hardware
We migrated from a 
Rented VMware to a 
self-hosted OpenStack.
For more details on this: 
http://rh.gd/a9vmw2sos
Proof point made…
Cloud Foundry 
saves investments into 
software development 
by being 
infrastructure agnostic.
Running Cloud Foundry. 
What happened.
Security Issues
• Pivotal informs partners early about 
issued 
• Usually along with fixes
OpenStack Issues
• Ext4 vs. Ext3 
• DEA MTU 
• rsyslogd command not found
CF Gotchas
DEA evacuate & Bosh 
timeout race-condition
• Removing a DEA 
→ Apps will be evacuated 
→ DEA will be stopped 
• Bosh deployment will fail when 
evacuation takes longer than the Bosh 
timeout 
• Set your Bosh timeout accordingly!
DEA over-commitment
Default overcommitment 
factor = 4
RAM peaks may cause 
random errors
• Failures during staging 
• Random application crashes 
• No meaningful log information
Reducing over-commitment
• Native strategy 
• Reduce over-commitment factor 
• Bosh deploy
Running Cloud Foundry for 12 months - An experience report | anynines
• 8 GB VM, OC factor 4 
→ Announces 32 GB (V)RAM 
• 8 GB VM, OC factor 2 
→ Announces 16 GB (V)RAM 
• When evacuating a 32 GB (V)RAM host, 
another 32 GB (V)RAM host will be 
preferred (more free space)
Evacuation Wave
1 GB 
1 GB 
1 GB 
1 GB
= maximum impact on 
running apps!
New DEAs (OC 2) will 
receive apps when old DEAs 
(OC 4) have been stopped.
Hints
• Create 2nd resource pool for new DEAs 
• Deploy the 2nd resource pool before 
startup to stop old DEAs 
• (-) Needs more resources 
• (+) Smoother transition
Updating Cloud Foundry
Required: 
Staging System
• Structurally identical 
• Less VMs
1. 
Determine new features 
since last release
2. 
Study 
deployment manifest 
changes
3. 
Apply 
deployment manifest 
changes
4. 
First staging attempt
5. 
Debug and Fix it!
6. 
Simulate the live-upgrade
7. 
Schedule maintenance on 
status.anynines.com
8. 
Perform the upgrade 
and cross fingers.
CF Hardening
Accept that VMs are 
ephemeral
VM Failover Strategies
Resurrect
• Monitor VM 
• Re-Build VMs automatically 
• e.g. using Cloud Foundry Bosh 
• + Easy 
• - Takes long (minutes not seconds) 
• - Open Stack doesn’t release persistent 
disks automatically
Failover to Standby VM
Distribute CF components 
across availability zones
• Build disjunct networks, racks, etc. 
• Each disjunct zone = availability zone 
• Tell your IaaS about availability zones 
• On provision choose the AZ 
• Build Bosh releases accordingly
• Provide stand-by VM 
• Monitor VM and perform failover 
• IP failover using Pacemaker 
• + Fast failover (seconds) 
• - Pacemaker not easy to use (& boshify) 
• - Increased resource usage by stdby 
VM(s)
• 2 * UAA 
• 2 * CC 
• 2 * n * DEAs 
• 2 * Health Manager 
• …
UAA & CC DB 
= 
SPOF
HA Postgres
• UAA and Cloud Controller database 
• Single point of failure for Cloud Foundry
• Postgres not inherently clusterable > 
failover with standby vm 
• Master/slave replication 
• Pacemaker/corosync 
• IP-Failover using NIC-reattachment
That’s half way towards a 
PostgreSQL CF Service
• Add a V2 Service Broker 
• Add a provisioning logic 
• Provision 2-node db cluster on 
cf create service postgres medium-cluster
Services
“The best way to find yourself is to lose 
yourself in the service of others.” 
― Mahatma Gandhi
Wardenized Services 
(community services) 
are cute for pet projects.
Not suitable for production.
• Implementations are outdated 
• One size doesn’t fit all!
No production CF without 
high quality services.
CF Service Design
• Use clusterable services if possible 
• Implement automatic failover if not 
• Autoprovisioning using Bosh 
• Organize self-healing 
• (Semi-)Automatic recovery from 
degraded mode
Summary
• Bosh & the CF release are powerful, yet 
you can cut yourself. 
• HA Services are very necessary. 
• CF is ready to be used in production.
Questions?
Thank you!

More Related Content

PPTX
Experience Report: Cloud Foundry Open Source Operations | anynines
PDF
Cloud Foundry on OpenStack - An Experience Report | anynines
PDF
Delivering a production Cloud Foundry Environment with Bosh | anynines
PDF
Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...
PDF
OSv presentation from Linux Foundation Collaboration Summit
PPTX
Challenges of Kubernetes On-premise Deployment
PPTX
Distributed automation sel_conf_2015
PDF
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...
Experience Report: Cloud Foundry Open Source Operations | anynines
Cloud Foundry on OpenStack - An Experience Report | anynines
Delivering a production Cloud Foundry Environment with Bosh | anynines
Cloud Infrastructures Slide Set 8 - More Cloud Technologies - Mesos, Spark | ...
OSv presentation from Linux Foundation Collaboration Summit
Challenges of Kubernetes On-premise Deployment
Distributed automation sel_conf_2015
Meetup 23 - 01 - The things I wish I would have known before doing OpenStack ...

What's hot (20)

PPTX
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
PDF
Understanding performance aspects of etcd and Raft
PDF
[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)
PDF
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
PDF
[OpenInfra Days Korea 2018] Day 1 - T4-7: "Ceph 스토리지, PaaS로 서비스 운영하기"
PDF
One-click Hadoop Cluster Deployment on OpenPOWER Systems
PPTX
OpenStack QA Tooling & How to use it for Production Cloud Testing | Ghanshyam...
PDF
Paris Container Day 2016 : Etcd - overview and future (CoreOS)
PDF
Mitchell Hashimoto, HashiCorp
PDF
Ansible & Cumulus Networks - Simplify Network Automation
PPTX
Immutable infrastructure 介紹與實做:以 kolla 為例
PDF
Monitor PowerKVM using Ganglia, Nagios
PDF
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
PDF
Is there still room for innovation in container orchestration and scheduling
PPTX
Distributed automation selcamp2016
PDF
SQL Server DevOps Jumpstart
PDF
Cloud data center and openstack
PDF
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
PDF
Performance Benchmarking of Clouds Evaluating OpenStack
PPTX
1 DevOp vs 1.000 servers - Amazon EC2 and Chef automation intro
Tối ưu hiệu năng đáp ứng các yêu cầu của hệ thống 4G core
Understanding performance aspects of etcd and Raft
[2018.10.19] 김용기 부장 - IAC on OpenStack (feat. ansible)
[OpenInfra Days Korea 2018] Day 2 - E6 - OpenInfra monitoring with Prometheus
[OpenInfra Days Korea 2018] Day 1 - T4-7: "Ceph 스토리지, PaaS로 서비스 운영하기"
One-click Hadoop Cluster Deployment on OpenPOWER Systems
OpenStack QA Tooling & How to use it for Production Cloud Testing | Ghanshyam...
Paris Container Day 2016 : Etcd - overview and future (CoreOS)
Mitchell Hashimoto, HashiCorp
Ansible & Cumulus Networks - Simplify Network Automation
Immutable infrastructure 介紹與實做:以 kolla 為例
Monitor PowerKVM using Ganglia, Nagios
Ceph Goes on Online at Qihoo 360 - Xuehan Xu
Is there still room for innovation in container orchestration and scheduling
Distributed automation selcamp2016
SQL Server DevOps Jumpstart
Cloud data center and openstack
OpenNebulaConf 2016 - Measuring and tuning VM performance by Boyan Krosnov, S...
Performance Benchmarking of Clouds Evaluating OpenStack
1 DevOp vs 1.000 servers - Amazon EC2 and Chef automation intro
Ad

Viewers also liked (11)

PDF
Cloud infrastructures - Slide Set 6 - BOSH | anynines
PDF
Cloud Infrastructures Slide Set 7 - Docker - Neo4j | anynines
PDF
Building a European PaaS | anynines
PDF
Introduction into Cloud Foundry and Bosh | anynines
PPTX
Docker & Diego - good friends or not? | anynines
PPTX
An Introduction into Bosh | anynines
PPTX
Digital Transformation Case Study | anynines
PDF
Building a Production Grade PostgreSQL Cloud Foundry Service | anynines
PDF
Vorlesung - Cloud Infrastrukturen - Clusterbau | anynines
PDF
Vorlesung - Cloud Infrastrukturen - OpenStack Part 1 | anynines
PDF
Vorlesung - Cloud Infrastrukturen - Einleitung | anynines
Cloud infrastructures - Slide Set 6 - BOSH | anynines
Cloud Infrastructures Slide Set 7 - Docker - Neo4j | anynines
Building a European PaaS | anynines
Introduction into Cloud Foundry and Bosh | anynines
Docker & Diego - good friends or not? | anynines
An Introduction into Bosh | anynines
Digital Transformation Case Study | anynines
Building a Production Grade PostgreSQL Cloud Foundry Service | anynines
Vorlesung - Cloud Infrastrukturen - Clusterbau | anynines
Vorlesung - Cloud Infrastrukturen - OpenStack Part 1 | anynines
Vorlesung - Cloud Infrastrukturen - Einleitung | anynines
Ad

Similar to Running Cloud Foundry for 12 months - An experience report | anynines (20)

PDF
As a Service: Cloud Foundry on OpenStack - Lessons Learnt
PDF
Containers & Cloud Native Ops Cloud Foundry Approach
PDF
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
PPTX
Cloud Foundry Road Map in 2017
PDF
Continuous delivery and DevOps with CloudFoundry
PDF
EMC DevOps Day Aug-2015 - Stormy Peters, Cloud Foundry Foundation
PPTX
Cloud Foundry Vancouver Meetup July 2016
PPTX
CF Summit: A Developer's Perspective
PPTX
Cloud Foundry Technical Overview at IBM Interconnect 2016
PPTX
Taking Cloud to Extremes: Scaled-down, Highly Available, and Mission-critical...
PPTX
Cloud foundry: The Platform for Forging Cloud Native Applications
PPTX
Cloud Foundry Roadmap in 2016
PDF
DevOps and Continuous Delivery with CloudFoundry
PDF
Cloud foundry shanghai summit experience
PPT
Boston Cloud Foundry Meetup 5-22-14
PDF
Cloud Foundry the definitive guide develop deploy and scale First Edition Winn
PPTX
Cloud Foundry: Hands-on Deployment Workshop
PDF
Persistent storage in Docker
PDF
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
PDF
Solving k8s persistent workloads using k8s DevOps style
As a Service: Cloud Foundry on OpenStack - Lessons Learnt
Containers & Cloud Native Ops Cloud Foundry Approach
VMworld 2013: Three Advantages of Running Cloud Foundry in a VMware Private C...
Cloud Foundry Road Map in 2017
Continuous delivery and DevOps with CloudFoundry
EMC DevOps Day Aug-2015 - Stormy Peters, Cloud Foundry Foundation
Cloud Foundry Vancouver Meetup July 2016
CF Summit: A Developer's Perspective
Cloud Foundry Technical Overview at IBM Interconnect 2016
Taking Cloud to Extremes: Scaled-down, Highly Available, and Mission-critical...
Cloud foundry: The Platform for Forging Cloud Native Applications
Cloud Foundry Roadmap in 2016
DevOps and Continuous Delivery with CloudFoundry
Cloud foundry shanghai summit experience
Boston Cloud Foundry Meetup 5-22-14
Cloud Foundry the definitive guide develop deploy and scale First Edition Winn
Cloud Foundry: Hands-on Deployment Workshop
Persistent storage in Docker
OSDC 2018 | Highly Available Cloud Foundry on Kubernetes by Cornelius Schumacher
Solving k8s persistent workloads using k8s DevOps style

More from anynines GmbH (6)

PPTX
Beyond 1000 bosh Deployments
PPTX
Automating the Entire PostgreSQL Lifecycle
PPTX
Kill Your Productivity - As Efficient as Possible
PDF
NSA - No thanks - Build your own cloud with OpenStack and Cloud Foundry | any...
PDF
Migrating a Cloud Foundry from VMware to OpenStack | anynines
PDF
Continuous deployment with Cloud Foundry, Github and Travis CI | anynines
Beyond 1000 bosh Deployments
Automating the Entire PostgreSQL Lifecycle
Kill Your Productivity - As Efficient as Possible
NSA - No thanks - Build your own cloud with OpenStack and Cloud Foundry | any...
Migrating a Cloud Foundry from VMware to OpenStack | anynines
Continuous deployment with Cloud Foundry, Github and Travis CI | anynines

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
KodekX | Application Modernization Development
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Approach and Philosophy of On baking technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Machine learning based COVID-19 study performance prediction
PDF
Empathic Computing: Creating Shared Understanding
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Reach Out and Touch Someone: Haptics and Empathic Computing
Agricultural_Statistics_at_a_Glance_2022_0.pdf
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
The Rise and Fall of 3GPP – Time for a Sabbatical?
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
NewMind AI Monthly Chronicles - July 2025
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
KodekX | Application Modernization Development
MYSQL Presentation for SQL database connectivity
Mobile App Security Testing_ A Comprehensive Guide.pdf
Approach and Philosophy of On baking technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Spectral efficient network and resource selection model in 5G networks
Machine learning based COVID-19 study performance prediction
Empathic Computing: Creating Shared Understanding
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Advanced methodologies resolving dimensionality complications for autism neur...
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...

Running Cloud Foundry for 12 months - An experience report | anynines