SlideShare a Scribd company logo
Deep Dive Into the CERN Cloud Infrastructure - November, 2013
Deep Dive into the CERN Cloud Infrastructure
Openstack Design Summit – Hong Kong, 2013
Belmiro Moreira
belmiro.moreira@cern.ch @belmiromoreira
What is CERN?
•  Conseil Européen pour la Recherche
Nucléaire – aka European
Organization for Nuclear Research
•  Founded in 1954 with an
international treaty
•  20 state members, other countries
contribute to experiments
•  Situated between Geneva and the
Jura Mountains, straddling the
Swiss-French border
3
What is CERN?
4
CERN Cloud Experiment
What is CERN?
CERN provides particle accelerators and other infrastructure for
high-energy physics research
5
faqLHC
the guide
LINAC 2
Gran Sasso
North Area
LINAC 3
Ions
East Area
TI2
TI8
TT41TT40
CTF3
TT2
TT10
TT60
e–
ALICE
ATLAS
LHCb
CMS
CNGS
neutrinos
neutrons
p
p
SPS
ISOLDEBOOSTER
AD
LEIR
n-ToF
LHC
PS
LHC - Large Hadron Collider
6
https://www.google.com/maps/views/streetview/cern?gl=us
LHC and Experiments
7
CMS detector
LHC and Experiments
8
Proton-lead collisions at ALICE detector
CERN - Computer Center - Geneva, Switzerland
9
•  3.5 Mega Watts
•  ~91000 cores
•  ~120 PB HDD
•  ~100 PB Tape
•  ~310 TB Memory
CERN - Computer Center - Budapest, Hungary
10
•  2.5 Mega Watts
•  ~20000 cores
•  ~6 PB HDD
Computer Centers location
11
CERN IT Infrastructure in 2011
•  ~10k servers
•  Dedicated compute, dedicated disk server, dedicated service nodes
•  Mostly running on real hardware
•  Server consolidation of some service nodes using Microsoft HyperV/
SCVMM
•  ~3400 VMs (~2000 Linux, ~1400 Windows)
•  Various other virtualization projects around
•  Many diverse applications (”clusters”)
•  Managed by different teams (CERN IT + experiment groups)
12
CERN IT Infrastructure challenges in 2011
•  Expected new Computer Center in 2013
•  Need to manage twice the servers
•  No increase in staff numbers
•  Increasing number of users / computing requirements
•  Legacy tools - high maintenance and brittle
13
Why Build CERN Cloud
Improve operational efficiency
•  Machine reception and testing
•  Hardware interventions with long running programs
•  Multiple operating system demand
Improve resource efficiency
•  Exploit idle resources
•  Highly variable load such as interactive or build machines
Improve responsiveness
•  Self-service
14
Identify a new Tool Chain
•  Identify the tools needed to build our Cloud
Infrastructure
•  Configuration Manager tool
•  Cloud Manager tool
•  Monitoring tools
•  Storage Solution
15
Strategy to deploy OpenStack
•  Configuration infrastructure based on Puppet
•  Community Puppet modules for OpenStack
•  SLC6 Operating System
•  EPEL/RDO - RPM Packages
16
Strategy to deploy OpenStack
•  Deliver a production IaaS service though a series of time-
based pre-production services of increasing functionality
and Quality-of-Service
•  Budapest Computer Center hardware deployed as
OpenStack compute nodes
•  Have an OpenStack production service in the Q2 of 2013
17
Pre-Production Infrastructure
18
Essex Folsom
"Guppy" "Hamster" "Ibex"
- Deployed on Fedora 16
- Community OpenStack puppet
modules
- Used for functionality tests
- Limited integration with CERN
infrastructure
- Open to early adopters
- Deployed on SLC6 and Hyper-V
- CERN Network DB integration
- Keystone LDAP integration
- Open to a wider community
(ATLAS, CMS, LHCb, …)
- Some OpenStack services in HA
- ~14000 cores
June, 2012 October, 2012 March, 2013
OpenStack at CERN - grizzly release
19
OpenStack at CERN - grizzly release
•  +2 Children Cells – Geneva and Budapest Computer Centers
•  HA+1 architecture
•  Ceilometer deployed
•  Integrated with CERN accounts and network infrastructure
•  Monitoring OpenStack components status
•  Glance - Ceph backend
•  Cinder - Testing with Ceph backend
20
Infrastructure Overview
•  Adding ~100 compute nodes every week
•  Geneva, Switzerland Cell
•  ~11000 cores
•  Budapest, Hungary Cell
•  ~10000 cores
•  Today we have +2500 VMs
•  Several VMs have more than 8 cores
21
compute-nodescontrollers
compute-nodes
Architecture Overview
22
Child Cell
Geneva, Switzerland
Child Cell
Budapest, Hungary
Top Cell - controllers
Geneva, Switzerland
Load Balancer
Geneva, Switzerland
controllers
Architecture Components
23
rabbitmq
- Keystone
- Nova api
- Nova conductor
- Nova scheduler
- Nova network
- Nova cells
- Glance api
- Ceilometer agent-central
- Ceilometer collector
Controller
- Flume
- Nova compute
- Ceilometer agent-compute
Compute node
- Flume
- HDFS
- Elastic Search
- Kibana
- MySQL
- MongoDB
- Glance api
- Glance registry
- Keystone
- Nova api
- Nova consoleauth
- Nova novncproxy
- Nova cells
- Horizon
- Ceilometer api
- Cinder api
- Cinder volume
- Cinder scheduler
rabbitmq
Controller
Top Cell Children Cells
- Stacktach
- Ceph
- Flume
Infrastructure Overview
•  SLC6 and Microsoft Windows 2012
•  KVM and Microsoft HyperV
•  All infrastructure “puppetized” (also, windows compute nodes!)
•  Using stackforge OpenStack puppet modules
•  Using CERN Foreman/Puppet configuration infrastructure
•  Master, Client architecture
•  Puppet managed VMs - share the same configuration infrastructure
24
Infrastructure Overview
•  HAProxy as load balancer
•  Master and Compute nodes
•  3+ Master nodes per Cell
•  O(1000) Compute nodes per Child Cell (KVM and HyperV)
•  3 availability zones per Cell
•  Rabbitmq
•  At least 3 brokers per Cell
•  Rabbitmq cluster with mirrored queues
25
Infrastructure Overview
•  MySql instance per Cell
•  MySql managed by CERN DB team
•  Running on top of Oracle CRS
•  active/slave configuration
•  NetApp storage backend
•  Backups every 6 hours
26
Nova Cells
•  Why Cells?
•  Scale transparently between different Computer Centers
•  With cells we lost functionality
•  Security groups
•  Live migration
•  "Parents" don't know about “children” compute
•  Flavors not propagated to "children” cells
27
Nova Cells
•  Scheduling
•  Random cell selection on Grizzly
•  Implemented simple scheduler based on project
•  CERN Geneva only, CERN Wigner only, “both”
•  “both” selects the cell with more available free memory
•  Cell/Cell communication doesn’t support multiple Rabbitmq
servers
•  https://bugs.launchpad.net/nova/+bug/1178541
28
Nova Network
•  CERN network infrastructure
29
IP MAC
CERN network DB
VM
VM
VM
VM
VM
Nova Network
•  Implemented a Nova Network CERN driver
•  Considers the “host” picked by nova-scheduler
•  MAC address selected from pre-registered addresses of “host”
IP Service
•  Updates CERN network database address with instance
hostname and responsible of the device
•  Network constraints in some nova operations
•  Resize, Live-Migration
30
Nova Scheduler
•  ImagePropertiesFilter
•  linux/windows hypervisors in the same infrastructure
•  ProjectsToAggregateFilter
•  Projects need dedicated resources
•  Instances from defined projects are created in specific Aggregates
•  Aggregates can be shared by a set of projects
•  Availability Zones
•  Implemented “default_schedule_zones”
31
Nova Conductor
•  Reduces “dramatically” the number of DB connections
•  Conductor “bottleneck”
•  Only 3+ processes for “all” DB requests
•  General “slowness” in the infrastructure
•  Fixed with backport
•  https://review.openstack.org/#/c/42342/
32
Nova Compute
•  KVM and Hyper-V compute nodes share the same
infrastructure
•  Hypervisor selection based on “Image” properties
•  Hyper-V driver still lacks some functionality on Grizzly
•  Console access, metadata support with nova-network, resize
support, ephemeral disk support, ceilometer metrics support
33
Keystone
•  CERN’s Active Directory infrastructure
•  Unified identity management across the site
•  +44000 users
•  +29000 groups
•  ~200 arrivals/departures per month
•  Keystone integrated with CERN Active Directory
•  LDAP backend
34
Keystone
•  CERN user subscribes the "cloud service”
•  Created "Personal Tenant" with limited quota
•  Shared projects created by request
•  Project life cycle
•  owner, member, admin – roles
•  "Personal project" disabled when user leaves
•  Delete resources (VMs, Volumes, Images, …)
•  User removed from "Shared Projects"
35
Ceilometer
•  Users are not directly billed
•  Metering needed to adjust Project quotas
•  mongoDB backend – sharded and replicated
•  Collector, Central-Agent
•  Running on “children” Cells controllers
•  Compute-Agent
•  Uses nova-api running on “children” Cells controllers
36
Glance
•  Glance API
•  Using glance api v1
•  python-glanceclient doesn’t support completely v2
•  Glance Registry
•  With v1 we need to keep Glance Registry
•  Only runs in Top Cell behind the load balancer
•  Glance backend
•  File Store (AFS)
•  Ceph
37
Glance
•  Maintain small set of SLC5/6 images as default
•  Difficult to offer only the most updated set of images
•  Resize and Live Migration not available if image is deleted from
Glance
•  Users can upload images up to 25GB
•  Users don’t pay storage!
•  Glance in Grizzly doesn’t support quotas per Tenant!
38
Cinder
•  Ceph backend
•  Still in evaluation
•  SLC6 with qemu-kvm patched by Inktank to support RBD
•  Cinder doesn't support cells in Grizzly
•  Fixed with backport:
https://review.openstack.org/#/c/31561/
39
Ceph as Storage Backend
•  3 PB cluster available for Ceph
•  48 OSDs servers
•  5 Monitors servers
•  Initial testing with FIO, libaio, bs 256k
fio --size=4g --bs=256k –numjobs=1 --direct=1 --rw=randrw
--ioengine=libaio --name=/mnt/vdb1/tmp4
Rand RW Rand R Rand W
99 MB/s 103 MB/s 108 MB/s
40
Ceph as Storage Backend
•  ulimits
•  With more than >1024 OSDs, we’re getting various errors
where clients cannot create enough processes
•  authx for security (key lifecycle is a challenge as always)
•  need librbd (from EPEL)
41
Monitoring - Lemon
•  Monitor “physical” and virtual “servers” with Lemon
42
Monitoring - Flume, Elastic Search, Kibana
•  How to monitor OpenStack status in all nodes?
•  ERRORs, WARNINGs – log visualization
•  identify in “real time” possible problems
•  preserve all logs for analytics
•  visualization of cloud infrastructure status
•  service managers
•  resource managers
•  users
43
Monitoring - Flume, Elastic Search, Kibana
44
HDFS
Flume
gateway
elasticsearch Kibana
OpenStack infrastructure
Monitoring - Kibana
45
Monitoring - Kibana
46
Challenges
•  Moving resources to the infrastructure
•  +100 compute nodes per week
•  15000 servers – more than 300000 cores
•  Migration from Grizzly to Havana
•  Deploy Neutron
•  Deploy Heat
•  Kerberos, X.509 user certificate authentication
•  Keystone Domains
47
belmiro.moreira@cern.ch
@belmiromoreira

More Related Content

PDF
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
PDF
Unveiling CERN Cloud Architecture - October, 2015
PDF
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
PDF
Cern Cloud Architecture - February, 2016
PDF
CPU Optimizations in the CERN Cloud - February 2016
PPTX
Learning to Scale OpenStack
PDF
10 Years of OpenStack at CERN - From 0 to 300k cores
PDF
CERN OpenStack Cloud Control Plane - From VMs to K8s
Multi-Cell OpenStack: How to Evolve Your Cloud to Scale - November, 2014
Unveiling CERN Cloud Architecture - October, 2015
Tips Tricks and Tactics with Cells and Scaling OpenStack - May, 2015
Cern Cloud Architecture - February, 2016
CPU Optimizations in the CERN Cloud - February 2016
Learning to Scale OpenStack
10 Years of OpenStack at CERN - From 0 to 300k cores
CERN OpenStack Cloud Control Plane - From VMs to K8s

What's hot (20)

PDF
Future Science on Future OpenStack
PDF
Evolution of Openstack Networking at CERN
PPTX
CERN User Story
PPTX
20121017 OpenStack CERN Accelerating Science
PDF
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
PPT
Euro ht condor_alahiff
PPTX
20190620 accelerating containers v3
PDF
Containers on Baremetal and Preemptible VMs at CERN and SKA
PPTX
OpenStack High Availability
PDF
OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...
PDF
TripleO
PDF
Ovn vancouver
PDF
OpenStack Data Processing ("Sahara") project update - December 2014
PDF
What's new in OpenStack Liberty
PDF
Enabling Scientific Workflows on FermiCloud using OpenNebula
PPTX
Enhancing OpenStack FWaaS for real world application
PDF
Monitoring Large-scale Cloud Infrastructures with OpenNebula
PPTX
Hostvn ceph in production v1.1 dungtq
KEY
Openstack In Real Life
PPTX
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Future Science on Future OpenStack
Evolution of Openstack Networking at CERN
CERN User Story
20121017 OpenStack CERN Accelerating Science
Bruno Silva - eMedLab: Merging HPC and Cloud for Biomedical Research
Euro ht condor_alahiff
20190620 accelerating containers v3
Containers on Baremetal and Preemptible VMs at CERN and SKA
OpenStack High Availability
OpenNebulaConf2015 2.05 OpenNebula at the Leibniz Supercomputing Centre - Mat...
TripleO
Ovn vancouver
OpenStack Data Processing ("Sahara") project update - December 2014
What's new in OpenStack Liberty
Enabling Scientific Workflows on FermiCloud using OpenNebula
Enhancing OpenStack FWaaS for real world application
Monitoring Large-scale Cloud Infrastructures with OpenNebula
Hostvn ceph in production v1.1 dungtq
Openstack In Real Life
Leverage Mesos for running Spark Streaming production jobs by Iulian Dragos a...
Ad

Similar to Deep Dive Into the CERN Cloud Infrastructure - November, 2013 (20)

PPTX
The OpenStack Cloud at CERN - OpenStack Nordic
PPTX
20140509 cern open_stack_linuxtag_v3
PPTX
Cloud computing and OpenStack
ODP
Deep Dive: OpenStack Summit (Red Hat Summit 2014)
PDF
What is OpenStack and the added value of IBM solutions
PDF
CERN Data Centre Evolution
PDF
Introduction openstack-meetup-nov-28
PPTX
The OpenStack Cloud at CERN
PPTX
Power of OpenStack & Hadoop
PDF
Gordonh0945deepdive openstackcompute-140417174059-phpapp02
PDF
OpenStack Best Practices and Considerations - terasky tech day
PPTX
OpenStack: Toward a More Resilient Cloud
PDF
Openstack For Beginners
PPT
Openstack presentation
PPTX
OpenStack Architecture and Use Cases
PPTX
Open stack Architecture and Use Cases
PPTX
Quick overview of Openstack architecture
PPTX
OpenStack at CERN : A 5 year perspective
PDF
At the Crossroads of HPC and Cloud Computing with Openstack
PPTX
An Intrudction to OpenStack 2017
The OpenStack Cloud at CERN - OpenStack Nordic
20140509 cern open_stack_linuxtag_v3
Cloud computing and OpenStack
Deep Dive: OpenStack Summit (Red Hat Summit 2014)
What is OpenStack and the added value of IBM solutions
CERN Data Centre Evolution
Introduction openstack-meetup-nov-28
The OpenStack Cloud at CERN
Power of OpenStack & Hadoop
Gordonh0945deepdive openstackcompute-140417174059-phpapp02
OpenStack Best Practices and Considerations - terasky tech day
OpenStack: Toward a More Resilient Cloud
Openstack For Beginners
Openstack presentation
OpenStack Architecture and Use Cases
Open stack Architecture and Use Cases
Quick overview of Openstack architecture
OpenStack at CERN : A 5 year perspective
At the Crossroads of HPC and Cloud Computing with Openstack
An Intrudction to OpenStack 2017
Ad

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Advanced IT Governance
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
cuic standard and advanced reporting.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Modernizing your data center with Dell and AMD
Diabetes mellitus diagnosis method based random forest with bat algorithm
Chapter 3 Spatial Domain Image Processing.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
GamePlan Trading System Review: Professional Trader's Honest Take
Advanced Soft Computing BINUS July 2025.pdf
Unlocking AI with Model Context Protocol (MCP)
Advanced methodologies resolving dimensionality complications for autism neur...
Advanced IT Governance
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Dropbox Q2 2025 Financial Results & Investor Presentation
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
NewMind AI Monthly Chronicles - July 2025
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Understanding_Digital_Forensics_Presentation.pptx
cuic standard and advanced reporting.pdf
The AUB Centre for AI in Media Proposal.docx
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
MYSQL Presentation for SQL database connectivity
Modernizing your data center with Dell and AMD

Deep Dive Into the CERN Cloud Infrastructure - November, 2013

  • 2. Deep Dive into the CERN Cloud Infrastructure Openstack Design Summit – Hong Kong, 2013 Belmiro Moreira belmiro.moreira@cern.ch @belmiromoreira
  • 3. What is CERN? •  Conseil Européen pour la Recherche Nucléaire – aka European Organization for Nuclear Research •  Founded in 1954 with an international treaty •  20 state members, other countries contribute to experiments •  Situated between Geneva and the Jura Mountains, straddling the Swiss-French border 3
  • 4. What is CERN? 4 CERN Cloud Experiment
  • 5. What is CERN? CERN provides particle accelerators and other infrastructure for high-energy physics research 5 faqLHC the guide LINAC 2 Gran Sasso North Area LINAC 3 Ions East Area TI2 TI8 TT41TT40 CTF3 TT2 TT10 TT60 e– ALICE ATLAS LHCb CMS CNGS neutrinos neutrons p p SPS ISOLDEBOOSTER AD LEIR n-ToF LHC PS
  • 6. LHC - Large Hadron Collider 6 https://www.google.com/maps/views/streetview/cern?gl=us
  • 8. LHC and Experiments 8 Proton-lead collisions at ALICE detector
  • 9. CERN - Computer Center - Geneva, Switzerland 9 •  3.5 Mega Watts •  ~91000 cores •  ~120 PB HDD •  ~100 PB Tape •  ~310 TB Memory
  • 10. CERN - Computer Center - Budapest, Hungary 10 •  2.5 Mega Watts •  ~20000 cores •  ~6 PB HDD
  • 12. CERN IT Infrastructure in 2011 •  ~10k servers •  Dedicated compute, dedicated disk server, dedicated service nodes •  Mostly running on real hardware •  Server consolidation of some service nodes using Microsoft HyperV/ SCVMM •  ~3400 VMs (~2000 Linux, ~1400 Windows) •  Various other virtualization projects around •  Many diverse applications (”clusters”) •  Managed by different teams (CERN IT + experiment groups) 12
  • 13. CERN IT Infrastructure challenges in 2011 •  Expected new Computer Center in 2013 •  Need to manage twice the servers •  No increase in staff numbers •  Increasing number of users / computing requirements •  Legacy tools - high maintenance and brittle 13
  • 14. Why Build CERN Cloud Improve operational efficiency •  Machine reception and testing •  Hardware interventions with long running programs •  Multiple operating system demand Improve resource efficiency •  Exploit idle resources •  Highly variable load such as interactive or build machines Improve responsiveness •  Self-service 14
  • 15. Identify a new Tool Chain •  Identify the tools needed to build our Cloud Infrastructure •  Configuration Manager tool •  Cloud Manager tool •  Monitoring tools •  Storage Solution 15
  • 16. Strategy to deploy OpenStack •  Configuration infrastructure based on Puppet •  Community Puppet modules for OpenStack •  SLC6 Operating System •  EPEL/RDO - RPM Packages 16
  • 17. Strategy to deploy OpenStack •  Deliver a production IaaS service though a series of time- based pre-production services of increasing functionality and Quality-of-Service •  Budapest Computer Center hardware deployed as OpenStack compute nodes •  Have an OpenStack production service in the Q2 of 2013 17
  • 18. Pre-Production Infrastructure 18 Essex Folsom "Guppy" "Hamster" "Ibex" - Deployed on Fedora 16 - Community OpenStack puppet modules - Used for functionality tests - Limited integration with CERN infrastructure - Open to early adopters - Deployed on SLC6 and Hyper-V - CERN Network DB integration - Keystone LDAP integration - Open to a wider community (ATLAS, CMS, LHCb, …) - Some OpenStack services in HA - ~14000 cores June, 2012 October, 2012 March, 2013
  • 19. OpenStack at CERN - grizzly release 19
  • 20. OpenStack at CERN - grizzly release •  +2 Children Cells – Geneva and Budapest Computer Centers •  HA+1 architecture •  Ceilometer deployed •  Integrated with CERN accounts and network infrastructure •  Monitoring OpenStack components status •  Glance - Ceph backend •  Cinder - Testing with Ceph backend 20
  • 21. Infrastructure Overview •  Adding ~100 compute nodes every week •  Geneva, Switzerland Cell •  ~11000 cores •  Budapest, Hungary Cell •  ~10000 cores •  Today we have +2500 VMs •  Several VMs have more than 8 cores 21
  • 22. compute-nodescontrollers compute-nodes Architecture Overview 22 Child Cell Geneva, Switzerland Child Cell Budapest, Hungary Top Cell - controllers Geneva, Switzerland Load Balancer Geneva, Switzerland controllers
  • 23. Architecture Components 23 rabbitmq - Keystone - Nova api - Nova conductor - Nova scheduler - Nova network - Nova cells - Glance api - Ceilometer agent-central - Ceilometer collector Controller - Flume - Nova compute - Ceilometer agent-compute Compute node - Flume - HDFS - Elastic Search - Kibana - MySQL - MongoDB - Glance api - Glance registry - Keystone - Nova api - Nova consoleauth - Nova novncproxy - Nova cells - Horizon - Ceilometer api - Cinder api - Cinder volume - Cinder scheduler rabbitmq Controller Top Cell Children Cells - Stacktach - Ceph - Flume
  • 24. Infrastructure Overview •  SLC6 and Microsoft Windows 2012 •  KVM and Microsoft HyperV •  All infrastructure “puppetized” (also, windows compute nodes!) •  Using stackforge OpenStack puppet modules •  Using CERN Foreman/Puppet configuration infrastructure •  Master, Client architecture •  Puppet managed VMs - share the same configuration infrastructure 24
  • 25. Infrastructure Overview •  HAProxy as load balancer •  Master and Compute nodes •  3+ Master nodes per Cell •  O(1000) Compute nodes per Child Cell (KVM and HyperV) •  3 availability zones per Cell •  Rabbitmq •  At least 3 brokers per Cell •  Rabbitmq cluster with mirrored queues 25
  • 26. Infrastructure Overview •  MySql instance per Cell •  MySql managed by CERN DB team •  Running on top of Oracle CRS •  active/slave configuration •  NetApp storage backend •  Backups every 6 hours 26
  • 27. Nova Cells •  Why Cells? •  Scale transparently between different Computer Centers •  With cells we lost functionality •  Security groups •  Live migration •  "Parents" don't know about “children” compute •  Flavors not propagated to "children” cells 27
  • 28. Nova Cells •  Scheduling •  Random cell selection on Grizzly •  Implemented simple scheduler based on project •  CERN Geneva only, CERN Wigner only, “both” •  “both” selects the cell with more available free memory •  Cell/Cell communication doesn’t support multiple Rabbitmq servers •  https://bugs.launchpad.net/nova/+bug/1178541 28
  • 29. Nova Network •  CERN network infrastructure 29 IP MAC CERN network DB VM VM VM VM VM
  • 30. Nova Network •  Implemented a Nova Network CERN driver •  Considers the “host” picked by nova-scheduler •  MAC address selected from pre-registered addresses of “host” IP Service •  Updates CERN network database address with instance hostname and responsible of the device •  Network constraints in some nova operations •  Resize, Live-Migration 30
  • 31. Nova Scheduler •  ImagePropertiesFilter •  linux/windows hypervisors in the same infrastructure •  ProjectsToAggregateFilter •  Projects need dedicated resources •  Instances from defined projects are created in specific Aggregates •  Aggregates can be shared by a set of projects •  Availability Zones •  Implemented “default_schedule_zones” 31
  • 32. Nova Conductor •  Reduces “dramatically” the number of DB connections •  Conductor “bottleneck” •  Only 3+ processes for “all” DB requests •  General “slowness” in the infrastructure •  Fixed with backport •  https://review.openstack.org/#/c/42342/ 32
  • 33. Nova Compute •  KVM and Hyper-V compute nodes share the same infrastructure •  Hypervisor selection based on “Image” properties •  Hyper-V driver still lacks some functionality on Grizzly •  Console access, metadata support with nova-network, resize support, ephemeral disk support, ceilometer metrics support 33
  • 34. Keystone •  CERN’s Active Directory infrastructure •  Unified identity management across the site •  +44000 users •  +29000 groups •  ~200 arrivals/departures per month •  Keystone integrated with CERN Active Directory •  LDAP backend 34
  • 35. Keystone •  CERN user subscribes the "cloud service” •  Created "Personal Tenant" with limited quota •  Shared projects created by request •  Project life cycle •  owner, member, admin – roles •  "Personal project" disabled when user leaves •  Delete resources (VMs, Volumes, Images, …) •  User removed from "Shared Projects" 35
  • 36. Ceilometer •  Users are not directly billed •  Metering needed to adjust Project quotas •  mongoDB backend – sharded and replicated •  Collector, Central-Agent •  Running on “children” Cells controllers •  Compute-Agent •  Uses nova-api running on “children” Cells controllers 36
  • 37. Glance •  Glance API •  Using glance api v1 •  python-glanceclient doesn’t support completely v2 •  Glance Registry •  With v1 we need to keep Glance Registry •  Only runs in Top Cell behind the load balancer •  Glance backend •  File Store (AFS) •  Ceph 37
  • 38. Glance •  Maintain small set of SLC5/6 images as default •  Difficult to offer only the most updated set of images •  Resize and Live Migration not available if image is deleted from Glance •  Users can upload images up to 25GB •  Users don’t pay storage! •  Glance in Grizzly doesn’t support quotas per Tenant! 38
  • 39. Cinder •  Ceph backend •  Still in evaluation •  SLC6 with qemu-kvm patched by Inktank to support RBD •  Cinder doesn't support cells in Grizzly •  Fixed with backport: https://review.openstack.org/#/c/31561/ 39
  • 40. Ceph as Storage Backend •  3 PB cluster available for Ceph •  48 OSDs servers •  5 Monitors servers •  Initial testing with FIO, libaio, bs 256k fio --size=4g --bs=256k –numjobs=1 --direct=1 --rw=randrw --ioengine=libaio --name=/mnt/vdb1/tmp4 Rand RW Rand R Rand W 99 MB/s 103 MB/s 108 MB/s 40
  • 41. Ceph as Storage Backend •  ulimits •  With more than >1024 OSDs, we’re getting various errors where clients cannot create enough processes •  authx for security (key lifecycle is a challenge as always) •  need librbd (from EPEL) 41
  • 42. Monitoring - Lemon •  Monitor “physical” and virtual “servers” with Lemon 42
  • 43. Monitoring - Flume, Elastic Search, Kibana •  How to monitor OpenStack status in all nodes? •  ERRORs, WARNINGs – log visualization •  identify in “real time” possible problems •  preserve all logs for analytics •  visualization of cloud infrastructure status •  service managers •  resource managers •  users 43
  • 44. Monitoring - Flume, Elastic Search, Kibana 44 HDFS Flume gateway elasticsearch Kibana OpenStack infrastructure
  • 47. Challenges •  Moving resources to the infrastructure •  +100 compute nodes per week •  15000 servers – more than 300000 cores •  Migration from Grizzly to Havana •  Deploy Neutron •  Deploy Heat •  Kerberos, X.509 user certificate authentication •  Keystone Domains 47