SlideShare a Scribd company logo
1
ACDC - AutomatiC DataCenter
Felix Cantournet & Xavier Krantz
2017-11-07
Agenda
1. Leboncoin
2. Historique
3. Remise en question
4. ACDC
5. Next
6. Rex
3
Leboncoin
Quelques chiffres
4
5
6
7
1.2 - Stack Technique
2
Datacenters
600
serveurs physiques
(plus de 1000 avec les virtuels)
12 Gbits/s
de débit sortant
6 To
de BDD
8
300M
d’images
15k req/s
sur leboncoin.fr
1.2 - Stack Technique
2
Datacenters
600
serveurs physiques
(plus de 1000 avec les virtuels)
12 Gbits/s
de débit sortant
6 To
de BDD
9
300M
d’images
15k req/s
sur leboncoin.fr
Historique
& Évolutions
10
2.1 - Situation initiale
11
2.1 - Situation initiale
● 1 - Operator
○ find a free IP (Welcome ping !)
● 3 - Foreman
○ Go in Foreman and select a node
○ Get the @MAC
○ Create the node + put in build mode
12
● 4 - Puppet
○ Reserve @Mac / DNS name in DHCP
○ Commit + push
○ Run the agent on every DHCP nodes
● 2 - Puppet
○ Reserve IP / DNS name in DNS
○ Commit + push
○ Run the agent on every DHCP nodes
2.1 - Situation initiale
● 5 - Foreman
○ Reboot the node via BMC plugin
● 7 - Operator
○ Follows with Java console
13
● 6 - Node installs
○ Boot on network (PXE)
○ DHCP redirects to TFTP
○ TFTP serves the custom PXE config
○ Pressed is rendered by Foreman
2.1 - Situation initiale
● 5 - Foreman
○ Reboot the node via BMC plugin
● 7 - Operator
○ Follows with Java console
14
● 6 - Node installs
○ Boot on network (PXE)
○ DHCP redirects to TFTP
○ TFTP serves the custom PXE config
○ Pressed is rendered by Foreman
6 manual steps
Errors prone
Human conflicts
Time consuming
15
2.2 - Problématique
● Simplifier le provisioning bare metal
○ Provisioning / installation non-supervisée
○ 1 manual step
16
2.3 - Essai 1 - Foreman + SmartProxies
Constat: Sous utilisation de Foreman.
Solutions: Smart proxy pour automatiser :
- IPAM + DHCP
- DNS
17
● Foreman Smart-proxy
○ Not supported
2.3 - Essai 1 - Foreman + SmartProxies
● We
○ 1 big zone file
● Foreman Smart-proxy
○ Dynamic updates = nsupdate
○ Binary journal file + serial conflicts
18
● We
○ Do nics bonding
○ Need to register n@Macs <> 1 IP
Pain points: DNS
Pain points: DHCP
2.3 - Essai 1 - Foreman + SmartProxies
● We
○ Do not master Ruby
○ Are not “a Tech company”
○ Are not that big
● Foreman & Smart-proxy
○ Very complex code base
○ Very complex UI
○ Generic and have a lots (too many) of
features
19
Pain points: Foreman
Remise en
cause
20
3.1 - Interface avec prestataire
Celeris : Prestataire interventions en DC
● Spreadsheet
● DCIM : Netbox
○ Open source
○ Digital Ocean
○ python + postgresql
Intégration avec Foreman ?
21
3.2 - Overlap de solutions
IPAM
DCIMCMDB
???
22
Problématique 2
● Automatiser la gestion du cycle de vie des
machines physiques
○ Discovery/intake
○ Provisioning / installation non-supervisée
○ Maintenance, decommission
23
Collins
● Project open source https://github.com/tumblr/collins
● Machine à état imposée
● Système de hook / callback arbitraire sur les transitions d’état
● Metadata key / value arbitraires associées à chaque assets
● Web UI + API http + firehose
24
Collins: Tooling
25
API Clients
● Go-collins
● pycollins
● Ruby libs
○ collins-auth
○ collins-client
○ collins-notify
○ collins-state
○ ...
CLI
● collins-shell
Collins: Web UI
26
Collins: Web UI
27
Collins: Cycle de vie
28
Workflows spécifiés :
- Intake
- Comissionnement
- Maintenance
- Décomissionnement
Collins: Callback registry
29
ACDC
30
4.1 - Overview
31
4.2 - Lorie
32
4.2 - Lorie
33
4.3 - IPXE Router
34
4.4 - Collins callbacks
35
● nowProvisioned
○ on = "asset_update"
○ When
■ previous.state = "isProvisioning"
■ && current.state = "isProvisioned"
● provisionEvent
○ on = "asset_update"
○ When
■ current.state = "isNew"
● unallocated
○ on = "asset_update"
○ When
■ current.state = "isUnallocated"
4.5 - Provisioning
36
4.6 - Tooling
37
$ collins-shell
INFO - ENV Variable COLLINS_CONFIG=/home/xkrantz/Sources/github.schibsted.io/leboncoin/acdc/conf/collins.yaml
Tasks:
collins-shell asset <command> # Asset related commands
collins-shell asset_type <command> # Asset Type related commands
collins-shell console # drop into the interactive collins shell
collins-shell help [TASK] # Describe available tasks or one specific task
collins-shell ip_address <command> # IP address related commands
collins-shell ipmi <command> # IPMI related commands
collins-shell latest # check if there is a newer version of collins-shell
collins-shell log MESSAGE # log a message on an asset
collins-shell logs TAG # fetch logs for an asset specified by its tag. Use "all" for a...
collins-shell power ACTION --reason=REASON --tag=TAG # perform power action (off, on, rebootSoft, rebootHard, etc) o...
collins-shell power_status # check power status on an asset
collins-shell provision <command> # Provisioning related commands
collins-shell search_logs QUERY # search for asset logs
collins-shell state <command> # State management related commands - use with care
collins-shell tag <command> # Tag related commands
collins-shell version # current version of collins-shell
Next
38
5 - Next
ACDC v2
Rework
● Discovery
● OS bootstrapping
Add
● Disk management
● Firmware updates
● Any maintenance tasks
39
5 - Next
ACDC v2
Rework
● Discovery
● OS bootstrapping
Add
● Disk management
● Firmware updates
● Any maintenance tasks
Discovery
● Currently:
○ Genesis (Tumblr)
○ Ruby DSL (Chef like)
● Next:
○ CoreOS in Memory + Ansible
40
5 - Next
ACDC v2
Rework
● Discovery
● OS bootstrapping
Add
● Disk management
● Firmware updates
● Any maintenance tasks
OS Bootstrapping
● Currently:
○ Pressed / Kickstart
○ Shell scripts
● Next:
○ CoreOS in Memory + Ansible
41
5.1 - Ansible jobs runner
42
5.1 - Ansible jobs runner
43
5.2 - Visualization & federation
44
5.3 - Integration
45
5.3 - Integration
46
SPECS
REX
47
20% projects are not enough
REX
48
Services & ownership transition (for Ops)
REX
49

More Related Content

PDF
HTTP/2, HTTP/3 and SSL/TLS State of the Art in Our Servers
PDF
From a cluster to the Cloud
PDF
Small, Simple, and Secure: Alpine Linux under the Microscope
PPTX
Vigor 3910 docker firmware quick start
PDF
Large scale overlay networks with ovn: problems and solutions
PDF
TomcatCon: from a cluster to the cloud
PDF
Apache Httpd and TLS certificates validations
PDF
DockerCon EU '17 - Dockerizing Aurea
HTTP/2, HTTP/3 and SSL/TLS State of the Art in Our Servers
From a cluster to the Cloud
Small, Simple, and Secure: Alpine Linux under the Microscope
Vigor 3910 docker firmware quick start
Large scale overlay networks with ovn: problems and solutions
TomcatCon: from a cluster to the cloud
Apache Httpd and TLS certificates validations
DockerCon EU '17 - Dockerizing Aurea

What's hot (20)

PDF
LF_OVS_17_OVN and Containers - An update.
PDF
IPTABLES Introduction
PDF
Anatomy of neutron from the eagle eyes of troubelshoorters
PDF
Understand the iptables step by step
PPTX
OVN operationalization at scale at eBay
PDF
Trevor McDonald - Nagios XI Under The Hood
PDF
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
PDF
Achieving the Ultimate Performance with KVM
PDF
Understanding docker networking
PDF
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
PPTX
The Basic Introduction of Open vSwitch
PDF
OpenZFS send and receive
PDF
CRuby Committers Who's Who in 2013
PDF
Kubernetes 1001
PDF
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
PPTX
Packet Walk(s) In Kubernetes
PDF
Like loggly using open source
PDF
Load Balancing 101
PDF
Redis Meetup TLV - K8s Session 28/10/2018
PDF
High performance json- postgre sql vs. mongodb
LF_OVS_17_OVN and Containers - An update.
IPTABLES Introduction
Anatomy of neutron from the eagle eyes of troubelshoorters
Understand the iptables step by step
OVN operationalization at scale at eBay
Trevor McDonald - Nagios XI Under The Hood
[2018.10.19] Andrew Kong - Tunnel without tunnel (Seminar at OpenStack Korea ...
Achieving the Ultimate Performance with KVM
Understanding docker networking
MongoDB World 2019: Terraform New Worlds on MongoDB Atlas
The Basic Introduction of Open vSwitch
OpenZFS send and receive
CRuby Committers Who's Who in 2013
Kubernetes 1001
pgDay Asia 2016 - Swapping Pacemaker-Corosync for repmgr (1)
Packet Walk(s) In Kubernetes
Like loggly using open source
Load Balancing 101
Redis Meetup TLV - K8s Session 28/10/2018
High performance json- postgre sql vs. mongodb
Ad

Similar to New bare-metal provisioning setup built around Collins (20)

ODP
Monitoring your VM's at Scale
PDF
Denser, cooler, faster, stronger: PHP on ARM microservers
ODP
Foreman in your datacenter
PDF
The Automation Factory
PDF
Designing Lean CloudStack Environments for the Edge - IndiQus - CloudStack E...
ODP
Foreman in Your Data Center :OSDC 2015
PDF
Foreman monitoring integration - Dirk Goetz - Cfgmgmtcamp Ghent_ 2018
PDF
OSCamp 2019 | #3 Ansible: Foreman Discovery by Adam Ruzicka
PPTX
Sonian, Open Source and Sensu
PDF
OSDC 2015: Stephen Benjamin | Foreman in Your Data Center
PDF
How we built Packet's bare metal cloud platform
PPTX
Some Advanced OpenStack Overview Document
ODP
Another 7 tools for your #devops stack
ODP
OSDC 2016 - Another 7 Tools for your #devops Stack by Kris Buytaert
PDF
OpenStack & Ubuntu (india openstack day)
ODP
Integrating icinga2 and the HashiCorp suite
PDF
CloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlue
PDF
Sprint 72
PDF
Building Docker images with Puppet
PDF
CERN Agile Infrastructure, Road to Production
Monitoring your VM's at Scale
Denser, cooler, faster, stronger: PHP on ARM microservers
Foreman in your datacenter
The Automation Factory
Designing Lean CloudStack Environments for the Edge - IndiQus - CloudStack E...
Foreman in Your Data Center :OSDC 2015
Foreman monitoring integration - Dirk Goetz - Cfgmgmtcamp Ghent_ 2018
OSCamp 2019 | #3 Ansible: Foreman Discovery by Adam Ruzicka
Sonian, Open Source and Sensu
OSDC 2015: Stephen Benjamin | Foreman in Your Data Center
How we built Packet's bare metal cloud platform
Some Advanced OpenStack Overview Document
Another 7 tools for your #devops stack
OSDC 2016 - Another 7 Tools for your #devops Stack by Kris Buytaert
OpenStack & Ubuntu (india openstack day)
Integrating icinga2 and the HashiCorp suite
CloudStack Tooling Ecosystem – Kiran Chavala, ShapeBlue
Sprint 72
Building Docker images with Puppet
CERN Agile Infrastructure, Road to Production
Ad

Recently uploaded (20)

PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Clinical guidelines as a resource for EBP(1).pdf
PDF
Business Analytics and business intelligence.pdf
PPT
Predictive modeling basics in data cleaning process
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Lecture1 pattern recognition............
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Introduction to Knowledge Engineering Part 1
PPTX
Managing Community Partner Relationships
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Leprosy and NLEP programme community medicine
PDF
Introduction to Data Science and Data Analysis
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
SAP 2 completion done . PRESENTATION.pptx
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Optimise Shopper Experiences with a Strong Data Estate.pdf
ISS -ESG Data flows What is ESG and HowHow
Clinical guidelines as a resource for EBP(1).pdf
Business Analytics and business intelligence.pdf
Predictive modeling basics in data cleaning process
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
climate analysis of Dhaka ,Banglades.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Lecture1 pattern recognition............
oil_refinery_comprehensive_20250804084928 (1).pptx
Introduction to Knowledge Engineering Part 1
Managing Community Partner Relationships
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Leprosy and NLEP programme community medicine
Introduction to Data Science and Data Analysis
Introduction-to-Cloud-ComputingFinal.pptx
SAP 2 completion done . PRESENTATION.pptx

New bare-metal provisioning setup built around Collins

  • 1. 1
  • 2. ACDC - AutomatiC DataCenter Felix Cantournet & Xavier Krantz 2017-11-07
  • 3. Agenda 1. Leboncoin 2. Historique 3. Remise en question 4. ACDC 5. Next 6. Rex 3
  • 5. 5
  • 6. 6
  • 7. 7
  • 8. 1.2 - Stack Technique 2 Datacenters 600 serveurs physiques (plus de 1000 avec les virtuels) 12 Gbits/s de débit sortant 6 To de BDD 8 300M d’images 15k req/s sur leboncoin.fr
  • 9. 1.2 - Stack Technique 2 Datacenters 600 serveurs physiques (plus de 1000 avec les virtuels) 12 Gbits/s de débit sortant 6 To de BDD 9 300M d’images 15k req/s sur leboncoin.fr
  • 11. 2.1 - Situation initiale 11
  • 12. 2.1 - Situation initiale ● 1 - Operator ○ find a free IP (Welcome ping !) ● 3 - Foreman ○ Go in Foreman and select a node ○ Get the @MAC ○ Create the node + put in build mode 12 ● 4 - Puppet ○ Reserve @Mac / DNS name in DHCP ○ Commit + push ○ Run the agent on every DHCP nodes ● 2 - Puppet ○ Reserve IP / DNS name in DNS ○ Commit + push ○ Run the agent on every DHCP nodes
  • 13. 2.1 - Situation initiale ● 5 - Foreman ○ Reboot the node via BMC plugin ● 7 - Operator ○ Follows with Java console 13 ● 6 - Node installs ○ Boot on network (PXE) ○ DHCP redirects to TFTP ○ TFTP serves the custom PXE config ○ Pressed is rendered by Foreman
  • 14. 2.1 - Situation initiale ● 5 - Foreman ○ Reboot the node via BMC plugin ● 7 - Operator ○ Follows with Java console 14 ● 6 - Node installs ○ Boot on network (PXE) ○ DHCP redirects to TFTP ○ TFTP serves the custom PXE config ○ Pressed is rendered by Foreman 6 manual steps Errors prone Human conflicts Time consuming
  • 15. 15
  • 16. 2.2 - Problématique ● Simplifier le provisioning bare metal ○ Provisioning / installation non-supervisée ○ 1 manual step 16
  • 17. 2.3 - Essai 1 - Foreman + SmartProxies Constat: Sous utilisation de Foreman. Solutions: Smart proxy pour automatiser : - IPAM + DHCP - DNS 17
  • 18. ● Foreman Smart-proxy ○ Not supported 2.3 - Essai 1 - Foreman + SmartProxies ● We ○ 1 big zone file ● Foreman Smart-proxy ○ Dynamic updates = nsupdate ○ Binary journal file + serial conflicts 18 ● We ○ Do nics bonding ○ Need to register n@Macs <> 1 IP Pain points: DNS Pain points: DHCP
  • 19. 2.3 - Essai 1 - Foreman + SmartProxies ● We ○ Do not master Ruby ○ Are not “a Tech company” ○ Are not that big ● Foreman & Smart-proxy ○ Very complex code base ○ Very complex UI ○ Generic and have a lots (too many) of features 19 Pain points: Foreman
  • 21. 3.1 - Interface avec prestataire Celeris : Prestataire interventions en DC ● Spreadsheet ● DCIM : Netbox ○ Open source ○ Digital Ocean ○ python + postgresql Intégration avec Foreman ? 21
  • 22. 3.2 - Overlap de solutions IPAM DCIMCMDB ??? 22
  • 23. Problématique 2 ● Automatiser la gestion du cycle de vie des machines physiques ○ Discovery/intake ○ Provisioning / installation non-supervisée ○ Maintenance, decommission 23
  • 24. Collins ● Project open source https://github.com/tumblr/collins ● Machine à état imposée ● Système de hook / callback arbitraire sur les transitions d’état ● Metadata key / value arbitraires associées à chaque assets ● Web UI + API http + firehose 24
  • 25. Collins: Tooling 25 API Clients ● Go-collins ● pycollins ● Ruby libs ○ collins-auth ○ collins-client ○ collins-notify ○ collins-state ○ ... CLI ● collins-shell
  • 28. Collins: Cycle de vie 28 Workflows spécifiés : - Intake - Comissionnement - Maintenance - Décomissionnement
  • 34. 4.3 - IPXE Router 34
  • 35. 4.4 - Collins callbacks 35 ● nowProvisioned ○ on = "asset_update" ○ When ■ previous.state = "isProvisioning" ■ && current.state = "isProvisioned" ● provisionEvent ○ on = "asset_update" ○ When ■ current.state = "isNew" ● unallocated ○ on = "asset_update" ○ When ■ current.state = "isUnallocated"
  • 37. 4.6 - Tooling 37 $ collins-shell INFO - ENV Variable COLLINS_CONFIG=/home/xkrantz/Sources/github.schibsted.io/leboncoin/acdc/conf/collins.yaml Tasks: collins-shell asset <command> # Asset related commands collins-shell asset_type <command> # Asset Type related commands collins-shell console # drop into the interactive collins shell collins-shell help [TASK] # Describe available tasks or one specific task collins-shell ip_address <command> # IP address related commands collins-shell ipmi <command> # IPMI related commands collins-shell latest # check if there is a newer version of collins-shell collins-shell log MESSAGE # log a message on an asset collins-shell logs TAG # fetch logs for an asset specified by its tag. Use "all" for a... collins-shell power ACTION --reason=REASON --tag=TAG # perform power action (off, on, rebootSoft, rebootHard, etc) o... collins-shell power_status # check power status on an asset collins-shell provision <command> # Provisioning related commands collins-shell search_logs QUERY # search for asset logs collins-shell state <command> # State management related commands - use with care collins-shell tag <command> # Tag related commands collins-shell version # current version of collins-shell
  • 39. 5 - Next ACDC v2 Rework ● Discovery ● OS bootstrapping Add ● Disk management ● Firmware updates ● Any maintenance tasks 39
  • 40. 5 - Next ACDC v2 Rework ● Discovery ● OS bootstrapping Add ● Disk management ● Firmware updates ● Any maintenance tasks Discovery ● Currently: ○ Genesis (Tumblr) ○ Ruby DSL (Chef like) ● Next: ○ CoreOS in Memory + Ansible 40
  • 41. 5 - Next ACDC v2 Rework ● Discovery ● OS bootstrapping Add ● Disk management ● Firmware updates ● Any maintenance tasks OS Bootstrapping ● Currently: ○ Pressed / Kickstart ○ Shell scripts ● Next: ○ CoreOS in Memory + Ansible 41
  • 42. 5.1 - Ansible jobs runner 42
  • 43. 5.1 - Ansible jobs runner 43
  • 44. 5.2 - Visualization & federation 44
  • 48. 20% projects are not enough REX 48
  • 49. Services & ownership transition (for Ops) REX 49