SlideShare a Scribd company logo
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Salt at
Web Scale
Craig Sebenik
SRE
29 January 2014
SaltConf
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Who Am I?
•Programming for 30-ish years
•Scientific computing
•Java and Perl Developer (web apps)
•HATE doing the same thing more than once
•Been at LinkedIn overy 3 years
•From the very beginning of us using salt
•Manage/architect the entire salt infrastructure at
LinkedIn
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
What is LinkedIn?
•Social media company connecting the world’s
professionals
•5000+ employees
•Offices throughout the world
•Based in Mountain View, CA
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
How Big Is lnkedin.com?
•Several data centers
•Customer facing apps (aka “production”)
•Staging for production apps
•Internal only apps
•Several Hundred Apps
•30+K Hosts
•90+% Linux
•Solaris
•Mac and Linux Desktops
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
LinkedIn Operations
•Several operations groups
•Systems (eg. OS install/config, “rack and stack”)
•Database Admins
•Network
•Application (i.e. SRE)
•Different groups have different needs for automation
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
What Is An SRE?
•Assist application developers deploy their apps
•Advise on rollout plans
•Coordinate rollouts
•Generally, the group in-between all of operations and
all of the developers
•Lots of troubleshooting
•SREs write code (automation)
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
SREs Use Salt
•Using salt since 0.8.9
!
•Installation of new apps
!
•Config management
!
•Some troubleshooting
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Salt Architecture
•Each physical data center
•multiple “fabrics” (logical grouping of hosts)
•single salt master (largest set of minions = 8+k)
•warm backup (same private key)
•minions configured with CNAME to master
•Files stored in subversion
•states, grains, modules
•runners
•reactor
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Building Salt
•Internal fork from github
•Add another number. E.g. 2014.01.0.0
•Allows for internal only patches
•Create specific package for testing
•same git repo, with same tags
•LNKD-salt-dev-2014.01.0.0-12345.noarch.rpm
•Allows for emergency changes elsewhere
•salt-dev is deployed on a set of virtual machines
•custom test suite is run
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Installing Salt
•OS is managed by cfengine
•cfengine will push new salt releases and restart
minions
•cfengine also manages minion configs
•master is a set of RPMs
•includes config
•Solaris install is handled by systems team
•Roll out to one data center at a time
•Entire process can take over a week
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Salt Master
•salt master is wrapped in a “runit” script
•runit is a process supervisor
•restarts the master if is dies/stops
•salt API
•use the reactor system to send metrics
•metrics gathering is all home grown
•trying to open source it
•file updates (every 5 mins)
•modules, states, grains
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Master Access
•Logins to the host are managed via cfengine
•Have to be in a whitelisted group to log on
•Access to salt command controlled via sudo
•sudo logs provide audit trail
•Disable cmd.* from salt cli
•If you want to automate; write a state and/or module
•salt API access via a whitelist of IPs
•Auth using LDAP
•Only a handful of commands
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Minions
•basic salt RPM
•includes “salt” command (unfortunately)
•module sync
•every hr
•small python script using client API
•minion metrics
•“age” of modules (via a tracker file)
•uptime of minion
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Deployment With Salt
•LinkedIn.com apps are deployed via a custom app
•App is showing its age and needs to be replaced
•Team outside of operations is writing new
deployment app
•Uses salt api
•Has a lot of custom code
•Not in salt
•Needs to deploy locally (for testing)
•This includes Mac desktop/laptops
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Custom Modules and
States
•couchbase management (via runner)
•runit
•Apache Traffic Server
•metrics system
•alerts
•data collection
•data display
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Module Promotion
•Small oversight last year caused massizve issues
•Developed process to “promote”modules
•Salt environments:
•dev -> vm -> test -> stage -> prod
•different dirs in svn
•sparse directories
•minions are configured to look at certain
environments
•Changes are managed with “review board”
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Problems
•Education!
•Most salt customizations in 2 groups (out of 10)
•Few power users
•Corrupted keys
•Syncing only every hour
•No syncing on solaris
•No highstate enforcement
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
More Problems
•Lots of CPU issues on master
•Key management
•Reinstall of OS with same host name
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
Future
•Multi master
•shared job cache via file system isn’t what we want
•investigating using a returner to share job info
•More training
•Whitelist of states
•Non-ops users
•Eg. devs that want to deploy just their code
•Increase amount of data in grains
©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved.
More Future
•Pillar data
•Metrics
•Better visibility when things go wrong
•Tools to see job cache
•Logs on master are too chatty
•Ability to watch all traffic from a specific minion(s)
•Key management
•reactor system, possibly
Questions?
http://www.linkedin.com/in/craigsebenik

More Related Content

PDF
SaltConf14 - Justin Carmony, Deseret Digital Media - Teaching Devs About DevOps
PPTX
SaltConf2015: SaltStack at Scale Automating Your Automation
PPT
SaltConf14 - Brendan Burns, Google - Management at Google Scale
PDF
Configuration Management - Finding the tool to fit your needs
PPTX
Creating SaltStack State data with Pyobjects
PPT
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
PDF
SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...
PDF
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)
SaltConf14 - Justin Carmony, Deseret Digital Media - Teaching Devs About DevOps
SaltConf2015: SaltStack at Scale Automating Your Automation
SaltConf14 - Brendan Burns, Google - Management at Google Scale
Configuration Management - Finding the tool to fit your needs
Creating SaltStack State data with Pyobjects
SaltConf14 - Saurabh Surana, HP Cloud - Automating operations and support wit...
SaltConf14 - Matthew Williams, Flowroute - Salt Virt for Linux contatiners an...
Salt Air 19 - Intro to SaltStack RAET (reliable asyncronous event transport)

What's hot (20)

PDF
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
PDF
Hacking on WildFly 9
PPTX
What's new in chef 12
PDF
Inside the Chef Push Jobs Service - ChefConf 2015
PPTX
Patch Management on Windows with Puppet
PPTX
Serverspec and Sensu - Testing and Monitoring collide
PDF
PuppetCamp Sydney 2012 - Building a Multimaster Environment
PPTX
Continuous Delivery and Infrastructure as Code
PDF
From Chef to Saltstack on Cloud Providers - Incontro DevOps 2015
PDF
Introduction to SaltStack
PDF
Push jobs: an orchestration building block for private Chef
PPTX
Salt stack introduction
PDF
Dependencies and Licenses
PPTX
Vagrant, Chef and TYPO3 - A Love Affair
PPTX
Moscow DevOps Meetup P2P oct 2015
PPTX
MoldCamp - multidimentional testing workflow. CIBox.
PDF
Steamlining your puppet development workflow
PDF
Chef Provisioning a Chef Server Cluster - ChefConf 2015
PDF
faastRuby - Building a FaaS platform with Redis (RedisConf19)
PDF
Chef Fundamentals Training Series Module 1: Overview of Chef
De-centralise and Conquer: Masterless Puppet in a Dynamic Environment
Hacking on WildFly 9
What's new in chef 12
Inside the Chef Push Jobs Service - ChefConf 2015
Patch Management on Windows with Puppet
Serverspec and Sensu - Testing and Monitoring collide
PuppetCamp Sydney 2012 - Building a Multimaster Environment
Continuous Delivery and Infrastructure as Code
From Chef to Saltstack on Cloud Providers - Incontro DevOps 2015
Introduction to SaltStack
Push jobs: an orchestration building block for private Chef
Salt stack introduction
Dependencies and Licenses
Vagrant, Chef and TYPO3 - A Love Affair
Moscow DevOps Meetup P2P oct 2015
MoldCamp - multidimentional testing workflow. CIBox.
Steamlining your puppet development workflow
Chef Provisioning a Chef Server Cluster - ChefConf 2015
faastRuby - Building a FaaS platform with Redis (RedisConf19)
Chef Fundamentals Training Series Module 1: Overview of Chef
Ad

Viewers also liked (7)

PDF
Spot Trading - A case study in continuous delivery for mission critical finan...
PPTX
Integration testing for salt states using aws ec2 container service
ODP
Django: utilizzo avanzato e nuove funzionalità
PDF
Deployment ripetibili e automatizzati con Salt
PDF
SaltConf14 - Yazz Atlas, HP Cloud - Installing OpenStack using SaltStack
PDF
Arnold Bechtoldt, Inovex GmbH Linux systems engineer - Configuration Manageme...
PDF
A user's perspective on SaltStack and other configuration management tools
Spot Trading - A case study in continuous delivery for mission critical finan...
Integration testing for salt states using aws ec2 container service
Django: utilizzo avanzato e nuove funzionalità
Deployment ripetibili e automatizzati con Salt
SaltConf14 - Yazz Atlas, HP Cloud - Installing OpenStack using SaltStack
Arnold Bechtoldt, Inovex GmbH Linux systems engineer - Configuration Manageme...
A user's perspective on SaltStack and other configuration management tools
Ad

Similar to SaltConf14 - Craig Sebenik, LinkedIn - SaltStack at Web Scale (20)

PDF
August Webinar - Water Cooler Talks: A Look into a Developer's Workbench
PPTX
Connect the Dots: Logging and Custom Connectors
PDF
A Reference Architecture to Enable Visibility and Traceability across the Ent...
PDF
Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey
PPTX
DevOps Days Ohio
PPT
OpenShift Origin: Build a PaaS Just Like Red Hats
PDF
WSO2Con US 2013 - Connected Business - making it happen
ODP
Selenium at Mozilla: An Essential Element to our Success
PPTX
Openfest15 MySQL Plugin Development
PPTX
Software devops engineer in test (SDET)
PPTX
Picnic Software - Developing a flexible and scalable application
PDF
Middleware in Golang: InVision's Rye
PPTX
How bigtop leveraged docker for build automation and one click hadoop provis...
KEY
Make It Cooler: Using Decentralized Version Control
PPTX
Using SaltStack to Auto Triage and Remediate Production Systems
PPTX
Les nouveautés ASP.NET 5 avec Visual Studio 2015
PDF
be the captain of your connections deployment
PPTX
Aiming for automatic updates - Drupal Dev Days Lisbon 2018
PPTX
Cloud Platforms for Java
PPTX
Free Mongo on OpenShift
August Webinar - Water Cooler Talks: A Look into a Developer's Workbench
Connect the Dots: Logging and Custom Connectors
A Reference Architecture to Enable Visibility and Traceability across the Ent...
Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey
DevOps Days Ohio
OpenShift Origin: Build a PaaS Just Like Red Hats
WSO2Con US 2013 - Connected Business - making it happen
Selenium at Mozilla: An Essential Element to our Success
Openfest15 MySQL Plugin Development
Software devops engineer in test (SDET)
Picnic Software - Developing a flexible and scalable application
Middleware in Golang: InVision's Rye
How bigtop leveraged docker for build automation and one click hadoop provis...
Make It Cooler: Using Decentralized Version Control
Using SaltStack to Auto Triage and Remediate Production Systems
Les nouveautés ASP.NET 5 avec Visual Studio 2015
be the captain of your connections deployment
Aiming for automatic updates - Drupal Dev Days Lisbon 2018
Cloud Platforms for Java
Free Mongo on OpenShift

More from SaltStack (12)

PPT
SaltConf14 - Ben Cane - Using SaltStack in High Availability Environments
PPT
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
PDF
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...
PDF
SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...
PDF
SaltConf14 - Forrest Alvarez, Choice Hotels - Salt Formulas and States
PPTX
SaltConf14 - Thomas Jackson, LinkedIn - Safety with Power Tools
PDF
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
PDF
SaltStack - An open source software story
PDF
Real-time Cloud Management with SaltStack
PDF
Adding to your Python Armory - OpenWest 2013
PDF
Real-time Infrastructure Management with SaltStack - OpenWest 2013
PDF
Writing SaltStack Modules - OpenWest 2013
SaltConf14 - Ben Cane - Using SaltStack in High Availability Environments
SaltConf14 - Oz Akan, Rackspace - Deploying OpenStack Marconi with SaltStack
SaltConf14 - Anita Kuno, HP & OpenStack - Using SaltStack for event-driven or...
SaltConf14 - Ryan Lane, Wikimedia - Immediate consistency with Trebuchet Depl...
SaltConf14 - Forrest Alvarez, Choice Hotels - Salt Formulas and States
SaltConf14 - Thomas Jackson, LinkedIn - Safety with Power Tools
SaltConf14 - Eric johnson, Google - Orchestrating Google Compute Engine with ...
SaltStack - An open source software story
Real-time Cloud Management with SaltStack
Adding to your Python Armory - OpenWest 2013
Real-time Infrastructure Management with SaltStack - OpenWest 2013
Writing SaltStack Modules - OpenWest 2013

Recently uploaded (20)

PPTX
Chapter 5: Probability Theory and Statistics
PDF
Web App vs Mobile App What Should You Build First.pdf
PPTX
The various Industrial Revolutions .pptx
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
STKI Israel Market Study 2025 version august
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Getting Started with Data Integration: FME Form 101
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
Tartificialntelligence_presentation.pptx
PDF
Hybrid model detection and classification of lung cancer
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
A novel scalable deep ensemble learning framework for big data classification...
PPTX
1. Introduction to Computer Programming.pptx
PDF
Getting started with AI Agents and Multi-Agent Systems
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
project resource management chapter-09.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
Chapter 5: Probability Theory and Statistics
Web App vs Mobile App What Should You Build First.pdf
The various Industrial Revolutions .pptx
DP Operators-handbook-extract for the Mautical Institute
STKI Israel Market Study 2025 version august
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
TLE Review Electricity (Electricity).pptx
Getting Started with Data Integration: FME Form 101
Univ-Connecticut-ChatGPT-Presentaion.pdf
Tartificialntelligence_presentation.pptx
Hybrid model detection and classification of lung cancer
Module 1.ppt Iot fundamentals and Architecture
1 - Historical Antecedents, Social Consideration.pdf
A novel scalable deep ensemble learning framework for big data classification...
1. Introduction to Computer Programming.pptx
Getting started with AI Agents and Multi-Agent Systems
Group 1 Presentation -Planning and Decision Making .pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
project resource management chapter-09.pdf
OMC Textile Division Presentation 2021.pptx

SaltConf14 - Craig Sebenik, LinkedIn - SaltStack at Web Scale

  • 1. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Salt at Web Scale Craig Sebenik SRE 29 January 2014 SaltConf
  • 2. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Who Am I? •Programming for 30-ish years •Scientific computing •Java and Perl Developer (web apps) •HATE doing the same thing more than once •Been at LinkedIn overy 3 years •From the very beginning of us using salt •Manage/architect the entire salt infrastructure at LinkedIn
  • 3. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. What is LinkedIn? •Social media company connecting the world’s professionals •5000+ employees •Offices throughout the world •Based in Mountain View, CA
  • 4. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. How Big Is lnkedin.com? •Several data centers •Customer facing apps (aka “production”) •Staging for production apps •Internal only apps •Several Hundred Apps •30+K Hosts •90+% Linux •Solaris •Mac and Linux Desktops
  • 5. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. LinkedIn Operations •Several operations groups •Systems (eg. OS install/config, “rack and stack”) •Database Admins •Network •Application (i.e. SRE) •Different groups have different needs for automation
  • 6. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. What Is An SRE? •Assist application developers deploy their apps •Advise on rollout plans •Coordinate rollouts •Generally, the group in-between all of operations and all of the developers •Lots of troubleshooting •SREs write code (automation)
  • 7. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. SREs Use Salt •Using salt since 0.8.9 ! •Installation of new apps ! •Config management ! •Some troubleshooting
  • 8. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Salt Architecture •Each physical data center •multiple “fabrics” (logical grouping of hosts) •single salt master (largest set of minions = 8+k) •warm backup (same private key) •minions configured with CNAME to master •Files stored in subversion •states, grains, modules •runners •reactor
  • 9. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Building Salt •Internal fork from github •Add another number. E.g. 2014.01.0.0 •Allows for internal only patches •Create specific package for testing •same git repo, with same tags •LNKD-salt-dev-2014.01.0.0-12345.noarch.rpm •Allows for emergency changes elsewhere •salt-dev is deployed on a set of virtual machines •custom test suite is run
  • 10. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Installing Salt •OS is managed by cfengine •cfengine will push new salt releases and restart minions •cfengine also manages minion configs •master is a set of RPMs •includes config •Solaris install is handled by systems team •Roll out to one data center at a time •Entire process can take over a week
  • 11. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Salt Master •salt master is wrapped in a “runit” script •runit is a process supervisor •restarts the master if is dies/stops •salt API •use the reactor system to send metrics •metrics gathering is all home grown •trying to open source it •file updates (every 5 mins) •modules, states, grains
  • 12. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Master Access •Logins to the host are managed via cfengine •Have to be in a whitelisted group to log on •Access to salt command controlled via sudo •sudo logs provide audit trail •Disable cmd.* from salt cli •If you want to automate; write a state and/or module •salt API access via a whitelist of IPs •Auth using LDAP •Only a handful of commands
  • 13. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Minions •basic salt RPM •includes “salt” command (unfortunately) •module sync •every hr •small python script using client API •minion metrics •“age” of modules (via a tracker file) •uptime of minion
  • 14. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Deployment With Salt •LinkedIn.com apps are deployed via a custom app •App is showing its age and needs to be replaced •Team outside of operations is writing new deployment app •Uses salt api •Has a lot of custom code •Not in salt •Needs to deploy locally (for testing) •This includes Mac desktop/laptops
  • 15. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Custom Modules and States •couchbase management (via runner) •runit •Apache Traffic Server •metrics system •alerts •data collection •data display
  • 16. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Module Promotion •Small oversight last year caused massizve issues •Developed process to “promote”modules •Salt environments: •dev -> vm -> test -> stage -> prod •different dirs in svn •sparse directories •minions are configured to look at certain environments •Changes are managed with “review board”
  • 17. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Problems •Education! •Most salt customizations in 2 groups (out of 10) •Few power users •Corrupted keys •Syncing only every hour •No syncing on solaris •No highstate enforcement
  • 18. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. More Problems •Lots of CPU issues on master •Key management •Reinstall of OS with same host name
  • 19. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. Future •Multi master •shared job cache via file system isn’t what we want •investigating using a returner to share job info •More training •Whitelist of states •Non-ops users •Eg. devs that want to deploy just their code •Increase amount of data in grains
  • 20. ©2013 LinkedIn Corporation. All Rights Reserved. ORGANIZATION NAME©2014 LinkedIn Corporation. All Rights Reserved. More Future •Pillar data •Metrics •Better visibility when things go wrong •Tools to see job cache •Logs on master are too chatty •Ability to watch all traffic from a specific minion(s) •Key management •reactor system, possibly