SlideShare a Scribd company logo
OpenStack at Ebsco
Nate Baechtold, IT Architect
Ebsco Information Services
August 23, 2016
Bulleted List
• The leading discovery service provider for libraries worldwide
with more than 10,000 discovery customers in over 100
countries.
• Preeminent provider of online research content for libraries,
including hundreds of research databases, historical archives,
point-of-care medical reference, and corporate learning tools
serving millions of end users at tens of thousands of institutions.
• Leading provider of electronic journals & books for libraries, with
more than 360,000 serials, including more than 57,000 e-
journals, as well as online access to more than 800,000 e-
books.
2
What did we need?
• Self service infrastructure to all development teams.
• Full stack automation to all environments.
• Increase agility and productivity of operations and development
teams.
• Lower costs by leveraging open source solutions.
• Provide a solution that integrates well with other products and
allows other products and tools to easily integrate with it.
3
Why OpenStack?
• Easy to consume API that commoditizes infrastructure with the same
methodology used by public clouds.
• Abstraction of underlying infrastructure allowing for configuration or
hardware differences to not propagate to consumers and
automation.
• Standardized interface for compute, network and storage
• When software supports OpenStack it tends to “just work”
• Allows us to build an IaaS platform fit for live services and safely
hand out access to diverse teams through built in project isolation.
• Prefer to tell consumers that “if you break it then it is our fault” rather than
giving them a long list of things that they should never do.
4
5
Current Scale
• 3 OpenStack clouds
• Approximately 1100 running
instances
• Almost 500,000 instances
created and destroyed since
general availability
• 68% of workloads concentrated
in development environments
• Around 1/3 of all virtualized
workloads currently on
OpenStack
68%
10%
22%
Distribution By Running Instance
DevQa Live DC 1 Live DC 2
6
Design Philosophy
• Build a platform to run production applications.
• Multi-tenant at its core
• Should be able to safely support development and operations teams sharing
the same cloud.
• All tools needed to build a highly available production application
need to be available
• Good enough for development but not production is not an acceptable
permanent state.
• Build general purpose solutions. Customize as little as possible.
• Provide an easy menu of infrastructure offerings
• Easy to use solution with safeguards to encourage experimentation
• Development is easier when you don’t need to worry about breaking the
environment
Current Architecture
7
Ebsco Private Cloud Platform
OpenStack CloudMonitoring
Operations
Dashboards NovaNeutron CinderGlance
Keystone Heat Ceilometer Horizon
Load Balancing
What we learned…
9
Problems to Solve:
• Skills and training
• Selection of vendors and
integrations
• Deployment
• Adoption
• Productionization
10
Skills and training:
Our Experiences
• Internally develop a core group of OpenStack
SMEs before progressing too far.
• Do not waste learning opportunities by relying
to much on professional services.
• Look for candidates with strong Linux,
networking, virtualization and python skills
rather than OpenStack experience.
• Give your team the time and opportunity to
experiment and learn how OpenStack works.
• Vendor support lowers the amount of
expertise you need to go to production.
• OpenStack skills are
VERY hard to hire
• Administration requires
good Linux experience
• Inexperienced
administrators can
cause huge amounts of
damage
11
Vendors and
integrations:
Our Experiences
• Prefer products that align with
OpenStack’s multi-tenancy model
whenever possible.
• Focus on vendors building for cloud rather
than trying to integrate it afterwards.
• Look at areas to improve everywhere in
the stack. Re-evaluate your product
decisions. There is high value when an
integration is done right.
• You will not know how good a vendor’s
integration is until you try it. There can be
many hidden landmines with missing
capabilities or API support.
• Tons of vendor
integrations with varying
degrees of quality
• Many established
vendors
• Users need access to
everything that they
need to deploy and
manage a highly
available production
application
Case Study – Existing Load Balancing
• Existing vendor had limited OpenStack knowledge and bare bones
integration at the time.
• Actual quote from support after a bug was discovered (vendor specific
lines edited)
• “For now, to avoid a failover, I would recommend to program the OpenStack not
to delete IPs.”
• LBaaS v1 was extremely limited. Would not have covered all
production use cases.
• Product did not support safe multi-tenancy. There were shared resources that
were a point of failure.
• Prolonged evaluation period of 6-8 months resulting in rejection.
12
Case Study – Cloud Load Balancer (AVI)
• Installation involves providing OpenStack credentials and it handles the
rest.
• Allowed us to make production grade load balancing generally available in
development within a week and produciton within a month.
• Multi-tenancy model aligns with OpenStack Projects and with keystone
• Nobody had to ask for access. If you had access to OpenStack then you have
access to a load balancing services.
• No fighting with permissions or concerns with preventing untrained users from
damaging the environment.
13
14
Problems to Solve:
Our Experiences
• Align resources for storage, networking
and datacenter teams and make sure that
someone on each team will make
troubleshooting installation issues a top
priority.
• OpenStack requires tight integration with all of
these elements. A slow troubleshooting
feedback loop will have a very negative effect
on the deployment.
• Understand what deployment choices are
difficult to change afterwards and make
sure that you got them right.
• Assume multiple tries to get a production
ready configuration.
• Deployment
• Deployments take a
long time and are
complex
• Some OpenStack
functionality is not ready
for production
15
Problems to Solve:
Our Experiences
• Have a close relationship with your early
adopters. They will help you increase the
resiliency of your deployment.
• Regularly speak with them in person to help them
understand OpenStack and to let them tell you
about issues before they become a problem.
• Get deployments into your users hands as
soon as possible.
• Do not stall getting to production. Teams will
not want to code to an API that they cannot
use in production.
• Adoption will be limited until you can get
production availability.
• Solving problems “just for development
environments” is the wrong mentality.
• Early feedback is critical.
• Adoption
• Adoption is one of the
most critical elements to
success.
16
Problems to Solve:
Our Experiences
• Monitor OpenStack by actually using
OpenStack. Build instances and use
OpenStack functionality to detect failures.
• OpenStack is very complex and understanding
the effect of a failure can be difficult.
• If you monitor by using OpenStack you will
catch most failures before your users do and
know what functionality is impacted.
• Automate common operational and
maintenance tasks.
• OpenStack HA is complex but needed for
all environments.
• Productionizaton
• OpenStack provides
building blocks but
some assembly is
required to build a
product out of it.
• Monitoring and common
operational tasks are
not solved out of the
box.
What we did…
18
Phased Environments…
Prototype
• Single machine all
in one deployment
• Learn basics
• Validate direction
• Disposable
environment
Interim
• Break apart compute
and control
• Limited release to
early adopters
• Get feedback and
determine desired
configuration
DevQa
• Highly available
environment
• Treated like production
• General availability for
development workloads
• Determine
producitonization tasks
needed
Production
• Implement
productionizaiton
tasks
• Deploy production
clouds
19
What wound up happening…
Prototype Interim DevQa Production
20
Took too long to get
to production…
• Critical team member left
• Took too long finding a
replacement due to focus on
hiring OpenStack skillset.
• Additional work for monitoring
and operations automation
were required before we were
confident hosting production
workloads.
• Required skillsets that were not
a part of the OpenStack team
and focused manpower.
Solution: Create a focus squad
• Kicked of a 6 week effort with a cross-functional team that had
all required skills.
• This team would focus 100% on getting OpenStack to live.
• OpenStack tasks must be top priority for all team members.
• Director quote “Set your email to out of office if you have to”
• The focused effort was incredibly efficient.
• Feedback loops for troubleshooting massively reduced.
• Reduction of blocked tasks created a higher quality implementation.
21
What the focus squad do?
• Created a reliable monitoring solution based on Zabbix and a python
framework for executing OpenStack checks.
• Created automated recovery for problems discovered in DevQa.
• Automated compute node evacuation
• Automated failed OpenStack service recovery
• Increased visibility into the environment with Zabbix and Grafana.
• Automated common operational tasks to push button jobs in
Rundeck.
• Taking a compute or control node out of service
• Restarting OpenStack services
• Deployed all production OpenStack, Zabbix and Rundeck
infrastructure.
22
Tracking Success…
• Critical to getting continued commitment but hard to determine.
• We track the following metrics:
• Instance count and resource usage
• Number of teams and products leveraging OpenStack
• The number of instances created and deleted
• This can be a good indicator as to whether OpenStack was the right fit for your
organization. Indicates people using automation as opposed to manual usage.
23
Thank You
Questons?

More Related Content

PDF
Flintstones or Jetsons? Jump Start Your Virtual Test Lab
PPTX
OpenStack in the Enterprise - NJ VMUG June 9, 2015 - Melissa Palmer
PPTX
MassTLC Cloud Summit Keynote
PPTX
Resiliency through Failure @ OSCON 2013
PPTX
Provisioning Oracle Fusion Middleware Environments with Chef and Puppet
PPTX
Oracle Fusion Middleware provisioning with Puppet
PPTX
DevOps Workshops Fall 2016
PDF
Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey
Flintstones or Jetsons? Jump Start Your Virtual Test Lab
OpenStack in the Enterprise - NJ VMUG June 9, 2015 - Melissa Palmer
MassTLC Cloud Summit Keynote
Resiliency through Failure @ OSCON 2013
Provisioning Oracle Fusion Middleware Environments with Chef and Puppet
Oracle Fusion Middleware provisioning with Puppet
DevOps Workshops Fall 2016
Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey

What's hot (20)

PPTX
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps world
PPTX
OpenSouthCode 2016 - Accenture DevOps Platform 2016-05-07
PPTX
Accelerate Your Visual Studio Software Build Environment with ElectricAcceler...
PDF
Get Loose! Microservices and Loosely Coupled Architectures
PPTX
Intro to Puppet Enterprise
PPTX
Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015
PDF
Scaling Jenkins
PDF
How to choose tools for DevOps and Continuous Delivery - Unicom DevOps Summit...
PPTX
Steve Brodie - Electric Cloud - The Yin and Yang of DevOps Transformation
PPTX
Astute PeopleSoft 9.2 Sandbox In The Cloud
PDF
DevOps-Redefining your IT Strategy-28thJan15
PDF
DevOps 2016 summit
PPTX
Integrating Security into DevOps
PDF
Security at the Speed of Software Development
PDF
Achieving DevOps using Open Source Tools in the Enterprise
PDF
Behind the Book: Gene Kim's Top Takeaways from Researching and Writing 'The D...
PPTX
New DevOps for the DBA
PDF
DevOps: A Culture Transformation, More than Technology
PDF
Software operability and run book collaboration - DevOps Summit, Bangalore
PDF
Dev ops concept
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps world
OpenSouthCode 2016 - Accenture DevOps Platform 2016-05-07
Accelerate Your Visual Studio Software Build Environment with ElectricAcceler...
Get Loose! Microservices and Loosely Coupled Architectures
Intro to Puppet Enterprise
Accelerating Innovation and Time-to-Market @ Camp Devops Houston 2015
Scaling Jenkins
How to choose tools for DevOps and Continuous Delivery - Unicom DevOps Summit...
Steve Brodie - Electric Cloud - The Yin and Yang of DevOps Transformation
Astute PeopleSoft 9.2 Sandbox In The Cloud
DevOps-Redefining your IT Strategy-28thJan15
DevOps 2016 summit
Integrating Security into DevOps
Security at the Speed of Software Development
Achieving DevOps using Open Source Tools in the Enterprise
Behind the Book: Gene Kim's Top Takeaways from Researching and Writing 'The D...
New DevOps for the DBA
DevOps: A Culture Transformation, More than Technology
Software operability and run book collaboration - DevOps Summit, Bangalore
Dev ops concept
Ad

Viewers also liked (20)

PPTX
OpenStack by the Numbers
PPTX
The ScriptED Story: Futures in Technology
PDF
Presentación Proyecto Crónicas de Jóvenes Emprendedores (AJE Granada)
PDF
Doug Hardenburgh Portfolio
PPTX
Las comunidades de aprendizaje
PDF
Copernica - Cross Mail Presentatie
PDF
XING for Universities
PPTX
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
PDF
EL MALPARIDO -NOVELA EN VERSIÓN DIGITAL GRATUITA-
PDF
Conferencia: Herramientas Estratégicas de Gestión: Responsabilidad Social Cor...
PDF
Videreutdanning 160610
PDF
“Feria del Conocimiento América Latina y el Caribe: Casos destacados en agric...
ODP
Introduction to terrastore
PPS
Applus IAT BUMP
PDF
Tutorial / Manual / how-to-set-up: ’Remember the Milk’ as your task managemen...
ODP
Presentacion Malaga CF Pedro Jimenez
PDF
Análisis de vibraciones de un tren de maquinaria
PPT
Historias y cuentos online
PDF
2016 Cloud, OpenStack & Networking Brand Leader Survey
OpenStack by the Numbers
The ScriptED Story: Futures in Technology
Presentación Proyecto Crónicas de Jóvenes Emprendedores (AJE Granada)
Doug Hardenburgh Portfolio
Las comunidades de aprendizaje
Copernica - Cross Mail Presentatie
XING for Universities
Oct 2012 HUG: Project Panthera: Better Analytics with SQL, MapReduce, and HBase
EL MALPARIDO -NOVELA EN VERSIÓN DIGITAL GRATUITA-
Conferencia: Herramientas Estratégicas de Gestión: Responsabilidad Social Cor...
Videreutdanning 160610
“Feria del Conocimiento América Latina y el Caribe: Casos destacados en agric...
Introduction to terrastore
Applus IAT BUMP
Tutorial / Manual / how-to-set-up: ’Remember the Milk’ as your task managemen...
Presentacion Malaga CF Pedro Jimenez
Análisis de vibraciones de un tren de maquinaria
Historias y cuentos online
2016 Cloud, OpenStack & Networking Brand Leader Survey
Ad

Similar to OpenStack at EBSCO (20)

PDF
State of the Stack v4 - OpenStack in All It's Glory
PPTX
So Your OpenStack Cloud is Built...Now What?
PPTX
So Your OpenStack Cloud is Built... Now What's Next - Walter Bentley - OpenSt...
PPTX
Running OpenStack in Production
ODP
How Big Companies Contribute to OpenStack
PDF
OpenStack for VMware Administrators
PDF
OpenStack in Action 4! Heidi Bretz - State of OpenStack, what's new, the tech...
PDF
Transforming to OpenStack: a sample roadmap to DevOps
PDF
Are enterprises ready for the OpenStack transformation
PPTX
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
PDF
EOSC2015_OpenStack_de_la_teoria_a_la_practica-jdelvalle_mperezco-web
ODP
How to Maximize Effectiveness of Developers Contributing to Free Software
PPTX
OpenStack & the Evolving Cloud Ecosystem
PPTX
Operating OpenStack on a Budget
PPTX
Operating OpenStack on a Budget
PDF
The "Holy Grail" of Dev/Ops
PDF
OpenStack in the Enterprise - Interop Las Vegas 2014
PPTX
Some Advanced OpenStack Overview Document
PDF
State of the Stack April 2013
PDF
HP Helion Webinar #4 - Open stack the magic pill
State of the Stack v4 - OpenStack in All It's Glory
So Your OpenStack Cloud is Built...Now What?
So Your OpenStack Cloud is Built... Now What's Next - Walter Bentley - OpenSt...
Running OpenStack in Production
How Big Companies Contribute to OpenStack
OpenStack for VMware Administrators
OpenStack in Action 4! Heidi Bretz - State of OpenStack, what's new, the tech...
Transforming to OpenStack: a sample roadmap to DevOps
Are enterprises ready for the OpenStack transformation
OpenStack Ousts vCenter for DevOps and Unites IT Silos at AVG Technologies
EOSC2015_OpenStack_de_la_teoria_a_la_practica-jdelvalle_mperezco-web
How to Maximize Effectiveness of Developers Contributing to Free Software
OpenStack & the Evolving Cloud Ecosystem
Operating OpenStack on a Budget
Operating OpenStack on a Budget
The "Holy Grail" of Dev/Ops
OpenStack in the Enterprise - Interop Las Vegas 2014
Some Advanced OpenStack Overview Document
State of the Stack April 2013
HP Helion Webinar #4 - Open stack the magic pill

More from Tesora (20)

PDF
Model-Driven Operations
PPTX
What's Next for OpenStack at Walmart
PDF
OpenStack: Upstream First
PDF
All of the Amazing OpenStack Resources
PDF
What's the TCO for an OpenStack Cloud?
PDF
OpenStack: Past, Present, and Future
PPTX
VMware and Container Orchestration
PPTX
OpenStack Management at Hyperscale
PPTX
Stateful Applications On the Cloud: A PayPal Journey
PPTX
Secrets of Success: Building Community Through Meetups
PPTX
The State of OpenStack Product Management
PPTX
OpenStack in the Enterprise
PPTX
Leveraging OpenStack to Run Mesos/Marathon at Charter Communications
PPTX
Consuming Cinder from Docker
PPTX
Bridging OpenStack and Mobile Cloud
PPTX
OpenStack at Scale Inside NetApp
PPTX
OpenStack at Bloomberg
PPTX
A Tale from the Upstream Path
PPTX
Building a Personal Cloud Storage Service
PPTX
Why OpenStack Hasn't Solved Enterprise Challenges
Model-Driven Operations
What's Next for OpenStack at Walmart
OpenStack: Upstream First
All of the Amazing OpenStack Resources
What's the TCO for an OpenStack Cloud?
OpenStack: Past, Present, and Future
VMware and Container Orchestration
OpenStack Management at Hyperscale
Stateful Applications On the Cloud: A PayPal Journey
Secrets of Success: Building Community Through Meetups
The State of OpenStack Product Management
OpenStack in the Enterprise
Leveraging OpenStack to Run Mesos/Marathon at Charter Communications
Consuming Cinder from Docker
Bridging OpenStack and Mobile Cloud
OpenStack at Scale Inside NetApp
OpenStack at Bloomberg
A Tale from the Upstream Path
Building a Personal Cloud Storage Service
Why OpenStack Hasn't Solved Enterprise Challenges

Recently uploaded (20)

PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Advanced IT Governance
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Big Data Technologies - Introduction.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
KodekX | Application Modernization Development
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Approach and Philosophy of On baking technology
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Cloud computing and distributed systems.
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
MYSQL Presentation for SQL database connectivity
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Advanced IT Governance
Advanced methodologies resolving dimensionality complications for autism neur...
20250228 LYD VKU AI Blended-Learning.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Big Data Technologies - Introduction.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
KodekX | Application Modernization Development
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Approach and Philosophy of On baking technology
Network Security Unit 5.pdf for BCA BBA.
Cloud computing and distributed systems.
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Spectral efficient network and resource selection model in 5G networks
MYSQL Presentation for SQL database connectivity
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm

OpenStack at EBSCO

  • 1. OpenStack at Ebsco Nate Baechtold, IT Architect Ebsco Information Services August 23, 2016
  • 2. Bulleted List • The leading discovery service provider for libraries worldwide with more than 10,000 discovery customers in over 100 countries. • Preeminent provider of online research content for libraries, including hundreds of research databases, historical archives, point-of-care medical reference, and corporate learning tools serving millions of end users at tens of thousands of institutions. • Leading provider of electronic journals & books for libraries, with more than 360,000 serials, including more than 57,000 e- journals, as well as online access to more than 800,000 e- books. 2
  • 3. What did we need? • Self service infrastructure to all development teams. • Full stack automation to all environments. • Increase agility and productivity of operations and development teams. • Lower costs by leveraging open source solutions. • Provide a solution that integrates well with other products and allows other products and tools to easily integrate with it. 3
  • 4. Why OpenStack? • Easy to consume API that commoditizes infrastructure with the same methodology used by public clouds. • Abstraction of underlying infrastructure allowing for configuration or hardware differences to not propagate to consumers and automation. • Standardized interface for compute, network and storage • When software supports OpenStack it tends to “just work” • Allows us to build an IaaS platform fit for live services and safely hand out access to diverse teams through built in project isolation. • Prefer to tell consumers that “if you break it then it is our fault” rather than giving them a long list of things that they should never do. 4
  • 5. 5 Current Scale • 3 OpenStack clouds • Approximately 1100 running instances • Almost 500,000 instances created and destroyed since general availability • 68% of workloads concentrated in development environments • Around 1/3 of all virtualized workloads currently on OpenStack 68% 10% 22% Distribution By Running Instance DevQa Live DC 1 Live DC 2
  • 6. 6 Design Philosophy • Build a platform to run production applications. • Multi-tenant at its core • Should be able to safely support development and operations teams sharing the same cloud. • All tools needed to build a highly available production application need to be available • Good enough for development but not production is not an acceptable permanent state. • Build general purpose solutions. Customize as little as possible. • Provide an easy menu of infrastructure offerings • Easy to use solution with safeguards to encourage experimentation • Development is easier when you don’t need to worry about breaking the environment
  • 7. Current Architecture 7 Ebsco Private Cloud Platform OpenStack CloudMonitoring Operations Dashboards NovaNeutron CinderGlance Keystone Heat Ceilometer Horizon Load Balancing
  • 9. 9 Problems to Solve: • Skills and training • Selection of vendors and integrations • Deployment • Adoption • Productionization
  • 10. 10 Skills and training: Our Experiences • Internally develop a core group of OpenStack SMEs before progressing too far. • Do not waste learning opportunities by relying to much on professional services. • Look for candidates with strong Linux, networking, virtualization and python skills rather than OpenStack experience. • Give your team the time and opportunity to experiment and learn how OpenStack works. • Vendor support lowers the amount of expertise you need to go to production. • OpenStack skills are VERY hard to hire • Administration requires good Linux experience • Inexperienced administrators can cause huge amounts of damage
  • 11. 11 Vendors and integrations: Our Experiences • Prefer products that align with OpenStack’s multi-tenancy model whenever possible. • Focus on vendors building for cloud rather than trying to integrate it afterwards. • Look at areas to improve everywhere in the stack. Re-evaluate your product decisions. There is high value when an integration is done right. • You will not know how good a vendor’s integration is until you try it. There can be many hidden landmines with missing capabilities or API support. • Tons of vendor integrations with varying degrees of quality • Many established vendors • Users need access to everything that they need to deploy and manage a highly available production application
  • 12. Case Study – Existing Load Balancing • Existing vendor had limited OpenStack knowledge and bare bones integration at the time. • Actual quote from support after a bug was discovered (vendor specific lines edited) • “For now, to avoid a failover, I would recommend to program the OpenStack not to delete IPs.” • LBaaS v1 was extremely limited. Would not have covered all production use cases. • Product did not support safe multi-tenancy. There were shared resources that were a point of failure. • Prolonged evaluation period of 6-8 months resulting in rejection. 12
  • 13. Case Study – Cloud Load Balancer (AVI) • Installation involves providing OpenStack credentials and it handles the rest. • Allowed us to make production grade load balancing generally available in development within a week and produciton within a month. • Multi-tenancy model aligns with OpenStack Projects and with keystone • Nobody had to ask for access. If you had access to OpenStack then you have access to a load balancing services. • No fighting with permissions or concerns with preventing untrained users from damaging the environment. 13
  • 14. 14 Problems to Solve: Our Experiences • Align resources for storage, networking and datacenter teams and make sure that someone on each team will make troubleshooting installation issues a top priority. • OpenStack requires tight integration with all of these elements. A slow troubleshooting feedback loop will have a very negative effect on the deployment. • Understand what deployment choices are difficult to change afterwards and make sure that you got them right. • Assume multiple tries to get a production ready configuration. • Deployment • Deployments take a long time and are complex • Some OpenStack functionality is not ready for production
  • 15. 15 Problems to Solve: Our Experiences • Have a close relationship with your early adopters. They will help you increase the resiliency of your deployment. • Regularly speak with them in person to help them understand OpenStack and to let them tell you about issues before they become a problem. • Get deployments into your users hands as soon as possible. • Do not stall getting to production. Teams will not want to code to an API that they cannot use in production. • Adoption will be limited until you can get production availability. • Solving problems “just for development environments” is the wrong mentality. • Early feedback is critical. • Adoption • Adoption is one of the most critical elements to success.
  • 16. 16 Problems to Solve: Our Experiences • Monitor OpenStack by actually using OpenStack. Build instances and use OpenStack functionality to detect failures. • OpenStack is very complex and understanding the effect of a failure can be difficult. • If you monitor by using OpenStack you will catch most failures before your users do and know what functionality is impacted. • Automate common operational and maintenance tasks. • OpenStack HA is complex but needed for all environments. • Productionizaton • OpenStack provides building blocks but some assembly is required to build a product out of it. • Monitoring and common operational tasks are not solved out of the box.
  • 18. 18 Phased Environments… Prototype • Single machine all in one deployment • Learn basics • Validate direction • Disposable environment Interim • Break apart compute and control • Limited release to early adopters • Get feedback and determine desired configuration DevQa • Highly available environment • Treated like production • General availability for development workloads • Determine producitonization tasks needed Production • Implement productionizaiton tasks • Deploy production clouds
  • 19. 19 What wound up happening… Prototype Interim DevQa Production
  • 20. 20 Took too long to get to production… • Critical team member left • Took too long finding a replacement due to focus on hiring OpenStack skillset. • Additional work for monitoring and operations automation were required before we were confident hosting production workloads. • Required skillsets that were not a part of the OpenStack team and focused manpower.
  • 21. Solution: Create a focus squad • Kicked of a 6 week effort with a cross-functional team that had all required skills. • This team would focus 100% on getting OpenStack to live. • OpenStack tasks must be top priority for all team members. • Director quote “Set your email to out of office if you have to” • The focused effort was incredibly efficient. • Feedback loops for troubleshooting massively reduced. • Reduction of blocked tasks created a higher quality implementation. 21
  • 22. What the focus squad do? • Created a reliable monitoring solution based on Zabbix and a python framework for executing OpenStack checks. • Created automated recovery for problems discovered in DevQa. • Automated compute node evacuation • Automated failed OpenStack service recovery • Increased visibility into the environment with Zabbix and Grafana. • Automated common operational tasks to push button jobs in Rundeck. • Taking a compute or control node out of service • Restarting OpenStack services • Deployed all production OpenStack, Zabbix and Rundeck infrastructure. 22
  • 23. Tracking Success… • Critical to getting continued commitment but hard to determine. • We track the following metrics: • Instance count and resource usage • Number of teams and products leveraging OpenStack • The number of instances created and deleted • This can be a good indicator as to whether OpenStack was the right fit for your organization. Indicates people using automation as opposed to manual usage. 23