#unidevops

Software Operability,
Run Book Collaboration,
and DevOps

Matthew Skelton
27th February 2014
DevOps Summit,
London, UK
www.devopssummit.com
@matthewpskelton
softwareoperability.com
• Software Operability
• Run Book Collaboration
• Making Operability Work
• Questions

#unidevops

Agenda
• Software systems since 1998
• Continuous Delivery specialist,
DevOps enthusiast, Operability nut
• London Continuous Delivery meetup
group - londoncd.org.uk
• Experience DevOps workshops
• PIPELINE Conference

#unidevops

Background
#unidevops

Software
Operability
•
•
•
•

Definitions
Examples
Why focus on operability?
How DevOps can help

#unidevops

Software Operability
#unidevops

Operability?
• Cognates:
– Opera
– Operate
– Operational
– Inter-operability

#unidevops

Etymology of Operability?
#unidevops
• Operability: the properties of a
system which make it work well in
Production

#unidevops

Software Operability
Since 1929,
Mallorca, Spain

#unidevops

Operable Systems
• David Copeland (@davetron5000):
“How your software runs in
production is all that matters. The
most amazing abstractions, cleanest
code, or beautiful algorithms are
meaningless if your code doesn’t run
well on production.”
•

http://www.naildrivin5.com/blog/2013/06/16/production-is-all-that-matters.html

#unidevops

Software Operability
•
•
•
•
•
•
•
•
•

Deploy
Monitor
Diagnose
Debug
Query
Control
Inspect
Clear
...

#unidevops

Operational Criteria
#unidevops

“Non-Functional”
• Hooks (internal APIs) for:
– Logging
– Monitoring
– Diagnostics
– Health checks
– Data clear-down
– Service / daemon / container control

#unidevops

Shaped by Operability
#unidevops

Ops Folk are Users Too!
#unidevops
• Deploy more rapidly, frequently
• High cost of Production outage
• Systems now more complicated

#unidevops

Why focus on Operability?
#unidevops

Outages are Embarrassing!
#unidevops

Operational considerations
#unidevops

Operational considerations
#unidevops

Operational considerations
• DevOps is one way to address
poor operability
• Improved collaboration and
communication between Dev
teams and Ops teams
• Example: Run Book Collaboration

#unidevops

How DevOps can help
#unidevops

Run Book
Collaboration
• Feedback loops and learning
• What is a run book?
• How can run book collaboration
help operability?

#unidevops

Run Book Collaboration
Gene Kim:
http://itrevolution.com/the-three-ways-principles-underpinning-devops/

#unidevops

Feedback Loops
#unidevops

Run Book
#unidevops

Templates
#unidevops

Example
•
•

1 Table of Contents
2 System Overview
–
–
–
–
–
–
–
–
–

2.1 Service Overview
2.2 Contributing Applications, Daemons, and
Windows Services
2.3 Hours of Operation
2.4 Execution Design
2.5 Infrastructure and Network Design
2.6 Resilience, Fault Tolerance and HighAvailability
2.7 Throttling and Partial Shutdown
2.8 Required Resources
2.9 Expected Traffic and Load
•
•
•

4.1 Configuration Management

–
–
–

•

7.5 Troubleshooting

–

8.1 Maintenance Procedures

7 Operational Tasks

•

•

5.2 Backup Procedures
5.3 Restore Procedures

–
–

6.1 Error Messages
6.2 Events

6 Monitoring and Alerting

8.1.1 Patching
–
–

•
•

–

•

–

8.1.3.1 Log Rotation

8.2.1 Technical Testing
8.2.2 Post-Deployment

9 Failure and Recovery Procedures
–
–
–

•

8.1.1.1 Normal Cycle
8.1.1.2 Zero-Day Vulnerabilities

8.1.2 GMT/BST time changes
8.1.3 Cleardown Activities

8.2 Testing
•
•

5 System Backup and Restore
5.1.1 Special Files

7.4.1 System Rebuilds

8 Maintenance Tasks
•

5.1 Backup Requirements

3 Security and Access Control
4 System Configuration

•

7.1 Deployment
7.2 Batch Processing
7.3 Power Procedures
7.4 Routine Checks

–

2.10 Environmental Differences
2.11 Tools

–

•

–
–
–
–

•

6.3 Health Checks
6.4 Other Messages

2.9.1 Hot or Peak Periods
2.9.2 Warm Periods
2.9.3 Cool or Quiet Periods

–
–

•
•

–
–

9.1 Failover
9.2 Recovery
9.3 Troubleshooting Failover and Recovery

10 Contact Details
#unidevops

Example
•
•

1 Table of Contents
2 System Overview

– 2.1 Service Overview
– 2.2 Contributing Applications,
Daemons, and Windows
Services
– 2.3 Hours of Operation
– 2.4 Execution Design
– 2.5 Infrastructure and Network
Design
– 2.6 Resilience, Fault Tolerance
and High-Availability
– 2.7 Throttling and Partial
Shutdown
– 2.8 Required Resources
– 2.9 Expected Traffic and Load

•
•
•
•
•
•
•
•

3 Security and Access
Control
4 System Configuration
5 System Backup and
Restore
6 Monitoring and Alerting
7 Operational Tasks
8 Maintenance Tasks
9 Failure and Recovery
Procedures
10 Contact Details
#unidevops

Example
2.1 Service Overview
2.2 Contributing
Applications,
Daemons, and
Windows Services
2.3 Hours of
Operation
2.4 Execution Design
2.5 Infrastructure and
Network Design

2.6 Resilience, Fault
Tolerance and
High-Availability
2.7 Throttling and
Partial Shutdown
2.8 Required
Resources
2.9 Expected Traffic
and Load
#unidevops

It‟s Not Documentation
#unidevops

Focus on Collaboration
•
•
•
•
•

Better understanding
Better cross-team working
Reduction in operational problems
Fewer outages
Reduced long-term cost-ofownership

#unidevops

Outcomes
•
•
•
•

Focus on the collaboration
Run book is a means, not an end
Throw it away when complete (?)
Aim to automate more over time

• See http://runbookcollab.info/

#unidevops

Run Book as Collaboration
#unidevops

Making Operability
Work
•
•
•
•
•

NFRs vs Operational Features
Budget changes
Organisation changes
Responsibility changes
Avoid on-call anti-patterns

#unidevops

Making Operability Work
#unidevops

“Non-Functional”
Features

#unidevops

Operational Features
• Single product backlog
– End-user + Operational features
– New features + bugs

• Product Owner on call
– Accountable for operational failures
– Seriously!

#unidevops

Taking Operability Seriously
#unidevops
• “What is your budget code?”
• Capex vs. Opex?
• Remove budget barriers to
regular, effective communication

#unidevops

Budget changes
#unidevops

Niek Bartholomeus (@niekbartho) - http://niek.bartholomeus.be/

https://speakerdeck.com/niekbartho/self-organization-vs-global-optimization-a-comparison-betweentraditional-and-modern-organizations
• “I‟ll need to ask my manager first”
• Lack of autonomy
• Remove reporting barriers to regular,
effective communication
• More at
http://bit.ly/DevOpsTopologies

#unidevops

Organisation changes
#unidevops

“I just want to write code”
#unidevops

Mysterious Coding Tricks
#unidevops

On-call for Responsibility
•
•
•
•
•
•

Too much overtime pay
Too little overtime pay
Rota team too small
No training in incident response
No team ownership of product
No team autonomy for changes

#unidevops

On-call Anti-Patterns
• Team members want to help
make things better
• Empowered to fix problems
• Reduce the times they are woken
up

#unidevops

On call - Goal
•
•
•
•
•

Operational Features, not “NFRs”
Sustainable collaboration
Sensible, fair on-call rotas
Over-compensate in time off
Avoid burn-out

#unidevops

The operability of operability
#unidevops

Recapitulation
Making software
systems work well
in Production

#unidevops

Software Operability
Shared focus on
operability
throughout the
delivery cycle

#unidevops

Run Book Collaboration
Use DevOps
team patterns for
sustainable
operability

#unidevops

Making Operability Operable
#unidevops

What‟s Next?
• Patterns for
Performance and
Operability
– Ford, Gileadi, Purba,
Moerman

• http://whoownsmyoperability.com/
– Recommended reading lists

#unidevops

Further Reading
• Release It!
– Michael Nygard
(@mnygard)

• http://www.michaelnygard.com/

#unidevops

Further Reading
• Software Operability – How to make
software work well in Production
– Due early late 2014

• Sign up at OperabilityBook.com

• Discount code for DevOps Summit
attendees

#unidevops

Operability Book
• A hands-on workshop for DevOps
culture

• Forthcoming dates:
– London: 28th February 2014

• http://experiencedevops.org/

#unidevops

Experience DevOps
•
•
•
•
•
•

Continuous Delivery
„Unconference‟ format
Tuesday 8th April 2014
London, UK
http://pipelineconf.info/
@PipelineConf

#unidevops

PIPELINE
#unidevops

Matthew Skelton
@matthewpskelton

Questions &
Discussion

softwareoperability.com
operabilitybook.com
bit.ly/DevOpsTopologies
http://www.blinkenlights.nl/images/
blinkenlights-big.jpeg
http://www.danatronics.com/s db_apps.html
http://riverbankoftruth.com/ wpcontent/uploads/2013/07/embarrassedchimp22.jpg
http://www.thinkgeek.com/edm/
20040709.html

Acknowledgements

http://indianaohindiana.com/wpcontent/uploads/2013/10/Tome.jpg
http://www.guavaworks.com/companyblog/guava-doesnt-do-cookie-cutter.html
http://www.carpages.co.uk/ford/ford-sandsculptures-05-09-11.asp
http://www.thisismoney.co.uk/money/experts/
article-2324270/Take-smaller-pension-pots-taxfree-leave-final-salary-untouched.html
http://paranoidnews.org/wpcontent/uploads/2010/10/Alien-Hunt-AlarmClock.jpg
http://particulations.blogspot.co.uk/
2010/08/headingley-hole.html
http://marvel.wikia.com/
Stephen_Strange_(Earth-616)

#unidevops

http://pianofortekeys.files.wordpress.com/
2013/04/ariadnne_wideweb__470x3300.jpg
#unidevops

Further Slides
#unidevops

The Phoenix Project
#unidevops

Continuous Delivery

More Related Content

PDF
5 Best Practices DevOps Culture
PPTX
DevOps Patterns - Team Topologies
PPTX
DevOps principles and practices - accelerate flow
PPTX
Fundamental Concepts of DevOps
PDF
Demystifying DevOps
PPTX
DevOps game lego
PDF
DevOps: Hype or Hope
PPTX
Devops
5 Best Practices DevOps Culture
DevOps Patterns - Team Topologies
DevOps principles and practices - accelerate flow
Fundamental Concepts of DevOps
Demystifying DevOps
DevOps game lego
DevOps: Hype or Hope
Devops

What's hot (20)

PPTX
DevOps game marshmallow challenge
PPTX
How Spinnaker helped us achieve real Continuous Delivery
PDF
DevOps for absolute beginners
PPTX
1st Riga DevOps meetup
PPTX
DevOps Enterprise Summit 2016
PDF
The Journey to DevOps #MFSummit2017
PPTX
The Road to DevOps V3
PPTX
DevOps topologies
PDF
Deployment Automation - My journey at Peazie
PPTX
DevOps. If it hurts, do it more often.
PDF
PuppetConf 2016: Continuous Delivery and DevOps with Jenkins and Puppet Enter...
PDF
Tfs 2015 Upgrade Tips and Tricks
PDF
Long live the DevOps team - LeedsDevOps - 2015-10-22 - Matthew Skelton
PPTX
DevOps and the Future of IT Operations
ODP
DevOps presentation
PDF
Introduction to Software Design Sutra
PDF
Understanding DevOps in simpler way with Continuous Delivery
PDF
DevOps, from inception to conclusion
PPTX
Implementing DevOps In Practice
PDF
7 habits of effective DevOps dev ops il 2015 oded tamir
DevOps game marshmallow challenge
How Spinnaker helped us achieve real Continuous Delivery
DevOps for absolute beginners
1st Riga DevOps meetup
DevOps Enterprise Summit 2016
The Journey to DevOps #MFSummit2017
The Road to DevOps V3
DevOps topologies
Deployment Automation - My journey at Peazie
DevOps. If it hurts, do it more often.
PuppetConf 2016: Continuous Delivery and DevOps with Jenkins and Puppet Enter...
Tfs 2015 Upgrade Tips and Tricks
Long live the DevOps team - LeedsDevOps - 2015-10-22 - Matthew Skelton
DevOps and the Future of IT Operations
DevOps presentation
Introduction to Software Design Sutra
Understanding DevOps in simpler way with Continuous Delivery
DevOps, from inception to conclusion
Implementing DevOps In Practice
7 habits of effective DevOps dev ops il 2015 oded tamir
Ad

Similar to Software operability and run book collaboration London Feb 2014 (20)

PDF
Software operability and run book collaboration - DevOps Summit, Bangalore
PPTX
Software operability and run book collaboration - DevOps Summit, Amsterdam
PDF
Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey
PPTX
DevOps Days Ohio
PDF
Journey to the center of DevOps - v6
PPTX
DBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
PDF
Implementing Enterprise DevOps: Real Life Experiences
PPTX
DevOps-as-a-Service: Towards Automating the Automation
PPTX
SCM Transformation Challenges and How to Overcome Them
PPTX
Five Ways to Fix Your SQL Server Dev-Test Problems
PDF
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
PPTX
DevOps_service.pptx
PDF
Delivering Better Software Faster (Without Breaking Everything)
PPTX
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps world
PDF
The Evolution of Continuous Delivery at Scale @ Linkedin
PDF
Introduction to DevOps
PPTX
BrainQuest-DevOps
PDF
Quantifying DevOps Adoption Empirically for Demonstrable ROI
PPTX
Mastering Complex Application Deployments
Software operability and run book collaboration - DevOps Summit, Bangalore
Software operability and run book collaboration - DevOps Summit, Amsterdam
Gartner Infrastructure and Operations Summit Berlin 2015 - DevOps Journey
DevOps Days Ohio
Journey to the center of DevOps - v6
DBmaestro's State of the Database Continuous Delivery Survey- Findings Revealed
Implementing Enterprise DevOps: Real Life Experiences
DevOps-as-a-Service: Towards Automating the Automation
SCM Transformation Challenges and How to Overcome Them
Five Ways to Fix Your SQL Server Dev-Test Problems
Cloud and Network Transformation using DevOps methodology : Cisco Live 2015
DevOps_service.pptx
Delivering Better Software Faster (Without Breaking Everything)
Lucas Gravley - HP - Self-Healing And Monitoring in a DevOps world
The Evolution of Continuous Delivery at Scale @ Linkedin
Introduction to DevOps
BrainQuest-DevOps
Quantifying DevOps Adoption Empirically for Demonstrable ROI
Mastering Complex Application Deployments
Ad

More from Matthew Skelton (20)

PDF
Find me on SpeakerDeck! - Matthew Skelton.pdf
PDF
Business and technical agility with Team Topologies - QCon Plus - 2021-05-26
PDF
What is platform as a product? Clues from Team Topologies - WTFinar with Cont...
PDF
Business agility with Team Topologies - NatWest Group - 2021-01-19
PDF
WFT is platform as a product? Clues from Team Topologies - WTFinar with Conta...
PDF
Beyond the Spotify Model - Team Topologies - Tech.rocks - 2020-12-10 - Matthe...
PDF
Accidental Architects - how HR designs software systems - Team Topologies - f...
PDF
Team Topologies in action - early results from industry - DOES Las Vegas 2020...
PDF
What is platform as a product? Clues from Team Topologies - Puppetize 2020 - ...
PDF
Remote first team interactions with Team Topologies - Iris Software Group - 2...
PDF
Team Topologies in action - early results from industry - DOES London Virtual...
PDF
Accidental Architects - how HR designs software systems - Team Topologies - e...
PDF
Remote-first team interactions with Team Topologies - SEAM - 2020-05-13
PDF
Remote-first team interactions with Team Topologies - Agile Yorkshire - 2020-...
PDF
Remote first team interactions with Team Topologies - IT Revolution webinar -...
PDF
Remote-first team interactions with Team Topologies
PDF
Forget monoliths vs microservices - focus on Team Cognitive Load - Team Topol...
PDF
How to break apart a monolithic system safely without destroying your team - ...
PDF
Un-broken logging - the foundation of software operability - Operability.io -...
PDF
Forget monoliths vs microservices - focus on team cognitive load - Team Topol...
Find me on SpeakerDeck! - Matthew Skelton.pdf
Business and technical agility with Team Topologies - QCon Plus - 2021-05-26
What is platform as a product? Clues from Team Topologies - WTFinar with Cont...
Business agility with Team Topologies - NatWest Group - 2021-01-19
WFT is platform as a product? Clues from Team Topologies - WTFinar with Conta...
Beyond the Spotify Model - Team Topologies - Tech.rocks - 2020-12-10 - Matthe...
Accidental Architects - how HR designs software systems - Team Topologies - f...
Team Topologies in action - early results from industry - DOES Las Vegas 2020...
What is platform as a product? Clues from Team Topologies - Puppetize 2020 - ...
Remote first team interactions with Team Topologies - Iris Software Group - 2...
Team Topologies in action - early results from industry - DOES London Virtual...
Accidental Architects - how HR designs software systems - Team Topologies - e...
Remote-first team interactions with Team Topologies - SEAM - 2020-05-13
Remote-first team interactions with Team Topologies - Agile Yorkshire - 2020-...
Remote first team interactions with Team Topologies - IT Revolution webinar -...
Remote-first team interactions with Team Topologies
Forget monoliths vs microservices - focus on Team Cognitive Load - Team Topol...
How to break apart a monolithic system safely without destroying your team - ...
Un-broken logging - the foundation of software operability - Operability.io -...
Forget monoliths vs microservices - focus on team cognitive load - Team Topol...

Recently uploaded (20)

PPTX
Benefits of Physical activity for teenagers.pptx
PDF
Five Habits of High-Impact Board Members
PPT
What is a Computer? Input Devices /output devices
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PDF
CloudStack 4.21: First Look Webinar slides
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Consumable AI The What, Why & How for Small Teams.pdf
DOCX
search engine optimization ppt fir known well about this
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
STKI Israel Market Study 2025 version august
PPTX
Custom Battery Pack Design Considerations for Performance and Safety
PDF
A proposed approach for plagiarism detection in Myanmar Unicode text
Benefits of Physical activity for teenagers.pptx
Five Habits of High-Impact Board Members
What is a Computer? Input Devices /output devices
A contest of sentiment analysis: k-nearest neighbor versus neural network
Developing a website for English-speaking practice to English as a foreign la...
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
CloudStack 4.21: First Look Webinar slides
Module 1.ppt Iot fundamentals and Architecture
Zenith AI: Advanced Artificial Intelligence
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Consumable AI The What, Why & How for Small Teams.pdf
search engine optimization ppt fir known well about this
NewMind AI Weekly Chronicles – August ’25 Week III
Enhancing emotion recognition model for a student engagement use case through...
sustainability-14-14877-v2.pddhzftheheeeee
STKI Israel Market Study 2025 version august
Custom Battery Pack Design Considerations for Performance and Safety
A proposed approach for plagiarism detection in Myanmar Unicode text

Software operability and run book collaboration London Feb 2014