SlideShare a Scribd company logo
FROM APOLLO 13 TO GOOGLE SRE

WHEN DEVOPS MET SRE
Sanjeev Sharma
@sd_architect | http://sdarchitect.blog
#WHOAMI
• 20+ Years in Software Development
and Delivery
• Past: IBM Distinguished Engineer
and CTO for DevOps Adoption
• Now: Global Practice Lead for Data
Transformation at Delphix
• Author of two DevOps books:
• DevOps For Dummies: https://ibm.biz/
BdsPMX
• The DevOps Adoption Playbook: http://
amzn.to/2hH7rt2
• Blog: https://sdarchitect.blog
• Tweets: @sd_architect
WHAT IS SRE?
“SRE is what happens
when you ask a software
engineer to design an
operations team. ”
Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall
Richard Murphy
“Site Reliability Engineering.”
Site Reliability Engineering (SRE) : Google’s
approach to Service Management
RELIABILITY: THE REAL AVAILABILITY NUMBERS!
How much downtime does 5-nines 99.999% availability translate to?
• Daily: 0.9s
• Weekly: 6.0s
• Monthly: 26.3s
• Yearly: 5m 15.6s
4-nines or 99.99% translates to downtime of:
• Daily: 8.6s
• Weekly: 1m 0.5s
• Monthly: 4m 23.0s
• Yearly: 52m 35.7s
Even the more common
99.95% availability SLO is
a mere 43 seconds/day or
5:24 minutes/week.
APOLLO 13 – THE REAL HEROES
Image Courtesy:
Universal Pictures, NASA
EIGHT TENETS OF GOOGLE SRE
1. Ensuring a Durable Focus on Engineering
2. Pursuing Maximum ChangeVelocity WithoutViolating a Service’s
SLO
3. Monitoring
4. Emergency Response
5. Change Management
6. Demand Forecasting and Capacity Planning
7. Provisioning
8. Efficiency and Performance
BEST PRACTICES OF INCIDENT MANAGEMENT
1. Prioritize
2. Prepare
3. Trust
4. Introspect
5. Consider alternatives
6. Practice
7. Change it around
Image Courtesy:
Universal Pictures, NASA
Development SCM Build
Package
Repo Deploy
Development SCM Build
Package
Repo Deploy
Development SCM Build
Package
Repo Deploy
Development SCM Build
Package
Repo Deploy Test Stage Production
Mainframe Hosted App
Mobile App
App Server Monolithic App
Cloud Native App
Enterprise
Release
Agile/Innovation Edge
Rapid Delivery for Innovation • Agile • Antifragile • Experimentation • New and Innovative • Hybrid Cloud • IaaS/PaaS • Containers
Industrialized Core
Deliver at regular cadence • Agile • Stability • Predictability • Lean Delivery pipeline • Core and Legacy Systems
Hybrid Infrastructure – Physical, Cloud • IaaS/PaaS • Containers
Business
Capability
DevOps + SRE in the Enterprise
Development SCM Build
Package
Repo Deploy
Development SCM Build
Package
Repo Deploy
Development SCM Build
Package
Repo Deploy
Development SCM Build
Package
Repo Deploy Test Stage Production
Application N
Application C
Application B
Application A
Enterprise
Release
Agile/Innovation Edge
Rapid Delivery for Innovation • Agile • Antifragile • Experimentation • New and Innovative • Hybrid Cloud • IaaS/PaaS • Containers
Industrialized Core
Deliver at regular cadence • Agile • Stability • Predictability • Lean Delivery pipeline • Core and Legacy Systems
Hybrid Infrastructure – Physical, Cloud • IaaS/PaaS • Containers
Business
Capability
Standardization Across Delivery Pipelines
Deployment Automation
and
Orchestration
Service andTest
Environment
Virtualization
ArchitecturePlanning
Release
Management
Operational
Readiness
Your Delivery Pipeline
will be as fast as the
slowest Delivery
Pipeline it is dependent
on
Data Friction is
usually the last
challenge to be
addressed
PLANNING
Modernizing to
Microservices based
Architecture:
Refactoring Code and
Data
Code can be
started afresh, not
Data
ARCHITECTURE
Developers are paid to
write code, not maintain
deployment and
configuration scripts.
DBAs are paid to
Manage Data and
Datastores, not
generateTest Data sets
APPLICATION DEPLOYMENT AND ENVIRONMENT ORCHESTRATION
If you are doing 2-
week Sprints, but it
takes 3-weeks to get
aTest Environment
andTest Data sets,
how long are your
Sprints?
TEST SERVICE AND ENVIRONMENT VIRTUALIZATION
It is not possible to
patch the software of a
missile AFTER it has
been launched
RELEASE MANAGEMENT
Shift thinking from
MeanTime Between
Failure (MTBF) to
MeanTimeTo Repair
(MTTR). 
OPERATIONAL READINESS FOR SRE
ANTIFRAGILE SYSTEMS
Antifragile: Things that
are neither fragile or
robust, but rather thrive
in chaos.
WHEN DEVOPS MEETS SRE
DevOps: “Everyone is responsible for
delivering Business Value.”
SRE: “(Everyone) is responsible for
delivering Continuous Business Value”
THANK YOU

Any questions?
@sd_architect
http://sdarchitect.blog
delphix.com

More Related Content

PDF
DeliverAgile2018 - from Apollo 13 to Google SRE
PDF
My code, my environment, and yes, my data
PDF
From Apollo 13 to Google SRE
PDF
DevOps in an Embedded World
PDF
Democratizing security
PDF
The Muda, Mura and Muri of DevOps
PPTX
Driving Enterprise Architecture Redesign: Cloud-Native Platforms, APIs, and D...
PPTX
DOES16 San Francisco - Susanna Brown & Ben Chan - DevOps in the Midst of an A...
DeliverAgile2018 - from Apollo 13 to Google SRE
My code, my environment, and yes, my data
From Apollo 13 to Google SRE
DevOps in an Embedded World
Democratizing security
The Muda, Mura and Muri of DevOps
Driving Enterprise Architecture Redesign: Cloud-Native Platforms, APIs, and D...
DOES16 San Francisco - Susanna Brown & Ben Chan - DevOps in the Midst of an A...

What's hot (20)

PDF
API and App Ecosystems - Build The Best: a deep dive
PPTX
The DevOps Journey in an Enterprise - DOES 2021
PDF
Lo Scenario Cloud-Native (Pivotal Cloud-Native Workshop: Milan)
PPTX
Devops with Alibaba Cloud
PPTX
DOES16 San Francisco - Scott Prugh & Erica Morrison - When Ops Swallows Dev
PPTX
Troubleshooting App Health and Performance with PCF Metrics 1.2
PPTX
DevOps to DevSecOps Journey..
PDF
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
PPTX
DevOps and Cloud Tips and Techniques to Revolutionize Your SDLC
PDF
Pivotal Cloud Foundry: A Technical Overview
PPTX
The 7 Principles of DevOps and Cloud Applications
PDF
Tools and Recipes to Replatform Monolithic Apps to Modern Cloud Environments
PDF
Demystifying Operational Features for Product Owners - AgileCam - SkeltonThat...
PDF
Cloud Native Operations
PDF
Accelerating Time to Market
PPTX
CA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeter
PDF
DOES14: Scott Prugh, CSG - DevOps and Lean in Legacy Environments
PPTX
Evolving Devops: The Benefits of PaaS and Application Dial Tone
PDF
Integrating SAP into DevOps Pipelines: Why and How
PPTX
DevOps For Everyone: Bringing DevOps Success to Every App and Every Role in y...
API and App Ecosystems - Build The Best: a deep dive
The DevOps Journey in an Enterprise - DOES 2021
Lo Scenario Cloud-Native (Pivotal Cloud-Native Workshop: Milan)
Devops with Alibaba Cloud
DOES16 San Francisco - Scott Prugh & Erica Morrison - When Ops Swallows Dev
Troubleshooting App Health and Performance with PCF Metrics 1.2
DevOps to DevSecOps Journey..
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
DevOps and Cloud Tips and Techniques to Revolutionize Your SDLC
Pivotal Cloud Foundry: A Technical Overview
The 7 Principles of DevOps and Cloud Applications
Tools and Recipes to Replatform Monolithic Apps to Modern Cloud Environments
Demystifying Operational Features for Product Owners - AgileCam - SkeltonThat...
Cloud Native Operations
Accelerating Time to Market
CA Security Communities Webcast - CA SSO Performance Testing with CA BlazeMeter
DOES14: Scott Prugh, CSG - DevOps and Lean in Legacy Environments
Evolving Devops: The Benefits of PaaS and Application Dial Tone
Integrating SAP into DevOps Pipelines: Why and How
DevOps For Everyone: Bringing DevOps Success to Every App and Every Role in y...
Ad

Similar to Cloud expo 2018: From Apollo 13 to Google SRE - When DevOps meets SRE (20)

PDF
Containers, microservices and serverless for realists
PPTX
Enterprise DevOps and the Modern Mainframe Webcast Presentation
PPTX
What is DevOps?
PPTX
Neo4j for Cloud Management at Scale
PDF
London DevOps Meetup - PaaS as a platform for devops
PDF
DevOps Vancouver Meetup - WSBC Progress
PPTX
RightScale User Conference: Why RightScale?
PDF
Azure-Migration-Presentation-Fresno-1-28-2020.pdf
PPTX
Delivering Applications Continuously to Cloud
PDF
Continuous Delivery for cloud - scenarios and scope
PPSX
Elastic-Engineering
PPTX
Dutch Oracle Architects Platform - Reviewing Oracle OpenWorld 2017 and New Tr...
PDF
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
PPTX
Rising Above the Noise: Continuous Integration, Delivery and DevOps
PPTX
Get your head in the clouds! - Swansea Con 2016
PPTX
Business and IT agility through DevOps and microservice architecture powered ...
PDF
Beyond DevOps: How Netflix Bridges the Gap?
PDF
451 Research: Data Is the Key to Friction in DevOps
PDF
It summit 2014_migrating_applications_to_the_cloud-5
PDF
Microdeployments for microservices dev ops nashville
Containers, microservices and serverless for realists
Enterprise DevOps and the Modern Mainframe Webcast Presentation
What is DevOps?
Neo4j for Cloud Management at Scale
London DevOps Meetup - PaaS as a platform for devops
DevOps Vancouver Meetup - WSBC Progress
RightScale User Conference: Why RightScale?
Azure-Migration-Presentation-Fresno-1-28-2020.pdf
Delivering Applications Continuously to Cloud
Continuous Delivery for cloud - scenarios and scope
Elastic-Engineering
Dutch Oracle Architects Platform - Reviewing Oracle OpenWorld 2017 and New Tr...
ADDO_2020-Driving-Digital-Transformation-through-CloudOps-and-SRE.pdf
Rising Above the Noise: Continuous Integration, Delivery and DevOps
Get your head in the clouds! - Swansea Con 2016
Business and IT agility through DevOps and microservice architecture powered ...
Beyond DevOps: How Netflix Bridges the Gap?
451 Research: Data Is the Key to Friction in DevOps
It summit 2014_migrating_applications_to_the_cloud-5
Microdeployments for microservices dev ops nashville
Ad

More from Sanjeev Sharma (20)

PDF
From DevOps to DevSecOps: 2 Dimensions of Security for DevOps
PDF
How NBCUniversal Adopted DevOps
PDF
Unicorns on an Aircraft Carrier: CDSummit London and Stockholm Keynote
PDF
DevOps Thinking for the Line of Business
PDF
A DevOps adoption playbook- achieving business value at scale
PDF
IBM InterConnect 2016: Security for DevOps in an Enterprise
PDF
DevOps adoption in the enterprise
PDF
dev@InterConnect workshop - Lean and DevOps
PPTX
OpenTechSummit InterConnect2015 DevOps
PDF
DTS-1778 Understanding DevOps - IBM InterConnect Session
PDF
Mobile to Mainframe - En-to-end transformation
PDF
DevOps and Application Delivery for Hybrid Cloud - DevOpsSummit session
PDF
Using Lean Thinking to identify and address Delivery Pipeline bottlenecks
PPTX
DevOps 101 - IBM Impact 2014
PPT
Enabling DevOps in the cloud - Federal Cloud Innovation Center
PPT
Continuous Delivery to the cloud - Innovate 2014
PPTX
Applying DevOps, PaaS and cloud for better citizen service outcomes - IBM Fe...
PDF
IBM Innovate - Uderstanding DevOps
PDF
CampDevOps keynote - DevOps: Using 'Lean' to eliminate Bottlenecks
PPT
IBM Pulse session 2727: Continuous delivery -accelerated with DevOps
From DevOps to DevSecOps: 2 Dimensions of Security for DevOps
How NBCUniversal Adopted DevOps
Unicorns on an Aircraft Carrier: CDSummit London and Stockholm Keynote
DevOps Thinking for the Line of Business
A DevOps adoption playbook- achieving business value at scale
IBM InterConnect 2016: Security for DevOps in an Enterprise
DevOps adoption in the enterprise
dev@InterConnect workshop - Lean and DevOps
OpenTechSummit InterConnect2015 DevOps
DTS-1778 Understanding DevOps - IBM InterConnect Session
Mobile to Mainframe - En-to-end transformation
DevOps and Application Delivery for Hybrid Cloud - DevOpsSummit session
Using Lean Thinking to identify and address Delivery Pipeline bottlenecks
DevOps 101 - IBM Impact 2014
Enabling DevOps in the cloud - Federal Cloud Innovation Center
Continuous Delivery to the cloud - Innovate 2014
Applying DevOps, PaaS and cloud for better citizen service outcomes - IBM Fe...
IBM Innovate - Uderstanding DevOps
CampDevOps keynote - DevOps: Using 'Lean' to eliminate Bottlenecks
IBM Pulse session 2727: Continuous delivery -accelerated with DevOps

Recently uploaded (20)

PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
cuic standard and advanced reporting.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPT
Teaching material agriculture food technology
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Approach and Philosophy of On baking technology
PDF
Empathic Computing: Creating Shared Understanding
PDF
KodekX | Application Modernization Development
PDF
Electronic commerce courselecture one. Pdf
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Cloud computing and distributed systems.
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Chapter 3 Spatial Domain Image Processing.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
cuic standard and advanced reporting.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Teaching material agriculture food technology
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Understanding_Digital_Forensics_Presentation.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Approach and Philosophy of On baking technology
Empathic Computing: Creating Shared Understanding
KodekX | Application Modernization Development
Electronic commerce courselecture one. Pdf
MYSQL Presentation for SQL database connectivity
Cloud computing and distributed systems.
20250228 LYD VKU AI Blended-Learning.pptx
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....

Cloud expo 2018: From Apollo 13 to Google SRE - When DevOps meets SRE

  • 1. FROM APOLLO 13 TO GOOGLE SRE
 WHEN DEVOPS MET SRE Sanjeev Sharma @sd_architect | http://sdarchitect.blog
  • 2. #WHOAMI • 20+ Years in Software Development and Delivery • Past: IBM Distinguished Engineer and CTO for DevOps Adoption • Now: Global Practice Lead for Data Transformation at Delphix • Author of two DevOps books: • DevOps For Dummies: https://ibm.biz/ BdsPMX • The DevOps Adoption Playbook: http:// amzn.to/2hH7rt2 • Blog: https://sdarchitect.blog • Tweets: @sd_architect
  • 3. WHAT IS SRE? “SRE is what happens when you ask a software engineer to design an operations team. ” Betsy Beyer, Chris Jones, Jennifer Petoff, and Niall Richard Murphy “Site Reliability Engineering.” Site Reliability Engineering (SRE) : Google’s approach to Service Management
  • 4. RELIABILITY: THE REAL AVAILABILITY NUMBERS! How much downtime does 5-nines 99.999% availability translate to? • Daily: 0.9s • Weekly: 6.0s • Monthly: 26.3s • Yearly: 5m 15.6s 4-nines or 99.99% translates to downtime of: • Daily: 8.6s • Weekly: 1m 0.5s • Monthly: 4m 23.0s • Yearly: 52m 35.7s Even the more common 99.95% availability SLO is a mere 43 seconds/day or 5:24 minutes/week.
  • 5. APOLLO 13 – THE REAL HEROES Image Courtesy: Universal Pictures, NASA
  • 6. EIGHT TENETS OF GOOGLE SRE 1. Ensuring a Durable Focus on Engineering 2. Pursuing Maximum ChangeVelocity WithoutViolating a Service’s SLO 3. Monitoring 4. Emergency Response 5. Change Management 6. Demand Forecasting and Capacity Planning 7. Provisioning 8. Efficiency and Performance
  • 7. BEST PRACTICES OF INCIDENT MANAGEMENT 1. Prioritize 2. Prepare 3. Trust 4. Introspect 5. Consider alternatives 6. Practice 7. Change it around Image Courtesy: Universal Pictures, NASA
  • 8. Development SCM Build Package Repo Deploy Development SCM Build Package Repo Deploy Development SCM Build Package Repo Deploy Development SCM Build Package Repo Deploy Test Stage Production Mainframe Hosted App Mobile App App Server Monolithic App Cloud Native App Enterprise Release Agile/Innovation Edge Rapid Delivery for Innovation • Agile • Antifragile • Experimentation • New and Innovative • Hybrid Cloud • IaaS/PaaS • Containers Industrialized Core Deliver at regular cadence • Agile • Stability • Predictability • Lean Delivery pipeline • Core and Legacy Systems Hybrid Infrastructure – Physical, Cloud • IaaS/PaaS • Containers Business Capability DevOps + SRE in the Enterprise
  • 9. Development SCM Build Package Repo Deploy Development SCM Build Package Repo Deploy Development SCM Build Package Repo Deploy Development SCM Build Package Repo Deploy Test Stage Production Application N Application C Application B Application A Enterprise Release Agile/Innovation Edge Rapid Delivery for Innovation • Agile • Antifragile • Experimentation • New and Innovative • Hybrid Cloud • IaaS/PaaS • Containers Industrialized Core Deliver at regular cadence • Agile • Stability • Predictability • Lean Delivery pipeline • Core and Legacy Systems Hybrid Infrastructure – Physical, Cloud • IaaS/PaaS • Containers Business Capability Standardization Across Delivery Pipelines Deployment Automation and Orchestration Service andTest Environment Virtualization ArchitecturePlanning Release Management Operational Readiness
  • 10. Your Delivery Pipeline will be as fast as the slowest Delivery Pipeline it is dependent on Data Friction is usually the last challenge to be addressed PLANNING
  • 11. Modernizing to Microservices based Architecture: Refactoring Code and Data Code can be started afresh, not Data ARCHITECTURE
  • 12. Developers are paid to write code, not maintain deployment and configuration scripts. DBAs are paid to Manage Data and Datastores, not generateTest Data sets APPLICATION DEPLOYMENT AND ENVIRONMENT ORCHESTRATION
  • 13. If you are doing 2- week Sprints, but it takes 3-weeks to get aTest Environment andTest Data sets, how long are your Sprints? TEST SERVICE AND ENVIRONMENT VIRTUALIZATION
  • 14. It is not possible to patch the software of a missile AFTER it has been launched RELEASE MANAGEMENT
  • 15. Shift thinking from MeanTime Between Failure (MTBF) to MeanTimeTo Repair (MTTR).  OPERATIONAL READINESS FOR SRE
  • 16. ANTIFRAGILE SYSTEMS Antifragile: Things that are neither fragile or robust, but rather thrive in chaos.
  • 17. WHEN DEVOPS MEETS SRE DevOps: “Everyone is responsible for delivering Business Value.” SRE: “(Everyone) is responsible for delivering Continuous Business Value”