SlideShare a Scribd company logo
Rewriting DevOps
Lessons from a 15 month software
rewrite
Matthew Boeckman
Dryas.io
Strategic Consulting
Startup Advisor
DevOps Coaching
19 years (dev)Ops
Craftsy
Former VP - Infrastructure
@matthewboeckman
Yarn?!
Founded 4/2010
200+% YoY growth through 2015
Rewriting DevOps - Lessons from a 15 month software rewrite
Yes, Yarn!
Yarn?!
Founded 4/2010
200+% YoY growth through 2015
Before & After … in
our DevOps
Devs wrote code
Ops wrote infrastructure
Configuration Automation
(puppet)
Shared On-call
(with escalation)
Separate workstreams
Dev and Ops write code and
infrastructure
Immutable Infrastructure
Shared On-call
(with less escalation)
Joined workstreams
15 month
agilefall
(waterscrum?)
rewrite
Before & After
… in the app
Java/Postgres Monolith
Sessions
Poor caching model
Tightly coupled FE/BE
Limited scale
(micro)Services
API calls
Flexible caching (multi-
tier)
Decoupled logic
Less limited scale!
15 month
agilefall
(waterscrum?)
rewrite
Before & After … in
our culture
Weekly release Ceremony
High risk profile
Full day regression
Dev and Ops work streams
Functional teams
CD
Low risk
Test automation
Shared work, responsibility
Objective based teams
15 month
agilefall
(waterscrum?)
rewrite
ewww
Rewriting DevOps - Lessons from a 15 month software rewrite
The new stack
Fastly - Content Delivery
F5 & ALB - load balancing
Node.js/React - FE
Java/Spring - BE
Packer - AMI’s
Consul - service discovery
Terraform - Infrastructure
Postgres/RDS - database
SQS/SNS/Lambda/S3 - everything else
JENKINS - Orchestration & Delivery
Timeline
2010-2015 | 7/15-9/15 | 9/15 - 2/16 | 2/16 - 9/16 | 9/30/16
DevOps! Bi-Modal SRE Foundations Hugs!
Phase I Phase II Phase III LAUNCH
Phase II for the Ops team
Rewriting DevOps - Lessons from a 15 month software rewrite
Take a chance that went badly?
Try something new that exploded?
Accidentally push something past the edge?
Here’s an award!
The enrolled sash
DevOps needs culture
Phase III
2011-2015 | 7/15-9/15 | 9/15 - 2/16 | 2/16 - 9/16 | 9/30/16
Phase I Phase II Phase III LAUNCH
DevOps! Bi-Modal SRE Foundations Hugs!
Silos? In my DevOps culture?
Backend
Ops
Frontend
Separate workstreams
Joined workstreams
DevOps teams need shared perspective.
Common tools
Site Reliability Engineering
"Fundamentally, it's what happens when
you ask a software engineer to design an
operations function."
Ben Treynor Sloss, founder of Google SRE
Common goals
SRE is a good start
SRE Phase I (Feb-May)
● Focus on SRE stack
○ Nagios, graphite, splunk, atlassian
● Reliability metrics
○ Errors; response time
● Runbooks
● Blameless PIR everything
● Iterate
Runbooks:
● System overview
● Escalation path
● Alert descriptions
● Common failure conditions
● Known recovery procedures
● Incident history
SRE Phase II (May -> Launch -> Forever)
● Build a production environment
● Pentesting and lockdown
● Tune reliability metrics
● Load tests
● Resilience tests
● Recovery tests
● Blameless PIR everything
● Runbooks
● Iterate
Quack like a duck
Space to try
DevOps teams need low-risk
places to experiment and learn
Launch!
● RDS database failed in 1st hour -- zero impact
● 17 pushes in 3 days
● 50+ pushes in week 1
● 85% automated test coverage
● Meaningful bump in all business KPI’s
60 hours of round-the-clock coverage.
Zero SevOne or SevTwo incidents
Before & After … in
our DevOps
Devs wrote code
Ops wrote infrastructure
Configuration Automation
(puppet)
Shared On-call
(with escalation)
Separate workstreams
Dev and Ops write code and
infrastructure
Immutable Infrastructure
Shared On-call
(with less escalation)
Joined workstreams
15 month
agilefall
(waterscrum?)
rewrite
Before & After
… in the app
Java/Postgres Monolith
Sessions
Poor caching model
Tightly coupled FE/BE
Limited scale
(micro)Services
API calls
Flexible caching (multi-
tier)
Decoupled logic
Less limited scale!
15 month
agilefall
(waterscrum?)
rewrite
Before & After … in
our culture
Weekly release Ceremony
High risk profile
Full day regression
Dev and Ops work streams
Functional teams
CD
Low risk
Test automation
Shared work, responsibility
Objective based teams
15 month
agilefall
(waterscrum?)
rewrite
DevOps needs
Shared culture
Shared tools
Shared goals
and
space to try
Thank you!
@matthewboeckman
dryas.io

More Related Content

PDF
Continuous Delivery Agile Tour Beirut 2015
PDF
QA in DevOps: Transformation thru Automation via Jenkins
PPTX
DevOps and All the Continuouses w/ Helen Beal
PDF
DevOps Continuous Integration & Delivery - A Whitepaper by RapidValue
PPTX
Dev ops is more than CI+CD tools
PPTX
DevQAOps - Surviving in a DevOps World
PPTX
HOW TO OPTIMIZE NON-CODING TIME, ORI KEREN, LinearB
PPTX
Introduction to Continuous Delivery (BBWorld/DevCon 2013)
Continuous Delivery Agile Tour Beirut 2015
QA in DevOps: Transformation thru Automation via Jenkins
DevOps and All the Continuouses w/ Helen Beal
DevOps Continuous Integration & Delivery - A Whitepaper by RapidValue
Dev ops is more than CI+CD tools
DevQAOps - Surviving in a DevOps World
HOW TO OPTIMIZE NON-CODING TIME, ORI KEREN, LinearB
Introduction to Continuous Delivery (BBWorld/DevCon 2013)

What's hot (18)

PPSX
Continuous Integration - Oracle Database Objects
PPTX
Continuous Delivery
PDF
Continuous Delivery at Oracle Database Insights
PDF
Creative Branching Models for Multiple Release Streams
PPTX
Continuous Deployment
PDF
What is DevOps
PPTX
Continuous Integration and Continuous Deployment in Enterprise scenario
PPTX
Continuous delivery applied
PDF
Introduction cypress
PDF
Agile2012 soccer witha_basketballteam
PPTX
Where Testers & QA Fit in the Story of DevOps
PDF
Continuous Delivery Distilled
PPTX
How MS Does Devops - Developer Developer Developer 2018
PPTX
ATAGTR2017 Upgrading a mobile tester's weapons with advanced debugging
PPTX
Poster - DevOps Habits @ Microsoft
PDF
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
PPTX
AppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec Program
PDF
Taking AppSec to 11 - BSides Austin 2016
Continuous Integration - Oracle Database Objects
Continuous Delivery
Continuous Delivery at Oracle Database Insights
Creative Branching Models for Multiple Release Streams
Continuous Deployment
What is DevOps
Continuous Integration and Continuous Deployment in Enterprise scenario
Continuous delivery applied
Introduction cypress
Agile2012 soccer witha_basketballteam
Where Testers & QA Fit in the Story of DevOps
Continuous Delivery Distilled
How MS Does Devops - Developer Developer Developer 2018
ATAGTR2017 Upgrading a mobile tester's weapons with advanced debugging
Poster - DevOps Habits @ Microsoft
Taking AppSec to 11: AppSec Pipeline, DevOps and Making Things Better
AppSec++ Take the best of Agile, DevOps and CI/CD into your AppSec Program
Taking AppSec to 11 - BSides Austin 2016
Ad

Similar to Rewriting DevOps - Lessons from a 15 month software rewrite (20)

PPTX
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
PPTX
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
PPTX
Innovate Better Through Machine data Analytics
PPTX
DevOps - Boldly Go for Distro
PPTX
Continuous Testing 2016
PPTX
Continuous Testing
DOC
Sunny Agrawal
PDF
Strengthen and Scale Security for a dollar or less
PDF
Scale security for a dollar or less
KEY
Testing and DevOps Culture: Lessons Learned
PPTX
A Crash Course in Building Site Reliability
PDF
Confoo-Montreal-2016: Controlling Your Environments using Infrastructure as Code
PDF
How Salesforce built a Scalable, World-Class, Performance Engineering Team
PPTX
DevOps - Understanding Core Concepts (Old)
PDF
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
PPTX
Dev ops != Dev+Ops
PPTX
Moving 65,000 Microsofties to DevOps with Visual Studio Team Services
PDF
Continuous Testing: A Key to DevOps Success
PDF
Luiz Fernando Testa Contador - Aplicando DevOps em grandes corporações
PDF
Strengthen and Scale Security Using DevSecOps - OWASP Indonesia
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
Innovate Better Through Machine data Analytics
DevOps - Boldly Go for Distro
Continuous Testing 2016
Continuous Testing
Sunny Agrawal
Strengthen and Scale Security for a dollar or less
Scale security for a dollar or less
Testing and DevOps Culture: Lessons Learned
A Crash Course in Building Site Reliability
Confoo-Montreal-2016: Controlling Your Environments using Infrastructure as Code
How Salesforce built a Scalable, World-Class, Performance Engineering Team
DevOps - Understanding Core Concepts (Old)
Creating a DevOps Practice for Analytics -- Strata Data, September 28, 2017
Dev ops != Dev+Ops
Moving 65,000 Microsofties to DevOps with Visual Studio Team Services
Continuous Testing: A Key to DevOps Success
Luiz Fernando Testa Contador - Aplicando DevOps em grandes corporações
Strengthen and Scale Security Using DevSecOps - OWASP Indonesia
Ad

More from Matthew Boeckman (11)

PPTX
Useful flakes - The Value of Common Tools
PDF
All Day DevOps 2017 - There is No Root Cause
PPTX
Top 10 Practices of Highly Successful DevOps Incident Management Teams
PPTX
Many hands make light work
PDF
Sandstorm or Significant? The evolving role of situational context in inciden...
PPTX
Rewriting DevOps
PPTX
Go Rin no Show - DevOpsDays Rockies
PPTX
The promise of NoOps
PPTX
Ops, DevOps, NoOps and AWS Lambda
PPTX
Vpc aws meetup
PPTX
S3 & Glacier - The only backup solution you'll ever need
Useful flakes - The Value of Common Tools
All Day DevOps 2017 - There is No Root Cause
Top 10 Practices of Highly Successful DevOps Incident Management Teams
Many hands make light work
Sandstorm or Significant? The evolving role of situational context in inciden...
Rewriting DevOps
Go Rin no Show - DevOpsDays Rockies
The promise of NoOps
Ops, DevOps, NoOps and AWS Lambda
Vpc aws meetup
S3 & Glacier - The only backup solution you'll ever need

Recently uploaded (20)

PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
A Presentation on Artificial Intelligence
PPTX
sap open course for s4hana steps from ECC to s4
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
Spectroscopy.pptx food analysis technology
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PPTX
Machine Learning_overview_presentation.pptx
PPTX
Cloud computing and distributed systems.
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Chapter 3 Spatial Domain Image Processing.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Assigned Numbers - 2025 - Bluetooth® Document
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Encapsulation theory and applications.pdf
Machine learning based COVID-19 study performance prediction
A Presentation on Artificial Intelligence
sap open course for s4hana steps from ECC to s4
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
“AI and Expert System Decision Support & Business Intelligence Systems”
Spectroscopy.pptx food analysis technology
Network Security Unit 5.pdf for BCA BBA.
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Unlocking AI with Model Context Protocol (MCP)
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Machine Learning_overview_presentation.pptx
Cloud computing and distributed systems.
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...

Rewriting DevOps - Lessons from a 15 month software rewrite

  • 1. Rewriting DevOps Lessons from a 15 month software rewrite
  • 2. Matthew Boeckman Dryas.io Strategic Consulting Startup Advisor DevOps Coaching 19 years (dev)Ops Craftsy Former VP - Infrastructure @matthewboeckman
  • 3. Yarn?! Founded 4/2010 200+% YoY growth through 2015
  • 6. Yarn?! Founded 4/2010 200+% YoY growth through 2015
  • 7. Before & After … in our DevOps Devs wrote code Ops wrote infrastructure Configuration Automation (puppet) Shared On-call (with escalation) Separate workstreams Dev and Ops write code and infrastructure Immutable Infrastructure Shared On-call (with less escalation) Joined workstreams 15 month agilefall (waterscrum?) rewrite
  • 8. Before & After … in the app Java/Postgres Monolith Sessions Poor caching model Tightly coupled FE/BE Limited scale (micro)Services API calls Flexible caching (multi- tier) Decoupled logic Less limited scale! 15 month agilefall (waterscrum?) rewrite
  • 9. Before & After … in our culture Weekly release Ceremony High risk profile Full day regression Dev and Ops work streams Functional teams CD Low risk Test automation Shared work, responsibility Objective based teams 15 month agilefall (waterscrum?) rewrite
  • 10. ewww
  • 12. The new stack Fastly - Content Delivery F5 & ALB - load balancing Node.js/React - FE Java/Spring - BE Packer - AMI’s Consul - service discovery Terraform - Infrastructure Postgres/RDS - database SQS/SNS/Lambda/S3 - everything else JENKINS - Orchestration & Delivery
  • 13. Timeline 2010-2015 | 7/15-9/15 | 9/15 - 2/16 | 2/16 - 9/16 | 9/30/16 DevOps! Bi-Modal SRE Foundations Hugs! Phase I Phase II Phase III LAUNCH
  • 14. Phase II for the Ops team
  • 16. Take a chance that went badly? Try something new that exploded? Accidentally push something past the edge? Here’s an award! The enrolled sash
  • 18. Phase III 2011-2015 | 7/15-9/15 | 9/15 - 2/16 | 2/16 - 9/16 | 9/30/16 Phase I Phase II Phase III LAUNCH DevOps! Bi-Modal SRE Foundations Hugs!
  • 19. Silos? In my DevOps culture? Backend Ops Frontend
  • 22. DevOps teams need shared perspective. Common tools
  • 23. Site Reliability Engineering "Fundamentally, it's what happens when you ask a software engineer to design an operations function." Ben Treynor Sloss, founder of Google SRE
  • 24. Common goals SRE is a good start
  • 25. SRE Phase I (Feb-May) ● Focus on SRE stack ○ Nagios, graphite, splunk, atlassian ● Reliability metrics ○ Errors; response time ● Runbooks ● Blameless PIR everything ● Iterate
  • 26. Runbooks: ● System overview ● Escalation path ● Alert descriptions ● Common failure conditions ● Known recovery procedures ● Incident history
  • 27. SRE Phase II (May -> Launch -> Forever) ● Build a production environment ● Pentesting and lockdown ● Tune reliability metrics ● Load tests ● Resilience tests ● Recovery tests ● Blameless PIR everything ● Runbooks ● Iterate
  • 28. Quack like a duck
  • 29. Space to try DevOps teams need low-risk places to experiment and learn
  • 30. Launch! ● RDS database failed in 1st hour -- zero impact ● 17 pushes in 3 days ● 50+ pushes in week 1 ● 85% automated test coverage ● Meaningful bump in all business KPI’s 60 hours of round-the-clock coverage. Zero SevOne or SevTwo incidents
  • 31. Before & After … in our DevOps Devs wrote code Ops wrote infrastructure Configuration Automation (puppet) Shared On-call (with escalation) Separate workstreams Dev and Ops write code and infrastructure Immutable Infrastructure Shared On-call (with less escalation) Joined workstreams 15 month agilefall (waterscrum?) rewrite
  • 32. Before & After … in the app Java/Postgres Monolith Sessions Poor caching model Tightly coupled FE/BE Limited scale (micro)Services API calls Flexible caching (multi- tier) Decoupled logic Less limited scale! 15 month agilefall (waterscrum?) rewrite
  • 33. Before & After … in our culture Weekly release Ceremony High risk profile Full day regression Dev and Ops work streams Functional teams CD Low risk Test automation Shared work, responsibility Objective based teams 15 month agilefall (waterscrum?) rewrite
  • 34. DevOps needs Shared culture Shared tools Shared goals and space to try