SlideShare a Scribd company logo
© 2013 – 2017 naked Agility Limited All Rights Reserved
A DevOps Story
@MrHinsh 1
Martin Hinshelwood | @MrHinsh
martin@nkdagility.com | http://nkdagility.com/blog
© 2013 – 2017 naked Agility Limited All Rights Reserved
2
© 2013 – 2017 naked Agility Limited All Rights Reserved
3
© 2013 – 2017 naked Agility Limited All Rights Reserved
© 2013 – 2017 naked Agility Limited All Rights Reserved
Diego Lo Giudice and Dave West, Forrester
February 2011
Transforming Application Delivery
Firms today experience a much higher velocity of
business change. Market opportunities appear or
dissolve in months or weeks instead of years.
“
”
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
This is the story of:
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Insanity is doing the same thing over and over again
expecting a different result.
-Albert Einstein
© 2013 – 2017 naked Agility Limited All Rights Reserved
Developer Division
© 2013 – 2017 naked Agility Limited All Rights Reserved
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Before 1ES and Azure DevOps
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Faster Value Delivery
Increase flow of Shorten cycle times Reduce re-work costs
Typical day at Microsoft
Data: Internal Microsoft engineering system activity, August 2018
12.4k
Pull Requests per day
67k
Git commits per day
78,000Deployments per day
146k
Builds per day
500m
Test executions per day
500k
Work items updated
per day
5m
Work items viewed per
day
Azure DevOps Services is the toolchain of choice for Microsoft engineering with over 90,000 internal users
https://aka.ms/DevOpsAtMicrosoft
@MrHinsh
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Schedule
Code Test & Stabilize Code Test & Stabilize
Beta RTM
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Feedback
Planning
Customer feedback – we should
change the way a feature works. We
didn’t get it quite right…
… but we’re booked solid already.
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
S1 S2 S3 S4 S5 Stabilization S6
Story: Sprint 1-5
A
B
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Now
2 years
3 weeks
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Deliver more value to customers
Faster responses to customers and market changes
Improved engineering satisfaction
2x productivity increase
Features Delivered per Year
https://www.visualstudio.com/en-us/articles/news/features-timeline
22
58
65
111
262
249
2012 2013 2014 2015 2016 2017
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Organization
Roles
Teams
Cadence
Taxonomy
Plan
Practices
Guiding Principles
Alignment
Autonomy
“Let’s try to give our teams three things….
Autonomy, Mastery, Purpose”
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Alignment
Every team and business
tracks scenarios and
features consistently.
Autonomy
Every team chooses how to manage
stories and/or tasks
Taxonomy & Staying Aligned
© 2013 – 2017 naked Agility Limited All Rights Reserved
Planning
Epic
18 months
Aspirational
(60%)
Plan
3 sprints
Thoughtful
(90%)
3
Sprint
3 weeks
Confident (95%)
1
Season
6 months
Hopeful (80%)
6
Teams are responsible for the detail
Leadership is responsible
for the big picture
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Scenarios
Features
Stories
Tasks
Aligned Autonomy
Alignment
The big picture in light of our
business goals
Autonomy
The detail about what we’ll deliver
to achieve our business goals
Week 1 Week 2 Week 3
Week 1 Week 2 Week 3Week 2 Week 3
Sprint 69Sprint 68 Sprint 70
Sprint Planning Done!
What we accomplished
Week 1 Week 2 Week 3
Week 1 Week 2 Week 3Week 2 Week 3
Sprint 69Sprint 68 Sprint 70
The sprint plan
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Sprint Mails
Value delivered
during the sprint
Video demonstrating
the value
What the team is
planning to accomplish
in the next sprint
© 2013 – 2017 naked Agility Limited All Rights Reserved
It’s not 2 years, but…
• Updates were large
• Months apart
• Lots of problems!
4/1/2010 4/23/2012
5/3/2010
TFS 2010 RTM
4/23/2011
ServiceDeployment
8/5/2011
ServiceUpdate
9/26/2011
//BUILD2011
12/7/2011
ServiceUpdate
1/30/2012
ServiceUpdate
2/20/2012
ServiceUpdate
3/12/2012
ServiceUpdate
4/2/2012
ServiceUpdate
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Organization Chart… before
Program Management Development Testing
Operations
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Organization Chart
Program Management Engineering
Operations
Engineering
Program Management is responsible for:
WHAT we’re building, and
WHY we’re building it
Engineering is responsible for
HOW we’re building it, and that
we’re building it with QUALITY
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Teams
Program Management Engineering
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Week 1 Week 2 Week 3
Week 1 Week 2 Week 3Week 2 Week 3
Sprint 69Sprint 68 Sprint 70
Deployment
Sprint Planning Done!
If it’s bad, YOU wake up
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
© 2013 – 2017 naked Agility Limited All Rights Reserved
But we have many teams
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Everyone creates a branch…
Week 1 Week 2 Week 3
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Writes a lot of code…
Week 1 Week 2 Week 3
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
It needs to come together…
Week 1 Week 2 Week 3
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Merge Debt
Week 1 Week 2 Week 3
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Organizations which design systems... are
constrained to produce designs which are
copies of the communication structures of
these organizations…
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Typical Server Based Branching Structure
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Organizations tend to produce
branching structures that copy the
organization chart.
© 2013 – 2017 naked Agility Limited All Rights Reserved
Maintaining enterprise rigor
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Branching
https://guides.github.com/introduction/flow/
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Branching
© 2013 – 2017 naked Agility Limited All Rights Reserved
Internal Open Source
Starts from a
position ofTrust
Share everything Encourage
contributions
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Quality- Before
Code Test & Stabilize Code Test & Stabilize
Beta RTM
Planning
Code
Complete
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Quality- After
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
There’s no place
like production!
© 2013 – 2017 naked Agility Limited All Rights Reserved
Customer IntelligenceBusiness IntelligenceOperational Intelligence
Gather everything
Dashboard DevOps Debug Experiments
© 2013 – 2017 naked Agility Limited All Rights Reserved
TAKE
AWAY
• Microsoft has changed the way that they run their
business to support faster delivery
• Create the right balance of autonomy and
alignment
• Focus on Engineering Excellence with continuous
delivery
• Gather as much telemetry as you can to make
better decisions
Summary
© 2013 – 2017 naked Agility Limited All Rights Reserved
Martin
Hinshelwood
martin@nkdagility.com
• Hear about their Journey journey:
http://aka.ms/engineeringstories
• Learn how they deployVSTS:
https://blogs.msdn.microsoft.com/devops/2017/04/25/
how-we-use-rm-part-1/
• Follow their ongoing journey andVSTS updates:
https://blogs.msdn.microsoft.com/bharry/
• UseVSTS
• http://visualstudio.com/team-services
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Connect With Martin Hinshelwood:
55
+52 1 998 894 1898@MrHinsh
martin@nkdagility.com
https://nkdAgility.com/blog
56
Starting with what is most important/most pain, go from
there
Designing metrics is as hard as designing features
Baking it into the review culture – from top to bottom –
cadence is the heartbeat – spurs activity
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Health Dashboards
© 2013 – 2017 naked Agility Limited All Rights Reserved
Getting the availability model right
Experience: Coverage too narrow as service footprint grows
Experience: Loses sensitivity as command volumes grow
Experience: Empathizes individual customer impact
0.8
0.82
0.84
0.86
0.88
0.9
0.92
0.94
0.96
0.98
1
-200
0
200
400
600
800
1000
1200
1400
1600
9/25/13 2:24 PM 9/25/13 3:36 PM 9/25/13 4:48 PM 9/25/13 6:00 PM 9/25/13 7:12 PM 9/25/13 8:24 PM 9/25/13 9:36 PM 9/25/13 10:48 PM
Sept 25th 2013 LSI
FailedExecutionCount SlowExecutionCount Start End Availability (ID4 - Activity Only) Availability (Current)
© 2013 – 2017 naked Agility Limited All Rights Reserved
Alerting is key to fast detection
Every alert must be actionable and represent a
real issue with the system.
Alerts should create a sense of urgency – false
alerts dilutes that
Redundant alerts for same the issue
Needed to set right thresholds and tune often
Stateless alerts contributed to further noise
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Health model in action
• 3 errors for memory and
performance
• All 3 related to same code
defect
• APM component mapped to feature team
• Auto-dialer engaged Global DRI
Eliminated alert noise ~928
alerts per week to ~22 and
reduced DRI escalations by
~56%
© 2013 – 2017 naked Agility Limited All Rights Reserved
© 2013 – 2017 naked Agility Limited All Rights Reserved
VSTS Scorecard
© 2013 – 2017 naked Agility Limited All Rights Reserved
Time to MitigateTime to Detect
%ofIncidents
DRAFT
DRAFT
Microsoft Confidential 64
Service Availability & Health Metrics
DRAFT DRAFT
DRAFT
IncidentCount
IncidentCount
DRAFT
DRAFT
DRAFT
%ofIncidents
UserMinutes
DRAFT
DRAFTDRAFT
Error By SourceIncidents by Severity
User Impact Minutes During Incidents [TFS Only]
3
2
1
4
1. TFS Availability is on an improving trend. No Sev0/Sev1 LSIs for July.
2. App Insights switched from synthetic availability to real-user experience in Ibiza portal. A high
volume of SEV-2 LSIs (72) contributed to customer impact in addition to intermittent UX errors.
(UX fixes applied on 8/11 that improves availability)
3. App Insights was impacted by 3 long running LSIs related to ES maintenance, Ibiza updates and an
Azure Storage outage.
4. TFS Service attainment (SLO) improved significantly MoM with focus on minimizing failed/slow
commands and reviewing in weekly LiveSite reviews
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Service status
© 2013 – 2017 naked Agility Limited All Rights Reserved
RCA (Root Cause Analysis) transparency
© 2013 – 2017 naked Agility Limited All Rights Reserved
Changing the test portfolio balance
Tests should be written at the lowest level
possible
Write once, run anywhere including production
system
Product is designed for testability
Test code is product code, only reliable tests
survive
Testing infrastructure is a shared Service
© 2013 – 2017 naked Agility Limited All Rights Reserved
Agenda
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Feedback - Before
Code Test & Stabilize Code Test & Stabilize
Beta RTM
Planning
??
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Feedback - After
? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
© 2013 – 2017 naked Agility Limited All Rights Reserved
Agenda
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Staying connected
Chat Chat Chat Chat Chat Chat
Every 3 sprints we sit down with
the team for a “chat”
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
• What’s next on your backlog?
• How are you doing with
regards to debt?
• Any issues?
Team “Chats”
Version Control
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Team “Chats”
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Team “Chats”
© 2013 – 2017 naked Agility Limited All Rights Reserved
@MrHinsh
Sprint mails
Plan Accomplished

More Related Content

PPTX
Embedded meets Agile
PDF
Will Agile work in my embedded development environment?
PDF
DOES16 London - Margo Cronin - DevOps for Enterprises; ("Respect the Monolit...
PPTX
Transforming at 100,000 person IT department
PPTX
Scaling DevOps Adoption
PDF
How to Start Your Application Modernization Journey
PPTX
Creating a Collaborative Workplace Culture Webinar Series: “How can remote wo...
PDF
The SAFe Way to Lean Software Development for AgileNCR - April 5, 2014
Embedded meets Agile
Will Agile work in my embedded development environment?
DOES16 London - Margo Cronin - DevOps for Enterprises; ("Respect the Monolit...
Transforming at 100,000 person IT department
Scaling DevOps Adoption
How to Start Your Application Modernization Journey
Creating a Collaborative Workplace Culture Webinar Series: “How can remote wo...
The SAFe Way to Lean Software Development for AgileNCR - April 5, 2014

What's hot (20)

PPTX
2016-12-23 Co-learning Webinar - re-vers-ify
PDF
AgileCamp Silicon Valley 2015: An Agile Journey
PDF
From Zero to A11Y: Building an Accessibility Culture
PPSX
Agile Network India | Understanding the importance of a full featured CI/CD p...
PPTX
ypobo - Enterprise DevOps Adoption
PPTX
Mirco hering devops for systems of record final
PDF
Catn - Enterprise DevOps Adoption
PPTX
Empirical Management - Scrum Days Poland 2015
PDF
Webinar on Agile Metrics
PDF
DOES14 - Scott Prugh - CSG - DevOps and Lean in Legacy Environments
PDF
Next Generation IT Delivery - What it means to deliver atthe speed of the Dig...
PDF
From Measurement to Insight: Putting DevOps Metrics To Work
PDF
Integrating Automated Testing into DevOps
PDF
Agile Embedded Software
PPTX
From 1 RPM to 1,000 RPM - succeeding in a software-defined economy - Sacha La...
PDF
Agile SAP ACTIVATE
PDF
Why VersionOne is Wicked Awesome
PPTX
Break Through Agile Transformation Stagnation
PDF
eDevOps in HPSW from buzzword to reality
PDF
Carmen DeArdo - CarmenDeArdo_HowDevOpsIsEnablingLeanApplicationDevelopment
2016-12-23 Co-learning Webinar - re-vers-ify
AgileCamp Silicon Valley 2015: An Agile Journey
From Zero to A11Y: Building an Accessibility Culture
Agile Network India | Understanding the importance of a full featured CI/CD p...
ypobo - Enterprise DevOps Adoption
Mirco hering devops for systems of record final
Catn - Enterprise DevOps Adoption
Empirical Management - Scrum Days Poland 2015
Webinar on Agile Metrics
DOES14 - Scott Prugh - CSG - DevOps and Lean in Legacy Environments
Next Generation IT Delivery - What it means to deliver atthe speed of the Dig...
From Measurement to Insight: Putting DevOps Metrics To Work
Integrating Automated Testing into DevOps
Agile Embedded Software
From 1 RPM to 1,000 RPM - succeeding in a software-defined economy - Sacha La...
Agile SAP ACTIVATE
Why VersionOne is Wicked Awesome
Break Through Agile Transformation Stagnation
eDevOps in HPSW from buzzword to reality
Carmen DeArdo - CarmenDeArdo_HowDevOpsIsEnablingLeanApplicationDevelopment
Ad

Similar to ScotSoft 2018 - A DevOps Story: 70k deployments a day (20)

PDF
Martin Hinshelwood - A Devops Story from the Trenches
PDF
Martin Hinshelwood - Empirical & Incremental change for Enterprise
PDF
Aligning AI, RPA, and Cognitive with Your Business Objectives
PPTX
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
PDF
AVEVA World Conference NA - Alan Smith, WorleyParsons
PDF
graymatter-pentaho-consulting-services-.pdf
PDF
Getting more of the pie - Andrew Lawless (Rockant)
PDF
Softsolvers businessintroduction-171020070824
PDF
Making Businesses Productive using Digital Transformation – An Introduction
PDF
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
PDF
Cognos Data Manager Support Changes: Entitlements Migrate to DataStage
PPTX
Who is the Product Owner Anyway
PDF
Agile Values, French Values and Your Company
PPTX
Lean Agile Center of Excellence LACE – Drink our own Champagne
PPTX
Lean Agile Center of Excellence - Agile2017 Talk
PPTX
8 building blocks - SPFestSeattle.pptx
PDF
Who+is+the+Product+Owrt54545trerner+Anyway.pdf
PDF
Enterprise Scrum as a Language of Choices - Scrum Event London December 2017
PDF
Implementing Lean UX: The Practical Guide to Lean User Experience
PDF
5 Key Practices of Successful Agile Teams.pdf
Martin Hinshelwood - A Devops Story from the Trenches
Martin Hinshelwood - Empirical & Incremental change for Enterprise
Aligning AI, RPA, and Cognitive with Your Business Objectives
Unleashing the power of Scrum and Kanban together - Best of Both Worlds!!
AVEVA World Conference NA - Alan Smith, WorleyParsons
graymatter-pentaho-consulting-services-.pdf
Getting more of the pie - Andrew Lawless (Rockant)
Softsolvers businessintroduction-171020070824
Making Businesses Productive using Digital Transformation – An Introduction
Strategies for on premise to Google Cloud migration - Mateusz Pytel, GetInData
Cognos Data Manager Support Changes: Entitlements Migrate to DataStage
Who is the Product Owner Anyway
Agile Values, French Values and Your Company
Lean Agile Center of Excellence LACE – Drink our own Champagne
Lean Agile Center of Excellence - Agile2017 Talk
8 building blocks - SPFestSeattle.pptx
Who+is+the+Product+Owrt54545trerner+Anyway.pdf
Enterprise Scrum as a Language of Choices - Scrum Event London December 2017
Implementing Lean UX: The Practical Guide to Lean User Experience
5 Key Practices of Successful Agile Teams.pdf
Ad

More from Martin Hinshelwood (9)

PPTX
ScrumPulse Scaling Professional Scrum with Visual Studio Team Services
PPTX
Cloud load testing with Visual Studio Team Services
PPT
Agile into Finance 2014-02
PPTX
Metrics that matter with evidence-based management
PPTX
Migrating process templates
PPTX
Teams without barriers
PPTX
Agile development gets things done
PPTX
Who is scrum.org
PPTX
Training [bites] - scrum in 30 minutes
ScrumPulse Scaling Professional Scrum with Visual Studio Team Services
Cloud load testing with Visual Studio Team Services
Agile into Finance 2014-02
Metrics that matter with evidence-based management
Migrating process templates
Teams without barriers
Agile development gets things done
Who is scrum.org
Training [bites] - scrum in 30 minutes

Recently uploaded (20)

PPTX
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
PPTX
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
PPTX
Relationship Management Presentation In Banking.pptx
PPTX
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
PPTX
water for all cao bang - a charity project
PDF
natwest.pdf company description and business model
PPTX
fundraisepro pitch deck elegant and modern
PPTX
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
PPTX
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
PDF
oil_refinery_presentation_v1 sllfmfls.pdf
PPTX
Tour Presentation Educational Activity.pptx
PPTX
nose tajweed for the arabic alphabets for the responsive
PDF
Instagram's Product Secrets Unveiled with this PPT
PPTX
Sustainable Forest Management ..SFM.pptx
PPTX
Effective_Handling_Information_Presentation.pptx
PPTX
Self management and self evaluation presentation
PPTX
Introduction-to-Food-Packaging-and-packaging -materials.pptx
PPTX
Human Mind & its character Characteristics
DOCX
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
PPT
First Aid Training Presentation Slides.ppt
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
INTERNATIONAL LABOUR ORAGNISATION PPT ON SOCIAL SCIENCE
Relationship Management Presentation In Banking.pptx
ART-APP-REPORT-FINctrwxsg f fuy L-na.pptx
water for all cao bang - a charity project
natwest.pdf company description and business model
fundraisepro pitch deck elegant and modern
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
AcademyNaturalLanguageProcessing-EN-ILT-M02-Introduction.pptx
oil_refinery_presentation_v1 sllfmfls.pdf
Tour Presentation Educational Activity.pptx
nose tajweed for the arabic alphabets for the responsive
Instagram's Product Secrets Unveiled with this PPT
Sustainable Forest Management ..SFM.pptx
Effective_Handling_Information_Presentation.pptx
Self management and self evaluation presentation
Introduction-to-Food-Packaging-and-packaging -materials.pptx
Human Mind & its character Characteristics
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
First Aid Training Presentation Slides.ppt

ScotSoft 2018 - A DevOps Story: 70k deployments a day

  • 1. © 2013 – 2017 naked Agility Limited All Rights Reserved A DevOps Story @MrHinsh 1 Martin Hinshelwood | @MrHinsh martin@nkdagility.com | http://nkdagility.com/blog
  • 2. © 2013 – 2017 naked Agility Limited All Rights Reserved 2
  • 3. © 2013 – 2017 naked Agility Limited All Rights Reserved 3
  • 4. © 2013 – 2017 naked Agility Limited All Rights Reserved
  • 5. © 2013 – 2017 naked Agility Limited All Rights Reserved Diego Lo Giudice and Dave West, Forrester February 2011 Transforming Application Delivery Firms today experience a much higher velocity of business change. Market opportunities appear or dissolve in months or weeks instead of years. “ ”
  • 6. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh This is the story of:
  • 7. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Insanity is doing the same thing over and over again expecting a different result. -Albert Einstein
  • 8. © 2013 – 2017 naked Agility Limited All Rights Reserved Developer Division
  • 9. © 2013 – 2017 naked Agility Limited All Rights Reserved
  • 10. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Before 1ES and Azure DevOps
  • 11. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Faster Value Delivery Increase flow of Shorten cycle times Reduce re-work costs
  • 12. Typical day at Microsoft Data: Internal Microsoft engineering system activity, August 2018 12.4k Pull Requests per day 67k Git commits per day 78,000Deployments per day 146k Builds per day 500m Test executions per day 500k Work items updated per day 5m Work items viewed per day Azure DevOps Services is the toolchain of choice for Microsoft engineering with over 90,000 internal users https://aka.ms/DevOpsAtMicrosoft
  • 14. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Schedule Code Test & Stabilize Code Test & Stabilize Beta RTM
  • 15. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Feedback Planning Customer feedback – we should change the way a feature works. We didn’t get it quite right… … but we’re booked solid already.
  • 16. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh S1 S2 S3 S4 S5 Stabilization S6 Story: Sprint 1-5 A B
  • 17. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Now 2 years 3 weeks
  • 18. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Deliver more value to customers Faster responses to customers and market changes Improved engineering satisfaction 2x productivity increase Features Delivered per Year https://www.visualstudio.com/en-us/articles/news/features-timeline 22 58 65 111 262 249 2012 2013 2014 2015 2016 2017
  • 19. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh
  • 20. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Organization Roles Teams Cadence Taxonomy Plan Practices Guiding Principles Alignment Autonomy “Let’s try to give our teams three things…. Autonomy, Mastery, Purpose”
  • 21. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Alignment Every team and business tracks scenarios and features consistently. Autonomy Every team chooses how to manage stories and/or tasks Taxonomy & Staying Aligned
  • 22. © 2013 – 2017 naked Agility Limited All Rights Reserved Planning Epic 18 months Aspirational (60%) Plan 3 sprints Thoughtful (90%) 3 Sprint 3 weeks Confident (95%) 1 Season 6 months Hopeful (80%) 6 Teams are responsible for the detail Leadership is responsible for the big picture
  • 23. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Scenarios Features Stories Tasks Aligned Autonomy Alignment The big picture in light of our business goals Autonomy The detail about what we’ll deliver to achieve our business goals
  • 24. Week 1 Week 2 Week 3 Week 1 Week 2 Week 3Week 2 Week 3 Sprint 69Sprint 68 Sprint 70 Sprint Planning Done!
  • 25. What we accomplished Week 1 Week 2 Week 3 Week 1 Week 2 Week 3Week 2 Week 3 Sprint 69Sprint 68 Sprint 70 The sprint plan
  • 26. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Sprint Mails Value delivered during the sprint Video demonstrating the value What the team is planning to accomplish in the next sprint
  • 27. © 2013 – 2017 naked Agility Limited All Rights Reserved It’s not 2 years, but… • Updates were large • Months apart • Lots of problems! 4/1/2010 4/23/2012 5/3/2010 TFS 2010 RTM 4/23/2011 ServiceDeployment 8/5/2011 ServiceUpdate 9/26/2011 //BUILD2011 12/7/2011 ServiceUpdate 1/30/2012 ServiceUpdate 2/20/2012 ServiceUpdate 3/12/2012 ServiceUpdate 4/2/2012 ServiceUpdate
  • 28. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Organization Chart… before Program Management Development Testing Operations
  • 29. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Organization Chart Program Management Engineering Operations Engineering Program Management is responsible for: WHAT we’re building, and WHY we’re building it Engineering is responsible for HOW we’re building it, and that we’re building it with QUALITY
  • 30. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Teams Program Management Engineering
  • 31. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh
  • 32. Week 1 Week 2 Week 3 Week 1 Week 2 Week 3Week 2 Week 3 Sprint 69Sprint 68 Sprint 70 Deployment Sprint Planning Done! If it’s bad, YOU wake up
  • 33. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh
  • 34. © 2013 – 2017 naked Agility Limited All Rights Reserved But we have many teams
  • 35. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Everyone creates a branch… Week 1 Week 2 Week 3
  • 36. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Writes a lot of code… Week 1 Week 2 Week 3
  • 37. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh It needs to come together… Week 1 Week 2 Week 3
  • 38. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Merge Debt Week 1 Week 2 Week 3
  • 39. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Organizations which design systems... are constrained to produce designs which are copies of the communication structures of these organizations…
  • 40. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Typical Server Based Branching Structure
  • 41. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Organizations tend to produce branching structures that copy the organization chart.
  • 42. © 2013 – 2017 naked Agility Limited All Rights Reserved Maintaining enterprise rigor
  • 43. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Branching https://guides.github.com/introduction/flow/
  • 44. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Branching
  • 45. © 2013 – 2017 naked Agility Limited All Rights Reserved Internal Open Source Starts from a position ofTrust Share everything Encourage contributions
  • 46. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Quality- Before Code Test & Stabilize Code Test & Stabilize Beta RTM Planning Code Complete
  • 47. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Quality- After
  • 48. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh There’s no place like production!
  • 49. © 2013 – 2017 naked Agility Limited All Rights Reserved Customer IntelligenceBusiness IntelligenceOperational Intelligence Gather everything Dashboard DevOps Debug Experiments
  • 50. © 2013 – 2017 naked Agility Limited All Rights Reserved TAKE AWAY • Microsoft has changed the way that they run their business to support faster delivery • Create the right balance of autonomy and alignment • Focus on Engineering Excellence with continuous delivery • Gather as much telemetry as you can to make better decisions Summary
  • 51. © 2013 – 2017 naked Agility Limited All Rights Reserved Martin Hinshelwood martin@nkdagility.com • Hear about their Journey journey: http://aka.ms/engineeringstories • Learn how they deployVSTS: https://blogs.msdn.microsoft.com/devops/2017/04/25/ how-we-use-rm-part-1/ • Follow their ongoing journey andVSTS updates: https://blogs.msdn.microsoft.com/bharry/ • UseVSTS • http://visualstudio.com/team-services
  • 52. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Connect With Martin Hinshelwood: 55 +52 1 998 894 1898@MrHinsh martin@nkdagility.com https://nkdAgility.com/blog
  • 53. 56
  • 54. Starting with what is most important/most pain, go from there Designing metrics is as hard as designing features Baking it into the review culture – from top to bottom – cadence is the heartbeat – spurs activity
  • 55. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Health Dashboards
  • 56. © 2013 – 2017 naked Agility Limited All Rights Reserved Getting the availability model right Experience: Coverage too narrow as service footprint grows Experience: Loses sensitivity as command volumes grow Experience: Empathizes individual customer impact 0.8 0.82 0.84 0.86 0.88 0.9 0.92 0.94 0.96 0.98 1 -200 0 200 400 600 800 1000 1200 1400 1600 9/25/13 2:24 PM 9/25/13 3:36 PM 9/25/13 4:48 PM 9/25/13 6:00 PM 9/25/13 7:12 PM 9/25/13 8:24 PM 9/25/13 9:36 PM 9/25/13 10:48 PM Sept 25th 2013 LSI FailedExecutionCount SlowExecutionCount Start End Availability (ID4 - Activity Only) Availability (Current)
  • 57. © 2013 – 2017 naked Agility Limited All Rights Reserved Alerting is key to fast detection Every alert must be actionable and represent a real issue with the system. Alerts should create a sense of urgency – false alerts dilutes that Redundant alerts for same the issue Needed to set right thresholds and tune often Stateless alerts contributed to further noise
  • 58. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Health model in action • 3 errors for memory and performance • All 3 related to same code defect • APM component mapped to feature team • Auto-dialer engaged Global DRI Eliminated alert noise ~928 alerts per week to ~22 and reduced DRI escalations by ~56%
  • 59. © 2013 – 2017 naked Agility Limited All Rights Reserved
  • 60. © 2013 – 2017 naked Agility Limited All Rights Reserved VSTS Scorecard
  • 61. © 2013 – 2017 naked Agility Limited All Rights Reserved Time to MitigateTime to Detect %ofIncidents DRAFT DRAFT Microsoft Confidential 64 Service Availability & Health Metrics DRAFT DRAFT DRAFT IncidentCount IncidentCount DRAFT DRAFT DRAFT %ofIncidents UserMinutes DRAFT DRAFTDRAFT Error By SourceIncidents by Severity User Impact Minutes During Incidents [TFS Only] 3 2 1 4 1. TFS Availability is on an improving trend. No Sev0/Sev1 LSIs for July. 2. App Insights switched from synthetic availability to real-user experience in Ibiza portal. A high volume of SEV-2 LSIs (72) contributed to customer impact in addition to intermittent UX errors. (UX fixes applied on 8/11 that improves availability) 3. App Insights was impacted by 3 long running LSIs related to ES maintenance, Ibiza updates and an Azure Storage outage. 4. TFS Service attainment (SLO) improved significantly MoM with focus on minimizing failed/slow commands and reviewing in weekly LiveSite reviews
  • 62. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Service status
  • 63. © 2013 – 2017 naked Agility Limited All Rights Reserved RCA (Root Cause Analysis) transparency
  • 64. © 2013 – 2017 naked Agility Limited All Rights Reserved Changing the test portfolio balance Tests should be written at the lowest level possible Write once, run anywhere including production system Product is designed for testability Test code is product code, only reliable tests survive Testing infrastructure is a shared Service
  • 65. © 2013 – 2017 naked Agility Limited All Rights Reserved Agenda
  • 66. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Feedback - Before Code Test & Stabilize Code Test & Stabilize Beta RTM Planning ??
  • 67. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Feedback - After ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?
  • 68. © 2013 – 2017 naked Agility Limited All Rights Reserved Agenda
  • 69. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Staying connected Chat Chat Chat Chat Chat Chat Every 3 sprints we sit down with the team for a “chat”
  • 70. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh • What’s next on your backlog? • How are you doing with regards to debt? • Any issues? Team “Chats” Version Control
  • 71. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Team “Chats”
  • 72. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Team “Chats”
  • 73. © 2013 – 2017 naked Agility Limited All Rights Reserved @MrHinsh Sprint mails Plan Accomplished

Editor's Notes

  • #5: In pursuit of that goal, in the last year alone, I have visited 46 customers in 38 cities & 14 different countries. And sometimes I still get surprised by some of the strange and wonderful thing that they do… its happening less and less… Everyone has the same problems, faulty understandings, and dysfunctional behaviours… some are just better at working around it than others…
  • #11: Visual Studio depends on Windows. Windows depends on Visual Studio for compilers. What happens when there’s a compiler bug? Windows vendors the entire toolchain. NOTE TO SILVERCLOUD: this image is licensed stock photography from shutterstock
  • #12: Plan, learn, react to feedback Quality Chaos
  • #13: Enterprise scale
  • #15: A produce cycle looked something like this. It worked given the environment… but the environment has changed. We needed something different.
  • #18: Today we look more like this. That is to say that our teams work in 3 week sprints… and we plan continuously. This is how we run the business. Just move onto the next sprint, it’s righ tther
  • #21: I can’t tell you there was a day we made a decision to be Agile… instead, a group said “Hey, this agile thing sounds interesting… we want to try that”. The decision I made was not stopping them from trying Agile.
  • #23: This approach, aligns with what I was describing earlier… ALIGNED AUTONOMY. We see alignment through the SCENARIOS and SEASONS… and we give teams automoy through their uses of STORIES and TASKS.
  • #24: In fact, if you looked at my backlogs, and the backlogs of my teams… you’d see these exact terms.
  • #27: Need more room for the email so people can see it
  • #28: When we started planning the service, our initial thinking was more like a box product Started by looking at major/minor updates All updates were major! Story: December 2011 update went very badly and took a week to complete. Larger sets of changes are harder to test, diagnose Risk is proportional to the ship cycle Ship frequently and stay near ship quality
  • #29: Our organizational chart is by discipline. PMs report to PMs. Devs report to Devs. Testers report to Testers.
  • #30: Our organizational chart is by discipline. PMs report to PMs. Devs report to Devs. Testers report to Testers.
  • #31: However, our business is managed through cross-discipline teams.
  • #42: Fred Brooks coined this term…
  • #43: Conway’s Law “organizations which design systems ... are constrained to produce designs which are copies of the communication structures of these organizations” Aka shipping the org chart
  • #45: Hard requirements around release branches, while devs want freedom in topic branches Starts with getting the branch workflows right in a single repo Adding support for robust cross-repo sharing across release trains
  • #50: Tell stories here. Code complete… we celebrated this achievement. But what did we have? A lot of code… with a lot of bugs. No way to deliver it to customers. We’re Test & Stabilization… how do you think the team felt? Morale? Bad. We’re now just climbing a mountain.
  • #51: No, we pay our debt as we go. Our new bug curve looks more like this. We don’t let it every grow out of control. This enables us to ship features when they’re ready… instead of only at these “big events”.
  • #52: Availability and usage are hard to troubleshoot with all of this going on
  • #53: https://vsowiki.com/index.php?title=Customer_Intelligence
  • #54: and reinforce what this transformation made possible for our engineers and the opportunity for it to do the same in their orgs and the way they work with us as synergistic technologies Find ways your methodologies/solutions could plug in
  • #60: Synthetic tests – “test in production” used in the earliest days of the service; run broad functional tests against one test account; we left this behind pretty quickly Command health – Aggregate availability number based on command pass/fail. This is a pretty standard model. It worked well for us when command volumes were relatively low. It lost sensitivity as command volumes grew. Customer impact – The main message here is that we choose to create buckets to deal with the scale problem. We measure failed user minutes. Pass/fail & performance grouped in time and aggregated to account then service Dashed black line at the top is the Command Health (old) model – not sensitive Solid black line shows Customer Impact We now clearly see that there is a customer impact during this event even though failed/slow command numbers are small. At a very high level… we’ve normalized so that every active account contributes an equal amount to the availability measure. This gives small accounts a voice.
  • #61: Classified alerts and reduced noise: repeat, non actionable, and
  • #62: One of our biggest acheivements was tuning our alerts accurately enough to autodial the DRI without the need for human escalation. We achieve this by having a health model to eliminate noisy, redundant alerts, and smart boundaries to indicate when action is actually needed, as shown in Figure 15. It has given us a 40x improvement in alert precision, to the extent that by February 2015, all P0 and P1 alerts were routed correctly by the autodialer. Figure
  • #65: Deployments don’t take much. Can redeploy, not roll back. Optimize for Time to detect, time to mitigate not mean time to failure: redundancy in the system Root casue analysis all of it Revisiting telemetry and alerting to find earlier signs By being built on Azure we get a lot of the failure/redundancy for free. Jeff’s whitepaper aka.ms.vsosecurity
  • #70: We’d ask for feedback after each milestone – planning, Beta, etc. The problem was, there was never time to react to any of it. For the most part, we’d tell people “sorry”… and push those things off to the next release. We’ll get to that in 2.5 years when the next release comes out. We’d find bugs with the process… and fix them. No problems there. But we couldn’t react to anything our customers using the product were telling us… or very, very little.
  • #71: We’ve now got channels for feedback continually. We still have the “big event” feedback at preview, etc. But we’ve got a channel to talk to customers… constantly. In fact, to make that a bit more real… here are examples from our release notes that we write every 3 weeks with updates to our service. At each of these intervals we’ve got a chance to listen to customers and react.
  • #73: We’ve included in our process some lightweight methods to stay connected. Every 3 sprints we sit down with teams for a “chat”. This is direction conversation with leadership talking about 3 things (next slide).
  • #74: What’s next on your backlog? Is debt under control? Any issues?
  • #75: Team chats are direct conversations with the leadership team. Every org has a layer in the “middle”… we’ve got that too although we’ve done a lot to flatten our orgs in recent years. Instead of this… we do this (next slide).
  • #76: That’s not to say that the folks in the middle aren’t involved – they are. But we talk directly with the team.