SlideShare a Scribd company logo
OBSERVABILITY DRIVEN
DEVELOPMENT
GEERT VAN DER CRUIJSEN
@GEERTVDC
GEERT VAN DER CRUIJSEN
@GEERTVDC
CLOUDNATIVEARCHITECT FULLCYCLEDEVELOPERDEVOPSCOACH
CBO (CHIEF BEER OFFICER) AT XPIRIT NETHERLANDS
I LOVE BELGIUM
+
+
=
=
SINCE WE ALL LOVE BEER
I BROUGHT SOME DUTCH BEERS!!
SINCE WE ALL LOVE BEER
I BROUGHT SOME DUTCH BEERS!!
FIND THIS LOGO DURING MY
PRESENTATION, TAKE A PICTURE,
TWEET IT
MENTION/FOLLOW @GEERTVDC
AND WIN BEER!
I HAVE TO MAKE
A CONFESSION
@GEERTVDC
I HAVE TO MAKE
A CONFESSION
I TEST IN
PRODUCTION
@GEERTVDC
I TEST IN
PRODUCTION
I’M NOT LIKE
THIS GUY THOUGH
@GEERTVDC
TODAY’S PREACH
YOU SHOULD TEST IN
PRODUCTION TOO
@GEERTVDC
YOU SHOULD TEST IN
PRODUCTION TOO
STOP BEING AFRAID
OF PRODUCTION!
@GEERTVDC
WHO’S DOING
AGILE OR DEVOPS?
@GEERTVDC
WHO’S DOING
AGILE OR DEVOPS?
COMMON
AGILE / DEVOPS
MISTAKES
@GEERTVDC
FOCUS ON SPEED?
@GEERTVDC
DO YOU WANT FAST WHEN YOU’RE
NOT GOING IN THE RIGHT DIRECTION?
@GEERTVDC
TEST IN PRODUCTION
USER BEHAVIOR
A/B TESTING
EXPERIMENTS
@GEERTVDC
BEING ABLE TO BRAKE AND STEER
THAT IS WHAT MAKES YOU GO FASTER!
@GEERTVDC
DEVOPS IS THE UNION OF PEOPLE, PROCESS,
AND PRODUCTS TO ENABLE CONTINUOUS
DELIVERY OF VALUE TO OUR END USERS.
DONOVAN BROWN
@GEERTVDC
DEVOPS IS THE UNION OF PEOPLE, PROCESS,
AND PRODUCTS TO ENABLE CONTINUOUS
DELIVERY OF VALUE TO OUR END USERS.
DONOVAN BROWN
@GEERTVDC
VALUE IS ONLY VALUE WHEN
IT’S RUNNING IN PRODUCTION
@GEERTVDC
VALUE IS ONLY VALUE WHEN
IT’S RUNNING IN PRODUCTION
@GEERTVDC
TEST IN PRODUCTION
CANARY RELEASING
RING BASED DEPLOYMENTS
MULTI REGION
CHAOS ENGINEERING
SHADOW TESTING@GEERTVDC
BUT I USE STAGING?@GEERTVDC
BUT I USE STAGING?
DOES STAGING HAVE REAL DATA?
DOES STAGING HAVE REAL USERS?
DOES STAGING REPRESENT PRODUCTION ENOUGH?
HOW MUCH TIME DO YOU SPEND ON STAGING?
WHAT IS KEY TO TESTING ON PROD?
OBSERVABILITY
@GEERTVDC
OBSERVABILITY
“OBSERVABILITY IS A MEASURE OF HOW
WELL INTERNAL STATES OF A SYSTEM CAN
BE INFERRED FROM KNOWLEDGE OF ITS
EXTERNAL OUTPUTS”
CONTROL THEORY
@GEERTVDC
WHAT IS THE DIFFERENCE
WITH MONITORING?
@GEERTVDC
MONITORING
KNOWN UNKNOWNS
OBSERVABILITY
UNKNOWN UNKNOWNS
@GEERTVDC
COMPLEX
APPLICATION LANDSCAPES
DISTRIBUTED SYSTEMS – MICROSERVICES – CLOUD
“IN A COMPLEX LANDSCAPE YOUR
APPLICATION IS NEVER FULLY UP”
@GEERTVDC
MICROSERVICES
TRADITIONAL MONITORING
TOOLS ARE DEAD
@GEERTVDC
MEASURE
USER IMPACT
@GEERTVDC
MEASURE
USER IMPACT
https://medium.com/netflix-techblog/sps-the-pulse-of-netflix-streaming-ae4db0e05f8a
@GEERTVDC
RELIABILITY
AVAILABILITY LATENCY
THROUGHPUT
CORRECTNESS
FRESHNESS
COVERAGE
QUALITY
DURABILITY
RELIABILITY
@GEERTVDC
FAIL OPEN
PARTIAL FAILURE MODE
@GEERTVDC
OBSERVABILITY IS THE
KEY TO SOFTWARE
OWNERSHIP
@GEERTVDC
WE’VE TAUGHT OPS TO DEV
SOURCE CONTROL
INFRASTRUCTURE AS CODE
AUTOMATION
SCRIPTING
@GEERTVDC
TIME HAS COME
DEVS GET PROD ACCESS
@GEERTVDC
TIME HAS COME
DEVS GET PROD ACCESS
DEVS TAKE OWNERSHIP
@GEERTVDC
TIME HAS COME
DEVS GET PROD ACCESS
DEVS TAKE OWNERSHIP
DEVS TAKE ON CALL!
@GEERTVDC
DEVOPS CYCLE
@GEERTVDC
DEVOPS CYCLE
@GEERTVDC
BUSINESS + DEV
IT OPERATIONS
@GEERTVDC
BUSINESS + DEV
IT OPERATIONS
IMPROVE THE COMPANY
@GEERTVDC
OBSERVABILITY
CONNECT DEV TO BUSINESS
OBSERVABILITY
CONNECT DEV TO OPERATIONS
@GEERTVDC
3 PILLARS OF
OBSERVABILITY
@GEERTVDC
3 PILLARS OF
OBSERVABILITY
LOGS METRICS TRACES
@GEERTVDC
LOGGING
EXAMPLE: REQUEST DURATION
SERVICE REQUEST X FOR USER Y
TOOK 50 MILLISECONDS
@GEERTVDC
LOGGING
EASY TO GENERATE, HARD TO QUERY?
@GEERTVDC
STRUCTURED LOGGING
Log.Information(
“Request by {User} took {Duration}",
user,
duration);
Log.Information(“Request by userA took 35ms");
FROM
TO
@GEERTVDC
STRUCTURED LOGGING
GENERATE LOGS
SERILOG
APPLICATION INSIGHTS
NLOG
@GEERTVDC
STRUCTURED LOGGING
GENERATE LOGS STORE & QUERY LOGS
AZURE
LOG ANALYTICS
SERILOG
APPLICATION INSIGHTS
NLOG
@GEERTVDC
LOGGING
SHOULD YOU SAMPLE?
STORAGE == MONEY
AUDIT LOGS DO NOT SAMPLE
OPERATIONAL LOGS DO SAMPLE
DYNAMIC SAMPLING
@GEERTVDC
METRICS
AGGREGATE INFORMATION INTO TIME SERIES
CREATE REAL TIME GRAPHS OR HISTOGRAPHS
CHEAPER TO STORE
@GEERTVDC
METRICS
EXAMPLE: REQUEST DURATION
50 MILLISECONDS REQUEST IS 15 MILLISECONDS
HIGHER THAN AVERAGE
@GEERTVDC
METRICS
EXAMPLE: REQUEST DURATION
50 MILLISECONDS REQUEST IS 15 MILLISECONDS
HIGHER THAN AVERAGE
IN MECHELEN
ON FRIDAYS
PEOPLE WHO LIKE BEER
@GEERTVDC
DISTRIBUTED TRACING
EXAMPLE: REQUEST DURATION
WHY DID THIS REQUEST TAKE 50
MILLISECONDS -> IT CALLED DB, OTHER
SERVICES?
@GEERTVDC
DISTRIBUTED TRACING
APPLICATION FLOW FROM FRONT TO BACK
USER SESSION
TRANSACTION
AMOUNT OF DATA IN MICROSERVICE
LANDSCAPE?
@GEERTVDC
@GEERTVDC
DISTRIBUTED TRACING
WHAT TO MEASURE?
USE RED
@GEERTVDC
FOCUS ON YOUR USERS
LOG ALL USER EVENTS
USE RED
UTILIZATION
SATURATION
ERROR RATE
RESOURCE SCOPE
@GEERTVDC
USE RED
UTILIZATION
SATURATION
ERROR RATE
RATE
ERRORS
DURATION
RESOURCE SCOPE REQUEST SCOPE
@GEERTVDC
FEATURE FLAGS
If(_featureFlag.IsEnabled(“NewCheckoutFlow”)
{
log.Information(“NewCheckoutFlow feature used”);
ExecuteNewCheckoutFlow();
}
else
{
log.Information(“LegacyCheckout feature used”);
ExecuteLegacyCheckoutFlow();
}
@GEERTVDC
FEATURE FLAGS
FEATURE FLAGS
INITIAL DEPLOYMENT
FEATURE FLAGS
INITIAL DEPLOYMENT
BUG FOUND
FEATURE FLAGS
INITIAL DEPLOYMENT
BUG FOUND
SOLVED THE BUG
FEATURE FLAGS
INITIAL DEPLOYMENT
BUG FOUND
SOLVED THE BUG
ROLL OUT TO MORE USERS
FEATURE FLAGS
INITIAL DEPLOYMENT
BUG FOUND
SOLVED THE BUG
ROLL OUT TO MORE USERS
REMOVE FEATURE FLAG
EXPERIMENT IN PRODUCTION
public bool CanAccess(IUser user)
{
return Scientist.Science<bool>("widget-permissions", experiment =>
{
experiment.Use(() => IsCollaborator(user)); // old way
experiment.Try(() => HasAccess(user)); // new way
}); // returns the control value
}
SCIENTIST.NET
@GEERTVDC https://github.com/scientistproject/Scientist.net
FROM OBSERVABILITY
TO OBSERVABILITY DRIVEN DEVELOPMENT
@GEERTVDC
TDD WRITE
TESTS
PASS
TESTS
REFACTOR
@GEERTVDC
PLAN DESIGN DEVELOP TEST DEPLOY OPERATE
TDD
@GEERTVDC
ODDOBSERVABILITY DRIVEN DEVELOPMENT
DEFINE
EXPECTED
OUTCOME
MEASURE
THE
OUTCOME
CHANGE
FEATURE
& KEEP
MEASURING
@GEERTVDC
PLAN DESIGN DEVELOP TEST DEPLOY OPERATE
TDD
WHAT IS THE USER IMPACT?
@GEERTVDC
OBSERVABILITY DRIVEN DEVELOPMENT
PLAN DESIGN DEVELOP TEST DEPLOY OPERATE
TDD
WHAT IS THE USER IMPACT?
IS THE FEATURE BEHAVING
LIKE WE EXPECTED?@GEERTVDC
OBSERVABILITY DRIVEN DEVELOPMENT
PLAN DESIGN DEVELOP TEST DEPLOY OPERATE
OBSERVABILITY DRIVEN DEVELOPMENT
TDD
WHAT IS THE USER IMPACT?
IS THE FEATURE BEHAVING
LIKE WE EXPECTED?
DEPLOYMENT FEEDBACK
@GEERTVDC
KNOWING HOW OUR SYSTEM
OPERATES SHOULD BE IN
OUR SYSTEM AS DEVELOPERS
WHAT IS NORMAL?
RELEASE GATES TO NEXT STAGE?
@GEERTVDC
Cloudbrew 2019   observability driven development
Cloudbrew 2019   observability driven development
Cloudbrew 2019   observability driven development
Cloudbrew 2019   observability driven development
SLI
SLO
SLA
@GEERTVDC
SLI
SLO
SLA
SERVICE LEVEL INDICATOR
SERVICE LEVEL OBJECTIVE
SERVICE LEVEL AGREEMENT
@GEERTVDC
SLI SERVICE LEVEL INDICATOR
QUANTITATIVE MEASURE FOR YOUR SERVICE
AVAILABILITY
ERROR RATE
DURATION
LATENCY
FRESHNESS
@GEERTVDC
SLO SERVICE LEVEL OBJECTIVE
TARGET MEASURE FOR A SERVICE
MEASURED BY SLIS
AVAILABILITY OF 99.9% FOR LAST 30 DAYS
@GEERTVDC
SLA SERVICE LEVEL AGREEMENT
CONTRACT WITH USERS WITH
CONSEQUENSES WHEN
MISSING YOUR SLO
10% DISCOUNT FOR EACH 0.1%
BELOW AVAILABILITY SLO
@GEERTVDC
HOW TO DO THIS IN PRACTICE?
@GEERTVDC
HOW TO DO THIS IN PRACTICE?
DEFINE AN SLO
BUILD INDICATORS BY LOGGING / METRICS
BUILD A DASHBOARD – START MEASURING
MAKE CHOICES BASED ON SERVICE LEVEL
LEAVE SLA PART FOR SALES PEOPLE
MAKE IT VISIBLE@GEERTVDC
MAKE IT VISIBLE
SLO
AVAILABILITY
99.9954%
@GEERTVDC
MAKE IT VISIBLE
SLO
AVAILABILITY
99.9954%
RING 0
98%
RING 1
99.91%
RING 2
100%
@GEERTVDC
MAKE IT VISIBLE
SLO
AVAILABILITY
99.9954%
RING 0
98%
RING 1
99.91%
RING 2
100%
USER SIGN UP FLOW – 100%
CHECKOUT – 99.91%
SEARCH – 98%
@GEERTVDC
MAKE IT VISIBLE
SLO
AVAILABILITY
99.9954%
RING 0
98%
RING 1
99.91%
RING 2
100%
USER SIGN UP FLOW – 100%
CHECKOUT – 99.91%
SEARCH – 98%
CLIENT A - USER SIGN UP FLOW – 100%
CLIENT A - CHECKOUT – 99.91%
CLIENT A - SEARCH – 90%
TAKEAWAYS
START SMALL AT KEY AREAS OF YOUR APP
EXPLORE TOOLS
EMBRACE TESTING ON PROD!
FOCUS ON CUSTOMERS
TAKE OWNERSHIP OF CODE
@GEERTVDC
@GEERTVDC
GEERT VAN DER CRUIJSEN
@GEERTVDC
THANK YOU!
MOBILEFIRSTCLOUDFIRST.NET

More Related Content

PDF
The art and joy of testing in production
PDF
Test Smart, not hard
PDF
Continuous deployment 2.0
PDF
The Anatomy of a Code Review
PDF
Test-Driven Development for Embedded C -- OOP Conference 2015, Munich
PDF
Taming your CQ/AEM instances with the Apache Sling Health Checks
PDF
DevOpsDays Baltimore 2018: A Definition of Done for DevSecOps - Gene Gotimer
PDF
Technical Excellence - OOP Munich 2015
The art and joy of testing in production
Test Smart, not hard
Continuous deployment 2.0
The Anatomy of a Code Review
Test-Driven Development for Embedded C -- OOP Conference 2015, Munich
Taming your CQ/AEM instances with the Apache Sling Health Checks
DevOpsDays Baltimore 2018: A Definition of Done for DevSecOps - Gene Gotimer
Technical Excellence - OOP Munich 2015

What's hot (20)

PDF
Embedded Extreme Programming - Embedded Systems Conference 2002-2004
PDF
[오픈소스컨설팅]Session 2 2. Future of Team Collaboration
PDF
Do you know your production?
PDF
James Christie CAST 2014 Standards – promoting quality or restricting competi...
DOCX
QC E&I alhassan
PDF
Agile hardware
PDF
Microservices testing in the docker era
PDF
Beyond QA
PDF
plone.api
PDF
Testing in the new world-bug prevention vs. bug detection
PDF
The Tester Role in the Agile Release Train
PDF
DevSecOps for Developers, How To Start (ETC 2020)
PPTX
DevOps In Mobility World With Microsoft Technology
PDF
Code analysis for a better future
PDF
Achievement Unlocked: Drive development, increase velocity, and write blissfu...
PDF
Scrum and kanban
PDF
Adding value in an agile context
PDF
JavaOne 2016 - The DevOps Disaster
PPTX
#ATAGTR2020 Presentation - Speed Up Your Regression Testing Cycles with Data ...
PDF
Continuous Delivery Testing @HiQ
Embedded Extreme Programming - Embedded Systems Conference 2002-2004
[오픈소스컨설팅]Session 2 2. Future of Team Collaboration
Do you know your production?
James Christie CAST 2014 Standards – promoting quality or restricting competi...
QC E&I alhassan
Agile hardware
Microservices testing in the docker era
Beyond QA
plone.api
Testing in the new world-bug prevention vs. bug detection
The Tester Role in the Agile Release Train
DevSecOps for Developers, How To Start (ETC 2020)
DevOps In Mobility World With Microsoft Technology
Code analysis for a better future
Achievement Unlocked: Drive development, increase velocity, and write blissfu...
Scrum and kanban
Adding value in an agile context
JavaOne 2016 - The DevOps Disaster
#ATAGTR2020 Presentation - Speed Up Your Regression Testing Cycles with Data ...
Continuous Delivery Testing @HiQ
Ad

Similar to Cloudbrew 2019 observability driven development (20)

PDF
Observability driven development
PDF
Metrics driven development 10.09.2014
PDF
ROI & Business Value of CI, CD, DevOps, DevSecOps, & Microservices
PDF
Business Value of CI, CD, & DevOps(Sec)
PDF
Microservices as an evolutionary architecture: lessons learned
PDF
Model-based Testing: Taking BDD/ATDD to the Next Level
PPTX
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
PPTX
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
PDF
Successfully Implementing BDD in an Agile World
PDF
Enter the mind of an Agile Developer
PPTX
PAC 2020 Santorin - Stijn Schepers
PDF
An Approach of Improve Efficiencies through DevOps Adoption
PDF
Microsoft DevOps Journey
PPTX
Neotys PAC - Stijn Schepers
PDF
Influx/Days 2017 San Francisco | Christine Yen
PPTX
WinOps Conf 2016 - Matteo Emili - Development and QA Dilemmas in DevOps
PDF
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
PDF
I pushed in production :). Have a nice weekend
PPTX
Data driven; People based
PDF
WSO2Con USA 2015: Jump-Starting Middleware Services
Observability driven development
Metrics driven development 10.09.2014
ROI & Business Value of CI, CD, DevOps, DevSecOps, & Microservices
Business Value of CI, CD, & DevOps(Sec)
Microservices as an evolutionary architecture: lessons learned
Model-based Testing: Taking BDD/ATDD to the Next Level
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
Successfully Implementing BDD in an Agile World
Enter the mind of an Agile Developer
PAC 2020 Santorin - Stijn Schepers
An Approach of Improve Efficiencies through DevOps Adoption
Microsoft DevOps Journey
Neotys PAC - Stijn Schepers
Influx/Days 2017 San Francisco | Christine Yen
WinOps Conf 2016 - Matteo Emili - Development and QA Dilemmas in DevOps
Data-Driven DevOps: Improve Velocity and Quality of Software Delivery with Me...
I pushed in production :). Have a nice weekend
Data driven; People based
WSO2Con USA 2015: Jump-Starting Middleware Services
Ad

More from Geert van der Cruijsen (20)

PDF
Chaos Engineering - Geert van der Cruijsen.pdf
PDF
Better Architecture without Architects.pdf
PDF
Techdays Helsinki - Creating the distributed apps of the future using dapr - ...
PDF
Cloudbrew 2019 - running serverless applications in azure in production
PDF
building resilient and production ready serverless solutions on azure
PPTX
Chaos engineering - The art of breaking stuff in production on purpose
PPTX
There is a bot for that - building chat bots from idea to production
PPTX
Security as code - DevOpsDays Amsterdam 2018
PPTX
NoOps for noobs; why i think Devs do not need Ops
PPTX
Making AI easy with Azure Cognitive services
PDF
Visual Studio Mobile Center: A story about mobile DevOps
PDF
Techdays 2017: Give your Xamarin Apps eyes, ears and a brain with Cognitive S...
PPTX
Build 2017 recap (Mobile)
PDF
Continuous delivery of Sitecore on Azure using VSTS at a bank from 1737
PPTX
Behavior driven development for Mobile apps
PPTX
Techdays app 2016 - behind the scenes
PPTX
Xamarin Test Cloud - from zero to hero in automated ui testing
PPTX
Continuous deployments mobile apps
PPTX
How the Internet of Things will disrupt your industry (Avanade Inspire 2015)
PPTX
Masters in Microsoft - Windows 10 app development introduction
Chaos Engineering - Geert van der Cruijsen.pdf
Better Architecture without Architects.pdf
Techdays Helsinki - Creating the distributed apps of the future using dapr - ...
Cloudbrew 2019 - running serverless applications in azure in production
building resilient and production ready serverless solutions on azure
Chaos engineering - The art of breaking stuff in production on purpose
There is a bot for that - building chat bots from idea to production
Security as code - DevOpsDays Amsterdam 2018
NoOps for noobs; why i think Devs do not need Ops
Making AI easy with Azure Cognitive services
Visual Studio Mobile Center: A story about mobile DevOps
Techdays 2017: Give your Xamarin Apps eyes, ears and a brain with Cognitive S...
Build 2017 recap (Mobile)
Continuous delivery of Sitecore on Azure using VSTS at a bank from 1737
Behavior driven development for Mobile apps
Techdays app 2016 - behind the scenes
Xamarin Test Cloud - from zero to hero in automated ui testing
Continuous deployments mobile apps
How the Internet of Things will disrupt your industry (Avanade Inspire 2015)
Masters in Microsoft - Windows 10 app development introduction

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Empathic Computing: Creating Shared Understanding
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Encapsulation theory and applications.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Machine learning based COVID-19 study performance prediction
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Network Security Unit 5.pdf for BCA BBA.
Spectral efficient network and resource selection model in 5G networks
Per capita expenditure prediction using model stacking based on satellite ima...
A comparative analysis of optical character recognition models for extracting...
Reach Out and Touch Someone: Haptics and Empathic Computing
Programs and apps: productivity, graphics, security and other tools
Empathic Computing: Creating Shared Understanding
MIND Revenue Release Quarter 2 2025 Press Release
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Building Integrated photovoltaic BIPV_UPV.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Encapsulation theory and applications.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Machine learning based COVID-19 study performance prediction
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Advanced methodologies resolving dimensionality complications for autism neur...
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Unlocking AI with Model Context Protocol (MCP)
Network Security Unit 5.pdf for BCA BBA.

Cloudbrew 2019 observability driven development