SlideShare a Scribd company logo
@PierreVincent
Observability & Testing
Explore what’s really happening
under the hood
March 8th, 2018 – QCon London
@PierreVincent pvincent.io
@PierreVincent
@PierreVincent
Pierre Vincent
Infra. & Reliability Manager
@PierreVincent
pvincent.io
@PierreVincent
Reaching production is
only the beginning
@PierreVincent
No system is immune to failure
Be ready to recover
@PierreVincent
When distributing a system,
we’re also distributing the places
where things might go wrong
@PierreVincent
Pre-Prod Testing & Monitoring
only cover known failure modes.
What about everything else?
@PierreVincent
Monitoring tells you whether the
system works. Observability lets
you ask why it's not working.
“
”– Baron Schwarz
@PierreVincent
Metrics
Logging
Tracing
@PierreVincent
System
metrics
Application
metrics
Business
metrics
CPU usage Error rates Customer conversions
Metrics
@PierreVincent
Metrics
@PierreVincent
Watch out for over-reliance on metrics
Limitations at high-cardinality
Not every metric
deserves an alert
Real-time querying means
some trade-offs on retention
Limit alerting to
user-impacting symptoms
Poor fine-grained debugging
e.g. CustomerId
Not suitable for long-term
trend analysis
@PierreVincent
Searchable Correlated
Logging
Making sense of (a lot of) logs
Centralised
@PierreVincent
A
F
H
D
J
B
E
C
G
a1b2c3
a1b2c3
a1b2c3
ERROR [svc=H][trace=a1b2c3] Failed to save order
Cause: Cassandra timeout exception
ERROR [svc=F][trace=a1b2c3] Failed to complete order
Cause: Shipping service responded with 500
ERROR [svc=A][trace=a1b2c3] Failed to process order
Cause: Order process manager responded with 500
a1b2c3
INFO [svc=G][trace=a1b2c3] Items verified in stock
Log Correlation
@PierreVincent
JSON
2018-02-20T16:38:23+00:00 ERROR Read timed out
timestamp 2018-02-20T16:38:23+00:00
requestID ec667cb45
level ERROR
team eventsservice registration-service
commit 542a8b8e build 542a8b8e.7
node node_e79f3e52
log Read timed out
region europe-west2
runtime java-1.8.0_161
When did it happen?
Can I trace it?
What is it?
What is it running?
Where is it?
What is the message?
customerID 55123Who caused it? userID 458
... ...Any other info?
Hmmm thanks
 ?
@PierreVincent
Structured logs unleash high-cardinality exploration
Error rate spike isolated
by build version
Activity spike isolated
for single customer
@PierreVincent
Tracing
get_confirmed_attendees
get_attendees
get_confirm_status
cassandra/select
mysql/select
event-mgt-api
attendees-service
registration-service
Trace
Spans
0ms 50ms 100ms 150ms 172ms
172ms
73ms
54ms
78ms
41ms
@PierreVincent
@PierreVincent
Unknown
Unknowns
Known
Unknowns
Healthchecks
Metrics
(Alerting)
Logs
Metrics
(Queries)
Tracing
Events
Monitoring & Resiliency Debugging & Exploration
@PierreVincent
Observability is the key to
unlock new ways of testing
@PierreVincent
Test (a little bit*) lessTest (a little bit*) lessDon’t spend all your time testing
...keep some for instrumenting
Thank you!
@PierreVincent
pvincent.io

More Related Content

PPTX
Presentation
PPTX
ATAGTR2017 Security Testing for Healthcare applications
PPTX
[RebelCon] Increasing visibility of distributed systems in production
PDF
Real world IoT for enterprises
PDF
Warranty Fraud Detection
PPTX
2010 06 gartner avoiding audit fatigue in nine steps 1d
PPTX
IoT Practical examples in Smart Industries and Smart Building Conclusion Co...
PDF
Executive Briefing: What Is Fast Data And Why Is It Important
Presentation
ATAGTR2017 Security Testing for Healthcare applications
[RebelCon] Increasing visibility of distributed systems in production
Real world IoT for enterprises
Warranty Fraud Detection
2010 06 gartner avoiding audit fatigue in nine steps 1d
IoT Practical examples in Smart Industries and Smart Building Conclusion Co...
Executive Briefing: What Is Fast Data And Why Is It Important

Similar to [Test Bash Manchester] Observability and Testing (20)

PPTX
IIOT on Variable Frequency Drives
PDF
eBook-IoTPractice
PPT
VmTurbo
PDF
Design Of Cooling Tower With Iot
PPTX
Automobile Industry Maintenance Management Software
PPTX
Benefits of CMMS Software in Automobile Industry | Innomaint
PDF
B3948
PPTX
Btpro-Penetration Testing Service
PDF
ThirdEye - LinkedIn's Business-wide monitoring platform
PDF
NetIQ AppManager & NetIQ Operations Center. NCU Ltd
PDF
The Role of Sensors in Industrial Automation Systems.pdf
PDF
How Can We Use Big Data in the Food Supply Chain
PDF
2018 ISPE Tieghi OT/ICS CyberSecurity per Pharma 4.0
PDF
IRJET-Managing Security of Systems by Data Collection
PDF
Why Use Open Source to Gain More Visibility into Network Monitoring
PDF
An Identity Crisis at the Center of Every IoT Product
PDF
Industrial Internet of Things by Edy Liongosari of Accenture
 
PDF
IoT Based Project for Submersible Motor controlling , monitoring, & Updating ...
PPTX
Challenges in fin tech & the growing need for
IIOT on Variable Frequency Drives
eBook-IoTPractice
VmTurbo
Design Of Cooling Tower With Iot
Automobile Industry Maintenance Management Software
Benefits of CMMS Software in Automobile Industry | Innomaint
B3948
Btpro-Penetration Testing Service
ThirdEye - LinkedIn's Business-wide monitoring platform
NetIQ AppManager & NetIQ Operations Center. NCU Ltd
The Role of Sensors in Industrial Automation Systems.pdf
How Can We Use Big Data in the Food Supply Chain
2018 ISPE Tieghi OT/ICS CyberSecurity per Pharma 4.0
IRJET-Managing Security of Systems by Data Collection
Why Use Open Source to Gain More Visibility into Network Monitoring
An Identity Crisis at the Center of Every IoT Product
Industrial Internet of Things by Edy Liongosari of Accenture
 
IoT Based Project for Submersible Motor controlling , monitoring, & Updating ...
Challenges in fin tech & the growing need for
Ad

More from Pierre Vincent (9)

PDF
[Test bash NL] Contract testing in practice with Pact
PDF
DevOpsDays Galway 2019 - Zero-downtime deployments
PDF
[Test bash manchester] contract testing in practice
PDF
QCon London - How to build observable distributed systems
PPTX
Deploying microservices in a fast-paced customer-centric environment: How and...
PPTX
Improve collaboration and confidence with Consumer-driven contracts
PPTX
Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...
PPTX
Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...
PPTX
Agile at Newsweaver (Agile Cork March 2016)
[Test bash NL] Contract testing in practice with Pact
DevOpsDays Galway 2019 - Zero-downtime deployments
[Test bash manchester] contract testing in practice
QCon London - How to build observable distributed systems
Deploying microservices in a fast-paced customer-centric environment: How and...
Improve collaboration and confidence with Consumer-driven contracts
Consumer-driven contracts: avoid microservices integration hell! (MuCon Londo...
Consumer-driven contracts: avoid microservices integration hell! (LondonCD - ...
Agile at Newsweaver (Agile Cork March 2016)
Ad

Recently uploaded (20)

PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
Essential Infomation Tech presentation.pptx
PDF
top salesforce developer skills in 2025.pdf
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
L1 - Introduction to python Backend.pptx
PPTX
Introduction to Artificial Intelligence
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
 
PPTX
ai tools demonstartion for schools and inter college
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Understanding Forklifts - TECH EHS Solution
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
How to Choose the Right IT Partner for Your Business in Malaysia
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
CHAPTER 2 - PM Management and IT Context
Essential Infomation Tech presentation.pptx
top salesforce developer skills in 2025.pdf
Softaken Excel to vCard Converter Software.pdf
L1 - Introduction to python Backend.pptx
Introduction to Artificial Intelligence
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
 
ai tools demonstartion for schools and inter college
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Odoo Companies in India – Driving Business Transformation.pdf
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Understanding Forklifts - TECH EHS Solution
wealthsignaloriginal-com-DS-text-... (1).pdf
Internet Downloader Manager (IDM) Crack 6.42 Build 41
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
2025 Textile ERP Trends: SAP, Odoo & Oracle

[Test Bash Manchester] Observability and Testing