SlideShare a Scribd company logo
@molly_struve
Building A Scalable
Monitoring System
1
@molly_struve
Monitoring Mistakes
Overhauling the
System
The Payoff
2
@molly_struve
Monitoring Mistakes
3
Overhauling the
System
The Payoff
@molly_struve
4
@molly_struve
Monitoring Mistakes
5
Overhauling the
System
The Payoff
@molly_struve
Monitoring Mistakes
Overhauling the
System
6
Overhauling the
System
The Payoff
@molly_struve
Monitoring Mistakes
Overhauling the
System
The Payoff
7
@molly_struve
Monitoring Mistakes
Overhauling the
System
The Payoff
8
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
9
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
10
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
11
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
12
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
13
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
14
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
15
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
16
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
17
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
18
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
19
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
20
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
21
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
22
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
23
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
24
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
25
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
Incredibly Inconsistent
26
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
Inconsistent Alerts
Required no actionReported data
27
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
Inconsistent Alerts
Required no actionReported data
Immediate action required
28
@molly_struve
make on-call devs miserable
Monitoring Mistakes Overhauling the System The Payoff
29
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
30
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
31
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
32
@molly_struve
Monitoring Mistakes
Overhauling the
System
The Payoff
33
@molly_struve
Monitoring Mistakes
Overhauling the
System
The Payoff
34
@molly_struve
Monitoring Must Haves
1
2
3
4
Monitoring Mistakes Overhauling the System The Payoff
Consolidate Monitoring To a Single Place
35
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
!
" #
$
%
36
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
37
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
38
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
39
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
40
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
41
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
42
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
43
@molly_struve
Make ALL Alerts Actionable
Monitoring Must Haves
1
2
3
4
Monitoring Mistakes Overhauling the System The Payoff
Consolidate Monitoring To a Single Place
44
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
45
Action
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
46
Action
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
Action Required No action Needed
47
@molly_struve
#ops_alerts
Monitoring Mistakes Overhauling the System The Payoff
#dev_alerts
Action Required No action Needed
48
@molly_struve
#ops_alerts
Monitoring Mistakes Overhauling the System The Payoff
#ops_reporting
#dev_alerts
Action Required No action Needed
#dev_reporting
49
@molly_struve
Make Alerts Mutable
Make ALL Alerts Actionable
Consolidate Monitoring To a Single Place
Monitoring Must Haves
1
2
3
4
Monitoring Mistakes Overhauling the System The Payoff
50
@molly_struve
Make Alerts Mutable
Make ALL Alerts Actionable
Consolidate Monitoring To a Single Place
Monitoring Must Haves
1
2
3
4
Monitoring Mistakes Overhauling the System The Payoff
51
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
30 60 90
minutes
52
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
53
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
54
Miss new alerts
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
55
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
56
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
57
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
58
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
59
@molly_struve
Track Alert History
Make Sure Alerts Are Mutable
Make ALL Alerts Actionable
Monitoring Must Haves
1
2
3
4
Monitoring Mistakes Overhauling the System The Payoff
Consolidate Monitoring To a Single Place
60
@molly_struve
Tracking Alert History
Monitoring Mistakes Overhauling the System The Payoff
61
@molly_struve
Tracking Alert History
Monitoring Mistakes Overhauling the System The Payoff
62
@molly_struve
Tracking Alert History
Monitoring Mistakes Overhauling the System The Payoff
63
@molly_struve
Track Alert History
Make Sure Alerts Are Mutable
Make ALL Alerts Actionable
Consolidate Monitoring To a Single Place
Monitoring Must Haves
1
2
3
4
Monitoring Mistakes Overhauling the System The Payoff
64
@molly_struve
Monitoring Mistakes
Overhauling the
System
The Payoff
65
@molly_struve
Monitoring Mistakes
Overhauling the
System
The Payoff
66
@molly_struve
On boarding is a breeze
Monitoring Mistakes Overhauling the System The Payoff
67
@molly_struve
3 On-boarding steps:
1
2
3
Monitoring Mistakes Overhauling the System The Payoff
68
@molly_struve
Show them the monitoring setup1
2
3
Monitoring Mistakes Overhauling the System The Payoff
3 On-boarding steps:
69
@molly_struve
Show them the monitoring setup1
2
3
Monitoring Mistakes Overhauling the System The Payoff
If an alert goes off you have to address it
3 On-boarding steps:
70
@molly_struve
Show them the monitoring setup1
2
3
Monitoring Mistakes Overhauling the System The Payoff
How to mute a triggered alert
3 On-boarding steps:
71
If an alert goes off you have to address it
@molly_struve
On boarding is a breeze
Monitoring Mistakes Overhauling the System The Payoff
72
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
Happier on-call developers
73
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
All alerts must be actionable
74
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
75
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
No more noise
76
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
77
@molly_struve
Developers began helping to
improve our monitoring system
Monitoring Mistakes Overhauling the System The Payoff
78
@molly_struve
30-40 Alerts
Monitoring Mistakes Overhauling the System The Payoff
79
@molly_struve
>90 Alerts
Monitoring Mistakes Overhauling the System The Payoff
80
@molly_struve
Monitoring Mistakes Overhauling the System The Payoff
81
@molly_struve
If you want…
1
2
3
Monitoring Mistakes Overhauling the System The Payoff
82
@molly_struve
On boarding to be a breeze.1
2
3
Monitoring Mistakes Overhauling the System The Payoff
If you want…
83
@molly_struve
On-call developers to be a lot happier
1
2
3
Monitoring Mistakes Overhauling the System The Payoff
If you want…
On boarding to be a breeze.
84
@molly_struve
Developers to help improve your monitoring
systems
1
2
3
Monitoring Mistakes Overhauling the System The Payoff
If you want…
On boarding to be a breeze.
85
On-call developers to be a lot happier
@molly_struve
Track Alert History
Make Sure Alerts Are Mutable
Make ALL Alerts Actionable
Consolidate Monitoring To a Single Place
Monitoring Must Haves
1
2
3
4
86
@molly_struve
❤❤❤
87
@molly_struve
Questions?
88

More Related Content

PDF
Creating a Scalable Monitoring System That Everyone Will Love (Velocity Conf)
PDF
Creating a Scalable Monitoring System That Everyone Will Love ADDO
PDF
Zero to One - Notes on functional monitoring for startups
PDF
How to improve your system monitoring
PDF
Dev "Programming" Ops For DevOps Success
PDF
Конфигурация и соответствие: две половины единого целого.
PDF
Rovi bit dashboard
PDF
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...
Creating a Scalable Monitoring System That Everyone Will Love (Velocity Conf)
Creating a Scalable Monitoring System That Everyone Will Love ADDO
Zero to One - Notes on functional monitoring for startups
How to improve your system monitoring
Dev "Programming" Ops For DevOps Success
Конфигурация и соответствие: две половины единого целого.
Rovi bit dashboard
Monitoring Cassandra: Don't Miss a Thing (Alain Rodriguez, The Last Pickle) |...

More from Molly Struve (11)

PDF
LeadDev NYC 2022: Calling Out a Terrible On-call System
PDF
Talk Horsey to Me
PDF
Eight Timezones, One Cohesive Team
PDF
All Day DevOps: Calling Out A Terrible On-Call System
PDF
Talk Horsey To Me
PDF
Elasticsearch 5 and Bust (RubyConf 2019)
PDF
Cache is King: RubyConf Columbia
PDF
Cache is King - RailsConf 2019
PDF
Cache is King - RubyHACK 2019
PDF
Cache is King: Get the Most Bang for Your Buck From Ruby
PDF
Taking Elasticsearch From 0 to 88mph
LeadDev NYC 2022: Calling Out a Terrible On-call System
Talk Horsey to Me
Eight Timezones, One Cohesive Team
All Day DevOps: Calling Out A Terrible On-Call System
Talk Horsey To Me
Elasticsearch 5 and Bust (RubyConf 2019)
Cache is King: RubyConf Columbia
Cache is King - RailsConf 2019
Cache is King - RubyHACK 2019
Cache is King: Get the Most Bang for Your Buck From Ruby
Taking Elasticsearch From 0 to 88mph
Ad

Recently uploaded (20)

PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PDF
737-MAX_SRG.pdf student reference guides
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PPTX
Current and future trends in Computer Vision.pptx
PPTX
Geodesy 1.pptx...............................................
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Sustainable Sites - Green Building Construction
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
CH1 Production IntroductoryConcepts.pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
Model Code of Practice - Construction Work - 21102022 .pdf
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
737-MAX_SRG.pdf student reference guides
Automation-in-Manufacturing-Chapter-Introduction.pdf
Internet of Things (IOT) - A guide to understanding
Fundamentals of safety and accident prevention -final (1).pptx
Current and future trends in Computer Vision.pptx
Geodesy 1.pptx...............................................
bas. eng. economics group 4 presentation 1.pptx
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Operating System & Kernel Study Guide-1 - converted.pdf
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Sustainable Sites - Green Building Construction
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Ad

Building a Scalable Monitoring System