SlideShare a Scribd company logo
2
Most read
4
Most read
6
Most read
Observability
at Scale
Presented By: Rahul Miglani
VP Engineering - DevOps Practice Head
Knoldus Inc.
About Knoldus
Knoldus is a technology consulting firm with focus on modernizing the digital systems
at the pace your business demands.
DevOps
Functional. Reactive. Cloud Native
01 What Is Observability in DevOps?
02 Components of Observability
03 Benefits of Observability
04 Common Pitfalls in Observability
05 Observability at Scale and best practices
Our Agenda
What is Observability in DevOps
Observability is the foundation of reliability , When things
inevitably go wrong, observability enables engineers to quickly
diagnose and fix issues when they arise. The more complex a
system gets, and the higher user expectations are over reliability,
the more important it becomes to invest in advanced
observability methods to reason about what is going on.
Full Stack Observability
Components of Observability
LEARN NW
LOGGING
METRICS
TRACING
Observability Pipeline
LEARN NW
c
Benefits of
Observability
● It helps the IT firm to have a complete understanding
of the internal workings of their system.
● Observability reduces the downtime spent in resolving
issues, as it tends to bring the possible causes of the
issue into focus.
● It gives the DevOps team the ability to identify the root
causes of issues.
● Observability makes debugging and troubleshooting
easier.
● Observability helps companies monitor the
performance of the application or system.
● It helps in speeding up the Mean Time to Detection
(MTTD) and the Mean Time To Resolution (MTTR) for
software infrastructure and services.
● Observability also enhances customer satisfaction if
staffers use data from logs and metrics to improve
services.
Pitfall 2: Working Without the Right Tools
Pitfall 3: Poor Alerting System
Pitfall 1: Uneven Distribution of Information
20XX
STRATEGY
Common Pitfalls in Observability
● Don’t try to monitor everything. Instead, gather only the necessary data.
● Focus more on monitoring essential things and fixing them if they fail.
● Avoid storing every log or data available. Rather, store those that give insights to critical events.
● Put up alerts on critical events.
● Create data graphs that are easily understandable by every team member, as this will improve
the usability of the information
MEASURE EVERYTHING
● Changes made to monitoring configuration.
● "Out of hours" alerts.
● Team alerting balance.
● False positives.
● False negatives.
● Alert creation.
● Alert acknowledgement.
● Alert silencing and silence duration.
● Unactionable alerts.
● Usability: alerts, runbooks, dashboards.
● MTTD, MTTR, impact.
Best Practices in Observability
Measure Everything
Rahul Miglani
DevOps Practice Head
DevOps@Knoldus.com
Thank You!

More Related Content

PDF
Observability & Datadog
PDF
Cloud-Native Observability
PPTX
PPTX
Observability vs APM vs Monitoring Comparison
PPTX
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
PDF
Observability
PPTX
Observability, what, why and how
PPTX
Observability – the good, the bad, and the ugly
Observability & Datadog
Cloud-Native Observability
Observability vs APM vs Monitoring Comparison
How to Move from Monitoring to Observability, On-Premises and in a Multi-Clou...
Observability
Observability, what, why and how
Observability – the good, the bad, and the ugly

What's hot (20)

PDF
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
PPTX
Observability
PPTX
.conf Go 2022 - Observability Session
PPTX
Monitoring & Observability
PDF
Data Architecture Strategies: Data Architecture for Digital Transformation
PDF
Observability
PDF
Observability
PDF
Monitoring and observability
PPTX
Data Observability.pptx
PPTX
Do You Really Need to Evolve From Monitoring to Observability?
PPTX
Data Observability Best Pracices
PDF
Data Pipline Observability meetup
PDF
Modern Data Flow
PDF
Observability for modern applications
PPTX
ODSC May 2019 - The DataOps Manifesto
PDF
Implementing Observability for Kubernetes.pdf
PPTX
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
PDF
Observability; a gentle introduction
PDF
DataOps: An Agile Method for Data-Driven Organizations
PDF
Improve monitoring and observability for kubernetes with oss tools
More Than Monitoring: How Observability Takes You From Firefighting to Fire P...
Observability
.conf Go 2022 - Observability Session
Monitoring & Observability
Data Architecture Strategies: Data Architecture for Digital Transformation
Observability
Observability
Monitoring and observability
Data Observability.pptx
Do You Really Need to Evolve From Monitoring to Observability?
Data Observability Best Pracices
Data Pipline Observability meetup
Modern Data Flow
Observability for modern applications
ODSC May 2019 - The DataOps Manifesto
Implementing Observability for Kubernetes.pdf
Best Practices in DataOps: How to Create Agile, Automated Data Pipelines
Observability; a gentle introduction
DataOps: An Agile Method for Data-Driven Organizations
Improve monitoring and observability for kubernetes with oss tools
Ad

Similar to Observability at Scale (20)

PDF
DevOps Observability & Monitoring_ Ultimate Guide.pdf
PPTX
Migrating Monitoring to Observability – How to Transform DevOps from being Re...
PPTX
What is Platform Observability? An Overview
PDF
What is Observability and how is it different from Monitoring?
PDF
beginners-guide-to-observability.pdf
PDF
A Comprehensive Look at Application Observability_ What it is and Why it Matt...
PDF
Monitoring and Observability_ Keeping Your DevOps Pipeline Healthy.pdf
PPTX
Monitoring and Observability_ Keeping Your DevOps Pipeline Healthy.pptx
PDF
MeasureWorks - Performance Labs - Why Observability Matters!
PDF
The Observability Graph; Knowledge Graphs for Automated Infrastructure Observ...
PDF
State of observability 2023 - story on the what
PDF
Final observability starts_with_data
PPTX
Solving the Hidden Costs of Kubernetes with Observability
PDF
Jenna Shae-Harris_Observability Overview.pptx.pdf
DOCX
Observability A Critical Practice to Enable Digital Transformation
PDF
Short Data Rules for Observability.pdf
PDF
Achieving observability-in-modern-applications
PPTX
DockerCon SF 2019 - TDD is Dead
PDF
Observability: Challenges, Priorities, Solutions, and the Role of OpenTelemetry
PPTX
Observability in serverless solutions
DevOps Observability & Monitoring_ Ultimate Guide.pdf
Migrating Monitoring to Observability – How to Transform DevOps from being Re...
What is Platform Observability? An Overview
What is Observability and how is it different from Monitoring?
beginners-guide-to-observability.pdf
A Comprehensive Look at Application Observability_ What it is and Why it Matt...
Monitoring and Observability_ Keeping Your DevOps Pipeline Healthy.pdf
Monitoring and Observability_ Keeping Your DevOps Pipeline Healthy.pptx
MeasureWorks - Performance Labs - Why Observability Matters!
The Observability Graph; Knowledge Graphs for Automated Infrastructure Observ...
State of observability 2023 - story on the what
Final observability starts_with_data
Solving the Hidden Costs of Kubernetes with Observability
Jenna Shae-Harris_Observability Overview.pptx.pdf
Observability A Critical Practice to Enable Digital Transformation
Short Data Rules for Observability.pdf
Achieving observability-in-modern-applications
DockerCon SF 2019 - TDD is Dead
Observability: Challenges, Priorities, Solutions, and the Role of OpenTelemetry
Observability in serverless solutions
Ad

More from Knoldus Inc. (20)

PPTX
Angular Hydration Presentation (FrontEnd)
PPTX
Optimizing Test Execution: Heuristic Algorithm for Self-Healing
PPTX
Self-Healing Test Automation Framework - Healenium
PPTX
Kanban Metrics Presentation (Project Management)
PPTX
Java 17 features and implementation.pptx
PPTX
Chaos Mesh Introducing Chaos in Kubernetes
PPTX
GraalVM - A Step Ahead of JVM Presentation
PPTX
Nomad by HashiCorp Presentation (DevOps)
PPTX
Nomad by HashiCorp Presentation (DevOps)
PPTX
DAPR - Distributed Application Runtime Presentation
PPTX
Introduction to Azure Virtual WAN Presentation
PPTX
Introduction to Argo Rollouts Presentation
PPTX
Intro to Azure Container App Presentation
PPTX
Insights Unveiled Test Reporting and Observability Excellence
PPTX
Introduction to Splunk Presentation (DevOps)
PPTX
Code Camp - Data Profiling and Quality Analysis Framework
PPTX
AWS: Messaging Services in AWS Presentation
PPTX
Amazon Cognito: A Primer on Authentication and Authorization
PPTX
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
PPTX
Managing State & HTTP Requests In Ionic.
Angular Hydration Presentation (FrontEnd)
Optimizing Test Execution: Heuristic Algorithm for Self-Healing
Self-Healing Test Automation Framework - Healenium
Kanban Metrics Presentation (Project Management)
Java 17 features and implementation.pptx
Chaos Mesh Introducing Chaos in Kubernetes
GraalVM - A Step Ahead of JVM Presentation
Nomad by HashiCorp Presentation (DevOps)
Nomad by HashiCorp Presentation (DevOps)
DAPR - Distributed Application Runtime Presentation
Introduction to Azure Virtual WAN Presentation
Introduction to Argo Rollouts Presentation
Intro to Azure Container App Presentation
Insights Unveiled Test Reporting and Observability Excellence
Introduction to Splunk Presentation (DevOps)
Code Camp - Data Profiling and Quality Analysis Framework
AWS: Messaging Services in AWS Presentation
Amazon Cognito: A Primer on Authentication and Authorization
ZIO Http A Functional Approach to Scalable and Type-Safe Web Development
Managing State & HTTP Requests In Ionic.

Recently uploaded (20)

PDF
GamePlan Trading System Review: Professional Trader's Honest Take
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Machine learning based COVID-19 study performance prediction
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PPTX
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Advanced IT Governance
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
Electronic commerce courselecture one. Pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PDF
Spectral efficient network and resource selection model in 5G networks
GamePlan Trading System Review: Professional Trader's Honest Take
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Machine learning based COVID-19 study performance prediction
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
breach-and-attack-simulation-cybersecurity-india-chennai-defenderrabbit-2025....
20250228 LYD VKU AI Blended-Learning.pptx
Advanced IT Governance
Understanding_Digital_Forensics_Presentation.pptx
Review of recent advances in non-invasive hemoglobin estimation
Electronic commerce courselecture one. Pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Chapter 3 Spatial Domain Image Processing.pdf
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
NewMind AI Monthly Chronicles - July 2025
Per capita expenditure prediction using model stacking based on satellite ima...
MYSQL Presentation for SQL database connectivity
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Spectral efficient network and resource selection model in 5G networks

Observability at Scale

  • 1. Observability at Scale Presented By: Rahul Miglani VP Engineering - DevOps Practice Head Knoldus Inc.
  • 2. About Knoldus Knoldus is a technology consulting firm with focus on modernizing the digital systems at the pace your business demands. DevOps Functional. Reactive. Cloud Native
  • 3. 01 What Is Observability in DevOps? 02 Components of Observability 03 Benefits of Observability 04 Common Pitfalls in Observability 05 Observability at Scale and best practices Our Agenda
  • 4. What is Observability in DevOps Observability is the foundation of reliability , When things inevitably go wrong, observability enables engineers to quickly diagnose and fix issues when they arise. The more complex a system gets, and the higher user expectations are over reliability, the more important it becomes to invest in advanced observability methods to reason about what is going on.
  • 6. Components of Observability LEARN NW LOGGING METRICS TRACING
  • 8. c Benefits of Observability ● It helps the IT firm to have a complete understanding of the internal workings of their system. ● Observability reduces the downtime spent in resolving issues, as it tends to bring the possible causes of the issue into focus. ● It gives the DevOps team the ability to identify the root causes of issues. ● Observability makes debugging and troubleshooting easier. ● Observability helps companies monitor the performance of the application or system. ● It helps in speeding up the Mean Time to Detection (MTTD) and the Mean Time To Resolution (MTTR) for software infrastructure and services. ● Observability also enhances customer satisfaction if staffers use data from logs and metrics to improve services.
  • 9. Pitfall 2: Working Without the Right Tools Pitfall 3: Poor Alerting System Pitfall 1: Uneven Distribution of Information 20XX STRATEGY Common Pitfalls in Observability
  • 10. ● Don’t try to monitor everything. Instead, gather only the necessary data. ● Focus more on monitoring essential things and fixing them if they fail. ● Avoid storing every log or data available. Rather, store those that give insights to critical events. ● Put up alerts on critical events. ● Create data graphs that are easily understandable by every team member, as this will improve the usability of the information MEASURE EVERYTHING ● Changes made to monitoring configuration. ● "Out of hours" alerts. ● Team alerting balance. ● False positives. ● False negatives. ● Alert creation. ● Alert acknowledgement. ● Alert silencing and silence duration. ● Unactionable alerts. ● Usability: alerts, runbooks, dashboards. ● MTTD, MTTR, impact. Best Practices in Observability
  • 12. Rahul Miglani DevOps Practice Head DevOps@Knoldus.com Thank You!