From the course: Monitoring and Observability with Datadog
Unlock the full course today
Join today to access over 24,700 courses taught by industry experts.
Datadog incidents - Datadog Tutorial
From the course: Monitoring and Observability with Datadog
Datadog incidents
- Incidents are events or occurrences that cause a deviation from normal system performance. Throughout the course, we've set up metrics and we have set up monitors to keep an eye on these critical metrics. Whenever these monitors start to go into the alerts or alarm phase, depending on their severity, they could constitute incidents. Incident response or incident management are the actions taken to manage the aftermath of an infrastructure or application malfunction. Observability is not complete if we do not have an incident response or incident management plan in place to cater to our services whenever they start to malfunction. Incident response consists of different phases such as detection, triage, diagnosis, resolution, monitoring, and postmortem, As we've seen throughout the course, Datadog plays a huge role in all of these phases. Datadog is great for detecting, triaging, diagnosing, and monitoring our services…