SlideShare a Scribd company logo
Shape Up
Skills Builder - September 4th, 2020
Confidential
How to Streamline Incident Response with
InfluxDB, PagerDuty and Rundeck
April 20th, 2021
Speaker: Craig Hobbs
Craig is a Solution Consultant, SyFy Super Fan,
and Do-gooder at Rundeck.
He has 10 years of experience in system
integrations, environment observability, and
application performance. At Rundeck, he helps
DevOps and IT teams leverage Runbook
Automation and Orchestration to solve complex
automation challenges. #BlackLivesMatter
#FightsForTheUser
Twitter: @chobbs
Agenda
1 MTTR Impact on DevOps
2 Shorten Resolution Time
3 Solution Overview
4 Demo
5 Q/A
2021 Prediction for IT Automation:
“Organizations will lower operational costs by 30% by combining
hyper-automation technologies with redesigned operational
processes”
- Gartner’s IT Automation Predictions for 2021
MTTR (mean time to resolution)
“Average time it takes to fully resolve a
failure.”
While MTTR is a critical metric for DevOps teams on its own, it also encourages
DevOps practices in a variety of ways:
Impact of MTTR on DevOps
● Lower impact of production incidents
● Save time and reduce escalation
● Monitor for problems actively
● Improve velocity, quality and performance
What can you do?
Shortening Your Resolution Time
Act
30 min
Mobilize
1 hr
Resolve
1 hr
Incident Quantity
Monitoring &
Alerting
Resolve
1-5 mins
Runbook
Automation
Incident Quantity
Act
5 min
Mobilize
< 1 min
Resolve
30 mins
Incident Management
Incident Quantity
Identify
5-10 secs
For Monitoring and Alerting?
Additional InfluxData Solution Components
InfluxDB Templates Telegraf Agent InfluxDB Alert Pipeline
For Incident Management?
For Runbook Automation?
Triggered
Alerts
Service Automation
Trigger
Job Execution Events
Incident Enhancement
IT Infrastructure Monitoring
Auto-Remediation Jobs
Metrics
Automate Your Incident Resolution
Identify Act / Mobilize Resolve
Use-Case - “Virtual DevOps”
Status:
1. Customer care teams are an
integral part of any organization.
2. DevOps are often inundated with
manual requests from the customer
service team to assist with
resources to resolve trivial issues.
3. Customer care teams then diagnose
and resolve the issue.
Customer Care DevOps
Demo
Rundeck Template for InfluxDB
Rundeck template do the following:
● Leverage the Rundeck API for
execution meta-data
● Facilitate secure, portable, and
source-controlled Rundeck job
states.
● Simplify sharing and using pre-built
InfluxDB solutions.
Working Together for Shorter Incidents
By combining real-time monitoring from
InfluxDB, faster response organization
from PagerDuty, and automated runbook
orchestration from Rundeck, DevOps
teams can shorten incident time and
reduce errors.
Questions
Archive Slides
How ever you measure resolution time, the one
constant is the need to keep that number down.
Runbook Automation
● Enable anyone to have
self-service automation access
to operations tasks that were
only available to subject matter
experts.
● Make existing automation more
secure, auditable, and easier to
run.
Automate
Incident
Response
Before
Automate
Service
Requests
Starting Rundeck Use Cases
Shorter Incidents.
Fewer Escalations.
After
Faster Turnaround.
Fewer Interruptions.
Rundeck Enterprise
Capabilities
Distributed execution
Orchestration workflows
Error handling
Healthchecks
Webhooks
Scheduling
Guided tours
Secure key storage
Role-based access
SSO support
History and audit trail
Ticket integration
Clustering
HA and failover
Plugin repositories
Use your existing tools and scripts
(Any language or automation tools)
Infrastructure aware
Made for DevOps and Cloud Native ways
of working
Security and compliance friendly
Infrastructure
details and
state
Collect and
Process Output
Authentication
and Roles
Tickets, Work
Status, Approvals
Workflow and
Scheduling
Case for Self-Service Automation
How can we reduce the burden on SRE teams
and empower customer care?
Make this information available at the click of a
button! This is where power of self-service
automation comes into picture.
● Self-service automation: Equip customer
care teams with the ability to resolve issue
quickly and allow the SRE team to maintain a
set of standards and practices for accessing
secure internal operations.
What Rundeck Enterprise Provides
Capabilities
Distributed execution
Orchestration workflows
Error handling
Healthchecks
Webhooks
Scheduling
Guided tours
Secure key storage
Role-based access
SSO support
History and audit trail
Ticket integration
Clustering
HA and failover
Plugin repositories
Use your existing tools and scripts
(Any language or automation tools)
Infrastructure aware
Made for DevOps and Cloud Native ways
of working
Security and compliance friendly
Infrastructure
details and
state
Collect and
Process Output
Authentication
and Roles
Tickets, Work
Status, Approvals
Workflow and
Scheduling
We look forward to bringing together our
community of developers to learn, interact
and share tips and use cases.
10-11 May 2021
Hands-On Flux Training
18-19 May 2021
Virtual Experience
www.influxdays.com/emea-2021-virtual-experience/

More Related Content

PDF
Turning Cloud Metrics into Results
PDF
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
PDF
Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...
PDF
Virtual training intro to InfluxDB - June 2021
PDF
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
PDF
Three Ways InfluxDB Enables You to Use Time Series Data Across Your Entire En...
PDF
Taming the Tiger: Tips and Tricks for Using Telegraf
PPTX
Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...
Turning Cloud Metrics into Results
How to Deliver a Critical and Actionable Customer-Facing Metrics Product with...
Tanny Ng, Nadeem Syed [WP Engine] | How WP Engine Transformed Monitoring Into...
Virtual training intro to InfluxDB - June 2021
Timothy Spann [StreamNative] | Using FLaNK with InfluxDB for EdgeAI IoT at Sc...
Three Ways InfluxDB Enables You to Use Time Series Data Across Your Entire En...
Taming the Tiger: Tips and Tricks for Using Telegraf
Paul Dix [InfluxData] | InfluxDays Keynote: Future of InfluxDB | InfluxDays N...

What's hot (20)

PDF
Time Series Tech Stack for the IoT Edge
PDF
InfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
PPTX
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
PDF
Ana-Maria Calin [InfluxData] | Migrating from OSS to InfluxDB Cloud | InfluxD...
PDF
Vasilis Papavasiliou [Mist.io] | Integrating Telegraf, InfluxDB and Mist to M...
PDF
Gain Deep Visibility into APIs and Integrations with Anypoint Monitoring
PPTX
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
PDF
InfluxDB + Kepware: Start Monitoring Industrial Data Quickly
PPTX
InfluxDB Cloud Product Update
PDF
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
PDF
Tim Hall [InfluxData] | InfluxDays Keynote: InfluxDB Roadmap | InfluxDays NA ...
PPTX
How to Manage Your Time Series Data Pipeline at the Edge with InfluxDB
PPTX
InfluxDB Community Office Hours September 2020
PPTX
Modern vSphere Monitoring and Dashboard using InfluxDB, Telegraf and Grafana
PDF
Monitor Kubernetes in Rancher using InfluxData
PDF
Kapacitor Stream Processing
PDF
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
PDF
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...
PDF
Alex Nauda [Nobl9] | How Not to Build an SLO Platform | InfluxDays NA 2021
PDF
InfluxData Architecture for IoT | Noah Crowley | InfluxData
Time Series Tech Stack for the IoT Edge
InfluxDB + Telegraf Operator: Easy Kubernetes Monitoring
Discover How Allscripts Uses InfluxDB to Monitor its Healthcare IT Platform
Ana-Maria Calin [InfluxData] | Migrating from OSS to InfluxDB Cloud | InfluxD...
Vasilis Papavasiliou [Mist.io] | Integrating Telegraf, InfluxDB and Mist to M...
Gain Deep Visibility into APIs and Integrations with Anypoint Monitoring
Tim Hall [InfluxData] | InfluxDB Roadmap | InfluxDays Virtual Experience Lond...
InfluxDB + Kepware: Start Monitoring Industrial Data Quickly
InfluxDB Cloud Product Update
How a Time Series Database Contributes to a Decentralized Cloud Object Storag...
Tim Hall [InfluxData] | InfluxDays Keynote: InfluxDB Roadmap | InfluxDays NA ...
How to Manage Your Time Series Data Pipeline at the Edge with InfluxDB
InfluxDB Community Office Hours September 2020
Modern vSphere Monitoring and Dashboard using InfluxDB, Telegraf and Grafana
Monitor Kubernetes in Rancher using InfluxData
Kapacitor Stream Processing
Streaming Sensor Data with Grafana and InfluxDB | Ryan Mckinley | Grafana
How to Gain Visibility into Containers, VM’s and Multi-Cloud Environments Usi...
Alex Nauda [Nobl9] | How Not to Build an SLO Platform | InfluxDays NA 2021
InfluxData Architecture for IoT | Noah Crowley | InfluxData
Ad

Similar to How to Streamline Incident Response with InfluxDB, PagerDuty and Rundeck (20)

PDF
Efficient platform engineering with Microk8s & gopaddle.pdf
PDF
Super-Charge Your Site Reliability Practices with Runbook Automation
PDF
DevOps Automation: Boosting Efficiency and Productivity
PDF
How to Maximize Business Productivity with Top DevOps Automation Tools
PDF
Self Service Cloud Operations: Safely Delegate the Management of your Cloud ...
PDF
DevOps Best Practices for 2025_ A Comprehensive Guide.pdf
DOC
DevOps - Bridging Development & Operations.doc
PDF
intro to DevOps
PPTX
CodeValue Architecture Next 2018 - Executive track dilemmas and solutions in...
PDF
Life cycle-management-for-oracle-data-integrator-(odi)
PPTX
How Azure DevOps can boost your organization's productivity
PDF
Obsidian Agile DevOps
PPTX
Training Bootcamp - MainframeDevOps.pptx
PDF
DevOps Consulting Services- Unlocking Agile Success with Drish Infotech Limit...
PPTX
Making software development processes to work for you
PDF
9 Ways to Integrate AI in DevOps for Enhanced Efficiency.pdf
PDF
Pete Marshall - casmadrid2015 - Continuous Delivery in Legacy Environments
PDF
Dev ops lpi-701
PPTX
Don’t Let Process Hold You Back: Best Practices for Cross-Functional Collabor...
PDF
Agile Network India | Agility Day @Noida | SRE & AIOps | Murugan Muthayan
Efficient platform engineering with Microk8s & gopaddle.pdf
Super-Charge Your Site Reliability Practices with Runbook Automation
DevOps Automation: Boosting Efficiency and Productivity
How to Maximize Business Productivity with Top DevOps Automation Tools
Self Service Cloud Operations: Safely Delegate the Management of your Cloud ...
DevOps Best Practices for 2025_ A Comprehensive Guide.pdf
DevOps - Bridging Development & Operations.doc
intro to DevOps
CodeValue Architecture Next 2018 - Executive track dilemmas and solutions in...
Life cycle-management-for-oracle-data-integrator-(odi)
How Azure DevOps can boost your organization's productivity
Obsidian Agile DevOps
Training Bootcamp - MainframeDevOps.pptx
DevOps Consulting Services- Unlocking Agile Success with Drish Infotech Limit...
Making software development processes to work for you
9 Ways to Integrate AI in DevOps for Enhanced Efficiency.pdf
Pete Marshall - casmadrid2015 - Continuous Delivery in Legacy Environments
Dev ops lpi-701
Don’t Let Process Hold You Back: Best Practices for Cross-Functional Collabor...
Agile Network India | Agility Day @Noida | SRE & AIOps | Murugan Muthayan
Ad

More from InfluxData (20)

PPTX
Announcing InfluxDB Clustered
PDF
Best Practices for Leveraging the Apache Arrow Ecosystem
PDF
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
PDF
Power Your Predictive Analytics with InfluxDB
PDF
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
PDF
Build an Edge-to-Cloud Solution with the MING Stack
PDF
Meet the Founders: An Open Discussion About Rewriting Using Rust
PDF
Introducing InfluxDB Cloud Dedicated
PDF
Gain Better Observability with OpenTelemetry and InfluxDB
PPTX
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
PDF
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
PPTX
Introducing InfluxDB’s New Time Series Database Storage Engine
PDF
Start Automating InfluxDB Deployments at the Edge with balena
PDF
Understanding InfluxDB’s New Storage Engine
PDF
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
PPTX
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
PDF
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
PDF
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
PDF
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022
Announcing InfluxDB Clustered
Best Practices for Leveraging the Apache Arrow Ecosystem
How Bevi Uses InfluxDB and Grafana to Improve Predictive Maintenance and Redu...
Power Your Predictive Analytics with InfluxDB
How Teréga Replaces Legacy Data Historians with InfluxDB, AWS and IO-Base
Build an Edge-to-Cloud Solution with the MING Stack
Meet the Founders: An Open Discussion About Rewriting Using Rust
Introducing InfluxDB Cloud Dedicated
Gain Better Observability with OpenTelemetry and InfluxDB
How a Heat Treating Plant Ensures Tight Process Control and Exceptional Quali...
How Delft University's Engineering Students Make Their EV Formula-Style Race ...
Introducing InfluxDB’s New Time Series Database Storage Engine
Start Automating InfluxDB Deployments at the Edge with balena
Understanding InfluxDB’s New Storage Engine
Streamline and Scale Out Data Pipelines with Kubernetes, Telegraf, and InfluxDB
Ward Bowman [PTC] | ThingWorx Long-Term Data Storage with InfluxDB | InfluxDa...
Scott Anderson [InfluxData] | New & Upcoming Flux Features | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Closing Thoughts | InfluxDays 2022
Steinkamp, Clifford [InfluxData] | Welcome to InfluxDays 2022 - Day 2 | Influ...
Steinkamp, Clifford [InfluxData] | Closing Thoughts Day 1 | InfluxDays 2022

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Tartificialntelligence_presentation.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Encapsulation theory and applications.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPT
Teaching material agriculture food technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Spectroscopy.pptx food analysis technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
Spectral efficient network and resource selection model in 5G networks
Tartificialntelligence_presentation.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Unlocking AI with Model Context Protocol (MCP)
The Rise and Fall of 3GPP – Time for a Sabbatical?
Encapsulation theory and applications.pdf
Electronic commerce courselecture one. Pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Teaching material agriculture food technology
MIND Revenue Release Quarter 2 2025 Press Release
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Spectroscopy.pptx food analysis technology
20250228 LYD VKU AI Blended-Learning.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Per capita expenditure prediction using model stacking based on satellite ima...

How to Streamline Incident Response with InfluxDB, PagerDuty and Rundeck

  • 1. Shape Up Skills Builder - September 4th, 2020 Confidential How to Streamline Incident Response with InfluxDB, PagerDuty and Rundeck April 20th, 2021
  • 2. Speaker: Craig Hobbs Craig is a Solution Consultant, SyFy Super Fan, and Do-gooder at Rundeck. He has 10 years of experience in system integrations, environment observability, and application performance. At Rundeck, he helps DevOps and IT teams leverage Runbook Automation and Orchestration to solve complex automation challenges. #BlackLivesMatter #FightsForTheUser Twitter: @chobbs
  • 3. Agenda 1 MTTR Impact on DevOps 2 Shorten Resolution Time 3 Solution Overview 4 Demo 5 Q/A
  • 4. 2021 Prediction for IT Automation: “Organizations will lower operational costs by 30% by combining hyper-automation technologies with redesigned operational processes” - Gartner’s IT Automation Predictions for 2021
  • 5. MTTR (mean time to resolution) “Average time it takes to fully resolve a failure.”
  • 6. While MTTR is a critical metric for DevOps teams on its own, it also encourages DevOps practices in a variety of ways: Impact of MTTR on DevOps ● Lower impact of production incidents ● Save time and reduce escalation ● Monitor for problems actively ● Improve velocity, quality and performance
  • 8. Shortening Your Resolution Time Act 30 min Mobilize 1 hr Resolve 1 hr Incident Quantity Monitoring & Alerting Resolve 1-5 mins Runbook Automation Incident Quantity Act 5 min Mobilize < 1 min Resolve 30 mins Incident Management Incident Quantity Identify 5-10 secs
  • 9. For Monitoring and Alerting?
  • 10. Additional InfluxData Solution Components InfluxDB Templates Telegraf Agent InfluxDB Alert Pipeline
  • 13. Triggered Alerts Service Automation Trigger Job Execution Events Incident Enhancement IT Infrastructure Monitoring Auto-Remediation Jobs Metrics Automate Your Incident Resolution Identify Act / Mobilize Resolve
  • 14. Use-Case - “Virtual DevOps” Status: 1. Customer care teams are an integral part of any organization. 2. DevOps are often inundated with manual requests from the customer service team to assist with resources to resolve trivial issues. 3. Customer care teams then diagnose and resolve the issue. Customer Care DevOps
  • 15. Demo
  • 16. Rundeck Template for InfluxDB Rundeck template do the following: ● Leverage the Rundeck API for execution meta-data ● Facilitate secure, portable, and source-controlled Rundeck job states. ● Simplify sharing and using pre-built InfluxDB solutions.
  • 17. Working Together for Shorter Incidents By combining real-time monitoring from InfluxDB, faster response organization from PagerDuty, and automated runbook orchestration from Rundeck, DevOps teams can shorten incident time and reduce errors.
  • 20. How ever you measure resolution time, the one constant is the need to keep that number down.
  • 21. Runbook Automation ● Enable anyone to have self-service automation access to operations tasks that were only available to subject matter experts. ● Make existing automation more secure, auditable, and easier to run.
  • 22. Automate Incident Response Before Automate Service Requests Starting Rundeck Use Cases Shorter Incidents. Fewer Escalations. After Faster Turnaround. Fewer Interruptions.
  • 23. Rundeck Enterprise Capabilities Distributed execution Orchestration workflows Error handling Healthchecks Webhooks Scheduling Guided tours Secure key storage Role-based access SSO support History and audit trail Ticket integration Clustering HA and failover Plugin repositories Use your existing tools and scripts (Any language or automation tools) Infrastructure aware Made for DevOps and Cloud Native ways of working Security and compliance friendly Infrastructure details and state Collect and Process Output Authentication and Roles Tickets, Work Status, Approvals Workflow and Scheduling
  • 24. Case for Self-Service Automation How can we reduce the burden on SRE teams and empower customer care? Make this information available at the click of a button! This is where power of self-service automation comes into picture. ● Self-service automation: Equip customer care teams with the ability to resolve issue quickly and allow the SRE team to maintain a set of standards and practices for accessing secure internal operations.
  • 25. What Rundeck Enterprise Provides Capabilities Distributed execution Orchestration workflows Error handling Healthchecks Webhooks Scheduling Guided tours Secure key storage Role-based access SSO support History and audit trail Ticket integration Clustering HA and failover Plugin repositories Use your existing tools and scripts (Any language or automation tools) Infrastructure aware Made for DevOps and Cloud Native ways of working Security and compliance friendly Infrastructure details and state Collect and Process Output Authentication and Roles Tickets, Work Status, Approvals Workflow and Scheduling
  • 26. We look forward to bringing together our community of developers to learn, interact and share tips and use cases. 10-11 May 2021 Hands-On Flux Training 18-19 May 2021 Virtual Experience www.influxdays.com/emea-2021-virtual-experience/