SlideShare a Scribd company logo
SENSE AND
SENSU-BILITY
Painless Metrics And Monitoring
In The Cloud with Sensu
Bethany Erskine
nycdevops Meetup
http://github.com/skymob/sensu-tutorial

Thursday, November 14, 13
DO YOU LOVE
YOUR
MONITORING
SETUP?
Thursday, November 14, 13
#MONITORINGLOVE

Thursday, November 14, 13
MY STORY

+

(╯︵╰,)

Thursday, November 14, 13
Thursday, November 14, 13
Thursday, November 14, 13
WHY SENSU
✓Ruby
Plugins can be written in any
✓language
✓
✓community

sensu-chef cookbook

Thursday, November 14, 13
WHY SENSU
✓re-use Nagios checks
metrics and checks all collected by
✓one system
✓
✓easy to scale

Graphite integration

Thursday, November 14, 13
WHY SENSU

✓“Can I do X with Sensu?” probably!

Thursday, November 14, 13
WHY SENSU

Thursday, November 14, 13
WHY SENSU?
✓

Sensu source is well-written and
easy to parse

✓

Thursday, November 14, 13

https://github.com/sensu
WHY SENSU?
✓sensu-community-plugins
80 contributors
✓
✓over 600 plugins
https://github.com/sensu/sensu✓community-plugins
Thursday, November 14, 13
TODAY at
PAPERLESS
Two Sensu environments (prod/testing)
~ 250 - 275 instances of sensu-client
4-6 Sensu-server instances
25k Metrics/Hour to Graphite
1 custom dashboard
1 custom CLI

Thursday, November 14, 13
RESOURCES
All of our
✓virtualized.Sensu infrastructure is
We typically give a
✓box 1.5GB RAM and sensu-server
4 processors,
scaling up RAM for any box running
more than one Sensu service on it.
4GB
✓install RAM for a monolithic Sensu
(Rabbit, Redis, all Sensu
components on one)
Thursday, November 14, 13
AS WE GREW
Growing pains and lessons learned...

Thursday, November 14, 13
NEEDS MORE
SENSU
✓High load on Sensu server
Backed-up queues in RabbitMQ
✓
TIP: set up check to monitor the
✓RabbitMQ ready queue size, you'll
want an email when the queue
grows about 10K and stays there

Thursday, November 14, 13
HOW TO SCALE
✓Add more sensu-server instances
No special configuration needed
✓
checks will be
✓robin fashion todistributed in roundthe sensu-servers

Thursday, November 14, 13
GRAPHITE PAINS
symptoms: backed up queues in
✓RabbitMQ, spotty graphs
cluster couldn’t
with the
✓large amount of keep upwe were
metrics
now serving it via AMQP

Thursday, November 14, 13
GRAPHITE PAINS
✓

Solution: stop collecting metrics
every 10 seconds (excessive!)

✓

moved staging metrics to staging
Graphite cluster

✓

Moved prod Graphite cluster to
SSD

Thursday, November 14, 13
THE MIGRATION
or, How To Quit Nagios in Ten Easy Steps

Thursday, November 14, 13
STEP 1: NUKE AND
PAVE

Thursday, November 14, 13
STEP 2: PLAN
METRICS AND MONITORING SURVEY

Thursday, November 14, 13
METRICS AND MONITORING SURVEY

Thursday, November 14, 13
STEP 3: DEFINE
GLOBALS
✓CHECKS: must be actionable!
✓METRICS: go nuts
HANDLERS: EMAIL for everything
✓initially, added Pagerduty later.

Thursday, November 14, 13
OUR GLOBALS
✓

CHECKS: disk usage, swap usage,
zombie processes, RO filesystems

✓

METRICS: vmstat, disk usage, cpu,
memory, interface and disk perf

✓

HANDLERS: Email, Campfire,
Pagerduty

Thursday, November 14, 13
STEP 4: DEFINE
SPECIFICS
✓

For each server role, define
additional states to be checked and
alerted on:

✓Process Checks
✓System Checks
✓Service Checks
✓Service Metrics
Thursday, November 14, 13
STEP 5: SET UP A
PLACE TO TEST
✓

Set up a permanent testing Sensu
stack using your CM tool of choice

✓

Thursday, November 14, 13

we used sensu-chef cookbook
STEP 6: SET A
WORKFLOW
✓

Develop and document a workflow
for implementing, testing,
deploying and signing off on
checks

✓

You’ll get the best coverage if
anyone (developers or ops) can
easily add checks and metrics to
Sensu

Thursday, November 14, 13
EXAMPLE
WORKFLOW
add new sensu_check
✓appropriate cookbook definitions to the
in Chef
deploy
✓Chef new check to staging env using

✓Pull Request with sample graphs or alerts
✓Code Review from colleague
✓Deploy to Prod
Thursday, November 14, 13
SENSU IN CHEF

Thursday, November 14, 13
STEP 7: EXECUTE
WORKFLOW
Starting with the low-hanging
✓(plugins that already existed infruit
sensu-community-plugins
repository), configure and deploy
each check in the worksheet to the
testing Sensu server
deploy sensu-client to a few select
✓machines
Thursday, November 14, 13
STEP 8: WATCH
THE WATCHER
Set up some bare-minimum 3rd
✓party monitoring for the Sensu
servers

Thursday, November 14, 13
Thursday, November 14, 13
MONITOR THE
MONITOR
✓

Other ideas: have Testing Sensu
monitor Prod Sensu

✓

Sensu can collect metrics about
itself

Thursday, November 14, 13
STEP 9: ROLLOUT
Deploy your
✓infrastructureProduction server
Roll out the client
✓the rest of the yourand checks to
prod
environments. 

Thursday, November 14, 13
STEP 10: TUNE
✓
Expect to need to tune
✓and alert occurrences. thresholds
Laissez le bon alertes roulent!

Thursday, November 14, 13
SENSU
ARCHITECTURE

Thursday, November 14, 13
SENSU
ARCHITECTURE

Thursday, November 14, 13
OMNIBUS
INSTALLER
is awesome

Thursday, November 14, 13
LET’S PLAY WITH
SENSU
If you haven’t been able to get your
sandboxes up and running,
please pair with someone near you.

Thursday, November 14, 13
SANDBOX GOALS
✓

Get familiar with Sensu
configuration

✓
✓Deploy a check
Trigger an alert on that check
✓
Give you something to take home
✓and hack on
Install a Handler

Thursday, November 14, 13
OOPS
If you mess anything up:
vagrant halt; vagrant up
Worst case:
vagrant destroy; vagrant up

Thursday, November 14, 13
TWO
VIRTUALBOXES
Sensu-Server and Sensu-Client
Vagrant/Chef
Centos 6.4
Sensu Version 0.10.2

Thursday, November 14, 13
SENSU
CONFIGURATION
Please open up a terminal
✓into both your sensu-serverand SSH
and
sensu-client VMs

✓sudo su ✓cd /etc/sensu
Thursday, November 14, 13
SENSU
CONFIGURATION
✓/etc/sensu/config.json - config for
redis, rabbitmq, api and dashboard

✓/etc/sensu/conf.d/ - checks go here
✓/etc/sensu/conf.d/client.json client configuration, subscriptions

✓

/etc/sensu/{extensions|handlers|
mutators|plugins}

Thursday, November 14, 13
TRIGGER AN
ALERT!
On sensu-client:
service sensu-client stop

Thursday, November 14, 13
CHECK YOUR
DASHBOARD
Open a web browser and
✓http://10.254.254.10:8080 go to
username:
✓secret admin / password:

Thursday, November 14, 13
HANDLERS
✓

A HANDLER takes action on an
event using a pipe, TCP, UDP,
AMQP, or a set of other handlers

Examples: send an
send
✓event to Pagerduty,email,metrics to
send
Graphite

✓
Thursday, November 14, 13

Default is “debug”
HANDLER
EXAMPLES
✓BASIC: send an email to ops@
ADVANCED: attempt to remediate
✓the alert (i.e. run a custom script
that spins up additional ec2
instances)

Thursday, November 14, 13
HANDLERS
Let’s configure an EMAIL handler
✓to send a informative email for an
event.

✓

/etc/sensu/handlers/mailer.rb
plugin is installed for you, we just
need to configure and install it

Thursday, November 14, 13
CONFIGURE THE
PLUGIN
ON SENSU SERVER:
vim /etc/sensu/conf.d/handlers/
mailer.json
{
"mailer": {
"mail_from": "sensu@you.com",
"mail_to": "you@yourdomain.com"
}
}
Thursday, November 14, 13
CONFIGURE THE
HANDLER
cp /etc/sensu/conf.d/handlers/
default.json
/etc/sensu/conf.d/handlers/
email.json
vim /etc/sensu/conf.d/handlers/
email.json

Thursday, November 14, 13
EMAIL.JSON
"handlers": {
"email": {
"type": "pipe",
"command": "/etc/sensu/handlers/
mailer.rb"
}
}

Thursday, November 14, 13
CHECK GEM
DEPENDENCIES
/opt/sensu/embedded/bin/gem list | grep mail

Thursday, November 14, 13
FIX PERMISSIONS

chown -R .sensu /etc/sensu/conf.d/

Thursday, November 14, 13
RESTART
SERVICES
service sensu-server restart
tail -100 /var/log/sensu/sensu-server.log
| grep mail

Thursday, November 14, 13
CHECKS
Sensu-client runs CHECKS that
✓defined and scheduled either are
locally (standalone) or on the
sensu-server (subscription).
A CHECK sends a RESULT as
✓EVENT to a HANDLER - this an
applies to anything - service
checks, metrics, etc

Thursday, November 14, 13
CHECK
EXECUTION
✓

Either scheduled by the server
(subscription) or scheduled by the
client (standalone)

Today we will configure a
✓subscription-based check on the
server that will run on our client

Thursday, November 14, 13
LETS CONFIGURE
A CHECK
✓

Use check-procs.rb to make sure
at least one instance of cornbread
is running

Thursday, November 14, 13
DETERMINE OUR
CHECK COMMAND
On your SENSU CLIENT:
/opt/sensu/embedded/bin/ruby /etc/sensu/plugins/check-procs.rb -p
cornbread -W1

Thursday, November 14, 13
INSTALL OUR
CHECK
✓On your SENSU SERVER:
vim /etc/sensu/conf.d/checks/
✓cornbread_process.json

Thursday, November 14, 13
CORNBREAD_PRO
CESS.JSON

Thursday, November 14, 13
RESTART
SERVICES
service sensu-server restart
tail -100 /var/log/sensu/sensu-server.log
| grep cornbread

Thursday, November 14, 13
CHECK YOUR
DASHBOARD

Thursday, November 14, 13
CHECK YOUR
EMAIL

Thursday, November 14, 13
SENSU API
✓
✓HTTP/4567
on SENSU SERVER try:
✓
REST API

curl -l http://localhost:4567/events 
| python -mjson.tool

Thursday, November 14, 13
SENSU SERVICES
✓Sensu API
Sensu Server
✓
✓Sensu Client
Sensu Dashboard
✓
Thursday, November 14, 13
EVERYTHING OK?
✓

/etc/init.d/sensu-service {client|
server|api|dashboard} {start|stop|
status|restart}

✓ps -ef | grep sensu
tail -f /var/log/sensu/*.log
✓
✓curl -l localhost:4567/info
Thursday, November 14, 13
COOL SENSU
TRICKS

Thursday, November 14, 13
SEND DIRECTLY
TO SENSU
netcat to: 127.0.0.0:3030

Thursday, November 14, 13
AGGREGATE
ALERTS
✓
Alert when
✓not OK X% of checks are are

Handy for preventing alert floods

Thursday, November 14, 13
MY SENSU TIPS
install the RabbitMQ management
✓web interface and bookmark it (see
http://10.254.254.10:15672/#/ )

✓

lock your plugins’ gem
dependency versions

Thursday, November 14, 13
TIPS TIPS TIPS
✓

have alternate ways to access your
Dashboard information

✓

we integrated our command-line
developer tools with Sensu API

✓

we also created our own Ops
dashboard that queries Sensu,
Graphite and our app for data

Thursday, November 14, 13
MORE TIPS

✓

Put NGINX in front of sensudashboard

Thursday, November 14, 13
HA SENSU
✓

Redundancy is easy (bring up
more sensu-servers)

✓

Making Redis and RabbitMQ HA
more challenging

✓

We’re still running one solitary
Redis and RabbitMQ but are OK
with this risk for now

Thursday, November 14, 13
WHERE TO GO
FOR HELP
✓
✓IRC: #sensu - freenode
sensu-users mailing list
✓

http://docs.sensuapp.org

Thursday, November 14, 13
QUESTIONS

Thursday, November 14, 13
THANK YOU
bethany@paperlesspost.com
@skymob - twitter
robotwitharose - #sensu on IRC (freenode)

Thursday, November 14, 13

More Related Content

PDF
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with Sensu
PDF
Aucklug slides - desktop tips and tricks
PPTX
WTF is Sensu and Monitoring
PPTX
Monitoring with sensu
PDF
Sensu @ Yelp!: A Guided Tour
PDF
PPTX
Cf summit-2016-monitoring-cf-sensu-graphite
PDF
How Yelp Uses Sensu to Monitor Services in a SOA World
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with Sensu
Aucklug slides - desktop tips and tricks
WTF is Sensu and Monitoring
Monitoring with sensu
Sensu @ Yelp!: A Guided Tour
Cf summit-2016-monitoring-cf-sensu-graphite
How Yelp Uses Sensu to Monitor Services in a SOA World

Viewers also liked (6)

PDF
PPTX
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
PDF
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
PPTX
Volta: Logging, Metrics, and Monitoring as a Service
PPTX
Grafana and MySQL - Benefits and Challenges
PDF
Stop using Nagios (so it can die peacefully)
The ultimate container monitoring bake-off - Rancher Online Meetup October 2016
PuppetConf 2016: Watching the Puppet Show – Sean Porter, Heavy Water Operations
Volta: Logging, Metrics, and Monitoring as a Service
Grafana and MySQL - Benefits and Challenges
Stop using Nagios (so it can die peacefully)
Ad

Similar to An Introduction to Sensu by Bethany Erskine (16)

PDF
Cooking an Omelette with Chef
PDF
Performance and optimization
PDF
5 Ways to Awesome-ize Your (PHP) Code
PDF
Smartgears
ODP
How we setup Rsync-powered Incremental Backups
PDF
Developing locally with virtual machines
PDF
Show an Open Source Project Some Love and Start Using Travis-CI
PDF
The State of Puppet
PDF
Deploying Plone on AWS
PDF
CloudOpen North America 2013: Vagrant & CFEngine
PDF
Dynamic Inventory: no more host lists!
PDF
Ruby meetup 7_years_in_testing
PDF
DevOps: Getting Started with Puppet on Windows
PDF
Recommender Systems with Ruby (adding machine learning, statistics, etc)
PDF
[Nuxeo World 2013] NUXEO DRIVE: AN EXTENSIBLE SOLUTION FOR SYNCHRONIZING YOUR...
PDF
Gerenciando múltiplas versões do PostgreSQL com pgvm
Cooking an Omelette with Chef
Performance and optimization
5 Ways to Awesome-ize Your (PHP) Code
Smartgears
How we setup Rsync-powered Incremental Backups
Developing locally with virtual machines
Show an Open Source Project Some Love and Start Using Travis-CI
The State of Puppet
Deploying Plone on AWS
CloudOpen North America 2013: Vagrant & CFEngine
Dynamic Inventory: no more host lists!
Ruby meetup 7_years_in_testing
DevOps: Getting Started with Puppet on Windows
Recommender Systems with Ruby (adding machine learning, statistics, etc)
[Nuxeo World 2013] NUXEO DRIVE: AN EXTENSIBLE SOLUTION FOR SYNCHRONIZING YOUR...
Gerenciando múltiplas versões do PostgreSQL com pgvm
Ad

More from Hakka Labs (20)

PDF
Always Valid Inference (Ramesh Johari, Stanford)
PPTX
DataEngConf SF16 - High cardinality time series search
PDF
DataEngConf SF16 - Data Asserts: Defensive Data Science
PDF
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
PDF
DataEngConf SF16 - Recommendations at Instacart
PDF
DataEngConf SF16 - Running simulations at scale
PDF
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
PDF
DataEngConf SF16 - Collecting and Moving Data at Scale
PDF
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
PDF
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
PDF
DataEngConf SF16 - Three lessons learned from building a production machine l...
PDF
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
PDF
DataEngConf SF16 - Bridging the gap between data science and data engineering
PDF
DataEngConf SF16 - Multi-temporal Data Structures
PDF
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
PDF
DataEngConf SF16 - Beginning with Ourselves
PDF
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
PDF
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
PDF
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
PDF
DataEngConf SF16 - Spark SQL Workshop
Always Valid Inference (Ramesh Johari, Stanford)
DataEngConf SF16 - High cardinality time series search
DataEngConf SF16 - Data Asserts: Defensive Data Science
DatEngConf SF16 - Apache Kudu: Fast Analytics on Fast Data
DataEngConf SF16 - Recommendations at Instacart
DataEngConf SF16 - Running simulations at scale
DataEngConf SF16 - Deriving Meaning from Wearable Sensor Data
DataEngConf SF16 - Collecting and Moving Data at Scale
DataEngConf SF16 - BYOMQ: Why We [re]Built IronMQ
DataEngConf SF16 - Unifying Real Time and Historical Analytics with the Lambd...
DataEngConf SF16 - Three lessons learned from building a production machine l...
DataEngConf SF16 - Scalable and Reliable Logging at Pinterest
DataEngConf SF16 - Bridging the gap between data science and data engineering
DataEngConf SF16 - Multi-temporal Data Structures
DataEngConf SF16 - Entity Resolution in Data Pipelines Using Spark
DataEngConf SF16 - Beginning with Ourselves
DataEngConf SF16 - Routing Billions of Analytics Events with High Deliverability
DataEngConf SF16 - Tales from the other side - What a hiring manager wish you...
DataEngConf SF16 - Methods for Content Relevance at LinkedIn
DataEngConf SF16 - Spark SQL Workshop

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPT
Teaching material agriculture food technology
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
KodekX | Application Modernization Development
PDF
Modernizing your data center with Dell and AMD
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Encapsulation theory and applications.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
CIFDAQ's Market Insight: SEC Turns Pro Crypto
20250228 LYD VKU AI Blended-Learning.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Teaching material agriculture food technology
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
NewMind AI Weekly Chronicles - August'25 Week I
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
Per capita expenditure prediction using model stacking based on satellite ima...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
KodekX | Application Modernization Development
Modernizing your data center with Dell and AMD
Reach Out and Touch Someone: Haptics and Empathic Computing
Advanced methodologies resolving dimensionality complications for autism neur...
Encapsulation theory and applications.pdf

An Introduction to Sensu by Bethany Erskine