SlideShare a Scribd company logo
Writing Custom Nagios Plugins

        Nathan Vonnahme
Nathan.Vonnahme@bannerhealth.com
Why write Nagios plugins?


 • Checklists are boring.
 • Life is complicated.
 • “OK” is complicated.
What tool should we use?


 Anything!


 I’ll show
   1. Perl
   2. JavaScript
   3. AutoIt


 Follow along!


                     2012
Why Perl?


 • Familiar to many sysadmins
 • Cross-platform
 • CPAN
 • Mature Nagios::Plugin API
 • Embeddable in Nagios (ePN)
 • Examples and documentation
 • “Swiss army chainsaw”
 • Perl 6… someday?

                       2012
Buuuuut I don’t like Perl




 Nagios plugins are very simple. Use any language
    you like. Eventually, imitate Nagios::Plugin.

                        2012
got Perl?


            perl.org/get.html
            Linux and Mac already have it:
              which perl
            On Windows, I prefer
              1. Strawberry Perl
              2. Cygwin (N.B. make, gcc4)
              3. ActiveState Perl
            Any version Perl 5 should work.


                2012                          6
got Documentation?


 http://nagiosplug.sf.net/
   developer-guidelines.html
 Or,
   goo.gl/kJRTI
                     Case
                   sensitive!




                            2012
got an idea?


 Check the validity of my backup file F.




                    2012
Simplest Plugin Ever


 #!/usr/bin/perl
 if (-e $ARGV[0]) { # File in first arg exists.
   print "OKn";
   exit(0);
 }
 else {
   print "CRITICALn";
   exit(2);
 }




                          2012                    9
Simplest Plugin Ever



 Save, then run with one argument:
    $ ./simple_check_backup.pl foo.tar.gz
    CRITICAL
    $ touch foo.tar.gz
    $ ./simple_check_backup.pl foo.tar.gz
    OK


 But: Will it succeed tomorrow?


                         2012
But “OK” is complicated.


 • Check the validity* of my backup file F.
    • Existent
    • Less than X hours old
    • Between Y and Z MB in size


 * further opportunity: check the restore process!
 BTW: Gavin Carr with Open Fusion in Australia has already written
   a check_file plugin that could do this, but we’re learning here.
   Also confer 2001 check_backup plugin by Patrick Greenwell, but
   it’s pre-Nagios::Plugin.


                                2012
Bells and Whistles


 • Argument parsing
 • Help/documentation
 • Thresholds
 • Performance data
 These things make
 up the majority of
 the code in any
 good plugin. We’ll
 demonstrate them all.

                         2012
Bells, Whistles, and Cowbell


 • Nagios::Plugin
   • Ton Voon rocks
   • Gavin Carr too
   • Used in production
     Nagios plugins
     everywhere
   • Since ~ 2006




                          2012
Bells, Whistles, and Cowbell


 • Install Nagios::Plugin
   sudo cpan
   Configure CPAN if necessary...
   cpan>   install Nagios::Plugin
 • Potential solutions:
   • Configure http_proxy environment variable if
     behind firewall
   • cpan> o conf prerequisites_policy follow
     cpan> o conf commit
   • cpan> install Params::Validate

                            2012
got an example plugin template?


 • Use check_stuff.pl from the Nagios::Plugin
   distribution as your template.

   goo.gl/vpBnh

 • This is always a good place to
   start a plugin.
 • We’re going to be turning
   check_stuff.pl into the finished
   check_backup.pl example.
                         2012
got the finished example?

Published with Gist:
     https://gist.github.com/1218081
or


goo.gl/hXnSm
• Note the “raw” hyperlink for downloading the
  Perl source code.
• The roman numerals in the comments match
  the next series of slides.
                             2012
Check your setup

 1. Save check_stuff.pl (goo.gl/vpBnh) as e.g.
    my_check_backup.pl.
 2. Change the first “shebang” line to point to the Perl
    executable on your machine.
       #!c:/strawberry/bin/perl
 3. Run it
       ./my_check_backup.pl
 4. You should get:
   MY_CHECK_BACKUP UNKNOWN -   you didn't supply a threshold
       argument

 5. If yours works, help your neighbors.

                                2012
Design: Which arguments do we need?


 • File name
 • Age in hours
 • Size in MB




                     2012
Design: Thresholds


 • Non-existence: CRITICAL
 • Age problem: CRITICAL if over age threshold
 • Size problem: WARNING if outside size
   threshold (min:max)




                       2012
I. Prologue (working from check_stuff.pl)


 use strict;
 use warnings;

 use Nagios::Plugin;
 use File::stat;

 use vars qw($VERSION $PROGNAME $verbose $timeout
 $result);
 $VERSION = '1.0';

 # get the base name of this script for use in the
 examples
 use File::Basename;
 $PROGNAME = basename($0);


                          2012
II. Usage/Help

 Changes from check_stuff.pl in bold
 my $p = Nagios::Plugin->new(
   usage => "Usage: %s [ -v|--verbose ] [-t <timeout>]
 [ -f|--file=<path/to/backup/file> ]
 [ -a|--age=<max age in hours> ]
 [ -s|--size=<acceptable min:max size in MB> ]",

   version => $VERSION,
   blurb => "Check the specified backup file's age and size",
   extra => "
 Examples:

 $PROGNAME -f /backups/foo.tgz -a 24 -s 1024:2048

 Check that foo.tgz exists, is less than 24 hours old, and is
 between
 1024 and 2048 MB.
 “);

                                2012
III. Command line arguments/options

 Replace the 3 add_arg calls from check_stuff.pl with:
 # See Getopt::Long for more
 $p->add_arg(
     spec => 'file|f=s',
     required => 1,
     help => "-f, --file=STRING
         The backup file to check. REQUIRED.");
 $p->add_arg(
     spec => 'age|a=i',
     default => 24,
     help => "-a, --age=INTEGER
         Maximum age in hours. Default 24.");
 $p->add_arg(
     spec => 'size|s=s',
     help => "-s, --size=INTEGER:INTEGER
         Minimum:maximum acceptable size in MB (1,000,000 bytes)");

 # Parse arguments and process standard ones (e.g. usage, help, version)
 $p->getopts;



                                   2012
Now it’s RTFM-enabled


 If you run it with no args, it shows usage:

 $ ./check_backup.pl
 Usage: check_backup.pl [ -v|--verbose ] [-t
 <timeout>]
     [ -f|--file=<path/to/backup/file> ]
     [ -a|--age=<max age in hours> ]
     [ -s|--size=<acceptable min:max size in MB> ]




                          2012
Now it’s RTFM-enabled

 $ ./check_backup.pl --help
    check_backup.pl 1.0

    This nagios plugin is free software, and comes with ABSOLUTELY NO WARRANTY.
    It may be used, redistributed and/or modified under the terms of the GNU
    General Public Licence (see http://www.fsf.org/licensing/licenses/gpl.txt).

    Check the specified backup file's age and size

    Usage: check_backup.pl [ -v|--verbose ] [-t <timeout>]
        [ -f|--file=<path/to/backup/file> ]
        [ -a|--age=<max age in hours> ]
        [ -s|--size=<acceptable min:max size in MB> ]

     -?, --usage
       Print usage information
     -h, --help
       Print detailed help screen
     -V, --version
       Print version information



                                       2012
Now it’s RTFM-enabled

   --extra-opts=[section][@file]
     Read options from an ini file. See http://nagiosplugins.org/extra-opts
     for usage and examples.
   -f, --file=STRING
                      The backup file to check. REQUIRED.
   -a, --age=INTEGER
                  Maximum age in hours. Default 24.
   -s, --size=INTEGER:INTEGER
                   Minimum:maximum acceptable size in MB (1,000,000 bytes)
   -t, --timeout=INTEGER
     Seconds before plugin times out (default: 15)
   -v, --verbose
     Show details for command-line debugging (can repeat up to 3 times)

    Examples:

      check_backup.pl -f /backups/foo.tgz -a 24 -s 1024:2048

    Check that foo.tgz exists, is less than 24 hours old, and is between
    1024 and 2048 MB.



                                     2012
IV. Check arguments for sanity


 • Basic syntax checks already defined with
   add_arg, but replace the “sanity checking” with:

 # Perform sanity checking on command line options.
 if ( (defined $p->opts->age) && $p->opts->age < 0 ) {
      $p->nagios_die( " invalid number supplied for
 the age option " );
 }



 • Your next plugin may be more complex.


                          2012
Ooops




At first I used -M, which Perl defines as “Script
  start time minus file modification time, in days.”
Nagios uses embedded Perl by default so the
 “script start time” may be hours or days ago.

                         2012
V. Check the stuff

 # Check the backup file.
 my $f = $p->opts->file;
 unless (-e $f) {
   $p->nagios_exit(CRITICAL, "File $f doesn't exist");
 }
 my $mtime = File::stat::stat($f)->mtime;
 my $age_in_hours = (time - $mtime) / 60 / 60;
 my $size_in_mb = (-s $f) / 1_000_000;

 my $message = sprintf
     "Backup exists, %.0f hours old, %.1f MB.",
     $age_in_hours, $size_in_mb;




                          2012
VI. Performance Data

 # Add perfdata, enabling pretty graphs etc.
 $p->add_perfdata(
    label => "age",
    value => $age_in_hours,
    uom => "hours"
 );
 $p->add_perfdata(
    label => "size",
    value => $size_in_mb,
    uom => "MB"
 );

 • This adds Nagios-friendly output like:
    | age=2.91611111111111hours;; size=0.515007MB;;


                          2012
VII. Compare to thresholds

 Add this section. check_stuff.pl combines
 check_threshold with nagios_exit at the very end.
 # We already checked for file existence.
 my $result = $p->check_threshold(
     check => $age_in_hours,
     warning => undef,
     critical => $p->opts->age
 );
 if ($result == OK) {
     $result = $p->check_threshold(
         check => $size_in_mb,
         warning => $p->opts->size,
         critical => undef,
     );
 }

                          2012
VIII. Exit Code


 # Output the result and exit.
 $p->nagios_exit(
     return_code => $result,
     message => $message
 );




                       2012
Testing the plugin
 $ ./check_backup.pl -f foo.gz
 BACKUP OK - Backup exists, 3 hours old, 0.5 MB |
   age=3.04916666666667hours;; size=0.515007MB;;


 $ ./check_backup.pl -f foo.gz   -s 100:900
 BACKUP WARNING - Backup exists, 23 hours old, 0.5 MB
   | age=23.4275hours;; size=0.515007MB;;


 $ ./check_backup.pl -f foo.gz   -a 8
 BACKUP CRITICAL - Backup exists, 23 hours old, 0.5 MB
   | age=23.4388888888889hours;; size=0.515007MB;;



                          2012
Telling Nagios to use your plugin


     1. misccommands.cfg*

 define command{
   command_name      check_backup
   command_line      $USER1$/myplugins/check_backup.pl
                       -f $ARG1$ -a $ARG2$ -s $ARG3$
 }




     * Lines wrapped for slide presentation


                             2012
Telling Nagios to use your plugin


   2. services.cfg (wrapped)
   define service{
     use                     generic-service
     normal_check_interval   1440    # 24 hours
     host_name               fai01337
     service_description     MySQL backups
     check_command           check_backup!/usr/local/backups
                               /mysql/fai01337.mysql.dump.bz2
                               !24!0.5:100
       contact_groups        linux-admins
   }


   3. Reload config:
       $ sudo /usr/bin/nagios -v /etc/nagios/nagios.cfg
          && sudo /etc/rc.d/init.d/nagios reload

                                2012
Remote execution


 • Hosts/filesystems other than the Nagios host
 • Requirements
   • NRPE, NSClient or equivalent
   • Perl with Nagios::Plugin




                           2012
Profit


 $ plugins/check_nt -H winhost -p 1248
   -v RUNSCRIPT -l check_my_backup.bat


 OK - Backup exists, 12 hours old, 35.7
   MB | age=12.4527777777778hours;;
   size=35.74016MB;;




                    2012
Share



 exchange.
 nagios.org




              2012
Other tools and languages


 • C
 • TAP – Test Anything Protocol
   • See check_tap.pl from my other talk
 • Python
 • Shell
 • Ruby? C#? VB? JavaScript?
 • AutoIt!



                           2012
Now in JavaScript


 Why JavaScript?
 • Node.js “Node's problem is that some of its
   users want to use it for everything? So what? “
 • Cool kids
 • Crockford
 • “Always bet on JS” – Brendan Eich




                         2012
Check_stuff.js – the short part

 var plugin_name = 'CHECK_STUFF';


 // Set up command line args and usage etc using commander.js.
 var cli = require('commander');


 cli
  .version('0.0.1')
    .option('-c, --critical <critical threshold>', 'Critical threshold
   using standard format', parseRangeString)
    .option('-w, --warning <warning threshold>', 'Warning threshold
   using standard format', parseRangeString)
    .option('-r, --result <Number4>', 'Use supplied value, not
   random', parseFloat)
  .parse(process.argv);

                                    2012
Check_stuff.js – the short part

 if (val == undefined) {
     val = Math.floor((Math.random() * 20) + 1);
 }
 var message = ' Sample result was ' + val.toString();

 var perfdata = "'Val'="+val + ';' + cli.warning + ';' +
    cli.critical + ';';

 if (cli.critical && cli.critical.check(val)) {
     nagios_exit(plugin_name, "CRITICAL", message, perfdata);
 } else if (cli.warning && cli.warning.check(val)) {
     nagios_exit(plugin_name, "WARNING", message, perfdata);
 } else {
     nagios_exit(plugin_name, "OK", message, perfdata);
 }




                                2012
The rest


 • Range object
   • Range.toString()
   • Range.check()
   • Range.parseRangeString()
 • nagios_exit()


 Who’s going to make it an NPM module?



                        2012
A silly but newfangled example


 Facebook friends is WARNING!


 ./check_facebook_friends.js -u
   nathan.vonnahme -w @202 -c @203




                      2012
Check_facebook_friends.js


 See the code at

  gist.github.com/3760536
 Note: functions as callbacks instead of loops or
  waiting...




                         2012
A horrifying/inspiring example


    The worst things need the most monitoring.




                       2012
Chart “servers”


 • MS Word macro
 • Mail merge
 • Runs in user session
 • Need about a dozen




                          2012
It gets worse.


                   • Not a service
                   • Not even a process
                   • 100% CPU is normal
                   • “OK” is complicated.




                 2012
Many failure modes




                     2012
AutoIt to the rescue
 Func CompareTitles()
   For $title=1 To $all_window_titles[0][0] Step 1     If
     $state=WinGetState($all_window_titles[$title][0])     StringRegExp($all_window_titles[$title][0], $vali
     $foo=0                                                d_windows[0])=1 Then
     $do_test=0
     For $foo In $valid_states                             $expression=ControlGetText($all_window_titles[$ti
       If $state=$foo Then                                 tle][0], "", 1013)
          $do_test +=1                                        EndIf
       EndIf                                               EndIf
     Next                                                Next
     If $all_window_titles[$title][0] <> "" AND          $no_bad_windows=1
     $do_test>0 Then                                   EndFunc
       $window_is_valid=0
                                                       Func NagiosExit()
       For $string=0 To $num_of_strings-1 Step 1         ConsoleWrite($detailed_status)
                                                         Exit($return)
     $match=StringRegExp($all_window_titles[$title][0] EndFunc
     , $valid_windows[$string])
          $window_is_valid += $match                   CompareTitles()
       Next
                                                       if $no_bad_windows=1 Then
       if $window_is_valid=0 Then                          $detailed_status="No chartserver anomalies at
          $return=2                                        this time -- " & $expression
          $detailed_status="Unexpected window *" &         $return=0
     $all_window_titles[$title][0] & "* present" & @LF EndIf
     & "***" & $all_window_titles[$title][0] & "***
     doesn't match anything we expect."                NagiosExit()
          NagiosExit()
       EndIf




                                                    2012
Nagios now knows when they’re broken




                     2012
Life is complicated
 “OK” is complicated.
 Custom plugins make Nagios much smarter about
  your environment.




                        2012
Questions?
               Comments?
       Perl and JS plugin example code at
              gist.github.com/n8v




2012

More Related Content

PDF
PuppetCamp SEA 1 - Use of Puppet
PDF
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP
PDF
Into The Box 2018 Going live with commandbox and docker
PDF
OSDC 2016 - Continous Integration in Data Centers - Further 3 Years later by ...
PDF
Everything as a code
PDF
Workshop: Know Before You Push 'Go': Using the Beaker Acceptance Test Framewo...
PDF
Ninja Git: Save Your Master
PDF
10 Cool Facts about Gradle
PuppetCamp SEA 1 - Use of Puppet
Hands on Docker - Launch your own LEMP or LAMP stack - SunshinePHP
Into The Box 2018 Going live with commandbox and docker
OSDC 2016 - Continous Integration in Data Centers - Further 3 Years later by ...
Everything as a code
Workshop: Know Before You Push 'Go': Using the Beaker Acceptance Test Framewo...
Ninja Git: Save Your Master
10 Cool Facts about Gradle

What's hot (20)

PDF
Docker workshop 0507 Taichung
PPTX
ABCs of docker
KEY
groovy & grails - lecture 13
PDF
Hyperledger composer
PDF
Continuous Delivery Workshop with Ansible x GitLab CI (2nd+)
PDF
Asynchronous Systems with Fn Flow
PPT
Python virtualenv & pip in 90 minutes
PDF
Configuring Django projects for multiple environments
PPTX
Real World Lessons on the Pain Points of Node.js Applications
PDF
Stop Worrying & Love the SQL - A Case Study
PDF
開放運算&GPU技術研究班
PDF
手把手帶你學Docker 03042017
PPTX
Mitigating Security Threats with Fastly - Joe Williams at Fastly Altitude 2015
PPT
Hw09 Monitoring Best Practices
PPTX
Running Docker in Development & Production (#ndcoslo 2015)
PDF
Creating PostgreSQL-as-a-Service at Scale
PDF
Ansible 實戰:top down 觀點
PPTX
Go語言開發APM微服務在Kubernetes之經驗分享
PPTX
Docker practice
PDF
Boycott Docker
Docker workshop 0507 Taichung
ABCs of docker
groovy & grails - lecture 13
Hyperledger composer
Continuous Delivery Workshop with Ansible x GitLab CI (2nd+)
Asynchronous Systems with Fn Flow
Python virtualenv & pip in 90 minutes
Configuring Django projects for multiple environments
Real World Lessons on the Pain Points of Node.js Applications
Stop Worrying & Love the SQL - A Case Study
開放運算&GPU技術研究班
手把手帶你學Docker 03042017
Mitigating Security Threats with Fastly - Joe Williams at Fastly Altitude 2015
Hw09 Monitoring Best Practices
Running Docker in Development & Production (#ndcoslo 2015)
Creating PostgreSQL-as-a-Service at Scale
Ansible 實戰:top down 觀點
Go語言開發APM微服務在Kubernetes之經驗分享
Docker practice
Boycott Docker
Ad

Viewers also liked (7)

PDF
Writing nagios plugins in perl
PPTX
Nagios Conference 2011 - Nathan Vonnahme - Writing Custom Nagios Plugins In Perl
ODP
Writing Nagios Plugins in Python
PDF
Janice Singh - Writing Custom Nagios Plugins
PDF
Nagios Conference 2013 - William Leibzon - SNMP Protocol and Nagios Plugins
PDF
Jesse Olson - Nagios Log Server Architecture Overview
PPTX
Nagios XI Best Practices
Writing nagios plugins in perl
Nagios Conference 2011 - Nathan Vonnahme - Writing Custom Nagios Plugins In Perl
Writing Nagios Plugins in Python
Janice Singh - Writing Custom Nagios Plugins
Nagios Conference 2013 - William Leibzon - SNMP Protocol and Nagios Plugins
Jesse Olson - Nagios Log Server Architecture Overview
Nagios XI Best Practices
Ad

Similar to Nagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in Perl (20)

PDF
OSMC 2008 | An Active Check on the Status of the Nagios Plugins PART 2 by Ton...
PDF
OSMC 2009 | Nagios Plugins: New features and future projects by Thomas Guyot-...
PPTX
Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
PDF
Using Nagios to monitor your WO systems
PDF
How tos nagios - centos wiki
ODP
Nagios Conference 2012 - Mike Weber - disaster
PDF
Monitoring using Sensu
PDF
Nagios Conference 2007 | State of the Plugins by Ton Voon
PDF
YAPC2007 Remote System Monitoring (w. Notes)
PPTX
Nagios intro
PPTX
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
ODP
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
PDF
Nagios Conference 2011 - Mike Weber - Training: Choosing Nagios Plugins To Use
PDF
Care and feeding notes
PPT
Nagios Conference 2012 - Robert Bolton - Custom SNMP OID Creation
PDF
Storage managment using nagios
ODP
bup backup system (2011-04)
ODP
Nagios Conference 2012 - Mike Weber - Failover
PPTX
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
PPT
Nagios Conference 2012 - Andrew Widdersheim - Nagios is down boss wants to se...
OSMC 2008 | An Active Check on the Status of the Nagios Plugins PART 2 by Ton...
OSMC 2009 | Nagios Plugins: New features and future projects by Thomas Guyot-...
Nagios Conference 2014 - James Clark - Nagios Cool Tips and Tricks
Using Nagios to monitor your WO systems
How tos nagios - centos wiki
Nagios Conference 2012 - Mike Weber - disaster
Monitoring using Sensu
Nagios Conference 2007 | State of the Plugins by Ton Voon
YAPC2007 Remote System Monitoring (w. Notes)
Nagios intro
Nagios Conference 2011 - Nathan Vonnahme - Integrating Nagios With Test Drive...
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
Nagios Conference 2011 - Mike Weber - Training: Choosing Nagios Plugins To Use
Care and feeding notes
Nagios Conference 2012 - Robert Bolton - Custom SNMP OID Creation
Storage managment using nagios
bup backup system (2011-04)
Nagios Conference 2012 - Mike Weber - Failover
Nagios Conference 2014 - Rob Hassing - How To Maintain Over 20 Monitoring App...
Nagios Conference 2012 - Andrew Widdersheim - Nagios is down boss wants to se...

More from Nagios (20)

PDF
Trevor McDonald - Nagios XI Under The Hood
PDF
Sean Falzon - Nagios - Resilient Notifications
PDF
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
PDF
Dave Williams - Nagios Log Server - Practical Experience
PDF
Mike Weber - Nagios and Group Deployment of Service Checks
PDF
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
PDF
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
PDF
Matt Bruzek - Monitoring Your Public Cloud With Nagios
PDF
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
PDF
Eric Loyd - Fractal Nagios
PDF
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
PDF
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
PPTX
Nagios World Conference 2015 - Scott Wilkerson Opening
PDF
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
PDF
Nagios Log Server - Features
PDF
Nagios Network Analyzer - Features
PPTX
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
ODP
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
ODP
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
ODP
Nagios Conference 2014 - Trevor McDonald - Monitoring The Physical World With...
Trevor McDonald - Nagios XI Under The Hood
Sean Falzon - Nagios - Resilient Notifications
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Dave Williams - Nagios Log Server - Practical Experience
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Eric Loyd - Fractal Nagios
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Nagios World Conference 2015 - Scott Wilkerson Opening
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nagios Log Server - Features
Nagios Network Analyzer - Features
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios Conference 2014 - Mike Weber - Nagios Rapid Deployment Options
Nagios Conference 2014 - Eric Mislivec - Getting Started With Nagios Core
Nagios Conference 2014 - Trevor McDonald - Monitoring The Physical World With...

Recently uploaded (20)

PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PPTX
MYSQL Presentation for SQL database connectivity
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Modernizing your data center with Dell and AMD
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
MYSQL Presentation for SQL database connectivity
“AI and Expert System Decision Support & Business Intelligence Systems”
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Chapter 3 Spatial Domain Image Processing.pdf
Big Data Technologies - Introduction.pptx
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Machine learning based COVID-19 study performance prediction
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Understanding_Digital_Forensics_Presentation.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Modernizing your data center with Dell and AMD

Nagios Conference 2012 - Nathan Vonnahme - Writing Custom Nagios Plugins in Perl

  • 1. Writing Custom Nagios Plugins Nathan Vonnahme Nathan.Vonnahme@bannerhealth.com
  • 2. Why write Nagios plugins? • Checklists are boring. • Life is complicated. • “OK” is complicated.
  • 3. What tool should we use? Anything! I’ll show 1. Perl 2. JavaScript 3. AutoIt Follow along! 2012
  • 4. Why Perl? • Familiar to many sysadmins • Cross-platform • CPAN • Mature Nagios::Plugin API • Embeddable in Nagios (ePN) • Examples and documentation • “Swiss army chainsaw” • Perl 6… someday? 2012
  • 5. Buuuuut I don’t like Perl Nagios plugins are very simple. Use any language you like. Eventually, imitate Nagios::Plugin. 2012
  • 6. got Perl? perl.org/get.html Linux and Mac already have it: which perl On Windows, I prefer 1. Strawberry Perl 2. Cygwin (N.B. make, gcc4) 3. ActiveState Perl Any version Perl 5 should work. 2012 6
  • 7. got Documentation? http://nagiosplug.sf.net/ developer-guidelines.html Or, goo.gl/kJRTI Case sensitive! 2012
  • 8. got an idea? Check the validity of my backup file F. 2012
  • 9. Simplest Plugin Ever #!/usr/bin/perl if (-e $ARGV[0]) { # File in first arg exists. print "OKn"; exit(0); } else { print "CRITICALn"; exit(2); } 2012 9
  • 10. Simplest Plugin Ever Save, then run with one argument: $ ./simple_check_backup.pl foo.tar.gz CRITICAL $ touch foo.tar.gz $ ./simple_check_backup.pl foo.tar.gz OK But: Will it succeed tomorrow? 2012
  • 11. But “OK” is complicated. • Check the validity* of my backup file F. • Existent • Less than X hours old • Between Y and Z MB in size * further opportunity: check the restore process! BTW: Gavin Carr with Open Fusion in Australia has already written a check_file plugin that could do this, but we’re learning here. Also confer 2001 check_backup plugin by Patrick Greenwell, but it’s pre-Nagios::Plugin. 2012
  • 12. Bells and Whistles • Argument parsing • Help/documentation • Thresholds • Performance data These things make up the majority of the code in any good plugin. We’ll demonstrate them all. 2012
  • 13. Bells, Whistles, and Cowbell • Nagios::Plugin • Ton Voon rocks • Gavin Carr too • Used in production Nagios plugins everywhere • Since ~ 2006 2012
  • 14. Bells, Whistles, and Cowbell • Install Nagios::Plugin sudo cpan Configure CPAN if necessary... cpan> install Nagios::Plugin • Potential solutions: • Configure http_proxy environment variable if behind firewall • cpan> o conf prerequisites_policy follow cpan> o conf commit • cpan> install Params::Validate 2012
  • 15. got an example plugin template? • Use check_stuff.pl from the Nagios::Plugin distribution as your template. goo.gl/vpBnh • This is always a good place to start a plugin. • We’re going to be turning check_stuff.pl into the finished check_backup.pl example. 2012
  • 16. got the finished example? Published with Gist: https://gist.github.com/1218081 or goo.gl/hXnSm • Note the “raw” hyperlink for downloading the Perl source code. • The roman numerals in the comments match the next series of slides. 2012
  • 17. Check your setup 1. Save check_stuff.pl (goo.gl/vpBnh) as e.g. my_check_backup.pl. 2. Change the first “shebang” line to point to the Perl executable on your machine. #!c:/strawberry/bin/perl 3. Run it ./my_check_backup.pl 4. You should get: MY_CHECK_BACKUP UNKNOWN - you didn't supply a threshold argument 5. If yours works, help your neighbors. 2012
  • 18. Design: Which arguments do we need? • File name • Age in hours • Size in MB 2012
  • 19. Design: Thresholds • Non-existence: CRITICAL • Age problem: CRITICAL if over age threshold • Size problem: WARNING if outside size threshold (min:max) 2012
  • 20. I. Prologue (working from check_stuff.pl) use strict; use warnings; use Nagios::Plugin; use File::stat; use vars qw($VERSION $PROGNAME $verbose $timeout $result); $VERSION = '1.0'; # get the base name of this script for use in the examples use File::Basename; $PROGNAME = basename($0); 2012
  • 21. II. Usage/Help Changes from check_stuff.pl in bold my $p = Nagios::Plugin->new( usage => "Usage: %s [ -v|--verbose ] [-t <timeout>] [ -f|--file=<path/to/backup/file> ] [ -a|--age=<max age in hours> ] [ -s|--size=<acceptable min:max size in MB> ]", version => $VERSION, blurb => "Check the specified backup file's age and size", extra => " Examples: $PROGNAME -f /backups/foo.tgz -a 24 -s 1024:2048 Check that foo.tgz exists, is less than 24 hours old, and is between 1024 and 2048 MB. “); 2012
  • 22. III. Command line arguments/options Replace the 3 add_arg calls from check_stuff.pl with: # See Getopt::Long for more $p->add_arg( spec => 'file|f=s', required => 1, help => "-f, --file=STRING The backup file to check. REQUIRED."); $p->add_arg( spec => 'age|a=i', default => 24, help => "-a, --age=INTEGER Maximum age in hours. Default 24."); $p->add_arg( spec => 'size|s=s', help => "-s, --size=INTEGER:INTEGER Minimum:maximum acceptable size in MB (1,000,000 bytes)"); # Parse arguments and process standard ones (e.g. usage, help, version) $p->getopts; 2012
  • 23. Now it’s RTFM-enabled If you run it with no args, it shows usage: $ ./check_backup.pl Usage: check_backup.pl [ -v|--verbose ] [-t <timeout>] [ -f|--file=<path/to/backup/file> ] [ -a|--age=<max age in hours> ] [ -s|--size=<acceptable min:max size in MB> ] 2012
  • 24. Now it’s RTFM-enabled $ ./check_backup.pl --help check_backup.pl 1.0 This nagios plugin is free software, and comes with ABSOLUTELY NO WARRANTY. It may be used, redistributed and/or modified under the terms of the GNU General Public Licence (see http://www.fsf.org/licensing/licenses/gpl.txt). Check the specified backup file's age and size Usage: check_backup.pl [ -v|--verbose ] [-t <timeout>] [ -f|--file=<path/to/backup/file> ] [ -a|--age=<max age in hours> ] [ -s|--size=<acceptable min:max size in MB> ] -?, --usage Print usage information -h, --help Print detailed help screen -V, --version Print version information 2012
  • 25. Now it’s RTFM-enabled --extra-opts=[section][@file] Read options from an ini file. See http://nagiosplugins.org/extra-opts for usage and examples. -f, --file=STRING The backup file to check. REQUIRED. -a, --age=INTEGER Maximum age in hours. Default 24. -s, --size=INTEGER:INTEGER Minimum:maximum acceptable size in MB (1,000,000 bytes) -t, --timeout=INTEGER Seconds before plugin times out (default: 15) -v, --verbose Show details for command-line debugging (can repeat up to 3 times) Examples: check_backup.pl -f /backups/foo.tgz -a 24 -s 1024:2048 Check that foo.tgz exists, is less than 24 hours old, and is between 1024 and 2048 MB. 2012
  • 26. IV. Check arguments for sanity • Basic syntax checks already defined with add_arg, but replace the “sanity checking” with: # Perform sanity checking on command line options. if ( (defined $p->opts->age) && $p->opts->age < 0 ) { $p->nagios_die( " invalid number supplied for the age option " ); } • Your next plugin may be more complex. 2012
  • 27. Ooops At first I used -M, which Perl defines as “Script start time minus file modification time, in days.” Nagios uses embedded Perl by default so the “script start time” may be hours or days ago. 2012
  • 28. V. Check the stuff # Check the backup file. my $f = $p->opts->file; unless (-e $f) { $p->nagios_exit(CRITICAL, "File $f doesn't exist"); } my $mtime = File::stat::stat($f)->mtime; my $age_in_hours = (time - $mtime) / 60 / 60; my $size_in_mb = (-s $f) / 1_000_000; my $message = sprintf "Backup exists, %.0f hours old, %.1f MB.", $age_in_hours, $size_in_mb; 2012
  • 29. VI. Performance Data # Add perfdata, enabling pretty graphs etc. $p->add_perfdata( label => "age", value => $age_in_hours, uom => "hours" ); $p->add_perfdata( label => "size", value => $size_in_mb, uom => "MB" ); • This adds Nagios-friendly output like: | age=2.91611111111111hours;; size=0.515007MB;; 2012
  • 30. VII. Compare to thresholds Add this section. check_stuff.pl combines check_threshold with nagios_exit at the very end. # We already checked for file existence. my $result = $p->check_threshold( check => $age_in_hours, warning => undef, critical => $p->opts->age ); if ($result == OK) { $result = $p->check_threshold( check => $size_in_mb, warning => $p->opts->size, critical => undef, ); } 2012
  • 31. VIII. Exit Code # Output the result and exit. $p->nagios_exit( return_code => $result, message => $message ); 2012
  • 32. Testing the plugin $ ./check_backup.pl -f foo.gz BACKUP OK - Backup exists, 3 hours old, 0.5 MB | age=3.04916666666667hours;; size=0.515007MB;; $ ./check_backup.pl -f foo.gz -s 100:900 BACKUP WARNING - Backup exists, 23 hours old, 0.5 MB | age=23.4275hours;; size=0.515007MB;; $ ./check_backup.pl -f foo.gz -a 8 BACKUP CRITICAL - Backup exists, 23 hours old, 0.5 MB | age=23.4388888888889hours;; size=0.515007MB;; 2012
  • 33. Telling Nagios to use your plugin 1. misccommands.cfg* define command{ command_name check_backup command_line $USER1$/myplugins/check_backup.pl -f $ARG1$ -a $ARG2$ -s $ARG3$ } * Lines wrapped for slide presentation 2012
  • 34. Telling Nagios to use your plugin 2. services.cfg (wrapped) define service{ use generic-service normal_check_interval 1440 # 24 hours host_name fai01337 service_description MySQL backups check_command check_backup!/usr/local/backups /mysql/fai01337.mysql.dump.bz2 !24!0.5:100 contact_groups linux-admins } 3. Reload config: $ sudo /usr/bin/nagios -v /etc/nagios/nagios.cfg && sudo /etc/rc.d/init.d/nagios reload 2012
  • 35. Remote execution • Hosts/filesystems other than the Nagios host • Requirements • NRPE, NSClient or equivalent • Perl with Nagios::Plugin 2012
  • 36. Profit $ plugins/check_nt -H winhost -p 1248 -v RUNSCRIPT -l check_my_backup.bat OK - Backup exists, 12 hours old, 35.7 MB | age=12.4527777777778hours;; size=35.74016MB;; 2012
  • 38. Other tools and languages • C • TAP – Test Anything Protocol • See check_tap.pl from my other talk • Python • Shell • Ruby? C#? VB? JavaScript? • AutoIt! 2012
  • 39. Now in JavaScript Why JavaScript? • Node.js “Node's problem is that some of its users want to use it for everything? So what? “ • Cool kids • Crockford • “Always bet on JS” – Brendan Eich 2012
  • 40. Check_stuff.js – the short part var plugin_name = 'CHECK_STUFF'; // Set up command line args and usage etc using commander.js. var cli = require('commander'); cli .version('0.0.1') .option('-c, --critical <critical threshold>', 'Critical threshold using standard format', parseRangeString) .option('-w, --warning <warning threshold>', 'Warning threshold using standard format', parseRangeString) .option('-r, --result <Number4>', 'Use supplied value, not random', parseFloat) .parse(process.argv); 2012
  • 41. Check_stuff.js – the short part if (val == undefined) { val = Math.floor((Math.random() * 20) + 1); } var message = ' Sample result was ' + val.toString(); var perfdata = "'Val'="+val + ';' + cli.warning + ';' + cli.critical + ';'; if (cli.critical && cli.critical.check(val)) { nagios_exit(plugin_name, "CRITICAL", message, perfdata); } else if (cli.warning && cli.warning.check(val)) { nagios_exit(plugin_name, "WARNING", message, perfdata); } else { nagios_exit(plugin_name, "OK", message, perfdata); } 2012
  • 42. The rest • Range object • Range.toString() • Range.check() • Range.parseRangeString() • nagios_exit() Who’s going to make it an NPM module? 2012
  • 43. A silly but newfangled example Facebook friends is WARNING! ./check_facebook_friends.js -u nathan.vonnahme -w @202 -c @203 2012
  • 44. Check_facebook_friends.js See the code at gist.github.com/3760536 Note: functions as callbacks instead of loops or waiting... 2012
  • 45. A horrifying/inspiring example The worst things need the most monitoring. 2012
  • 46. Chart “servers” • MS Word macro • Mail merge • Runs in user session • Need about a dozen 2012
  • 47. It gets worse. • Not a service • Not even a process • 100% CPU is normal • “OK” is complicated. 2012
  • 49. AutoIt to the rescue Func CompareTitles() For $title=1 To $all_window_titles[0][0] Step 1 If $state=WinGetState($all_window_titles[$title][0]) StringRegExp($all_window_titles[$title][0], $vali $foo=0 d_windows[0])=1 Then $do_test=0 For $foo In $valid_states $expression=ControlGetText($all_window_titles[$ti If $state=$foo Then tle][0], "", 1013) $do_test +=1 EndIf EndIf EndIf Next Next If $all_window_titles[$title][0] <> "" AND $no_bad_windows=1 $do_test>0 Then EndFunc $window_is_valid=0 Func NagiosExit() For $string=0 To $num_of_strings-1 Step 1 ConsoleWrite($detailed_status) Exit($return) $match=StringRegExp($all_window_titles[$title][0] EndFunc , $valid_windows[$string]) $window_is_valid += $match CompareTitles() Next if $no_bad_windows=1 Then if $window_is_valid=0 Then $detailed_status="No chartserver anomalies at $return=2 this time -- " & $expression $detailed_status="Unexpected window *" & $return=0 $all_window_titles[$title][0] & "* present" & @LF EndIf & "***" & $all_window_titles[$title][0] & "*** doesn't match anything we expect." NagiosExit() NagiosExit() EndIf 2012
  • 50. Nagios now knows when they’re broken 2012
  • 51. Life is complicated “OK” is complicated. Custom plugins make Nagios much smarter about your environment. 2012
  • 52. Questions? Comments? Perl and JS plugin example code at gist.github.com/n8v 2012

Editor's Notes

  • #5: Cf Mike Weber’s presentation:perl plugins can be more of a performance load
  • #15: Max 5 minute wait here. Again, we may not have time to troubleshoot your CPAN configuration right now. If you can&apos;t get it to work immediately, just watch or look on with someone else, or use another language. Unix people, you may want to help or observe someone with Windows because you&apos;ll want to do it too eventually.This worked like a dream for me with fresh Strawberry Perl, after I got the proxy configured.
  • #29: Again, replacing the section in check_stuff.pl
  • #30: This isn’t in check_stuff.pl
  • #37:  This is not working for me in production anymore.