SlideShare a Scribd company logo
Distributed Monitoring
with Raspberry Pi
Mike Weber
mweber@spidertools.com
2013 2
The Problem: Remote Monitoring at Low Cost
Limited Service Checks
Limited Cost
Low Power Usage
Central Nagios Server
Low Tech Skills
2013 3
Possible Solutions
Virtual Container
Requires VMWare etc.
Requires Expertise to Configure Nagios
Hardware
Cost
Resource Waste
Tech Skills Required (RAID, Nagios Config)
Passive Checks
Scripts on Hosts (more resources than compiled plugins)
Tech Skills
2013 4
Possible Solutions: ITX
Mini-ITX ($400-600)
6.7 x 6.7 inch motherboard developed by VIA in 2001
Intel Atom 1.8 GHz Processor
2 GB of RAM
SSD
60 Watt Power Supply
Nano-ITX ($500-700)
4.7 x 4.7 inch motherboard developed by VIA in 2003
VIA 1.2 GHz Processor
1 GB of RAM
SSD
60 Watt Power Supply
Pico-ITX ($600-700)
3.9 x 2.8 inch motherboard developed by VIA in 2007
Raspberry PiRaspberry Pi
2013 6
Raspberry Pi
Low Cost
$75.00 (board, case, power supply)
Low Power Usage
Power Usage of a Cell Phone
Low Tech Skills
Clone Disks
Distributed Model
Flexible
Low Cost on Nagios Server
2013 7
Pi: 512 RAM 700MHz
2013 8
Installation of wheezy-raspbian
Download the image file which is about 500 MB: http://www.raspberrypi.org/downloads 
Verify the Image
  sha1sum 2013­02­09­wheezy­raspbian.zip 
b4375dc9d140e6e48e0406f96dead3601fac6c81  2013­02­09­wheezy­raspbian.zip 
Unzip the Image
unzip 2013­02­09­wheezy­raspbian.zip 
Archive:  2013­02­09­wheezy­raspbian.zip 
  inflating: 2013­02­09­wheezy­raspbian.img 
Username: pi 
Password: raspberry 
Verify Disk Location
su ­ 
fdisk ­l 
Disk /dev/sdd: 4102 MB, 4102889984 bytes 
255 heads, 63 sectors/track, 498 cylinders 
Units = cylinders of 16065 * 512 = 8225280 bytes 
Sector size (logical/physical): 512 bytes / 512 bytes 
I/O size (minimum/optimal): 512 bytes / 512 bytes 
Disk identifier: 0x295b8178 
   Device Boot      Start         End      Blocks   Id  System 
/dev/sdd1               1         497     3992135+   b  W95 FAT32 
Create Disk
dd bs=4M if=~/2012­10­28­wheezy­raspbian.img of=/dev/sdd 
2013 9
Network Configuration: Wireless
Edimax Wireless 802.11b/g/n (supports WPS,WPA2,802.1x)
* works out of the box
/etc/network/interfaces
auto lo
iface lo inet loopback
iface eth0 inet dhcp
allow­hotplug wlan0
iface wlan0 inet dhcp
  wpa­ssid pi
  wpa­psk Pi89YQbg56)
Mod-GearmanMod-Gearman
2013 11
Why Mod-Gearman?
Distributes Tasks to Multiple Workers
Multiple Pi Workers
Supports Multiple Programming Languages
C, Java, Perl, PHP, Python, Shell
Provides a Distributed Model
Client Uses Very Small Resources
In Contrast to DNX Workers
2013 12
Why Not DNX?
Not Currently Updated (2010-4-13)
Uses UDP (less dependable)
Client Uses More Resources
DNX Worker Mod-Gearman Worker
0
50
100
150
200
250
Memory in MB
2013 13
NEB: Nagios Event Broker
2013 14
Mod-Gearman
2013 15
Installation of Mod-Gearman on Pi
Install Prerequisites
sudo apt­get update 
sudo apt­get install gearman mod­gearman­worker libgearman6 nagios­plugins 
cd /etc/mod­gearman
Edit the worker.conf
sudo nano worker.conf
server=192.168.5.212:4730
key=Modlinux23
hosts=no
services=no
eventhandlers=no
min­worker=6
max­worker=8
servicegroups=pi_srv
logfile=/var/log/mod_gearman/mod_gearman_worker.log
p1_file=/usr/share/mod­gearman/mod_gearman_p1.pl
Save your changes and then start the Mod-Gearman worker:
sudo /etc/init.d/mod­gearman­worker start
2013 16
Gearman Resource Usage
ps axo pid,ppid,pcpu,size,cmd|grep gearman
Process Parent CPU Memory CMD
 1747     1   0.0  1224  /usr/sbin/mod_gearman_worker 
 3255   1747   2.5  1488  /usr/sbin/mod_gearman_worker (working) 
 3256   1747   6.6  1488  /usr/sbin/mod_gearman_worker (working) 
 3257   1747   7.0  1488  /usr/sbin/mod_gearman_worker (working) 
 3258   1747   0.0  1356  /usr/sbin/mod_gearman_worker 
 3259   1747   0.0  1356  /usr/sbin/mod_gearman_worker 
 3260   1747   0.0  1356  /usr/sbin/mod_gearman_worker
size = virtual size of the process (code+data+stack) 
2013 17
Mod-Gearman Queues
2013 18
Mod-Gearman
2013 19
Worker Capacity
75-100 Service Checks
5 Minute Intervals
Compiled Plugins
6 Workers
2 Workers Always Available
2013 20
Mod-Gearman Worker Configuration
Worker Identifier
Unique identifier for worker, hostname
min-worker
Minimum number of total workers
max-worker
Maximum number of total workers
idle-timeout
Time in seconds before idle worker exits
max-jobs
Maximum number of jobs before worker exits
2013 21
Install Process
Install Nagios Event Broker
broker_module=/usr/local/lib/mod_gearman/mod_gearman.o 
config=/etc/mod_gearman/mod_gearman_neb.conf 
Install Server: gearmand
/etc/init.d/gearmand start
Install Worker: mod_gearman_worker
/etc/init.d/mod_gearman_worker start
Configuration File
/etc/mod_gearman/mod_gearman_neb.conf
Distributed MonitoringDistributed Monitoring
2013 23
Distributed Monitoring
2013 24
Distributed Monitoring: Hostgroups
Server Configuration: /etc/mod_gearman/mod_gearman_neb.conf
server=localhost:4730
eventhandler=yes
services=yes
hosts=yes
hostgroups=debian-servers
encryption=yes
key=linux23_Qg549K
Pi Worker Configuration: /etc/mod-gearman/worker.conf
server=192.168.5.99:4730
eventhandler=no
services=no
hosts=no
min-worker=6
max-worker=8
encryption=yes
key=linux23_Qg549K
p1_file=/usr/share/mod­gearman/mod_gearman_p1.pl
hostgroups=debian­servers
2013 25
Distributed Monitoring:
Servicegroups
Server Configuration: /etc/mod_gearman/mod_gearman_neb.conf
server=localhost:4730
eventhandler=yes
services=yes
hosts=yes
servicegroups=pi_srv
encryption=yes
key=linux23_Qg549K
Pi Worker Configuration: /etc/mod-gearman/worker.conf
server=192.168.5.99:4730
eventhandler=no
services=no
hosts=no
min-worker=6
max-worker=8
encryption=yes
key=linux23_Qg549K
p1_file=/usr/share/mod­gearman/mod_gearman_p1.pl
servicegroups=pi_srv
Performance Tuning PiPerformance Tuning Pi
2013 27
noatime
mtime
contents of file changed
ctime
inode changed (permissions,ownership)
atime
accessed time forces a write
/etc/fstab
proc            /proc           proc    defaults          0       0
/dev/mmcblk0p1  /boot           vfat    defaults          0       2
/dev/mmcblk0p2  /               ext4    defaults,noatime  0       1
mount ­o remount /
Verify Changes with:
mount
2013 28
Maximize Resources
Reduce Logging
* Turn Off rsyslog
* Minimize Logging
Shutdown Other Services
* mail server
Firewall IssuesFirewall Issues
2013 30
Understanding Network Connections: Pi
tcp        0      0 192.168.5.47:43965      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43948      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43964      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43962      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43960      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43977      192.168.5.212:4730      ESTABLISHED
tcp        0      0 192.168.5.47:43956      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43947      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43975      192.168.5.212:4730      ESTABLISHED
tcp        0      0 192.168.5.47:43969      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43978      192.168.5.212:4730      ESTABLISHED
tcp        0      0 192.168.5.47:43967      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43973      192.168.5.212:4730      ESTABLISHED
tcp        0      0 192.168.5.47:43959      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43951      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43961      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43957      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43963      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43976      192.168.5.212:4730      ESTABLISHED
tcp        0      0 192.168.5.47:43945      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43972      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43970      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43950      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43958      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43952      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43955      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43954      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43946      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43966      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43968      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43953      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43979      192.168.5.212:4730      ESTABLISHED
tcp        0      0 192.168.5.47:43971      192.168.5.212:4730      TIME_WAIT  
tcp        0      0 192.168.5.47:43974      192.168.5.212:4730      ESTABLISHED
  
2013 31
Understanding Network Connections: Nagios
tcp        0      0 0.0.0.0:4730                0.0.0.0:*                   LISTEN      
 
tcp        0      0 192.168.5.212:4730          192.168.5.47:44254          ESTABLISHED 
tcp        0      0 192.168.5.212:4730          192.168.5.47:44258          ESTABLISHED 
tcp        0      0 192.168.5.212:4730          192.168.5.47:44257          ESTABLISHED 
tcp        0      0 192.168.5.212:4730          192.168.5.47:44255          ESTABLISHED 
tcp        0      0 192.168.5.212:4730          192.168.5.47:44259          ESTABLISHED 
tcp        0      0 192.168.5.212:4730          192.168.5.47:44253          ESTABLISHED 
tcp        0      0 192.168.5.212:4730          192.168.5.47:44256          ESTABLISHED 
  
Creating ChecksCreating Checks
2013 33
Create Service Check
2013 34
Create Servicegroup
2013 35
Add Services to Servicegroup
2013 36
Graphing with Pi Checks
Monitoring PiMonitoring Pi
2013 38
Monitor Pi: Workers and Jobs
Create a Script on Nagios to Monitor Workers and Jobs
#!/bin/bash
check_gearman -H 192.168.5.99 -q worker_raspberrypi -t 10 -s check
2013 39
Monitor Pi: Service Check
2013 40
Monitor Gearman Workers
2013 41
Monitor Gearman Workers/Jobs
2013 42
Warning Signals
Nagios Server: Check Latency
Nagios Server: Orphaned Checks
service check orphaned, is the mod-gearman worker on queue
'servicegroup_pi' running?
Pi: Load Over 1
1= 100%
Pi: Defunct Workers
15824 14129 2.1 0 [mod_gearman_wor] <defunct>
2013 43
Pi: Overloaded
Load Approaching Limit
ps axo pid,ppid,pcpu,size,cmd|grep gearman|grep ­v grep
pid   ppid  pcpu  size  cmd
14129     1  0.0  1224 /usr/sbin/mod_gearman_worker 
15634 14129 12.0  1488 /usr/sbin/mod_gearman_worker 
15635 14129 12.0  1488 /usr/sbin/mod_gearman_worker 
15636 14129 12.0  1488 /usr/sbin/mod_gearman_worker 
15637 14129 13.0  1488 /usr/sbin/mod_gearman_worker 
15638 14129 12.0  1488 /usr/sbin/mod_gearman_worker 
15639 14129 12.0  1488 /usr/sbin/mod_gearman_worker
15640 14129 12.0  1488 /usr/sbin/mod_gearman_worker 
15641 14129 11.0  1488 /usr/sbin/mod_gearman_worker
15642 14129 11.0  1488 /usr/sbin/mod_gearman_worker 
Increased CPU Usage Indicating Impending DOOM
ps axo pid,ppid,pcpu,size,cmd|grep gearman|grep ­v grep
pid   ppid   pcpu size  cmd
14129     1  0.0  1224 /usr/sbin/mod_gearman_worker 
15658 14129  2.1  1488 /usr/sbin/mod_gearman_worker 
15659 14129  2.1  1488 /usr/sbin/mod_gearman_worker 
15660 14129  2.1  1488 /usr/sbin/mod_gearman_worker 
15661 14129  2.1  1488 /usr/sbin/mod_gearman_worker 
15662 14129  2.1  1488 /usr/sbin/mod_gearman_worker 
15663 14129  2.1  1488 /usr/sbin/mod_gearman_worker 
15664 14129 21.0  1488 /usr/sbin/mod_gearman_worker 
15665 14129 21.0  1488 /usr/sbin/mod_gearman_worker 
15666 14129 21.0  1488 /usr/sbin/mod_gearman_worker 
2013 44
Plugin Resource Usage: RAM
Compiled NSCA NSClient++ SSH Perl
0
2
4
6
8
10
12
RAM
2013 45
Plugin Resource Use: Time
Example: check_ping
PID PPID CPU RAM Time Command
12106 12105 0.0 280 00:01 25 /usr/lib/nagios/plugins/check_ping -H
192.168.5.220 -w 3000.0,80% -c 5000.0,100% -p 5
12106 12105 0.0 280 00:02 25 /usr/lib/nagios/plugins/check_ping -H
192.168.5.220 -w 3000.0,80% -c 5000.0,100% -p 5
12106 12105 0.0 280 00:03 25 /usr/lib/nagios/plugins/check_ping -H
192.168.5.220 -w 3000.0,80% -c 5000.0,100% -p 5
2013 46
Plugins Resource Hog: Network Bandwidth
CPU   RAM        Time                         Plugin
13.0  7696       00:01  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl 
 6.5  7696       00:02  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl 
 4.3  7696       00:03  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl 
 3.2  7696       00:04  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl 
 2.6  7696       00:05  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl 
 2.1  7696       00:06  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl 
 1.8  7696       00:07  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl
 1.6  7696       00:08  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl 
 1.4  7696       00:09  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl 
 1.3  7696       00:10  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl
2013 47
Latency Evaluation
Turn On Debug=1 
[2013­08­20 10:24:36][11574][DEBUG] received job for queue servicegroup_pi_srv: centos ­ FTP
[2013­08­20 10:24:36][11574][DEBUG] service: 'centos' ­ 'FTP', next_check is at 2013­08­20 
10:24:36, latency so far: 0
[2013­08­20 10:25:17][11574][DEBUG] received job for queue servicegroup_pi_srv: centos ­ HTTP
[2013­08­20 10:25:17][11574][DEBUG] service: 'centos' ­ 'HTTP', next_check is at 2013­08­20 
10:25:17, latency so far: 0
[2013­08­20 10:25:17][11574][DEBUG] service job completed: centos HTTP: 2
2013 48
Troubleshooting: Return code 127
CRITICAL: Return code of 127 is out of bounds. Make sure the plugin you're trying to run actually 
exists. (worker: raspberrypi)
Check the Path to the plugins directory.
sudo mkdir ­p /usr/local/nagios
sudo ln ­s /usr/lib/nagios/plugins /usr/local/nagios/libexec
Questions?Questions?

More Related Content

PDF
Nagios Conference 2011 - Mike Weber - Training: Reducing Nagios Server Load ...
PPTX
Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
PDF
Nagios Conference 2012 - Scott Wilkerson - Passive Monitoring Solutions For R...
PDF
MidTerm-RatanMohapatra
PDF
OSMC 2019 | Use Cloud services & features in your redundant Icinga2 Environme...
PPTX
Nagios intro
PPTX
Exadata db node update
PDF
Known basic of NFV Features
Nagios Conference 2011 - Mike Weber - Training: Reducing Nagios Server Load ...
Nagios Conference 2014 - Jim Prins - Passive Monitoring with Nagios
Nagios Conference 2012 - Scott Wilkerson - Passive Monitoring Solutions For R...
MidTerm-RatanMohapatra
OSMC 2019 | Use Cloud services & features in your redundant Icinga2 Environme...
Nagios intro
Exadata db node update
Known basic of NFV Features

What's hot (20)

PDF
NanoQplus Installation Guide - for Windows
PDF
Automação do físico ao NetSecDevOps
PPTX
Exadata cell update
ODP
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
PDF
PostgresOpen 2013 A Comparison of PostgreSQL Encryption Options
PPTX
Windows Server 2012 R2 Hyper-V Replica
PPTX
Fail-Safe Cluster for FirebirdSQL and something more
PPTX
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
PDF
XPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
PPTX
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
PDF
Monitoring Server Temperature with Opsview
PDF
Automating with NX-OS: Let's Get Started!
PDF
Introducing Xtrabackup Manager
PPT
Smooth as Silk Exadata Patching
PDF
BP103 - Got Problems? Let's Do a Health Check
PDF
Percona xtrabackup - MySQL Meetup @ Mumbai
PDF
Got Problems? Let's Do a Health Check
PPTX
VMworld 2016: Troubleshooting 101 for Horizon
PPTX
Mike Resseler - Using hyper-v replica in your environment
PPTX
Windows Server «10»: Что нового в виртуализации
NanoQplus Installation Guide - for Windows
Automação do físico ao NetSecDevOps
Exadata cell update
Nagios Conference 2012 - Dan Wittenberg - Case Study: Scaling Nagios Core at ...
PostgresOpen 2013 A Comparison of PostgreSQL Encryption Options
Windows Server 2012 R2 Hyper-V Replica
Fail-Safe Cluster for FirebirdSQL and something more
VMworld 2016: Enforcing a vSphere Cluster Design with PowerCLI Automation
XPDDS18: Xenwatch Multithreading - Dongli Zhang, Oracle
VMworld 2016: Getting Started with PowerShell and PowerCLI for Your VMware En...
Monitoring Server Temperature with Opsview
Automating with NX-OS: Let's Get Started!
Introducing Xtrabackup Manager
Smooth as Silk Exadata Patching
BP103 - Got Problems? Let's Do a Health Check
Percona xtrabackup - MySQL Meetup @ Mumbai
Got Problems? Let's Do a Health Check
VMworld 2016: Troubleshooting 101 for Horizon
Mike Resseler - Using hyper-v replica in your environment
Windows Server «10»: Что нового в виртуализации
Ad

Viewers also liked (7)

PPT
Raspberry Pi Technology
PPTX
Vision based system for monitoring the loss of attention in automotive driver
PPSX
Low Cost HD Surveillance Camera using Raspberry PI
PPTX
Real Time Vehicle Monitoring Using Raspberry Pi
PPTX
Introduction to raspberry pi
PPT
Smart Wireless Surveillance Monitoring using RASPBERRY PI
PPT
Rasberry pi
Raspberry Pi Technology
Vision based system for monitoring the loss of attention in automotive driver
Low Cost HD Surveillance Camera using Raspberry PI
Real Time Vehicle Monitoring Using Raspberry Pi
Introduction to raspberry pi
Smart Wireless Surveillance Monitoring using RASPBERRY PI
Rasberry pi
Ad

Similar to Nagios Conference 2013 - Mike Weber - Distributed Monitoring with Raspberry Pi (20)

PDF
VMworld 2013: A Technical Deep Dive on VMware Horizon View 5.2 Performance an...
ODP
Rete di casa e raspberry pi - Home network and Raspberry Pi
PDF
Sprint 131
PPTX
MySQL backup and restore performance
PDF
VMworld 2013: How Good is PCoIP - A Remoting Protocol Shootout
PDF
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
PDF
QRadar_CEddfdfdsfdfdfdfdfdfdfdfdfdfdff.pdf
PPTX
Windows 7 and Windows Server 2008 R2 SP1 Overview
PDF
Nano Server - the future of Windows Server - Thomas Maurer
PPTX
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
PPTX
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
PPTX
Building the World's Largest GPU
PDF
Tutorial WiFi driver code - Opening Nuts and Bolts of Linux WiFi Subsystem
PDF
Accelerating Data Science With GPUs
PPT
Vpu technology &gpgpu computing
PPT
Vpu technology &gpgpu computing
PPT
Vpu technology &gpgpu computing
PPTX
Implementing Hyper V virtualization Service Pack 1
PDF
[OpenStack Days Korea 2016] Track3 - OpenStack on 64-bit ARM with X-Gene
PDF
Computing Performance: On the Horizon (2021)
VMworld 2013: A Technical Deep Dive on VMware Horizon View 5.2 Performance an...
Rete di casa e raspberry pi - Home network and Raspberry Pi
Sprint 131
MySQL backup and restore performance
VMworld 2013: How Good is PCoIP - A Remoting Protocol Shootout
Part 3 Maximizing the utilization of GPU resources on-premise and in the cloud
QRadar_CEddfdfdsfdfdfdfdfdfdfdfdfdfdff.pdf
Windows 7 and Windows Server 2008 R2 SP1 Overview
Nano Server - the future of Windows Server - Thomas Maurer
Robotics and Machine Learning: Working with NVIDIA Jetson Kits
22by7 and DellEMC Tech Day July 20 2017 - Power Edge
Building the World's Largest GPU
Tutorial WiFi driver code - Opening Nuts and Bolts of Linux WiFi Subsystem
Accelerating Data Science With GPUs
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Vpu technology &gpgpu computing
Implementing Hyper V virtualization Service Pack 1
[OpenStack Days Korea 2016] Track3 - OpenStack on 64-bit ARM with X-Gene
Computing Performance: On the Horizon (2021)

More from Nagios (20)

PPTX
Nagios XI Best Practices
PDF
Jesse Olson - Nagios Log Server Architecture Overview
PDF
Trevor McDonald - Nagios XI Under The Hood
PDF
Sean Falzon - Nagios - Resilient Notifications
PDF
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
PDF
Janice Singh - Writing Custom Nagios Plugins
PDF
Dave Williams - Nagios Log Server - Practical Experience
PDF
Mike Weber - Nagios and Group Deployment of Service Checks
PDF
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
PDF
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
PDF
Matt Bruzek - Monitoring Your Public Cloud With Nagios
PDF
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
PDF
Eric Loyd - Fractal Nagios
PDF
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
PDF
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
PPTX
Nagios World Conference 2015 - Scott Wilkerson Opening
PDF
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
PDF
Nagios Log Server - Features
PDF
Nagios Network Analyzer - Features
PPTX
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios
Nagios XI Best Practices
Jesse Olson - Nagios Log Server Architecture Overview
Trevor McDonald - Nagios XI Under The Hood
Sean Falzon - Nagios - Resilient Notifications
Marcus Rochelle - Landis+Gyr - Monitoring with Nagios Enterprise Edition
Janice Singh - Writing Custom Nagios Plugins
Dave Williams - Nagios Log Server - Practical Experience
Mike Weber - Nagios and Group Deployment of Service Checks
Mike Guthrie - Revamping Your 10 Year Old Nagios Installation
Bryan Heden - Agile Networks - Using Nagios XI as the platform for Monitoring...
Matt Bruzek - Monitoring Your Public Cloud With Nagios
Lee Myers - What To Do When Nagios Notification Don't Meet Your Needs.
Eric Loyd - Fractal Nagios
Marcelo Perazolo, Lead Software Architect, IBM Corporation - Monitoring a Pow...
Thomas Schmainda - Tracking Boeing Satellites With Nagios - Nagios World Conf...
Nagios World Conference 2015 - Scott Wilkerson Opening
Nrpe - Nagios Remote Plugin Executor. NRPE plugin for Nagios Core
Nagios Log Server - Features
Nagios Network Analyzer - Features
Nagios Conference 2014 - Dorance Martinez Cortes - Customizing Nagios

Recently uploaded (20)

PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPT
Teaching material agriculture food technology
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Big Data Technologies - Introduction.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
cuic standard and advanced reporting.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Spectroscopy.pptx food analysis technology
PDF
Network Security Unit 5.pdf for BCA BBA.
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Empathic Computing: Creating Shared Understanding
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Digital-Transformation-Roadmap-for-Companies.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Teaching material agriculture food technology
“AI and Expert System Decision Support & Business Intelligence Systems”
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
The Rise and Fall of 3GPP – Time for a Sabbatical?
Big Data Technologies - Introduction.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Building Integrated photovoltaic BIPV_UPV.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
NewMind AI Weekly Chronicles - August'25-Week II
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
cuic standard and advanced reporting.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Spectroscopy.pptx food analysis technology
Network Security Unit 5.pdf for BCA BBA.

Nagios Conference 2013 - Mike Weber - Distributed Monitoring with Raspberry Pi

  • 1. Distributed Monitoring with Raspberry Pi Mike Weber mweber@spidertools.com
  • 2. 2013 2 The Problem: Remote Monitoring at Low Cost Limited Service Checks Limited Cost Low Power Usage Central Nagios Server Low Tech Skills
  • 3. 2013 3 Possible Solutions Virtual Container Requires VMWare etc. Requires Expertise to Configure Nagios Hardware Cost Resource Waste Tech Skills Required (RAID, Nagios Config) Passive Checks Scripts on Hosts (more resources than compiled plugins) Tech Skills
  • 4. 2013 4 Possible Solutions: ITX Mini-ITX ($400-600) 6.7 x 6.7 inch motherboard developed by VIA in 2001 Intel Atom 1.8 GHz Processor 2 GB of RAM SSD 60 Watt Power Supply Nano-ITX ($500-700) 4.7 x 4.7 inch motherboard developed by VIA in 2003 VIA 1.2 GHz Processor 1 GB of RAM SSD 60 Watt Power Supply Pico-ITX ($600-700) 3.9 x 2.8 inch motherboard developed by VIA in 2007
  • 6. 2013 6 Raspberry Pi Low Cost $75.00 (board, case, power supply) Low Power Usage Power Usage of a Cell Phone Low Tech Skills Clone Disks Distributed Model Flexible Low Cost on Nagios Server
  • 7. 2013 7 Pi: 512 RAM 700MHz
  • 8. 2013 8 Installation of wheezy-raspbian Download the image file which is about 500 MB: http://www.raspberrypi.org/downloads  Verify the Image   sha1sum 2013­02­09­wheezy­raspbian.zip  b4375dc9d140e6e48e0406f96dead3601fac6c81  2013­02­09­wheezy­raspbian.zip  Unzip the Image unzip 2013­02­09­wheezy­raspbian.zip  Archive:  2013­02­09­wheezy­raspbian.zip    inflating: 2013­02­09­wheezy­raspbian.img  Username: pi  Password: raspberry  Verify Disk Location su ­  fdisk ­l  Disk /dev/sdd: 4102 MB, 4102889984 bytes  255 heads, 63 sectors/track, 498 cylinders  Units = cylinders of 16065 * 512 = 8225280 bytes  Sector size (logical/physical): 512 bytes / 512 bytes  I/O size (minimum/optimal): 512 bytes / 512 bytes  Disk identifier: 0x295b8178     Device Boot      Start         End      Blocks   Id  System  /dev/sdd1               1         497     3992135+   b  W95 FAT32  Create Disk dd bs=4M if=~/2012­10­28­wheezy­raspbian.img of=/dev/sdd 
  • 9. 2013 9 Network Configuration: Wireless Edimax Wireless 802.11b/g/n (supports WPS,WPA2,802.1x) * works out of the box /etc/network/interfaces auto lo iface lo inet loopback iface eth0 inet dhcp allow­hotplug wlan0 iface wlan0 inet dhcp   wpa­ssid pi   wpa­psk Pi89YQbg56)
  • 11. 2013 11 Why Mod-Gearman? Distributes Tasks to Multiple Workers Multiple Pi Workers Supports Multiple Programming Languages C, Java, Perl, PHP, Python, Shell Provides a Distributed Model Client Uses Very Small Resources In Contrast to DNX Workers
  • 12. 2013 12 Why Not DNX? Not Currently Updated (2010-4-13) Uses UDP (less dependable) Client Uses More Resources DNX Worker Mod-Gearman Worker 0 50 100 150 200 250 Memory in MB
  • 13. 2013 13 NEB: Nagios Event Broker
  • 15. 2013 15 Installation of Mod-Gearman on Pi Install Prerequisites sudo apt­get update  sudo apt­get install gearman mod­gearman­worker libgearman6 nagios­plugins  cd /etc/mod­gearman Edit the worker.conf sudo nano worker.conf server=192.168.5.212:4730 key=Modlinux23 hosts=no services=no eventhandlers=no min­worker=6 max­worker=8 servicegroups=pi_srv logfile=/var/log/mod_gearman/mod_gearman_worker.log p1_file=/usr/share/mod­gearman/mod_gearman_p1.pl Save your changes and then start the Mod-Gearman worker: sudo /etc/init.d/mod­gearman­worker start
  • 16. 2013 16 Gearman Resource Usage ps axo pid,ppid,pcpu,size,cmd|grep gearman Process Parent CPU Memory CMD  1747     1   0.0  1224  /usr/sbin/mod_gearman_worker   3255   1747   2.5  1488  /usr/sbin/mod_gearman_worker (working)   3256   1747   6.6  1488  /usr/sbin/mod_gearman_worker (working)   3257   1747   7.0  1488  /usr/sbin/mod_gearman_worker (working)   3258   1747   0.0  1356  /usr/sbin/mod_gearman_worker   3259   1747   0.0  1356  /usr/sbin/mod_gearman_worker   3260   1747   0.0  1356  /usr/sbin/mod_gearman_worker size = virtual size of the process (code+data+stack) 
  • 19. 2013 19 Worker Capacity 75-100 Service Checks 5 Minute Intervals Compiled Plugins 6 Workers 2 Workers Always Available
  • 20. 2013 20 Mod-Gearman Worker Configuration Worker Identifier Unique identifier for worker, hostname min-worker Minimum number of total workers max-worker Maximum number of total workers idle-timeout Time in seconds before idle worker exits max-jobs Maximum number of jobs before worker exits
  • 21. 2013 21 Install Process Install Nagios Event Broker broker_module=/usr/local/lib/mod_gearman/mod_gearman.o  config=/etc/mod_gearman/mod_gearman_neb.conf  Install Server: gearmand /etc/init.d/gearmand start Install Worker: mod_gearman_worker /etc/init.d/mod_gearman_worker start Configuration File /etc/mod_gearman/mod_gearman_neb.conf
  • 24. 2013 24 Distributed Monitoring: Hostgroups Server Configuration: /etc/mod_gearman/mod_gearman_neb.conf server=localhost:4730 eventhandler=yes services=yes hosts=yes hostgroups=debian-servers encryption=yes key=linux23_Qg549K Pi Worker Configuration: /etc/mod-gearman/worker.conf server=192.168.5.99:4730 eventhandler=no services=no hosts=no min-worker=6 max-worker=8 encryption=yes key=linux23_Qg549K p1_file=/usr/share/mod­gearman/mod_gearman_p1.pl hostgroups=debian­servers
  • 25. 2013 25 Distributed Monitoring: Servicegroups Server Configuration: /etc/mod_gearman/mod_gearman_neb.conf server=localhost:4730 eventhandler=yes services=yes hosts=yes servicegroups=pi_srv encryption=yes key=linux23_Qg549K Pi Worker Configuration: /etc/mod-gearman/worker.conf server=192.168.5.99:4730 eventhandler=no services=no hosts=no min-worker=6 max-worker=8 encryption=yes key=linux23_Qg549K p1_file=/usr/share/mod­gearman/mod_gearman_p1.pl servicegroups=pi_srv
  • 27. 2013 27 noatime mtime contents of file changed ctime inode changed (permissions,ownership) atime accessed time forces a write /etc/fstab proc            /proc           proc    defaults          0       0 /dev/mmcblk0p1  /boot           vfat    defaults          0       2 /dev/mmcblk0p2  /               ext4    defaults,noatime  0       1 mount ­o remount / Verify Changes with: mount
  • 28. 2013 28 Maximize Resources Reduce Logging * Turn Off rsyslog * Minimize Logging Shutdown Other Services * mail server
  • 30. 2013 30 Understanding Network Connections: Pi tcp        0      0 192.168.5.47:43965      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43948      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43964      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43962      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43960      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43977      192.168.5.212:4730      ESTABLISHED tcp        0      0 192.168.5.47:43956      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43947      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43975      192.168.5.212:4730      ESTABLISHED tcp        0      0 192.168.5.47:43969      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43978      192.168.5.212:4730      ESTABLISHED tcp        0      0 192.168.5.47:43967      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43973      192.168.5.212:4730      ESTABLISHED tcp        0      0 192.168.5.47:43959      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43951      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43961      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43957      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43963      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43976      192.168.5.212:4730      ESTABLISHED tcp        0      0 192.168.5.47:43945      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43972      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43970      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43950      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43958      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43952      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43955      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43954      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43946      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43966      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43968      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43953      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43979      192.168.5.212:4730      ESTABLISHED tcp        0      0 192.168.5.47:43971      192.168.5.212:4730      TIME_WAIT   tcp        0      0 192.168.5.47:43974      192.168.5.212:4730      ESTABLISHED   
  • 31. 2013 31 Understanding Network Connections: Nagios tcp        0      0 0.0.0.0:4730                0.0.0.0:*                   LISTEN         tcp        0      0 192.168.5.212:4730          192.168.5.47:44254          ESTABLISHED  tcp        0      0 192.168.5.212:4730          192.168.5.47:44258          ESTABLISHED  tcp        0      0 192.168.5.212:4730          192.168.5.47:44257          ESTABLISHED  tcp        0      0 192.168.5.212:4730          192.168.5.47:44255          ESTABLISHED  tcp        0      0 192.168.5.212:4730          192.168.5.47:44259          ESTABLISHED  tcp        0      0 192.168.5.212:4730          192.168.5.47:44253          ESTABLISHED  tcp        0      0 192.168.5.212:4730          192.168.5.47:44256          ESTABLISHED    
  • 35. 2013 35 Add Services to Servicegroup
  • 38. 2013 38 Monitor Pi: Workers and Jobs Create a Script on Nagios to Monitor Workers and Jobs #!/bin/bash check_gearman -H 192.168.5.99 -q worker_raspberrypi -t 10 -s check
  • 39. 2013 39 Monitor Pi: Service Check
  • 41. 2013 41 Monitor Gearman Workers/Jobs
  • 42. 2013 42 Warning Signals Nagios Server: Check Latency Nagios Server: Orphaned Checks service check orphaned, is the mod-gearman worker on queue 'servicegroup_pi' running? Pi: Load Over 1 1= 100% Pi: Defunct Workers 15824 14129 2.1 0 [mod_gearman_wor] <defunct>
  • 43. 2013 43 Pi: Overloaded Load Approaching Limit ps axo pid,ppid,pcpu,size,cmd|grep gearman|grep ­v grep pid   ppid  pcpu  size  cmd 14129     1  0.0  1224 /usr/sbin/mod_gearman_worker  15634 14129 12.0  1488 /usr/sbin/mod_gearman_worker  15635 14129 12.0  1488 /usr/sbin/mod_gearman_worker  15636 14129 12.0  1488 /usr/sbin/mod_gearman_worker  15637 14129 13.0  1488 /usr/sbin/mod_gearman_worker  15638 14129 12.0  1488 /usr/sbin/mod_gearman_worker  15639 14129 12.0  1488 /usr/sbin/mod_gearman_worker 15640 14129 12.0  1488 /usr/sbin/mod_gearman_worker  15641 14129 11.0  1488 /usr/sbin/mod_gearman_worker 15642 14129 11.0  1488 /usr/sbin/mod_gearman_worker  Increased CPU Usage Indicating Impending DOOM ps axo pid,ppid,pcpu,size,cmd|grep gearman|grep ­v grep pid   ppid   pcpu size  cmd 14129     1  0.0  1224 /usr/sbin/mod_gearman_worker  15658 14129  2.1  1488 /usr/sbin/mod_gearman_worker  15659 14129  2.1  1488 /usr/sbin/mod_gearman_worker  15660 14129  2.1  1488 /usr/sbin/mod_gearman_worker  15661 14129  2.1  1488 /usr/sbin/mod_gearman_worker  15662 14129  2.1  1488 /usr/sbin/mod_gearman_worker  15663 14129  2.1  1488 /usr/sbin/mod_gearman_worker  15664 14129 21.0  1488 /usr/sbin/mod_gearman_worker  15665 14129 21.0  1488 /usr/sbin/mod_gearman_worker  15666 14129 21.0  1488 /usr/sbin/mod_gearman_worker 
  • 44. 2013 44 Plugin Resource Usage: RAM Compiled NSCA NSClient++ SSH Perl 0 2 4 6 8 10 12 RAM
  • 45. 2013 45 Plugin Resource Use: Time Example: check_ping PID PPID CPU RAM Time Command 12106 12105 0.0 280 00:01 25 /usr/lib/nagios/plugins/check_ping -H 192.168.5.220 -w 3000.0,80% -c 5000.0,100% -p 5 12106 12105 0.0 280 00:02 25 /usr/lib/nagios/plugins/check_ping -H 192.168.5.220 -w 3000.0,80% -c 5000.0,100% -p 5 12106 12105 0.0 280 00:03 25 /usr/lib/nagios/plugins/check_ping -H 192.168.5.220 -w 3000.0,80% -c 5000.0,100% -p 5
  • 46. 2013 46 Plugins Resource Hog: Network Bandwidth CPU   RAM        Time                         Plugin 13.0  7696       00:01  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl   6.5  7696       00:02  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl   4.3  7696       00:03  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl   3.2  7696       00:04  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl   2.6  7696       00:05  20 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl   2.1  7696       00:06  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl   1.8  7696       00:07  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl  1.6  7696       00:08  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl   1.4  7696       00:09  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl   1.3  7696       00:10  15 /usr/bin/perl ­w? /usr/lib/nagios/plugins/check_iftraffic3.pl
  • 48. 2013 48 Troubleshooting: Return code 127 CRITICAL: Return code of 127 is out of bounds. Make sure the plugin you're trying to run actually  exists. (worker: raspberrypi) Check the Path to the plugins directory. sudo mkdir ­p /usr/local/nagios sudo ln ­s /usr/lib/nagios/plugins /usr/local/nagios/libexec