So you want to switch off ?
Time to say goodbye to
your Nagios based setup!
© 2014 - Olivier Jan - Check my Website@olivjan - ojan@monitoring-fr.org
About me
❖ System admin and architect
❖ Co-founder of « Communauté Francophone de la Supervision Libre »
❖ Writer of the book « Nagios 3 au cœur de la supervision Open Source »
❖ Co-founder of Check my Website, a SaaS service for remote monitoring of
websites and applications (current)
Content
❖ Why switch off ? the good and maybe not so good reasons to do so !
❖ Which way to take ?
❖ Building a monitoring solution without Nagios :
❖ Tools available
❖ A personal work in progress
❖ Migrating from Nagios to this kind of solution
Some reasons to switch off…
❖ The godfather of OSS monitoring is dead as an
Open Source project ?
❖ Can’t do better with it
❖ Cool new kids out there
❖ Better « cloud » support
❖ Clear states, metrics and messages monitoring
distinction
❖ Better charting solution
❖ Near realtime monitoring
❖ Routing, aggregation, correlation…
❖ YOUR reasons ;)
Which way to take ?
❖ The « 4 mousquetaires »
❖ Naemon
❖ Icinga 2
❖ Shinken
❖ Centreon
❖ Reboot from building blocks
❖ Collect
❖ Store
❖ Visualize
❖ Alert
Tools : Collecting metrics and messages
❖ Packetbeat (metrics & messages)
❖ Rsyslog, NX log, Syslog-ng
(messages)
❖ sFlow Toolkit, Host sFlow
❖ Logstash-forwarder (messages)
❖ Collectd (metrics)
❖ Diamond (metrics)
❖ OSquery, WMI (metrics)
❖ Network level (sFlow)
❖ System Level
❖ Application Level
Tools : External collecting
❖ End user perspective
❖ Controls done closest to the
end-user
❖ Application behavior
❖ Real User Monitoring
❖ Webpagetest
❖ Selenium
❖ PhantomasJS
❖ Boomerang
❖ Bucky
Tools : Routing metrics and messages
❖ Messages : Logstash, Flume, Fluentd
❖ Metrics : StatsD
❖ Metrics : Carbon Relay NG
One or more messages can fire an event
Tools : Databases
❖ Graphite : The most used.
❖ OpenTSDB : HBase
❖ KairosDB : Cassandra
❖ InfluxDB : The most promising ?
❖ Elasticsearch : Index database
Tools : Visualizing metrics
and messages
❖ Kibana
❖ Grafana
❖ Dashboards collection
Tools : Alerting
❖ Seyren : Alerting dashboard for
Graphite.
❖ Cabot : Get alerted when services go
down or metrics go crazy
❖ Bosun : An advanced, open-source
monitoring and alerting system
❖ Skyline : Real-time anomaly
detection system
❖ Oculus : Anomaly correlation
component of Etsy's Kale system
❖ Esper : Complex Event Processing
The French Monitoring Community Xperience
❖ Reboot from building blocks
❖ Collect
❖ Store
❖ Visualize
❖ Alert
The French Monitoring Community Xperience
Is it working ? What is not working ?
Collecting metrics : Collectd
❖ InfluxDB Collectd proxy
❖ In Golang like InfluxDB
❖ Temporary solution
❖ Native Collectd plugin
LoadPlugin network
<Plugin network>
# proxy address
Server "127.0.0.1" "8096"
</Plugin>
❖ PHP5-FPM metrics
❖ Nginx metrics
❖ MariaDB metrics
❖ System metrics
❖ <metricname>:<value>|<type>
Collecting messages : Rsyslog
❖ Nearly ready log consumption
❖ Native distribution package
❖ Nginx Log, MySQL slow query
log
template(name=« ls_json"
type=« list" option.json="on") {
constant(value=« {")
constant(value=""@timestamp":"") property(name="timereported" dateFormat=« rfc3339")
constant(value=« ","@version":"1")
constant(value="","message":"") property(name=« msg")
constant(value="","host":"") property(name=« hostname")
constant(value="","severity":"") property(name=« syslogseverity-text")
constant(value="","facility":"") property(name=« syslogfacility-text")
constant(value="","programname":"") property(name=« programname")
constant(value="","procid":"") property(name=« procid")
constant(value=« "}n")
}
Collecting @ network level : Packetbeat
❖ Specific agent
❖ Collect traffic for
❖ HTTP
❖ MySQL
❖ PostgreSQL
❖ Redis
Routing messages : Logstash
❖ Inputs
❖ Codecs/filters
❖ Outputs
input {
udp {
port => 10514
codec => "json"
type => "syslog"
}
}
filter {
# This replaces the host field with the host that generated the message (sysloghost)
if [sysloghost] {
mutate {
replace => [ "host", "%{sysloghost}" ]
remove_field => "sysloghost"
}
}
}
output {
elasticsearch { host => localhost }
}
Routing metrics : StatsD
❖ Is now a protocol implemented
in all languages
❖ InfluxDB plugin
❖ Collectd can behave as a statsD
daemon (plugin)
❖ Very easy to push metrics

echo "foo:1|c" | nc -u -w0 127.0.0.1 8125
Storing metrics : InfluxDB
❖ Make it behave like Graphite
❖ graphite-api
❖ carbon-relay-ng
❖ graphite-influxdb
❖ Cluster, cluster, cluster
❖ Design for events and metrics
Storing messages : Elasticsearch
❖ Index database
❖ Cluster, cluster, cluster
❖ Full text search
Visualizing @ network level : Packetbeat
❖ Kibana 3 modified version
❖ Dashboards ready out 

of the box
Visualizing metrics : Grafana
❖ Compatible
❖ Graphite
❖ InfluxDB
❖ OpenTSDB
❖ Built on Kibana 3
Visualizing messages : Kibana 4
❖ Easy install
❖ Interactive dashboards
❖ Multiple indices
What's missing ? Wishes
❖ Alerting
❖ External monitoring
❖ Repository for dashboards…
❖ Giving sense to metrics and
messages
Alerting reboot
❖ Alert only on end user problems from an end
user perspective
❖ IRC, Chat channel…
❖ Alert thresholds based on history vs static
thresholds
❖ Statistics functions
❖ Boolean conditions
❖ Dynamic thresholds
❖ Anomaly detection
❖ Standard deviation
Coming from Nagios
❖ Graphios will inject perfdatas in Graphite or InfluxDB
❖ Check_graphite can query Graphite API from Nagios for alert based on
history
❖ Logstash will send events to NSCA
❖ Nagios log in Kibana with Grok %{NAGIOSLINE}
❖ Keep Nagios for states ?
Questions ?
@olivjan
ojan@monitoring-fr.org

More Related Content

PDF
Logmanagement with Icinga2 and ELK
PPTX
Introducing Tupilak, Snowplow's unified log fabric
PPTX
Seattle kafka meetup nov 2015 published siphon
PDF
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
PDF
Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar
PPT
What Crimean War gunboats teach us about the need for schema registries
PDF
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
PDF
The Netflix Way to deal with Big Data Problems
Logmanagement with Icinga2 and ELK
Introducing Tupilak, Snowplow's unified log fabric
Seattle kafka meetup nov 2015 published siphon
Kafka Connect: Operational Lessons Learned from the Trenches (Elizabeth Benne...
Siphon - Near Real Time Databus Using Kafka, Eric Boyd, Nitin Kumar
What Crimean War gunboats teach us about the need for schema registries
Javier Lopez_Mihail Vieru - Flink in Zalando's World of Microservices - Flink...
The Netflix Way to deal with Big Data Problems

What's hot (20)

PDF
Stream Processing in Uber
PDF
Serverless for the Cloud Native Era with Fission
PDF
Building event streaming pipelines using Apache Pulsar
PDF
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...
PDF
CFSSL 1.1: The Evolution of a PKI toolkit - DEF CON 23
PPTX
Time and ordering in streaming distributed systems
PDF
Flink forward-2017-netflix keystones-paas
PDF
Bringing Elliptic Curve Cryptography into the Mainstream
PDF
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
PDF
Clusternaut: Orchestrating  Percona XtraDB Cluster with Kubernetes
PDF
Kafka Summit SF 2017 - Building Event-Driven Services with Stateful Streams
PPTX
Centralised logging with ELK stack
PDF
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
PDF
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
PDF
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
PDF
The State of Stream Processing
PDF
Leveraging Microservice Architectures & Event-Driven Systems for Global APIs
PPTX
Elk meetup boston - logz.io
PDF
Elk devops
PDF
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Stream Processing in Uber
Serverless for the Cloud Native Era with Fission
Building event streaming pipelines using Apache Pulsar
Thomas Lamirault_Mohamed Amine Abdessemed -A brief history of time with Apac...
CFSSL 1.1: The Evolution of a PKI toolkit - DEF CON 23
Time and ordering in streaming distributed systems
Flink forward-2017-netflix keystones-paas
Bringing Elliptic Curve Cryptography into the Mainstream
Clusternaut: Orchestrating Percona XtraDB Cluster with Kubernetes.
Clusternaut: Orchestrating  Percona XtraDB Cluster with Kubernetes
Kafka Summit SF 2017 - Building Event-Driven Services with Stateful Streams
Centralised logging with ELK stack
Beaming flink to the cloud @ netflix ff 2016-monal-daxini
Apache Kafka and KSQL in Action: Let's Build a Streaming Data Pipeline!
Solutions for bi-directional Integration between Oracle RDMBS & Apache Kafka
The State of Stream Processing
Leveraging Microservice Architectures & Event-Driven Systems for Global APIs
Elk meetup boston - logz.io
Elk devops
Event Sourcing, Stream Processing and Serverless (Benjamin Stopford, Confluen...
Ad

Similar to OSMC 2014 | Time to say goodbye to your Nagios based setup? by Oliver Jan (20)

PPTX
Time to say goodbye to your Nagios based setup
PDF
OSMC 2014: Time to say goodbye to your Nagios setup | Oliver Jan
PDF
Experiences with Microservices at Tuenti
PDF
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
PDF
Istio : Service Mesh
PDF
Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...
PDF
Triangle Devops Meetup 10/2015
PDF
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
PDF
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
PDF
Cloud lunch and learn real-time streaming in azure
PDF
Apache Spark Streaming
PDF
Paasta: Application Delivery at Yelp
PPTX
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
PPTX
What's New in Docker - February 2017
PDF
Netflix Architecture and Open Source
PDF
Kentik Network@Scale (Dan Ellis)
PDF
IoT Story: From Edge to HDP
PPTX
Streaming Data Ingest and Processing with Apache Kafka
PDF
QNIBTerminal: Understand your datacenter by overlaying multiple information l...
PPTX
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
Time to say goodbye to your Nagios based setup
OSMC 2014: Time to say goodbye to your Nagios setup | Oliver Jan
Experiences with Microservices at Tuenti
Thinking DevOps in the era of the Cloud - Demi Ben-Ari
Istio : Service Mesh
Delivering the power of data using Spring Cloud DataFlow and DataStax Enterpr...
Triangle Devops Meetup 10/2015
Monitoring Big Data Systems Done "The Simple Way" - Codemotion Milan 2017 - D...
Demi Ben-Ari - Monitoring Big Data Systems Done "The Simple Way" - Codemotion...
Cloud lunch and learn real-time streaming in azure
Apache Spark Streaming
Paasta: Application Delivery at Yelp
Orchestrating Docker with Terraform and Consul by Mitchell Hashimoto
What's New in Docker - February 2017
Netflix Architecture and Open Source
Kentik Network@Scale (Dan Ellis)
IoT Story: From Edge to HDP
Streaming Data Ingest and Processing with Apache Kafka
QNIBTerminal: Understand your datacenter by overlaying multiple information l...
Flink Forward San Francisco 2018: - Jinkui Shi and Radu Tudoran "Flink real-t...
Ad

Recently uploaded (20)

PDF
CCleaner 6.39.11548 Crack 2025 License Key
PDF
Wondershare Recoverit Full Crack New Version (Latest 2025)
PPTX
"Secure File Sharing Solutions on AWS".pptx
PDF
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
PDF
Cost to Outsource Software Development in 2025
PDF
Website Design Services for Small Businesses.pdf
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PDF
Designing Intelligence for the Shop Floor.pdf
PDF
Types of Token_ From Utility to Security.pdf
PDF
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
PDF
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
PPTX
Patient Appointment Booking in Odoo with online payment
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
PPTX
CNN LeNet5 Architecture: Neural Networks
PDF
Autodesk AutoCAD Crack Free Download 2025
PDF
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
PDF
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
PDF
Multiverse AI Review 2025: Access All TOP AI Model-Versions!
PDF
iTop VPN Crack Latest Version Full Key 2025
CCleaner 6.39.11548 Crack 2025 License Key
Wondershare Recoverit Full Crack New Version (Latest 2025)
"Secure File Sharing Solutions on AWS".pptx
DuckDuckGo Private Browser Premium APK for Android Crack Latest 2025
Cost to Outsource Software Development in 2025
Website Design Services for Small Businesses.pdf
Monitoring Stack: Grafana, Loki & Promtail
Designing Intelligence for the Shop Floor.pdf
Types of Token_ From Utility to Security.pdf
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
How AI/LLM recommend to you ? GDG meetup 16 Aug by Fariman Guliev
Patient Appointment Booking in Odoo with online payment
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
CNN LeNet5 Architecture: Neural Networks
Autodesk AutoCAD Crack Free Download 2025
The Dynamic Duo Transforming Financial Accounting Systems Through Modern Expe...
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
Multiverse AI Review 2025: Access All TOP AI Model-Versions!
iTop VPN Crack Latest Version Full Key 2025

OSMC 2014 | Time to say goodbye to your Nagios based setup? by Oliver Jan

  • 1. So you want to switch off ? Time to say goodbye to your Nagios based setup! © 2014 - Olivier Jan - Check my Website@olivjan - ojan@monitoring-fr.org
  • 2. About me ❖ System admin and architect ❖ Co-founder of « Communauté Francophone de la Supervision Libre » ❖ Writer of the book « Nagios 3 au cœur de la supervision Open Source » ❖ Co-founder of Check my Website, a SaaS service for remote monitoring of websites and applications (current)
  • 3. Content ❖ Why switch off ? the good and maybe not so good reasons to do so ! ❖ Which way to take ? ❖ Building a monitoring solution without Nagios : ❖ Tools available ❖ A personal work in progress ❖ Migrating from Nagios to this kind of solution
  • 4. Some reasons to switch off… ❖ The godfather of OSS monitoring is dead as an Open Source project ? ❖ Can’t do better with it ❖ Cool new kids out there ❖ Better « cloud » support ❖ Clear states, metrics and messages monitoring distinction ❖ Better charting solution ❖ Near realtime monitoring ❖ Routing, aggregation, correlation… ❖ YOUR reasons ;)
  • 5. Which way to take ? ❖ The « 4 mousquetaires » ❖ Naemon ❖ Icinga 2 ❖ Shinken ❖ Centreon ❖ Reboot from building blocks ❖ Collect ❖ Store ❖ Visualize ❖ Alert
  • 6. Tools : Collecting metrics and messages ❖ Packetbeat (metrics & messages) ❖ Rsyslog, NX log, Syslog-ng (messages) ❖ sFlow Toolkit, Host sFlow ❖ Logstash-forwarder (messages) ❖ Collectd (metrics) ❖ Diamond (metrics) ❖ OSquery, WMI (metrics) ❖ Network level (sFlow) ❖ System Level ❖ Application Level
  • 7. Tools : External collecting ❖ End user perspective ❖ Controls done closest to the end-user ❖ Application behavior ❖ Real User Monitoring ❖ Webpagetest ❖ Selenium ❖ PhantomasJS ❖ Boomerang ❖ Bucky
  • 8. Tools : Routing metrics and messages ❖ Messages : Logstash, Flume, Fluentd ❖ Metrics : StatsD ❖ Metrics : Carbon Relay NG One or more messages can fire an event
  • 9. Tools : Databases ❖ Graphite : The most used. ❖ OpenTSDB : HBase ❖ KairosDB : Cassandra ❖ InfluxDB : The most promising ? ❖ Elasticsearch : Index database
  • 10. Tools : Visualizing metrics and messages ❖ Kibana ❖ Grafana ❖ Dashboards collection
  • 11. Tools : Alerting ❖ Seyren : Alerting dashboard for Graphite. ❖ Cabot : Get alerted when services go down or metrics go crazy ❖ Bosun : An advanced, open-source monitoring and alerting system ❖ Skyline : Real-time anomaly detection system ❖ Oculus : Anomaly correlation component of Etsy's Kale system ❖ Esper : Complex Event Processing
  • 12. The French Monitoring Community Xperience ❖ Reboot from building blocks ❖ Collect ❖ Store ❖ Visualize ❖ Alert
  • 13. The French Monitoring Community Xperience Is it working ? What is not working ?
  • 14. Collecting metrics : Collectd ❖ InfluxDB Collectd proxy ❖ In Golang like InfluxDB ❖ Temporary solution ❖ Native Collectd plugin LoadPlugin network <Plugin network> # proxy address Server "127.0.0.1" "8096" </Plugin> ❖ PHP5-FPM metrics ❖ Nginx metrics ❖ MariaDB metrics ❖ System metrics ❖ <metricname>:<value>|<type>
  • 15. Collecting messages : Rsyslog ❖ Nearly ready log consumption ❖ Native distribution package ❖ Nginx Log, MySQL slow query log template(name=« ls_json" type=« list" option.json="on") { constant(value=« {") constant(value=""@timestamp":"") property(name="timereported" dateFormat=« rfc3339") constant(value=« ","@version":"1") constant(value="","message":"") property(name=« msg") constant(value="","host":"") property(name=« hostname") constant(value="","severity":"") property(name=« syslogseverity-text") constant(value="","facility":"") property(name=« syslogfacility-text") constant(value="","programname":"") property(name=« programname") constant(value="","procid":"") property(name=« procid") constant(value=« "}n") }
  • 16. Collecting @ network level : Packetbeat ❖ Specific agent ❖ Collect traffic for ❖ HTTP ❖ MySQL ❖ PostgreSQL ❖ Redis
  • 17. Routing messages : Logstash ❖ Inputs ❖ Codecs/filters ❖ Outputs input { udp { port => 10514 codec => "json" type => "syslog" } } filter { # This replaces the host field with the host that generated the message (sysloghost) if [sysloghost] { mutate { replace => [ "host", "%{sysloghost}" ] remove_field => "sysloghost" } } } output { elasticsearch { host => localhost } }
  • 18. Routing metrics : StatsD ❖ Is now a protocol implemented in all languages ❖ InfluxDB plugin ❖ Collectd can behave as a statsD daemon (plugin) ❖ Very easy to push metrics
 echo "foo:1|c" | nc -u -w0 127.0.0.1 8125
  • 19. Storing metrics : InfluxDB ❖ Make it behave like Graphite ❖ graphite-api ❖ carbon-relay-ng ❖ graphite-influxdb ❖ Cluster, cluster, cluster ❖ Design for events and metrics
  • 20. Storing messages : Elasticsearch ❖ Index database ❖ Cluster, cluster, cluster ❖ Full text search
  • 21. Visualizing @ network level : Packetbeat ❖ Kibana 3 modified version ❖ Dashboards ready out 
 of the box
  • 22. Visualizing metrics : Grafana ❖ Compatible ❖ Graphite ❖ InfluxDB ❖ OpenTSDB ❖ Built on Kibana 3
  • 23. Visualizing messages : Kibana 4 ❖ Easy install ❖ Interactive dashboards ❖ Multiple indices
  • 24. What's missing ? Wishes ❖ Alerting ❖ External monitoring ❖ Repository for dashboards… ❖ Giving sense to metrics and messages
  • 25. Alerting reboot ❖ Alert only on end user problems from an end user perspective ❖ IRC, Chat channel… ❖ Alert thresholds based on history vs static thresholds ❖ Statistics functions ❖ Boolean conditions ❖ Dynamic thresholds ❖ Anomaly detection ❖ Standard deviation
  • 26. Coming from Nagios ❖ Graphios will inject perfdatas in Graphite or InfluxDB ❖ Check_graphite can query Graphite API from Nagios for alert based on history ❖ Logstash will send events to NSCA ❖ Nagios log in Kibana with Grok %{NAGIOSLINE} ❖ Keep Nagios for states ?