SlideShare a Scribd company logo
Monitoring nginx 
Alexis Lê-Quôc, Datadog 
@alq
Agenda 
• Dramatis personae 
• Observations 
• Monitoring 1 nginx (plus) with logs 
• Monitoring 1 nginx (plus) with metrics 
• Monitoring N nginx effectively
@alq 
CTO at Datadog
Datadog == monitoring 
• Monitoring as a service 
• Work really will with large, dynamic environments (e.g. clouds) 
• Aggregate performance metrics 
• Correlate nginx performance with the rest of your infrastructure
Monitoring NGINX (plus): key metrics and how-to
Monitoring NGINX (plus): key metrics and how-to
Observations 
From the field
Some stats 
• Across all monitored servers 
• nginx ~10% 
• Apache ~5% 
• CPU and CPU/$ is the dominant resource
% of instances per core count 
40% 
30% 
20% 
10% 
0% 
1 2 4 8 12 16 24 32 
Core count 
10% 
3% 1% 
10% 
30% 
7% 
39% 
10%
% of instances per type (AWS only) 
30% 
22.5% 
15% 
7.5% 
0% 
c3.l c3.2xl c1.xl c3.8xl m3.l c3.xl m3.m cc2.8xl t2.m c3.4xl rest 
EC2 type 
8.6% 
3.1% 
5% 4.7% 4.5% 4.4% 5.3% 
7.6% 
13% 
14% 
30%
Monitoring nginx 
1. Monitoring with logs 
2. Monitoring with status 
3. Monitoring with statsd
Monitoring with logs 
nginx log forwarder indexer UI 
• Canonical example of log indexers 
• Your choice of: 
• logstash 
• splunk 
• logentries, sumologic, loggly, etc.
Monitoring with logs 
nginx log forwarder indexer UI 
Strengths Weaknesses 
forensics & anomalies low signal-to-noise ratio 
content-driven analysis “black box”
Monitoring with metrics 
nginx 
status 
collector aggregator UI/alerts 
• open-source: ngx_http_stub_status_module 
• bare-bone metrics 
• human-readable text presentation 
• plus: ngx_http_status_module 
• a lot more metrics for each function 
• json format 
• Your choice of… 
• Datadog, Nagios, Zabbix, etc. for open-source 
• Datadog for nginx plus
Monitoring with metrics 
nginx 
status 
collector aggregator UI/alerts 
Strengths Weaknesses 
lightweight & real-time no insight into content 
“white box”
Simple metrics taxonomy 
1. What it measures 
• Work or resource 
• Focus on work because work == value 
• Resource analysis useful to understand performance 
• Use Brendan Gregg’s USE 
• Utilization (% over time) 
• Saturation (queue length) 
• Errors (count over time) 
2. Type 
• Gauge: sample 
• Counter: accumulated sample, needs to be derived to be 
meaningful 
http://www.brendangregg.com/usemethod.html
Open-source metrics 
Class Type Resource/Work Notes 
Current 
connections 
Gauge Resource 
reading, writing, 
idle 
Accepted 
connections 
Counter Resource 
Handled 
connections 
Counter Resource 
<= accepted if 
resource limit 
Requests Counter Work 
True purpose of 
the server 
•Latency must be measured 
using logs or statsd.
Key “plus” metrics 
Class Type Resource/Work Notes 
5xx Errors Counter Work 
without log 
analysis 
5xx/sum(Nxx) Gauge Work error rate % 
idle/dropped 
connections 
Gauge Resource saturation 
active/total 
connections 
Gauge Resource 
upstream 
capacity 
Requests Counter Work 
true purpose of 
the server 
• Latency must be measured 
using logs or statsd.
Monitoring with statsd 
nginx statsd UI/alerts 
Strengths Weaknesses 
lightweight, real-time, standard not comprehensive 
custom metrics, content-aware 
https://github.com/zebrafishlabs/nginx-statsd
Example
Monitoring nginx 
1. Logs for content-analysis (forensics, anomalies, marketing) 
2. Status for (white box) performance monitoring 
3. statsD for custom metrics 
No single method gives you everything you need.
Monitoring a lot of nginx 
1. Requires aggregation 
2. It’s all about Metadata (“Pet-to-cattle” mindset) 
3. Correlation
Aggregation 
• By default for log-based monitoring 
• Not by default for metric-based monitoring
Metadata 
• Analyze by properties that are not the host identity 
• Find anomalies that are not obvious 
• Pet-to-cattle evolution: hosts don’t matter, services do
Correlation 
• nginx is only one piece of the infrastructure
#plug 
www.datadog.com
Thank you! 
Questions/Comments? @alq

More Related Content

ODP
Nginx monitoring with graphite
PDF
Keynote: Scaling Sensu Go
PDF
Python and trending_data_ops
PDF
Anatomy of an action
PPTX
Monitoring in a scalable world
PDF
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
PPTX
Altitude NY 2018: 132 websites, 1 service: Your local news runs on Fastly
PPTX
Lifting the Blinds: Monitoring Windows Server 2012
Nginx monitoring with graphite
Keynote: Scaling Sensu Go
Python and trending_data_ops
Anatomy of an action
Monitoring in a scalable world
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Altitude NY 2018: 132 websites, 1 service: Your local news runs on Fastly
Lifting the Blinds: Monitoring Windows Server 2012

What's hot (20)

PDF
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
PPTX
Stabilising the jenga tower
PPTX
Herding cats & catching fire: Workday's telemetry & middleware
PPT
Nagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIs
PDF
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
PPTX
Scaling an ELK stack at bol.com
PPTX
AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the Cloud
PPTX
Airflow Clustering and High Availability
PPTX
How bol.com makes sense of its logs, using the Elastic technology stack.
PPT
'Scalable Logging and Analytics with LogStash'
PDF
Time Series Database and Tick Stack
PDF
Altitude NY 2018: Programming the edge workshop
PDF
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, Ever
PDF
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
PDF
Introduction to InfluxDB and TICK Stack
PPTX
Graylog Engineering - Design Your Architecture
PDF
Cloud Native User Group: Shift-Left Testing IaC With PaC
PDF
Monitoring with Graylog - a modern approach to monitoring?
POTX
Mobile 3: Launch Like a Boss!
PDF
Prezo at-mesos con2015-final
Central LogFile Storage. ELK stack Elasticsearch, Logstash and Kibana.
Stabilising the jenga tower
Herding cats & catching fire: Workday's telemetry & middleware
Nagios Conference 2014 - Janice Singh - Real World Uses for Nagios APIs
Building a Real-Time Data Pipeline with Spark, Kafka, and Python
Scaling an ELK stack at bol.com
AWS re:Invent 2014 talk: Scheduling using Apache Mesos in the Cloud
Airflow Clustering and High Availability
How bol.com makes sense of its logs, using the Elastic technology stack.
'Scalable Logging and Analytics with LogStash'
Time Series Database and Tick Stack
Altitude NY 2018: Programming the edge workshop
Altitude NY 2018: Leveraging Log Streaming to Build the Best Dashboards, Ever
USENIX LISA15: How TubeMogul Handles over One Trillion HTTP Requests a Month
Introduction to InfluxDB and TICK Stack
Graylog Engineering - Design Your Architecture
Cloud Native User Group: Shift-Left Testing IaC With PaC
Monitoring with Graylog - a modern approach to monitoring?
Mobile 3: Launch Like a Boss!
Prezo at-mesos con2015-final
Ad

Viewers also liked (11)

PDF
How to monitor NGINX
PDF
Naxsi, an open source WAF for Nginx
DOCX
Devops training in Hyderabad
PDF
Lcu14 Lightning Talk- NGINX
PDF
How to measure everything - a million metrics per second with minimal develop...
PPTX
Learn nginx in 90mins
PDF
Tuning TCP and NGINX on EC2
PDF
How to secure your web applications with NGINX
PPTX
The 3 Models in the NGINX Microservices Reference Architecture
PPTX
Introduction to Zabbix - Company, Product, Services and Use Cases
PDF
Nginx Internals
How to monitor NGINX
Naxsi, an open source WAF for Nginx
Devops training in Hyderabad
Lcu14 Lightning Talk- NGINX
How to measure everything - a million metrics per second with minimal develop...
Learn nginx in 90mins
Tuning TCP and NGINX on EC2
How to secure your web applications with NGINX
The 3 Models in the NGINX Microservices Reference Architecture
Introduction to Zabbix - Company, Product, Services and Use Cases
Nginx Internals
Ad

Similar to Monitoring NGINX (plus): key metrics and how-to (20)

PPTX
Analyzing NGINX Logs with Datadog
PDF
Monitoring your API
PDF
Monitoring Highly Dynamic and Distributed Systems with NGINX Amplify
PDF
Handout: 'Open Source Tools & Resources'
PDF
NGINX ADC: Basics and Best Practices – EMEA
PPTX
What's New in NGINX Plus R7?
PDF
Lesson_08_Continuous_Monitoring.pdf
PPTX
Functionality, security and performance monitoring of web assets (e.g. Joomla...
PDF
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
PDF
Monitoring in the cloud with Puppet
KEY
London devops logging
PDF
Open Source Monitoring in 2019
PDF
NGINX: Basics and Best Practices EMEA
PDF
Nim tames sprawl
PPTX
Time to say goodbye to your Nagios based setup
PDF
OSMC 2014: Time to say goodbye to your Nagios setup | Oliver Jan
PDF
Prometheus london
PPTX
Benchmarking NGINX for Accuracy and Results
PPTX
Maximizing PHP Performance with NGINX
PDF
Nginx for Fun & Performance - Philipp Krenn - Codemotion Rome 2015
Analyzing NGINX Logs with Datadog
Monitoring your API
Monitoring Highly Dynamic and Distributed Systems with NGINX Amplify
Handout: 'Open Source Tools & Resources'
NGINX ADC: Basics and Best Practices – EMEA
What's New in NGINX Plus R7?
Lesson_08_Continuous_Monitoring.pdf
Functionality, security and performance monitoring of web assets (e.g. Joomla...
Monitoring and Scaling Redis at DataDog - Ilan Rabinovitch, DataDog
Monitoring in the cloud with Puppet
London devops logging
Open Source Monitoring in 2019
NGINX: Basics and Best Practices EMEA
Nim tames sprawl
Time to say goodbye to your Nagios based setup
OSMC 2014: Time to say goodbye to your Nagios setup | Oliver Jan
Prometheus london
Benchmarking NGINX for Accuracy and Results
Maximizing PHP Performance with NGINX
Nginx for Fun & Performance - Philipp Krenn - Codemotion Rome 2015

More from Datadog (20)

PPTX
What it Means to be a Next-Generation Managed Service Provider
PDF
Monitoring kubernetes across data center and cloud
PDF
Datadog + VictorOps Webinar
PDF
Dataday Texas 2016 - Datadog
PDF
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
PDF
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
PDF
Monitoring Docker at Scale - Docker San Francisco Meetup - August 11, 2015
PPTX
Monitoring Docker containers - Docker NYC Feb 2015
PDF
Running & Monitoring Docker at Scale
PDF
Treating Infrastructure as Garbage
PDF
Events and metrics the Lifeblood of Webops
PDF
The Data Mullet: From all SQL to No SQL back to Some SQL
PDF
Big (IT) data
PDF
Deep dive into Nagios analytics
PDF
Just enough web ops for web developers
PDF
Customer Ops: DevOps &lt;3 customer support
PDF
I &lt;3 graphs in 20 slides
PDF
Effective monitoring with StatsD
PDF
Alerting: more signal, less noise, less pain
PDF
Fact based monitoring
What it Means to be a Next-Generation Managed Service Provider
Monitoring kubernetes across data center and cloud
Datadog + VictorOps Webinar
Dataday Texas 2016 - Datadog
Docker Usage Patterns - Meetup Docker Paris - November, 10th 2015
PyData NYC 2015 - Automatically Detecting Outliers with Datadog
Monitoring Docker at Scale - Docker San Francisco Meetup - August 11, 2015
Monitoring Docker containers - Docker NYC Feb 2015
Running & Monitoring Docker at Scale
Treating Infrastructure as Garbage
Events and metrics the Lifeblood of Webops
The Data Mullet: From all SQL to No SQL back to Some SQL
Big (IT) data
Deep dive into Nagios analytics
Just enough web ops for web developers
Customer Ops: DevOps &lt;3 customer support
I &lt;3 graphs in 20 slides
Effective monitoring with StatsD
Alerting: more signal, less noise, less pain
Fact based monitoring

Recently uploaded (20)

PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Encapsulation theory and applications.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
cuic standard and advanced reporting.pdf
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
KodekX | Application Modernization Development
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Electronic commerce courselecture one. Pdf
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Encapsulation theory and applications.pdf
Spectroscopy.pptx food analysis technology
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Digital-Transformation-Roadmap-for-Companies.pptx
Chapter 3 Spatial Domain Image Processing.pdf
cuic standard and advanced reporting.pdf
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
MIND Revenue Release Quarter 2 2025 Press Release
MYSQL Presentation for SQL database connectivity
Review of recent advances in non-invasive hemoglobin estimation
KodekX | Application Modernization Development
Reach Out and Touch Someone: Haptics and Empathic Computing
Spectral efficient network and resource selection model in 5G networks
Programs and apps: productivity, graphics, security and other tools
Diabetes mellitus diagnosis method based random forest with bat algorithm
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Electronic commerce courselecture one. Pdf

Monitoring NGINX (plus): key metrics and how-to

  • 1. Monitoring nginx Alexis Lê-Quôc, Datadog @alq
  • 2. Agenda • Dramatis personae • Observations • Monitoring 1 nginx (plus) with logs • Monitoring 1 nginx (plus) with metrics • Monitoring N nginx effectively
  • 3. @alq CTO at Datadog
  • 4. Datadog == monitoring • Monitoring as a service • Work really will with large, dynamic environments (e.g. clouds) • Aggregate performance metrics • Correlate nginx performance with the rest of your infrastructure
  • 8. Some stats • Across all monitored servers • nginx ~10% • Apache ~5% • CPU and CPU/$ is the dominant resource
  • 9. % of instances per core count 40% 30% 20% 10% 0% 1 2 4 8 12 16 24 32 Core count 10% 3% 1% 10% 30% 7% 39% 10%
  • 10. % of instances per type (AWS only) 30% 22.5% 15% 7.5% 0% c3.l c3.2xl c1.xl c3.8xl m3.l c3.xl m3.m cc2.8xl t2.m c3.4xl rest EC2 type 8.6% 3.1% 5% 4.7% 4.5% 4.4% 5.3% 7.6% 13% 14% 30%
  • 11. Monitoring nginx 1. Monitoring with logs 2. Monitoring with status 3. Monitoring with statsd
  • 12. Monitoring with logs nginx log forwarder indexer UI • Canonical example of log indexers • Your choice of: • logstash • splunk • logentries, sumologic, loggly, etc.
  • 13. Monitoring with logs nginx log forwarder indexer UI Strengths Weaknesses forensics & anomalies low signal-to-noise ratio content-driven analysis “black box”
  • 14. Monitoring with metrics nginx status collector aggregator UI/alerts • open-source: ngx_http_stub_status_module • bare-bone metrics • human-readable text presentation • plus: ngx_http_status_module • a lot more metrics for each function • json format • Your choice of… • Datadog, Nagios, Zabbix, etc. for open-source • Datadog for nginx plus
  • 15. Monitoring with metrics nginx status collector aggregator UI/alerts Strengths Weaknesses lightweight & real-time no insight into content “white box”
  • 16. Simple metrics taxonomy 1. What it measures • Work or resource • Focus on work because work == value • Resource analysis useful to understand performance • Use Brendan Gregg’s USE • Utilization (% over time) • Saturation (queue length) • Errors (count over time) 2. Type • Gauge: sample • Counter: accumulated sample, needs to be derived to be meaningful http://www.brendangregg.com/usemethod.html
  • 17. Open-source metrics Class Type Resource/Work Notes Current connections Gauge Resource reading, writing, idle Accepted connections Counter Resource Handled connections Counter Resource <= accepted if resource limit Requests Counter Work True purpose of the server •Latency must be measured using logs or statsd.
  • 18. Key “plus” metrics Class Type Resource/Work Notes 5xx Errors Counter Work without log analysis 5xx/sum(Nxx) Gauge Work error rate % idle/dropped connections Gauge Resource saturation active/total connections Gauge Resource upstream capacity Requests Counter Work true purpose of the server • Latency must be measured using logs or statsd.
  • 19. Monitoring with statsd nginx statsd UI/alerts Strengths Weaknesses lightweight, real-time, standard not comprehensive custom metrics, content-aware https://github.com/zebrafishlabs/nginx-statsd
  • 21. Monitoring nginx 1. Logs for content-analysis (forensics, anomalies, marketing) 2. Status for (white box) performance monitoring 3. statsD for custom metrics No single method gives you everything you need.
  • 22. Monitoring a lot of nginx 1. Requires aggregation 2. It’s all about Metadata (“Pet-to-cattle” mindset) 3. Correlation
  • 23. Aggregation • By default for log-based monitoring • Not by default for metric-based monitoring
  • 24. Metadata • Analyze by properties that are not the host identity • Find anomalies that are not obvious • Pet-to-cattle evolution: hosts don’t matter, services do
  • 25. Correlation • nginx is only one piece of the infrastructure