SlideShare a Scribd company logo
Mattias Schlenker
Technical Writer - Team Knowledge
Building a better check_http
Re-thinking the most important
active check
OSMC 2024
2
Who am I and why am I here with you?
Long term tech journo
My job at Checkmk
Why am I here
⬢ Once upon a time… Data Becker, Heise and others…
⬢ You might know me from Desinfecʼt (cʼt magazine)
⬢ Working on The Official Checkmk User Guide
⬢ Early access to new to features (‘Can this be documented?ʼ)
⬢ Product communication in general (blogs, articles, conventions…)
⬢ Convincing you to try out the new check_httpv2 with
your monitoring solution
⬢ Ultimately helping you with the decision which hosts to migrate
soon, which later and which never
3
Agenda of today
Our starting point
Decision
Practical use
check_cert
Why and how did we decide for a new check_http?
4
Our starting point
Decision
Practical use
check_cert
Letʼs start into it…
5
At Checkmk most development
is customer driven
Customers complained about bugs, missing
features and annoyances like...
⬢ … more than one thing to check usually
means more than one call
⬢ … problems with chunked content
⬢ … limited range of login options
⬢ … opaque priority handling of command line
arguments
⬢ ... SSL handling needed improvement
In general: users expect an HTTP check to
behave like a browser
6
Should we have worked on the existing check_http?
⬢ Very many pull requests could have
overwhelmed maintainers
⬢ New dependencies would have made
check_http harder to build
⬢ Changes of how CLI options are
interpreted could have broken existing
configurations
⬢ "Bug compatibility" is an issue
Conclusion
⬢ Time needed
⬢ IFDEFs needed (probably many)
⬢ Diplomacy needed to introduce new
paradigms
7
Our starting point
Decision
Practical use
check_cert
Decision: make it new!
We decided for a
complete re-implementation.
8
Which fighter would you have chosen?
C/C, Python or Rust?
9
10
C/C, Python or Rust?
+ Integration of our new check into monitoring-plugins.org collection would have been
possible
- Hard to maintain
+ Possibility to move from active check to Checkmk special agent
- Would have broken many debugging possibilities
- Harder migration
- Performance
- "Checkmk only" would have been very likely
+ Easy to debug, extend and maintain
+ Easy migration
+ Compatible to other monitoring software that uses Nagios call/return syntax
+ Performance
- Slightly less optimal integration with Checkmk
C/C
Python
Rust
Rust made the race!
11
12
It's open source and will work with Icinga, Zabbix,
Nagios, Naemon…
13
Our starting point
Decision
Practical use
check_cert
14
That's the theory and the background
Now let's get practical - Build it…
# Dependencies
apt install git curl
apt install build-essential libssl-dev pkg-config
# The Rust compiler
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
. "$HOME/.cargo/env"
# Get the sources
git clone https://github.com/Checkmk/checkmk
# Use the time to get a coffee!
# Omit this if you are brave:
git checkout 2.3.0
# Build...
cd checkmk/packages/check-http/
cargo build --release # Get another coffee...
15
And run it in the simplest way possible
# ...and test:
ls -lah target/release/
time ./target/release/check_http --url https://docs.checkmk.com/
echo $?
You might now install it to a matching location,
for example as check_httpv2 to the monitoring-plugins directory...
16
The biggest advantage:
Check several aspects within one call!
C
⬢ Response code 200
⬢ Response time 0.2 and 0.5s
⬢ Certificate validity 15 and 7 days
⬢ TLS version 1.3
check_httpv2 
--timeout 2 
--response-time-levels 0.2,1.0 
--min-tls-version tls13 
--certificate-levels 15,7 
--status-code 200 
--url https://docs.checkmk.com/
17
Unsure how to build the command line?
C
Install Checkmk, tinker with the rules for this active check, let you show the command line!
18
Unsure how to build the command line?
C
Install Checkmk, tinker with the rules for this active check, let you show the command line!
19
Unsure how to build the command line?
C
Install Checkmk, tinker with the rules for this active check, let you show the command line!
20
Our starting point
Decision
Practical use
check_cert
21
C
Are you also that type of person …
… that just runs a minimal web server
on other servers
just to check certificates with check_http?
22
C
I was until...
...check_cert became available!
23
This is very similar to check_httpv2, ...
C
⬢ Rust source in packages/check-cert, builds the same way
⬢ Connects to any SSL enabled service StartTLS under consideration)
⬢ Even more detailed checks for SNI, subjects, issuer available...
⬢ Use your own CA store to work with internal CAs (see also
https://checkmk.io/cert-authority)
⬢ Check for expected algorithm
Use it where you do not have an HTTP server to connect to or for HTTP where the
certificate checks in check_httpv2 are not sufficient.
Checkmk GmbH
Kellerstraße 27
81667 München
Germany
Web — checkmk.com
mattias.schlenker@checkmk.com
feedback@checkmk.com
Questions? Get in touch
Thank you!
25
Fast track your time-to-resolution
⬢
⬢
2,000 + plug-ins
Auto-discovery
⬢
⬢
Smart and granular alerts
Ticket auto-creation in ITSM tools
⬢ Out-of-the-box health assessment
⬢ Built-in time series graphs
⬢ Forecasting and capacity management
⬢ Built-in and custom dashboards
⬢ Root-cause analysis
⬢ Custom self-healing
26
Our mission: bringing visibility into your hybrid IT
Built to address your needs
27
Monitor Everything Highly Automated Hyper-Scalable Extensible
28
2,000+ well-maintained plug-ins
Data center Cloud
Monitor Everything
Archives
Load
Balancers
Firewall
Caches
DNS
Databases Costs
VPN
Block
storage
Con-
tainers
Object
storage
Kuber-
netes
Functions
Notifi-
cations
Apps
CDN
Cloud
Health
Virtual
Machines
Cloud services
VPNs
Router
Switches
Firewalls
Networking
Linux Windows MacOS
AIX
OpenVMS
Solaris
HPUX
IBM zOS
FreeBSD
OpenBSD
NetBSD
OS
Hyper-
visors
Con-
tainers
Kuber-
netes
Virtualization
Web
Apps
Databases
Application
servers
Web
servers
Message
queues /
caches
Business
software
Applications
HVAC
Storage
Power
supply
Server Sensors
Security
Hardware & Sensors
29
Highly automated to monitor at scale
29
29
29
Monitoring objects and metrics
can automatically be removed from
monitoring
Asset auto-registers
Monitoring services &
metrics automatically
added
Asset starts
VM, pod, …)
Asset is terminated
Highly Automated
Auto-register workloads
30
Highly automated to monitor at scale
30
Highly Automated
Auto-register workloads
Auto-discover services
31
Highly automated to monitor at scale
Auto-register workloads
Metrics dashboards
Auto-discover services
Highly Automated
32
Highly automated to monitor at scale
Auto-register workloads
Metrics dashboards
Auto-discover services
Application dashboards
Highly Automated
Highly automated to monitor at scale
Webex teams
Slack
Mattermost SMS
MS Teams Email
Splunk
On-Call
Ops Genie
Jira
ServiceNow PagerDuty
Messaging
ITSM
33
Auto-register workloads
Metrics dashboards
Auto-discover services
Application dashboards
Automated alerting
33
Highly Automated
34
Highly automated to monitor at scale
Auto-register workloads
Metrics dashboards
Auto-discover services
Application dashboards
Automated alerting
⬢ Automate entire monitoring configuration & operation
⬢ Leverage auto-generated documentation with code examples
and Ansible playbooks
RESTAPI Build your fully automated monitoring
Highly Automated
Customer examples
35
Scale vertically
100k+ services per instance
Scale horizontally
with massively distributed set-ups
Instances
Hosts Services
170
128k
6.1m
Hyper-scalable distributed set-ups
Instances
Hosts Services
12
108k
3.7m
Hyper-Scalable
36
Extensible open-source monitoring
⬢ Majority of code base open source
⬢ Easily readable and modifiable Python code
⬢ Developer APIs for writing monitoring integrations
⬢ Built-in logic to handle customized code
⬢ Large partner ecosystem for customizations
Build your own integrations
with simple scripts extending agents or
by writing entire plug-ins yourself
Extend existing integrations
to accommodate own requirements
Extensible
Checkmk Editions: Your use case, your edition
37
Raw
Free & open source IT monitoring
for mid-sized infrastructures.
Monitor your entire IT
⬡ Auto-discover your IT
⬡ Monitor out-of-the-box with
2000+ plug-ins
⬡ Auto-detect issues
and more
Support via community
Enterprise
Scalable and automated
enterprise-wide IT monitoring.
Everything in Raw, plus:
⬡ Speed up your monitoring
⬡ Scale up your monitoring
⬡ Automate your monitoring
⬡ Monitor dynamic workloads
⬡ Visualize your IT
and much more
Enterprise-grade support
Cloud
State-of-the-art IT monitoring for
cloud and hybrid infrastructures.
Everything in
Enterprise, plus:
⬡ Monitor cloud workloads
⬡ Auto-register any load
⬡ Push and pull agents
⬡ Visualize your cloud
⬡ Deploy from cloud
marketplaces
and much more
Enterprise-grade support
MSP
Monitor your customersʼ hybrid IT.
Designed for IT service providers.
Everything in Cloud,
plus:
⬡ Multi-customer
management & dashboards
⬡ Data segregation
⬡ Data loss protection, if
customer connections fail
⬡ White label branding
and much more
Enterprise-grade support
Why trust us? Because they do
38
Why trust us? Because they do
39
Why trust us? Because they do
40
Award-winning IT monitoring
#3 IT infrastructure #1 Recommended 4.7/5 Customer
Review
41
Award-winning IT monitoring
#3 IT infrastructure #1 2023 Summer 4.7/5 Customer
Review
42
43
The Checkmk Community
Where IT Monitoring experts meet
User forum
6,000+ users
10,000 daily+ visits
Translations
6 languages
Integration exchange
500+ packages
GitHub
100+ contributors
Checkmk — The Company
160+ employees, privately held, debt free
Based in Munich, Germany, and Atlanta, USA
Focusing on IT monitoring for 15+ years
Open-source enthusiasts
# Customers
44
02 Monitoring coverage
Monitor everything
45
⬢ Native and lightweight agents for
Linux, Windows, and many more OS
⬢ Inventory of the hardware &
software assets of your servers
⬢ Built-in dashboards for Linux and
Windows servers
⬢ Integrated virtual server monitoring
for real-time monitoring of all major
virtualization platforms VMware
ESXi, Hyper-V, Proxmox, Nutanix)
46
Streamlined troubleshooting for any server
Server monitoring
⬢ Monitor all network devices with just a
few clicks, thanks to plug-ins for
almost any vendor and device
⬢ Monitor bandwidth, error rates and
state on each port and receive alerts,
e.g., when throughput is too high
⬢ Powerful graphing helps identify
bandwidth-related peaks and patterns
⬢ Visualize your network topology and
understand which devices are talking
with each other*
⬢ Netflow monitoring via ntop
Holistic view from core to edge
Network monitoring
* requires layer 2 network topology data (e.g. CDP, LLDP) to be ingested by the user into Checkmk 47
Designed for enterprise cloud ecosystems
⬢ Full visibility across cloud and
on-premises, including OS-level and
network metrics
⬢ Monitor cloud applications and
infrastructure, including
auto-detection of new or deleted
cloud resources
⬢ Create custom dashboards for all
nodes, servers, and hosts
⬢ Manage system health and easily
trace errors across complex,
distributed architecture
Cloud monitoring
48
⬢ Monitor Kubernetes holistically,
including clusters, nodes, pods,
namespaces, deployments, etc.
⬢ Monitor unmanaged containers, e.g.,
Docker, Podman
⬢ Automatically adapts to dynamic,
ephemeral container infrastructure
⬢ Navigate through all the details, from
cluster down to pod level, thanks to
interconnected, built-in Kubernetes
dashboards
⬢ Intelligent alerting that takes K8sʼ
self-healing into account
Cut through complexity of dynamic infrastructures
Kubernetes and OpenShift monitoring
49
⬢ Performance and health monitoring for
on-premise and cloud databases
⬢ Monitor database attributes that are
critical to business operations
⬢ Implement standard queries for
automated health monitoring and quick
troubleshooting
⬢ Compatible with Oracle, MSSQL Server,
MySQL / MariaDB, SAP HANA,
PostgreSQL, Amazon RDS, Azure SQL
Databases, as well as other systems,
such as MongoDB, IBM DB2, IBM
Informix, etc.
Safeguard uptime of databases and their servers
Database monitoring
Note: Oracle Performance Monitoring not included off-the-shelf, but can easily be built
50
51
Monitor everything: 2,000+ well-maintained plug-ins
Data center Cloud
Archives
Load
Balancers
Firewall
Caches
DNS
Databases Costs
VPN
Block
storage
Con-
tainers
Object
storage
Kuber-
netes
Functions
Notifi-
cations
Apps
CDN
Cloud
Health
Virtual
Machines
Cloud services
VPNs
Router
Switches
Firewalls
Networking
Linux Windows MacOS
AIX
OpenVMS
Solaris
HPUX
IBM zOS
FreeBSD
OpenBSD
NetBSD
OS
Hyper-
visors
Con-
tainers
Kuber-
netes
Virtualization
Web
Apps
Databases
Application
servers
Web
servers
Message
queues /
caches
Business
software
Applications
HVAC
Storage
Power
supply
Server Sensors
Security
Hardware & Sensors
52
Monitor anything out-of the box…
Applications Networking
Virtualization OS Server, Storage, Sensors
53
… in any infrastructure: on-premises & cloud
EC2
Amazon Web Services Microsoft Azure Google Cloud
EBS S3 Glacier VM CE
CloudWatch Cost & Usage
App ELB Network ELB
RDS DynamoDB CloudFront Route53 WAF
ECS EKS
Lambda
SNS
ElastiCache
RSV Blob Storage
Storage
Accounts
Resource
Health Status Azure LB
MySQL PostgreSQL VPN Gateway
Traffic
Manager
AKS
App Gateway
App
Registrations AD Connect
Cloud
Storage
Filestore
Cost
GCP Health
GCP LB
Cloud SQL
Cloud Run
GKE Functions
Memorystore
03 The monitoring workflow
Built-in monitoring power
54
55
The Monitoring Lifecycle.
Checkmk Is There For You.
Get Notifications
Customize Alerts
Auto-Discover
Services Of A Host
Monitor The
Services
⬢ Detection: 2,000
vendor-maintained plug-ins for
automated detection of hosts and
services
⬢ Configuration: auto-recognize
metrics of your devices and apply
pre-defined thresholds
⬢ Updating: automatically keep your
monitoring up-to-date and ensure
that no important metrics remain
uncovered
⬢ Management: automatically create
labels for operating systems, as well
as for cloud and container systems,
such as AWS, Azure, K8s
From zero to monitoring in ten minutes
Auto-discover services of a host
56
One-click drill down into
all performance metrics
Combine metrics and health
data into services
Quickly identify problems in your IT environment
through an easy to identify 'state'
Everything you need to know about the service,
dynamically created based on state
A comprehensive visualization
of the relevant metrics
57
All important information at a glance
Monitor services
58
Get more context with one-click drill-down
Everything you need to know about the service.
Dynamically created based on state
A comprehensive visualization of the relevant metrics
Monitor services
⬢ Benefit from industry expertise with built-in
thresholds to automatically generate alerts
⬢ Adapt alerting to your needs with granular
options, all easily configurable via the UI without
the need to learn yet another query language
⬢ Flexibly restrict alerts with ‘conditionsʼ to a
subset of your infrastructure with a wide range
of filters, from explicit hosts/services to dynamic
elements like tags and labels
59
Adapt built-in alerting to your needs
Customize alerts
Alert the right team at the right time
60
⬢ Leverage comprehensive rule-based
notifications to fulfill complex
enterprise requirements about time periods,
service levels, and many more
⬢ Notify the responsible team quickly, e.g.: notify
storage admins when a disk fails, but not the
network admins
⬢ Escalate problems if they are not handled in time
⬢ Handle alerts centrally, even in distributed
environments
⬢ Use the ‘Alert Handlerʼ to automatically trigger
script-based remedies, e.g., for self-healing
Get notifications
⬢ Out-of-the-box integrations with popular
ITSM and messaging tools
⬢ Receive notifications via Email, SMS, or your
messaging tool
⬢ Streamline your workflows by automatically
creating tickets in your project management
system
Integrate with almost any ITSM & messaging tool
ITSM
Webex teams
Slack
Mattermost SMS
MS Teams Email
Messaging
VictorOps
OpsGenie
Jira
ServiceNow PagerDuty
Get notifications
61
04 Editions
One mission – Four editions
62
Raw Enterprise Cloud MSP
Everything in Raw,
plus:
Everything in
Enterprise,
plus:
Everything in
Enterprise,
plus:
Monitoring
coverage
On-premises
infrastructure
(static)
Basic cloud services,
Kubernetes, OpenShift
Advanced cloud services
Performance
1,000+ hosts
Medium resource
efficiency
100,000+ hosts
Very high resource efficiency
Automation
Auto-discovery
RESTAPI
Automated agent
management
Automated host
registration
Visualization
Table views
Standard graphing
Customizable
dashboards & graphing
Cloud dashboards
Grafana Cloud support
Customer dashboards
White-labeling
Reporting Availability analysis PDF reporting, SLA reporting Customer reports
Analytics Trend prediction, advanced forecasting
Security /
Availability
Encrypted comms,
secure pull agent
2FA, SAML,
HA via appliance
Push agent for
secure networking
Customer data
segregation
Support Community support Enterprise support 63
64
MSP
Specially designed for IT service providers.
Achieve Service Level Agreements and proactively
detect issues within your customersʼ IT infrastructure.
Monitor your customersʼ IT infrastructure
⬡ Based on Checkmk Cloud
⬡ Multi-customer management & dashboards
⬡ Data security compliance with data segregation
⬡ Central configuration via web interface
⬡ Automated deployment, configuration, and
reporting
⬡ Minimal resource requirements
⬡ Data loss protection if customer connections fail
⬡ Alarms routing to service provider or customers
⬡ Available as White label solution
05 Support
65
Pro Advanced
Support contacts 3 7
Support availability 8 hours x 5 days 10 hours x 5 days
Support hours
9am - 5pm CET
or 9am - 5pm ET
8am - 6pm CET
or 8am - 6pm ET
Response time Critical L1 best effort 4 hours
Significant L2 best effort 8 hours
Limited L3 best effort next business day
Minimal L4 best effort 2 business days
Access to consulting
Customer support
Two enterprise-grade support packages
66
67
Documentation
⬢ Step-by-step guides to set up a functional
monitoring system, both in written form and as
YouTube videos
⬢ Detailed explanations about all kinds of
operational tasks
⬢ Tips, tricks and best practices from our
experienced consultants
⬢ In-depth insights on complex topics and use
cases
User guides for all skill levels
540+ packages
68
Checkmk Community
6,000 Forum users
180 GitHub contributors
6 language packages
#CMKConf, Partner Day
Code Contributions
Checkmk Events Translations
800+ requests
Where IT Monitoring experts meet
06 Customer Stories
69
70
Large scale network monitoring for financial
services
IT service provider for German savings banks with 4,400 employees and four major data centers
Service several hundred banks with 112 million bank accounts plus many insurance companies
Challenge
Solution
Outcome
12 sites
100,000 hosts 3,500,000 services
Challenge
⬢ Need to monitor entire network
comprised of approx. 100,000
hosts
⬢ Diverse set of WAN, LAN, Access
point another equipment
⬢ Daily changes of the
infrastructure
⬢ Customer reporting and layer 2
network topology visualization
requirements
Outcome
⬢ Monitoring system in full operation
⬢ 50 concurrent users with large
scale increase planned over coming
months
⬢ First CA tool being replaced
Solution
⬢ Use the Checkmk Enterprise Edition to replace CA
Performance Management and CA Spectrum
⬢ Monitor 100,000 hosts and 3,500,000 services
across 12 monitoring sites
⬢ Use Checkmk Appliance Cluster Rack4+ in HA
mode as high available monitoring system hardware
⬢ Work with Checkmk team to develop network
topology visualization (release in Checkmk 2.3
⬢ Work with Checkmk partner to develop 50 customer
specific monitoring extensions
Automated monitoring of European networks
Challenge
Solution
Outcome
Challenge
⬢ Monitor all European networks
within and across its regional data
centers
⬢ Fully automated monitoring,
state-of-the-art RESTAPI thus a
critical requirement
⬢ Very heterogeneous
infrastructure across data center
networks, global backbone
networks and similar
⬢ Massive scalability needs, require
a platform that can work globally
71
IONOS is the largest hosting company in Europe, partnering with small and medium-sized businesses
It manages more than 8 million customer contracts and hosts more than 12 million domains
12 sites
11,000 hosts 640,000services
Outcome
⬢ Able to pinpoint difficult to identify
root causes from incidents, thus
reducing mean-time-to-resolution
Solution
⬢ Employ Checkmk Enterprise Edition, HA set-up
⬢ Currently monitor 11,000 hosts and 640,000
services across 12 sites, further expansion planned
⬢ Deployment and updating of monitoring is highly
automated and scalable, using the Checkmk
RESTAPI
⬢ Visualization of incidents shows the OP Center
where problems occur
Monitoring the banking systems of 170 banks
Largest telecommunications company in Switzerland, also one of the leading MSPs in the region
Over 170 banks place their trust in Swisscomʼs banking services
Challenge
Solution
Outcome
20sites
1,800 hosts 175,000services
Challenge
⬢ Banking systems – Operate
customers' Centralized Real-Time
Exchange Banking Systems CBS
⬢ Regulatory requirements –
operate on-premises in Switzerland
⬢ Consolidate monitoring tools –
heterogeneous legacy landscape
⬢ Managed services environment –
separation of customer data
⬢ Comprehensive scalability – large
scale Oracle databases, growing
customer base
Outcome
⬢ With Checkmk Swisscom banking
ensures high performance of the
banking systems of 170 banks
⬢ “With Checkmk we can provide
real-time insight. Among other
things, we can show the customer
exactly how his applications are
performing.ˮ Daniel Röttgermann,
Swisscom Banking]
Solution
⬢ Swisscom employs the Checkmk Managed
Services Edition
⬢ Monitors more than 1,800 hosts and 175,000
services in over 20 Checkmk instances
⬢ Multi-client capability allows precise control of
access rights for internal and external users
⬢ Swisscom & Checkmk partnered to enhance the
Oracle monitoring, which Swisscom further
expanded
⬢ Offers a new service thanks to Checkmk: precise
forecasting of future resource utilization
72
07 Additional Features
Built-in monitoring power
73
⬢ Map application dependencies from a
single overview
⬢ View availability and performance of
complex systems at a glance
⬢ Aggregate various services and hosts
into a single state
⬢ Review historical states to determine
the root cause of degraded
performance
⬢ Simulate worst case scenarios in real
time, analyzing the impact of failing
hosts to determine areas of operational
weakness
A birdʼs-eye view on key processes health
Business Intelligence
74
⬢ Combine metrics and log data for
fast problem identification and root
cause analysis
⬢ Filter and forward events, triggering
scripts or generating notifications
⬢ Collapse duplicate entries into a
single event (e.g., several failed user
logins) to prevent operator overload
⬢ Filter incoming messages to only
show important events and avoid
overload
Efficient processing and analysis of logs
Log & event monitoring
75
Identify all assets in your IT
⬢ Identify and inventory automatically all hardware and
software on your hosts
⬢ Integrate regularly updated data from monitoring
services, such as CPU utilization
⬢ Proactively track changes to hardware and software
⬢ Identify servers that have not yet had a specific
service pack installed
⬢ Import the data into CMDBs and keep them based on
current data
Hardware and software inventory
76
⬢ Generate branded PDF reports
containing pre-built or custom views
– either on-demand, or automated at
regular intervals
⬢ Review the history of states over any
desired timeframe with a single click,
and compute availability metrics in
real time
⬢ Monitor the compliance of complex
SLAs – even if the time unit is
measured in hours
Generate custom reports automatically
Reporting
77
78
⬢ Analyze historical data, predict trends, forecast
resource utilization and avoid unforeseen
surprises
⬢ Use sophisticated predictive monitoring
algorithms to dynamically adapt thresholds
based on historical events
⬢ Make capacity management a breeze with
forecasting that takes into account one-off
effects or seasonal factors
Prevent bottlenecks and failures
Forecasting & Capacity Mgmt
Visualize your data with customizable dashboards
⬢ Get full visibility on the state of your IT, thanks to
Checkmk's modern and customizable dashboards
⬢ Out-of-the-box dashboards provide key metrics for
AWS and Azure cloud environments, Linux and
Windows servers, and Kubernetes clusters
⬢ Leverage graphic maps and diagrams with live
monitoring data
⬢ Analyze time-series metrics over long time horizons
with interactive HTML5 graphs
⬢ Customize dashboards and views to your specific
needs with different dashboard elements to visualize
your most important metrics
Dashboards
⬢ Automates configuration and
operations
⬢ Built on RESTAPI best-practice
⬢ Provides auto-generated
documentation with
code examples
RESTAPI
Automate your monitoring with the RESTAPI
80
Appliance
Operate Checkmk within
your existing
virtualization platform or
in a dedicated appliance
– already preconfigured
for a quick start
Deployment
81
Native Linux
Debian, Ubuntu, RedHat
Enterprise Linux, SLES -
you choose your
preferred platform
Container
Easily deploy Checkmk
as a container image
from our registry into
your containerized
infrastructure
Cloud Marketplaces
Install an image of the
Checkmk Cloud,
including all required
dependencies, from
AWS and Azure
marketplaces
Many options for maximum flexibility

More Related Content

PDF
OSMC 2023 | Newest developments in Checkmk Raw – the open-source monitoring s...
PDF
Compliance as Code Everywhere
PDF
DevOps Case Studies
PPTX
System Center Operations Manager (SCOM) 2007 R2 & Non Microsoft Monitoring
PPTX
Intro to Puppet Enterprise for a Windows Environment - 08.23
PDF
Prometheus and Docker (Docker Galway, November 2015)
PPTX
PDF
What is this DevOps thing and why do I need it?
OSMC 2023 | Newest developments in Checkmk Raw – the open-source monitoring s...
Compliance as Code Everywhere
DevOps Case Studies
System Center Operations Manager (SCOM) 2007 R2 & Non Microsoft Monitoring
Intro to Puppet Enterprise for a Windows Environment - 08.23
Prometheus and Docker (Docker Galway, November 2015)
What is this DevOps thing and why do I need it?

Similar to OSMC 2024 | Building a better check_http by Mattias Schlenker.pdf (20)

PDF
Quick wins in the NetOps Journey by Vincent Boon, Opengear
PDF
AWS live hack: Atlassian + Snyk OSS on AWS
PDF
An Introduction to Microservices
PDF
The 5 elements of IoT security
PDF
DevSec Delight with Compliance as Code - Matt Ray - AgileNZ 2017
PPTX
ANIn Kolkata August 2022 | DevOps in daily Life by Mohana Chattopadhyay
PDF
Mastering the move
PDF
Choosing a Citrix Monitoring Strategy: Key Capabilities and Pitfalls to Avoid
PPTX
Ship code like a keptn
PDF
Iot in-production
PPTX
Cloud Platform Symantec Meetup Nov 2014
PPTX
Q Con New York 2015 Presentation - Conjur
PPTX
Intro to Puppet Enterprise Webinar 07.27.2017
PDF
DevOpsDays Singapore - Continuous Auditing with Compliance as Code
PDF
CertsOut Checkpoint-156-587 exam dumps pdf
PDF
Tracking license compliance made easy - intro to Grant (OSS)
PDF
Cncf checkov and bridgecrew
PDF
Enterprise-Grade DevOps Solutions for a Start Up Budget
PDF
Kubernetes Security Best Practices - With tips for the CKS exam
PPTX
measuring and monitoring client side performance / Nir Nahum
Quick wins in the NetOps Journey by Vincent Boon, Opengear
AWS live hack: Atlassian + Snyk OSS on AWS
An Introduction to Microservices
The 5 elements of IoT security
DevSec Delight with Compliance as Code - Matt Ray - AgileNZ 2017
ANIn Kolkata August 2022 | DevOps in daily Life by Mohana Chattopadhyay
Mastering the move
Choosing a Citrix Monitoring Strategy: Key Capabilities and Pitfalls to Avoid
Ship code like a keptn
Iot in-production
Cloud Platform Symantec Meetup Nov 2014
Q Con New York 2015 Presentation - Conjur
Intro to Puppet Enterprise Webinar 07.27.2017
DevOpsDays Singapore - Continuous Auditing with Compliance as Code
CertsOut Checkpoint-156-587 exam dumps pdf
Tracking license compliance made easy - intro to Grant (OSS)
Cncf checkov and bridgecrew
Enterprise-Grade DevOps Solutions for a Start Up Budget
Kubernetes Security Best Practices - With tips for the CKS exam
measuring and monitoring client side performance / Nir Nahum
Ad

Recently uploaded (20)

PPTX
Human Mind & its character Characteristics
PDF
Parts of Speech Prepositions Presentation in Colorful Cute Style_20250724_230...
PPTX
Emphasizing It's Not The End 08 06 2025.pptx
PPTX
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
PDF
oil_refinery_presentation_v1 sllfmfls.pdf
PDF
Swiggy’s Playbook: UX, Logistics & Monetization
PPTX
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
PDF
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
PPTX
Non-Verbal-Communication .mh.pdf_110245_compressed.pptx
PPTX
Effective_Handling_Information_Presentation.pptx
PPTX
The Effect of Human Resource Management Practice on Organizational Performanc...
PPTX
2025-08-10 Joseph 02 (shared slides).pptx
PPTX
_ISO_Presentation_ISO 9001 and 45001.pptx
PDF
Instagram's Product Secrets Unveiled with this PPT
PPTX
Hydrogel Based delivery Cancer Treatment
PPTX
Relationship Management Presentation In Banking.pptx
PPTX
An Unlikely Response 08 10 2025.pptx
PPTX
Tour Presentation Educational Activity.pptx
DOCX
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
PPT
First Aid Training Presentation Slides.ppt
Human Mind & its character Characteristics
Parts of Speech Prepositions Presentation in Colorful Cute Style_20250724_230...
Emphasizing It's Not The End 08 06 2025.pptx
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
oil_refinery_presentation_v1 sllfmfls.pdf
Swiggy’s Playbook: UX, Logistics & Monetization
Presentation for DGJV QMS (PQP)_12.03.2025.pptx
Nykaa-Strategy-Case-Fixing-Retention-UX-and-D2C-Engagement (1).pdf
Non-Verbal-Communication .mh.pdf_110245_compressed.pptx
Effective_Handling_Information_Presentation.pptx
The Effect of Human Resource Management Practice on Organizational Performanc...
2025-08-10 Joseph 02 (shared slides).pptx
_ISO_Presentation_ISO 9001 and 45001.pptx
Instagram's Product Secrets Unveiled with this PPT
Hydrogel Based delivery Cancer Treatment
Relationship Management Presentation In Banking.pptx
An Unlikely Response 08 10 2025.pptx
Tour Presentation Educational Activity.pptx
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
First Aid Training Presentation Slides.ppt
Ad

OSMC 2024 | Building a better check_http by Mattias Schlenker.pdf

  • 1. Mattias Schlenker Technical Writer - Team Knowledge Building a better check_http Re-thinking the most important active check OSMC 2024
  • 2. 2 Who am I and why am I here with you? Long term tech journo My job at Checkmk Why am I here ⬢ Once upon a time… Data Becker, Heise and others… ⬢ You might know me from Desinfecʼt (cʼt magazine) ⬢ Working on The Official Checkmk User Guide ⬢ Early access to new to features (‘Can this be documented?ʼ) ⬢ Product communication in general (blogs, articles, conventions…) ⬢ Convincing you to try out the new check_httpv2 with your monitoring solution ⬢ Ultimately helping you with the decision which hosts to migrate soon, which later and which never
  • 3. 3 Agenda of today Our starting point Decision Practical use check_cert Why and how did we decide for a new check_http?
  • 4. 4 Our starting point Decision Practical use check_cert Letʼs start into it…
  • 5. 5 At Checkmk most development is customer driven Customers complained about bugs, missing features and annoyances like... ⬢ … more than one thing to check usually means more than one call ⬢ … problems with chunked content ⬢ … limited range of login options ⬢ … opaque priority handling of command line arguments ⬢ ... SSL handling needed improvement In general: users expect an HTTP check to behave like a browser
  • 6. 6 Should we have worked on the existing check_http? ⬢ Very many pull requests could have overwhelmed maintainers ⬢ New dependencies would have made check_http harder to build ⬢ Changes of how CLI options are interpreted could have broken existing configurations ⬢ "Bug compatibility" is an issue Conclusion ⬢ Time needed ⬢ IFDEFs needed (probably many) ⬢ Diplomacy needed to introduce new paradigms
  • 7. 7 Our starting point Decision Practical use check_cert Decision: make it new!
  • 8. We decided for a complete re-implementation. 8
  • 9. Which fighter would you have chosen? C/C, Python or Rust? 9
  • 10. 10 C/C, Python or Rust? + Integration of our new check into monitoring-plugins.org collection would have been possible - Hard to maintain + Possibility to move from active check to Checkmk special agent - Would have broken many debugging possibilities - Harder migration - Performance - "Checkmk only" would have been very likely + Easy to debug, extend and maintain + Easy migration + Compatible to other monitoring software that uses Nagios call/return syntax + Performance - Slightly less optimal integration with Checkmk C/C Python Rust
  • 11. Rust made the race! 11
  • 12. 12 It's open source and will work with Icinga, Zabbix, Nagios, Naemon…
  • 14. 14 That's the theory and the background Now let's get practical - Build it… # Dependencies apt install git curl apt install build-essential libssl-dev pkg-config # The Rust compiler curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh . "$HOME/.cargo/env" # Get the sources git clone https://github.com/Checkmk/checkmk # Use the time to get a coffee! # Omit this if you are brave: git checkout 2.3.0 # Build... cd checkmk/packages/check-http/ cargo build --release # Get another coffee...
  • 15. 15 And run it in the simplest way possible # ...and test: ls -lah target/release/ time ./target/release/check_http --url https://docs.checkmk.com/ echo $? You might now install it to a matching location, for example as check_httpv2 to the monitoring-plugins directory...
  • 16. 16 The biggest advantage: Check several aspects within one call! C ⬢ Response code 200 ⬢ Response time 0.2 and 0.5s ⬢ Certificate validity 15 and 7 days ⬢ TLS version 1.3 check_httpv2 --timeout 2 --response-time-levels 0.2,1.0 --min-tls-version tls13 --certificate-levels 15,7 --status-code 200 --url https://docs.checkmk.com/
  • 17. 17 Unsure how to build the command line? C Install Checkmk, tinker with the rules for this active check, let you show the command line!
  • 18. 18 Unsure how to build the command line? C Install Checkmk, tinker with the rules for this active check, let you show the command line!
  • 19. 19 Unsure how to build the command line? C Install Checkmk, tinker with the rules for this active check, let you show the command line!
  • 21. 21 C Are you also that type of person … … that just runs a minimal web server on other servers just to check certificates with check_http?
  • 23. 23 This is very similar to check_httpv2, ... C ⬢ Rust source in packages/check-cert, builds the same way ⬢ Connects to any SSL enabled service StartTLS under consideration) ⬢ Even more detailed checks for SNI, subjects, issuer available... ⬢ Use your own CA store to work with internal CAs (see also https://checkmk.io/cert-authority) ⬢ Check for expected algorithm Use it where you do not have an HTTP server to connect to or for HTTP where the certificate checks in check_httpv2 are not sufficient.
  • 24. Checkmk GmbH Kellerstraße 27 81667 München Germany Web — checkmk.com mattias.schlenker@checkmk.com feedback@checkmk.com Questions? Get in touch Thank you!
  • 25. 25 Fast track your time-to-resolution ⬢ ⬢ 2,000 + plug-ins Auto-discovery ⬢ ⬢ Smart and granular alerts Ticket auto-creation in ITSM tools ⬢ Out-of-the-box health assessment ⬢ Built-in time series graphs ⬢ Forecasting and capacity management ⬢ Built-in and custom dashboards ⬢ Root-cause analysis ⬢ Custom self-healing
  • 26. 26 Our mission: bringing visibility into your hybrid IT
  • 27. Built to address your needs 27 Monitor Everything Highly Automated Hyper-Scalable Extensible
  • 28. 28 2,000+ well-maintained plug-ins Data center Cloud Monitor Everything Archives Load Balancers Firewall Caches DNS Databases Costs VPN Block storage Con- tainers Object storage Kuber- netes Functions Notifi- cations Apps CDN Cloud Health Virtual Machines Cloud services VPNs Router Switches Firewalls Networking Linux Windows MacOS AIX OpenVMS Solaris HPUX IBM zOS FreeBSD OpenBSD NetBSD OS Hyper- visors Con- tainers Kuber- netes Virtualization Web Apps Databases Application servers Web servers Message queues / caches Business software Applications HVAC Storage Power supply Server Sensors Security Hardware & Sensors
  • 29. 29 Highly automated to monitor at scale 29 29 29 Monitoring objects and metrics can automatically be removed from monitoring Asset auto-registers Monitoring services & metrics automatically added Asset starts VM, pod, …) Asset is terminated Highly Automated Auto-register workloads
  • 30. 30 Highly automated to monitor at scale 30 Highly Automated Auto-register workloads Auto-discover services
  • 31. 31 Highly automated to monitor at scale Auto-register workloads Metrics dashboards Auto-discover services Highly Automated
  • 32. 32 Highly automated to monitor at scale Auto-register workloads Metrics dashboards Auto-discover services Application dashboards Highly Automated
  • 33. Highly automated to monitor at scale Webex teams Slack Mattermost SMS MS Teams Email Splunk On-Call Ops Genie Jira ServiceNow PagerDuty Messaging ITSM 33 Auto-register workloads Metrics dashboards Auto-discover services Application dashboards Automated alerting 33 Highly Automated
  • 34. 34 Highly automated to monitor at scale Auto-register workloads Metrics dashboards Auto-discover services Application dashboards Automated alerting ⬢ Automate entire monitoring configuration & operation ⬢ Leverage auto-generated documentation with code examples and Ansible playbooks RESTAPI Build your fully automated monitoring Highly Automated
  • 35. Customer examples 35 Scale vertically 100k+ services per instance Scale horizontally with massively distributed set-ups Instances Hosts Services 170 128k 6.1m Hyper-scalable distributed set-ups Instances Hosts Services 12 108k 3.7m Hyper-Scalable
  • 36. 36 Extensible open-source monitoring ⬢ Majority of code base open source ⬢ Easily readable and modifiable Python code ⬢ Developer APIs for writing monitoring integrations ⬢ Built-in logic to handle customized code ⬢ Large partner ecosystem for customizations Build your own integrations with simple scripts extending agents or by writing entire plug-ins yourself Extend existing integrations to accommodate own requirements Extensible
  • 37. Checkmk Editions: Your use case, your edition 37 Raw Free & open source IT monitoring for mid-sized infrastructures. Monitor your entire IT ⬡ Auto-discover your IT ⬡ Monitor out-of-the-box with 2000+ plug-ins ⬡ Auto-detect issues and more Support via community Enterprise Scalable and automated enterprise-wide IT monitoring. Everything in Raw, plus: ⬡ Speed up your monitoring ⬡ Scale up your monitoring ⬡ Automate your monitoring ⬡ Monitor dynamic workloads ⬡ Visualize your IT and much more Enterprise-grade support Cloud State-of-the-art IT monitoring for cloud and hybrid infrastructures. Everything in Enterprise, plus: ⬡ Monitor cloud workloads ⬡ Auto-register any load ⬡ Push and pull agents ⬡ Visualize your cloud ⬡ Deploy from cloud marketplaces and much more Enterprise-grade support MSP Monitor your customersʼ hybrid IT. Designed for IT service providers. Everything in Cloud, plus: ⬡ Multi-customer management & dashboards ⬡ Data segregation ⬡ Data loss protection, if customer connections fail ⬡ White label branding and much more Enterprise-grade support
  • 38. Why trust us? Because they do 38
  • 39. Why trust us? Because they do 39
  • 40. Why trust us? Because they do 40
  • 41. Award-winning IT monitoring #3 IT infrastructure #1 Recommended 4.7/5 Customer Review 41
  • 42. Award-winning IT monitoring #3 IT infrastructure #1 2023 Summer 4.7/5 Customer Review 42
  • 43. 43 The Checkmk Community Where IT Monitoring experts meet User forum 6,000+ users 10,000 daily+ visits Translations 6 languages Integration exchange 500+ packages GitHub 100+ contributors
  • 44. Checkmk — The Company 160+ employees, privately held, debt free Based in Munich, Germany, and Atlanta, USA Focusing on IT monitoring for 15+ years Open-source enthusiasts # Customers 44
  • 46. ⬢ Native and lightweight agents for Linux, Windows, and many more OS ⬢ Inventory of the hardware & software assets of your servers ⬢ Built-in dashboards for Linux and Windows servers ⬢ Integrated virtual server monitoring for real-time monitoring of all major virtualization platforms VMware ESXi, Hyper-V, Proxmox, Nutanix) 46 Streamlined troubleshooting for any server Server monitoring
  • 47. ⬢ Monitor all network devices with just a few clicks, thanks to plug-ins for almost any vendor and device ⬢ Monitor bandwidth, error rates and state on each port and receive alerts, e.g., when throughput is too high ⬢ Powerful graphing helps identify bandwidth-related peaks and patterns ⬢ Visualize your network topology and understand which devices are talking with each other* ⬢ Netflow monitoring via ntop Holistic view from core to edge Network monitoring * requires layer 2 network topology data (e.g. CDP, LLDP) to be ingested by the user into Checkmk 47
  • 48. Designed for enterprise cloud ecosystems ⬢ Full visibility across cloud and on-premises, including OS-level and network metrics ⬢ Monitor cloud applications and infrastructure, including auto-detection of new or deleted cloud resources ⬢ Create custom dashboards for all nodes, servers, and hosts ⬢ Manage system health and easily trace errors across complex, distributed architecture Cloud monitoring 48
  • 49. ⬢ Monitor Kubernetes holistically, including clusters, nodes, pods, namespaces, deployments, etc. ⬢ Monitor unmanaged containers, e.g., Docker, Podman ⬢ Automatically adapts to dynamic, ephemeral container infrastructure ⬢ Navigate through all the details, from cluster down to pod level, thanks to interconnected, built-in Kubernetes dashboards ⬢ Intelligent alerting that takes K8sʼ self-healing into account Cut through complexity of dynamic infrastructures Kubernetes and OpenShift monitoring 49
  • 50. ⬢ Performance and health monitoring for on-premise and cloud databases ⬢ Monitor database attributes that are critical to business operations ⬢ Implement standard queries for automated health monitoring and quick troubleshooting ⬢ Compatible with Oracle, MSSQL Server, MySQL / MariaDB, SAP HANA, PostgreSQL, Amazon RDS, Azure SQL Databases, as well as other systems, such as MongoDB, IBM DB2, IBM Informix, etc. Safeguard uptime of databases and their servers Database monitoring Note: Oracle Performance Monitoring not included off-the-shelf, but can easily be built 50
  • 51. 51 Monitor everything: 2,000+ well-maintained plug-ins Data center Cloud Archives Load Balancers Firewall Caches DNS Databases Costs VPN Block storage Con- tainers Object storage Kuber- netes Functions Notifi- cations Apps CDN Cloud Health Virtual Machines Cloud services VPNs Router Switches Firewalls Networking Linux Windows MacOS AIX OpenVMS Solaris HPUX IBM zOS FreeBSD OpenBSD NetBSD OS Hyper- visors Con- tainers Kuber- netes Virtualization Web Apps Databases Application servers Web servers Message queues / caches Business software Applications HVAC Storage Power supply Server Sensors Security Hardware & Sensors
  • 52. 52 Monitor anything out-of the box… Applications Networking Virtualization OS Server, Storage, Sensors
  • 53. 53 … in any infrastructure: on-premises & cloud EC2 Amazon Web Services Microsoft Azure Google Cloud EBS S3 Glacier VM CE CloudWatch Cost & Usage App ELB Network ELB RDS DynamoDB CloudFront Route53 WAF ECS EKS Lambda SNS ElastiCache RSV Blob Storage Storage Accounts Resource Health Status Azure LB MySQL PostgreSQL VPN Gateway Traffic Manager AKS App Gateway App Registrations AD Connect Cloud Storage Filestore Cost GCP Health GCP LB Cloud SQL Cloud Run GKE Functions Memorystore
  • 54. 03 The monitoring workflow Built-in monitoring power 54
  • 55. 55 The Monitoring Lifecycle. Checkmk Is There For You. Get Notifications Customize Alerts Auto-Discover Services Of A Host Monitor The Services
  • 56. ⬢ Detection: 2,000 vendor-maintained plug-ins for automated detection of hosts and services ⬢ Configuration: auto-recognize metrics of your devices and apply pre-defined thresholds ⬢ Updating: automatically keep your monitoring up-to-date and ensure that no important metrics remain uncovered ⬢ Management: automatically create labels for operating systems, as well as for cloud and container systems, such as AWS, Azure, K8s From zero to monitoring in ten minutes Auto-discover services of a host 56
  • 57. One-click drill down into all performance metrics Combine metrics and health data into services Quickly identify problems in your IT environment through an easy to identify 'state' Everything you need to know about the service, dynamically created based on state A comprehensive visualization of the relevant metrics 57 All important information at a glance Monitor services
  • 58. 58 Get more context with one-click drill-down Everything you need to know about the service. Dynamically created based on state A comprehensive visualization of the relevant metrics Monitor services
  • 59. ⬢ Benefit from industry expertise with built-in thresholds to automatically generate alerts ⬢ Adapt alerting to your needs with granular options, all easily configurable via the UI without the need to learn yet another query language ⬢ Flexibly restrict alerts with ‘conditionsʼ to a subset of your infrastructure with a wide range of filters, from explicit hosts/services to dynamic elements like tags and labels 59 Adapt built-in alerting to your needs Customize alerts
  • 60. Alert the right team at the right time 60 ⬢ Leverage comprehensive rule-based notifications to fulfill complex enterprise requirements about time periods, service levels, and many more ⬢ Notify the responsible team quickly, e.g.: notify storage admins when a disk fails, but not the network admins ⬢ Escalate problems if they are not handled in time ⬢ Handle alerts centrally, even in distributed environments ⬢ Use the ‘Alert Handlerʼ to automatically trigger script-based remedies, e.g., for self-healing Get notifications
  • 61. ⬢ Out-of-the-box integrations with popular ITSM and messaging tools ⬢ Receive notifications via Email, SMS, or your messaging tool ⬢ Streamline your workflows by automatically creating tickets in your project management system Integrate with almost any ITSM & messaging tool ITSM Webex teams Slack Mattermost SMS MS Teams Email Messaging VictorOps OpsGenie Jira ServiceNow PagerDuty Get notifications 61
  • 62. 04 Editions One mission – Four editions 62
  • 63. Raw Enterprise Cloud MSP Everything in Raw, plus: Everything in Enterprise, plus: Everything in Enterprise, plus: Monitoring coverage On-premises infrastructure (static) Basic cloud services, Kubernetes, OpenShift Advanced cloud services Performance 1,000+ hosts Medium resource efficiency 100,000+ hosts Very high resource efficiency Automation Auto-discovery RESTAPI Automated agent management Automated host registration Visualization Table views Standard graphing Customizable dashboards & graphing Cloud dashboards Grafana Cloud support Customer dashboards White-labeling Reporting Availability analysis PDF reporting, SLA reporting Customer reports Analytics Trend prediction, advanced forecasting Security / Availability Encrypted comms, secure pull agent 2FA, SAML, HA via appliance Push agent for secure networking Customer data segregation Support Community support Enterprise support 63
  • 64. 64 MSP Specially designed for IT service providers. Achieve Service Level Agreements and proactively detect issues within your customersʼ IT infrastructure. Monitor your customersʼ IT infrastructure ⬡ Based on Checkmk Cloud ⬡ Multi-customer management & dashboards ⬡ Data security compliance with data segregation ⬡ Central configuration via web interface ⬡ Automated deployment, configuration, and reporting ⬡ Minimal resource requirements ⬡ Data loss protection if customer connections fail ⬡ Alarms routing to service provider or customers ⬡ Available as White label solution
  • 66. Pro Advanced Support contacts 3 7 Support availability 8 hours x 5 days 10 hours x 5 days Support hours 9am - 5pm CET or 9am - 5pm ET 8am - 6pm CET or 8am - 6pm ET Response time Critical L1 best effort 4 hours Significant L2 best effort 8 hours Limited L3 best effort next business day Minimal L4 best effort 2 business days Access to consulting Customer support Two enterprise-grade support packages 66
  • 67. 67 Documentation ⬢ Step-by-step guides to set up a functional monitoring system, both in written form and as YouTube videos ⬢ Detailed explanations about all kinds of operational tasks ⬢ Tips, tricks and best practices from our experienced consultants ⬢ In-depth insights on complex topics and use cases User guides for all skill levels
  • 68. 540+ packages 68 Checkmk Community 6,000 Forum users 180 GitHub contributors 6 language packages #CMKConf, Partner Day Code Contributions Checkmk Events Translations 800+ requests Where IT Monitoring experts meet
  • 70. 70 Large scale network monitoring for financial services IT service provider for German savings banks with 4,400 employees and four major data centers Service several hundred banks with 112 million bank accounts plus many insurance companies Challenge Solution Outcome 12 sites 100,000 hosts 3,500,000 services Challenge ⬢ Need to monitor entire network comprised of approx. 100,000 hosts ⬢ Diverse set of WAN, LAN, Access point another equipment ⬢ Daily changes of the infrastructure ⬢ Customer reporting and layer 2 network topology visualization requirements Outcome ⬢ Monitoring system in full operation ⬢ 50 concurrent users with large scale increase planned over coming months ⬢ First CA tool being replaced Solution ⬢ Use the Checkmk Enterprise Edition to replace CA Performance Management and CA Spectrum ⬢ Monitor 100,000 hosts and 3,500,000 services across 12 monitoring sites ⬢ Use Checkmk Appliance Cluster Rack4+ in HA mode as high available monitoring system hardware ⬢ Work with Checkmk team to develop network topology visualization (release in Checkmk 2.3 ⬢ Work with Checkmk partner to develop 50 customer specific monitoring extensions
  • 71. Automated monitoring of European networks Challenge Solution Outcome Challenge ⬢ Monitor all European networks within and across its regional data centers ⬢ Fully automated monitoring, state-of-the-art RESTAPI thus a critical requirement ⬢ Very heterogeneous infrastructure across data center networks, global backbone networks and similar ⬢ Massive scalability needs, require a platform that can work globally 71 IONOS is the largest hosting company in Europe, partnering with small and medium-sized businesses It manages more than 8 million customer contracts and hosts more than 12 million domains 12 sites 11,000 hosts 640,000services Outcome ⬢ Able to pinpoint difficult to identify root causes from incidents, thus reducing mean-time-to-resolution Solution ⬢ Employ Checkmk Enterprise Edition, HA set-up ⬢ Currently monitor 11,000 hosts and 640,000 services across 12 sites, further expansion planned ⬢ Deployment and updating of monitoring is highly automated and scalable, using the Checkmk RESTAPI ⬢ Visualization of incidents shows the OP Center where problems occur
  • 72. Monitoring the banking systems of 170 banks Largest telecommunications company in Switzerland, also one of the leading MSPs in the region Over 170 banks place their trust in Swisscomʼs banking services Challenge Solution Outcome 20sites 1,800 hosts 175,000services Challenge ⬢ Banking systems – Operate customers' Centralized Real-Time Exchange Banking Systems CBS ⬢ Regulatory requirements – operate on-premises in Switzerland ⬢ Consolidate monitoring tools – heterogeneous legacy landscape ⬢ Managed services environment – separation of customer data ⬢ Comprehensive scalability – large scale Oracle databases, growing customer base Outcome ⬢ With Checkmk Swisscom banking ensures high performance of the banking systems of 170 banks ⬢ “With Checkmk we can provide real-time insight. Among other things, we can show the customer exactly how his applications are performing.ˮ Daniel Röttgermann, Swisscom Banking] Solution ⬢ Swisscom employs the Checkmk Managed Services Edition ⬢ Monitors more than 1,800 hosts and 175,000 services in over 20 Checkmk instances ⬢ Multi-client capability allows precise control of access rights for internal and external users ⬢ Swisscom & Checkmk partnered to enhance the Oracle monitoring, which Swisscom further expanded ⬢ Offers a new service thanks to Checkmk: precise forecasting of future resource utilization 72
  • 73. 07 Additional Features Built-in monitoring power 73
  • 74. ⬢ Map application dependencies from a single overview ⬢ View availability and performance of complex systems at a glance ⬢ Aggregate various services and hosts into a single state ⬢ Review historical states to determine the root cause of degraded performance ⬢ Simulate worst case scenarios in real time, analyzing the impact of failing hosts to determine areas of operational weakness A birdʼs-eye view on key processes health Business Intelligence 74
  • 75. ⬢ Combine metrics and log data for fast problem identification and root cause analysis ⬢ Filter and forward events, triggering scripts or generating notifications ⬢ Collapse duplicate entries into a single event (e.g., several failed user logins) to prevent operator overload ⬢ Filter incoming messages to only show important events and avoid overload Efficient processing and analysis of logs Log & event monitoring 75
  • 76. Identify all assets in your IT ⬢ Identify and inventory automatically all hardware and software on your hosts ⬢ Integrate regularly updated data from monitoring services, such as CPU utilization ⬢ Proactively track changes to hardware and software ⬢ Identify servers that have not yet had a specific service pack installed ⬢ Import the data into CMDBs and keep them based on current data Hardware and software inventory 76
  • 77. ⬢ Generate branded PDF reports containing pre-built or custom views – either on-demand, or automated at regular intervals ⬢ Review the history of states over any desired timeframe with a single click, and compute availability metrics in real time ⬢ Monitor the compliance of complex SLAs – even if the time unit is measured in hours Generate custom reports automatically Reporting 77
  • 78. 78 ⬢ Analyze historical data, predict trends, forecast resource utilization and avoid unforeseen surprises ⬢ Use sophisticated predictive monitoring algorithms to dynamically adapt thresholds based on historical events ⬢ Make capacity management a breeze with forecasting that takes into account one-off effects or seasonal factors Prevent bottlenecks and failures Forecasting & Capacity Mgmt
  • 79. Visualize your data with customizable dashboards ⬢ Get full visibility on the state of your IT, thanks to Checkmk's modern and customizable dashboards ⬢ Out-of-the-box dashboards provide key metrics for AWS and Azure cloud environments, Linux and Windows servers, and Kubernetes clusters ⬢ Leverage graphic maps and diagrams with live monitoring data ⬢ Analyze time-series metrics over long time horizons with interactive HTML5 graphs ⬢ Customize dashboards and views to your specific needs with different dashboard elements to visualize your most important metrics Dashboards
  • 80. ⬢ Automates configuration and operations ⬢ Built on RESTAPI best-practice ⬢ Provides auto-generated documentation with code examples RESTAPI Automate your monitoring with the RESTAPI 80
  • 81. Appliance Operate Checkmk within your existing virtualization platform or in a dedicated appliance – already preconfigured for a quick start Deployment 81 Native Linux Debian, Ubuntu, RedHat Enterprise Linux, SLES - you choose your preferred platform Container Easily deploy Checkmk as a container image from our registry into your containerized infrastructure Cloud Marketplaces Install an image of the Checkmk Cloud, including all required dependencies, from AWS and Azure marketplaces Many options for maximum flexibility