SlideShare a Scribd company logo
Sensu data monitoring system
at scale
LEANDRO TOTINO PEREIRA
SYSTEM ENGINEER
Sensu Kafka Kafka-connect Cassandra PrestoDB
Agenda
 Problem?
 Sensu Monitoring system
 Taking a little about the componentes/concepts used
 PrestoDB vs Circuit breaker
 Taking about the full architecture built
 Questions?
Problem?
 Opensource/comercial doesn´t scale very well (poller, proxy and
databases)
 These system often do have SQL database which doesn´t scale well
(sharding, master/slave, no TS database for metrics)
 These systems are hard to customize for our needs. (integrations with
other systems and dashboards)
 Don´t provide any queue layer to avoid overload.
Sensu Monitoring System
 Provide nice API (good for customizations)
 Suport Nagios commands pattern (0,1,2,3 return for OK, WARNING,
CRITICAL,UNKNOWN)
 Scaling and distribuited
 Mutator support (Modify output events)
 Handler Support
 No support for perfdata (alternatives, we will show later how)
 Support no metrics ( alternatives, we will show later how)
 Checks should be implemented inside machine and executed by sensu-agent
Pros
Cons
Cassandra Database
 Elastic Scalability
 Peer to peer architecture
 High Availability and Fault Tolerance
 Modeling tables restriction ( cqlsh informs if your modeling model is right/good to run your
queries to improve performance)
 High performance (customize tables to perform your queries even more)
 TTL support for inserted rows
 Tunable Consistency
Kakfa?
 Kafka® is used for building real-time data pipelines and streaming apps. It
is horizontally scalable, fault-tolerant, wicked fast, and runs in production in
thousands of companies. (kafka website)
 Elastic
 highly scalable
 fault-tolerant
 Equally viable for small, medium, & large use cases
 Exactly-once processing semantics
 No separate processing cluster required
 Develop on Mac, Linux, Windows
Stream Data Platform: A central
hub for all your data
A stream data platform captures streams of events
or data changes and feeds them to other data
systems such as relational databases, key-value
stores, Hadoop, or the data warehouse.In our
case, we are going to use Kafka Cassandra sink to
input data into Cassandra database
Kafka Connect /Cassandra-Sink
The DataStax Certified Connector, developed by DataMountaineer, simplifies writing data from Kafka into
Cassandra. The connector converts the value from the Kafka Connect SinkRecords to Json. A fail fast thread
pool is then used to insert the records asynchronously into Cassandra.
Kafka Connect is a framework included in Apache Kafka that integrates Kafka with other systems. Its purpose is to
make it easy to add new systems to your scalable and secure stream data pipelines.
To copy data between Kafka and another system, users instantiate Kafka Connectors for the systems they want to
pull data from or push data to. Source Connectors import data from another system (e.g. a relational database into
Kafka) and Sink Connectors export data (e.g. the contents of a Kafka topic to an HDFS file).
PrestoDB vs Circuit Breaker
Presto allows querying data where it lives, including Hive, Cassandra, relational databases or
even proprietary data stores. A single Presto query can combine data from multiple sources,
allowing for analytics across your entire organization.
Another option is to use circuit breaker pattern in your application through consul/etcd to check
the services is up and notify when something is down avoiding impacting another resources.
Taking about architecture
OBS: if you don´t need to get
data from all Cassandra and
correlate them. I really
recommend to use circuit
breaker concept at dashboard
in the case some database
get down.
Kafka have two topics: alarm
topics to receive all output with
status different 0 from nagios
command and it´s not silenced
and another one to receive all
output to save in the metrics
database
Sensu data
Sensu servers got all output from checks (because type metrics) and Kafka-send through validate/normalize the data and
specify which topic to send (alarms or metrics). Kafka connect is going to sink the data to Cassandra database.
Handler configurationCheck configuration
Sensu Output (all the information) when get on Kafka_send command
Check output from the server (monitoring client)
Kafka-send is going to check if output contains “status != 0” or “silenced != 0” to send to alarm Kafka topic,
all the output is send to metrics topic in a normalize database pattern.
Sensu – Deleting servers
Even you delete a
server from Sensu, you
must get still keepalived
checks, you should
delete on sensu registry
as well to take away
keepalived checks.
Sensu – Silence checks
Cassandra Database and testing sink demo
For modeling your table for TS database we strong recomend use
composite key to avoid to get partition limitations (2 bilhões rows) and
clustering order as DESC in the timestamp column.
We shouldn´t consider use materalize to join tables. You should model the
table to perform even more your queries.
Sending formated data to kafka
Showing the data saved on Cassandra table through Kafka
connect
Presto demo
Create another Cassandra db and associate
them on presto as cassandra2
Quering both tables on diferentes
Cassandra servers
Quering both tables and union both
of them.
Presto UI
Presto Mysql-database (OT)
Thank you!
Questions?
More Information:
Linkedin:
https://www.linkedin.com/in/leandro-totino-
pereira/
Facebook:
www.facebook.com/leandro.totinopereira

More Related Content

PPTX
Real time analytics
PPTX
Backup multi-cloud solution based on named pipes
PDF
Data Pipelines with Spark & DataStax Enterprise
PPTX
Querying Druid in SQL with Superset
PPTX
Spark Streaming the Industrial IoT
PPTX
Lambda-less Stream Processing @Scale in LinkedIn
PDF
Data Lake and the rise of the microservices
PDF
Sherlock: an anomaly detection service on top of Druid
Real time analytics
Backup multi-cloud solution based on named pipes
Data Pipelines with Spark & DataStax Enterprise
Querying Druid in SQL with Superset
Spark Streaming the Industrial IoT
Lambda-less Stream Processing @Scale in LinkedIn
Data Lake and the rise of the microservices
Sherlock: an anomaly detection service on top of Druid

What's hot (20)

PPTX
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
PDF
The delta architecture
PPTX
Building Continuously Curated Ingestion Pipelines
PDF
Big Data Day LA 2015 - Always-on Ingestion for Data at Scale by Arvind Prabha...
PPTX
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
PDF
WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms
PDF
Big Data Ready Enterprise
PPTX
Ravi Namboori 's Open stack framework introduction
PDF
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
PPTX
Building Data Pipelines with Spark and StreamSets
PPTX
Kappa Architecture on Apache Kafka and Querona: datamass.io
PDF
Big Telco - Yousun Jeong
PDF
Streamsets and spark
PDF
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
PDF
Real-Time Analytics with Apache Cassandra and Apache Spark
PPTX
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
PPTX
Cassandra Lunch #88: Cadence
PDF
Data Pipelines With Streamsets
PPTX
Big Data Analytics with Spark
PDF
Databricks Delta Lake and Its Benefits
Real-Time Data Pipelines with Kafka, Spark, and Operational Databases
The delta architecture
Building Continuously Curated Ingestion Pipelines
Big Data Day LA 2015 - Always-on Ingestion for Data at Scale by Arvind Prabha...
Case Study: Elasticsearch Ingest Using StreamSets at Cisco Intercloud
WhereHows: Taming Metadata for 150K Datasets Over 9 Data Platforms
Big Data Ready Enterprise
Ravi Namboori 's Open stack framework introduction
A Big Data Lake Based on Spark for BBVA Bank-(Oscar Mendez, STRATIO)
Building Data Pipelines with Spark and StreamSets
Kappa Architecture on Apache Kafka and Querona: datamass.io
Big Telco - Yousun Jeong
Streamsets and spark
Tuning Java Driver for Apache Cassandra by Nenad Bozic at Big Data Spain 2017
Real-Time Analytics with Apache Cassandra and Apache Spark
Data Science lifecycle with Apache Zeppelin and Spark by Moonsoo Lee
Cassandra Lunch #88: Cadence
Data Pipelines With Streamsets
Big Data Analytics with Spark
Databricks Delta Lake and Its Benefits
Ad

Similar to Monitoring at scale - Sensu Kafka Kafka-connect Cassandra PrestoDB (20)

PDF
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
PDF
Building Streaming Data Applications Using Apache Kafka
PDF
Beyond the brokers - A tour of the Kafka ecosystem
PDF
Beyond the Brokers: A Tour of the Kafka Ecosystem
PPTX
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
PDF
Beyond the brokers - Un tour de l'écosystème Kafka
PDF
Feeding Cassandra with Spark-Streaming and Kafka
PDF
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
PPTX
Connecting kafka message systems with scylla
PDF
Apache Kafka as Event Streaming Platform for Microservice Architectures
PDF
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
PDF
JHipster conf 2019 - Kafka Ecosystem
PDF
Can Apache Kafka Replace a Database? – The 2021 Update | Kai Waehner, Confluent
PDF
How to over-engineer things and have fun? | Oto Brglez, OPALAB
PDF
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
PDF
Kafka Vienna Meetup 020719
PDF
Devoxx university - Kafka de haut en bas
PDF
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
PPTX
Event streaming webinar feb 2020
PDF
Event Hub (i.e. Kafka) in Modern Data Architecture
Fast and Simplified Streaming, Ad-Hoc and Batch Analytics with FiloDB and Spa...
Building Streaming Data Applications Using Apache Kafka
Beyond the brokers - A tour of the Kafka ecosystem
Beyond the Brokers: A Tour of the Kafka Ecosystem
Building streaming data applications using Kafka*[Connect + Core + Streams] b...
Beyond the brokers - Un tour de l'écosystème Kafka
Feeding Cassandra with Spark-Streaming and Kafka
Typesafe & William Hill: Cassandra, Spark, and Kafka - The New Streaming Data...
Connecting kafka message systems with scylla
Apache Kafka as Event Streaming Platform for Microservice Architectures
NoLambda: Combining Streaming, Ad-Hoc, Machine Learning and Batch Analysis
JHipster conf 2019 - Kafka Ecosystem
Can Apache Kafka Replace a Database? – The 2021 Update | Kai Waehner, Confluent
How to over-engineer things and have fun? | Oto Brglez, OPALAB
Lambda Architecture with Spark, Spark Streaming, Kafka, Cassandra, Akka and S...
Kafka Vienna Meetup 020719
Devoxx university - Kafka de haut en bas
Set your Data in Motion with Confluent & Apache Kafka Tech Talk Series LME
Event streaming webinar feb 2020
Event Hub (i.e. Kafka) in Modern Data Architecture
Ad

More from Leandro Totino Pereira (7)

PPTX
Zabbix at scale with Elasticsearch
PPTX
Discover/Register Everything in consul
PPTX
Automate schedule
PPTX
Gocd – Kubernetes/Nomad Continuous Deployment
PPTX
Linkerd – Service mesh with service Discovery backend
PPTX
DynomiteDB - No spof High-availability Redis cluster solution
PPTX
DalmatinerDB and cockroachDB monitoring plataform
Zabbix at scale with Elasticsearch
Discover/Register Everything in consul
Automate schedule
Gocd – Kubernetes/Nomad Continuous Deployment
Linkerd – Service mesh with service Discovery backend
DynomiteDB - No spof High-availability Redis cluster solution
DalmatinerDB and cockroachDB monitoring plataform

Recently uploaded (20)

PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Big Data Technologies - Introduction.pptx
PDF
Machine learning based COVID-19 study performance prediction
PDF
cuic standard and advanced reporting.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Electronic commerce courselecture one. Pdf
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
Empathic Computing: Creating Shared Understanding
PDF
Encapsulation theory and applications.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Building Integrated photovoltaic BIPV_UPV.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Big Data Technologies - Introduction.pptx
Machine learning based COVID-19 study performance prediction
cuic standard and advanced reporting.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
20250228 LYD VKU AI Blended-Learning.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Electronic commerce courselecture one. Pdf
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Digital-Transformation-Roadmap-for-Companies.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Unlocking AI with Model Context Protocol (MCP)
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Understanding_Digital_Forensics_Presentation.pptx
A Presentation on Artificial Intelligence
Empathic Computing: Creating Shared Understanding
Encapsulation theory and applications.pdf

Monitoring at scale - Sensu Kafka Kafka-connect Cassandra PrestoDB

  • 1. Sensu data monitoring system at scale LEANDRO TOTINO PEREIRA SYSTEM ENGINEER Sensu Kafka Kafka-connect Cassandra PrestoDB
  • 2. Agenda  Problem?  Sensu Monitoring system  Taking a little about the componentes/concepts used  PrestoDB vs Circuit breaker  Taking about the full architecture built  Questions?
  • 3. Problem?  Opensource/comercial doesn´t scale very well (poller, proxy and databases)  These system often do have SQL database which doesn´t scale well (sharding, master/slave, no TS database for metrics)  These systems are hard to customize for our needs. (integrations with other systems and dashboards)  Don´t provide any queue layer to avoid overload.
  • 4. Sensu Monitoring System  Provide nice API (good for customizations)  Suport Nagios commands pattern (0,1,2,3 return for OK, WARNING, CRITICAL,UNKNOWN)  Scaling and distribuited  Mutator support (Modify output events)  Handler Support  No support for perfdata (alternatives, we will show later how)  Support no metrics ( alternatives, we will show later how)  Checks should be implemented inside machine and executed by sensu-agent Pros Cons
  • 5. Cassandra Database  Elastic Scalability  Peer to peer architecture  High Availability and Fault Tolerance  Modeling tables restriction ( cqlsh informs if your modeling model is right/good to run your queries to improve performance)  High performance (customize tables to perform your queries even more)  TTL support for inserted rows  Tunable Consistency
  • 6. Kakfa?  Kafka® is used for building real-time data pipelines and streaming apps. It is horizontally scalable, fault-tolerant, wicked fast, and runs in production in thousands of companies. (kafka website)  Elastic  highly scalable  fault-tolerant  Equally viable for small, medium, & large use cases  Exactly-once processing semantics  No separate processing cluster required  Develop on Mac, Linux, Windows
  • 7. Stream Data Platform: A central hub for all your data A stream data platform captures streams of events or data changes and feeds them to other data systems such as relational databases, key-value stores, Hadoop, or the data warehouse.In our case, we are going to use Kafka Cassandra sink to input data into Cassandra database
  • 8. Kafka Connect /Cassandra-Sink The DataStax Certified Connector, developed by DataMountaineer, simplifies writing data from Kafka into Cassandra. The connector converts the value from the Kafka Connect SinkRecords to Json. A fail fast thread pool is then used to insert the records asynchronously into Cassandra. Kafka Connect is a framework included in Apache Kafka that integrates Kafka with other systems. Its purpose is to make it easy to add new systems to your scalable and secure stream data pipelines. To copy data between Kafka and another system, users instantiate Kafka Connectors for the systems they want to pull data from or push data to. Source Connectors import data from another system (e.g. a relational database into Kafka) and Sink Connectors export data (e.g. the contents of a Kafka topic to an HDFS file).
  • 9. PrestoDB vs Circuit Breaker Presto allows querying data where it lives, including Hive, Cassandra, relational databases or even proprietary data stores. A single Presto query can combine data from multiple sources, allowing for analytics across your entire organization. Another option is to use circuit breaker pattern in your application through consul/etcd to check the services is up and notify when something is down avoiding impacting another resources.
  • 10. Taking about architecture OBS: if you don´t need to get data from all Cassandra and correlate them. I really recommend to use circuit breaker concept at dashboard in the case some database get down. Kafka have two topics: alarm topics to receive all output with status different 0 from nagios command and it´s not silenced and another one to receive all output to save in the metrics database
  • 11. Sensu data Sensu servers got all output from checks (because type metrics) and Kafka-send through validate/normalize the data and specify which topic to send (alarms or metrics). Kafka connect is going to sink the data to Cassandra database. Handler configurationCheck configuration Sensu Output (all the information) when get on Kafka_send command Check output from the server (monitoring client) Kafka-send is going to check if output contains “status != 0” or “silenced != 0” to send to alarm Kafka topic, all the output is send to metrics topic in a normalize database pattern.
  • 12. Sensu – Deleting servers Even you delete a server from Sensu, you must get still keepalived checks, you should delete on sensu registry as well to take away keepalived checks.
  • 14. Cassandra Database and testing sink demo For modeling your table for TS database we strong recomend use composite key to avoid to get partition limitations (2 bilhões rows) and clustering order as DESC in the timestamp column. We shouldn´t consider use materalize to join tables. You should model the table to perform even more your queries. Sending formated data to kafka Showing the data saved on Cassandra table through Kafka connect
  • 15. Presto demo Create another Cassandra db and associate them on presto as cassandra2 Quering both tables on diferentes Cassandra servers Quering both tables and union both of them.