This document provides an overview of monitoring in big data frameworks. It discusses the challenges of monitoring large-scale cloud environments running big data applications. Several open-source monitoring tools are described, including Hadoop Performance Monitoring UI, SequenceIQ, Ganglia, Apache Chukwa, and Nagios. Key requirements for monitoring big data platforms are also outlined, such as scalability, timeliness, and handling constant changes. The document concludes by introducing the DICE monitoring platform, which collects metrics from Hadoop, YARN, Spark, Storm and Kafka using Collectd and stores the data in Elasticsearch for analysis and visualization with Kibana.
Related topics: