SlideShare a Scribd company logo
Ceilometer
CERN use case:
● CERN delivers resources in form of virtual machines and via traditional
batch and Grid computing
● Individual batch nodes execute payload from different users and
communities
● Accounting should cover both use cases
● Interesting metrics include
● What is the resource usage of experiment A during December ?
● What is the resource usage of user B last year ?
● Accounting information has to be reported to Grid bodies (WLCG) by
experiment
Facts:
● Details of user's jobs present in batch accounting database already
● It is a huge DB with around 400,000 records being added everyday
Solution
● Use of ceilometer as single source of truth for accounting data
● Batch data is put in the ceilometer database for accounting purpose
CERN's idea to use ceilometer
Ceilometer: Current Implementation
Ceilometer
Agent Central
With batch Plugin
Ceilometer
Collector
for batch Data
Ceilometer
Database
(mongodb)
RabbitMQRabbitMQ-LSF
Ceilometer
Agent
Central
Ceilometer
Collector
Ceilometer
API
Ceilometer
Agent
Compute
batch specific
instances
Batch
accounting
database
IaaS specific
instances
Ceilometer: Current Implementation
● Written a ceilometer-agent-central plugin, which polls
the batch accounting database for unpublished records
● The unpublished records are then pushed to metering
queue (RabbitMQ)
● The ceilometer-collector instance consumes the
messages from the metering queue and inserts them in
the ceilometer database (mongodb)
Ceilometer: Current Implementation
● In order to decrease the load on the openstack
messaging server, the batch data is being pushed to a
different messaging server than the one to which other
openstack messages (e.g. those from agent-compute)
go.
● This means that there are dedicated instances of
agent-central and collector for VM and batch metering
● The collectors writes the data into a single database
Ceilometer: LSF Data Statistics
● The batch plugin is run once per hour if the previous
run has finished
● Most runs do not have any unpublished data as data in
the batch accounting database arrives in bursts
● Most data of the day is published to the messaging
server within 2 runs of around 200,000 job records
each
● It takes around 5 hrs to complete one such run
Ceilometer: Batch Data Statistics
● The average rate of record publishing to the batch
rabbitmq server is 11 Hz. This includes
– the time to read unpublished records,
– push them to the rabbit-server and
– marking records in batch accounting database as
published
● Most of this time is spent in records publishing only
● The time for activities other than publishing is
minuscule
● The grow rate of the mongodb database is about
2GB/day

More Related Content

PDF
Dynamic log processing with fluentd and konfigurator
PDF
Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...
PDF
Stream Processing Live Traffic Data with Kafka Streams
PDF
Querying Dynamic Datasources with Continuously Mapped Sensor Data
PDF
Continuous Self-Updating Query Results over Dynamic Linked Data
PDF
Statistics for Engineers
PPTX
Open Source india 2014
PPTX
Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019
Dynamic log processing with fluentd and konfigurator
Flink Forward Berlin 2017: Francesco Versaci - Integrating Flink and Kafka in...
Stream Processing Live Traffic Data with Kafka Streams
Querying Dynamic Datasources with Continuously Mapped Sensor Data
Continuous Self-Updating Query Results over Dynamic Linked Data
Statistics for Engineers
Open Source india 2014
Cost Effective Presto on AWS with Spot Nodes - Strata SF 2019

What's hot (20)

PDF
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
PDF
Continuously Updating Query Results over Real-Time Linked Data
PDF
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
PPTX
Stream Processing Live Traffic Data with Kafka Streams
PDF
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
PDF
Windowing in apex
PPTX
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
PPTX
IoT Research Project
PDF
Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...
PDF
Kubernetes at Telekom Austria Group
PDF
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
PPTX
Prometheus on AWS
PPTX
Spark Pitfalls meetup UnderscoreIL
PPTX
Relational Database Management System
PDF
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
PPTX
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
PDF
Story of migrating event pipeline from batch to streaming
PPTX
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
PDF
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
PPTX
Internet of things - 3/4. Solving the problems
Flink Forward Berlin 2018: Shriya Arora - "Taming large-state to join dataset...
Continuously Updating Query Results over Real-Time Linked Data
Flink Forward SF 2017: Stefan Richter - Improvements for large state and reco...
Stream Processing Live Traffic Data with Kafka Streams
I²: Interactive Real-Time Visualization for Streaming Data with Apache Flink ...
Windowing in apex
Flink Forward SF 2017: Stephan Ewen - Convergence of real-time analytics and ...
IoT Research Project
Flink Forward Berlin 2017: Steffen Hausmann - Build a Real-time Stream Proces...
Kubernetes at Telekom Austria Group
Flink Forward San Francisco 2019: Massive Scale Data Processing at Netflix us...
Prometheus on AWS
Spark Pitfalls meetup UnderscoreIL
Relational Database Management System
Virtual Flink Forward 2020: Autoscaling Flink at Netflix - Timothy Farkas
Flink Forward Berlin 2017: Fabian Hueske - Using Stream and Batch Processing ...
Story of migrating event pipeline from batch to streaming
Keynote: Stephan Ewen - Stream Processing as a Foundational Paradigm and Apac...
Apache Flink Training Workshop @ HadoopCon2016 - #1 System Overview
Internet of things - 3/4. Solving the problems
Ad

Similar to Ceilometer lsf-intergration-openstack-summit (20)

PDF
Ceilometer presentation ODS Grizzly.pdf
PDF
OpenStack Ceilometer
PDF
Webinar Monitoring in era of cloud computing
PPTX
Stabilising the jenga tower
PDF
Stabilizing the Jenga tower: Scaling out Ceilometer
PDF
OSMC 2022 | Current State of icinga by Bernd Erk
PDF
Tick Stack - Listen your infrastructure and please sleep
PDF
Improving monitoring systems Interoperability with OpenMetrics
PDF
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
PDF
Icinga 2011 at OSMC
PDF
Ceilometer presentation ods havana final - published
KEY
Trending with Purpose
PDF
NetflixOSS Open House Lightning talks
PPTX
Herding cats & catching fire: Workday's telemetry & middleware
PDF
SNMP Monitoring at scale - Icinga Camp Milan 2023
PPTX
Cloud Security Monitoring and Spark Analytics
PDF
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
PDF
Telemetry doesn't have to be scary; Ben Ford
PDF
Ben ford intro
Ceilometer presentation ODS Grizzly.pdf
OpenStack Ceilometer
Webinar Monitoring in era of cloud computing
Stabilising the jenga tower
Stabilizing the Jenga tower: Scaling out Ceilometer
OSMC 2022 | Current State of icinga by Bernd Erk
Tick Stack - Listen your infrastructure and please sleep
Improving monitoring systems Interoperability with OpenMetrics
ClickHouse Paris Meetup. Pragma Analytics Software Suite w/ClickHouse, by Mat...
Icinga 2011 at OSMC
Ceilometer presentation ods havana final - published
Trending with Purpose
NetflixOSS Open House Lightning talks
Herding cats & catching fire: Workday's telemetry & middleware
SNMP Monitoring at scale - Icinga Camp Milan 2023
Cloud Security Monitoring and Spark Analytics
FOSDEM 2019: M3, Prometheus and Graphite with metrics and monitoring in an in...
Telemetry doesn't have to be scary; Ben Ford
Ben ford intro
Ad

More from Tim Bell (20)

PPTX
CERN IT Monitoring
PPTX
CERN Status at OpenStack Shanghai Summit November 2019
PPTX
20190620 accelerating containers v3
PPTX
20190314 cern register v3
PPTX
20181219 ucc open stack 5 years v3
PPTX
20181219 ucc open stack 5 years v3
PPTX
OpenStack at CERN : A 5 year perspective
PPTX
20170926 cern cloud v4
PPTX
The OpenStack Cloud at CERN - OpenStack Nordic
PPTX
20161025 OpenStack at CERN Barcelona
PPTX
20150924 rda federation_v1
PPTX
OpenStack Paris 2014 - Federation, are we there yet ?
PPTX
20141103 cern open_stack_paris_v3
PPTX
CERN Mass and Agility talk at OSCON 2014
PPTX
20140509 cern open_stack_linuxtag_v3
PPTX
Open stack operations feedback loop v1.4
PPT
CERN clouds and culture at GigaOm London 2013
PPTX
20130529 openstack cee_day_v6
PDF
Academic cloud experiences cern v4
PDF
Havana survey results-final-v2
CERN IT Monitoring
CERN Status at OpenStack Shanghai Summit November 2019
20190620 accelerating containers v3
20190314 cern register v3
20181219 ucc open stack 5 years v3
20181219 ucc open stack 5 years v3
OpenStack at CERN : A 5 year perspective
20170926 cern cloud v4
The OpenStack Cloud at CERN - OpenStack Nordic
20161025 OpenStack at CERN Barcelona
20150924 rda federation_v1
OpenStack Paris 2014 - Federation, are we there yet ?
20141103 cern open_stack_paris_v3
CERN Mass and Agility talk at OSCON 2014
20140509 cern open_stack_linuxtag_v3
Open stack operations feedback loop v1.4
CERN clouds and culture at GigaOm London 2013
20130529 openstack cee_day_v6
Academic cloud experiences cern v4
Havana survey results-final-v2

Ceilometer lsf-intergration-openstack-summit

  • 1. Ceilometer CERN use case: ● CERN delivers resources in form of virtual machines and via traditional batch and Grid computing ● Individual batch nodes execute payload from different users and communities ● Accounting should cover both use cases ● Interesting metrics include ● What is the resource usage of experiment A during December ? ● What is the resource usage of user B last year ? ● Accounting information has to be reported to Grid bodies (WLCG) by experiment Facts: ● Details of user's jobs present in batch accounting database already ● It is a huge DB with around 400,000 records being added everyday Solution ● Use of ceilometer as single source of truth for accounting data ● Batch data is put in the ceilometer database for accounting purpose
  • 2. CERN's idea to use ceilometer
  • 3. Ceilometer: Current Implementation Ceilometer Agent Central With batch Plugin Ceilometer Collector for batch Data Ceilometer Database (mongodb) RabbitMQRabbitMQ-LSF Ceilometer Agent Central Ceilometer Collector Ceilometer API Ceilometer Agent Compute batch specific instances Batch accounting database IaaS specific instances
  • 4. Ceilometer: Current Implementation ● Written a ceilometer-agent-central plugin, which polls the batch accounting database for unpublished records ● The unpublished records are then pushed to metering queue (RabbitMQ) ● The ceilometer-collector instance consumes the messages from the metering queue and inserts them in the ceilometer database (mongodb)
  • 5. Ceilometer: Current Implementation ● In order to decrease the load on the openstack messaging server, the batch data is being pushed to a different messaging server than the one to which other openstack messages (e.g. those from agent-compute) go. ● This means that there are dedicated instances of agent-central and collector for VM and batch metering ● The collectors writes the data into a single database
  • 6. Ceilometer: LSF Data Statistics ● The batch plugin is run once per hour if the previous run has finished ● Most runs do not have any unpublished data as data in the batch accounting database arrives in bursts ● Most data of the day is published to the messaging server within 2 runs of around 200,000 job records each ● It takes around 5 hrs to complete one such run
  • 7. Ceilometer: Batch Data Statistics ● The average rate of record publishing to the batch rabbitmq server is 11 Hz. This includes – the time to read unpublished records, – push them to the rabbit-server and – marking records in batch accounting database as published ● Most of this time is spent in records publishing only ● The time for activities other than publishing is minuscule ● The grow rate of the mongodb database is about 2GB/day