SlideShare a Scribd company logo
Web scale monitoring
 using gearman, redis, mojolicious, Angular.
js, gnuplot and PostgreSQL as NoSQL store
                Dobrica Pavlinušić
                http://blog.rot13.org
                DORS/CLUC 2012
Goals
● define problem in terms of scaling
   ○ Gearman as distributed fork
● don't lock yourself into technological choice
   ○ relational data database, so what?
● don't mungle and rename data
   ○ preserve naming through whole stack
● test driven development
   ○ small iterations, easy deployment
● is your cache really useful?
   ○ can you make web interface out of it?
● why are web interfaces hard?
   ○ Angular.js comes to rescue!
Project specification
● Existing perl scripts parse telnet output
   ○ end-users (CPE)
   ○ equipment in-between (MSAN, DSLAM)
● Create monitoring system!
● Users data split between LDAP and CRM
● Horizontal scalability (on single box!)
   ○ number of users grow
● Store data in relational database for
  reporting
   ○ All collected data is interesting
● Web interface to inspect data
  ○ prototype http://youtu.be/Cp31xUdyZBQ
Proposed architecture
● Gearman as queue server
  ○ workers collect, process and store data
  ○ Gearman::Driver fork workers on-demand
● PostgreSQL with hstore
  ○ don't mungle data - not normalized
  ○ use views for reporting
  ○ table inheritance for easy expiry of data
● Redis rich structures for data caching
  ○ provide "warm" data for Web interface
● Web: mojolicious, Angular.js, gnuplot
  ○ gearman calls and SQL queries to JSONP
Web UI     PostgreSQL



                                        store



                         Gearm                                        LDAP
        Redis                                   LDAP
                          an



                                                XML/R
                                                                      CRM
                                                 PC



 CPE        DSLAM        MSAN
                                         poll
 *40        34*1-5       40*1-5
                                          *1
~5300       ~2300        ~1100                           15 min pull interval
                                                        dual-core, 4Gb RAM
                                                         130-300 processes

                                         cron
More information
http://gearman.org/

http://redis.io/

http://www.postgresql.org/

http://mojolicio.us/

http://angularjs.org/
Queue
● distributed (across cores) on-demand fork
● German::Driver manages workers
    ○   min, max process limits
    ○   copy-on-write fork
    ○   three master processes (services, MSAN, DSLAM)
    ○   modify process name for status info (ps ax)
●   german workers
    ○ pollers generate timestamp for data (inserts are
      queued!)
    ○ one per work (CPE pollers)
    ○ persistent workers (TCP connection to
      MSAN/DSLAM is re-used for all work)
Cache with structures
● store
   ○ all data from gearman calls (which are slow)
   ○ statistics from poll workers
● expire data after poll interval
   ○ fresh data for web interface
● name your keys in sane way!
   ○   CPE.*, ZTEMSAN.*, ZTEDSLAM.* (poll stats)
   ○   CRM.login, LDAP.login
   ○   table.dslam.login (last row inserted)
   ○   columns.dslam (existence, needed for Web)
hstore
● store key-value pairs (single-level) in single
  column
● additional columns to support indexes
   ○ GiST and GIN indexes on hstore are not enough
● table inheritance
   ○ partitioning of tables by date
   ○ DELETE and VACUUM can take a long time
   ○ set sql_inheritance = false
● using PostgreSQL 8.4 (nothing new!)
● PostgreSQL 9.2 will have JSON type
  support and v8!
Web interface
● mojolicious
  ○ web server and JSON provider
  ○ MojoX::Gearman
● gnuplot graphs from huge amount of data
  ○ JavaScript doesn't cut it!
  ○ get textual data from gearman
  ○ generate graphs on-the-fly
● Angular.js as nice way to generate HTML
  from JSON $resources

More Related Content

PDF
NetFlow Data processing using Hadoop and Vertica
PPTX
Level 101 for Presto: What is PrestoDB?
PDF
Big Data Day LA 2015 - HBase at Factual: Real time and Batch Uses by Molly O'...
PDF
Search engine based on Elasticsearch
PDF
Doing E-commerce Right – Magento on DigitalOcean
PDF
Performance tuning ColumnStore
PPT
Large-scale projects development (scaling LAMP)
NetFlow Data processing using Hadoop and Vertica
Level 101 for Presto: What is PrestoDB?
Big Data Day LA 2015 - HBase at Factual: Real time and Batch Uses by Molly O'...
Search engine based on Elasticsearch
Doing E-commerce Right – Magento on DigitalOcean
Performance tuning ColumnStore
Large-scale projects development (scaling LAMP)

What's hot (20)

PDF
Caffe + H2O - By Cyprien noel
PPTX
Dynamo db and Cross Region Migration
TXT
No sql
PPTX
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
PPTX
Cache options for Data Layer
PDF
HPCS16 - Frederick Lefebvre - Bridging the last mile
PDF
Big data should be simple
PPTX
Inside CynosDB: MariaDB optimized for the cloud at Tencent
PPTX
Office Track: Exchange 2013 in the real world - Michael Van Horenbeeck
PDF
TechEvent Time Seriesd Databases
PDF
OpenStreetMap Belarus Tile Server
PDF
Avoiding Data Hotspots at Scale
ODP
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
PDF
Redis Overview
PDF
PGConf.ASIA 2019 Bali - IoT and PostgreSQL - Koichi Suzuki
PDF
Reading The Source Code of Presto
PDF
Configuring workload-based storage and topologies
ODP
Gluster fs hadoop_fifth-elephant
PPTX
M|18 Creating a Reference Architecture for High Availability at Nokia
ODP
Tiering barcelona
Caffe + H2O - By Cyprien noel
Dynamo db and Cross Region Migration
No sql
Sharding: patterns and antipatterns (Osipov, Rybak, HighLoad'2014)
Cache options for Data Layer
HPCS16 - Frederick Lefebvre - Bridging the last mile
Big data should be simple
Inside CynosDB: MariaDB optimized for the cloud at Tencent
Office Track: Exchange 2013 in the real world - Michael Van Horenbeeck
TechEvent Time Seriesd Databases
OpenStreetMap Belarus Tile Server
Avoiding Data Hotspots at Scale
GlusterD 2.0 - Managing Distributed File System Using a Centralized Store
Redis Overview
PGConf.ASIA 2019 Bali - IoT and PostgreSQL - Koichi Suzuki
Reading The Source Code of Presto
Configuring workload-based storage and topologies
Gluster fs hadoop_fifth-elephant
M|18 Creating a Reference Architecture for High Availability at Nokia
Tiering barcelona
Ad

Viewers also liked (20)

PDF
Open Education in Virtual Worlds
PPT
Denk- en discussiedag sept 2010: digitale collectie
PDF
Social Media & Web 2.0 Services for Choirs
PPTX
Hacktivism in Virtual Worlds
PDF
Free Libre Open Source Software at FFZG library
PPTX
Open Workshop on Information Literacy
PDF
The Attack of the Learning Clones
PDF
Pubic Diplomacy and Web 2.0
PPTX
Re-Negotiating Narrative: Emergent Storytelling
PDF
This is an interesting metadata source. Can I import it into Koha?
PPT
One Climate Initiative Sep 2007
PDF
Virtual Reality Applications in Career Consulting - Potential & Restictions
PDF
Post-relational databases: What's wrong with web development? v3
PDF
Morocco
PDF
Information Literacy and Smart Life-Long Learning: Knowledge Antidotes in the...
PPT
CTE Teaching and Learning Inst. 2008
PPT
Euronem Zambia 2008
PDF
Intro to Haml
PPT
Χριστούγεννα χωρίς Χριστό
PDF
Mojo Facets – so, you have data and browser?
Open Education in Virtual Worlds
Denk- en discussiedag sept 2010: digitale collectie
Social Media & Web 2.0 Services for Choirs
Hacktivism in Virtual Worlds
Free Libre Open Source Software at FFZG library
Open Workshop on Information Literacy
The Attack of the Learning Clones
Pubic Diplomacy and Web 2.0
Re-Negotiating Narrative: Emergent Storytelling
This is an interesting metadata source. Can I import it into Koha?
One Climate Initiative Sep 2007
Virtual Reality Applications in Career Consulting - Potential & Restictions
Post-relational databases: What's wrong with web development? v3
Morocco
Information Literacy and Smart Life-Long Learning: Knowledge Antidotes in the...
CTE Teaching and Learning Inst. 2008
Euronem Zambia 2008
Intro to Haml
Χριστούγεννα χωρίς Χριστό
Mojo Facets – so, you have data and browser?
Ad

Similar to Web scale monitoring (20)

PPTX
Big Data in 200 km/h | AWS Big Data Demystified #1.3
PPTX
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
PDF
A Journey into Hexagon: Dissecting Qualcomm Basebands
PDF
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PDF
Patroni - HA PostgreSQL made easy
PPTX
ApacheCon 2022_ Large scale unification of file format.pptx
PPTX
AWS Big Data Demystified #1: Big data architecture lessons learned
PDF
MariaDB Paris Workshop 2023 - Performance Optimization
PDF
Application Caching: The Hidden Microservice (SAConf)
PPTX
Node.js Web Apps @ ebay scale
PDF
Migrating to Apache Spark at Netflix
PDF
Data pipelines from zero to solid
PDF
Netflix Open Source Meetup Season 4 Episode 2
PDF
NetflixOSS Meetup season 3 episode 1
PDF
Search and fpga
PDF
Etl confessions pg conf us 2017
PDF
Argus Production Monitoring at Salesforce
PDF
Argus Production Monitoring at Salesforce
PDF
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
PPTX
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned
Big Data in 200 km/h | AWS Big Data Demystified #1.3
AWS big-data-demystified #1.1 | Big Data Architecture Lessons Learned | English
A Journey into Hexagon: Dissecting Qualcomm Basebands
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
Patroni - HA PostgreSQL made easy
ApacheCon 2022_ Large scale unification of file format.pptx
AWS Big Data Demystified #1: Big data architecture lessons learned
MariaDB Paris Workshop 2023 - Performance Optimization
Application Caching: The Hidden Microservice (SAConf)
Node.js Web Apps @ ebay scale
Migrating to Apache Spark at Netflix
Data pipelines from zero to solid
Netflix Open Source Meetup Season 4 Episode 2
NetflixOSS Meetup season 3 episode 1
Search and fpga
Etl confessions pg conf us 2017
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
EVCache: Lowering Costs for a Low Latency Cache with RocksDB
AWS Big Data Demystified #1.2 | Big Data architecture lessons learned

More from Dobrica Pavlinušić (20)

PDF
Mainline kernel on ARM Tegra20 devices that are left behind on 2.6 kernels
PDF
Linux+sensor+device-tree+shell=IoT !
PDF
bro - what is in my network?
PDF
Let's hack cheap hardware 2016 edition
PDF
Raspberry Pi - best friend for all your GPIO needs
PDF
Cheap, good, hackable tools from China: AVR component tester
PDF
Ganeti - build your own cloud
PDF
FSEC 2014 - I can haz your board with JTAG
PDF
Hardware hacking for software people
PDF
Gnu linux on arm for $50 - $100
PDF
Security of Linux containers in the cloud
PDF
SysAdmin cookbook
PDF
Printing on Linux, simple right?
PPT
KohaCon11: Integrating Koha with RFID system
PDF
Deploy your own P2P network
PDF
Virtualization which isn't: LXC (Linux Containers)
PDF
Slobodni softver za digitalne arhive: EPrints u Knjižnici Filozofskog fakulte...
PDF
Post-relational databases: What's wrong with web development?
PDF
Kako napraviti Google od zgrade sa računalima?
PDF
Virtual LDAP - kako natjerati strgane aplikacije da koriste LDAP
Mainline kernel on ARM Tegra20 devices that are left behind on 2.6 kernels
Linux+sensor+device-tree+shell=IoT !
bro - what is in my network?
Let's hack cheap hardware 2016 edition
Raspberry Pi - best friend for all your GPIO needs
Cheap, good, hackable tools from China: AVR component tester
Ganeti - build your own cloud
FSEC 2014 - I can haz your board with JTAG
Hardware hacking for software people
Gnu linux on arm for $50 - $100
Security of Linux containers in the cloud
SysAdmin cookbook
Printing on Linux, simple right?
KohaCon11: Integrating Koha with RFID system
Deploy your own P2P network
Virtualization which isn't: LXC (Linux Containers)
Slobodni softver za digitalne arhive: EPrints u Knjižnici Filozofskog fakulte...
Post-relational databases: What's wrong with web development?
Kako napraviti Google od zgrade sa računalima?
Virtual LDAP - kako natjerati strgane aplikacije da koriste LDAP

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Cloud computing and distributed systems.
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Electronic commerce courselecture one. Pdf
PDF
Approach and Philosophy of On baking technology
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
Teaching material agriculture food technology
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Programs and apps: productivity, graphics, security and other tools
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
Machine learning based COVID-19 study performance prediction
NewMind AI Weekly Chronicles - August'25 Week I
Spectral efficient network and resource selection model in 5G networks
Building Integrated photovoltaic BIPV_UPV.pdf
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Unlocking AI with Model Context Protocol (MCP)
MIND Revenue Release Quarter 2 2025 Press Release
Cloud computing and distributed systems.
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Electronic commerce courselecture one. Pdf
Approach and Philosophy of On baking technology
MYSQL Presentation for SQL database connectivity
Advanced methodologies resolving dimensionality complications for autism neur...
Teaching material agriculture food technology
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Per capita expenditure prediction using model stacking based on satellite ima...
Programs and apps: productivity, graphics, security and other tools
The AUB Centre for AI in Media Proposal.docx
Diabetes mellitus diagnosis method based random forest with bat algorithm

Web scale monitoring

  • 1. Web scale monitoring using gearman, redis, mojolicious, Angular. js, gnuplot and PostgreSQL as NoSQL store Dobrica Pavlinušić http://blog.rot13.org DORS/CLUC 2012
  • 2. Goals ● define problem in terms of scaling ○ Gearman as distributed fork ● don't lock yourself into technological choice ○ relational data database, so what? ● don't mungle and rename data ○ preserve naming through whole stack ● test driven development ○ small iterations, easy deployment ● is your cache really useful? ○ can you make web interface out of it? ● why are web interfaces hard? ○ Angular.js comes to rescue!
  • 3. Project specification ● Existing perl scripts parse telnet output ○ end-users (CPE) ○ equipment in-between (MSAN, DSLAM) ● Create monitoring system! ● Users data split between LDAP and CRM ● Horizontal scalability (on single box!) ○ number of users grow ● Store data in relational database for reporting ○ All collected data is interesting ● Web interface to inspect data ○ prototype http://youtu.be/Cp31xUdyZBQ
  • 4. Proposed architecture ● Gearman as queue server ○ workers collect, process and store data ○ Gearman::Driver fork workers on-demand ● PostgreSQL with hstore ○ don't mungle data - not normalized ○ use views for reporting ○ table inheritance for easy expiry of data ● Redis rich structures for data caching ○ provide "warm" data for Web interface ● Web: mojolicious, Angular.js, gnuplot ○ gearman calls and SQL queries to JSONP
  • 5. Web UI PostgreSQL store Gearm LDAP Redis LDAP an XML/R CRM PC CPE DSLAM MSAN poll *40 34*1-5 40*1-5 *1 ~5300 ~2300 ~1100 15 min pull interval dual-core, 4Gb RAM 130-300 processes cron
  • 7. Queue ● distributed (across cores) on-demand fork ● German::Driver manages workers ○ min, max process limits ○ copy-on-write fork ○ three master processes (services, MSAN, DSLAM) ○ modify process name for status info (ps ax) ● german workers ○ pollers generate timestamp for data (inserts are queued!) ○ one per work (CPE pollers) ○ persistent workers (TCP connection to MSAN/DSLAM is re-used for all work)
  • 8. Cache with structures ● store ○ all data from gearman calls (which are slow) ○ statistics from poll workers ● expire data after poll interval ○ fresh data for web interface ● name your keys in sane way! ○ CPE.*, ZTEMSAN.*, ZTEDSLAM.* (poll stats) ○ CRM.login, LDAP.login ○ table.dslam.login (last row inserted) ○ columns.dslam (existence, needed for Web)
  • 9. hstore ● store key-value pairs (single-level) in single column ● additional columns to support indexes ○ GiST and GIN indexes on hstore are not enough ● table inheritance ○ partitioning of tables by date ○ DELETE and VACUUM can take a long time ○ set sql_inheritance = false ● using PostgreSQL 8.4 (nothing new!) ● PostgreSQL 9.2 will have JSON type support and v8!
  • 10. Web interface ● mojolicious ○ web server and JSON provider ○ MojoX::Gearman ● gnuplot graphs from huge amount of data ○ JavaScript doesn't cut it! ○ get textual data from gearman ○ generate graphs on-the-fly ● Angular.js as nice way to generate HTML from JSON $resources