Making sense of your data

Making Sense of your Data
Building A Custom DataSource
for Grafana with Vert.x
Gerald Mücke
DevCon5 GmbH
@gmuecke

About me
 IT Consultant & Java Specialist at DevCon5 (CH)
 Focal Areas
 Tool-assisted quality assurance
 Performance (-testing, -analysis, -tooling)
 Operational Topics (APM, Monitoring)
 Twitter: @gmuecke
3

The Starting Point
 Customer stored and keep response time measurement of test runs
in a MongoDB
 Lots of Data
 Timestamp & Value
 No Proper Visualization
4

What are timeseries data?
 a set of datapoints with a timestamp and a value
time
value

What is MongoDB?
 MongoDB
 NoSQL database with focus on scale
 JSON as data representation
 No HTTP endpoint (TCP based Wire Protocol)
 Aggregation framework for complex queries
6

What is Grafana?
 A Service for Visualizing Time Series Data
 Open Source
 Backend written in Go
 Frontend based on Angular
 Dashboards & Alerts

Grafana Architecture 8
Grafana Server
• Implemented in GO
• Persistence for Settings and Dashboards
• Offers Proxy for Service Calls
Browser
Angular UI Data Source Data Source Plugin...
Proxy
DB DB

Datasources for Grafana 9
Grafana Server
• Implemented in GO
• Persistence for Settings and Dashboards
• Offers Proxy for Service Calls
Browser
Datasource
Angular UI
Data Source Plugin
• Angular
• JavaScript
HTTP

Connect Angular Directly to
Mongo?
10

From 2 Tier to 3 Tier 11
Grafana
(Angular) Mongo DB
Grafana
(Angular) Mongo DB
Datasource
Service
HTTP Mongo
Wire
Protocol

Start Simple
SimpleJsonDatasource (Plugin)
3 ServiceEndpoints
 /search  Labels – names of available timeseries
 /annoations  Annotations – textual markers
 /query  Query – actual time series data
12
https://github.com/grafana/simple-json-datasource

/search Format
Request
{
"target" : "select metric",
"refId" : "E"
}
Response
[
"Metric Name 1",
"Metric Name2",
]
An array of strings
13

/annotations Format
Request
{ "annotation" : {
"name" : "Test",
"iconColor" : "rgba(255, 96, 96, 1)",
"datasource" : "Simple Example DS",
"enable" : true,
"query" : "{"name":"Timeseries A"}" },
"range" : {
"from" : "2016-06-13T12:23:47.387Z",
"to" : "2016-06-13T12:24:19.217Z" },
"rangeRaw" : {
"from" : "2016-06-13T12:23:47.387Z",
"to" : "2016-06-13T12:24:19.217Z"
} }
Response
[ { "annotation": {
"name": "Test",
"iconColor": "rgba(255, 96, 96, 1)",
"datasource": "Simple Example DS",
"enable": true,
"query": "{"name":"Timeseries A"}" },
"time": 1465820629774,
"title": "Marker",
"tags": [
"Tag 1",
"Tag 2" ] } ]
14

/query Format
Request
{ "panelId" : 1,
"maxDataPoints" : 1904,
"format" : "json",
"range" : {
"from" : "2016-06-13T12:23:47.387Z",
"to" : "2016-06-13T12:24:19.217Z" },
"rangeRaw" : {
"from" : "2016-06-13T12:23:47.387Z",
"to" : "2016-06-13T12:24:19.217Z" },
"interval" : "20ms",
"targets" : [ {
"target" : "Time series A",
"refId" : "A" },] }
Response
[ { "target":"Timeseries A",
"datapoints":[
[1936,1465820629774],
[2105,1465820632673],
[4187,1465820635570],
[30001,1465820645243] },
{ "target":"Timeseries B",
"datapoints":[ ] }
]
15

Structure of the Source Data
{
"_id" : ObjectId("56375bc54f3c4caedfe68aca"),
"t" : {
"eDesc" : "Some more verbose description of the datapoint than the name",
"eId" : "56375ae24f3c4caedfe68a07",
"name" : "a name of the datapoint series",
"profile" : "P01",
"rnId" : "56375b694f3c4caedfe68aa0",
"rnStatus" : "OK",
"uId" : "anonymous"
},
"n" : {
"begin" : NumberLong("1446468494689"),
"value" : NumberLong(283)
}
}
16

Custom Datasource
 Should be
 Lightweight
 Fast / Performant
 Simple
17

Microservice?
 Options for implementation
 Java EE Microservice (i.e. Wildfly Swarm)
 Springboot Microservice
 Vert.x Microservice
 Node.js
 ...
19

The Alternative Options
Node.js
 Single Threaded
 Child Worker Processes
 Javascript Only
 Not best-choice for heavy
computation
Spring / Java EE
 Multithreaded
 Clusterable
 Java Only
 Solid Workhorses, cumbersome at
times
21

Why Vert.x?
 High Performance, Low Footprint
 Asynchronous, Non-Blocking
 Actor-like Concurrency
 Event & Worker Verticles
 Message Driven
 Polyglott
 Java, Groovy, Javascript, Scala …
 Scalable
 Distributed Eventbus
 Multi-threaded Event Loops
22

Asynchronous non-blocking vs
Synchronous blocking
23

Event Loop 24
Photo: Andreas Praefcke

Event Loop and Verticles 25
Photo: RokerHRO
3rd Floor, Verticle A
2nd Floor, Verticle B
1st Floor, Verticle C

Event Loop 28
Verticle
Verticle
Verticle
EventI/O

Event Bus 30
Verticle
Verticle
Verticle
Eventbus
Message

CPU
Multi-Reactor 31
Core Core Core Core
Eventbus
Other Vert.x
Instance
Browser
Verticle Verticle

Event & Worker Verticles
Event Driven Verticles Worker Verticles
32
Verticle
Verticle
Verticle
Thread Pool
Thread Pool
Verticle
Verticle
Verticle
Verticle
Verticle

Implementing the datasource
 Http Routes
 Pre-Processing
 Optional, for example: Split Request
 Query Database for Time Data Points
 Post Processing
 Optional, for example: Merge, Calculations
33

Step 1 – The naive approach
 Find all datapoints within range
34

Step 2 – Split Request
 Split request into chunks (#chunks = #cores)
 Use multiple Verticle Instance in parallel (#instances = #cores) ?
35
CPU

Step 4 – Aggregate Datapoints
 Use Mongo Aggregation Pipeline
 Reduce Datapoints returned to service
37

Step 5 – Percentiles (Service)
 Fetch all data
 Calculate percentiles in service
38
CPU

Step 6 – Percentiles (DB)
 Build aggregation pipeline to calculate percentiles
 Algorithm, see
http://www.dummies.com/education/math/statistics/how-to-
calculate-percentiles-in-statistics/
39
DB

CPUCPU
Datasource Architecture 40
HTTP
Service
Split
Request
Eventbus
Query
Database
Query
Database
Query
Database
Query
Database
Merge
Result
HTTP
Request
HTTP
Response
DB
Post
Process
Post
Process
Post
Process
Post
Process
Eventbus

Remember the
CELL BE?
 CPU of the PS3
 Central Bus (EIB)
 1 General Purpose CPU
 Runs the Game (Event) Loop
 8 SPUs
 Special Processing
 Sound
 Physics
 AI
 ...
41

Adding more stats & calculation
 Push Calculation to DB if possible
 Add more workers / node for complex (post-) processing
 Aggregate results before post-processing
 DB performance is king
42

Takeaways
 Grafana is powerful tool for visualizing data
 Information becomes consumable through visualization
 Information is essential for decision making
 Vert.x is
 Reactive, Non-Blocking, Asynchronous, Scalable
 Running on JVM
 Polyglott
 Fun
47
Source code on:
https://github.com/gmuecke/grafana-vertx-datasource

Gerald Mücke www.devcon5.ch
DevCon5 @gmuecke

Making sense of your data

More Related Content

What's hot (20)

Viewers also liked (10)

Similar to Making sense of your data (20)

Recently uploaded (20)

Making sense of your data