SlideShare a Scribd company logo
Chronix as long term storage for Prometheus
Florian Lautenschlager, Moritz Kammerer
@flolaut, @phxql
Prometheus
Cloud Native Application
Cloud Native Application
Cloud Native Application
Cloud Native Application
Cloud Native Application
Cloud Native Application
Real-time monitoring and alerting for cloud native apps to detect
anomalies close to their occurrence and to initiate measures.
TIMENOW 14 Days
Beyond real-time monitoring of cloud native apps?
Nothing more to do?
Prometheus
Cloud Native Application
Cloud Native Application
Cloud Native Application
Cloud Native Application
Cloud Native Application
Cloud Native Application
TIMENOW THEN
Real-time monitoring and alerting for cloud native apps to detect
anomalies close to their occurrence and to initiate measures.
Lossless long term storage to store
data forever allowing analyses
beyond real-time monitoring!
Chronix
Chronix as Long-Term Storage for Prometheus
Agenda
■ Some words about Chronix, its Architecture, its Features, and its Performance.
■ How did we built the integration with Prometheus.
■ Showcase: Prometheus, Chronix Ingester, Chronix, and Grafana
Chronix is more than just a simple time series database. It’s a
time series processing tool stack for all purposes.
Time Series Database: What’s that?
■ Definition 1: “A sample s is a tuple of {timestamp, value}, where the
value could be any kind of object.”
■ Definition 2: “A time series T is an arbitrary list of chronological
ordered samples of one value type”.
■ Definition 3: “A chunk C is a chronological ordered part of a time
series.”
■ Definition 4: “A time series database TSDB is a specialized database
for storing and retrieving time series in an efficient and optimized
way”.
s
{t,v}
1
T
{s1,s2}
T
CT
T1
C1,1
C1,2
TSDB
T3C2,2
T1 C2,1
Chronix’ architecture enables both efficient storage of time
series and millisecond range queries.
(1)
Semantic Transformation
(2)
Attributes and Chunks
(3)
Basic Compression
(4)
Multi-Dimensional
Storage
Record
data:<chunk>
attributes
Record
data:compressed
<chunk>
attributes
Record Storage
68 Billion Points
1 Mio. Chunks *
68.000 Points
~ 96% Compression
Optional
The key data type of Chronix is called a record.
It stores a compressed time series chunk and its attributes.
record{
data:compressed{<chunk>}
//technical fields
id: 3dce1de0−...−93fb2e806d19
version: 1501692859622883300
start: 1427457011238
end: 1427471159292
//optional attributes
host: prodI5
process: scheduler
group: jmx
metric: heapMemory.Usage.Used
max: 896.571
}
Data:compressed{<chunk of time series data>}
■ Time Series: timestamp, numeric value
■ Traces: calls, exceptions, …
■ Logs: access, method runtimes
■ Complex data: models, test coverage,
anything else…
Optional attributes
■ Arbitrary attributes for the time series
■ Attributes are indexed
■ Make the chunk searchable
■ Can contain pre-calculated values
Chronix provides specialized aggregations, transformations,
and analyses for time series that are commonly used.
Aggregations
■ Min / Max / Average / Sum / Count
■ Percentile
■ Standard Deviation
■ First / Last
■ Range
Analyses
■ Trend Analysis
Using a linear regression model
■ Outlier Analysis
Using the IQR
■ Frequency Analysis
Check occurrence within a time range
■ Fast Dynamic Time Warping
Time series similarity search
■ Symbolic Aggregate Approximation
Similarity and pattern search
Transformations
■ Bottom/Top n-values
■ Moving average
■ Divide / Scale
■ Downsampling
Many more
Many more
Only scalar values? One size fits all? No! What about logs,
traces, and others? No problem – Just do it yourself!
■ Chronix Time Series
■Time Series framework that is used by Chronix.
■Time Series Types:
■Numeric: Doubles (the time series known to be the default)
■More to come.
public interface TimeSeriesConverter<T> {
/**
* Shall create an object of type T from the given binary time series.
*/
T from(BinaryTimeSeries binaryTimeSeriesChunk, long queryStart, long queryEnd);
/**
* Shall do the conversation of the custom time series T into the binary time series that is
stored.
*/
BinaryTimeSeries to(T timeSeriesChunk);
}
That‘s the easiest way to play with Chronix. A single instance of
Chronix on a single node.
Java 8 (JRE)
Chronix - 0.4
Solr - 6.2.1
Lucene
Solr plugins
8983
Your Computer
Chronix-Query-Handler
Chronix-Ingestion-Handler
Chronix-Retention
OpenTSDB
Prometheus
KairosDB
HTTP
Chronix-Compaction-Handler
Chronix Client
InfluxDB
Graphite
Go
Java
Code-Slide: How to set up Chronix, ask for time series data, and
call some server-side aggregations in Java.
■ Create a connection to Solr and set up Chronix
■ Define and range query and stream its results
■ Call some aggregations
solr = new HttpSolrClient("http://localhost:8913/solr/chronix/")
chronix = new ChronixClient(new MetricTimeSeriesConverter<>(),
new ChronixSolrStorage(200, groupBy, reduce))
query = new SolrQuery("metric:*Load*")
chronix.stream(solr,query)
query.addFilterQuery("function=max,min,count,sdiff")
stream = chronix.stream(solr,query) Signed Difference:
First=20, Last=-100
 -80
Group chunks on a combination
of attributes and reduce them to
a time series.
Get all time series whose
metric contains Load
Compared to other time series databases Chronix‘ results for
our use case are outstanding.
■ We have evaluated Chronix with:
■InfluxDB, OpenTSDB, and KairosDB
■All databases are configured as single node
■ Storage demand for 108 GB of raw csv time
series data.
■Chronix (8.7 GB) saves 20% – 84% of the space
other time series databases.
■ Query times on imported data.
■73% – 92% faster on data retrieval.
■80% – 97% faster on a mix of analyses.
■ Memory footprint: after start, max during
import, max during query mix
■Chronix takes 1.6 times less memory than
the best alternative.
The hard facts. For more details I suggest you to read our
research paper about Chronix.
Florian Lautenschlager, Michael Philippsen, Andreas Kumlehn, Josef Adersberger
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in
Operational Data
FAST 2017 (submitted)
17
Let‘s dig into Chronix Ingesters’ internals.
Image Credit: http://www.taringa.net/posts/ciencia-educacion/12656540/La-Filosofia-del-Dr-House-2.html
Big Picture. It’s a simply and scalable architecture.
Prometheus
Standard Prometheus
Installation
Chronix ServerChronix Ingester
• Collects metrics from
various services.
• Writes them to its
default storage
• Writes them using the
standard remote write
interface to Chronix
Ingester
• Collects samples in
batches and writes
them later to Chronix
with an ideal batch size
• Writes checkpoints to
disk to avoid loss of
data.
• Scales easily
• Lossless long term
storage
• Data distribution
(Apache Solr)
• Rich set of analyses
functions for data
analytics beyond real-
time monitoring.
Chronix Chronix
Single Host
Prometheus Chronix ServerChronix Ingester
In-Memory
Everything runs on a single machine. Small. Simple. Beautiful.
S S S B B B
S Sample: {t,v}
B Batch: [{t,v},{t,v},{t,v}]
Single Host
Prometheus
Chronix Server
Chronix Ingester
In-Memory
Once per Prometheus on a single host.
Chronix Ingester
In-Memory
Prometheus
S Sample: {t,v}
B Batch: [{t,v},{t,v},{t,v}]
Single Host
Prometheus
Chronix ServerChronix Ingester
In-Memory
Chronix Ingester Singleton ;-)
Prometheus
S Sample: {t,v}
B Batch: [{t,v},{t,v},{t,v}]
B B B
Single Host
Prometheus
Chronix Server
Chronix Ingester
In-Memory
Chronix Ingester Cloud behind a proxy to serve multiple
Prometheus servers.
Prometheus
S Sample: {t,v}
B Batch: [{t,v},{t,v},{t,v}]
N
G
I
N
X
Chronix Ingester
In-Memory
Prometheus
Prometheus
Single Host
Single Host
Single HostSingle Host
Prometheus
Chronix ServerChronix Ingester
In-Memory
Cloud Mode: Multiple Prometheus Servers, One Chronix Ingester
per Host, A Chronix Server Cloud
Prometheus
N
G
I
N
X
Chronix Ingester
In-Memory
Prometheus
Prometheus Chronix Server Cloud
M
a
s
t
e
r
Architectural Key Factor: The Chronix Ingestor
■ Small Go Program
■Binary Size: 8.5 MB
■Lines of Code: ~ 720 LoC
■Scales easily: Copy, Execute
■ Handles writes from Prometheus
■Just a small configuration:
remote_write: url:
http://<host>:<port>/ingest
■ Batches samples in memory
■Prometheus sends single samples.
■Chronix needs large chunks (n single
samples) to work well
■Max Batch Age
■5M, 12H, ..
■ Crash and restart resilience
■In-memory is dangerous. The Ingester
holds some amount of transient state
■Regularly writes checkpoints of the entire
in-memory state to disk
■Latest checkpoint is loaded on restart
Chronix loves Chunks. Hence the Ingester batches samples.
The data models for Prometheus and Chronix are similar.
■ Prometheus
■Uses so called labels (key-value pairs) to store dimensional values
■Are added dynamically
■Stores samples (pairs of timestamp and scalar value)
■ Chronix
■Uses attributes (key-value pairs) to store dimensional values
■Schema, Schema less, Dynamic Fields, etc.
■Stores samples of timestamp an any value type: scalar, trace, string, etc.
An example Chronix schema to define the available fields.
<?xml version="1.0" encoding="UTF-8" ?>
<schema name="Chronix" version="1.5">
<types>
<fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/>
<fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/>
<fieldType name="binary" class="solr.BinaryField"/>
</types>
<fields>
<!-- The required fields -->
<field name="id" type="string" indexed="true" stored="true" required="true"/>
<field name="_version_" type="long" indexed="true" stored="true"/>
<field name="start" type="long" indexed="true" stored="true" required="true"/>
<field name="end" type="long" indexed="true" stored="true" required="true"/>
<field name="data" type="binary" indexed="true" stored="true" required="false"/>
<field name="metric" type="string" indexed="true" stored="true" required="true"/>
<!-- Dynamic field for tags -->
<dynamicField name="*_s" type="string" indexed="true" stored="true"/>
</fields>
<uniqueKey>id</uniqueKey>
<solrQueryParser defaultOperator="OR"/>
</schema>
Definition of types
Available Fields
Prometheus labels are strings. Chronix Ingester creates them in
Chronix Server dynamically using the dynamicField *_s.
Prometheus_Label -> Chronix_Label
host -> host_s
Showcase: Prometheus, Chronix Ingester, Chronix and Grafana
Prometheus Chronix ServerChronix Ingester
In-Memory
S S S
Grafana
B B B
Disk usage: 11 Days of Data
112,815,835 Samples
Prometheus: ~ 786 MB (whole data directory)
Chronix: ~ 265 MB (without compaction)
A few words about performance in our showcase.
Compaction Effects.
Compaction
Points per
Chunk
Amount of
Records
Disk Usage in
MB
Compaction Time in
Seconds
no -1 610355 265 0
yes 100 1422369 357 134
yes 500 284815 187 75
yes 1000 142573 160 93
yes 5000 28850 131 69
yes 10000 14797 126 61
yes 25000 6408 123 61
yes 100000 2051 121 60
yes 500000 920 119 63
Contains about 112 points per chunk without compaction!
A few words about performance in our showcase.
CPU usage: 4 Cores available (= 400 % Max)
A few words about performance in our showcase.
Memory consumption (max. 8 G)
Ingester
Prometheus
Prometheus Configuration
Chronix Default Web-UI
Using the data source plugins for Chronix and Prometheus.
Ingester Health: Everything Green!
Short Term Data in Prometheus.
Long Term Data in Chronix.
See the difference?
Everything is open source and free to everyone.
The code is the truth.
Chronix Website: www.chronix.io
Chronix Github: https://github.com/ChronixDB
- Ingester: https://github.com/ChronixDB/chronix.ingester
Questions?
- Twitter: @ChronixDB, @flolaut, @phxql
- Slack: https://qaware.slack.com/messages/chronix/
Now it’s your turn.
Now it’s your turn.

More Related Content

PDF
The new time series kid on the block
PDF
Time Series Processing with Solr and Spark
PDF
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
PDF
A Fast and Efficient Time Series Storage Based on Apache Solr
PDF
Efficient and Fast Time Series Storage - The missing link in dynamic software...
PDF
Chronix Poster for the Poster Session FAST 2017
PDF
Go and Uber’s time series database m3
PDF
Chronix Time Series Database - The New Time Series Kid on the Block
The new time series kid on the block
Time Series Processing with Solr and Spark
Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in ...
A Fast and Efficient Time Series Storage Based on Apache Solr
Efficient and Fast Time Series Storage - The missing link in dynamic software...
Chronix Poster for the Poster Session FAST 2017
Go and Uber’s time series database m3
Chronix Time Series Database - The New Time Series Kid on the Block

What's hot (20)

PPTX
Time Series Data in a Time Series World
PDF
Anatomy of an action
PDF
Monitoring with Prometheus
PDF
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
PPTX
Stabilising the jenga tower
PDF
Monitoring infrastructure with prometheus
PDF
Azure Functions - Get rid of your servers, use functions!
PPTX
Cassandra and Storm at Health Market Sceince
PDF
Real Time Data Streaming using Kafka & Storm
PPTX
Need for Time series Database
PDF
Build a Complex, Realtime Data Management App with Postgres 14!
PDF
Collect distributed application logging using fluentd (EFK stack)
PPT
Monitoring using Prometheus and Grafana
PDF
Breaking Prometheus (Promcon Berlin '16)
PDF
Monitoring Kubernetes with Prometheus
PDF
Virtual training Intro to InfluxDB & Telegraf
PDF
Introduction to Twitter Storm
PDF
Distributed real time stream processing- why and how
PDF
Gnocchi v3 brownbag
PDF
Scaling Up Logging and Metrics
Time Series Data in a Time Series World
Anatomy of an action
Monitoring with Prometheus
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Stabilising the jenga tower
Monitoring infrastructure with prometheus
Azure Functions - Get rid of your servers, use functions!
Cassandra and Storm at Health Market Sceince
Real Time Data Streaming using Kafka & Storm
Need for Time series Database
Build a Complex, Realtime Data Management App with Postgres 14!
Collect distributed application logging using fluentd (EFK stack)
Monitoring using Prometheus and Grafana
Breaking Prometheus (Promcon Berlin '16)
Monitoring Kubernetes with Prometheus
Virtual training Intro to InfluxDB & Telegraf
Introduction to Twitter Storm
Distributed real time stream processing- why and how
Gnocchi v3 brownbag
Scaling Up Logging and Metrics
Ad

Similar to Chronix as Long-Term Storage for Prometheus (20)

PDF
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
PDF
Chronix: A fast and efficient time series storage based on Apache Solr
PDF
Time Series Processing with Apache Spark
PDF
Time Series Processing with Apache Spark
PPTX
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
PDF
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
PDF
Time Series Analysis
PPTX
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
PPTX
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
PDF
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
PDF
Prometheus
PDF
Modern MySQL Monitoring and Dashboards.
PDF
Monitoring with prometheus at scale
PDF
Monitoring with prometheus at scale
PDF
Server monitoring using grafana and prometheus
PDF
3 Reasons to Select Time Series Platforms for Cloud Native Applications Monit...
PDF
Microservices and Prometheus (Microservices NYC 2016)
PDF
Monitoring with Prometheus
PPTX
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
PPTX
Redis TimeSeries
OSDC 2016 - Chronix - A fast and efficient time series storage based on Apach...
Chronix: A fast and efficient time series storage based on Apache Solr
Time Series Processing with Apache Spark
Time Series Processing with Apache Spark
Evolving Prometheus for the Cloud Native World (FOSDEM 2018)
Time Series Processing with Solr and Spark: Presented by Josef Adersberger, Q...
Time Series Analysis
Evolution of the Prometheus TSDB (Percona Live Europe 2017)
Prometheus - Intro, CNCF, TSDB,PromQL,Grafana
Your data is in Prometheus, now what? (CurrencyFair Engineering Meetup, 2016)
Prometheus
Modern MySQL Monitoring and Dashboards.
Monitoring with prometheus at scale
Monitoring with prometheus at scale
Server monitoring using grafana and prometheus
3 Reasons to Select Time Series Platforms for Cloud Native Applications Monit...
Microservices and Prometheus (Microservices NYC 2016)
Monitoring with Prometheus
Paul Dix [InfluxData] | InfluxDays Opening Keynote | InfluxDays Virtual Exper...
Redis TimeSeries
Ad

More from QAware GmbH (20)

PDF
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
PDF
Frontends mit Hilfe von KI entwickeln.pdf
PDF
Mit ChatGPT Dinosaurier besiegen - Möglichkeiten und Grenzen von LLM für die ...
PDF
50 Shades of K8s Autoscaling #JavaLand24.pdf
PDF
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
PPTX
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
PDF
Down the Ivory Tower towards Agile Architecture
PDF
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
PDF
Make Developers Fly: Principles for Platform Engineering
PDF
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
PDF
Was kommt nach den SPAs
PDF
Cloud Migration mit KI: der Turbo
PDF
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
PDF
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
PDF
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
PDF
Kubernetes with Cilium in AWS - Experience Report!
PDF
50 Shades of K8s Autoscaling
PDF
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
PDF
Service Mesh Pain & Gain. Experiences from a client project.
PDF
50 Shades of K8s Autoscaling
QAware_Mario-Leander_Reimer_Architecting and Building a K8s-based AI Platform...
Frontends mit Hilfe von KI entwickeln.pdf
Mit ChatGPT Dinosaurier besiegen - Möglichkeiten und Grenzen von LLM für die ...
50 Shades of K8s Autoscaling #JavaLand24.pdf
Make Agile Great - PM-Erfahrungen aus zwei virtuellen internationalen SAFe-Pr...
Fully-managed Cloud-native Databases: The path to indefinite scale @ CNN Mainz
Down the Ivory Tower towards Agile Architecture
"Mixed" Scrum-Teams – Die richtige Mischung macht's!
Make Developers Fly: Principles for Platform Engineering
Der Tod der Testpyramide? – Frontend-Testing mit Playwright
Was kommt nach den SPAs
Cloud Migration mit KI: der Turbo
Migration von stark regulierten Anwendungen in die Cloud: Dem Teufel die See...
Aus blau wird grün! Ansätze und Technologien für nachhaltige Kubernetes-Cluster
Endlich gute API Tests. Boldly Testing APIs Where No One Has Tested Before.
Kubernetes with Cilium in AWS - Experience Report!
50 Shades of K8s Autoscaling
Kontinuierliche Sicherheitstests für APIs mit Testkube und OWASP ZAP
Service Mesh Pain & Gain. Experiences from a client project.
50 Shades of K8s Autoscaling

Recently uploaded (20)

PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPT
Quality review (1)_presentation of this 21
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
.pdf is not working space design for the following data for the following dat...
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPT
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
Database Infoormation System (DBIS).pptx
PDF
Introduction to Business Data Analytics.
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
Introduction to Knowledge Engineering Part 1
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
STUDY DESIGN details- Lt Col Maksud (21).pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Quality review (1)_presentation of this 21
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
.pdf is not working space design for the following data for the following dat...
Reliability_Chapter_ presentation 1221.5784
Major-Components-ofNKJNNKNKNKNKronment.pptx
Fluorescence-microscope_Botany_detailed content
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Chapter 3 METAL JOINING.pptnnnnnnnnnnnnn
Launch Your Data Science Career in Kochi – 2025
Database Infoormation System (DBIS).pptx
Introduction to Business Data Analytics.
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
Introduction to Knowledge Engineering Part 1
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
MODULE 8 - DISASTER risk PREPAREDNESS.pptx

Chronix as Long-Term Storage for Prometheus

  • 1. Chronix as long term storage for Prometheus Florian Lautenschlager, Moritz Kammerer @flolaut, @phxql
  • 2. Prometheus Cloud Native Application Cloud Native Application Cloud Native Application Cloud Native Application Cloud Native Application Cloud Native Application Real-time monitoring and alerting for cloud native apps to detect anomalies close to their occurrence and to initiate measures. TIMENOW 14 Days
  • 3. Beyond real-time monitoring of cloud native apps? Nothing more to do?
  • 4. Prometheus Cloud Native Application Cloud Native Application Cloud Native Application Cloud Native Application Cloud Native Application Cloud Native Application TIMENOW THEN Real-time monitoring and alerting for cloud native apps to detect anomalies close to their occurrence and to initiate measures. Lossless long term storage to store data forever allowing analyses beyond real-time monitoring! Chronix
  • 6. Agenda ■ Some words about Chronix, its Architecture, its Features, and its Performance. ■ How did we built the integration with Prometheus. ■ Showcase: Prometheus, Chronix Ingester, Chronix, and Grafana
  • 7. Chronix is more than just a simple time series database. It’s a time series processing tool stack for all purposes.
  • 8. Time Series Database: What’s that? ■ Definition 1: “A sample s is a tuple of {timestamp, value}, where the value could be any kind of object.” ■ Definition 2: “A time series T is an arbitrary list of chronological ordered samples of one value type”. ■ Definition 3: “A chunk C is a chronological ordered part of a time series.” ■ Definition 4: “A time series database TSDB is a specialized database for storing and retrieving time series in an efficient and optimized way”. s {t,v} 1 T {s1,s2} T CT T1 C1,1 C1,2 TSDB T3C2,2 T1 C2,1
  • 9. Chronix’ architecture enables both efficient storage of time series and millisecond range queries. (1) Semantic Transformation (2) Attributes and Chunks (3) Basic Compression (4) Multi-Dimensional Storage Record data:<chunk> attributes Record data:compressed <chunk> attributes Record Storage 68 Billion Points 1 Mio. Chunks * 68.000 Points ~ 96% Compression Optional
  • 10. The key data type of Chronix is called a record. It stores a compressed time series chunk and its attributes. record{ data:compressed{<chunk>} //technical fields id: 3dce1de0−...−93fb2e806d19 version: 1501692859622883300 start: 1427457011238 end: 1427471159292 //optional attributes host: prodI5 process: scheduler group: jmx metric: heapMemory.Usage.Used max: 896.571 } Data:compressed{<chunk of time series data>} ■ Time Series: timestamp, numeric value ■ Traces: calls, exceptions, … ■ Logs: access, method runtimes ■ Complex data: models, test coverage, anything else… Optional attributes ■ Arbitrary attributes for the time series ■ Attributes are indexed ■ Make the chunk searchable ■ Can contain pre-calculated values
  • 11. Chronix provides specialized aggregations, transformations, and analyses for time series that are commonly used. Aggregations ■ Min / Max / Average / Sum / Count ■ Percentile ■ Standard Deviation ■ First / Last ■ Range Analyses ■ Trend Analysis Using a linear regression model ■ Outlier Analysis Using the IQR ■ Frequency Analysis Check occurrence within a time range ■ Fast Dynamic Time Warping Time series similarity search ■ Symbolic Aggregate Approximation Similarity and pattern search Transformations ■ Bottom/Top n-values ■ Moving average ■ Divide / Scale ■ Downsampling Many more Many more
  • 12. Only scalar values? One size fits all? No! What about logs, traces, and others? No problem – Just do it yourself! ■ Chronix Time Series ■Time Series framework that is used by Chronix. ■Time Series Types: ■Numeric: Doubles (the time series known to be the default) ■More to come. public interface TimeSeriesConverter<T> { /** * Shall create an object of type T from the given binary time series. */ T from(BinaryTimeSeries binaryTimeSeriesChunk, long queryStart, long queryEnd); /** * Shall do the conversation of the custom time series T into the binary time series that is stored. */ BinaryTimeSeries to(T timeSeriesChunk); }
  • 13. That‘s the easiest way to play with Chronix. A single instance of Chronix on a single node. Java 8 (JRE) Chronix - 0.4 Solr - 6.2.1 Lucene Solr plugins 8983 Your Computer Chronix-Query-Handler Chronix-Ingestion-Handler Chronix-Retention OpenTSDB Prometheus KairosDB HTTP Chronix-Compaction-Handler Chronix Client InfluxDB Graphite Go Java
  • 14. Code-Slide: How to set up Chronix, ask for time series data, and call some server-side aggregations in Java. ■ Create a connection to Solr and set up Chronix ■ Define and range query and stream its results ■ Call some aggregations solr = new HttpSolrClient("http://localhost:8913/solr/chronix/") chronix = new ChronixClient(new MetricTimeSeriesConverter<>(), new ChronixSolrStorage(200, groupBy, reduce)) query = new SolrQuery("metric:*Load*") chronix.stream(solr,query) query.addFilterQuery("function=max,min,count,sdiff") stream = chronix.stream(solr,query) Signed Difference: First=20, Last=-100  -80 Group chunks on a combination of attributes and reduce them to a time series. Get all time series whose metric contains Load
  • 15. Compared to other time series databases Chronix‘ results for our use case are outstanding. ■ We have evaluated Chronix with: ■InfluxDB, OpenTSDB, and KairosDB ■All databases are configured as single node ■ Storage demand for 108 GB of raw csv time series data. ■Chronix (8.7 GB) saves 20% – 84% of the space other time series databases. ■ Query times on imported data. ■73% – 92% faster on data retrieval. ■80% – 97% faster on a mix of analyses. ■ Memory footprint: after start, max during import, max during query mix ■Chronix takes 1.6 times less memory than the best alternative.
  • 16. The hard facts. For more details I suggest you to read our research paper about Chronix. Florian Lautenschlager, Michael Philippsen, Andreas Kumlehn, Josef Adersberger Chronix: Long Term Storage and Retrieval Technology for Anomaly Detection in Operational Data FAST 2017 (submitted)
  • 17. 17 Let‘s dig into Chronix Ingesters’ internals. Image Credit: http://www.taringa.net/posts/ciencia-educacion/12656540/La-Filosofia-del-Dr-House-2.html
  • 18. Big Picture. It’s a simply and scalable architecture. Prometheus Standard Prometheus Installation Chronix ServerChronix Ingester • Collects metrics from various services. • Writes them to its default storage • Writes them using the standard remote write interface to Chronix Ingester • Collects samples in batches and writes them later to Chronix with an ideal batch size • Writes checkpoints to disk to avoid loss of data. • Scales easily • Lossless long term storage • Data distribution (Apache Solr) • Rich set of analyses functions for data analytics beyond real- time monitoring. Chronix Chronix
  • 19. Single Host Prometheus Chronix ServerChronix Ingester In-Memory Everything runs on a single machine. Small. Simple. Beautiful. S S S B B B S Sample: {t,v} B Batch: [{t,v},{t,v},{t,v}]
  • 20. Single Host Prometheus Chronix Server Chronix Ingester In-Memory Once per Prometheus on a single host. Chronix Ingester In-Memory Prometheus S Sample: {t,v} B Batch: [{t,v},{t,v},{t,v}]
  • 21. Single Host Prometheus Chronix ServerChronix Ingester In-Memory Chronix Ingester Singleton ;-) Prometheus S Sample: {t,v} B Batch: [{t,v},{t,v},{t,v}] B B B
  • 22. Single Host Prometheus Chronix Server Chronix Ingester In-Memory Chronix Ingester Cloud behind a proxy to serve multiple Prometheus servers. Prometheus S Sample: {t,v} B Batch: [{t,v},{t,v},{t,v}] N G I N X Chronix Ingester In-Memory Prometheus Prometheus
  • 23. Single Host Single Host Single HostSingle Host Prometheus Chronix ServerChronix Ingester In-Memory Cloud Mode: Multiple Prometheus Servers, One Chronix Ingester per Host, A Chronix Server Cloud Prometheus N G I N X Chronix Ingester In-Memory Prometheus Prometheus Chronix Server Cloud M a s t e r
  • 24. Architectural Key Factor: The Chronix Ingestor ■ Small Go Program ■Binary Size: 8.5 MB ■Lines of Code: ~ 720 LoC ■Scales easily: Copy, Execute ■ Handles writes from Prometheus ■Just a small configuration: remote_write: url: http://<host>:<port>/ingest ■ Batches samples in memory ■Prometheus sends single samples. ■Chronix needs large chunks (n single samples) to work well ■Max Batch Age ■5M, 12H, .. ■ Crash and restart resilience ■In-memory is dangerous. The Ingester holds some amount of transient state ■Regularly writes checkpoints of the entire in-memory state to disk ■Latest checkpoint is loaded on restart
  • 25. Chronix loves Chunks. Hence the Ingester batches samples.
  • 26. The data models for Prometheus and Chronix are similar. ■ Prometheus ■Uses so called labels (key-value pairs) to store dimensional values ■Are added dynamically ■Stores samples (pairs of timestamp and scalar value) ■ Chronix ■Uses attributes (key-value pairs) to store dimensional values ■Schema, Schema less, Dynamic Fields, etc. ■Stores samples of timestamp an any value type: scalar, trace, string, etc.
  • 27. An example Chronix schema to define the available fields. <?xml version="1.0" encoding="UTF-8" ?> <schema name="Chronix" version="1.5"> <types> <fieldType name="int" class="solr.TrieIntField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="long" class="solr.TrieLongField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="double" class="solr.TrieDoubleField" precisionStep="0" positionIncrementGap="0"/> <fieldType name="string" class="solr.StrField" sortMissingLast="true" omitNorms="true"/> <fieldType name="binary" class="solr.BinaryField"/> </types> <fields> <!-- The required fields --> <field name="id" type="string" indexed="true" stored="true" required="true"/> <field name="_version_" type="long" indexed="true" stored="true"/> <field name="start" type="long" indexed="true" stored="true" required="true"/> <field name="end" type="long" indexed="true" stored="true" required="true"/> <field name="data" type="binary" indexed="true" stored="true" required="false"/> <field name="metric" type="string" indexed="true" stored="true" required="true"/> <!-- Dynamic field for tags --> <dynamicField name="*_s" type="string" indexed="true" stored="true"/> </fields> <uniqueKey>id</uniqueKey> <solrQueryParser defaultOperator="OR"/> </schema> Definition of types Available Fields Prometheus labels are strings. Chronix Ingester creates them in Chronix Server dynamically using the dynamicField *_s. Prometheus_Label -> Chronix_Label host -> host_s
  • 28. Showcase: Prometheus, Chronix Ingester, Chronix and Grafana Prometheus Chronix ServerChronix Ingester In-Memory S S S Grafana B B B
  • 29. Disk usage: 11 Days of Data 112,815,835 Samples Prometheus: ~ 786 MB (whole data directory) Chronix: ~ 265 MB (without compaction) A few words about performance in our showcase.
  • 30. Compaction Effects. Compaction Points per Chunk Amount of Records Disk Usage in MB Compaction Time in Seconds no -1 610355 265 0 yes 100 1422369 357 134 yes 500 284815 187 75 yes 1000 142573 160 93 yes 5000 28850 131 69 yes 10000 14797 126 61 yes 25000 6408 123 61 yes 100000 2051 121 60 yes 500000 920 119 63 Contains about 112 points per chunk without compaction!
  • 31. A few words about performance in our showcase. CPU usage: 4 Cores available (= 400 % Max)
  • 32. A few words about performance in our showcase. Memory consumption (max. 8 G) Ingester
  • 36. Using the data source plugins for Chronix and Prometheus.
  • 38. Short Term Data in Prometheus. Long Term Data in Chronix. See the difference?
  • 39. Everything is open source and free to everyone. The code is the truth. Chronix Website: www.chronix.io Chronix Github: https://github.com/ChronixDB - Ingester: https://github.com/ChronixDB/chronix.ingester Questions? - Twitter: @ChronixDB, @flolaut, @phxql - Slack: https://qaware.slack.com/messages/chronix/
  • 40. Now it’s your turn. Now it’s your turn.