SlideShare a Scribd company logo
InfluxDB
The time series database
Modern Factory #workshops

Marcin Szepczyński, July 2016
What is time series
data?
What is time series data?
• A time series data is a sequence of data points
made from the same source over the time interval.
• If you have a time series data and plot it, one of
your axes will be always a time.
Examples of time
series data
Influxdb and time series data
Influxdb and time series data
What is not a time
series data?
Influxdb and time series data
Influxdb and time series data
Regular vs irregular 

time series
Time series data is good for
• Internet of Things (e.g. sensors data)
• Alerting
• Monitoring
• Real Time Analytics
InfluxDB is I in TICK stack
• Telegraf - time data collector
• InfluxDB - time series database
• Chronograf - time series data visualization
• Kapacitor - time series data processing and
alerting
InfluxDB features
• SQL-like query language
• Schemaless
• Case sensitive
• Data types: string, float64, int64, boolean
Measurement
• Measurement (or Point) is a single record (row) in
InfluxDB data store
• Each measurement has time (as primary key), tags
(indexed columns) and fields (not indexed
columns)
Inserting
INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5
measurement name 

(„table”)
Inserting
INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5
measurement name 

(„table”)
Comma is a separator between measurement and tags

Comma is a separator between each tag and each field

Inserting
INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5
measurement name 

(„table”)




Space is a separator between tags and fields
Inserting
INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5
measurement name 

(„table”)
tags
Tags
tag1 tag2
value1 value2
Inserting
INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5
measurement name 

(„table”)
fields
Fields
temp value
30.5 1.5
Inserting
INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5
measurement name 

(„table”)
Comma is a separator between measurement and tags

Comma is a separator between each tag and each field

Space is a separator between tags and fields
tags
fields
Querying
• Show databases:

> SHOW DATABASES
• Select database:

> USE workshop
• Show measurements („tables”)

> SHOW MEASUREMENTS
• Simple select all

> SELECT * FROM measurement_name
Querying (2)
• Select with limit:

> SELECT * FROM measure LIMIT 10
• Select with offset:

> SELECT * FROM measure OFFSET 10
• Select where clause:

> SELECT * FROM measure WHERE tag1 = ’value1’
• Select with order clause:

> SELECT * FROM measure ORDER BY cpu DESC
Querying (3)
• Operators:

= equal to

<>, != not equal to

> greater than

< less than

=~ matches against (REGEX)

!~ doesn’t matches against (REGEX)
Aggregations - COUNT()
Returns the number of non-null values.





> SELECT count(<field>) FROM measure



> SELECT count(cpu) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Aggregations - MEAN()
Returns the mean (average) value of a single field
(calculates only for non-null values).





> SELECT mean(<field>) FROM measure



> SELECT mean(cpu) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Aggregations - MEDIAN()
Returns the middle value from the sorted values in
single field (Its similar to PERCENTILE(field, 50).





> SELECT median(<field>) FROM measure



> SELECT median(cpu) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Aggregations - SPREAD()
Returns the difference between minimum and
maximum value of the field.





> SELECT spread(<field>) FROM measure



> SELECT spread(cpu) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Aggregations - SUM()
Returns the sum of all values in a single field.





> SELECT sum(<field>) FROM measure



> SELECT sum(cpu) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Selectors - BOTTOM(N)
Returns the smaller N values in a single field.





> SELECT bottom(<field>, <N>) FROM measure



> SELECT bottom(cpu, 5) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Selectors - FIRST()
Returns the oldest values of a single field.





> SELECT first(<field>) FROM measure



> SELECT first(cpu) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Selectors - LAST()
Returns the newest values of a single field.





> SELECT last(<field>) FROM measure



> SELECT last(cpu) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Selectors - MAX()
Returns the highest value in a single field.





> SELECT max(<field>) FROM measure



> SELECT max(cpu) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Selectors - MIN()
Returns the lowest value in a single field.





> SELECT min(<field>) FROM measure



> SELECT min(cpu) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Selectors - PERCENTILE(N)
Returns the N-percentile value for sorted values of a
single field.





> SELECT percentile(<field>, <N>) FROM measure



> SELECT percentile(cpu, 95) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
Selectors - TOP(N)
Returns the largest N values in a single field.





> SELECT top(<field>, <N>) FROM measure



> SELECT top(cpu, 5) FROM cpu_temp 

WHERE time > '2016-07-04' 

AND time < '2016-07-05' 

GROUP BY time(1h)
GROUP BY clause
InfluxDB supports GROUP BY clause with tag values,
time intervals, tag values and time intervals and
GROUP BY with fill().
Downsampling
InfluxDB can handle hundreds of thousands of data
points per second. Working with that much data over
a long period of time can create storage concerns. A
natural solution is to downsample the data; keep the
high precision raw data for only a limited time, and
store the lower precision, summarized data for much
longer or forever.
Data retention
A retention policy is the part of InfluxDB’s data
structure that describes for how long InfluxDB keeps
data and how many copies of those data are stored
in the cluster. A database can have several RPs and
RPs are unique per database.
More
https://influxdata.com/videos/
https://docs.influxdata.com/influxdb

More Related Content

PPTX
InfluxDb
PDF
All about InfluxDB.
PDF
Introduction to influx db
PDF
Time Series Data with InfluxDB
PDF
Introduction to InfluxDB and TICK Stack
PDF
Introduction to InfluxDB
PDF
InfluxDB & Grafana
PDF
Intro to InfluxDB
InfluxDb
All about InfluxDB.
Introduction to influx db
Time Series Data with InfluxDB
Introduction to InfluxDB and TICK Stack
Introduction to InfluxDB
InfluxDB & Grafana
Intro to InfluxDB

What's hot (20)

PDF
Timeseries - data visualization in Grafana
PDF
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
PDF
Prometheus monitoring
PPTX
Prometheus design and philosophy
PDF
How Robinhood Built a Real-Time Anomaly Detection System to Monitor and Mitig...
PDF
Introduction to Redis
PDF
Beautiful Monitoring With Grafana and InfluxDB
PDF
Monitoring with prometheus
PPTX
Prometheus and Grafana
PPT
PDF
Thanos - Prometheus on Scale
PDF
TiDB Introduction
PDF
Introduction to elasticsearch
PDF
Deep Dive into Cassandra
PDF
Airflow Best Practises & Roadmap to Airflow 2.0
PDF
Kibana + timelion: time series with the elastic stack
PPTX
Grafana optimization for Prometheus
PPTX
Airflow - a data flow engine
PPTX
PDF
Apache Kafka Fundamentals for Architects, Admins and Developers
Timeseries - data visualization in Grafana
InfluxDB 101 - Concepts and Architecture | Michael DeSa | InfluxData
Prometheus monitoring
Prometheus design and philosophy
How Robinhood Built a Real-Time Anomaly Detection System to Monitor and Mitig...
Introduction to Redis
Beautiful Monitoring With Grafana and InfluxDB
Monitoring with prometheus
Prometheus and Grafana
Thanos - Prometheus on Scale
TiDB Introduction
Introduction to elasticsearch
Deep Dive into Cassandra
Airflow Best Practises & Roadmap to Airflow 2.0
Kibana + timelion: time series with the elastic stack
Grafana optimization for Prometheus
Airflow - a data flow engine
Apache Kafka Fundamentals for Architects, Admins and Developers
Ad

Viewers also liked (20)

PDF
Introduction to InfluxDB, an Open Source Distributed Time Series Database by ...
PDF
Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)
PPTX
Grafana zabbix
PDF
Alerting in Grafana, Grafanacon 2015
PPTX
Grafana and MySQL - Benefits and Challenges
PDF
Stop using Nagios (so it can die peacefully)
PDF
Intro to UML
PDF
Convertigo Mobility Platform | Mobile Application Development for Enterprises...
PDF
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More!
PDF
Next generation alerting and fault detection, SRECon Europe 2016
PPTX
job design and ergonomics
PPTX
Cf summit-2016-monitoring-cf-sensu-graphite
PDF
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with Sensu
PDF
Sensu @ Yelp!: A Guided Tour
PPTX
MySQL InnoDB Cluster 미리보기 (remote cluster test)
PDF
Infiniflux introduction
ODP
Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...
PDF
Time Series Database and Tick Stack
PDF
A Fast and Efficient Time Series Storage Based on Apache Solr
PDF
[213]monitoringwithscouter 이건희
Introduction to InfluxDB, an Open Source Distributed Time Series Database by ...
Custom DevOps Monitoring System in MelOn (with InfluxDB + Telegraf + Grafana)
Grafana zabbix
Alerting in Grafana, Grafanacon 2015
Grafana and MySQL - Benefits and Challenges
Stop using Nagios (so it can die peacefully)
Intro to UML
Convertigo Mobility Platform | Mobile Application Development for Enterprises...
Redis in a Multi Tenant Environment–High Availability, Monitoring & Much More!
Next generation alerting and fault detection, SRECon Europe 2016
job design and ergonomics
Cf summit-2016-monitoring-cf-sensu-graphite
Sense and Sensu-bility: Painless Metrics And Monitoring In The Cloud with Sensu
Sensu @ Yelp!: A Guided Tour
MySQL InnoDB Cluster 미리보기 (remote cluster test)
Infiniflux introduction
Open Source Monitoring in 2014, from #monitoringssucks to #monitoringlove and...
Time Series Database and Tick Stack
A Fast and Efficient Time Series Storage Based on Apache Solr
[213]monitoringwithscouter 이건희
Ad

Similar to Influxdb and time series data (20)

PDF
INFLUXQL & TICKSCRIPT
PDF
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
PDF
Inside the InfluxDB storage engine
PDF
Virtual training optimizing the tick stack
PDF
OPTIMIZING THE TICK STACK
PDF
InfluxData Platform Future and Vision
PPTX
OPTIMIZING THE TICK STACK
PDF
Downsampling your data October 2017
PDF
OPTIMIZING THE TICK STACK
PPTX
InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...
PDF
Influx db talk-20150415
PPTX
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
PDF
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
PPTX
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
PDF
Best Practices: How to Analyze IoT Sensor Data with InfluxDB
PDF
Power Your Predictive Analytics with InfluxDB
PPTX
CCI2019 - Monitorare SQL Server Senza Andare in Bancarotta
PPTX
InfluxDB 1.0 - Optimizing InfluxDB by Sam Dillard
PDF
Solving Manufacturing Challenges with Time Series Data.pdf
PDF
IOT with PostgreSQL
 
INFLUXQL & TICKSCRIPT
Creating and Using the Flux SQL Datasource | Katy Farmer | InfluxData
Inside the InfluxDB storage engine
Virtual training optimizing the tick stack
OPTIMIZING THE TICK STACK
InfluxData Platform Future and Vision
OPTIMIZING THE TICK STACK
Downsampling your data October 2017
OPTIMIZING THE TICK STACK
InfluxDB 101 – Concepts and Architecture by Michael DeSa, Software Engineer |...
Influx db talk-20150415
Intro to InfluxDB 2.0 and Your First Flux Query by Sonia Gupta
Paul Dix [InfluxData] The Journey of InfluxDB | InfluxDays 2022
Anais Dotis-Georgiou & Faith Chikwekwe [InfluxData] | Top 10 Hurdles for Flux...
Best Practices: How to Analyze IoT Sensor Data with InfluxDB
Power Your Predictive Analytics with InfluxDB
CCI2019 - Monitorare SQL Server Senza Andare in Bancarotta
InfluxDB 1.0 - Optimizing InfluxDB by Sam Dillard
Solving Manufacturing Challenges with Time Series Data.pdf
IOT with PostgreSQL
 

Recently uploaded (20)

PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPT
Project quality management in manufacturing
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
OOP with Java - Java Introduction (Basics)
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
additive manufacturing of ss316l using mig welding
PPTX
web development for engineering and engineering
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PDF
Well-logging-methods_new................
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
DOCX
573137875-Attendance-Management-System-original
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
Geodesy 1.pptx...............................................
Internet of Things (IOT) - A guide to understanding
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Project quality management in manufacturing
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
OOP with Java - Java Introduction (Basics)
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
CH1 Production IntroductoryConcepts.pptx
additive manufacturing of ss316l using mig welding
web development for engineering and engineering
Embodied AI: Ushering in the Next Era of Intelligent Systems
Well-logging-methods_new................
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
573137875-Attendance-Management-System-original
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Geodesy 1.pptx...............................................

Influxdb and time series data

  • 1. InfluxDB The time series database Modern Factory #workshops
 Marcin Szepczyński, July 2016
  • 2. What is time series data?
  • 3. What is time series data? • A time series data is a sequence of data points made from the same source over the time interval. • If you have a time series data and plot it, one of your axes will be always a time.
  • 7. What is not a time series data?
  • 10. Regular vs irregular 
 time series
  • 11. Time series data is good for • Internet of Things (e.g. sensors data) • Alerting • Monitoring • Real Time Analytics
  • 12. InfluxDB is I in TICK stack • Telegraf - time data collector • InfluxDB - time series database • Chronograf - time series data visualization • Kapacitor - time series data processing and alerting
  • 13. InfluxDB features • SQL-like query language • Schemaless • Case sensitive • Data types: string, float64, int64, boolean
  • 14. Measurement • Measurement (or Point) is a single record (row) in InfluxDB data store • Each measurement has time (as primary key), tags (indexed columns) and fields (not indexed columns)
  • 16. Inserting INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5 measurement name 
 („table”) Comma is a separator between measurement and tags
 Comma is a separator between each tag and each field

  • 17. Inserting INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5 measurement name 
 („table”) 
 
 Space is a separator between tags and fields
  • 18. Inserting INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5 measurement name 
 („table”) tags Tags tag1 tag2 value1 value2
  • 19. Inserting INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5 measurement name 
 („table”) fields Fields temp value 30.5 1.5
  • 20. Inserting INSERT table_name,tag1=value1,tag2=value2 temp=30.5,value=1.5 measurement name 
 („table”) Comma is a separator between measurement and tags
 Comma is a separator between each tag and each field
 Space is a separator between tags and fields tags fields
  • 21. Querying • Show databases:
 > SHOW DATABASES • Select database:
 > USE workshop • Show measurements („tables”)
 > SHOW MEASUREMENTS • Simple select all
 > SELECT * FROM measurement_name
  • 22. Querying (2) • Select with limit:
 > SELECT * FROM measure LIMIT 10 • Select with offset:
 > SELECT * FROM measure OFFSET 10 • Select where clause:
 > SELECT * FROM measure WHERE tag1 = ’value1’ • Select with order clause:
 > SELECT * FROM measure ORDER BY cpu DESC
  • 23. Querying (3) • Operators:
 = equal to
 <>, != not equal to
 > greater than
 < less than
 =~ matches against (REGEX)
 !~ doesn’t matches against (REGEX)
  • 24. Aggregations - COUNT() Returns the number of non-null values.
 
 
 > SELECT count(<field>) FROM measure
 
 > SELECT count(cpu) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 25. Aggregations - MEAN() Returns the mean (average) value of a single field (calculates only for non-null values).
 
 
 > SELECT mean(<field>) FROM measure
 
 > SELECT mean(cpu) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 26. Aggregations - MEDIAN() Returns the middle value from the sorted values in single field (Its similar to PERCENTILE(field, 50).
 
 
 > SELECT median(<field>) FROM measure
 
 > SELECT median(cpu) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 27. Aggregations - SPREAD() Returns the difference between minimum and maximum value of the field.
 
 
 > SELECT spread(<field>) FROM measure
 
 > SELECT spread(cpu) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 28. Aggregations - SUM() Returns the sum of all values in a single field.
 
 
 > SELECT sum(<field>) FROM measure
 
 > SELECT sum(cpu) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 29. Selectors - BOTTOM(N) Returns the smaller N values in a single field.
 
 
 > SELECT bottom(<field>, <N>) FROM measure
 
 > SELECT bottom(cpu, 5) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 30. Selectors - FIRST() Returns the oldest values of a single field.
 
 
 > SELECT first(<field>) FROM measure
 
 > SELECT first(cpu) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 31. Selectors - LAST() Returns the newest values of a single field.
 
 
 > SELECT last(<field>) FROM measure
 
 > SELECT last(cpu) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 32. Selectors - MAX() Returns the highest value in a single field.
 
 
 > SELECT max(<field>) FROM measure
 
 > SELECT max(cpu) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 33. Selectors - MIN() Returns the lowest value in a single field.
 
 
 > SELECT min(<field>) FROM measure
 
 > SELECT min(cpu) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 34. Selectors - PERCENTILE(N) Returns the N-percentile value for sorted values of a single field.
 
 
 > SELECT percentile(<field>, <N>) FROM measure
 
 > SELECT percentile(cpu, 95) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 35. Selectors - TOP(N) Returns the largest N values in a single field.
 
 
 > SELECT top(<field>, <N>) FROM measure
 
 > SELECT top(cpu, 5) FROM cpu_temp 
 WHERE time > '2016-07-04' 
 AND time < '2016-07-05' 
 GROUP BY time(1h)
  • 36. GROUP BY clause InfluxDB supports GROUP BY clause with tag values, time intervals, tag values and time intervals and GROUP BY with fill().
  • 37. Downsampling InfluxDB can handle hundreds of thousands of data points per second. Working with that much data over a long period of time can create storage concerns. A natural solution is to downsample the data; keep the high precision raw data for only a limited time, and store the lower precision, summarized data for much longer or forever.
  • 38. Data retention A retention policy is the part of InfluxDB’s data structure that describes for how long InfluxDB keeps data and how many copies of those data are stored in the cluster. A database can have several RPs and RPs are unique per database.