SlideShare a Scribd company logo
2.4 and 3.0 Update
2
Who Am I?
Chris Larsen
● Maintainer and author for OpenTSDB since 2013
● Software Engineer @ Yahoo
● Central Monitoring Team
Who I’m not:
● A marketer
● A sales person
What Is OpenTSDB?
● Open Source Time Series Database
● Scales to 10s of millions of writes
per second
● Sucks up all data and keeps going
● Never lose precision (if you have
space)
● Scales using HBase or Bigtable
4
What are Time Series?
● Time Series: A sequence of discrete data
points (values) ordered and indexed by time
associated with an identity.
E.g.:
web01.sys.cpu.busy.pct 45% 1/1/207 12:01:00
web01.sys.cpu.busy.pct 52% 1/1/207 12:02:00
web01.sys.cpu.busy.pct 35% 1/1/207 12:03:00
^ Identity ^ Value ^ Timestamp
5
What are Time Series?
6
What are Time Series?
Data Point:
● Metric + Tags
● + Value: 42
● + Timestamp: 1234567890
sys.cpu.user 1234567890 42 host=web01 cpu=0
● Payload could also be a string, a blob, a histogram,
etc.
^ a data point ^
7
What are HBase and Bigtable?
● HBase is an OSS distributed LSM backed hash table
based on Google’s Bigtable.
● Key value, row based column store.
● Sorted by row, columns and cell versions.
● Supports:
○ Scans across rows with filters.
○ Get specific row and/or columns.
○ Atomic operations.
● CP from CAP theorem.
8
OpenTSDB Schema
● Row key is a concatenation of UIDs and time:
○ salt + metric + timestamp + tagk1 + tagv1… + tagkN + tagvN
● sys.cpu.user 1234567890 42 host=web01 cpu=0
x01x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02
● Timestamp normalized on hour or daily boundaries.
● All data points for an hour or day are stored in one row.
● Data: VLE 64 bit signed integers or single/double precision
signed floats, Strings and raw histograms.
● Saves storage space but requires UID conversion.
9
OpenTSDB Schema
Row Key Columns (qualifier/value)
m t1 tagk1 tagv1 o1/v1 o2/v2 o3/v3
m t1 tagk1 tagv2 o1/v1 o2/v2
m t1 tagk1 tagv1 tagk2 tagv3 o1/v1 o2/v2 o3/v3
m t1 tagk1 tagv2 tagk2 tagv4 o1/v1 o3/v3
m t1 tagk3 tagv5 o1/v1 o2/v2 o3/v3
m t1 tagk3 tagv6 o2/v2
m t2 tagk1 tagv1 o1/v1 o3/v3
m t2 tagk1 tagv2 o1/v1 o2/v2
10
OpenTSDB Use Cases
● Backing store for Argus: Open
source monitoring and alerting
system.
● 50M writes per minute.
● ~4M writes per TSD per minute.
● 23k queries per minute.
● https://github.com/salesforce/Argus
11
OpenTSDB Use Cases
● Monitoring system, network and
application performance and statistics.
● Single cluster: 10M to 18M writes/s ~ 3PB.
● Multi-tenant and Kerberos secure HBase.
● ~200k writes per second per TSD.
● Central monitoring for all Yahoo properties.
● Over 1 billion active time series served.
● Leading committer to OpenTSDB.
12
Other Users
13
New for OpenTSDB 2.4
● Rollup / Pre-Aggregated storage and querying
○ Improves query speed
○ Allows for high-resolution data to be TTL’d out
● Histogram/Digests/Sketches
○ Accurate percentile calculations on distributed
measurements such as latencies.
● Date Tiered Compaction support
● Authentication/Authorization plugin
14
The Problem of Percentiles
● Aggregating percentiles ==
● Averaging percentiles is in accurate.
E.g. 46.175 hides the bad host, web02
● Max is more useful for finding bad hosts
● But there are better ways...
latency.p99.9 42.50 host=web01
latency.p99.9 58.98 host=web02
latency.p99.9 41.28 host=web03
latency.p99.9 41.94 host=web04
15
Histograms
● Distribution of frequency of
measurements over a
time period
● Simplest form:
latency measurement
buckets storing counts
falling within those buckets. E.g.
latency.histogram 0,15.0=0:15.0,30.0=1:30.0,45.0=4:45.0,60.0=0 host=web01
latency.histogram 0,15.0=1:15.0,30.0=0:30.0,45.0=2:45.0,60.0=4 host=web02
latency.histogram 0,15.0=2:15.0,30.0=0:30.0,45.0=4:45.4,60.0=0 host=web03
latency.histogram 0,15.0=0:15.0,30.0=1:30.0,45.0=4:45.0,60.0=0 host=web04
16
Histograms
Histogram p99 p85 p50
latency.histogram 0,15.0=0:15.0,30.0=1:30.0,45.0=4:45.0,60.0=0 host=web01 37.5 37.5 37.5
latency.histogram 0,15.0=1:15.0,30.0=0:30.0,45.0=2:45.0,60.0=4 host=web02 52.5 52.5 52.5
latency.histogram 0,15.0=2:15.0,30.0=0:30.0,45.0=4:45.4,60.0=0 host=web03 37.5 37.5 37.5
latency.histogram 0,15.0=0:15.0,30.0=1:30.0,45.0=4:45.0,60.0=0 host=web04 37.5 37.5 37.5
Averaged Percentiles: 41.25 41.25 41.25
Summed Histograms:
latency.histogram 0,15.0=3:15.0,30.0=2:30.0,45.0=14:45.0,60.0=4 52.5 52.5 37.5
17
Histograms
● Pros:
○ Fixed size (877 bytes for 97 buckets per data point)
○ Richer analysis (probability distribution, etc)
○ Mergable via group by and downsampling
○ Fixed rank error, variable value error
● Cons:
○ Much more network/storage space required
○ Loss of accuracy (somewhere within the bucket) but
precise
○ Common metrics libraries lack support
18
Pluggable Implementations
Yahoo’s Data Sketches
● Collection of approximation
algorithms with mergability and
configurable accuracy v. size
(~26k for 2M measurements)
● Deterministic rank error
● Tapering log size with N
measurements per sketch
● Good for median percentiles
● https://datasketches.github.io/
19
Pluggable Implementations
T-Digest
● Offshoot of Q-Digest K-means clustering quantile
approximations
● Small error at top and bottom of the quantile range
● Mergable
● Able to store floating point as well as integers
● https://github.com/tdunning/t-digest
20
The Problem of Appends
● 2.2 Introduced appends to
move away from TSD
compactions.
● 1 second resolution = 3600
columns per row => compact into 1.
● But with appends, HBase:
○ Reads the column (from memstore or disk)
○ Appends the data and writes back to memstore (and
possibly block cache)
○ Send full data back to the client
21
The Problem of Appends
● Negatives:
○ Possible disk thrashing if columns have been
compacted out of the memstore
○ Higher CPU utilization on the region servers
○ Longer wait time on the client side
● Future Solution:
○ Yahoo’s HBase developers (Francis, Thiruvel) working
on an optimization using coprocessors.
○ Trials underway, details in August
22
The Problem of Compactions
● HBase compaction merges multiple store files into one,
saving space.
● But if we assume the data is time series, with older data
immutable and updates only to new data…
● ...we can avoid re-compacting old files that won’t change
and skip them at scan time.
● HBASE-15181 from Yahoo and Flurry supports organizing
store files by date and time.
● PR #990 from Karan at SalesForce allows TSDB to write
HBase timestamps
23
AsyncHBase 1.8
● AsyncHBase is a fully asynchronous,
multi-threaded HBase client
● Supports HBase 0.90 to 1.x
● Faster and less resource intensive than the
native HBase client
● Support for scanner filters, META prefetch,
“fail-fast” RPCs
24
AsyncHBase 1.8
● Batched GetRequests thanks to Tian-Ying at Pinterest and
Bizu at Yahoo
● Reverse scanning support thanks to Jiayun at Harvard
● HBase 1.3.x+ support thanks to Karan at SalesForce
● MultipleColumnPrefixFilter
● Skip WAL with increments
● AtomicIncrements with multiple columns per request
25
OpenTSDB on Bigtable
● Bigtable
○ Hosted Google Service
○ Client uses HTTP2 and GRPC for communication
● OpenTSDB heads home
○ Based on a time series store on Bigtable at Google
○ Identical schema as HBase
○ Same filter support (fuzzy filters are coming)
26
OpenTSDB 3.0
● Problem: Queries are slow and the order of operations is
immutable
● Solutions: (This part is ready for testing!)
○ New composable query layer allowing operations in any
order
○ Support for querying multiple sources and merging the
results (e.g. use Facebook’s Berengi as a write-cache
and Redis as a query cache)
○ Support for multi-cluster queries for active-active,
high-availability setups
27
OpenTSDB 3.0
● Problem: Storing other types of data or using other
backends is a pain.
● Solutions: (In progress)
○ Pluggable storage interface allowing for various
schemas and implementations
(e.g. native HTable client, AsyncHBase, native Bigtable
client, etc)
○ Abstracted data types for pluggable implementations of
time series (e.g. raw binary, histograms, SCADA data)
28
OpenTSDB 3.0
● Problem: What about anomaly detection, forecasting, etc?
● Solutions: (In progress)
○ Integration with Yahoo’s EGADS time series functions
library
○ Period-over-period analysis with model caching
○ Clustering algorithms for detecting outliers
○ https://github.com/yahoo/egads
29
OpenTSDB 3.0
● New Java APIs
● Servlet for standard deployment using your favorite server
● Tracing with Zipkin and OpenTracing
● New debugging UI
● Improved Docker support
30
Alternative TSDBs
DalmatinerDB
https://misfra.me/2016/04/09/tsdb-list/
31
More Info and Credits
● Thanks to the Monitoring and HBase teams at Yahoo, Pythian for Bigtable
support and our OSS contributors!
● Contribute at github.com/OpenTSDB/opentsdb
● Website: opentsdb.net
● Mailing List: groups.google.com/group/opentsdb
Images
● https://commons.wikimedia.org/wiki/File:Programmer_writing_code_with_Unit_Tests.jpg
● http://www.doncio.navy.mil/CHIPS/ArticleDetails.aspx?ID=8098
● https://i.imgflip.com/t96s8.jpg
● https://commons.wikimedia.org/wiki/File:Twemoji_1f626.svg
● https://xkcd.com/1425/
● https://commons.wikimedia.org/wiki/Emoji#/media/File:Twemoji_1f623.svg
● https://c1.staticflickr.com/1/508/32307332875_40e73bf750_b.jpg
● http://code.flickr.net/2008/10/27/counting-timing/
● http://3.bp.blogspot.com/-tTXEI5IiQh4/VQqaJz4LtSI/AAAAAAAAEL8/n5AwTVNI-Us/s1600/Introduction%2Bto%2BSQ
L.png

More Related Content

PDF
OpenTSDB for monitoring @ Criteo
PPTX
Update on OpenTSDB and AsyncHBase
PPTX
Update on OpenTSDB and AsyncHBase
PDF
OpenTSDB 2.0
PPTX
Monitoring MySQL with OpenTSDB
PDF
openTSDB - Metrics for a distributed world
PPTX
HBaseCon 2015: OpenTSDB and AsyncHBase Update
PPTX
HBaseCon 2013: OpenTSDB at Box
OpenTSDB for monitoring @ Criteo
Update on OpenTSDB and AsyncHBase
Update on OpenTSDB and AsyncHBase
OpenTSDB 2.0
Monitoring MySQL with OpenTSDB
openTSDB - Metrics for a distributed world
HBaseCon 2015: OpenTSDB and AsyncHBase Update
HBaseCon 2013: OpenTSDB at Box

What's hot (20)

PPTX
opentsdb in a real enviroment
PDF
Samza memory capacity_2015_ieee_big_data_data_quality_workshop
PDF
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
PDF
Chronix Time Series Database - The New Time Series Kid on the Block
PDF
Kafka on ZFS: Better Living Through Filesystems
PPTX
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
PDF
Distributed Postgres
PDF
HBaseCon2017 HBase at Xiaomi
PPTX
Time Series Data in a Time Series World
PPTX
Lightning Talk: MongoDB Sharding
PDF
Go and Uber’s time series database m3
PPTX
Logs @ OVHcloud
PDF
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
PPTX
Back to Basics Webinar 6: Production Deployment
PPTX
Latest performance changes by Scylla - Project optimus / Nolimits
PPTX
Sharding Methods for MongoDB
PDF
Gnocchi v3
PDF
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
PPTX
Mongo db multidc_webinar
PDF
SignalFx: Making Cassandra Perform as a Time Series Database
opentsdb in a real enviroment
Samza memory capacity_2015_ieee_big_data_data_quality_workshop
HBaseConAsia2018 Track1-2: WALLess HBase with persistent memory devices
Chronix Time Series Database - The New Time Series Kid on the Block
Kafka on ZFS: Better Living Through Filesystems
Bucket Your Partitions Wisely (Markus Höfer, codecentric AG) | Cassandra Summ...
Distributed Postgres
HBaseCon2017 HBase at Xiaomi
Time Series Data in a Time Series World
Lightning Talk: MongoDB Sharding
Go and Uber’s time series database m3
Logs @ OVHcloud
HBaseCon 2012 | Lessons learned from OpenTSDB - Benoit Sigoure, StumbleUpon
Back to Basics Webinar 6: Production Deployment
Latest performance changes by Scylla - Project optimus / Nolimits
Sharding Methods for MongoDB
Gnocchi v3
PGConf APAC 2018 - High performance json postgre-sql vs. mongodb
Mongo db multidc_webinar
SignalFx: Making Cassandra Perform as a Time Series Database
Ad

Similar to OpenTSDB: HBaseCon2017 (20)

PDF
Argus Production Monitoring at Salesforce
PDF
Argus Production Monitoring at Salesforce
PDF
Kudu - Fast Analytics on Fast Data
PDF
Macy's: Changing Engines in Mid-Flight
PDF
Speed Up Uber's Presto with Alluxio
PDF
Scaling Monitoring At Databricks From Prometheus to M3
PDF
Enabling Presto Caching at Uber with Alluxio
PDF
How We Added Replication to QuestDB - JonTheBeach
PPTX
Ledingkart Meetup #2: Scaling Search @Lendingkart
PDF
Elasticsearch on Kubernetes
PPTX
MongoDB Days UK: Tales from the Field
PDF
Streaming millions of Contact Center interactions in (near) real-time with Pu...
PDF
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
PDF
Hadoop 3 @ Hadoop Summit San Jose 2017
PDF
Apache Hadoop 3.0 Community Update
PDF
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
PPTX
Building real time Data Pipeline using Spark Streaming
PDF
Improve Presto Architectural Decisions with Shadow Cache
PDF
Chronix Poster for the Poster Session FAST 2017
PDF
NewSQL - The Future of Databases?
Argus Production Monitoring at Salesforce
Argus Production Monitoring at Salesforce
Kudu - Fast Analytics on Fast Data
Macy's: Changing Engines in Mid-Flight
Speed Up Uber's Presto with Alluxio
Scaling Monitoring At Databricks From Prometheus to M3
Enabling Presto Caching at Uber with Alluxio
How We Added Replication to QuestDB - JonTheBeach
Ledingkart Meetup #2: Scaling Search @Lendingkart
Elasticsearch on Kubernetes
MongoDB Days UK: Tales from the Field
Streaming millions of Contact Center interactions in (near) real-time with Pu...
Streaming Millions of Contact Center Interactions in (Near) Real-Time with Pu...
Hadoop 3 @ Hadoop Summit San Jose 2017
Apache Hadoop 3.0 Community Update
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
Building real time Data Pipeline using Spark Streaming
Improve Presto Architectural Decisions with Shadow Cache
Chronix Poster for the Poster Session FAST 2017
NewSQL - The Future of Databases?
Ad

More from HBaseCon (20)

PDF
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
PDF
hbaseconasia2017: HBase on Beam
PDF
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
PDF
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
PDF
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
PDF
hbaseconasia2017: Apache HBase at Netease
PDF
hbaseconasia2017: HBase在Hulu的使用和实践
PDF
hbaseconasia2017: 基于HBase的企业级大数据平台
PDF
hbaseconasia2017: HBase at JD.com
PDF
hbaseconasia2017: Large scale data near-line loading method and architecture
PDF
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
PDF
hbaseconasia2017: HBase Practice At XiaoMi
PDF
hbaseconasia2017: hbase-2.0.0
PDF
HBaseCon2017 Democratizing HBase
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PDF
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
PDF
HBaseCon2017 Transactions in HBase
PDF
HBaseCon2017 Highly-Available HBase
PDF
HBaseCon2017 Apache HBase at Didi
PDF
HBaseCon2017 gohbase: Pure Go HBase Client
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: hbase-2.0.0
HBaseCon2017 Democratizing HBase
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Transactions in HBase
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 gohbase: Pure Go HBase Client

Recently uploaded (20)

PDF
Machine learning based COVID-19 study performance prediction
PPTX
Cloud computing and distributed systems.
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation theory and applications.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
cuic standard and advanced reporting.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPT
Teaching material agriculture food technology
PPTX
Programs and apps: productivity, graphics, security and other tools
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Machine learning based COVID-19 study performance prediction
Cloud computing and distributed systems.
Encapsulation_ Review paper, used for researhc scholars
NewMind AI Weekly Chronicles - August'25 Week I
Empathic Computing: Creating Shared Understanding
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation theory and applications.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Review of recent advances in non-invasive hemoglobin estimation
cuic standard and advanced reporting.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Teaching material agriculture food technology
Programs and apps: productivity, graphics, security and other tools
The AUB Centre for AI in Media Proposal.docx
MYSQL Presentation for SQL database connectivity
Reach Out and Touch Someone: Haptics and Empathic Computing
Mobile App Security Testing_ A Comprehensive Guide.pdf

OpenTSDB: HBaseCon2017

  • 1. 2.4 and 3.0 Update
  • 2. 2 Who Am I? Chris Larsen ● Maintainer and author for OpenTSDB since 2013 ● Software Engineer @ Yahoo ● Central Monitoring Team Who I’m not: ● A marketer ● A sales person
  • 3. What Is OpenTSDB? ● Open Source Time Series Database ● Scales to 10s of millions of writes per second ● Sucks up all data and keeps going ● Never lose precision (if you have space) ● Scales using HBase or Bigtable
  • 4. 4 What are Time Series? ● Time Series: A sequence of discrete data points (values) ordered and indexed by time associated with an identity. E.g.: web01.sys.cpu.busy.pct 45% 1/1/207 12:01:00 web01.sys.cpu.busy.pct 52% 1/1/207 12:02:00 web01.sys.cpu.busy.pct 35% 1/1/207 12:03:00 ^ Identity ^ Value ^ Timestamp
  • 5. 5 What are Time Series?
  • 6. 6 What are Time Series? Data Point: ● Metric + Tags ● + Value: 42 ● + Timestamp: 1234567890 sys.cpu.user 1234567890 42 host=web01 cpu=0 ● Payload could also be a string, a blob, a histogram, etc. ^ a data point ^
  • 7. 7 What are HBase and Bigtable? ● HBase is an OSS distributed LSM backed hash table based on Google’s Bigtable. ● Key value, row based column store. ● Sorted by row, columns and cell versions. ● Supports: ○ Scans across rows with filters. ○ Get specific row and/or columns. ○ Atomic operations. ● CP from CAP theorem.
  • 8. 8 OpenTSDB Schema ● Row key is a concatenation of UIDs and time: ○ salt + metric + timestamp + tagk1 + tagv1… + tagkN + tagvN ● sys.cpu.user 1234567890 42 host=web01 cpu=0 x01x00x00x01x49x95xFBx70x00x00x01x00x00x01x00x00x02x00x00x02 ● Timestamp normalized on hour or daily boundaries. ● All data points for an hour or day are stored in one row. ● Data: VLE 64 bit signed integers or single/double precision signed floats, Strings and raw histograms. ● Saves storage space but requires UID conversion.
  • 9. 9 OpenTSDB Schema Row Key Columns (qualifier/value) m t1 tagk1 tagv1 o1/v1 o2/v2 o3/v3 m t1 tagk1 tagv2 o1/v1 o2/v2 m t1 tagk1 tagv1 tagk2 tagv3 o1/v1 o2/v2 o3/v3 m t1 tagk1 tagv2 tagk2 tagv4 o1/v1 o3/v3 m t1 tagk3 tagv5 o1/v1 o2/v2 o3/v3 m t1 tagk3 tagv6 o2/v2 m t2 tagk1 tagv1 o1/v1 o3/v3 m t2 tagk1 tagv2 o1/v1 o2/v2
  • 10. 10 OpenTSDB Use Cases ● Backing store for Argus: Open source monitoring and alerting system. ● 50M writes per minute. ● ~4M writes per TSD per minute. ● 23k queries per minute. ● https://github.com/salesforce/Argus
  • 11. 11 OpenTSDB Use Cases ● Monitoring system, network and application performance and statistics. ● Single cluster: 10M to 18M writes/s ~ 3PB. ● Multi-tenant and Kerberos secure HBase. ● ~200k writes per second per TSD. ● Central monitoring for all Yahoo properties. ● Over 1 billion active time series served. ● Leading committer to OpenTSDB.
  • 13. 13 New for OpenTSDB 2.4 ● Rollup / Pre-Aggregated storage and querying ○ Improves query speed ○ Allows for high-resolution data to be TTL’d out ● Histogram/Digests/Sketches ○ Accurate percentile calculations on distributed measurements such as latencies. ● Date Tiered Compaction support ● Authentication/Authorization plugin
  • 14. 14 The Problem of Percentiles ● Aggregating percentiles == ● Averaging percentiles is in accurate. E.g. 46.175 hides the bad host, web02 ● Max is more useful for finding bad hosts ● But there are better ways... latency.p99.9 42.50 host=web01 latency.p99.9 58.98 host=web02 latency.p99.9 41.28 host=web03 latency.p99.9 41.94 host=web04
  • 15. 15 Histograms ● Distribution of frequency of measurements over a time period ● Simplest form: latency measurement buckets storing counts falling within those buckets. E.g. latency.histogram 0,15.0=0:15.0,30.0=1:30.0,45.0=4:45.0,60.0=0 host=web01 latency.histogram 0,15.0=1:15.0,30.0=0:30.0,45.0=2:45.0,60.0=4 host=web02 latency.histogram 0,15.0=2:15.0,30.0=0:30.0,45.0=4:45.4,60.0=0 host=web03 latency.histogram 0,15.0=0:15.0,30.0=1:30.0,45.0=4:45.0,60.0=0 host=web04
  • 16. 16 Histograms Histogram p99 p85 p50 latency.histogram 0,15.0=0:15.0,30.0=1:30.0,45.0=4:45.0,60.0=0 host=web01 37.5 37.5 37.5 latency.histogram 0,15.0=1:15.0,30.0=0:30.0,45.0=2:45.0,60.0=4 host=web02 52.5 52.5 52.5 latency.histogram 0,15.0=2:15.0,30.0=0:30.0,45.0=4:45.4,60.0=0 host=web03 37.5 37.5 37.5 latency.histogram 0,15.0=0:15.0,30.0=1:30.0,45.0=4:45.0,60.0=0 host=web04 37.5 37.5 37.5 Averaged Percentiles: 41.25 41.25 41.25 Summed Histograms: latency.histogram 0,15.0=3:15.0,30.0=2:30.0,45.0=14:45.0,60.0=4 52.5 52.5 37.5
  • 17. 17 Histograms ● Pros: ○ Fixed size (877 bytes for 97 buckets per data point) ○ Richer analysis (probability distribution, etc) ○ Mergable via group by and downsampling ○ Fixed rank error, variable value error ● Cons: ○ Much more network/storage space required ○ Loss of accuracy (somewhere within the bucket) but precise ○ Common metrics libraries lack support
  • 18. 18 Pluggable Implementations Yahoo’s Data Sketches ● Collection of approximation algorithms with mergability and configurable accuracy v. size (~26k for 2M measurements) ● Deterministic rank error ● Tapering log size with N measurements per sketch ● Good for median percentiles ● https://datasketches.github.io/
  • 19. 19 Pluggable Implementations T-Digest ● Offshoot of Q-Digest K-means clustering quantile approximations ● Small error at top and bottom of the quantile range ● Mergable ● Able to store floating point as well as integers ● https://github.com/tdunning/t-digest
  • 20. 20 The Problem of Appends ● 2.2 Introduced appends to move away from TSD compactions. ● 1 second resolution = 3600 columns per row => compact into 1. ● But with appends, HBase: ○ Reads the column (from memstore or disk) ○ Appends the data and writes back to memstore (and possibly block cache) ○ Send full data back to the client
  • 21. 21 The Problem of Appends ● Negatives: ○ Possible disk thrashing if columns have been compacted out of the memstore ○ Higher CPU utilization on the region servers ○ Longer wait time on the client side ● Future Solution: ○ Yahoo’s HBase developers (Francis, Thiruvel) working on an optimization using coprocessors. ○ Trials underway, details in August
  • 22. 22 The Problem of Compactions ● HBase compaction merges multiple store files into one, saving space. ● But if we assume the data is time series, with older data immutable and updates only to new data… ● ...we can avoid re-compacting old files that won’t change and skip them at scan time. ● HBASE-15181 from Yahoo and Flurry supports organizing store files by date and time. ● PR #990 from Karan at SalesForce allows TSDB to write HBase timestamps
  • 23. 23 AsyncHBase 1.8 ● AsyncHBase is a fully asynchronous, multi-threaded HBase client ● Supports HBase 0.90 to 1.x ● Faster and less resource intensive than the native HBase client ● Support for scanner filters, META prefetch, “fail-fast” RPCs
  • 24. 24 AsyncHBase 1.8 ● Batched GetRequests thanks to Tian-Ying at Pinterest and Bizu at Yahoo ● Reverse scanning support thanks to Jiayun at Harvard ● HBase 1.3.x+ support thanks to Karan at SalesForce ● MultipleColumnPrefixFilter ● Skip WAL with increments ● AtomicIncrements with multiple columns per request
  • 25. 25 OpenTSDB on Bigtable ● Bigtable ○ Hosted Google Service ○ Client uses HTTP2 and GRPC for communication ● OpenTSDB heads home ○ Based on a time series store on Bigtable at Google ○ Identical schema as HBase ○ Same filter support (fuzzy filters are coming)
  • 26. 26 OpenTSDB 3.0 ● Problem: Queries are slow and the order of operations is immutable ● Solutions: (This part is ready for testing!) ○ New composable query layer allowing operations in any order ○ Support for querying multiple sources and merging the results (e.g. use Facebook’s Berengi as a write-cache and Redis as a query cache) ○ Support for multi-cluster queries for active-active, high-availability setups
  • 27. 27 OpenTSDB 3.0 ● Problem: Storing other types of data or using other backends is a pain. ● Solutions: (In progress) ○ Pluggable storage interface allowing for various schemas and implementations (e.g. native HTable client, AsyncHBase, native Bigtable client, etc) ○ Abstracted data types for pluggable implementations of time series (e.g. raw binary, histograms, SCADA data)
  • 28. 28 OpenTSDB 3.0 ● Problem: What about anomaly detection, forecasting, etc? ● Solutions: (In progress) ○ Integration with Yahoo’s EGADS time series functions library ○ Period-over-period analysis with model caching ○ Clustering algorithms for detecting outliers ○ https://github.com/yahoo/egads
  • 29. 29 OpenTSDB 3.0 ● New Java APIs ● Servlet for standard deployment using your favorite server ● Tracing with Zipkin and OpenTracing ● New debugging UI ● Improved Docker support
  • 31. 31 More Info and Credits ● Thanks to the Monitoring and HBase teams at Yahoo, Pythian for Bigtable support and our OSS contributors! ● Contribute at github.com/OpenTSDB/opentsdb ● Website: opentsdb.net ● Mailing List: groups.google.com/group/opentsdb Images ● https://commons.wikimedia.org/wiki/File:Programmer_writing_code_with_Unit_Tests.jpg ● http://www.doncio.navy.mil/CHIPS/ArticleDetails.aspx?ID=8098 ● https://i.imgflip.com/t96s8.jpg ● https://commons.wikimedia.org/wiki/File:Twemoji_1f626.svg ● https://xkcd.com/1425/ ● https://commons.wikimedia.org/wiki/Emoji#/media/File:Twemoji_1f623.svg ● https://c1.staticflickr.com/1/508/32307332875_40e73bf750_b.jpg ● http://code.flickr.net/2008/10/27/counting-timing/ ● http://3.bp.blogspot.com/-tTXEI5IiQh4/VQqaJz4LtSI/AAAAAAAAEL8/n5AwTVNI-Us/s1600/Introduction%2Bto%2BSQ L.png