SlideShare a Scribd company logo
© 2014 MapR Technologies 1© 2014 MapR Technologies
© 2014 MapR Technologies 2
Agenda
• The Internet is turning upside down
• A story
• The last (mile) shall be first
• Time series on NO-SQL
• Faster time series on NO-SQL
• Summary
© 2014 MapR Technologies 3
How the Internet Works
• Big content servers feed data across the backbone to
• Regional caches and servers feed data across neighborhood
transport to
• The “last mile”
• Bits are nearly conserved, $ are concentrated centrally
– But total $ mass at the edge is much higher
© 2014 MapR Technologies 4
How The Internet Works
Server
Cache
Cache
Gateway
Switch
Firewall
c1
c2
Gateway
Switch Firewall
c1
c2
Switch
Firewall c1
c2
© 2014 MapR Technologies 5
Conservation of Bits Decreases Bandwidth
Server
Cache
Cache
Gateway
Switch
Firewall
c1
c2
Gateway
Switch Firewall
c1
c2
Switch
Firewall c1
c2
© 2014 MapR Technologies 6
Total Investment Dominated by Last Mile
Server
Cache
Cache
Gateway
Switch
Firewall
c1
c2
Gateway
Switch Firewall
c1
c2
Switch
Firewall c1
c2
© 2014 MapR Technologies 7
The Rub
• What's the problem?
– Speed (end-to-end latency, backbone bw)
– Feasibility (cost for consumer links)
– Caching
• What do we need?
– Cheap last-mile hardware
– Good caches
© 2014 MapR Technologies 8
First:
An apology for going
off-script
© 2014 MapR Technologies 9
Now, the story
© 2014 MapR Technologies 10
© 2014 MapR Technologies 11
By the 1840’s, the NY-SF
sailing time was down to
130-180 days
© 2014 MapR Technologies 12
© 2014 MapR Technologies 13
In 1851, the record was
set at 89 days by the
Flying Cloud
© 2014 MapR Technologies 14
The difference was due
(in part) to big data
and a primitive kind of
time-series database
© 2014 MapR Technologies 15
© 2014 MapR Technologies 16
© 2014 MapR Technologies 17
© 2014 MapR Technologies 18
These charts were free …
If you donated your data
© 2014 MapR Technologies 19
But how does this apply
today?
© 2014 MapR Technologies 20
What has changed?
Where will it lead?
© 2014 MapR Technologies 21
© 2014 MapR Technologies 22
© 2014 MapR Technologies 23
© 2014 MapR Technologies 24
© 2014 MapR Technologies 25
© 2014 MapR Technologies 26
© 2014 MapR Technologies 27
© 2014 MapR Technologies 28
© 2014 MapR Technologies 29
© 2014 MapR Technologies 30
© 2014 MapR Technologies 31
Things
© 2014 MapR Technologies 32
Emitting data
© 2014 MapR Technologies 33
How The Internet Works
Server
Cache
Cache
Gateway
Switch
Firewall
c1
c2
Gateway
Switch Firewall
c1
c2
Switch
Firewall c1
c2
© 2014 MapR Technologies 34
How the Internet is Going to Work
Server
Cache
Cache
GatewaySwitchController
m4
m3
Gateway
Switch
Controller
m6
m5
Switch
Controllerm2
m1
© 2014 MapR Technologies 35
Where Will The $ Go?
Server
Cache
Cache
GatewaySwitchController
m4
m3
Gateway
Switch
Controller
m6
m5
Switch
Controllerm2
m1
© 2014 MapR Technologies 36
Sensors
© 2014 MapR Technologies 37
Controllers
© 2014 MapR Technologies 38
The Problems
• Sensors and controllers have little processing or space
– SIM cards = 20Mhz processor, 128kb space = 16kB
– Arduino mini = 15kB RAM (more EPROM)
– BeagleBone/Raspberry Pi = 500 kB RAM
• Sensors and controllers have little power
– Very common to power down 99% of the time
• Sensors and controls often have very low bandwidth
– Mesh networks with base rates << 1Mb/s
– Power line networking
– Intermittent 3G/4G/LTE connectivity
© 2014 MapR Technologies 39
What Do We Need to Do With a Time Series
• Acquire
– Measurement, transmission, reception
– Mostly not our problem
• Store
– We own this
• Retrieve
– We have to allow this
• Analyze and visualize
– We facilitate this via retrieval
© 2014 MapR Technologies 40
Retrieval Requirements
• Retrieve by time-series, time range, tags
– Possibly pull millions of data points at a time
– Possibly do on-the-fly windowed aggregations
• Search by unstructured data
– Typically require time windowed facetting after search
– Also need to dive in with first kind of retrieval
© 2014 MapR Technologies 41
Storage choices and trade-offs
• Flat files
– Great for rapid ingest with massive data
– Handles essentially any data type
– Less good for data requiring frequent updates
– Harder to find specific ranges
• Traditional relational db
– Ingests up to 10,000’s/ sec; prefers well structured (numerical) data; expensive
• Non-relational db: Tables (such as MapR tables in M7 or HBase)
– Ingests up to 100,000 rows/sec
– Handles wide variety of data
– Good for frequent updates
– Easily scanned in a range
© 2014 MapR Technologies 42
Specific Example
• Consider a server farm
• Lots of system metrics
• Typically 100-300 stats / 30 s
• Loads, RPC’s, packets, requests/s
• Common to have 100 – 10,000 machines
© 2014 MapR Technologies 43
The General Outline
• 10 samples / second / machine
x 1,000 machines
= 10,000 samples / second
• This is what Open TSDB was designed to handle
• Install and go, but don’t test at scale
© 2014 MapR Technologies 44
Specific Example
• Consider oil drilling rigs
• When drilling wells, there are *lots* of moving parts
• Typically a drilling rig makes about 10K samples/s
• Temperatures, pressures, magnetics,
machine vibration levels, salinity, voltage,
currents, many others
• Typical project has 100 rigs
© 2014 MapR Technologies 45
The General Outline
• 10K samples / second / rig
x 100 rigs
= 1M samples / second
© 2014 MapR Technologies 46
The General Outline
• 10K samples / second / rig
x 100 rigs
= 1M samples / second
• But wait, there’s more
– Suppose you want to test your system
– Perhaps with a year of data
– And you want to load that data in << 1 year
• 100x real-time = 100M samples / second
© 2014 MapR Technologies 47
How Should That Work?
Message
queue
Collector
MapR
table
Samples
Web service Users
© 2014 MapR Technologies 48
A First Attempt
OpenTSDB is a distributed Time Series Database build on top of
HBase, enabling you …
– to store & index, as well as
– to query & plot
… metrics at scale.
© 2014 MapR Technologies 49
Design Goals
• Distributed storage of metrics
• Metrics query fast and easy
• Scale out to thousands of machines and billions of data points
• No SPOF
© 2014 MapR Technologies 50
Key concepts
© 2014 MapR Technologies 51
Key concepts
(00:38, 56) mysql.com_delete schema=userdb
© 2014 MapR Technologies 52
Key concepts
data point: (timestamp, value)
+ metric
+ tag: key=value
 time series
© 2014 MapR Technologies 53
Example TS
...
1409497082 327810227706 mysql.bytes_received schema=foo host=db1
1409497099 6604859181710 mysql.bytes_sent schema=foo host=db1
1409497106 327812421706 mysql.bytes_received schema=foo host=db1
1409497113 6604901075387 mysql.bytes_sent schema=foo host=db
...
UNIX epoch timestamp: $(date +%s)
a metric (often hierarchical)
two tags
© 2014 MapR Technologies 54
Declare metric
$ tsdb mkmetric mysql.bytes_sent mysql.bytes_received
metrics mysql.bytes_sent: [0, 0, 1]
metrics mysql.bytes_received: [0, 0, 2]
… or use –auto-metric
© 2014 MapR Technologies 55
Collect metric
• tcollector: gathers data from local
collectors, pushes to TSDs and
providing deduplication
• lots bundled
– General: iostat, netstat, etc.
– Others: MySQL, HBase, etc.
• … or roll your own
© 2014 MapR Technologies 56
The Whole Picture
HBase
or
MapR-DB
© 2014 MapR Technologies 57
Wide Table Design: Point-by-Point
© 2014 MapR Technologies 58
Wide Table Design: Hybrid Point-by-Point + Blob
Insertion of data as blob makes original columns redundant
Non-relational, but you can query these tables with Drill
© 2014 MapR Technologies 59
Status to This Point
• Each sample requires one insertion, compaction requires
another
• Typical performance on SE cluster
– 1 edge node + 4 cluster nodes
– 20,000 samples per second observed
– Would be faster on performance cluster, possibly not a lot
• Suitable for server monitoring
• Not suitable for large scale history ingestion
• Bulk load helps a little, but not much
• Still 1000x too slow for industrial work
© 2014 MapR Technologies 60
Speeding up OpenTSDB
20,000 data points per second per node in the test cluster
Why can’t it be faster ?
© 2014 MapR Technologies 61
Speeding up OpenTSDB: open source MapR extensions
Available on Github: https://github.com/mapr-demos/opentsdb
© 2014 MapR Technologies 62
Status to This Point
• 3600 samples require one insertion
• Typical results on SE cluster
– 1 edge node + 4 cluster nodes
– 14 million samples per second observed
– ~700x faster ingestion
• Typical results on performance cluster
– 2-4 edge nodes + 4-9 cluster nodes
– 110 million samples/s (4 nodes) to >200 million samples/s (8 nodes)
• Suitable for large scale history ingestion
• 30 million data points retrieved in 20s
• Ready for industrial work
© 2014 MapR Technologies 63
Key Results
• Ingestion is network limited
– Edge nodes are the critical resource
– Number of edge nodes defines a limit to scaling
• With enough edge nodes scaling is near perfect
• Performance of raw OpenTSDB is limited by stateless demon
• Modified OpenTSDB can run 1000x faster
© 2014 MapR Technologies 64
Overall Ingestion Rate
Nodes
TotalIngestionRate(millionsofpoints/second)
4 5 8 9
050150250
Two ingestors
One ingestor
© 2014 MapR Technologies 65
Normalized Ingestion Rate
Nodes
Ingestionpernode(millionsofpoints/second)
4 5 8 9
010203040 Two ingestors
One ingestor
© 2014 MapR Technologies 66
Why MapR?
• MapR tables are inherently faster, safer
– Sustained > 1GB/s ingest rate in tests
• Mirror to M5 or M7 cluster to isolate analytics load
• Transaction logs involves frequent appends, many files
© 2014 MapR Technologies 67
When is this All Wrong?
• In some cases, retrieval by series-id + time range not sufficient
• May need very flexible retrieval of events based on text-like
criteria
• Search may be better than class time-series database
• Can scale Lucene based search to > 1 million events / second
© 2014 MapR Technologies 68
When is it Even More Right
• In many industrial settings, data rates from individual sensors are
relatively high
– Latency to view is still measured in seconds, not sample points
• This allows batching at source
• Common requirement for highly variable sample rates
– 1 sample/s, baseline, switch to 10 k sample/s
– Small batches during slow times are just fine since number of sensors is
constant
– Requires variable window sizes
© 2014 MapR Technologies 69
Summary
• The internet is turning upside down
• This will make time series ubiquitous
• Current open source systems are much too slow
• We can fix that with modern NoSQL systems
– (I wear a red hat for a reason)
© 2014 MapR Technologies 70
Questions
© 2014 MapR Technologies 71
Thank You
@mapr maprtech
tdunning@mapr.com
tdunning@apache.org
Ted Dunning, ChiefApplicationArchitect
MapRTechnologies
maprtech
mapr-technologies

More Related Content

PPTX
Realistic Synthetic Generation Allows Secure Development
PPTX
Dealing with an Upside Down Internet
PPTX
Drill at the Chug 9-19-12
PDF
Hadoop as a Platform for Genomics
PDF
Build a Time Series Application with Apache Spark and Apache HBase
PDF
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
PDF
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
PDF
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine
Realistic Synthetic Generation Allows Secure Development
Dealing with an Upside Down Internet
Drill at the Chug 9-19-12
Hadoop as a Platform for Genomics
Build a Time Series Application with Apache Spark and Apache HBase
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Real-World Machine Learning - Leverage the Features of MapR Converged Data Pl...
Elastic Scaling of a High-Throughput Content-Based Publish/Subscribe Engine

What's hot (20)

PDF
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
PPTX
Design Patterns for working with Fast Data
PDF
Demonstrating 100 Gbps in and out of the Clouds
PPTX
Inside MapR's M7
PPTX
Boston hug-2012-07
PPTX
Resource Aware Scheduling in Apache Storm
PDF
Streaming Patterns Revolutionary Architectures with the Kafka API
PPTX
ACM 2013-02-25
PDF
Buzz Words Dunning Real-Time Learning
PPTX
Storm – Streaming Data Analytics at Scale - StampedeCon 2014
PPTX
Storm 2012-03-29
PDF
Big Data Streaming processing using Apache Storm - FOSSCOMM 2016
PDF
Advanced Threat Detection on Streaming Data
PPTX
And Then There Are Algorithms
PPTX
Enterprise Grade Streaming under 2ms on Hadoop
PPTX
Geo Analytics Canada Overview - May 2020
PDF
ONS Summit 2017 SKT TINA
PPTX
Llnl talk
PPTX
STAC, ZARR, COG, K8S and Data Cubes: The brave new world of satellite EO anal...
PPTX
What's new in Hadoop Common and HDFS
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Design Patterns for working with Fast Data
Demonstrating 100 Gbps in and out of the Clouds
Inside MapR's M7
Boston hug-2012-07
Resource Aware Scheduling in Apache Storm
Streaming Patterns Revolutionary Architectures with the Kafka API
ACM 2013-02-25
Buzz Words Dunning Real-Time Learning
Storm – Streaming Data Analytics at Scale - StampedeCon 2014
Storm 2012-03-29
Big Data Streaming processing using Apache Storm - FOSSCOMM 2016
Advanced Threat Detection on Streaming Data
And Then There Are Algorithms
Enterprise Grade Streaming under 2ms on Hadoop
Geo Analytics Canada Overview - May 2020
ONS Summit 2017 SKT TINA
Llnl talk
STAC, ZARR, COG, K8S and Data Cubes: The brave new world of satellite EO anal...
What's new in Hadoop Common and HDFS
Ad

Similar to Dealing with an Upside Down Internet With High Performance Time Series Database (20)

PDF
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
PPTX
How the Internet of Things are Turning the Internet Upside Down
PPTX
Time Series Data in a Time Series World
PPTX
Building HBase Applications - Ted Dunning
PPTX
Keys for Success from Streams to Queries
PPTX
Real time-hadoop
PPTX
Real-time Hadoop: The Ideal Messaging System for Hadoop
PDF
HUG_Ireland_Streaming_Ted_Dunning
PDF
Tsinghua University: Two Exemplary Applications in China
PPTX
IoT and Big Data - Iot Asia 2014
PPTX
CEP - simplified streaming architecture - Strata Singapore 2016
PPTX
CitySprint Fleetmapper use case -Big Data Bootcamp
PDF
Drill into Drill – How Providing Flexibility and Performance is Possible
PDF
Apache Hadoop YARN - The Future of Data Processing with Hadoop
PPTX
Apache Kylin - OLAP Cubes for SQL on Hadoop
PPTX
Apache Kylin – Cubes on Hadoop
PPTX
MapR-DB – The First In-Hadoop Document Database
PPTX
Next Generation Enterprise Architecture
PPTX
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Ted Dunning – Very High Bandwidth Time Series Database Implementation - NoSQL...
How the Internet of Things are Turning the Internet Upside Down
Time Series Data in a Time Series World
Building HBase Applications - Ted Dunning
Keys for Success from Streams to Queries
Real time-hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
HUG_Ireland_Streaming_Ted_Dunning
Tsinghua University: Two Exemplary Applications in China
IoT and Big Data - Iot Asia 2014
CEP - simplified streaming architecture - Strata Singapore 2016
CitySprint Fleetmapper use case -Big Data Bootcamp
Drill into Drill – How Providing Flexibility and Performance is Possible
Apache Hadoop YARN - The Future of Data Processing with Hadoop
Apache Kylin - OLAP Cubes for SQL on Hadoop
Apache Kylin – Cubes on Hadoop
MapR-DB – The First In-Hadoop Document Database
Next Generation Enterprise Architecture
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Ad

More from DataWorks Summit (20)

PPTX
Data Science Crash Course
PPTX
Floating on a RAFT: HBase Durability with Apache Ratis
PPTX
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
PDF
HBase Tales From the Trenches - Short stories about most common HBase operati...
PPTX
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
PPTX
Managing the Dewey Decimal System
PPTX
Practical NoSQL: Accumulo's dirlist Example
PPTX
HBase Global Indexing to support large-scale data ingestion at Uber
PPTX
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
PPTX
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
PPTX
Supporting Apache HBase : Troubleshooting and Supportability Improvements
PPTX
Security Framework for Multitenant Architecture
PDF
Presto: Optimizing Performance of SQL-on-Anything Engine
PPTX
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
PPTX
Extending Twitter's Data Platform to Google Cloud
PPTX
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
PPTX
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
PPTX
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
PDF
Computer Vision: Coming to a Store Near You
PPTX
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark
Data Science Crash Course
Floating on a RAFT: HBase Durability with Apache Ratis
Tracking Crime as It Occurs with Apache Phoenix, Apache HBase and Apache NiFi
HBase Tales From the Trenches - Short stories about most common HBase operati...
Optimizing Geospatial Operations with Server-side Programming in HBase and Ac...
Managing the Dewey Decimal System
Practical NoSQL: Accumulo's dirlist Example
HBase Global Indexing to support large-scale data ingestion at Uber
Scaling Cloud-Scale Translytics Workloads with Omid and Phoenix
Building the High Speed Cybersecurity Data Pipeline Using Apache NiFi
Supporting Apache HBase : Troubleshooting and Supportability Improvements
Security Framework for Multitenant Architecture
Presto: Optimizing Performance of SQL-on-Anything Engine
Introducing MlFlow: An Open Source Platform for the Machine Learning Lifecycl...
Extending Twitter's Data Platform to Google Cloud
Event-Driven Messaging and Actions using Apache Flink and Apache NiFi
Securing Data in Hybrid on-premise and Cloud Environments using Apache Ranger
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Computer Vision: Coming to a Store Near You
Big Data Genomics: Clustering Billions of DNA Sequences with Apache Spark

Recently uploaded (20)

PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Approach and Philosophy of On baking technology
PDF
KodekX | Application Modernization Development
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Empathic Computing: Creating Shared Understanding
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Electronic commerce courselecture one. Pdf
PPTX
Big Data Technologies - Introduction.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
sap open course for s4hana steps from ECC to s4
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Understanding_Digital_Forensics_Presentation.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Encapsulation_ Review paper, used for researhc scholars
Approach and Philosophy of On baking technology
KodekX | Application Modernization Development
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Programs and apps: productivity, graphics, security and other tools
Empathic Computing: Creating Shared Understanding
Advanced methodologies resolving dimensionality complications for autism neur...
Per capita expenditure prediction using model stacking based on satellite ima...
Electronic commerce courselecture one. Pdf
Big Data Technologies - Introduction.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
The AUB Centre for AI in Media Proposal.docx
sap open course for s4hana steps from ECC to s4
The Rise and Fall of 3GPP – Time for a Sabbatical?
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Unlocking AI with Model Context Protocol (MCP)
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

Dealing with an Upside Down Internet With High Performance Time Series Database

  • 1. © 2014 MapR Technologies 1© 2014 MapR Technologies
  • 2. © 2014 MapR Technologies 2 Agenda • The Internet is turning upside down • A story • The last (mile) shall be first • Time series on NO-SQL • Faster time series on NO-SQL • Summary
  • 3. © 2014 MapR Technologies 3 How the Internet Works • Big content servers feed data across the backbone to • Regional caches and servers feed data across neighborhood transport to • The “last mile” • Bits are nearly conserved, $ are concentrated centrally – But total $ mass at the edge is much higher
  • 4. © 2014 MapR Technologies 4 How The Internet Works Server Cache Cache Gateway Switch Firewall c1 c2 Gateway Switch Firewall c1 c2 Switch Firewall c1 c2
  • 5. © 2014 MapR Technologies 5 Conservation of Bits Decreases Bandwidth Server Cache Cache Gateway Switch Firewall c1 c2 Gateway Switch Firewall c1 c2 Switch Firewall c1 c2
  • 6. © 2014 MapR Technologies 6 Total Investment Dominated by Last Mile Server Cache Cache Gateway Switch Firewall c1 c2 Gateway Switch Firewall c1 c2 Switch Firewall c1 c2
  • 7. © 2014 MapR Technologies 7 The Rub • What's the problem? – Speed (end-to-end latency, backbone bw) – Feasibility (cost for consumer links) – Caching • What do we need? – Cheap last-mile hardware – Good caches
  • 8. © 2014 MapR Technologies 8 First: An apology for going off-script
  • 9. © 2014 MapR Technologies 9 Now, the story
  • 10. © 2014 MapR Technologies 10
  • 11. © 2014 MapR Technologies 11 By the 1840’s, the NY-SF sailing time was down to 130-180 days
  • 12. © 2014 MapR Technologies 12
  • 13. © 2014 MapR Technologies 13 In 1851, the record was set at 89 days by the Flying Cloud
  • 14. © 2014 MapR Technologies 14 The difference was due (in part) to big data and a primitive kind of time-series database
  • 15. © 2014 MapR Technologies 15
  • 16. © 2014 MapR Technologies 16
  • 17. © 2014 MapR Technologies 17
  • 18. © 2014 MapR Technologies 18 These charts were free … If you donated your data
  • 19. © 2014 MapR Technologies 19 But how does this apply today?
  • 20. © 2014 MapR Technologies 20 What has changed? Where will it lead?
  • 21. © 2014 MapR Technologies 21
  • 22. © 2014 MapR Technologies 22
  • 23. © 2014 MapR Technologies 23
  • 24. © 2014 MapR Technologies 24
  • 25. © 2014 MapR Technologies 25
  • 26. © 2014 MapR Technologies 26
  • 27. © 2014 MapR Technologies 27
  • 28. © 2014 MapR Technologies 28
  • 29. © 2014 MapR Technologies 29
  • 30. © 2014 MapR Technologies 30
  • 31. © 2014 MapR Technologies 31 Things
  • 32. © 2014 MapR Technologies 32 Emitting data
  • 33. © 2014 MapR Technologies 33 How The Internet Works Server Cache Cache Gateway Switch Firewall c1 c2 Gateway Switch Firewall c1 c2 Switch Firewall c1 c2
  • 34. © 2014 MapR Technologies 34 How the Internet is Going to Work Server Cache Cache GatewaySwitchController m4 m3 Gateway Switch Controller m6 m5 Switch Controllerm2 m1
  • 35. © 2014 MapR Technologies 35 Where Will The $ Go? Server Cache Cache GatewaySwitchController m4 m3 Gateway Switch Controller m6 m5 Switch Controllerm2 m1
  • 36. © 2014 MapR Technologies 36 Sensors
  • 37. © 2014 MapR Technologies 37 Controllers
  • 38. © 2014 MapR Technologies 38 The Problems • Sensors and controllers have little processing or space – SIM cards = 20Mhz processor, 128kb space = 16kB – Arduino mini = 15kB RAM (more EPROM) – BeagleBone/Raspberry Pi = 500 kB RAM • Sensors and controllers have little power – Very common to power down 99% of the time • Sensors and controls often have very low bandwidth – Mesh networks with base rates << 1Mb/s – Power line networking – Intermittent 3G/4G/LTE connectivity
  • 39. © 2014 MapR Technologies 39 What Do We Need to Do With a Time Series • Acquire – Measurement, transmission, reception – Mostly not our problem • Store – We own this • Retrieve – We have to allow this • Analyze and visualize – We facilitate this via retrieval
  • 40. © 2014 MapR Technologies 40 Retrieval Requirements • Retrieve by time-series, time range, tags – Possibly pull millions of data points at a time – Possibly do on-the-fly windowed aggregations • Search by unstructured data – Typically require time windowed facetting after search – Also need to dive in with first kind of retrieval
  • 41. © 2014 MapR Technologies 41 Storage choices and trade-offs • Flat files – Great for rapid ingest with massive data – Handles essentially any data type – Less good for data requiring frequent updates – Harder to find specific ranges • Traditional relational db – Ingests up to 10,000’s/ sec; prefers well structured (numerical) data; expensive • Non-relational db: Tables (such as MapR tables in M7 or HBase) – Ingests up to 100,000 rows/sec – Handles wide variety of data – Good for frequent updates – Easily scanned in a range
  • 42. © 2014 MapR Technologies 42 Specific Example • Consider a server farm • Lots of system metrics • Typically 100-300 stats / 30 s • Loads, RPC’s, packets, requests/s • Common to have 100 – 10,000 machines
  • 43. © 2014 MapR Technologies 43 The General Outline • 10 samples / second / machine x 1,000 machines = 10,000 samples / second • This is what Open TSDB was designed to handle • Install and go, but don’t test at scale
  • 44. © 2014 MapR Technologies 44 Specific Example • Consider oil drilling rigs • When drilling wells, there are *lots* of moving parts • Typically a drilling rig makes about 10K samples/s • Temperatures, pressures, magnetics, machine vibration levels, salinity, voltage, currents, many others • Typical project has 100 rigs
  • 45. © 2014 MapR Technologies 45 The General Outline • 10K samples / second / rig x 100 rigs = 1M samples / second
  • 46. © 2014 MapR Technologies 46 The General Outline • 10K samples / second / rig x 100 rigs = 1M samples / second • But wait, there’s more – Suppose you want to test your system – Perhaps with a year of data – And you want to load that data in << 1 year • 100x real-time = 100M samples / second
  • 47. © 2014 MapR Technologies 47 How Should That Work? Message queue Collector MapR table Samples Web service Users
  • 48. © 2014 MapR Technologies 48 A First Attempt OpenTSDB is a distributed Time Series Database build on top of HBase, enabling you … – to store & index, as well as – to query & plot … metrics at scale.
  • 49. © 2014 MapR Technologies 49 Design Goals • Distributed storage of metrics • Metrics query fast and easy • Scale out to thousands of machines and billions of data points • No SPOF
  • 50. © 2014 MapR Technologies 50 Key concepts
  • 51. © 2014 MapR Technologies 51 Key concepts (00:38, 56) mysql.com_delete schema=userdb
  • 52. © 2014 MapR Technologies 52 Key concepts data point: (timestamp, value) + metric + tag: key=value  time series
  • 53. © 2014 MapR Technologies 53 Example TS ... 1409497082 327810227706 mysql.bytes_received schema=foo host=db1 1409497099 6604859181710 mysql.bytes_sent schema=foo host=db1 1409497106 327812421706 mysql.bytes_received schema=foo host=db1 1409497113 6604901075387 mysql.bytes_sent schema=foo host=db ... UNIX epoch timestamp: $(date +%s) a metric (often hierarchical) two tags
  • 54. © 2014 MapR Technologies 54 Declare metric $ tsdb mkmetric mysql.bytes_sent mysql.bytes_received metrics mysql.bytes_sent: [0, 0, 1] metrics mysql.bytes_received: [0, 0, 2] … or use –auto-metric
  • 55. © 2014 MapR Technologies 55 Collect metric • tcollector: gathers data from local collectors, pushes to TSDs and providing deduplication • lots bundled – General: iostat, netstat, etc. – Others: MySQL, HBase, etc. • … or roll your own
  • 56. © 2014 MapR Technologies 56 The Whole Picture HBase or MapR-DB
  • 57. © 2014 MapR Technologies 57 Wide Table Design: Point-by-Point
  • 58. © 2014 MapR Technologies 58 Wide Table Design: Hybrid Point-by-Point + Blob Insertion of data as blob makes original columns redundant Non-relational, but you can query these tables with Drill
  • 59. © 2014 MapR Technologies 59 Status to This Point • Each sample requires one insertion, compaction requires another • Typical performance on SE cluster – 1 edge node + 4 cluster nodes – 20,000 samples per second observed – Would be faster on performance cluster, possibly not a lot • Suitable for server monitoring • Not suitable for large scale history ingestion • Bulk load helps a little, but not much • Still 1000x too slow for industrial work
  • 60. © 2014 MapR Technologies 60 Speeding up OpenTSDB 20,000 data points per second per node in the test cluster Why can’t it be faster ?
  • 61. © 2014 MapR Technologies 61 Speeding up OpenTSDB: open source MapR extensions Available on Github: https://github.com/mapr-demos/opentsdb
  • 62. © 2014 MapR Technologies 62 Status to This Point • 3600 samples require one insertion • Typical results on SE cluster – 1 edge node + 4 cluster nodes – 14 million samples per second observed – ~700x faster ingestion • Typical results on performance cluster – 2-4 edge nodes + 4-9 cluster nodes – 110 million samples/s (4 nodes) to >200 million samples/s (8 nodes) • Suitable for large scale history ingestion • 30 million data points retrieved in 20s • Ready for industrial work
  • 63. © 2014 MapR Technologies 63 Key Results • Ingestion is network limited – Edge nodes are the critical resource – Number of edge nodes defines a limit to scaling • With enough edge nodes scaling is near perfect • Performance of raw OpenTSDB is limited by stateless demon • Modified OpenTSDB can run 1000x faster
  • 64. © 2014 MapR Technologies 64 Overall Ingestion Rate Nodes TotalIngestionRate(millionsofpoints/second) 4 5 8 9 050150250 Two ingestors One ingestor
  • 65. © 2014 MapR Technologies 65 Normalized Ingestion Rate Nodes Ingestionpernode(millionsofpoints/second) 4 5 8 9 010203040 Two ingestors One ingestor
  • 66. © 2014 MapR Technologies 66 Why MapR? • MapR tables are inherently faster, safer – Sustained > 1GB/s ingest rate in tests • Mirror to M5 or M7 cluster to isolate analytics load • Transaction logs involves frequent appends, many files
  • 67. © 2014 MapR Technologies 67 When is this All Wrong? • In some cases, retrieval by series-id + time range not sufficient • May need very flexible retrieval of events based on text-like criteria • Search may be better than class time-series database • Can scale Lucene based search to > 1 million events / second
  • 68. © 2014 MapR Technologies 68 When is it Even More Right • In many industrial settings, data rates from individual sensors are relatively high – Latency to view is still measured in seconds, not sample points • This allows batching at source • Common requirement for highly variable sample rates – 1 sample/s, baseline, switch to 10 k sample/s – Small batches during slow times are just fine since number of sensors is constant – Requires variable window sizes
  • 69. © 2014 MapR Technologies 69 Summary • The internet is turning upside down • This will make time series ubiquitous • Current open source systems are much too slow • We can fix that with modern NoSQL systems – (I wear a red hat for a reason)
  • 70. © 2014 MapR Technologies 70 Questions
  • 71. © 2014 MapR Technologies 71 Thank You @mapr maprtech tdunning@mapr.com tdunning@apache.org Ted Dunning, ChiefApplicationArchitect MapRTechnologies maprtech mapr-technologies

Editor's Notes

  • #55: In the HBase shell: scan 'tsdb-uid', {STARTROW => "\0\0\1"}
  • #56: Try out MySQL collector or for advanced once: write your own in Python
  • #57: Ted’s original talk notes: OpenTSDB consists of a Time Series Daemon (TSD) as well as set of command line utilities. Interaction with OpenTSDB is primarily achieved by running one or more of the TSDs. Each TSD is independent. There is no master, no shared state so you can run as many TSDs as required to handle any load you throw at it. Each TSD uses the open source databaseHBase to store and retrieve time-series data. The HBase schema is highly optimized for fast aggregations of similar time series to minimize storage space. Users of the TSD never need to access HBase directly. You can communicate with the TSD via a simple telnet-style protocol, an HTTP API or a simple built-in GUI. All communications happen on the same port (the TSD figures out the protocol of the client by looking at the first few bytes it receives).
  • #58: Key ideas: Unique row key based on an id for each time series (looked up from a separate look-up table); important part of the efficiency of design is to have each column be a time off-set from the start time shown in the row key. Note that data is stored point-by-point in this wide table design. Ted’s notes from his original slide: One technique for increasing the rate at which data can be retrieved from a time series database is to store many values in each row. Doing this allows data points to be retrieved at a higher speed Because both HBase and MapR-DB store data ordered by the primary key, this design will cause rows containing data from a single time series to wind up near one another on disk. Retrieving data from a particular time series for a time range will involve largely sequential disk operations and therefore will be much faster than would be the case if the rows were widely scattered. Typically, the time window is adjusted so that 100–1,000 samples are in each row.
  • #59: Ted’s notes from original slide: The table design is improved by collapsing all of the data for a row into a single data structure known as a blob. This blob can be highly compressed so that less data needs to be read from disk. Also, having a single column per row decreases the per-column overhead incurred by the on-disk format that HBase uses, which further increases performance. Data can be progressively converted to the compressed format as soon as it is known that little or no new data is likely to arrive for that time series and time window. Commonly, once the time window ends, new data will only arrive for a few more seconds, and the compression of the data can begin. Since compressed and uncompressed data can coexist in the same row, if a few samples arrive after the row is compressed, the row can simply be compressed again to merge the blob and the late-arriving samples.
  • #61: Richard: This is based on a figure from Chapter 3 of our book. Point here is to show that with standard Open TSDB, data is loaded into the wide table point-by-point, then pulled out and compressed to blob, then reloaded to form the hybrid table. This is a fairly efficient arrangement. Next slide will show how this is speeded up with the MapR open source extensions. Here are Ted’s original notes: Since data is inserted in the uncompressed format, the arrival of each data point requires a row update operation to insert the value into the database. Then read again by the blob maker. Reads are approximately equal to writes. Once data is compressed to blobs, it is again written to the database. This row update can limit the insertion rate for data to as little as 20,000 data points per second per node in the cluster.
  • #62: Richard: Also based on a figure from Chapter 3 of book: This slide shows the increased performance using the open source code MapR made open on github. I’ve added the github link. The key differences is that the blob production occurs upstream, before the data is ever loaded into the table. The restart logs are useful so that if there were ever a glitch with the process of compressing data to blobs and insertion, you would not lose the original data. Note that there is still the delay while blobs are made… see explanation in book, chapters 3 and 4. Richard: Please preserve the rest of the material on fast ingestion with MapR extensions (direct blob loading) for Ted’s talk on Sat. Use this slide as a preview and mention that Ted will be talking about this on Fiday. Ted’s original notes: the direct blob insertion data flow allows the insertion rate to be increased by as much as roughly 1,000-fold. How does the direct blob approach get this bump in performance? The essential difference is that the blob maker has been moved into the data flow between the catcher and the NoSQL time series database. This way, the blob maker can use incoming data from a memory cache rather than extracting its input from wide table rows already stored in the storage tier. the full data stream is only written to the memory cache, which is fast, rather than to the database. Data is not written to the storage tier until it’s compressed into blobs, so writing can be much faster. The number of database operations is decreased by the average number of data points in each of the compressed data blobs. This decrease can easily be a factor in the thousands.