2015 GHC Presentation - High Availability and High Frequency Big Data Analytics

2015
High Availability and
High Frequency Big
Data Analytics
Esther Kundin
Bloomberg LP
10/15/2015
#GHC15
2015

2015
Outline
 The Problem Space
 High Availability
 High Frequency
 Takeaways
 Questions

2015
The Problem Space
 High Frequency
 Takeaways
 Questions

2015
The Problem Space
 Total data set: 2 TB – roughly 2x1013 data points
− “medium data”
 Average Write: 4 billion data points a day
 Average read: 140 trillion data points a day
 Read/Write latency: 50 ms
 Read throughput: 3 trillion points in the peak
minute – 2000 bulk requests
 Allowable downtime < read latency

2015
High Availability – Pain Points and Solutions
 High Frequency
 Takeaways
 Questions

2015
High Availability - Major Points of
Failure
Client
HDFS
RegionServer RegionServer RegionServer
Meta Region
Server

2015
High Availability – Solution
HBASE-10070
Client
HDFS
RegionServer 1 RegionServer 2 RegionServer 3
Meta Region
Server
SecondaryRegion
Server 1
SecondaryRegion
Server 2
SecondaryRegion
Server 3
Secondary Meta
Region Server

2015
High Availability Across Data
Centers
 3 Options
− HBASE-12259 – HydraBase integration – HBASE +
Raft – In Progress
− Cloudera BDR in Cloudera Enterprise 5 – Not
Open Source
− Roll Your Own!

2015
Replication Across Data Centers
HBase 1 HBase 2
Writer1 Writer2
Reader1 Reader2
Global ZK
Replication

2015
High Frequency – Pain Points and
Solutions
 High Frequency
 Takeaways
 Questions

2015
HA to remove fat tails
0
2
4
6
8
10
12
50 60 80 90 95 99
Latencyinms
Percentile
Avg Latency per-Get Distribution

2015
High Frequency – Pain Points
 Speed bounded by slowest responding region
server
 Garbage Collection causes spikes in latency

2015
The Art of Fine Tuning
 Use Data to set your heuristics
− Identify repeatable base-line tests
− Identify performance parameters
− Tweak one setting at a time

2015
Tuning Your DB – Garbage Collection
 What Did Not Work
− Stop The World
− Small Memory Footprint – 4GB
− Synchronized GC via coprocessors
 What worked for us:
− CMS – shorter pauses
− Very large memory footprint – 28GB
− Read from backup RS when GC in progress

2015
Takeaways
 High Frequency
 Takeaways
 Questions

2015
Takeaways
 High Availability can solve most availability
and latency concerns
 Multiple Data Center Support Needed
 Tune those settings!

2015
Questions?
 High Frequency
 Takeaways
 Questions

2015
Resources:
Tuning Your DB – What to Tweak
 Key Design
 Column Family Design
 hbase_site.xml - Lots of configuration to try!
 Bloom Filters
 Short-Circuit Reads
 Block Cache
 Scheduling Major Compactions Judiciously

2015
Got Feedback?
Rate and review the session on our mobile app
Download at http://ddut.ch/ghc15
or search GHC 2015 in the app store

2015 GHC Presentation - High Availability and High Frequency Big Data Analytics

More Related Content

What's hot (20)

Viewers also liked (12)

Similar to 2015 GHC Presentation - High Availability and High Frequency Big Data Analytics (20)

Recently uploaded (20)

2015 GHC Presentation - High Availability and High Frequency Big Data Analytics

Editor's Notes