SlideShare a Scribd company logo
Valta: A Resource Management
Layer over Apache HBase
Lars George| Director EMEA Services
Andrew Wang | Software Engineer
June 13, 2013
Background on HBase
2
• Write-heavy processing pipelines
• Web crawling, personalization, time-series
• Storing a lot of data (many TBs)
• Random reads/writes
• Tight MapReduce and Hadoop integration
Workloads
3
• Very much a shared system
• One system, multiple workloads
• Frontend doing random reads/writes
• Analytical MR doing sequential scans
• Bulk import/export with MR
• Hard to isolate multitenant workloads
Example: Rolling RS failures
4
• Happened in production
• Bad bulk import wiped out entire cluster
• MR writes kill the RS
• Region gets reassigned
• Repeat until cluster is dead
• Applies to any high-load traffic
Current state of the art
5
• Run separate clusters, replicate between
• $$$, poor utilization, more complex
• Namespace-based hardware partitioning
• Same issues as above
• Delay big tasks until periods of low load
• Ad-hoc, weak guarantees
Other Problems
6
• Long requests impact frontend latency
• I/O latency (HDFS, OS, disk)
• Unpredictable ops (compaction, cron, …)
• Some straightforward to fix, some not
Outline
7
• Project Valta (HBase)
• Resource limits
• Blueprint for further issues
• Request scheduling
• Auto-tuning scheduling for SLOs
• Multiple read replicas
8
Project Valta
Project Valta
9
• Need basic resource limits in HBase
• Single shared system
• Ill-behaved HBase clients are unrestricted
• Take resources from other clients
• Worst case: rolling RS failures
• Want to limit damage from bad clients
Resource Limits
10
• Collect RPC metrics
• Payload size and throughput
• Impose per-client throughput limits
• e.g. MR import limited to 100 1MB puts/s
• Limits are enforced per-regionserver
• Soft state
• Think of it as a firewall
Implementation
11
• Client-side table wrapper
• Server-side coprocessor
• Github
• https://github.com/larsgeorge/Valta
• Follow HBASE-8481
• https://issues.apache.org/jira/browse/HBASE-8481
Limitations
12
• Important first steps, still more to do
• Static limits need baby-sitting
• Dynamic workload, set of clients
• Doesn’t fix some parts of HBase
• Compactions
• Doesn’t fix the rest of the stack
• HDFS, OS, disk
13
Blueprint for further issues
Blueprint
14
• Ideas on other QoS issues
• Full-stack request scheduling
• HBase, HDFS, OS, disk
• Auto-tuning to meet high-level SLOs
• Random latency (compaction, cron, …)
• Let’s file some JIRAs 
Full-stack request scheduling
15
• Need scheduling in all layers
• HBase, HDFS, OS, disk
• Run high-priority requests first
• Preemption of long operations
• Some pieces already available
• RPC priority field (HADOOP-9194)
• Client names in MR/HBase/HDFS
HBase request scheduling
16
• Add more HBase scheduling hooks
• RPC handling
• Between HDFS I/Os
• During long coprocessors or scans
• Expose hooks to coprocessors
• Could be used by Valta
HDFS request scheduling
17
• Same scheduling hooks as in HBase
• RPC layer, between I/Os
• Bound # of requests per disk
• Reduces queue length and contention
• Preempt queues in OS and disk
• OS block layer (CFQ, ioprio_set)
• Disk controller (SATA NCQ, ???)
High-level SLO enforcement
18
• Research work I did at Berkeley (Cake)
• Specify high-level SLOs directly to HBase
• “100ms 99th percentile latency for gets”
• Added hooks to HBase and HDFS
• System auto-tunes to satisfy SLOs
• Read the paper or hit me up!
• http://www.umbrant.com/papers/socc12-cake.pdf
Multiple read replicas
19
• Also proposed for MTTR, availability
• Many unpredictable sources of latency
• Compactions
• Also: cron, MR spill, shared caches, network, …
• Sidestep the problem!
• Read from 3 RS, return the fastest result
• Unlikely all three will be slow
• Weaker consistency, better latency
Conclusion
20
• HBase is a great system!
• Let’s make it multitenant
• Request limits
• Full-stack request scheduling
• High-level SLO enforcement
• Multiple read replicas
21
Thanks!
lars@cloudera.com
andrew.wang@cloudera.com

More Related Content

PPTX
HBase Backups
PDF
Large-scale Web Apps @ Pinterest
PPTX
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
PPTX
HBase at Bloomberg: High Availability Needs for the Financial Industry
PDF
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
PPTX
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
PPTX
HBaseCon 2015: HBase and Spark
PDF
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBase Backups
Large-scale Web Apps @ Pinterest
Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2015: HBase and Spark
HBase Read High Availability Using Timeline-Consistent Region Replicas

What's hot (20)

PDF
HBaseCon 2015- HBase @ Flipboard
PPTX
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
PDF
Facebook - Jonthan Gray - Hadoop World 2010
PDF
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
PPTX
Digital Library Collection Management using HBase
PPTX
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
PDF
hbaseconasia2017: HBase在Hulu的使用和实践
PPTX
HBase Data Modeling and Access Patterns with Kite SDK
PPTX
A Survey of HBase Application Archetypes
PPTX
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
PPTX
HBase and HDFS: Understanding FileSystem Usage in HBase
PDF
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
PPT
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
PPTX
HBaseCon 2015: State of HBase Docs and How to Contribute
PDF
Meet HBase 1.0
PDF
HBase: Extreme Makeover
PPTX
HBaseCon 2013: Compaction Improvements in Apache HBase
PPTX
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
PDF
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
PPTX
Keynote: The Future of Apache HBase
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
Facebook - Jonthan Gray - Hadoop World 2010
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
Digital Library Collection Management using HBase
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
hbaseconasia2017: HBase在Hulu的使用和实践
HBase Data Modeling and Access Patterns with Kite SDK
A Survey of HBase Application Archetypes
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBase and HDFS: Understanding FileSystem Usage in HBase
HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2015: State of HBase Docs and How to Contribute
Meet HBase 1.0
HBase: Extreme Makeover
HBaseCon 2013: Compaction Improvements in Apache HBase
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
Keynote: The Future of Apache HBase
Ad

Viewers also liked (20)

PPT
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
PPTX
HBaseCon 2013: Being Smarter Than the Smart Meter
PDF
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
PPTX
HBaseCon 2012 | Scaling GIS In Three Acts
PPTX
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
PPTX
Cross-Site BigTable using HBase
PPTX
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
PPTX
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
PPTX
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
PPTX
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
PPTX
HBaseCon 2013: Rebuilding for Scale on Apache HBase
PPTX
HBaseCon 2013: Apache HBase on Flash
PPTX
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
PPTX
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
PPTX
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
PPT
HBaseCon 2012 | Building Mobile Infrastructure with HBase
PPTX
HBaseCon 2013: 1500 JIRAs in 20 Minutes
PDF
Tales from the Cloudera Field
PDF
HBaseCon 2015: Just the Basics
PPTX
HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
Cross-Site BigTable using HBase
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2013: 1500 JIRAs in 20 Minutes
Tales from the Cloudera Field
HBaseCon 2015: Just the Basics
HBaseCon 2015 General Session: The Evolution of HBase @ Bloomberg
Ad

Similar to HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase (20)

PPTX
HBase: Where Online Meets Low Latency
PPTX
HBase Operations and Best Practices
PPTX
Apache HBase Internals you hoped you Never Needed to Understand
PDF
Apache HBase Low Latency
PPTX
HBase Low Latency, StrataNYC 2014
PPTX
HBase Low Latency
PPTX
Streaming map reduce
PDF
Apache HBase Improvements and Practices at Xiaomi
PDF
Hbase 20141003
PPTX
Meet HBase 2.0 and Phoenix-5.0
PPTX
How to scale recommendation system with HBase
PDF
Hbase: an introduction
PPTX
High Availability for HBase Tables - Past, Present, and Future
PPTX
HBaseCon 2015: HBase 2.0 and Beyond Panel
PPTX
High-speed, Reactive Microservices 2017
KEY
HBase and Hadoop at Urban Airship
PDF
NoSQL in Financial Industry - Pierre Bittner
PPTX
HDFS: Optimization, Stabilization and Supportability
PPTX
Hdfs 2016-hadoop-summit-dublin-v1
POTX
Meet HBase 2.0 and Phoenix 5.0
HBase: Where Online Meets Low Latency
HBase Operations and Best Practices
Apache HBase Internals you hoped you Never Needed to Understand
Apache HBase Low Latency
HBase Low Latency, StrataNYC 2014
HBase Low Latency
Streaming map reduce
Apache HBase Improvements and Practices at Xiaomi
Hbase 20141003
Meet HBase 2.0 and Phoenix-5.0
How to scale recommendation system with HBase
Hbase: an introduction
High Availability for HBase Tables - Past, Present, and Future
HBaseCon 2015: HBase 2.0 and Beyond Panel
High-speed, Reactive Microservices 2017
HBase and Hadoop at Urban Airship
NoSQL in Financial Industry - Pierre Bittner
HDFS: Optimization, Stabilization and Supportability
Hdfs 2016-hadoop-summit-dublin-v1
Meet HBase 2.0 and Phoenix 5.0

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
PPTX
Cloudera Data Impact Awards 2021 - Finalists
PPTX
2020 Cloudera Data Impact Awards Finalists
PPTX
Edc event vienna presentation 1 oct 2019
PPTX
Machine Learning with Limited Labeled Data 4/3/19
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
PPTX
Modern Data Warehouse Fundamentals Part 3
PPTX
Modern Data Warehouse Fundamentals Part 2
PPTX
Modern Data Warehouse Fundamentals Part 1
PPTX
Extending Cloudera SDX beyond the Platform
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
PPTX
Analyst Webinar: Doing a 180 on Customer 360
PPTX
Build a modern platform for anti-money laundering 9.19.18
PPTX
Introducing the data science sandbox as a service 8.30.18
Partner Briefing_January 25 (FINAL).pptx
Cloudera Data Impact Awards 2021 - Finalists
2020 Cloudera Data Impact Awards Finalists
Edc event vienna presentation 1 oct 2019
Machine Learning with Limited Labeled Data 4/3/19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Leveraging the cloud for analytics and machine learning 1.29.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Leveraging the Cloud for Big Data Analytics 12.11.18
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 1
Extending Cloudera SDX beyond the Platform
Federated Learning: ML with Privacy on the Edge 11.15.18
Analyst Webinar: Doing a 180 on Customer 360
Build a modern platform for anti-money laundering 9.19.18
Introducing the data science sandbox as a service 8.30.18

Recently uploaded (20)

PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Modernizing your data center with Dell and AMD
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Empathic Computing: Creating Shared Understanding
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
MYSQL Presentation for SQL database connectivity
Modernizing your data center with Dell and AMD
Mobile App Security Testing_ A Comprehensive Guide.pdf
Spectral efficient network and resource selection model in 5G networks
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Building Integrated photovoltaic BIPV_UPV.pdf
Encapsulation_ Review paper, used for researhc scholars
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Empathic Computing: Creating Shared Understanding
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Advanced methodologies resolving dimensionality complications for autism neur...
NewMind AI Weekly Chronicles - August'25 Week I
The Rise and Fall of 3GPP – Time for a Sabbatical?
The AUB Centre for AI in Media Proposal.docx
20250228 LYD VKU AI Blended-Learning.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Reach Out and Touch Someone: Haptics and Empathic Computing
“AI and Expert System Decision Support & Business Intelligence Systems”
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf

HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase

  • 1. Valta: A Resource Management Layer over Apache HBase Lars George| Director EMEA Services Andrew Wang | Software Engineer June 13, 2013
  • 2. Background on HBase 2 • Write-heavy processing pipelines • Web crawling, personalization, time-series • Storing a lot of data (many TBs) • Random reads/writes • Tight MapReduce and Hadoop integration
  • 3. Workloads 3 • Very much a shared system • One system, multiple workloads • Frontend doing random reads/writes • Analytical MR doing sequential scans • Bulk import/export with MR • Hard to isolate multitenant workloads
  • 4. Example: Rolling RS failures 4 • Happened in production • Bad bulk import wiped out entire cluster • MR writes kill the RS • Region gets reassigned • Repeat until cluster is dead • Applies to any high-load traffic
  • 5. Current state of the art 5 • Run separate clusters, replicate between • $$$, poor utilization, more complex • Namespace-based hardware partitioning • Same issues as above • Delay big tasks until periods of low load • Ad-hoc, weak guarantees
  • 6. Other Problems 6 • Long requests impact frontend latency • I/O latency (HDFS, OS, disk) • Unpredictable ops (compaction, cron, …) • Some straightforward to fix, some not
  • 7. Outline 7 • Project Valta (HBase) • Resource limits • Blueprint for further issues • Request scheduling • Auto-tuning scheduling for SLOs • Multiple read replicas
  • 9. Project Valta 9 • Need basic resource limits in HBase • Single shared system • Ill-behaved HBase clients are unrestricted • Take resources from other clients • Worst case: rolling RS failures • Want to limit damage from bad clients
  • 10. Resource Limits 10 • Collect RPC metrics • Payload size and throughput • Impose per-client throughput limits • e.g. MR import limited to 100 1MB puts/s • Limits are enforced per-regionserver • Soft state • Think of it as a firewall
  • 11. Implementation 11 • Client-side table wrapper • Server-side coprocessor • Github • https://github.com/larsgeorge/Valta • Follow HBASE-8481 • https://issues.apache.org/jira/browse/HBASE-8481
  • 12. Limitations 12 • Important first steps, still more to do • Static limits need baby-sitting • Dynamic workload, set of clients • Doesn’t fix some parts of HBase • Compactions • Doesn’t fix the rest of the stack • HDFS, OS, disk
  • 14. Blueprint 14 • Ideas on other QoS issues • Full-stack request scheduling • HBase, HDFS, OS, disk • Auto-tuning to meet high-level SLOs • Random latency (compaction, cron, …) • Let’s file some JIRAs 
  • 15. Full-stack request scheduling 15 • Need scheduling in all layers • HBase, HDFS, OS, disk • Run high-priority requests first • Preemption of long operations • Some pieces already available • RPC priority field (HADOOP-9194) • Client names in MR/HBase/HDFS
  • 16. HBase request scheduling 16 • Add more HBase scheduling hooks • RPC handling • Between HDFS I/Os • During long coprocessors or scans • Expose hooks to coprocessors • Could be used by Valta
  • 17. HDFS request scheduling 17 • Same scheduling hooks as in HBase • RPC layer, between I/Os • Bound # of requests per disk • Reduces queue length and contention • Preempt queues in OS and disk • OS block layer (CFQ, ioprio_set) • Disk controller (SATA NCQ, ???)
  • 18. High-level SLO enforcement 18 • Research work I did at Berkeley (Cake) • Specify high-level SLOs directly to HBase • “100ms 99th percentile latency for gets” • Added hooks to HBase and HDFS • System auto-tunes to satisfy SLOs • Read the paper or hit me up! • http://www.umbrant.com/papers/socc12-cake.pdf
  • 19. Multiple read replicas 19 • Also proposed for MTTR, availability • Many unpredictable sources of latency • Compactions • Also: cron, MR spill, shared caches, network, … • Sidestep the problem! • Read from 3 RS, return the fastest result • Unlikely all three will be slow • Weaker consistency, better latency
  • 20. Conclusion 20 • HBase is a great system! • Let’s make it multitenant • Request limits • Full-stack request scheduling • High-level SLO enforcement • Multiple read replicas