SlideShare a Scribd company logo
3
Most read
8
Most read
14
Most read
Apache HBase at Yahoo Scale
PUSHING THE LIMITS
Francis Liu
HBase Yahoo
HBase @
HBase @ Y!
• Hosted multi-tenant clusters
• 3 Production
• 3 Sandbox
• HBase-only
• Off-stage Use Cases
• Internal 0.98 releases
• Security
HBase
Client
HBase
Client
Resource Mgr Namenode
TaskTracker
DataNode
Namenode
RegionServer
DataNode
RegionServer
DataNode
RegionServer
DataNode
HBase Master
Zookeeper
Quorum
HBase
Client
MR Client
M/R Task
TaskTracker
DataNode
M/R Task
Node Mgr
DataNode
MR Task
Compute Cluster HBase Cluster
Gateway/Launcher
Rest Proxy
HTTP
Client
Workload Jungle
Multi-tenancy
Multi-tenancy at Scale
• 35 Tenants
• 800 RegionServers
• 300k regions
• RS Peak 115k requests/sec
Divide and Conquer
RS RS…Group A RS
RS RS…Group B RS
RS RS…Group C RS
RS RS…Group D RS
RS RS…Group E RS
RegionServer Groups
• Group Membership
• Table
• RegionServer
• Coarse Isolation
• Group customization
• Namespace integration
Multi-tenancy at Scale
• 800 RegionServers
• 40 namespaces
• 40 Region server groups
• 4 to 100s of servers
• Up to 2000+ regions per server
• ~1 week rolling upgrade
Scaling to 10’s of PBs (and Beyond)
• Scale to Millions of Regions (and Beyond)
• Avoid large regions
• Data Locality
• Network utilization
• Datanode load
• Performance
• Region directories under table directory
• HDFS data structure bottleneck
• Namenode Hard Limit of ~6.7 Million
Filesystem Layout
Create file ops for 5M Region Table
Filesystem Layout
• Hierarchical Table Layout
Filesystem Layout
Performance Comparison
Test 1M Regions 5M Regions 10M
Regions
Normal Table 20 mins 4 hours 23
mins
DNF
Humongous 15 mins 48
secs
1 hour 27
mins
2 hours 53
mins
Region directory creation time
▪ Lock Thrashing
▪ ZK bottlenecks
› List/Mutate Millions of Znodes
› Notification firehose
▪ State is kept in 3 places
› Cached in master
› Zookeeper
› Meta
ZK Region Assignment
RS
Master
Zookeeper
Meta
Region 1
Region 2
RS
ZKLess Region Assignment
▪ ZK no longer involved
▪ Master approves all assignment
▪ State is persisted only in Meta
▪ State is updated by the Master
Meta region
RS
Master Region 1
Region 2
RS
Performance Comparison
Test Latency
ZK 1hr 16mins
ZK w/o force-sync 11mins
ZKLess 11mins
Assignment Time for 1 Million Regions
Single Meta Region
▪ Meta not splittable
▪ Large compactions
▪ Longer failover times
Splittable Meta Table
▪ Scale Horizontally
› I/O load
› Caching
› RPC Load
Performance Comparison
Scan Meta Assignment Total
1 Meta / 1 RS 56min 19.79min 75.79min
1 Meta / 1 RS 58.63min 28.16min 86.79min
32 Meta / 3
RS
2.91min 12.56min 15.47min
32 Meta / 3
RS
3.6min 12.54min 16.4min
Assignment Time for 3 Million Regions
Data Locality
▪ HDFS
› Hadoop Distributed Filesystem
▪ Region Server
› Serves Regions
› Locality of a Region’s Data blocks
Favored Nodes
▪ HDFS
› Dictate block placement on file creation
▪ HBase
› Partially completed in Apache HBase
› Select 3 favored nodes per Region
› 1 Node on-rack, 2 Node off-rack
› Restrict Region Assignment
Favored Nodes – Fault Testing
Control Favored Nodes
THANK YOU
Icon Courtesy – iconfinder.com (under Creative Commons)

More Related Content

PDF
The State of HBase Replication
PPTX
Apache Tez: Accelerating Hadoop Query Processing
PDF
An Introduction to Redis for Developers.pdf
PDF
HBase Advanced - Lars George
PDF
Simplifying Change Data Capture using Databricks Delta
PDF
HBase replication
PPTX
Node Labels in YARN
PDF
HBase Storage Internals
The State of HBase Replication
Apache Tez: Accelerating Hadoop Query Processing
An Introduction to Redis for Developers.pdf
HBase Advanced - Lars George
Simplifying Change Data Capture using Databricks Delta
HBase replication
Node Labels in YARN
HBase Storage Internals

What's hot (20)

PDF
Column Stride Fields aka. DocValues
PPTX
Rds data lake @ Robinhood
PPTX
HBase in Practice
PPTX
Ozone: scaling HDFS to trillions of objects
PPTX
HBaseCon 2013: Apache HBase Table Snapshots
PDF
MySQLとPostgreSQLの基本的なバックアップ比較
PPTX
Apache HBase Performance Tuning
PPTX
Introduction to redis
PPTX
HBase Low Latency
PDF
Hive Anatomy
PPTX
PDF
1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話
PPTX
HBase Accelerated: In-Memory Flush and Compaction
PPTX
Off-heaping the Apache HBase Read Path
PPTX
Solr Exchange: Introduction to SolrCloud
PPTX
Apache Arrow: In Theory, In Practice
PPTX
Inside MapR's M7
PPTX
RocksDB compaction
PDF
Mapreduce by examples
PPTX
Backup and Disaster Recovery in Hadoop
Column Stride Fields aka. DocValues
Rds data lake @ Robinhood
HBase in Practice
Ozone: scaling HDFS to trillions of objects
HBaseCon 2013: Apache HBase Table Snapshots
MySQLとPostgreSQLの基本的なバックアップ比較
Apache HBase Performance Tuning
Introduction to redis
HBase Low Latency
Hive Anatomy
1000台規模のHadoopクラスタをHive/Tezアプリケーションにあわせてパフォーマンスチューニングした話
HBase Accelerated: In-Memory Flush and Compaction
Off-heaping the Apache HBase Read Path
Solr Exchange: Introduction to SolrCloud
Apache Arrow: In Theory, In Practice
Inside MapR's M7
RocksDB compaction
Mapreduce by examples
Backup and Disaster Recovery in Hadoop
Ad

Viewers also liked (20)

PDF
Apache HBase - Just the Basics
PPTX
Keynote: The Future of Apache HBase
PPTX
Keynote: Welcome Message/State of Apache HBase
PPTX
Apache HBase at Airbnb
PDF
Improvements to Apache HBase and Its Applications in Alibaba Search
PPTX
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
PPT
Hw09 Practical HBase Getting The Most From Your H Base Install
PPT
Chicago Data Summit: Apache HBase: An Introduction
PPTX
HBase Read High Availability Using Timeline Consistent Region Replicas
PDF
Hourglass: a Library for Incremental Processing on Hadoop
PDF
HBaseCon 2015: HBase Operations at Xiaomi
PDF
Argus Production Monitoring at Salesforce
PDF
Apache Mesos at Twitter (Texas LinuxFest 2014)
PPTX
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
PDF
HBaseCon 2015: HBase @ Flipboard
PDF
Tales from Taming the Long Tail
PPTX
Rolling Out Apache HBase for Mobile Offerings at Visa
PDF
Breaking the Sound Barrier with Persistent Memory
PDF
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
PDF
Apache HBase Improvements and Practices at Xiaomi
Apache HBase - Just the Basics
Keynote: The Future of Apache HBase
Keynote: Welcome Message/State of Apache HBase
Apache HBase at Airbnb
Improvements to Apache HBase and Its Applications in Alibaba Search
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Hw09 Practical HBase Getting The Most From Your H Base Install
Chicago Data Summit: Apache HBase: An Introduction
HBase Read High Availability Using Timeline Consistent Region Replicas
Hourglass: a Library for Incremental Processing on Hadoop
HBaseCon 2015: HBase Operations at Xiaomi
Argus Production Monitoring at Salesforce
Apache Mesos at Twitter (Texas LinuxFest 2014)
August 2016 HUG: Open Source Big Data Ingest with StreamSets Data Collector
HBaseCon 2015: HBase @ Flipboard
Tales from Taming the Long Tail
Rolling Out Apache HBase for Mobile Offerings at Visa
Breaking the Sound Barrier with Persistent Memory
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
Apache HBase Improvements and Practices at Xiaomi
Ad

Similar to Keynote: Apache HBase at Yahoo! Scale (20)

PPTX
Millions of Regions in HBase: Size Matters
PPTX
HBaseCon 2015: Multitenancy in HBase
PDF
HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...
PDF
Benchmarking Apache Samza: 1.2 million messages per sec per node
PPTX
Riding the Stream Processing Wave (Strange loop 2019)
PPTX
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
PPTX
Arc305 how netflix leverages multiple regions to increase availability an i...
PPTX
Apache Performance Tuning: Scaling Out
PPTX
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
PDF
DNS/DNSSEC by Nurul Islam
PPTX
Realtime olap architecture in apache kylin 3.0
PPTX
HBase Operations and Best Practices
PDF
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
PDF
Hands-on DNSSEC Deployment
PDF
Domain Name System (DNS) Fundamentals
PDF
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
PDF
Dnscluster @ DevOps Krakow 2013
PDF
Hadoop 3.0 - Revolution or evolution?
PDF
Perforce Server: The Next Generation
PDF
Omid Efficient Transaction Mgmt and Processing for HBase
Millions of Regions in HBase: Size Matters
HBaseCon 2015: Multitenancy in HBase
HBaseCon2017 Achieving HBase Multi-Tenancy with RegionServer Groups and Favor...
Benchmarking Apache Samza: 1.2 million messages per sec per node
Riding the Stream Processing Wave (Strange loop 2019)
Lessons learned from scaling YARN to 40K machines in a multi tenancy environment
Arc305 how netflix leverages multiple regions to increase availability an i...
Apache Performance Tuning: Scaling Out
Achieving HBase Multi-Tenancy with RegionServer Groups and Favored Nodes
DNS/DNSSEC by Nurul Islam
Realtime olap architecture in apache kylin 3.0
HBase Operations and Best Practices
Flink Forward Berlin 2017: Robert Metzger - Keep it going - How to reliably a...
Hands-on DNSSEC Deployment
Domain Name System (DNS) Fundamentals
Scylla Summit 2016: Outbrain Case Study - Lowering Latency While Doing 20X IO...
Dnscluster @ DevOps Krakow 2013
Hadoop 3.0 - Revolution or evolution?
Perforce Server: The Next Generation
Omid Efficient Transaction Mgmt and Processing for HBase

More from HBaseCon (20)

PDF
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
PDF
hbaseconasia2017: HBase on Beam
PDF
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
PDF
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
PDF
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
PDF
hbaseconasia2017: Apache HBase at Netease
PDF
hbaseconasia2017: HBase在Hulu的使用和实践
PDF
hbaseconasia2017: 基于HBase的企业级大数据平台
PDF
hbaseconasia2017: HBase at JD.com
PDF
hbaseconasia2017: Large scale data near-line loading method and architecture
PDF
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
PDF
hbaseconasia2017: HBase Practice At XiaoMi
PDF
hbaseconasia2017: hbase-2.0.0
PDF
HBaseCon2017 Democratizing HBase
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PDF
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
PDF
HBaseCon2017 Transactions in HBase
PDF
HBaseCon2017 Highly-Available HBase
PDF
HBaseCon2017 Apache HBase at Didi
PDF
HBaseCon2017 gohbase: Pure Go HBase Client
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: hbase-2.0.0
HBaseCon2017 Democratizing HBase
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Transactions in HBase
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 gohbase: Pure Go HBase Client

Recently uploaded (20)

PPTX
L1 - Introduction to python Backend.pptx
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
history of c programming in notes for students .pptx
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Transform Your Business with a Software ERP System
PDF
top salesforce developer skills in 2025.pdf
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
Nekopoi APK 2025 free lastest update
PDF
System and Network Administraation Chapter 3
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
medical staffing services at VALiNTRY
L1 - Introduction to python Backend.pptx
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
CHAPTER 2 - PM Management and IT Context
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
history of c programming in notes for students .pptx
Design an Analysis of Algorithms II-SECS-1021-03
2025 Textile ERP Trends: SAP, Odoo & Oracle
Transform Your Business with a Software ERP System
top salesforce developer skills in 2025.pdf
How to Migrate SBCGlobal Email to Yahoo Easily
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Odoo Companies in India – Driving Business Transformation.pdf
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Nekopoi APK 2025 free lastest update
System and Network Administraation Chapter 3
How to Choose the Right IT Partner for Your Business in Malaysia
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
medical staffing services at VALiNTRY

Keynote: Apache HBase at Yahoo! Scale

  • 1. Apache HBase at Yahoo Scale PUSHING THE LIMITS Francis Liu HBase Yahoo
  • 3. HBase @ Y! • Hosted multi-tenant clusters • 3 Production • 3 Sandbox • HBase-only • Off-stage Use Cases • Internal 0.98 releases • Security HBase Client HBase Client Resource Mgr Namenode TaskTracker DataNode Namenode RegionServer DataNode RegionServer DataNode RegionServer DataNode HBase Master Zookeeper Quorum HBase Client MR Client M/R Task TaskTracker DataNode M/R Task Node Mgr DataNode MR Task Compute Cluster HBase Cluster Gateway/Launcher Rest Proxy HTTP Client
  • 6. Multi-tenancy at Scale • 35 Tenants • 800 RegionServers • 300k regions • RS Peak 115k requests/sec
  • 7. Divide and Conquer RS RS…Group A RS RS RS…Group B RS RS RS…Group C RS RS RS…Group D RS RS RS…Group E RS
  • 8. RegionServer Groups • Group Membership • Table • RegionServer • Coarse Isolation • Group customization • Namespace integration
  • 9. Multi-tenancy at Scale • 800 RegionServers • 40 namespaces • 40 Region server groups • 4 to 100s of servers • Up to 2000+ regions per server • ~1 week rolling upgrade
  • 10. Scaling to 10’s of PBs (and Beyond) • Scale to Millions of Regions (and Beyond) • Avoid large regions • Data Locality • Network utilization • Datanode load • Performance
  • 11. • Region directories under table directory • HDFS data structure bottleneck • Namenode Hard Limit of ~6.7 Million Filesystem Layout
  • 12. Create file ops for 5M Region Table Filesystem Layout
  • 13. • Hierarchical Table Layout Filesystem Layout
  • 14. Performance Comparison Test 1M Regions 5M Regions 10M Regions Normal Table 20 mins 4 hours 23 mins DNF Humongous 15 mins 48 secs 1 hour 27 mins 2 hours 53 mins Region directory creation time
  • 15. ▪ Lock Thrashing ▪ ZK bottlenecks › List/Mutate Millions of Znodes › Notification firehose ▪ State is kept in 3 places › Cached in master › Zookeeper › Meta ZK Region Assignment RS Master Zookeeper Meta Region 1 Region 2 RS
  • 16. ZKLess Region Assignment ▪ ZK no longer involved ▪ Master approves all assignment ▪ State is persisted only in Meta ▪ State is updated by the Master Meta region RS Master Region 1 Region 2 RS
  • 17. Performance Comparison Test Latency ZK 1hr 16mins ZK w/o force-sync 11mins ZKLess 11mins Assignment Time for 1 Million Regions
  • 18. Single Meta Region ▪ Meta not splittable ▪ Large compactions ▪ Longer failover times
  • 19. Splittable Meta Table ▪ Scale Horizontally › I/O load › Caching › RPC Load
  • 20. Performance Comparison Scan Meta Assignment Total 1 Meta / 1 RS 56min 19.79min 75.79min 1 Meta / 1 RS 58.63min 28.16min 86.79min 32 Meta / 3 RS 2.91min 12.56min 15.47min 32 Meta / 3 RS 3.6min 12.54min 16.4min Assignment Time for 3 Million Regions
  • 21. Data Locality ▪ HDFS › Hadoop Distributed Filesystem ▪ Region Server › Serves Regions › Locality of a Region’s Data blocks
  • 22. Favored Nodes ▪ HDFS › Dictate block placement on file creation ▪ HBase › Partially completed in Apache HBase › Select 3 favored nodes per Region › 1 Node on-rack, 2 Node off-rack › Restrict Region Assignment
  • 23. Favored Nodes – Fault Testing Control Favored Nodes
  • 24. THANK YOU Icon Courtesy – iconfinder.com (under Creative Commons)