SlideShare a Scribd company logo
Harmonizing Multi-tenant HBase Clusters for
Managing Workload Diversity
PRESENTED BY Dheeraj Kapur, Rajiv Chittajallu, Anish Mathew⎪ May 5, 2014
Agenda
Topic Speaker(s)
Overview of Hadoop stack and Grid Infrastructure at Yahoo Rajiv Chittajallu
Application onboarding on Multi-Tenant HBase Dheeraj Kapur
Automation for Compaction/Splits and Monitoring Anish Mathew
Q&A All Presenters
Hadoop at Yahoo
Hadoop Usage at Yahoo
HBaseCon 2014
Browsers
Mobile Devices
Web Crawl
Knowledge
Graph
3rd Party
Yahoo Grid
Business Intelligence
Tools
(e.g. Tableau,
MicroStrategy)
Data
Collection
Asynchronous Data
Processing
Synchronous Serving
User
Events
WCC
Entity Feeds
Content Feeds
Source of truth for data*
Serving Systems
Home
Run
Search Mail
Mobile Flickr Media
Stream
Ads
Native
Ads
Display
Ads
Content
systems
Y!
NoSql
…
Grid Infrastructure at Yahoo
HBaseCon 2014
A multi-tenant, secure, distributed compute and storage
environment, based on Hadoop stack for large scale data
processing
Grid Stack
HBaseCon 2014
Deployment Model
HBaseCon 2014
DataNode NodeManager
NameNode RM
DataNodes RegionServers
NameNode HBase Master Nimbus
Supervisor
Administration, Management and Monitoring
ZooKeeper
Pools
HTTP/HDFS/GDM Load
Proxies
Applications and Data
Data
Feeds
Data
Stores
Oozie
Server
HS2/
HCat
Network Architecture – 1G to node
HBaseCon 2014
Network Architecture – 10G Node
tor
VC0
spine0 leaf0
spine1
spine7
.
.
leaf1
leaf31
.
.
VC1
spine0 leaf0
spine1
spine7
.
.
leaf1
leaf31
.
.
tor
40G (or 4 x 10G)
Host
Host
Host
.
.
.
.
.
.
10G
Hbase @ Yahoo
HBaseCon 2014
• 7 clusters, 1500 region servers, 6 PB of data
• Diverse use cases, 500+ Tables, 100k regions
• Rolling Major compaction & Split and Group Rebalancing
• RegionServer groups, namespaces and multi region config System
Challenges
HBaseCon 2014
• Customer onboarding and provisioning
• Access management and Table provisioning
• Deployments
• Customizing group configs
• Rolling Major Compaction and Splits
• Group Balancing
Use Cases
Use Cases
HBaseCon 2014
Search
▪ Web Cache
▪ Query Analysis
▪ Local Listings
▪ Analytics
Y! Mail
▪ Anti-spam
▪ Log Analytics
▪ Metadata Mgmt.
Cloud Platforms
▪ Performance
▪ Monitoring
▪ OpenStack
Consumer Platforms
▪ CMS
▪ Social Data
Online Ads
▪ Traffic Protection
▪ Ads Data Mgmt.
P13N
▪ Content P13N
▪ Ad targeting
Mobile
▪ Notifications
▪ Flickr
Sales
▪ eCommerce
Yahoo’s Global Business
Web Crawl Cache
HBaseCon 2014
Developers/
Scientists
Poller
Fetcher
Ingestor
Extruder
Processing
Random
Read
poll
fetch
launch
write Compute Clusters
NM
DN
NM
DN
NM
DN.....HDFS NN
YARN RM
Clusters
RS
DN
RS
DN
RS
DN..... HDFSNN
HBaseHM
r/w
insert
scan
Customer Onboarding & Multi Tenancy
Customer Onboarding & Provisioning
HBaseCon 2014
• Two identical environments (Prod and Non-Prod)
• Applications are on boarded to Non-Prod for
performance/Integration testing
• Once ready, provisioned on prod
• Performance results help in production onboarding
Namespaces
HBaseCon 2014
• Allow tenants to create/drop/modify their own tables
• Only super admin used to do it before
• Quota Management
• Security administration
• Commands : alter_namespace, create_namespace, describe_namespace,
drop_namespace, list_namespace, list_namespace_tables
RegionServer Groups
HBaseCon 2014
• Missing QoS in Hbase 0.94
• Isolation is required in Multi-tenant env
• Multi configs are required for different apps
• Commands : group_add, group_balance, group_get, group_list,
group_list_tables, group_list_transitions, group_move_servers,
group_move_tables, group_of_server, group_of_table, group_remove
Multi Region Configs
HBaseCon 2014
SVN Jenkins Build
Farm
Master Repository
Slave Repository
Colo B
Slave Repository
Colo A
HBase Cluster A
HBase Cluster B
Fetch Group List
Generate Multi Configs
Merge Default
Config & Push
multi config
Sync
Configs Download
Host Maps
and Multi
Region Config
Compaction
HBaseCon 2014
• Minor & Major
• Minor picks up couple of smaller files and rewrite as one
• Major drop deletes or expire cells and picks up all files and rewrite
as one
Compaction file selection
HBaseCon 2014
F
I
L
E
S
I
Z
E
Older File Age Younger
minCompactSize
ExcludedIncluded
Compaction/Split Managemt
Config Parameters
HBaseCon 2014
• hbase.hstore.blockingStoreFile
• hbase.hstore.compaction.max.size
• hbase.hstore.compaction.min.size
• hbase.hstore.compaction.ratio
• hbase.hregion.max.filesize
• hbase.hregion.memstore.flush.size
• hbase.master.wait.on.regionservers.mintostart
Managed Compactions and Splits
HBaseCon 2014
• Flexible Scheduling
• Custom Logic per table and workload
Compaction and Splits Scheduler
HBaseCon 2014
Metrics
Mysql
Metrics
Analyze
Region
Specific
Metrics
Server
Metrics
Scheduling
Parameters HBaseCtl:
Scheduler
HDFS
Publish
HBase Cluster A
HBase Cluster B
HBase Cluster CUpdate Compaction/Split Statistics
Zookeeper
Coordination &
Intermediate Store
Group Balancing
• Scheduled group balance followed by rolling major compaction
• Based on Data Locality
– Find data locality of each block of store files
– Move region to server where the maximum blocks are located
• Helps after cluster upgrades and restarts
• After config changes for a region group
HBaseCon 2014
Monitoring
Monitoring
HBaseCon 2014
• Simon Metrics & Yahoo Monitoring As a Service (YMS)
• OpenTSDB at Yahoo, replacing MySQL as backend for YMS
Monitoring cont..
HBaseCon 2014
Monitoring Cont.. ( Metrics for Customers)
HBaseCon 2014
Simon
System
Other
Systems for
Analysis &
Reporting
Jenkins Job :
Merges and
Formats Metrics
HBase
HBase
Master
HDFS
Master
Grid
Snodes
Customer
Dashboards
Upload data
to HDFS
Memory Dump
from Master
Region Server
Metrics
Push compiled
metrics to
snodes
Fetch metrics
Monitoring cont.. ( OpenTSDB )
HBaseCon 2014
• Evaluating
• Work required to make is production ready at Yahoo
Thank You
HBaseCon 2014

More Related Content

PPTX
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
PDF
Large-scale Web Apps @ Pinterest
PPTX
HBase at Bloomberg: High Availability Needs for the Financial Industry
PPTX
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
PPTX
A Survey of HBase Application Archetypes
PDF
HBase Read High Availability Using Timeline-Consistent Region Replicas
PPTX
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
PPT
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
Large-scale Web Apps @ Pinterest
HBase at Bloomberg: High Availability Needs for the Financial Industry
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
A Survey of HBase Application Archetypes
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data

What's hot (20)

PDF
HBaseCon 2015- HBase @ Flipboard
PDF
HBaseCon 2013: Integration of Apache Hive and HBase
PDF
HBaseCon 2013: Apache HBase Operations at Pinterest
PPTX
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
PPTX
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
PPTX
HBaseCon 2015: State of HBase Docs and How to Contribute
PPTX
HBaseCon 2015: HBase and Spark
PPTX
HBase Backups
PPTX
Content Identification using HBase
PPTX
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
PPTX
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
PDF
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
PPTX
Keynote: The Future of Apache HBase
PPTX
HBase Data Modeling and Access Patterns with Kite SDK
PPTX
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
PPTX
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
PDF
Tales from the Cloudera Field
PPTX
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
PPTX
NoSQL: Cassadra vs. HBase
PDF
Data Evolution in HBase
HBaseCon 2015- HBase @ Flipboard
HBaseCon 2013: Integration of Apache Hive and HBase
HBaseCon 2013: Apache HBase Operations at Pinterest
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
HBaseCon 2015: Apache Phoenix - The Evolution of a Relational Database Layer ...
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseCon 2015: HBase and Spark
HBase Backups
Content Identification using HBase
HBaseCon 2012 | Mignify: A Big Data Refinery Built on HBase - Internet Memory...
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
Keynote: The Future of Apache HBase
HBase Data Modeling and Access Patterns with Kite SDK
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
Tales from the Cloudera Field
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
NoSQL: Cassadra vs. HBase
Data Evolution in HBase
Ad

Viewers also liked (20)

PDF
Rapid Infrastructure Provisioning
PDF
Stephenson big data utrecht 2017
PDF
Oracle Cloud Café IOT 12 avril 2016
PDF
Native XML processing in C++ (BoostCon'11)
PPTX
Node.JS error handling best practices
PPTX
Vasilis Bankov & Calin Iliescu AEGON
PPTX
GoAzure 2015 Azure AD for Developers
PDF
Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...
PDF
Fun git hub
PDF
D5 crazy speed web development
PDF
Dino Product Overview
PDF
Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...
PPTX
Cloud Camp Azure概要
PDF
OC Big Data Monthly Meetup #6 - Session 1 - IBM
PDF
DevOps and AWS
PDF
Introduction to QC
PDF
EventoDadosAbertos v17ago16
PPT
PPTX
Roadmap to data driven advice michael goedhart 1v0
PPS
Rapid Infrastructure Provisioning
Stephenson big data utrecht 2017
Oracle Cloud Café IOT 12 avril 2016
Native XML processing in C++ (BoostCon'11)
Node.JS error handling best practices
Vasilis Bankov & Calin Iliescu AEGON
GoAzure 2015 Azure AD for Developers
Pre-Con Ed: Discover the New CA App Experience Analytics 16.3 - The Omnichann...
Fun git hub
D5 crazy speed web development
Dino Product Overview
Silicon Valley Grade IT and Cloud Maturity Assessment for Startup Ecosystem i...
Cloud Camp Azure概要
OC Big Data Monthly Meetup #6 - Session 1 - IBM
DevOps and AWS
Introduction to QC
EventoDadosAbertos v17ago16
Roadmap to data driven advice michael goedhart 1v0
Ad

Similar to Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity (20)

PPTX
HBase New Features
 
PDF
Apache HBase in the Enterprise Data Hub at Cerner
ODP
HBase introduction talk
PDF
April 2013 HUG: HBase as a Service at Yahoo!
PPTX
HBase Operations and Best Practices
PDF
Apache HBase 1.0 Release
PDF
PPTX
HBaseCon 2015: HBase Performance Tuning @ Salesforce
PPTX
Apache HBase Performance Tuning
PDF
Nyc hadoop meetup introduction to h base
PPTX
Hbase.pptx
PPTX
HBase in Practice
PDF
Apache HBase Low Latency
PPTX
Hbasepreso 111116185419-phpapp02
PPTX
HBase in Practice
PPTX
HBaseCon 2015: HBase 2.0 and Beyond Panel
PPTX
Introduction to Apache HBase
PPTX
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
PDF
Apache Big Data EU 2015 - HBase
PDF
Facebook - Jonthan Gray - Hadoop World 2010
HBase New Features
 
Apache HBase in the Enterprise Data Hub at Cerner
HBase introduction talk
April 2013 HUG: HBase as a Service at Yahoo!
HBase Operations and Best Practices
Apache HBase 1.0 Release
HBaseCon 2015: HBase Performance Tuning @ Salesforce
Apache HBase Performance Tuning
Nyc hadoop meetup introduction to h base
Hbase.pptx
HBase in Practice
Apache HBase Low Latency
Hbasepreso 111116185419-phpapp02
HBase in Practice
HBaseCon 2015: HBase 2.0 and Beyond Panel
Introduction to Apache HBase
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Apache Big Data EU 2015 - HBase
Facebook - Jonthan Gray - Hadoop World 2010

More from HBaseCon (20)

PDF
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
PDF
hbaseconasia2017: HBase on Beam
PDF
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
PDF
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
PDF
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
PDF
hbaseconasia2017: Apache HBase at Netease
PDF
hbaseconasia2017: HBase在Hulu的使用和实践
PDF
hbaseconasia2017: 基于HBase的企业级大数据平台
PDF
hbaseconasia2017: HBase at JD.com
PDF
hbaseconasia2017: Large scale data near-line loading method and architecture
PDF
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
PDF
hbaseconasia2017: HBase Practice At XiaoMi
PDF
hbaseconasia2017: hbase-2.0.0
PDF
HBaseCon2017 Democratizing HBase
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PDF
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
PDF
HBaseCon2017 Transactions in HBase
PDF
HBaseCon2017 Highly-Available HBase
PDF
HBaseCon2017 Apache HBase at Didi
PDF
HBaseCon2017 gohbase: Pure Go HBase Client
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: hbase-2.0.0
HBaseCon2017 Democratizing HBase
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Transactions in HBase
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 gohbase: Pure Go HBase Client

Recently uploaded (20)

PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
ISO 45001 Occupational Health and Safety Management System
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
ManageIQ - Sprint 268 Review - Slide Deck
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
System and Network Administraation Chapter 3
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
top salesforce developer skills in 2025.pdf
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
ai tools demonstartion for schools and inter college
PDF
Nekopoi APK 2025 free lastest update
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PTS Company Brochure 2025 (1).pdf.......
ISO 45001 Occupational Health and Safety Management System
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
ManageIQ - Sprint 268 Review - Slide Deck
Which alternative to Crystal Reports is best for small or large businesses.pdf
Design an Analysis of Algorithms I-SECS-1021-03
How to Choose the Right IT Partner for Your Business in Malaysia
Online Work Permit System for Fast Permit Processing
Design an Analysis of Algorithms II-SECS-1021-03
System and Network Administraation Chapter 3
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
top salesforce developer skills in 2025.pdf
Odoo POS Development Services by CandidRoot Solutions
Navsoft: AI-Powered Business Solutions & Custom Software Development
How Creative Agencies Leverage Project Management Software.pdf
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
ai tools demonstartion for schools and inter college
Nekopoi APK 2025 free lastest update
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...

Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity

  • 1. Harmonizing Multi-tenant HBase Clusters for Managing Workload Diversity PRESENTED BY Dheeraj Kapur, Rajiv Chittajallu, Anish Mathew⎪ May 5, 2014
  • 2. Agenda Topic Speaker(s) Overview of Hadoop stack and Grid Infrastructure at Yahoo Rajiv Chittajallu Application onboarding on Multi-Tenant HBase Dheeraj Kapur Automation for Compaction/Splits and Monitoring Anish Mathew Q&A All Presenters
  • 4. Hadoop Usage at Yahoo HBaseCon 2014 Browsers Mobile Devices Web Crawl Knowledge Graph 3rd Party Yahoo Grid Business Intelligence Tools (e.g. Tableau, MicroStrategy) Data Collection Asynchronous Data Processing Synchronous Serving User Events WCC Entity Feeds Content Feeds Source of truth for data* Serving Systems Home Run Search Mail Mobile Flickr Media Stream Ads Native Ads Display Ads Content systems Y! NoSql …
  • 5. Grid Infrastructure at Yahoo HBaseCon 2014 A multi-tenant, secure, distributed compute and storage environment, based on Hadoop stack for large scale data processing
  • 7. Deployment Model HBaseCon 2014 DataNode NodeManager NameNode RM DataNodes RegionServers NameNode HBase Master Nimbus Supervisor Administration, Management and Monitoring ZooKeeper Pools HTTP/HDFS/GDM Load Proxies Applications and Data Data Feeds Data Stores Oozie Server HS2/ HCat
  • 8. Network Architecture – 1G to node HBaseCon 2014
  • 9. Network Architecture – 10G Node tor VC0 spine0 leaf0 spine1 spine7 . . leaf1 leaf31 . . VC1 spine0 leaf0 spine1 spine7 . . leaf1 leaf31 . . tor 40G (or 4 x 10G) Host Host Host . . . . . . 10G
  • 10. Hbase @ Yahoo HBaseCon 2014 • 7 clusters, 1500 region servers, 6 PB of data • Diverse use cases, 500+ Tables, 100k regions • Rolling Major compaction & Split and Group Rebalancing • RegionServer groups, namespaces and multi region config System
  • 11. Challenges HBaseCon 2014 • Customer onboarding and provisioning • Access management and Table provisioning • Deployments • Customizing group configs • Rolling Major Compaction and Splits • Group Balancing
  • 13. Use Cases HBaseCon 2014 Search ▪ Web Cache ▪ Query Analysis ▪ Local Listings ▪ Analytics Y! Mail ▪ Anti-spam ▪ Log Analytics ▪ Metadata Mgmt. Cloud Platforms ▪ Performance ▪ Monitoring ▪ OpenStack Consumer Platforms ▪ CMS ▪ Social Data Online Ads ▪ Traffic Protection ▪ Ads Data Mgmt. P13N ▪ Content P13N ▪ Ad targeting Mobile ▪ Notifications ▪ Flickr Sales ▪ eCommerce Yahoo’s Global Business
  • 14. Web Crawl Cache HBaseCon 2014 Developers/ Scientists Poller Fetcher Ingestor Extruder Processing Random Read poll fetch launch write Compute Clusters NM DN NM DN NM DN.....HDFS NN YARN RM Clusters RS DN RS DN RS DN..... HDFSNN HBaseHM r/w insert scan
  • 15. Customer Onboarding & Multi Tenancy
  • 16. Customer Onboarding & Provisioning HBaseCon 2014 • Two identical environments (Prod and Non-Prod) • Applications are on boarded to Non-Prod for performance/Integration testing • Once ready, provisioned on prod • Performance results help in production onboarding
  • 17. Namespaces HBaseCon 2014 • Allow tenants to create/drop/modify their own tables • Only super admin used to do it before • Quota Management • Security administration • Commands : alter_namespace, create_namespace, describe_namespace, drop_namespace, list_namespace, list_namespace_tables
  • 18. RegionServer Groups HBaseCon 2014 • Missing QoS in Hbase 0.94 • Isolation is required in Multi-tenant env • Multi configs are required for different apps • Commands : group_add, group_balance, group_get, group_list, group_list_tables, group_list_transitions, group_move_servers, group_move_tables, group_of_server, group_of_table, group_remove
  • 19. Multi Region Configs HBaseCon 2014 SVN Jenkins Build Farm Master Repository Slave Repository Colo B Slave Repository Colo A HBase Cluster A HBase Cluster B Fetch Group List Generate Multi Configs Merge Default Config & Push multi config Sync Configs Download Host Maps and Multi Region Config
  • 20. Compaction HBaseCon 2014 • Minor & Major • Minor picks up couple of smaller files and rewrite as one • Major drop deletes or expire cells and picks up all files and rewrite as one
  • 21. Compaction file selection HBaseCon 2014 F I L E S I Z E Older File Age Younger minCompactSize ExcludedIncluded
  • 23. Config Parameters HBaseCon 2014 • hbase.hstore.blockingStoreFile • hbase.hstore.compaction.max.size • hbase.hstore.compaction.min.size • hbase.hstore.compaction.ratio • hbase.hregion.max.filesize • hbase.hregion.memstore.flush.size • hbase.master.wait.on.regionservers.mintostart
  • 24. Managed Compactions and Splits HBaseCon 2014 • Flexible Scheduling • Custom Logic per table and workload
  • 25. Compaction and Splits Scheduler HBaseCon 2014 Metrics Mysql Metrics Analyze Region Specific Metrics Server Metrics Scheduling Parameters HBaseCtl: Scheduler HDFS Publish HBase Cluster A HBase Cluster B HBase Cluster CUpdate Compaction/Split Statistics Zookeeper Coordination & Intermediate Store
  • 26. Group Balancing • Scheduled group balance followed by rolling major compaction • Based on Data Locality – Find data locality of each block of store files – Move region to server where the maximum blocks are located • Helps after cluster upgrades and restarts • After config changes for a region group HBaseCon 2014
  • 28. Monitoring HBaseCon 2014 • Simon Metrics & Yahoo Monitoring As a Service (YMS) • OpenTSDB at Yahoo, replacing MySQL as backend for YMS
  • 30. Monitoring Cont.. ( Metrics for Customers) HBaseCon 2014 Simon System Other Systems for Analysis & Reporting Jenkins Job : Merges and Formats Metrics HBase HBase Master HDFS Master Grid Snodes Customer Dashboards Upload data to HDFS Memory Dump from Master Region Server Metrics Push compiled metrics to snodes Fetch metrics
  • 31. Monitoring cont.. ( OpenTSDB ) HBaseCon 2014 • Evaluating • Work required to make is production ready at Yahoo