SlideShare a Scribd company logo
hbaseconasia2019 HBase at Tencent
HBase At Tencent
Andrew Cheng | 程广旭
Tencent | HBase Committer
Content
01. HBase Service In Tencent
02. Applications
03. Practices & Optimization
01. HBase Service
In Tencent
HBase Story in Tencent
l Began using since 2013
l Used version
l 0.94.17 -> 0.98.6 -> 1.2.5 -> 2.2.0 (ing)
l Largest cluster more than 500 nodes
90+
Clusters
4000+
Nodes
10PB+
Data
3Tri+
RPD
Overview
HBase Users come from 6 groups , more than 100+ different applications
Architecture
Tencent HBase Zookeeper
OpenTSDB
S2Graph
Spark Tookit
HBase Api
TDBank
Lhotse
RestServer
ThriftServer
Kylin
Phoenix Tenpay
Doss
monitoring
TNM2
Deploy CenterWepay Game
Advertiseme
nt
…
02. Applications
Tencent Ads – Real-Time Logjoin System
Mixer Exposure
TDBank
Tencent HBase
Model learning Freshness Budget control Report
Association Table
Flow Table
Click …
LogJoin LogJoin LogJoin LogJoin
Data Source
Transport
Logical
Storage
Consumer
Tenpay - Transaction record
Data Source MySQL
Binlog Paser DBSync
Cache Hippo
Storage Tencent HBase
Thrift Server
Application
C++
Read
Read
Write
Application
JAVA
Read
Write TDSort
03. Practices &
Optimization
Practices–Data migration
add_peer
disable_peer
Set REPLICATION_SCOPE => '1'
snapshot clone_snapshot
Set REPLICATION_SCOPE => '0'
Check Dataenable_peer
Client switch to new cluster
Cluster A Cluster B
ExportSnapshot
delete_snapshot
Business-insensitive data migration
Practices–Table
l Create table per day
l Large amount of data
l TTL is short
l Benefit
l Reduce the amount of data in compaction
l Easy to delete expired data
Optimization - Bandwidth
② RS2 and RS3 Wal data
① Input Data
③ RS2 and RS3 Flush data
⑤ RS2 and RS3 Large compact
④ RS2 and RS3 Small compact
RS1 RS2 RS3
Input Data
Wal
Flush
①
Small compact
Large compact
②
③
④
⑤
Input Data Input Data
Optimization - Bandwidth
l Enable compressing of CellBlocks
l Wal compressor
l Increase the size of memstore
l Reduce the number of threads about compaction
l Turn off major compaction
l create tables by day
Optimization - Online filtering of dirty data
l A large amount of data which have the same Rowkey
l How to find filter rowkeys?
l ResponseTooSlow
l How to set filter rowkeys?
l hbase.hregion.filter.rowkeys
l How to refresh filter rowkeys?
l update_config
Input Data
Filter
Enable
Write
Filter
Yes
Yes
No
No
Optimization - Prefix Bloom Filter(HBASE-20636)
l ROWPREFIX_FIXED_LENGTH
l ROWPREFIX_DELIMITER
uin ts action
Bloom Filter
Prefix
Create Table:
File info:
Optimization - Prefix Bloom Filter(HBASE-20636)
Scan
Not Filter StoreFile
Same
prefix?
{StartKey,EndKey}
Computer hash
value
Hit
BloomFilter?
Prefix length
>=
prefix_length
Yes
Yes
No
Filter StoreFile
No
No
Get prefix key by
prefix_length
Yes
Read
Rowkey
Get prefix key by prefix_length
Computer hash value
Set BloomFilter
Last line?
Input Data
Write BloomFilter information to StoreFile metadata
Yes
No
Write
Optimization - RestServer
RestServer A
Cluster A Cluster B Cluster C
RestServer CRestServer B RestServer D
User
Nginx
Optimization - RestServer
RestServer A
Cluster A Cluster B Cluster C
RestServer CRestServer B
User
Nginx
Mysql
Optimization - RestServer
l Only maintain one configuration
l use effectively resources
l User-friendly access
HBase Community
l 1 Committer, 2 Contributor
l Total commits: 80+
l Feature
l HBASE-20636 Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED
l HBASE-19799 Add web UI to rsgroup
l HBASE-20243 [Shell] Add shell command to create a new table by cloning the existent table
l HBASE-19483 Add proper privilege check for rsgroup commands
l ………
Join Us
Personal WechatDept. Wechat
Thanks!

More Related Content

PDF
hbaseconasia2019 Recent work on HBase at Pinterest
PDF
HBaseConAsia2018 Keynote1: Apache HBase Project Status
PDF
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
PDF
HBaseConAsia2018 Track3-3: HBase at China Life Insurance
PPTX
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
PDF
HBaseConAsia2018 Track2-6: Scaling 30TB's of data lake with Apache HBase and ...
PDF
Hadoop Networking at Datasift
PDF
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...
hbaseconasia2019 Recent work on HBase at Pinterest
HBaseConAsia2018 Keynote1: Apache HBase Project Status
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
HBaseConAsia2018 Track3-3: HBase at China Life Insurance
HBaseConAsia2018: Track2-5: JanusGraph-Distributed graph database with HBase
HBaseConAsia2018 Track2-6: Scaling 30TB's of data lake with Apache HBase and ...
Hadoop Networking at Datasift
Роман Новиков "Best Practices for MySQL Performance & Troubleshooting with th...

What's hot (16)

PDF
Amazon RedShift - Ianni Vamvadelis
PDF
Amazon Elastic Map Reduce - Ian Meyers
PPTX
HBaseCon 2015: HBase Operations in a Flurry
PDF
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
PDF
Володимир Цап "Constraint driven infrastructure - scale or tune?"
PPTX
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
PPTX
Rolling Out Apache HBase for Mobile Offerings at Visa
PDF
Argus Production Monitoring at Salesforce
PPTX
HBaseCon 2015: State of HBase Docs and How to Contribute
PPTX
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
PPTX
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
PPTX
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
PDF
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
PPTX
Introduction to streaming and messaging flume,kafka,SQS,kinesis
PDF
HBaseCon2017 Apache HBase at Didi
PPTX
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Amazon RedShift - Ianni Vamvadelis
Amazon Elastic Map Reduce - Ian Meyers
HBaseCon 2015: HBase Operations in a Flurry
HBaseConAsia2018 Keynote 2: Recent Development of HBase in Alibaba and Cloud
Володимир Цап "Constraint driven infrastructure - scale or tune?"
Виталий Бондаренко "Fast Data Platform for Real-Time Analytics. Architecture ...
Rolling Out Apache HBase for Mobile Offerings at Visa
Argus Production Monitoring at Salesforce
HBaseCon 2015: State of HBase Docs and How to Contribute
HBaseConAsia2018 Track3-7: The application of HBase in New Energy Vehicle Mon...
HBaseConAsia2018 Track2-2: Apache Kylin on HBase: Extreme OLAP for big data
HBaseCon 2015: Optimizing HBase for the Cloud in Microsoft Azure HDInsight
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
Introduction to streaming and messaging flume,kafka,SQS,kinesis
HBaseCon2017 Apache HBase at Didi
Protecting Your API with Redis by Jane Paek - Redis Day Seattle 2020
Ad

Similar to hbaseconasia2019 HBase at Tencent (20)

PPTX
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
PPTX
Big dataarchitecturesandecosystem+nosql
PPTX
Stratebi Big Data
PPT
Hadoop and Pig at Twitter__HadoopSummit2010
PPT
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
PPTX
Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in ...
PPTX
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
PDF
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
PPT
Hive @ Hadoop day seattle_2010
PDF
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
PDF
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
PDF
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
PDF
Enterprise Data Lakes
PDF
History of Apache Pinot
PDF
Web performance optimization
PPTX
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
PDF
OSMC 2013 | openTSDB - metrics for a distributed world
PPTX
Membase Meetup 2010
PPT
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
PDF
hbaseconasia2017: Apache HBase at Netease
Ameya Kanitkar: Using Hadoop and HBase to Personalize Web, Mobile and Email E...
Big dataarchitecturesandecosystem+nosql
Stratebi Big Data
Hadoop and Pig at Twitter__HadoopSummit2010
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Gaming SEC Filings Using Machine Learning to Detect Vectors and Sentiment in ...
Planning your Next-Gen Change Data Capture (CDC) Architecture in 2019 - Strea...
Big Data Taiwan 2014 Track2-2: Informatica Big Data Solution
Hive @ Hadoop day seattle_2010
Big Data Serving with Vespa - Jon Bratseth, Distinguished Architect, Oath
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Admin Tech Clash: Discussing Best (and Worst) Administration Practices from ...
Enterprise Data Lakes
History of Apache Pinot
Web performance optimization
PayPal merchant ecosystem using Apache Spark, Hive, Druid, and HBase
OSMC 2013 | openTSDB - metrics for a distributed world
Membase Meetup 2010
Hadoop World 2011: Hadoop’s Life in Enterprise Systems - Y Masatani, NTTData
hbaseconasia2017: Apache HBase at Netease
Ad

More from Michael Stack (20)

PDF
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
PDF
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
PDF
hbaseconasia2019 HBase at Didi
PDF
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
PDF
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
PDF
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
PDF
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
PDF
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
PDF
hbaseconasia2019 OpenTSDB at Xiaomi
PDF
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
PDF
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
PDF
hbaseconasia2019 Distributed Bitmap Index Solution
PDF
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
PDF
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
PDF
hbaseconasia2019 BDS: A data synchronization platform for HBase
PDF
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
PDF
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
PDF
HBaseConAsia2019 Keynote
PDF
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
PDF
HBaseConAsia2018 Track1-3: HBase at Xiaomi
hbaseconasia2019 HBase Table Monitoring and Troubleshooting System on Cloud
hbaseconasia2019 Phoenix Practice in China Life Insurance Co., Ltd
hbaseconasia2019 HBase at Didi
hbaseconasia2019 The Practice in trillion-level Video Storage and billion-lev...
hbaseconasia2019 Spatio temporal Data Management based on Ali-HBase Ganos and...
hbaseconasia2019 Bridging the Gap between Big Data System Software Stack and ...
hbaseconasia2019 Pharos as a Pluggable Secondary Index Component
hbaseconasia2019 Phoenix Improvements and Practices on Cloud HBase at Alibaba
hbaseconasia2019 OpenTSDB at Xiaomi
hbaseconasia2019 BigData NoSQL System: ApsaraDB, HBase and Spark
hbaseconasia2019 Test-suite for Automating Data-consistency checks on HBase
hbaseconasia2019 Distributed Bitmap Index Solution
hbaseconasia2019 HBase Bucket Cache on Persistent Memory
hbaseconasia2019 The Procedure v2 Implementation of WAL Splitting and ACL
hbaseconasia2019 BDS: A data synchronization platform for HBase
hbaseconasia2019 Further GC optimization for HBase 2.x: Reading HFileBlock in...
hbaseconasia2019 HBCK2: Concepts, trends, and recipes for fixing issues in HB...
HBaseConAsia2019 Keynote
HBaseConAsia2018 Track3-1: Serving billions of queries in millisecond latencies
HBaseConAsia2018 Track1-3: HBase at Xiaomi

Recently uploaded (20)

PPT
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
PPTX
introduction about ICD -10 & ICD-11 ppt.pptx
PDF
The New Creative Director: How AI Tools for Social Media Content Creation Are...
DOCX
Unit-3 cyber security network security of internet system
PPTX
E -tech empowerment technologies PowerPoint
PDF
Tenda Login Guide: Access Your Router in 5 Easy Steps
PPTX
Introuction about WHO-FIC in ICD-10.pptx
PPTX
international classification of diseases ICD-10 review PPT.pptx
PPTX
Funds Management Learning Material for Beg
PPT
Ethics in Information System - Management Information System
PDF
SASE Traffic Flow - ZTNA Connector-1.pdf
PDF
WebRTC in SignalWire - troubleshooting media negotiation
PPTX
Power Point - Lesson 3_2.pptx grad school presentation
PDF
An introduction to the IFRS (ISSB) Stndards.pdf
PPTX
Introuction about ICD -10 and ICD-11 PPT.pptx
PDF
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
PPTX
Slides PPTX World Game (s) Eco Economic Epochs.pptx
PPTX
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
PPTX
Internet___Basics___Styled_ presentation
PPTX
Module 1 - Cyber Law and Ethics 101.pptx
isotopes_sddsadsaadasdasdasdasdsa1213.ppt
introduction about ICD -10 & ICD-11 ppt.pptx
The New Creative Director: How AI Tools for Social Media Content Creation Are...
Unit-3 cyber security network security of internet system
E -tech empowerment technologies PowerPoint
Tenda Login Guide: Access Your Router in 5 Easy Steps
Introuction about WHO-FIC in ICD-10.pptx
international classification of diseases ICD-10 review PPT.pptx
Funds Management Learning Material for Beg
Ethics in Information System - Management Information System
SASE Traffic Flow - ZTNA Connector-1.pdf
WebRTC in SignalWire - troubleshooting media negotiation
Power Point - Lesson 3_2.pptx grad school presentation
An introduction to the IFRS (ISSB) Stndards.pdf
Introuction about ICD -10 and ICD-11 PPT.pptx
Vigrab.top – Online Tool for Downloading and Converting Social Media Videos a...
Slides PPTX World Game (s) Eco Economic Epochs.pptx
June-4-Sermon-Powerpoint.pptx USE THIS FOR YOUR MOTIVATION
Internet___Basics___Styled_ presentation
Module 1 - Cyber Law and Ethics 101.pptx

hbaseconasia2019 HBase at Tencent

  • 2. HBase At Tencent Andrew Cheng | 程广旭 Tencent | HBase Committer
  • 3. Content 01. HBase Service In Tencent 02. Applications 03. Practices & Optimization
  • 5. HBase Story in Tencent l Began using since 2013 l Used version l 0.94.17 -> 0.98.6 -> 1.2.5 -> 2.2.0 (ing) l Largest cluster more than 500 nodes 90+ Clusters 4000+ Nodes 10PB+ Data 3Tri+ RPD
  • 6. Overview HBase Users come from 6 groups , more than 100+ different applications
  • 7. Architecture Tencent HBase Zookeeper OpenTSDB S2Graph Spark Tookit HBase Api TDBank Lhotse RestServer ThriftServer Kylin Phoenix Tenpay Doss monitoring TNM2 Deploy CenterWepay Game Advertiseme nt …
  • 9. Tencent Ads – Real-Time Logjoin System Mixer Exposure TDBank Tencent HBase Model learning Freshness Budget control Report Association Table Flow Table Click … LogJoin LogJoin LogJoin LogJoin Data Source Transport Logical Storage Consumer
  • 10. Tenpay - Transaction record Data Source MySQL Binlog Paser DBSync Cache Hippo Storage Tencent HBase Thrift Server Application C++ Read Read Write Application JAVA Read Write TDSort
  • 12. Practices–Data migration add_peer disable_peer Set REPLICATION_SCOPE => '1' snapshot clone_snapshot Set REPLICATION_SCOPE => '0' Check Dataenable_peer Client switch to new cluster Cluster A Cluster B ExportSnapshot delete_snapshot Business-insensitive data migration
  • 13. Practices–Table l Create table per day l Large amount of data l TTL is short l Benefit l Reduce the amount of data in compaction l Easy to delete expired data
  • 14. Optimization - Bandwidth ② RS2 and RS3 Wal data ① Input Data ③ RS2 and RS3 Flush data ⑤ RS2 and RS3 Large compact ④ RS2 and RS3 Small compact RS1 RS2 RS3 Input Data Wal Flush ① Small compact Large compact ② ③ ④ ⑤ Input Data Input Data
  • 15. Optimization - Bandwidth l Enable compressing of CellBlocks l Wal compressor l Increase the size of memstore l Reduce the number of threads about compaction l Turn off major compaction l create tables by day
  • 16. Optimization - Online filtering of dirty data l A large amount of data which have the same Rowkey l How to find filter rowkeys? l ResponseTooSlow l How to set filter rowkeys? l hbase.hregion.filter.rowkeys l How to refresh filter rowkeys? l update_config Input Data Filter Enable Write Filter Yes Yes No No
  • 17. Optimization - Prefix Bloom Filter(HBASE-20636) l ROWPREFIX_FIXED_LENGTH l ROWPREFIX_DELIMITER uin ts action Bloom Filter Prefix Create Table: File info:
  • 18. Optimization - Prefix Bloom Filter(HBASE-20636) Scan Not Filter StoreFile Same prefix? {StartKey,EndKey} Computer hash value Hit BloomFilter? Prefix length >= prefix_length Yes Yes No Filter StoreFile No No Get prefix key by prefix_length Yes Read Rowkey Get prefix key by prefix_length Computer hash value Set BloomFilter Last line? Input Data Write BloomFilter information to StoreFile metadata Yes No Write
  • 19. Optimization - RestServer RestServer A Cluster A Cluster B Cluster C RestServer CRestServer B RestServer D User Nginx
  • 20. Optimization - RestServer RestServer A Cluster A Cluster B Cluster C RestServer CRestServer B User Nginx Mysql
  • 21. Optimization - RestServer l Only maintain one configuration l use effectively resources l User-friendly access
  • 22. HBase Community l 1 Committer, 2 Contributor l Total commits: 80+ l Feature l HBASE-20636 Introduce two bloom filter type : ROWPREFIX_FIXED_LENGTH and ROWPREFIX_DELIMITED l HBASE-19799 Add web UI to rsgroup l HBASE-20243 [Shell] Add shell command to create a new table by cloning the existent table l HBASE-19483 Add proper privilege check for rsgroup commands l ………