SlideShare a Scribd company logo
www.huawei.com
Security
Level:
HUAWEI TECHNOLOGIES CO., LTD.
HBase Replication
Replication of Bulk Loaded
Data
Ashish Singhi (@ashishsinghi89)
ashish.singhi@huawei.com
HBase Developer
08 October 2015
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 2
Agenda
Existing Replication Design1
Enhanced Replication Design2
Future Scope3
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 3
Replication
 Keep data synchronized between the clusters

Supports multiple destination

Supports cyclic replication

Configurable at table/column family level
 Push based architecture
 Uses WAL shipping to propagate data
 Used for disaster recovery, geo-distributed
serving and more
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 4
Replication
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 5
Replication State
 Maintains its state in ZooKeeper
 Default: /hbase/replication
 The Peers znode:

Default: /hbase/replication/peers
/hbase/replication/peers:
/peer1: zk1.host.com,…:2181:/hbase
/peer-state: ENABLED
/tableCFs: table1;table2:cf1
/peerN: ….
Cluster key = ZK
quorum:ZK client
port:HBase root
znode
Replication Status
Only Replicate
this table
column families
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 6
Replication State
 The RS znode:

Default: /hbase/replication/rs
/hbase/replication/rs:
/rs1,16201,1234:
/peer1:
/wal1:9086
/walN:0
/peerN: …
/rsN,16201,1234: …
Server name =
hostname,port,
startcode
Peer ID, where
logs should be
replicated
WAL log
name & read
offset
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 7
Existing Replication Design
WAL
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
Peer Cluster 2 [tableCfs - ]
Region Server
Replication Sink
Replication Sink
Table
Table
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 8
Existing Replication Design
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
Peer Cluster 2 [tableCfs - ]
Region Server
Replication Sink
Replication Sink
Table
Table
Batch
Bulk load
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 9
Existing Replication Design
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
Peer Cluster 2 [tableCfs - ]
Region Server
Replication Sink
Replication Sink
Table
Table
Batch
Bulk load
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 10
Existing Replication Design
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
Peer Cluster 2 [tableCfs - ]
Region Server
1
2 1
Replication Sink
Replication Sink
Table
Table
Batch
Bulk load
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 11
Existing Replication Design
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
Peer Cluster 2 [tableCfs - ]
Region Server
1
2 1
Batch
Batch
1
12
Replication Sink
Replication Sink
Table
Table
Batch
Bulk load
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 12
Existing Replication Design
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
Peer Cluster 2 [tableCfs - ]
Region Server
1
2 1
Batch
Batch
1
12
Limitation: Does not support bulk
loaded data
Replication Sink
Replication Sink
Table
Table
Batch
Bulk load
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 13
Agenda
Existing Replication Design1
Enhanced Replication Design2
Future Scope3
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 14
Enhanced Replication Design
(HBASE-13153)
WAL
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
…/hfile-refs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
3 1
TableReplication Sink
Region Server
TableReplication Sink
Peer Cluster 2 [tableCfs - ]
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 15
Enhanced Replication Design
(HBASE-13153)
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
…/hfile-refs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
3 1
TableReplication Sink
Region Server
TableReplication Sink
Peer Cluster 2 [tableCfs - ]
Batch
Bulk load
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 16
Enhanced Replication Design
(HBASE-13153)
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
…/hfile-refs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
3 1
TableReplication Sink
Region Server
TableReplication Sink
Peer Cluster 2 [tableCfs - ]
Batch
Bulk load
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 17
Enhanced Replication Design
(HBASE-13153)
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
…/hfile-refs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
1
3 1
TableReplication Sink
Region Server
TableReplication Sink
Peer Cluster 2 [tableCfs - ]
12
1
1
Batch
Bulk load
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 18
Enhanced Replication Design
(HBASE-13153)
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
…/hfile-refs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
1
3 1
Table
Batch
1
Replication Sink
1
Bulk load
Region Server
TableReplication Sink
Peer Cluster 2 [tableCfs - ]
12
1
Batch
Bulk load
12
1
1
Batch
Bulk load
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 19
Enhanced Replication Design
(HBASE-13153)
WAL
1
1
2
Region Server
Replication
Source/End Point
Replication
Source/End Point
Replication
Source Manager
Region Server
…/peers/
…/rs/
…/hfile-refs/
Source
Cluster
Peer Cluster 1 [tableCfs - 1]
1
3 1
Table
Batch
1
Replication Sink
1
Bulk load
Region Server
TableReplication Sink
Peer Cluster 2 [tableCfs - ]
12
1
Batch
Bulk load
12
1
1
Batch
Bulk load
Configuration:hbase.replication.bulkload.enabled
[Default False]
ZooKeeper
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 20
Replication State
 The HFile References znode:

Default: /hbase/replication/hfile-refs
/hbase/replication/hfile-refs:
/peer1:
/hfile1
/hfileN
/peerN: …
Peer ID, where
hfiles should
be replicated
Name of
hfiles which
needs to be
replicated
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 21
Metrics
 Source

sizeOfHFileQueue: Number of bulk loaded hfiles
pending

shippedHFiles: Number of bulk loaded hfile entries
shipped
 Sink

appliedHFiles: Number of bulk loaded hfile entries
applied
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 22
Limitation & Constraints
 If data is different in visibility labels table in
source and peer cluster then scan will fail.
 Peer cluster requires read permission on active
HDFS cluster.
 Peer cluster must have Compression codec library
used in source cluster for hfile compression.
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 23
Agenda
Existing Replication Design1
Enhanced Replication Design2
Future Scope3
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 24
Replication V2 (HBASE-14379)
 Eliminate permanent ZooKeeper node (HBASE-
10295)
 Admin requests to be routed through master
(HBASE-11392)
 Hbck support for fixing corrupt and stuck queues
(HBASE-14014)
HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 25
Thank you
www.huawei.com
Copyright©2011 Huawei Technologies Co., Ltd. All Rights Reserved.
The information in this document may contain predictive statements including, without
limitation, statements regarding the future financial and operating results, future product
portfolio, new technology, etc. There are a number of factors that could cause actual results
and developments to differ materially from those expressed or implied in the predictive
statements. Therefore, such information is provided for reference purpose only and
constitutes neither an offer nor an acceptance. Huawei may change the information at any
time without notice.

More Related Content

PDF
The State of HBase Replication
PDF
HBase replication
PDF
Hoodie - DataEngConf 2017
PPTX
HBase Low Latency
PPTX
What's new in Java 11
PDF
MMUG18 - MySQL Failover and Orchestrator
PPTX
Apache Tez - A New Chapter in Hadoop Data Processing
PDF
Softwareentwicklung ohne Abhängigkeiten
The State of HBase Replication
HBase replication
Hoodie - DataEngConf 2017
HBase Low Latency
What's new in Java 11
MMUG18 - MySQL Failover and Orchestrator
Apache Tez - A New Chapter in Hadoop Data Processing
Softwareentwicklung ohne Abhängigkeiten

What's hot (20)

PDF
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
PDF
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
PDF
Monitoring in CloudStack
PDF
[Oracle DBA & Developer Day 2012] 高可用性システムに適した管理性と性能を向上させるASM と RMAN の魅力
PPTX
Practical examples of using extended events
PPT
HBaseCon 2013: Apache HBase Replication
PDF
Hudi architecture, fundamentals and capabilities
PDF
LinuxIO-Introduction-FUDCon-2015
PDF
Automated master failover
PDF
Apache Hive Hook
PPT
Galera Cluster Best Practices for DBA's and DevOps Part 1
PPTX
Linux MMAP & Ioremap introduction
PPTX
The top 3 challenges running multi-tenant Flink at scale
KEY
Introduction to memcached
PDF
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
PPTX
[오픈소스컨설팅]Ansible overview
PDF
Ansible, best practices
PDF
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
PDF
Top 5 Mistakes When Writing Spark Applications
PPTX
HDFS Internals
Hudi: Large-Scale, Near Real-Time Pipelines at Uber with Nishith Agarwal and ...
Room 3 - 1 - Nguyễn Xuân Trường Lâm - Zero touch on-premise storage infrastru...
Monitoring in CloudStack
[Oracle DBA & Developer Day 2012] 高可用性システムに適した管理性と性能を向上させるASM と RMAN の魅力
Practical examples of using extended events
HBaseCon 2013: Apache HBase Replication
Hudi architecture, fundamentals and capabilities
LinuxIO-Introduction-FUDCon-2015
Automated master failover
Apache Hive Hook
Galera Cluster Best Practices for DBA's and DevOps Part 1
Linux MMAP & Ioremap introduction
The top 3 challenges running multi-tenant Flink at scale
Introduction to memcached
Storage 101: Rook and Ceph - Open Infrastructure Denver 2019
[오픈소스컨설팅]Ansible overview
Ansible, best practices
Webinar: Deep Dive on Apache Flink State - Seth Wiesman
Top 5 Mistakes When Writing Spark Applications
HDFS Internals
Ad

Similar to HBase Replication for Bulk Loaded Data (20)

PDF
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
PDF
HBase Replication
PDF
Facebook keynote-nicolas-qcon
PDF
支撑Facebook消息处理的h base存储系统
PDF
Facebook Messages & HBase
PPTX
HBase at Flurry
PPT
Leveraging Hadoop in your PostgreSQL Environment
PPTX
HBaseCon 2015: HBase 2.0 and Beyond Panel
PPTX
HBase Operations and Best Practices
PDF
Realtime Apache Hadoop at Facebook
PDF
Cisco connect toronto 2015 big data sean mc keown
PDF
Big Data Architecture and Deployment
PDF
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
PPT
Hadoop presentation
PPTX
Advance HBase and Zookeeper - Module 8
PDF
HBaseConAsia2018 Track1-3: HBase at Xiaomi
PPTX
HBase Low Latency, StrataNYC 2014
PDF
Storage Infrastructure Behind Facebook Messages
PPTX
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
PDF
HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at Huawei
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
HBase Replication
Facebook keynote-nicolas-qcon
支撑Facebook消息处理的h base存储系统
Facebook Messages & HBase
HBase at Flurry
Leveraging Hadoop in your PostgreSQL Environment
HBaseCon 2015: HBase 2.0 and Beyond Panel
HBase Operations and Best Practices
Realtime Apache Hadoop at Facebook
Cisco connect toronto 2015 big data sean mc keown
Big Data Architecture and Deployment
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
Hadoop presentation
Advance HBase and Zookeeper - Module 8
HBaseConAsia2018 Track1-3: HBase at Xiaomi
HBase Low Latency, StrataNYC 2014
Storage Infrastructure Behind Facebook Messages
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
HBaseConAsia2018 Track3-4: HBase and OpenTSDB practice at Huawei
Ad

Recently uploaded (20)

PPTX
A Complete Guide to Streamlining Business Processes
PPTX
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
IMPACT OF LANDSLIDE.....................
PPTX
Database Infoormation System (DBIS).pptx
PDF
Introduction to Data Science and Data Analysis
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
Transcultural that can help you someday.
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PDF
Business Analytics and business intelligence.pdf
PDF
Global Data and Analytics Market Outlook Report
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
New ISO 27001_2022 standard and the changes
PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PDF
How to run a consulting project- client discovery
PPTX
modul_python (1).pptx for professional and student
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PDF
Introduction to the R Programming Language
PPT
ISS -ESG Data flows What is ESG and HowHow
A Complete Guide to Streamlining Business Processes
Copy of 16 Timeline & Flowchart Templates – HubSpot.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
IMPACT OF LANDSLIDE.....................
Database Infoormation System (DBIS).pptx
Introduction to Data Science and Data Analysis
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Transcultural that can help you someday.
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
Optimise Shopper Experiences with a Strong Data Estate.pdf
Business Analytics and business intelligence.pdf
Global Data and Analytics Market Outlook Report
IBA_Chapter_11_Slides_Final_Accessible.pptx
New ISO 27001_2022 standard and the changes
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
How to run a consulting project- client discovery
modul_python (1).pptx for professional and student
importance of Data-Visualization-in-Data-Science. for mba studnts
Introduction to the R Programming Language
ISS -ESG Data flows What is ESG and HowHow

HBase Replication for Bulk Loaded Data

  • 1. www.huawei.com Security Level: HUAWEI TECHNOLOGIES CO., LTD. HBase Replication Replication of Bulk Loaded Data Ashish Singhi (@ashishsinghi89) ashish.singhi@huawei.com HBase Developer 08 October 2015
  • 2. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 2 Agenda Existing Replication Design1 Enhanced Replication Design2 Future Scope3
  • 3. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 3 Replication  Keep data synchronized between the clusters  Supports multiple destination  Supports cyclic replication  Configurable at table/column family level  Push based architecture  Uses WAL shipping to propagate data  Used for disaster recovery, geo-distributed serving and more
  • 4. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 4 Replication
  • 5. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 5 Replication State  Maintains its state in ZooKeeper  Default: /hbase/replication  The Peers znode:  Default: /hbase/replication/peers /hbase/replication/peers: /peer1: zk1.host.com,…:2181:/hbase /peer-state: ENABLED /tableCFs: table1;table2:cf1 /peerN: …. Cluster key = ZK quorum:ZK client port:HBase root znode Replication Status Only Replicate this table column families
  • 6. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 6 Replication State  The RS znode:  Default: /hbase/replication/rs /hbase/replication/rs: /rs1,16201,1234: /peer1: /wal1:9086 /walN:0 /peerN: … /rsN,16201,1234: … Server name = hostname,port, startcode Peer ID, where logs should be replicated WAL log name & read offset
  • 7. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 7 Existing Replication Design WAL Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ Source Cluster Peer Cluster 1 [tableCfs - 1] Peer Cluster 2 [tableCfs - ] Region Server Replication Sink Replication Sink Table Table ZooKeeper
  • 8. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 8 Existing Replication Design WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ Source Cluster Peer Cluster 1 [tableCfs - 1] Peer Cluster 2 [tableCfs - ] Region Server Replication Sink Replication Sink Table Table Batch Bulk load ZooKeeper
  • 9. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 9 Existing Replication Design WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ Source Cluster Peer Cluster 1 [tableCfs - 1] Peer Cluster 2 [tableCfs - ] Region Server Replication Sink Replication Sink Table Table Batch Bulk load ZooKeeper
  • 10. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 10 Existing Replication Design WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ Source Cluster Peer Cluster 1 [tableCfs - 1] Peer Cluster 2 [tableCfs - ] Region Server 1 2 1 Replication Sink Replication Sink Table Table Batch Bulk load ZooKeeper
  • 11. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 11 Existing Replication Design WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ Source Cluster Peer Cluster 1 [tableCfs - 1] Peer Cluster 2 [tableCfs - ] Region Server 1 2 1 Batch Batch 1 12 Replication Sink Replication Sink Table Table Batch Bulk load ZooKeeper
  • 12. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 12 Existing Replication Design WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ Source Cluster Peer Cluster 1 [tableCfs - 1] Peer Cluster 2 [tableCfs - ] Region Server 1 2 1 Batch Batch 1 12 Limitation: Does not support bulk loaded data Replication Sink Replication Sink Table Table Batch Bulk load ZooKeeper
  • 13. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 13 Agenda Existing Replication Design1 Enhanced Replication Design2 Future Scope3
  • 14. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 14 Enhanced Replication Design (HBASE-13153) WAL Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ …/hfile-refs/ Source Cluster Peer Cluster 1 [tableCfs - 1] 3 1 TableReplication Sink Region Server TableReplication Sink Peer Cluster 2 [tableCfs - ] ZooKeeper
  • 15. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 15 Enhanced Replication Design (HBASE-13153) WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ …/hfile-refs/ Source Cluster Peer Cluster 1 [tableCfs - 1] 3 1 TableReplication Sink Region Server TableReplication Sink Peer Cluster 2 [tableCfs - ] Batch Bulk load ZooKeeper
  • 16. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 16 Enhanced Replication Design (HBASE-13153) WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ …/hfile-refs/ Source Cluster Peer Cluster 1 [tableCfs - 1] 3 1 TableReplication Sink Region Server TableReplication Sink Peer Cluster 2 [tableCfs - ] Batch Bulk load ZooKeeper
  • 17. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 17 Enhanced Replication Design (HBASE-13153) WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ …/hfile-refs/ Source Cluster Peer Cluster 1 [tableCfs - 1] 1 3 1 TableReplication Sink Region Server TableReplication Sink Peer Cluster 2 [tableCfs - ] 12 1 1 Batch Bulk load ZooKeeper
  • 18. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 18 Enhanced Replication Design (HBASE-13153) WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ …/hfile-refs/ Source Cluster Peer Cluster 1 [tableCfs - 1] 1 3 1 Table Batch 1 Replication Sink 1 Bulk load Region Server TableReplication Sink Peer Cluster 2 [tableCfs - ] 12 1 Batch Bulk load 12 1 1 Batch Bulk load ZooKeeper
  • 19. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 19 Enhanced Replication Design (HBASE-13153) WAL 1 1 2 Region Server Replication Source/End Point Replication Source/End Point Replication Source Manager Region Server …/peers/ …/rs/ …/hfile-refs/ Source Cluster Peer Cluster 1 [tableCfs - 1] 1 3 1 Table Batch 1 Replication Sink 1 Bulk load Region Server TableReplication Sink Peer Cluster 2 [tableCfs - ] 12 1 Batch Bulk load 12 1 1 Batch Bulk load Configuration:hbase.replication.bulkload.enabled [Default False] ZooKeeper
  • 20. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 20 Replication State  The HFile References znode:  Default: /hbase/replication/hfile-refs /hbase/replication/hfile-refs: /peer1: /hfile1 /hfileN /peerN: … Peer ID, where hfiles should be replicated Name of hfiles which needs to be replicated
  • 21. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 21 Metrics  Source  sizeOfHFileQueue: Number of bulk loaded hfiles pending  shippedHFiles: Number of bulk loaded hfile entries shipped  Sink  appliedHFiles: Number of bulk loaded hfile entries applied
  • 22. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 22 Limitation & Constraints  If data is different in visibility labels table in source and peer cluster then scan will fail.  Peer cluster requires read permission on active HDFS cluster.  Peer cluster must have Compression codec library used in source cluster for hfile compression.
  • 23. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 23 Agenda Existing Replication Design1 Enhanced Replication Design2 Future Scope3
  • 24. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 24 Replication V2 (HBASE-14379)  Eliminate permanent ZooKeeper node (HBASE- 10295)  Admin requests to be routed through master (HBASE-11392)  Hbck support for fixing corrupt and stuck queues (HBASE-14014)
  • 25. HUAWEI TECHNOLOGIES CO., LTD. Huawei Confidential 25
  • 26. Thank you www.huawei.com Copyright©2011 Huawei Technologies Co., Ltd. All Rights Reserved. The information in this document may contain predictive statements including, without limitation, statements regarding the future financial and operating results, future product portfolio, new technology, etc. There are a number of factors that could cause actual results and developments to differ materially from those expressed or implied in the predictive statements. Therefore, such information is provided for reference purpose only and constitutes neither an offer nor an acceptance. Huawei may change the information at any time without notice.