SlideShare a Scribd company logo
1
Ecosystems built with HBase and
CloudTable Service at Huawei
Jieshan Bi, Yanhui Zhong
2
Agenda
CTBase: A light weight HBase client for structured data
Tagram: Distributed bitmap index implementation with HBase
CloudTable Service(HBase on Huawei Cloud)
3
CTBase Design Motivation
 Most of our customer scenarios are structured data
 HBase secondary index is a basic requirement
 New application indicated new HBase secondary development
 Simple cross-table join queries are common
 Full text index is also required for some customer scenarios
4
CTBase Features
 Schematized table
 Global secondary index
 Cluster table for simple cross-table join queries
 Online schema changes
 JSON based query DSL
5
Schematized Table
UserTable
Service conceptual user table for
storing user data
Column
User table column:
Each column indicates an
attribute of service data.
Index
Primary index:
Rowkey of table that stored the user data,
indicating the search scenario with the
highest probability
Secondary index:
Saves the information about the index to
the primary index.
Qualifier
HBase column:
Each column indicates a
KeyValue.
contains
contains mapping
Index RowKey
Column 1 Column 2 Column 3
Schematized Tables is better for structured user data storage. A
lot of modern NewSQL databases likes MegaStore, Spanner, F1,
Kudu are designed based on schematized tables.
6
CTBase provide schema definition API. Schema definition includes:
 Table Creation
A user table will be exist as simple or cluster table mode.
 Column Definition
Column is a similar concept with RDBMS. A column has specific type and length limit.
 Qualifier Definition
Column to ColumnFamily:Qualifier mapping. CTBase supports composite column, multiple column
can be stored into one same ColumnFamily:Qualifier.
 Index Definition
An index is either primary or secondary. The major part of index definition is the index rowkey
definition. Some hot columns can also be stored in secondary row.
Schema Manager
7
 Meta Cache
Each client has a schema locally in memory for fast data conversion.
 Meta Backup/Recovery Tool
Schema data can be exported as data file for fast recovery.
 Schema Changes
• Column changes
• Qualifier changes
• Index changes
Some changes are light-weight since they can take advantage of the scheme-less characteristics
of HBase. But some changes may cause the existing data to rebuild.
Schema Manager Cont.
8
HBase Global Secondary Index
NAME ID
Ariya I0000005
Bai I0000006
He I0000004
Lily I0000001
Lina I0000003
Lina I9999999
Lisa I0000008
Wang I0000002
Wang I0000007
……. ………….
Xiao I0000009
ID NAME PROVINCE GENDER PHONE AGE
I0000001 Lily Shandong MALE 13322221111 20
I0000002 Wang Guangdong FEMAIL 13222221111 15
I0000003 Lina Shanxi FEMAIL 13522221111 13
I0000004 He Henan MALE 13333331111 18
I0000005 Ariya Hebei FEMAIL 13344441111 28
I0000006 Bai Hunan MALE 15822221111 30
I0000007 Wang Hubei FEMAIL 15922221111 35
I0000008 Lisa Heilongjiang MALE 15844448888 38
I0000009 Xiao Jilin MALE 13802514000 38
…………. ……. …… ………. ………………….. ….
I9999999 Lina Liaoning MALE 13955225522 70
NAME =‘Lina’
Secondary index is for non-key column based queries.
Global secondary index is better for OLTP-like queries with
small batch results.
Region1
Region2
Region3
Region4
IndexRegionA
IndexRegionB
User Region Index Region
9
HBase Global Secondary Index Cont.
Section Section Section
Index RowKey Format
Suppose table UserInfo includes below 5 columns:
ID, NAME, ADDRESS, PHONE,DATE
Primary key are composed with 3 sections:
Section 1: ID
Section 2: NAME
Section 3: truncate(DATE, 8)
So the primary rowkey is:
Secondary Index Key
IDNAME
Secondary index key for NAME index:
H H
Secondary index key for PHONE index:
ID NAME truncate(DATE, 8)
………….
Primary Key
Section is normally related to one user column, but can also be a
constant or a random number.
truncate(DATE, 8)
H
ID NAME
HH
truncate(DATE, 8)
H
PHONE
NOTE:Sections with are also exist in primary keyH
10
Example: select a.account_id, a.amount, b.account_name, b.account_balance from Transactions a
left join AccountInfo b on a.account_id = b.account_id where a.account_id = “xxxxxxx”
account_id amount time
A0001 $100 12/12/2014 18:00:02
A0001 $1020 10/12/2014 15:30:05
A0001 $89 09/12/2014 13:00:07
A0002 $105 11/12/2014 20:15:00
account_id account_name account_balance
A0001 Andy $100232
A0002 Lily $902323
A0003 Selina $90000
A0004 Anna $102320
A0001 Andy $100232
A0001 $100 12/12/2014 18:00:02
A0001 $1020 10/12/2014 15:30:05
A0001 $89 09/12/2014 13:00:07
A0002 Lily $902323
A0002 $105 11/12/2014 20:15:00
A0002 $129 11/11/2014 18:15:00
Records from different
business-level user
table stored together
Transaction record
AccountInfo record
Pre-Joining with Keys: A better solution for cross-table join in
HBase. Records come from different tables but have some same
primary key columns can be stored adjacent to each other, so the
cross-table join turns into a sequential scan.
Cluster Table
11
Table table = null;
try {
table = conn.getTable(TABLE_NAME);
// Generate RowKey.
String rowKey = record.getId() + SEPERATOR + record.getName();
Put put = new Put(Bytes.toBytes(rowKey));
// Add name.
put.add(FAMILY, Bytes.toBytes("N"), Bytes.toBytes(record.getName()));
// Add phone.
put.add(FAMILY, Bytes.toBytes("P"), Bytes.toBytes(record.getPhone()));
// Add composite columns.
String compositeColumn = record.getAddress() + SEPERATOR
+ record.getAge() + SEPERATOR + record.getGender();
put.add(FAMILY, Bytes.toBytes("Z"), Bytes.toBytes(compositeColumn));
table.put(put);
} catch (IOException e) {
// Handle exception.
} finally {
// ……..
}
ClusterTableInterface table = null;
try {
table = new ClusterTable(conf, CLUSTER_TABLE);
CTRow row = new CTRow();
// Add all columns.
row.addColumn("ID", record.getId());
row.addColumn("NAME", record.getName());
row.addColumn("Address", record.getAddress());
row.addColumn("Phone", record.getPhone());
row.addColumn("Age", record.getAge());
row.addColumn("Gender", record.getGender());
table.put(USER_TABLE, row);
} catch (IOException e) {
// Handle exception.
} finally {
// ………….
}
RowKey/Put/KeyValue are not visible to application directly.
Secondary index row will be auto-generated by CTBase.
HBase Write Vs. ClusterTable Write
12
JSON Based Query DSL
{
table: “TableA",
conditions: [“ID": “23470%", “CarNo": “A1?234",
“Color”: “Yello || Black || White”],
columns: ["ID", “Time", “CarNo", “Color”],
caching: 100
}
 Flexible and powerful query API.
 Support for below operators:
Range Query Operator: >, >=, <, <=
Logic Operator: &&, ||
Fuzzy Query Operator: ?, *, %
 Index name can be specified, or just depend
on imbedded RBO to choose the best index.
 Using exist or customized filters to push
down queries for decreasing query latency.
JSON
Query Executor
JSON Analyzer
Rule Based Optimizer
Query Plan
Result Scanner
Result
13
Bulk Load
Local
Schema
Structured
Data
KeyValue
(User data)
KeyValue
(Index data)
HFile
HFile
 Schema has been defined in advance, including columns, column to qualifier
mappings, index row key format, etc. The only required configuration for bulk load
task is the column orders of the data file.
 Secondary index related HFiles can be generated together in one bulk load task.
14
Future Work For CTBase
1. Better Full-Text index support.
2. Active-Active Clusters Client.
3. Better HFile format for structured data.
15
Agenda
CTBase: A light weight HBase client for structured data
Tagram: Distributed Bitmap index implementation with HBase
CloudTable Service(HBase on Huawei Cloud)
16
 Low-cardinality attributes are popularly used in Personas area, these attributes are
used to describe user/entity typical characteristics, behavior patterns, motivations. E.g.
Attributes for describing buyer personas can help identify where your best customers
spend time on the internet.
 Ad-hoc queries must be supported. Likes:
“How many male customers have age < 30?”
“How many customers have these specific attributes?”
“Which people appeared in Area-A, Area-B and Area-C between 9:00 and 12:00?”
 Solr/Elasticsearch based solutions are not fast enough for low-cardinality attributes
based ad-hoc queries.
Tagram Design Motivation
17
Tagram Introduction
 Distributed bitmap index implementation uses
HBase as backend storage.
 Milliseconds level latency for attribute based ad-
hoc queries.
 Each attribute value is called a Tag. Entity is called
a TagHost. Each Tag relates to an independent
bitmap. Hot tags related bitmaps are memory-
resident.
 A Tag is either static or dynamic. Static tags must
be defined in advance. Dynamic tags have no such
restriction, likes Time-Space related tags.
Condition
GENDER:Male AND MARRIAGE:Married AND AGE:25-30
AND BLOOD_TYPE:A AND CAROWNER
Tagram Client
Query
Execution
TagZone
101111010010...
011001011110...
101001011010...
101111011010...
101010011010...
&
&
&
&
Query
Execution
Conditions
AST Tree
Query
Optimization
Query Plan
TagZone
001111010010...
111001011110...
001101011010...
101001011010...
000010011010...
&
&
&
&
101111010010101...
Each bit represent
whether an Entity
have this attribute
Each attribute value
relates to a Bitmap
Conditions
AST Tree
Query
Optimization
Query Plan
18
TagZone
HBase
Checkpoint Checkpoint
HDFS
Bitmap Container
Bitmap Bitmap Bitmap …
Dynamic Tag Loader
Query Cache
TagHostGroup
TagSource
StaticTag
ChangeLog
DynamicTag
PostingList
DTag
DynamicTag PostingList
Checkpoint
Bitmap Latest Data View
Changes
Base Delta
Service Threads
TagZone
Query CacheService Threads
Tagram Architecture
 TagZone service is initialized by
HBase coprocessor.
 Each TagZone is an independent
bitmap computing unit.
 All the real-time writes and logs
are stored in HBase.
 Use bitmap checkpoint for fast
recovery during service initialization.
Bitmap Container
Bitmap Bitmap Bitmap …
Dynamic Tag Loader
19
Data Model
TagSource
TagHostGroup
TagHostGroup_TAGZONE
M
1
1
1
Inverted index of Tag to TagHosts
TagHost to Tags
TagHostID
(Any Type)
TID
(Integer)
Tags Meta data storage
 TagSource: Meta data storage for static tags, includes
configurations per tag.
 TagHostGroup: Uses TagHostID as key, and store all the
tags as columns.
 TagZone: Inverted index from Tag to TagHost list.
Bitmap related data is also stored in this table. Partitions
are decided during table creation, and can not split in
future.
 Each table is an independent HBase table.
20
Query
Query grammar in BNF:
Query ::= ( Clause )+
Clause ::= ["AND", "OR", "NOT"] ([TagName:]TagValue| "(" Query ")" )
 A Query is a series of Clauses. Each Clause can also be a nested query.
 Supports AND/OR/NOT operators. AND indicates this clause is required, NOT
indicates this clause is prohibited, OR indicates this clause should appear in the
matching results. The default operator is OR is none operator specified.
 Parentheses “(” “)” can be used to improve the priority of a sub-query.
21
Query Example
 Normal Query:
GENDER:Male AND MARRIAGE:Married AND AGE:25-30 AND BLOOD_TYPE:A
 Use parentheses “(” “)” to improve the priority of sub-query:
GENDER:Male AND MARRIAGE:Married AND (AGE:25-30 OR AGE:30-35) AND BLOOD_TYPE:A
 Minimum Number Should Match Query:
At least 2 of below 4 groups of conditions should be satisfied:
(A1 B1 C1 D1 E1 F1 G1 H1) (A2 B2 C2 D2 E2 F2 G2 H2) (A3 B3 C3 D3 E3 F3 G3 H3) (A4 B4 C4 D4 E4 F4 G4 H4)
 Complex query with static and dynamic tags:
GENDER:Male AND MARRIAGE:Married AND AGE:25-30 AND CAROWNER AND $D:DTag1 AND $D:DTag2
22
Evaluation
Bitmap Cardinality In-memory Bytes On-Disk Size Bytes
5,000,000 15426632 10387402
10,000,000 29042504 20370176
50,000,000 140155632 99812920
100,000,000 226915200 198083304
Test results on small cluster:
3 Huawei 2288 Servers(256GB Memory, Intel(R) Xeon(R) CPU E5-2618L v3 @2.30GHZ*2 SATA,4TB*14)
1.5 Billion TagHosts, ~60 static Tags per TagHost.
Query with 10 random tags(Hundreds of thousands satisfied results), count and only return first screen
results. Average query latency: 60ms。
Bitmap in-memory and on-disk size:
NOTE: 1. Bitmap cardinality is the number of bit 1 from the bitmap in binary form.
2. The positions with bit 1 are random integers between 0 and Integer.Max.
3. The distribution of bit 1(In Bitmap binary form) and the range may affect the bitmap size.
23
Future Work For Tagram
1. Multiple TagZone Replica.
2. Async Tagram/HBase Client.
3. Better Bitmap Memory Management.
4. Integration with Graph/Full-Text index.
24
Acknowledgment
• Chaoqiang Zhong (zhongchaoqiang@huawei.com)
• Bene Guo (guoyijun@huawei.com)
• Daicheng Li (lidaicheng@huawei.com)
25
Agenda
CTBase: A light weight HBase client for structured data
Tagram: Distributed Bitmap index implementation with HBase
CloudTable Service(HBase on Huawei Cloud)
26
 Easy Maintenance
 Security
 High Performance
 SLA
 High Availability
 Low Cost
CloudTable Service Features
27
VPC1 HBase VPC2 HBase
RegionServer
HDFS
HMaster HRegion
Memstoe
HFile
…
RegionS
erver
…ZK
…
Tenant VPC
Tenant
VPC
VPC3 HBase
Tenant
VPC
 Isolation by VPC
 Shared Storage
CloudTable Service On Huawei Cloud
28
HBase
Disk Disk Disk
Block
Device
FileSystem
HDFS
HBase HBase
Disk Disk Disk Disk
Distribute Pool(Append only)
HDFS Interface
HBase HBase HBase
• A low-latency IO stack
• Deep Optimization With hardware
FileSystem FileSystem
Block
Device
Block
Device
Native HBase IO Stack
CloudTable IO Stack
CloudTable – IO Optimization
29
Region
HFile
HFile
HFile
HFile
HFile
HFile
HFile
HFile
HFile
HFile
HDFS Data Node
Region
HFile
HFile
HFile
HFile
HFile
HFile
Region Server
Read Write
Compaction Region
HFile
HFile
HFile
HFile
HFile
HFile
HFile
HFile
HFile
HFile
HDFS Data Node
Region
HFile
HFile
HFile
HFile
HFile
HFile
Region Server
Read
Write
Compaction
CMD:compactionOffload compaction
Smooth Performance
0
5000
10000
15000
20000
25000
1
4
7
10
13
16
19
22
25
28
31
34
37
40
43
46
49
52
55
58
61
64
67
70
73
76
79
82
85
88
91
94
97
100
103
106
109
112
115
118
121
124
127
130
TPS
normal
offload
CloudTable – Offload Compaction
30
HBase
Cluster
Arbitration
Node1
Arbitration Cluster
Arbitration
Node2
Arbitration
Node3
HBase
Cluster
HBase
Cluster
HBase
Cluster
AZ1 AZ2
Sync Replication
Sync Replication
Heartbeat
 Cross AZ Replication
 Write: Strong Consistency
 Read: Timeline Consistency
 99.99% Availability
 99.999999999% Durability
 Auto Failover
CloudTable – High Availability
31
Disk Disk Disk Disk
Distribute Pool(Append only)
HDFS Interface
HBase Solr Other
Services
Disk
 40% resource savings
From:Flash Storage Disaggregation
CloudTable – Low Cost
32
Thank You!
bijieshan@huawei.com zhongyanhui@huawei.com

More Related Content

PPT
PDF
Building a data warehouse
PDF
Sq lite module7
PDF
Sq lite module6
PDF
Lession 6.introduction to records
PPT
Internet Environment
PDF
web designing blogger html codes
Building a data warehouse
Sq lite module7
Sq lite module6
Lession 6.introduction to records
Internet Environment
web designing blogger html codes

What's hot (20)

PDF
Access presentation
PDF
Sq lite module8
DOCX
Sdn beginners bi
PDF
03 HTML #burningkeyboards
PPT
Access 2007-Datasheets 1-Create a table by entering data
PPT
Training MS Access 2007
PDF
hbaseconasia2019 Distributed Bitmap Index Solution
PPT
Introduction to Oracle
PDF
Sq lite module2
PPTX
Sas visual analytics Training
PDF
Effective Use of Excel
PPT
Ch 9 S Q L
PPT
Chapter 10
PPT
PPTX
MS Office Access Tutorial
PPT
ASP.NET 10 - Data Controls
PPTX
form view
PPTX
Scraping, Transforming, and Enriching Bibliographic Data with Google Sheets
PPT
PDF
Vba primer
Access presentation
Sq lite module8
Sdn beginners bi
03 HTML #burningkeyboards
Access 2007-Datasheets 1-Create a table by entering data
Training MS Access 2007
hbaseconasia2019 Distributed Bitmap Index Solution
Introduction to Oracle
Sq lite module2
Sas visual analytics Training
Effective Use of Excel
Ch 9 S Q L
Chapter 10
MS Office Access Tutorial
ASP.NET 10 - Data Controls
form view
Scraping, Transforming, and Enriching Bibliographic Data with Google Sheets
Vba primer
Ad

Similar to hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei (20)

PDF
MIS5101 WK10 Outcome Measures
PPT
ASP.NET 08 - Data Binding And Representation
PPTX
Physical Design and Development
PPT
MYSQL.ppt
PPTX
PostgreSQL as NoSQL
PDF
Big Data: Big SQL and HBase
PPTX
HPD SQL Training - Beginner - 20220916.pptx
PPT
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQL
POTX
SOC_MSAccess_Infographic. A visual tour of the main ideas behind Access
PPT
Indic threads pune12-nosql now and path ahead
PDF
Developing Microsoft SQL Server 2012 Databases 70-464 Pass Guarantee
PDF
RDBMS to NoSQL: Practical Advice from Successful Migrations
PPTX
It 302 computerized accounting (week 2) - sharifah
PDF
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
PDF
Data Warehousing with Python
PDF
xml-150211140504-conversion-gate01 (1).pdf
PDF
Database concepts
PPT
ITReady DW Day2
PPTX
No SQL, No Problem: Use Azure DocumentDB
PPTX
Session 14 - Hive
MIS5101 WK10 Outcome Measures
ASP.NET 08 - Data Binding And Representation
Physical Design and Development
MYSQL.ppt
PostgreSQL as NoSQL
Big Data: Big SQL and HBase
HPD SQL Training - Beginner - 20220916.pptx
Adding Value to HBase with IBM InfoSphere BigInsights and BigSQL
SOC_MSAccess_Infographic. A visual tour of the main ideas behind Access
Indic threads pune12-nosql now and path ahead
Developing Microsoft SQL Server 2012 Databases 70-464 Pass Guarantee
RDBMS to NoSQL: Practical Advice from Successful Migrations
It 302 computerized accounting (week 2) - sharifah
Data Vault 2.0: Using MD5 Hashes for Change Data Capture
Data Warehousing with Python
xml-150211140504-conversion-gate01 (1).pdf
Database concepts
ITReady DW Day2
No SQL, No Problem: Use Azure DocumentDB
Session 14 - Hive
Ad

More from HBaseCon (20)

PDF
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
PDF
hbaseconasia2017: HBase on Beam
PDF
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
PDF
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
PDF
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
PDF
hbaseconasia2017: Apache HBase at Netease
PDF
hbaseconasia2017: HBase在Hulu的使用和实践
PDF
hbaseconasia2017: 基于HBase的企业级大数据平台
PDF
hbaseconasia2017: HBase at JD.com
PDF
hbaseconasia2017: Large scale data near-line loading method and architecture
PDF
hbaseconasia2017: HBase Practice At XiaoMi
PDF
hbaseconasia2017: hbase-2.0.0
PDF
HBaseCon2017 Democratizing HBase
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PDF
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
PDF
HBaseCon2017 Transactions in HBase
PDF
HBaseCon2017 Highly-Available HBase
PDF
HBaseCon2017 Apache HBase at Didi
PDF
HBaseCon2017 gohbase: Pure Go HBase Client
PDF
HBaseCon2017 Improving HBase availability in a multi tenant environment
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: hbase-2.0.0
HBaseCon2017 Democratizing HBase
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Transactions in HBase
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 gohbase: Pure Go HBase Client
HBaseCon2017 Improving HBase availability in a multi tenant environment

Recently uploaded (20)

PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Machine learning based COVID-19 study performance prediction
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Electronic commerce courselecture one. Pdf
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Encapsulation theory and applications.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Per capita expenditure prediction using model stacking based on satellite ima...
The AUB Centre for AI in Media Proposal.docx
Chapter 3 Spatial Domain Image Processing.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
Dropbox Q2 2025 Financial Results & Investor Presentation
Machine learning based COVID-19 study performance prediction
Building Integrated photovoltaic BIPV_UPV.pdf
Electronic commerce courselecture one. Pdf
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Unlocking AI with Model Context Protocol (MCP)
Encapsulation theory and applications.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Understanding_Digital_Forensics_Presentation.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Advanced methodologies resolving dimensionality complications for autism neur...

hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei

  • 1. 1 Ecosystems built with HBase and CloudTable Service at Huawei Jieshan Bi, Yanhui Zhong
  • 2. 2 Agenda CTBase: A light weight HBase client for structured data Tagram: Distributed bitmap index implementation with HBase CloudTable Service(HBase on Huawei Cloud)
  • 3. 3 CTBase Design Motivation  Most of our customer scenarios are structured data  HBase secondary index is a basic requirement  New application indicated new HBase secondary development  Simple cross-table join queries are common  Full text index is also required for some customer scenarios
  • 4. 4 CTBase Features  Schematized table  Global secondary index  Cluster table for simple cross-table join queries  Online schema changes  JSON based query DSL
  • 5. 5 Schematized Table UserTable Service conceptual user table for storing user data Column User table column: Each column indicates an attribute of service data. Index Primary index: Rowkey of table that stored the user data, indicating the search scenario with the highest probability Secondary index: Saves the information about the index to the primary index. Qualifier HBase column: Each column indicates a KeyValue. contains contains mapping Index RowKey Column 1 Column 2 Column 3 Schematized Tables is better for structured user data storage. A lot of modern NewSQL databases likes MegaStore, Spanner, F1, Kudu are designed based on schematized tables.
  • 6. 6 CTBase provide schema definition API. Schema definition includes:  Table Creation A user table will be exist as simple or cluster table mode.  Column Definition Column is a similar concept with RDBMS. A column has specific type and length limit.  Qualifier Definition Column to ColumnFamily:Qualifier mapping. CTBase supports composite column, multiple column can be stored into one same ColumnFamily:Qualifier.  Index Definition An index is either primary or secondary. The major part of index definition is the index rowkey definition. Some hot columns can also be stored in secondary row. Schema Manager
  • 7. 7  Meta Cache Each client has a schema locally in memory for fast data conversion.  Meta Backup/Recovery Tool Schema data can be exported as data file for fast recovery.  Schema Changes • Column changes • Qualifier changes • Index changes Some changes are light-weight since they can take advantage of the scheme-less characteristics of HBase. But some changes may cause the existing data to rebuild. Schema Manager Cont.
  • 8. 8 HBase Global Secondary Index NAME ID Ariya I0000005 Bai I0000006 He I0000004 Lily I0000001 Lina I0000003 Lina I9999999 Lisa I0000008 Wang I0000002 Wang I0000007 ……. …………. Xiao I0000009 ID NAME PROVINCE GENDER PHONE AGE I0000001 Lily Shandong MALE 13322221111 20 I0000002 Wang Guangdong FEMAIL 13222221111 15 I0000003 Lina Shanxi FEMAIL 13522221111 13 I0000004 He Henan MALE 13333331111 18 I0000005 Ariya Hebei FEMAIL 13344441111 28 I0000006 Bai Hunan MALE 15822221111 30 I0000007 Wang Hubei FEMAIL 15922221111 35 I0000008 Lisa Heilongjiang MALE 15844448888 38 I0000009 Xiao Jilin MALE 13802514000 38 …………. ……. …… ………. ………………….. …. I9999999 Lina Liaoning MALE 13955225522 70 NAME =‘Lina’ Secondary index is for non-key column based queries. Global secondary index is better for OLTP-like queries with small batch results. Region1 Region2 Region3 Region4 IndexRegionA IndexRegionB User Region Index Region
  • 9. 9 HBase Global Secondary Index Cont. Section Section Section Index RowKey Format Suppose table UserInfo includes below 5 columns: ID, NAME, ADDRESS, PHONE,DATE Primary key are composed with 3 sections: Section 1: ID Section 2: NAME Section 3: truncate(DATE, 8) So the primary rowkey is: Secondary Index Key IDNAME Secondary index key for NAME index: H H Secondary index key for PHONE index: ID NAME truncate(DATE, 8) …………. Primary Key Section is normally related to one user column, but can also be a constant or a random number. truncate(DATE, 8) H ID NAME HH truncate(DATE, 8) H PHONE NOTE:Sections with are also exist in primary keyH
  • 10. 10 Example: select a.account_id, a.amount, b.account_name, b.account_balance from Transactions a left join AccountInfo b on a.account_id = b.account_id where a.account_id = “xxxxxxx” account_id amount time A0001 $100 12/12/2014 18:00:02 A0001 $1020 10/12/2014 15:30:05 A0001 $89 09/12/2014 13:00:07 A0002 $105 11/12/2014 20:15:00 account_id account_name account_balance A0001 Andy $100232 A0002 Lily $902323 A0003 Selina $90000 A0004 Anna $102320 A0001 Andy $100232 A0001 $100 12/12/2014 18:00:02 A0001 $1020 10/12/2014 15:30:05 A0001 $89 09/12/2014 13:00:07 A0002 Lily $902323 A0002 $105 11/12/2014 20:15:00 A0002 $129 11/11/2014 18:15:00 Records from different business-level user table stored together Transaction record AccountInfo record Pre-Joining with Keys: A better solution for cross-table join in HBase. Records come from different tables but have some same primary key columns can be stored adjacent to each other, so the cross-table join turns into a sequential scan. Cluster Table
  • 11. 11 Table table = null; try { table = conn.getTable(TABLE_NAME); // Generate RowKey. String rowKey = record.getId() + SEPERATOR + record.getName(); Put put = new Put(Bytes.toBytes(rowKey)); // Add name. put.add(FAMILY, Bytes.toBytes("N"), Bytes.toBytes(record.getName())); // Add phone. put.add(FAMILY, Bytes.toBytes("P"), Bytes.toBytes(record.getPhone())); // Add composite columns. String compositeColumn = record.getAddress() + SEPERATOR + record.getAge() + SEPERATOR + record.getGender(); put.add(FAMILY, Bytes.toBytes("Z"), Bytes.toBytes(compositeColumn)); table.put(put); } catch (IOException e) { // Handle exception. } finally { // …….. } ClusterTableInterface table = null; try { table = new ClusterTable(conf, CLUSTER_TABLE); CTRow row = new CTRow(); // Add all columns. row.addColumn("ID", record.getId()); row.addColumn("NAME", record.getName()); row.addColumn("Address", record.getAddress()); row.addColumn("Phone", record.getPhone()); row.addColumn("Age", record.getAge()); row.addColumn("Gender", record.getGender()); table.put(USER_TABLE, row); } catch (IOException e) { // Handle exception. } finally { // …………. } RowKey/Put/KeyValue are not visible to application directly. Secondary index row will be auto-generated by CTBase. HBase Write Vs. ClusterTable Write
  • 12. 12 JSON Based Query DSL { table: “TableA", conditions: [“ID": “23470%", “CarNo": “A1?234", “Color”: “Yello || Black || White”], columns: ["ID", “Time", “CarNo", “Color”], caching: 100 }  Flexible and powerful query API.  Support for below operators: Range Query Operator: >, >=, <, <= Logic Operator: &&, || Fuzzy Query Operator: ?, *, %  Index name can be specified, or just depend on imbedded RBO to choose the best index.  Using exist or customized filters to push down queries for decreasing query latency. JSON Query Executor JSON Analyzer Rule Based Optimizer Query Plan Result Scanner Result
  • 13. 13 Bulk Load Local Schema Structured Data KeyValue (User data) KeyValue (Index data) HFile HFile  Schema has been defined in advance, including columns, column to qualifier mappings, index row key format, etc. The only required configuration for bulk load task is the column orders of the data file.  Secondary index related HFiles can be generated together in one bulk load task.
  • 14. 14 Future Work For CTBase 1. Better Full-Text index support. 2. Active-Active Clusters Client. 3. Better HFile format for structured data.
  • 15. 15 Agenda CTBase: A light weight HBase client for structured data Tagram: Distributed Bitmap index implementation with HBase CloudTable Service(HBase on Huawei Cloud)
  • 16. 16  Low-cardinality attributes are popularly used in Personas area, these attributes are used to describe user/entity typical characteristics, behavior patterns, motivations. E.g. Attributes for describing buyer personas can help identify where your best customers spend time on the internet.  Ad-hoc queries must be supported. Likes: “How many male customers have age < 30?” “How many customers have these specific attributes?” “Which people appeared in Area-A, Area-B and Area-C between 9:00 and 12:00?”  Solr/Elasticsearch based solutions are not fast enough for low-cardinality attributes based ad-hoc queries. Tagram Design Motivation
  • 17. 17 Tagram Introduction  Distributed bitmap index implementation uses HBase as backend storage.  Milliseconds level latency for attribute based ad- hoc queries.  Each attribute value is called a Tag. Entity is called a TagHost. Each Tag relates to an independent bitmap. Hot tags related bitmaps are memory- resident.  A Tag is either static or dynamic. Static tags must be defined in advance. Dynamic tags have no such restriction, likes Time-Space related tags. Condition GENDER:Male AND MARRIAGE:Married AND AGE:25-30 AND BLOOD_TYPE:A AND CAROWNER Tagram Client Query Execution TagZone 101111010010... 011001011110... 101001011010... 101111011010... 101010011010... & & & & Query Execution Conditions AST Tree Query Optimization Query Plan TagZone 001111010010... 111001011110... 001101011010... 101001011010... 000010011010... & & & & 101111010010101... Each bit represent whether an Entity have this attribute Each attribute value relates to a Bitmap Conditions AST Tree Query Optimization Query Plan
  • 18. 18 TagZone HBase Checkpoint Checkpoint HDFS Bitmap Container Bitmap Bitmap Bitmap … Dynamic Tag Loader Query Cache TagHostGroup TagSource StaticTag ChangeLog DynamicTag PostingList DTag DynamicTag PostingList Checkpoint Bitmap Latest Data View Changes Base Delta Service Threads TagZone Query CacheService Threads Tagram Architecture  TagZone service is initialized by HBase coprocessor.  Each TagZone is an independent bitmap computing unit.  All the real-time writes and logs are stored in HBase.  Use bitmap checkpoint for fast recovery during service initialization. Bitmap Container Bitmap Bitmap Bitmap … Dynamic Tag Loader
  • 19. 19 Data Model TagSource TagHostGroup TagHostGroup_TAGZONE M 1 1 1 Inverted index of Tag to TagHosts TagHost to Tags TagHostID (Any Type) TID (Integer) Tags Meta data storage  TagSource: Meta data storage for static tags, includes configurations per tag.  TagHostGroup: Uses TagHostID as key, and store all the tags as columns.  TagZone: Inverted index from Tag to TagHost list. Bitmap related data is also stored in this table. Partitions are decided during table creation, and can not split in future.  Each table is an independent HBase table.
  • 20. 20 Query Query grammar in BNF: Query ::= ( Clause )+ Clause ::= ["AND", "OR", "NOT"] ([TagName:]TagValue| "(" Query ")" )  A Query is a series of Clauses. Each Clause can also be a nested query.  Supports AND/OR/NOT operators. AND indicates this clause is required, NOT indicates this clause is prohibited, OR indicates this clause should appear in the matching results. The default operator is OR is none operator specified.  Parentheses “(” “)” can be used to improve the priority of a sub-query.
  • 21. 21 Query Example  Normal Query: GENDER:Male AND MARRIAGE:Married AND AGE:25-30 AND BLOOD_TYPE:A  Use parentheses “(” “)” to improve the priority of sub-query: GENDER:Male AND MARRIAGE:Married AND (AGE:25-30 OR AGE:30-35) AND BLOOD_TYPE:A  Minimum Number Should Match Query: At least 2 of below 4 groups of conditions should be satisfied: (A1 B1 C1 D1 E1 F1 G1 H1) (A2 B2 C2 D2 E2 F2 G2 H2) (A3 B3 C3 D3 E3 F3 G3 H3) (A4 B4 C4 D4 E4 F4 G4 H4)  Complex query with static and dynamic tags: GENDER:Male AND MARRIAGE:Married AND AGE:25-30 AND CAROWNER AND $D:DTag1 AND $D:DTag2
  • 22. 22 Evaluation Bitmap Cardinality In-memory Bytes On-Disk Size Bytes 5,000,000 15426632 10387402 10,000,000 29042504 20370176 50,000,000 140155632 99812920 100,000,000 226915200 198083304 Test results on small cluster: 3 Huawei 2288 Servers(256GB Memory, Intel(R) Xeon(R) CPU E5-2618L v3 @2.30GHZ*2 SATA,4TB*14) 1.5 Billion TagHosts, ~60 static Tags per TagHost. Query with 10 random tags(Hundreds of thousands satisfied results), count and only return first screen results. Average query latency: 60ms。 Bitmap in-memory and on-disk size: NOTE: 1. Bitmap cardinality is the number of bit 1 from the bitmap in binary form. 2. The positions with bit 1 are random integers between 0 and Integer.Max. 3. The distribution of bit 1(In Bitmap binary form) and the range may affect the bitmap size.
  • 23. 23 Future Work For Tagram 1. Multiple TagZone Replica. 2. Async Tagram/HBase Client. 3. Better Bitmap Memory Management. 4. Integration with Graph/Full-Text index.
  • 24. 24 Acknowledgment • Chaoqiang Zhong (zhongchaoqiang@huawei.com) • Bene Guo (guoyijun@huawei.com) • Daicheng Li (lidaicheng@huawei.com)
  • 25. 25 Agenda CTBase: A light weight HBase client for structured data Tagram: Distributed Bitmap index implementation with HBase CloudTable Service(HBase on Huawei Cloud)
  • 26. 26  Easy Maintenance  Security  High Performance  SLA  High Availability  Low Cost CloudTable Service Features
  • 27. 27 VPC1 HBase VPC2 HBase RegionServer HDFS HMaster HRegion Memstoe HFile … RegionS erver …ZK … Tenant VPC Tenant VPC VPC3 HBase Tenant VPC  Isolation by VPC  Shared Storage CloudTable Service On Huawei Cloud
  • 28. 28 HBase Disk Disk Disk Block Device FileSystem HDFS HBase HBase Disk Disk Disk Disk Distribute Pool(Append only) HDFS Interface HBase HBase HBase • A low-latency IO stack • Deep Optimization With hardware FileSystem FileSystem Block Device Block Device Native HBase IO Stack CloudTable IO Stack CloudTable – IO Optimization
  • 29. 29 Region HFile HFile HFile HFile HFile HFile HFile HFile HFile HFile HDFS Data Node Region HFile HFile HFile HFile HFile HFile Region Server Read Write Compaction Region HFile HFile HFile HFile HFile HFile HFile HFile HFile HFile HDFS Data Node Region HFile HFile HFile HFile HFile HFile Region Server Read Write Compaction CMD:compactionOffload compaction Smooth Performance 0 5000 10000 15000 20000 25000 1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52 55 58 61 64 67 70 73 76 79 82 85 88 91 94 97 100 103 106 109 112 115 118 121 124 127 130 TPS normal offload CloudTable – Offload Compaction
  • 30. 30 HBase Cluster Arbitration Node1 Arbitration Cluster Arbitration Node2 Arbitration Node3 HBase Cluster HBase Cluster HBase Cluster AZ1 AZ2 Sync Replication Sync Replication Heartbeat  Cross AZ Replication  Write: Strong Consistency  Read: Timeline Consistency  99.99% Availability  99.999999999% Durability  Auto Failover CloudTable – High Availability
  • 31. 31 Disk Disk Disk Disk Distribute Pool(Append only) HDFS Interface HBase Solr Other Services Disk  40% resource savings From:Flash Storage Disaggregation CloudTable – Low Cost