SlideShare a Scribd company logo
HBase	-	Just	The	Basics
The	Basics	of	HBase
NoSQL	Datastore	built	on	top	of	the
HDFS	filesystem
HBase	is	a	column	family	oriented
database
Based	on	the	Google	BigTable
paper
Uses	HDFS	for	storage
Data	can	be	retrieved	quickly	or
batch	processed	with	MapReduce
What	Is	Apache	HBase?
Need	Big	Data	TB/PB
High	throughput
Variable	columns
Need	random	reads	and	writes
HBase	Use	Cases
HBase	Architecture
HBase	Daemons
NoSQL	Table	Architecture
Column	Families
NoSQL	Data
Regions
Write	Path
Read	Path
HBase	API
HBase	has	a	Java	API
It	is	the	only	first	class	citizen
There	are	other	programmatic
interfaces
A	REST	interface	allows	HTTP	access
A	Thrift	gateway	allow	non-Java
programmatic	access
There	are	non-native	SQL	interfaces
Apache	Phoenix,	Impala,	Presto,	Hive
Accessing	HBase
setConf(HBaseConfiguration.create(getConf()));
Connection	connection	=	null;
Table	table	=	null;
try	{
				//	Define	the	table	and	column	family	for	the	data
				TableName	TABLE_NAME	=	TableName.valueOf("hbasetable");
				byte[]	CF	=	Bytes.toBytes("colfamily");
				//	Connect	to	the	table
				connection	=	ConnectionFactory.createConnection(getConf());
				table	=	connection.getTable(TABLE_NAME);
				//	Create	a	put	and	add	columns	to	it
				Put	p	=	new	Put(Bytes.toBytes("rowkey"));
				p.addColumn(CF,	Bytes.toBytes("columnqual"),	Bytes.toBytes(42.0d));
				//	Add	the	new	column	to	the	row
				table.put(p);
}	finally	{
				//	close	everything	down
				if	(table	!=	null)
								table.close();
				if	(connection	!=	null)
								connection.close();
}
Puts
//	Define	the	table	and	column	family	for	the	data
TableName	TABLE_NAME	=	TableName.valueOf("hbasetable");
byte[]	CF	=	Bytes.toBytes("colfamily");
//	Connect	to	the	table
connection	=	ConnectionFactory.createConnection(getConf());
table	=	connection.getTable(TABLE_NAME);
//	Create	a	get	with	the	row	key	you	want
Get	g	=	new	Get(Bytes.toBytes("rowkey"));
//	Get	the	row	and	bytes	the	for	the	cell
Result	result	=	table.get(g);
byte[]	value	=	result.getValue(CF,	Bytes.toBytes("columnqual"));
//	Yes,	your	client	will	need	to	know	the	type	of	data	in	the	cell
double	doubleValue	=	Bytes.toDouble(value);
Gets
Architecting	HBase
Solutions
Architecting	for	a	RDBMS	is	about
relationships	or	normalizing	data
Architecting	for	HBase	is	about
access	patterns	or	denormalizing
data
Questions	to	ask:
How	is	data	being	accessed?
What	is	the	fastest	way	to	read/write
data?
What	is	the	optimal	way	to	organize
data?
Differences	With	RDBMS
Treating
HBase	like	a
relational
database	will
lead	to	abject
failure
Abject	Failure
Actual	engineering	goes	into	row
key	design
You	only	have	one	index	or	primary
key
Getting	this	primary	key	right	takes
time	and	effort
You	don't	just	use	an	auto-
incrementing	number
Multiple	pieces	of	data	are	often	in
the	row	key
Row	Keys
Tables	schemas	require	design	and
thought
The	access	pattern	should	be
known	ahead	of	time
General	best	practices:
Fewer,	bigger	(denormalized)	tables
Spend	more	time	designing	up	front
Use	bulk	loading	for	incremental	or
time	series	data
Schema	Design
Jesse	Anderson	(Smoking	Hand)
jesse@smokinghand.com
@jessetanderson
Conclusion

More Related Content

PPTX
HBase Low Latency
PPTX
Apache HBase™
PPTX
Introduction to Apache Spark
PDF
Introduction to Apache Flink - Fast and reliable big data processing
PPT
Hadoop Security Architecture
PDF
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
PPTX
Sharding Methods for MongoDB
PDF
Introduction to MongoDB
HBase Low Latency
Apache HBase™
Introduction to Apache Spark
Introduction to Apache Flink - Fast and reliable big data processing
Hadoop Security Architecture
Apache Spark Data Source V2 with Wenchen Fan and Gengliang Wang
Sharding Methods for MongoDB
Introduction to MongoDB

What's hot (20)

PPTX
Hive + Tez: A Performance Deep Dive
PDF
Fundamentals of Apache Kafka
ODP
Elasticsearch for beginners
PPTX
How to understand and analyze Apache Hive query execution plan for performanc...
PDF
PPTX
Presentation of Apache Cassandra
PPTX
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
PPTX
Hive: Loading Data
PDF
Hive tuning
PPTX
An Overview of Apache Cassandra
PDF
Apache Iceberg: An Architectural Look Under the Covers
PPTX
Apache sqoop with an use case
PDF
Apache Flink internals
PPTX
Apache Tez - A New Chapter in Hadoop Data Processing
PDF
Introduction to Apache Cassandra
PPT
Introduction to redis
PPTX
PPTX
Ozone- Object store for Apache Hadoop
PPTX
RocksDB detail
PPTX
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
Hive + Tez: A Performance Deep Dive
Fundamentals of Apache Kafka
Elasticsearch for beginners
How to understand and analyze Apache Hive query execution plan for performanc...
Presentation of Apache Cassandra
How to build a streaming Lakehouse with Flink, Kafka, and Hudi
Hive: Loading Data
Hive tuning
An Overview of Apache Cassandra
Apache Iceberg: An Architectural Look Under the Covers
Apache sqoop with an use case
Apache Flink internals
Apache Tez - A New Chapter in Hadoop Data Processing
Introduction to Apache Cassandra
Introduction to redis
Ozone- Object store for Apache Hadoop
RocksDB detail
HBase Tutorial For Beginners | HBase Architecture | HBase Tutorial | Hadoop T...
Ad

Viewers also liked (9)

PPTX
Apache HBase at Airbnb
PDF
Improvements to Apache HBase and Its Applications in Alibaba Search
PPTX
Apache Spark on Apache HBase: Current and Future
PDF
HBase internals
PPTX
HBase: Just the Basics
PPTX
Time-Series Apache HBase
PPTX
Data Architectures for Robust Decision Making
PPTX
HBase in Practice
PDF
Intro to HBase
Apache HBase at Airbnb
Improvements to Apache HBase and Its Applications in Alibaba Search
Apache Spark on Apache HBase: Current and Future
HBase internals
HBase: Just the Basics
Time-Series Apache HBase
Data Architectures for Robust Decision Making
HBase in Practice
Intro to HBase
Ad

Similar to Apache HBase - Just the Basics (20)

PDF
HBaseCon 2015: Just the Basics
PPTX
H-Base in Data Base Mangement System
PPTX
HBase.pptx
ODP
Apache hadoop hbase
PPTX
Apache h base
PPTX
PDF
Techincal Talk Hbase-Ditributed,no-sql database
PDF
Intro to HBase - Lars George
PPTX
HBase_-_data_operaet le opérations de calciletions_final.pptx
PDF
Introduction to HBase | Big Data Hadoop Spark Tutorial | CloudxLab
PPTX
Hadoop - Apache Hbase
PDF
Apache HBase
PPTX
PPTX
Introduction to Apache HBase, MapR Tables and Security
PPT
Chicago Data Summit: Apache HBase: An Introduction
PDF
Big Data: Big SQL and HBase
PDF
Michael stack -the state of apache h base
PPTX
Unit II Hadoop Ecosystem_Updated.pptx
PPTX
Hbasepreso 111116185419-phpapp02
PDF
H base one page
HBaseCon 2015: Just the Basics
H-Base in Data Base Mangement System
HBase.pptx
Apache hadoop hbase
Apache h base
Techincal Talk Hbase-Ditributed,no-sql database
Intro to HBase - Lars George
HBase_-_data_operaet le opérations de calciletions_final.pptx
Introduction to HBase | Big Data Hadoop Spark Tutorial | CloudxLab
Hadoop - Apache Hbase
Apache HBase
Introduction to Apache HBase, MapR Tables and Security
Chicago Data Summit: Apache HBase: An Introduction
Big Data: Big SQL and HBase
Michael stack -the state of apache h base
Unit II Hadoop Ecosystem_Updated.pptx
Hbasepreso 111116185419-phpapp02
H base one page

More from HBaseCon (20)

PDF
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
PDF
hbaseconasia2017: HBase on Beam
PDF
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
PDF
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
PDF
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
PDF
hbaseconasia2017: Apache HBase at Netease
PDF
hbaseconasia2017: HBase在Hulu的使用和实践
PDF
hbaseconasia2017: 基于HBase的企业级大数据平台
PDF
hbaseconasia2017: HBase at JD.com
PDF
hbaseconasia2017: Large scale data near-line loading method and architecture
PDF
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
PDF
hbaseconasia2017: HBase Practice At XiaoMi
PDF
hbaseconasia2017: hbase-2.0.0
PDF
HBaseCon2017 Democratizing HBase
PDF
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
PDF
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
PDF
HBaseCon2017 Transactions in HBase
PDF
HBaseCon2017 Highly-Available HBase
PDF
HBaseCon2017 Apache HBase at Didi
PDF
HBaseCon2017 gohbase: Pure Go HBase Client
hbaseconasia2017: Building online HBase cluster of Zhihu based on Kubernetes
hbaseconasia2017: HBase on Beam
hbaseconasia2017: HBase Disaster Recovery Solution at Huawei
hbaseconasia2017: Removable singularity: a story of HBase upgrade in Pinterest
hbaseconasia2017: HareQL:快速HBase查詢工具的發展過程
hbaseconasia2017: Apache HBase at Netease
hbaseconasia2017: HBase在Hulu的使用和实践
hbaseconasia2017: 基于HBase的企业级大数据平台
hbaseconasia2017: HBase at JD.com
hbaseconasia2017: Large scale data near-line loading method and architecture
hbaseconasia2017: Ecosystems with HBase and CloudTable service at Huawei
hbaseconasia2017: HBase Practice At XiaoMi
hbaseconasia2017: hbase-2.0.0
HBaseCon2017 Democratizing HBase
HBaseCon2017 Removable singularity: a story of HBase upgrade in Pinterest
HBaseCon2017 Quanta: Quora's hierarchical counting system on HBase
HBaseCon2017 Transactions in HBase
HBaseCon2017 Highly-Available HBase
HBaseCon2017 Apache HBase at Didi
HBaseCon2017 gohbase: Pure Go HBase Client

Recently uploaded (20)

PDF
Odoo Companies in India – Driving Business Transformation.pdf
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PPTX
Essential Infomation Tech presentation.pptx
PPTX
Transform Your Business with a Software ERP System
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Digital Strategies for Manufacturing Companies
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
history of c programming in notes for students .pptx
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
top salesforce developer skills in 2025.pdf
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Odoo Companies in India – Driving Business Transformation.pdf
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Odoo POS Development Services by CandidRoot Solutions
How to Choose the Right IT Partner for Your Business in Malaysia
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Essential Infomation Tech presentation.pptx
Transform Your Business with a Software ERP System
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Digital Strategies for Manufacturing Companies
wealthsignaloriginal-com-DS-text-... (1).pdf
Understanding Forklifts - TECH EHS Solution
CHAPTER 2 - PM Management and IT Context
history of c programming in notes for students .pptx
VVF-Customer-Presentation2025-Ver1.9.pptx
How Creative Agencies Leverage Project Management Software.pdf
Wondershare Filmora 15 Crack With Activation Key [2025
top salesforce developer skills in 2025.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)

Apache HBase - Just the Basics