HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.

HBase, Meet OPs.
OPs, Meet HBase
Kevin O'Dell
Jean-Daniel Cryans

Kevin O'Dell
Systems Engineer
Extensive experience supporting customers
2

Jean-Daniel Cryans
Software Engineer
Builds and runs HBase
3

Agenda
Leveraging previous knowledge (Kevin)
Getting to production (JD)
4

Goals
Help audience members understanding how to
operate HBase.
Empower audience members when talking to
their own ops organization.
5

Leveraging previous knowledge
Distributed Filesystem
Distributed Database
6
Java
OS
Network
Hardware

Machines
•Industry Standard
•No RAID controller (JBOD on the slaves)
•Homogeneous environment is not necessary
•Cores, Spindles, and RAM
• Different conﬁgurations for different uses
7

Network
•Leverage the existing infrastructure
•No fancy equipment, no Inﬁniband
•Redundancy is key, no SPOF
•TOR vs Core
•1Gb, 10Gb, and 40Gb
•Bonding, VIPs, other such complexities
8

Operating system
•Production ready Linux
•Swap vs. Swappiness
•Basic FS -> Ext3/4
•Cgroups
•Recommended packages (systat, mce, iperf)
10

Java
•User space
•Programs run in contained JVM
•JVM requires tuning
•No leaks (usually), but overcommitting is easy
11

Distributed Filesystem(HDFS)
•Shared nothing
•User-level ﬁlesystem
•No POSIX compliant
•Immutable
•Built in Redundancy
•Linear Scalability
12

Distributed Database(HBase)
•Distributed Hash table
•Get, put, delete, scan, and CaS
•Denormalization is necessary
•Not a parallel database, just distributed
•Write-ahead log / data durability
•Master/slave replication
•ACID compliance
14

Getting to Production
Things HBase doesn't come with:
•Metrics
•Automation
•Alerting
16

Metrics
Tony was really excited to try his
new cluster
17

Metrics
You have no excuses:
•Ganglia
•Cacti
•OpenTSDB
•Hannibal
18

Metrics
Metrics you want in your dashboards:
•Call queues
•IO wait
•Compaction queues
23

Metrics - Call Queues
24
View of all the machines together

25
Ceiling

26
Breaking it down per node

27
What’s up with this one?

Metrics - IO Wait
28
Same time, breaking it down per node

Metrics - IO Wait
29
Our machine is somewhere here...

Metrics - IO Wait
30
Showing the previous machine (used to be yellow sorry)

Metrics - Compaction Queues
31
View of all the machines together, different time

32
Nice slope! Load is well distributed

33
Oh...

34
What is going on here?

Metrics
Want to learn more about metrics?
See:
“Using Metrics to Monitor and Debug Apache
HBase” (5:00pm-5:20pm) with Elliott Clark
35

Automation
How fast can you:
•Change an OS configuration on 100 machines?
•Kill one process on said machines?
•Reboot all your machines?
•Reboot all your machines one by one, with
some added configuration changes?
•Add 10 new fully configured nodes?
37

"Automation" - CSSH
Are you blind yet?
38

Automation
Common automations:
•Rolling restart
•Adding/removing nodes
•Deploying new conﬁgurations
•Finer re-balancing
42

Alerting
HBase is just like any other system you are
running, so maybe you've heard of...
43

Alerting
What to alert on:
•Previous metrics (call/compaction queues, IO).
•Network bandwidth
•Disk space
•Number of regions
•SMARTD
46

Backup
48
No, you’re not the only one.
Now drop that gun.

If you can manage to take your cluster offline for
possibly an hour:
1.Shutdown HBase
2.distcp to another cluster/separate folder
3.Restart HBase
* It's possible to run a distcp before shutting down, make sure you run distcp
-update -delete for the second step.
Backup - Offline
49

1.Create another HBase cluster (can be remote)
2.Alter the families that need replication
3.Make sure the same tables exist on the slave
cluster
* Replication isn't done inline with the inserts in the master cluster
* See "Apache HBase Replication" with Chris Trezzo at 5:20PM
Backup - Replication
50

•Doesn't require copying data
•Runs in less than 60 seconds
•Minimal impact on performance
* See the slides from "Apache HBase Table Snapshots" with Jonathan Hsieh
& pals
Backup - Snapshot
51

HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase. (20)

More from Cloudera, Inc. (20)

Recently uploaded (20)

HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.