SlideShare a Scribd company logo
HBase, Meet OPs.
OPs, Meet HBase
Kevin O'Dell
Jean-Daniel Cryans
Kevin O'Dell
Systems Engineer
Extensive experience supporting customers
2
Jean-Daniel Cryans
Software Engineer
Builds and runs HBase
3
Agenda
Leveraging previous knowledge (Kevin)
Getting to production (JD)
4
Goals
Help audience members understanding how to
operate HBase.
Empower audience members when talking to
their own ops organization.
5
Leveraging previous knowledge
Distributed Filesystem
Distributed Database
6
Java
OS
Network
Hardware
Machines
•Industry Standard
•No RAID controller (JBOD on the slaves)
•Homogeneous environment is not necessary
•Cores, Spindles, and RAM
• Different configurations for different uses
7
Network
•Leverage the existing infrastructure
•No fancy equipment, no Infiniband
•Redundancy is key, no SPOF
•TOR vs Core
•1Gb, 10Gb, and 40Gb
•Bonding, VIPs, other such complexities
8
Network
9
Operating system
•Production ready Linux
•Swap vs. Swappiness
•Basic FS -> Ext3/4
•Cgroups
•Recommended packages (systat, mce, iperf)
10
Java
•User space
•Programs run in contained JVM
•JVM requires tuning
•No leaks (usually), but overcommitting is easy
11
Distributed Filesystem(HDFS)
•Shared nothing
•User-level filesystem
•No POSIX compliant
•Immutable
•Built in Redundancy
•Linear Scalability
12
Distributed Filesystem
13
Distributed Database(HBase)
•Distributed Hash table
•Get, put, delete, scan, and CaS
•Denormalization is necessary
•Not a parallel database, just distributed
•Write-ahead log / data durability
•Master/slave replication
•ACID compliance
14
Distributed Database
15
Getting to Production
Things HBase doesn't come with:
•Metrics
•Automation
•Alerting
16
Metrics
Tony was really excited to try his
new cluster
17
Metrics
You have no excuses:
•Ganglia
•Cacti
•OpenTSDB
•Hannibal
18
Metrics - Ganglia
19
Metrics - Cacti
20
Metrics - OpenTSDB
21
Metrics - Hannibal
22
Metrics
Metrics you want in your dashboards:
•Call queues
•IO wait
•Compaction queues
23
Metrics - Call Queues
24
View of all the machines together
Metrics - Call Queues
25
Ceiling
Metrics - Call Queues
26
Breaking it down per node
Metrics - Call Queues
27
What’s up with this one?
Metrics - IO Wait
28
Same time, breaking it down per node
Metrics - IO Wait
29
Our machine is somewhere here...
Metrics - IO Wait
30
Showing the previous machine (used to be yellow sorry)
Metrics - Compaction Queues
31
View of all the machines together, different time
Metrics - Compaction Queues
32
Nice slope! Load is well distributed
Metrics - Compaction Queues
33
Oh...
Metrics - Compaction Queues
34
What is going on here?
Metrics
Want to learn more about metrics?
See:
“Using Metrics to Monitor and Debug Apache
HBase” (5:00pm-5:20pm) with Elliott Clark
35
23:59:60
36
Automation
How fast can you:
•Change an OS configuration on 100 machines?
•Kill one process on said machines?
•Reboot all your machines?
•Reboot all your machines one by one, with
some added configuration changes?
•Add 10 new fully configured nodes?
37
"Automation" - CSSH
Are you blind yet?
38
Automation - Puppet
39
Automation - Chef
40
Automation - Fabric
$ fab
41
Automation
Common automations:
•Rolling restart
•Adding/removing nodes
•Deploying new configurations
•Finer re-balancing
42
Alerting
HBase is just like any other system you are
running, so maybe you've heard of...
43
Alerting - Nagios
44
Alerting - Zabbix
45
Alerting
What to alert on:
•Previous metrics (call/compaction queues, IO).
•Network bandwidth
•Disk space
•Number of regions
•SMARTD
46
Backup
47
Backup
48
No, you’re not the only one.
Now drop that gun.
If you can manage to take your cluster offline for
possibly an hour:
1.Shutdown HBase
2.distcp to another cluster/separate folder
3.Restart HBase
* It's possible to run a distcp before shutting down, make sure you run distcp
-update -delete for the second step.
Backup - Offline
49
1.Create another HBase cluster (can be remote)
2.Alter the families that need replication
3.Make sure the same tables exist on the slave
cluster
* Replication isn't done inline with the inserts in the master cluster
* See "Apache HBase Replication" with Chris Trezzo at 5:20PM
Backup - Replication
50
•Doesn't require copying data
•Runs in less than 60 seconds
•Minimal impact on performance
* See the slides from "Apache HBase Table Snapshots" with Jonathan Hsieh
& pals
Backup - Snapshot
51
Thank You!
52

More Related Content

PDF
The State of HBase Replication
PDF
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
PDF
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
KEY
Near-realtime analytics with Kafka and HBase
PDF
Tales from the Cloudera Field
PPTX
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
PDF
Kudu - Fast Analytics on Fast Data
PPTX
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More
The State of HBase Replication
HBaseCon 2015: HBase at Scale in an Online and High-Demand Environment
HBaseCon 2013:High-Throughput, Transactional Stream Processing on Apache HBase
Near-realtime analytics with Kafka and HBase
Tales from the Cloudera Field
HBaseCon 2012 | HBase, the Use Case in eBay Cassini
Kudu - Fast Analytics on Fast Data
HBaseCon 2013: How to Get the MTTR Below 1 Minute and More

What's hot (20)

PDF
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
PPTX
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
PPTX
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
PPTX
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
PPTX
Rigorous and Multi-tenant HBase Performance Measurement
PDF
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
PPTX
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
PDF
HBaseCon 2012 | HBase Filtering - Lars George, Cloudera
PPTX
Flexible and Real-Time Stream Processing with Apache Flink
PPTX
State of HBase: Meet the Release Managers
PDF
HBaseCon 2013: Apache HBase Operations at Pinterest
PPTX
HBase and HDFS: Understanding FileSystem Usage in HBase
PDF
Low latency high throughput streaming using Apache Apex and Apache Kudu
PPTX
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
PPTX
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
PDF
HBaseCon 2015- HBase @ Flipboard
PPTX
Enterprise Grade Streaming under 2ms on Hadoop
PPTX
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
PPTX
Cross-Site BigTable using HBase
PDF
HBaseCon 2015: Elastic HBase on Mesos
HBaseCon 2012 | HBase Coprocessors – Deploy Shared Functionality Directly on ...
HBaseCon 2012 | Building a Large Search Platform on a Shoestring Budget
January 2015 HUG: Using HBase Co-Processors to Build a Distributed, Transacti...
HBaseCon 2012 | Solbase - Kyungseog Oh, Photobucket
Rigorous and Multi-tenant HBase Performance Measurement
HBaseCon 2012 | HBase and HDFS: Past, Present, Future - Todd Lipcon, Cloudera
HBaseCon 2013: Streaming Data into Apache HBase using Apache Flume: Experienc...
HBaseCon 2012 | HBase Filtering - Lars George, Cloudera
Flexible and Real-Time Stream Processing with Apache Flink
State of HBase: Meet the Release Managers
HBaseCon 2013: Apache HBase Operations at Pinterest
HBase and HDFS: Understanding FileSystem Usage in HBase
Low latency high throughput streaming using Apache Apex and Apache Kudu
Optimizing Apache HBase for Cloud Storage in Microsoft Azure HDInsight
Introducing Apache Kudu (Incubating) - Montreal HUG May 2016
HBaseCon 2015- HBase @ Flipboard
Enterprise Grade Streaming under 2ms on Hadoop
HBaseCon 2013: Using Coprocessors to Index Columns in an Elasticsearch Cluster
Cross-Site BigTable using HBase
HBaseCon 2015: Elastic HBase on Mesos
Ad

Viewers also liked (20)

PPTX
HBaseCon 2013: Rebuilding for Scale on Apache HBase
PPTX
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
PPT
HBaseCon 2012 | Building Mobile Infrastructure with HBase
PPTX
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
PPTX
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
PPTX
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
PPTX
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
PPTX
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
PPTX
HBaseCon 2013: Apache HBase on Flash
PPTX
HBaseCon 2012 | Scaling GIS In Three Acts
PDF
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
PPTX
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
PPTX
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
PDF
HBase Read High Availability Using Timeline-Consistent Region Replicas
PPTX
HBaseCon 2013: 1500 JIRAs in 20 Minutes
PPTX
HBaseCon 2013: Being Smarter Than the Smart Meter
PPT
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
PPTX
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
PPTX
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
PPTX
HBaseCon 2013: ETL for Apache HBase
HBaseCon 2013: Rebuilding for Scale on Apache HBase
HBaseCon 2012 | HBase for the Worlds Libraries - OCLC
HBaseCon 2012 | Building Mobile Infrastructure with HBase
HBaseCon 2012 | Leveraging HBase for the World’s Largest Curated Genomic Data...
HBaseCon 2015: DeathStar - Easy, Dynamic, Multi-tenant HBase via YARN
HBaseCon 2015: Trafodion - Integrating Operational SQL into HBase
HBaseCon 2012 | Relaxed Transactions for HBase - Francis Liu, Yahoo!
HBaseCon 2013: Evolving a First-Generation Apache HBase Deployment to Second...
HBaseCon 2013: Apache HBase on Flash
HBaseCon 2012 | Scaling GIS In Three Acts
HBaseCon 2012 | Content Addressable Storages for Fun and Profit - Berk Demir,...
HBaseCon 2012 | Living Data: Applying Adaptable Schemas to HBase - Aaron Kimb...
HBaseCon 2013: Project Valta - A Resource Management Layer over Apache HBase
HBase Read High Availability Using Timeline-Consistent Region Replicas
HBaseCon 2013: 1500 JIRAs in 20 Minutes
HBaseCon 2013: Being Smarter Than the Smart Meter
HBaseCon 2013: Apache Hadoop and Apache HBase for Real-Time Video Analytics
HBaseCon 2012 | Unique Sets on HBase and Hadoop - Elliot Clark, StumbleUpon
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
HBaseCon 2013: ETL for Apache HBase
Ad

Similar to HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase. (20)

PPTX
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
PDF
Provisioning Servers Made Easy
PDF
Hadoop Robot from eBay at China Hadoop Summit 2015
PDF
Latest (storage IO) patterns for cloud-native applications
PDF
How to Build a Compute Cluster
PDF
Hadoop Operations: Keeping the Elephant Running Smoothly
PDF
Scaling Hadoop at LinkedIn
PDF
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
PPTX
Designing for High Performance Ceph at Scale
PPTX
Sanger, upcoming Openstack for Bio-informaticians
PPTX
Flexible compute
PDF
Swift at Scale: The IBM SoftLayer Story
PDF
The Fabric of the Future
PDF
Introduction to DevOps
PDF
Best practices in Deploying SUSE CaaS Platform v3
PDF
Get the Facts: Oracle's Unbreakable Enterprise Kernel
PPTX
Opening last bits of the infrastructure
PPTX
HDFS Erasure Coding in Action
PDF
Kubernetes for HCL Connections Component Pack - Build or Buy?
PDF
Introduction to Stacki - World's fastest Linux server provisioning Tool
Sergey Dzyuban "To Build My Own Cloud with Blackjack…"
Provisioning Servers Made Easy
Hadoop Robot from eBay at China Hadoop Summit 2015
Latest (storage IO) patterns for cloud-native applications
How to Build a Compute Cluster
Hadoop Operations: Keeping the Elephant Running Smoothly
Scaling Hadoop at LinkedIn
Mikhail Serkov - Zabbix for HPC Cluster Support | ZabConf2016
Designing for High Performance Ceph at Scale
Sanger, upcoming Openstack for Bio-informaticians
Flexible compute
Swift at Scale: The IBM SoftLayer Story
The Fabric of the Future
Introduction to DevOps
Best practices in Deploying SUSE CaaS Platform v3
Get the Facts: Oracle's Unbreakable Enterprise Kernel
Opening last bits of the infrastructure
HDFS Erasure Coding in Action
Kubernetes for HCL Connections Component Pack - Build or Buy?
Introduction to Stacki - World's fastest Linux server provisioning Tool

More from Cloudera, Inc. (20)

PPTX
Partner Briefing_January 25 (FINAL).pptx
PPTX
Cloudera Data Impact Awards 2021 - Finalists
PPTX
2020 Cloudera Data Impact Awards Finalists
PPTX
Edc event vienna presentation 1 oct 2019
PPTX
Machine Learning with Limited Labeled Data 4/3/19
PPTX
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
PPTX
Introducing Cloudera DataFlow (CDF) 2.13.19
PPTX
Introducing Cloudera Data Science Workbench for HDP 2.12.19
PPTX
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
PPTX
Leveraging the cloud for analytics and machine learning 1.29.19
PPTX
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
PPTX
Leveraging the Cloud for Big Data Analytics 12.11.18
PPTX
Modern Data Warehouse Fundamentals Part 3
PPTX
Modern Data Warehouse Fundamentals Part 2
PPTX
Modern Data Warehouse Fundamentals Part 1
PPTX
Extending Cloudera SDX beyond the Platform
PPTX
Federated Learning: ML with Privacy on the Edge 11.15.18
PPTX
Analyst Webinar: Doing a 180 on Customer 360
PPTX
Build a modern platform for anti-money laundering 9.19.18
PPTX
Introducing the data science sandbox as a service 8.30.18
Partner Briefing_January 25 (FINAL).pptx
Cloudera Data Impact Awards 2021 - Finalists
2020 Cloudera Data Impact Awards Finalists
Edc event vienna presentation 1 oct 2019
Machine Learning with Limited Labeled Data 4/3/19
Data Driven With the Cloudera Modern Data Warehouse 3.19.19
Introducing Cloudera DataFlow (CDF) 2.13.19
Introducing Cloudera Data Science Workbench for HDP 2.12.19
Shortening the Sales Cycle with a Modern Data Warehouse 1.30.19
Leveraging the cloud for analytics and machine learning 1.29.19
Modernizing the Legacy Data Warehouse – What, Why, and How 1.23.19
Leveraging the Cloud for Big Data Analytics 12.11.18
Modern Data Warehouse Fundamentals Part 3
Modern Data Warehouse Fundamentals Part 2
Modern Data Warehouse Fundamentals Part 1
Extending Cloudera SDX beyond the Platform
Federated Learning: ML with Privacy on the Edge 11.15.18
Analyst Webinar: Doing a 180 on Customer 360
Build a modern platform for anti-money laundering 9.19.18
Introducing the data science sandbox as a service 8.30.18

Recently uploaded (20)

PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Approach and Philosophy of On baking technology
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Approach and Philosophy of On baking technology
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Chapter 3 Spatial Domain Image Processing.pdf
Encapsulation theory and applications.pdf
Review of recent advances in non-invasive hemoglobin estimation
MIND Revenue Release Quarter 2 2025 Press Release
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Encapsulation_ Review paper, used for researhc scholars
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Dropbox Q2 2025 Financial Results & Investor Presentation
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Per capita expenditure prediction using model stacking based on satellite ima...
Agricultural_Statistics_at_a_Glance_2022_0.pdf
20250228 LYD VKU AI Blended-Learning.pptx
Spectral efficient network and resource selection model in 5G networks
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf

HBaseCon 2013: Apache HBase, Meet Ops. Ops, Meet Apache HBase.