SlideShare a Scribd company logo
Apache HBase Internals
you Hoped you Never
Needed to Understand
Josh Elser
Future of Data, NYC
2016/10/11
2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Engineer at Hortonworks, Member of the Apache Software Foundation
Top-Level Projects
• Apache Accumulo®
• Apache Calcite™
• Apache Commons ™
• Apache HBase ®
• Apache Phoenix ™
ASF Incubator
• Apache Fluo ™
• Apache Gossip ™
• Apache Pirk ™
• Apache Rya ™
• Apache Slider ™
These Apache project names are trademarks or registered
trademarks of the Apache Software Foundation.
3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache HBase for storing your data!
CC BY 3.0 US: http://hbase.apache.org/
4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
What happens when things go wrong?
CC BY-ND 2.0: https://www.flickr.com/photos/widnr/6588151679
5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The BigTable Architecture
 BigTable’s architecture is simple
 Debugging a distributed system is not simple
 How can we break down a complex system?
 How do we write resilient software?
• Log-Structured Merge Tree
• Write-Ahead Logs
• Distributed Coordination
• Row-based, Auto-Sharding
• Strong Consistency
• Read Isolation
• Coprocessors
• Security (AuthN/AuthZ)
• Backups
6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Naming Conventions
 Servers
– Hostname, Port, and Timestamp
– RegionServer: r01n01.domain.com,16201,1475691463147
– Master: r02n01.domain.com,16000,1475691462616
 Regions
– Table, Start RowKey, Region ID (timestamp), Replica ID, Encoded name
– T1,x04x00x00,1470324608597.c04d94cd4ee9797da2fb906b4dcd2e3c.
– Or simply c04d94cd4ee9797da2fb906b4dcd2e3c
7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Regions
 A sorted “shard” of a table
 At least one “column family”
– Physical partitions
 Each family can have zero to many files
 Hosted by at most one RegionServer
– Can have many hosting RS’s for reads
 In-memory locks for certain intra-row operations
8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Region Assignment
 Coordinated by the HBase Master
 A Region must only be hosted by one RegionServer
 State tracked in hbase:meta
– hbck to fix issues
 Region splits/merges make a hard problem even harder
 Moving towards ProcedureV2
Closed Offline Opening OpenPending Open
Normal Region Assignment States
9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The File System
 HDFS “Compatible”
– Distributed, durable, ”write leases”
 Physical storage of HBase Tables (HFiles)
 Write-ahead logs
 A parent directory in that FileSystem (hbase.rootdir)
10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The File System
Physical Separation by HBase Namespace
/hbase/data/
/hbase/data/default/<table1>
/hbase/data/default/.tabledesc/.tableinfo…
/hbase/data/default/<table2>/<region_id1>
/hbase/data/default/<table2>/<region_id2>
/hbase/data/my_custom_ns/<table3>/…
/hbase/data/hbase/meta/…
/hbase/archive/…
/hbase/WALs/<regionserver_name>/…
/hbase/oldWALs/…
/hbase/corrupt/…
11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
The File System for one Region
/hbase/data/default/<table2>/<region_id1>
…/.regioninfo
…/.tmp
…/<family1>/<hfile>
…/<family1>/<hfile>
…/<family2>/<hfile>
…/<family3>/<hfile>
…/recovered.edits/<number>.seqid
12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Writes into HBase
 Mutations inserted into sorted in-memory structure and WAL
– Fast lookups of recent data
– Append-only log for durability and speed
 Mutations are collected by destination Region
 Beware of hot-spotting
 Data in memory eventually flush’ed into sorted (H)files
13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Compactions and Flushes
 Flush: Taking Key-Values from the In-Memory map and creating an HFile
 Minor Compaction: Rewriting a subset of HFiles for a Region into one HFile
 Major Compaction: Rewriting all HFiles for a Region into one HFile
 Compactions balance improved query performance with cost of rewriting data
– Compactions are good!
– Must understand SLA’s to properly tune compactions
14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Reads into HBase
 Merge-Sort over multiple streams of data
– Memory
– Disk (many files)
 hbase:meta is the definitive source of where to find Regions
RowKey Region
hbase:meta
RegionServer
ZooKeeper
15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache ZooKeeper™
 Distributed coordination is really hard
 Obvious use cases
– Service Discovery
– Cluster Membership
– “Root Table”
 Non-obvious use cases
– Assignment (sometimes)
– Region Recovery
– WAL Splitting
– Cluster Replication
– Distributed Procedures
– HBase Snapshots
Apache ZooKeeper is a trademark of the Apache Software Foundation
16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Apache ZooKeeper™
 Discovery/Leader ZNodes
– /hbase/rs/…
– /hbase/master/…
– /hbase/backup-masters/…
 Consensus
– /hbase/splitWAL/…
– /hbase/flush-table-proc/...
– /hbase/table-lock/...
– /hbase/region-in-transition/...
– /hbase/recovering-regions/...
17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Distributed Procedures
 Resiliency in an unreliable system
– How do we create a table?
 “Procedure V2”
– Resilient, finite state machine
 HBase operations represented as
”procedures”
 Clients are agnostic of Master state
– Clients track procedure state
https://issues.apache.org/jira/secure/attachment/12679960/ProcedureV2.pdf
18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Distributed Procedures
 Procedures are durable via Write-Ahead Log
– /hbase/MasterProcWALs/…
 Procedures only executed by the active HBase Master
 Reusable framework for the future
19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HBase RPCs
 Internal and External HBase
Communication
 Half-Sync/Half-Async Model
 Many knobs to tweak
 Listener
 Readers
 Scheduler
 Call Queues
 Call Runners/Handlers
Overview Components
20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
HBase RPCs
L
i
s
t
e
n
e
r
Reader
Reader
Reader
Reader
S
c
h
e
d
u
l
e
r
Call Queues Handlers
Priority
Read
Write
Replication
Request to Execution
21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Disaster Recovery
 Multiple tools to ensure copies of data in the face of catastrophic failure
 CopyTable
– MapReduce job which reads all data from a source, writing to destination
 Snapshots
– A collection of Regions, their HFiles, and metadata
 Backup & Restore
– HBASE-7912, current targeted for HBase-2.0.0
– Incremental and full backup/restore
22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Kerberos
 Strong authentication for untrusted networks
 ”Standard” across Apache Hadoop and friends
 Requirements:
– Forward/Reverse DNS
– Unlimited Strength Java Cryptography Extension
 SASL used to build RPC systems
 “Practical Kerberos with Apache HBase” https://goo.gl/y0d9ZO
23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Finding an Hypothesis
 Logs logs logs
 Application and System
 Metrics exposed by JMX
 Graphing solutions
– Ambari Metrics Server + Grafana
24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved
Thank You
jelser@hortonworks.com / elserj@apache.org

More Related Content

PPTX
Apache Phoenix Query Server
PPTX
Practical Kerberos with Apache HBase
PPTX
Apache phoenix: Past, Present and Future of SQL over HBAse
PPTX
De-Mystifying the Apache Phoenix QueryServer
PPTX
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
PPTX
Meet HBase 2.0 and Phoenix 5.0
PPTX
Apache phoenix
PPTX
Apache Phoenix Query Server PhoenixCon2016
Apache Phoenix Query Server
Practical Kerberos with Apache HBase
Apache phoenix: Past, Present and Future of SQL over HBAse
De-Mystifying the Apache Phoenix QueryServer
Apache Phoenix and Apache HBase: An Enterprise Grade Data Warehouse
Meet HBase 2.0 and Phoenix 5.0
Apache phoenix
Apache Phoenix Query Server PhoenixCon2016

What's hot (20)

PPTX
HBase state of the union
PPTX
Apache Hive on ACID
PPTX
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
PPTX
HBase Read High Availability Using Timeline Consistent Region Replicas
PPTX
Apache Phoenix + Apache HBase
PPTX
April 2014 HUG : Apache Phoenix
PPTX
Mapreduce over snapshots
PDF
Hortonworks Technical Workshop: HBase and Apache Phoenix
PDF
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
PDF
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
PPTX
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
PPTX
Apache Phoenix: Use Cases and New Features
PPTX
Dancing with the elephant h base1_final
PPTX
Operating and supporting HBase Clusters
PPTX
Meet HBase 2.0 and Phoenix-5.0
PDF
The Heterogeneous Data lake
PDF
HBase and Impala Notes - Munich HUG - 20131017
PPTX
Apache Hive 2.0: SQL, Speed, Scale
PDF
Apache Big Data EU 2015 - HBase
PPTX
Meet hbase 2.0
HBase state of the union
Apache Hive on ACID
Apache HBase + Spark: Leveraging your Non-Relational Datastore in Batch and S...
HBase Read High Availability Using Timeline Consistent Region Replicas
Apache Phoenix + Apache HBase
April 2014 HUG : Apache Phoenix
Mapreduce over snapshots
Hortonworks Technical Workshop: HBase and Apache Phoenix
Apache Phoenix with Actor Model (Akka.io) for real-time Big Data Programming...
Five major tips to maximize performance on a 200+ SQL HBase/Phoenix cluster
Apache Phoenix and HBase: Past, Present and Future of SQL over HBase
Apache Phoenix: Use Cases and New Features
Dancing with the elephant h base1_final
Operating and supporting HBase Clusters
Meet HBase 2.0 and Phoenix-5.0
The Heterogeneous Data lake
HBase and Impala Notes - Munich HUG - 20131017
Apache Hive 2.0: SQL, Speed, Scale
Apache Big Data EU 2015 - HBase
Meet hbase 2.0
Ad

Viewers also liked (7)

PPTX
HBaseConEast2016: HBase and Spark, State of the Art
PDF
Apache HBase 入門 (第2回)
PDF
Apache Spark streaming and HBase
PPSX
HBaseとSparkでセンサーデータを有効活用 #hbasejp
PPTX
PPTX
Free Code Friday - Spark Streaming with HBase
PDF
Apache HBase 入門 (第1回)
HBaseConEast2016: HBase and Spark, State of the Art
Apache HBase 入門 (第2回)
Apache Spark streaming and HBase
HBaseとSparkでセンサーデータを有効活用 #hbasejp
Free Code Friday - Spark Streaming with HBase
Apache HBase 入門 (第1回)
Ad

Similar to Apache HBase Internals you hoped you Never Needed to Understand (20)

POTX
Meet HBase 2.0 and Phoenix 5.0
PPTX
CCS334 BIG DATA ANALYTICS UNIT 5 PPT ELECTIVE PAPER
PPTX
HBase and HDFS: Understanding FileSystem Usage in HBase
PPTX
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
PPT
HBASE Overview
PDF
HBase for Architects
PPTX
HBaseCon 2015: HBase 2.0 and Beyond Panel
PPTX
PPTX
Introduction to Apache HBase
PDF
Hbase 20141003
PDF
Hbase: an introduction
PPTX
Hbasepreso 111116185419-phpapp02
PDF
Apachecon Europe 2012: Operating HBase - Things you need to know
PPTX
HBase Low Latency, StrataNYC 2014
PPTX
HBase.pptx
PDF
Michael stack -the state of apache h base
DOCX
Hbase Quick Review Guide for Interviews
ODP
HBase introduction talk
PPTX
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
PPTX
Hadoop and HBase experiences in perf log project
Meet HBase 2.0 and Phoenix 5.0
CCS334 BIG DATA ANALYTICS UNIT 5 PPT ELECTIVE PAPER
HBase and HDFS: Understanding FileSystem Usage in HBase
HBaseCon 2013: Apache HBase and HDFS - Understanding Filesystem Usage in HBase
HBASE Overview
HBase for Architects
HBaseCon 2015: HBase 2.0 and Beyond Panel
Introduction to Apache HBase
Hbase 20141003
Hbase: an introduction
Hbasepreso 111116185419-phpapp02
Apachecon Europe 2012: Operating HBase - Things you need to know
HBase Low Latency, StrataNYC 2014
HBase.pptx
Michael stack -the state of apache h base
Hbase Quick Review Guide for Interviews
HBase introduction talk
Big Data and Hadoop - History, Technical Deep Dive, and Industry Trends
Hadoop and HBase experiences in perf log project

More from Josh Elser (7)

PPTX
Effective Testing of Apache Accumulo Iterators
PPTX
Apache Accumulo 1.8.0 Overview
PPTX
Calcite meetup-2016-04-20
PPTX
Designing and Testing Accumulo Iterators
PPTX
Alternatives to Apache Accumulo’s Java API
PPTX
Data-Center Replication with Apache Accumulo
PDF
RPInventory 2-25-2010
Effective Testing of Apache Accumulo Iterators
Apache Accumulo 1.8.0 Overview
Calcite meetup-2016-04-20
Designing and Testing Accumulo Iterators
Alternatives to Apache Accumulo’s Java API
Data-Center Replication with Apache Accumulo
RPInventory 2-25-2010

Recently uploaded (20)

PPTX
CHAPTER 2 - PM Management and IT Context
PPTX
Online Work Permit System for Fast Permit Processing
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
ai tools demonstartion for schools and inter college
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PDF
medical staffing services at VALiNTRY
PDF
Understanding Forklifts - TECH EHS Solution
PDF
AI in Product Development-omnex systems
PPTX
Introduction to Artificial Intelligence
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PPTX
Transform Your Business with a Software ERP System
PPTX
Operating system designcfffgfgggggggvggggggggg
CHAPTER 2 - PM Management and IT Context
Online Work Permit System for Fast Permit Processing
Softaken Excel to vCard Converter Software.pdf
ai tools demonstartion for schools and inter college
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
How Creative Agencies Leverage Project Management Software.pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
How to Choose the Right IT Partner for Your Business in Malaysia
medical staffing services at VALiNTRY
Understanding Forklifts - TECH EHS Solution
AI in Product Development-omnex systems
Introduction to Artificial Intelligence
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Transform Your Business with a Software ERP System
Operating system designcfffgfgggggggvggggggggg

Apache HBase Internals you hoped you Never Needed to Understand

  • 1. Apache HBase Internals you Hoped you Never Needed to Understand Josh Elser Future of Data, NYC 2016/10/11
  • 2. 2 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Engineer at Hortonworks, Member of the Apache Software Foundation Top-Level Projects • Apache Accumulo® • Apache Calcite™ • Apache Commons ™ • Apache HBase ® • Apache Phoenix ™ ASF Incubator • Apache Fluo ™ • Apache Gossip ™ • Apache Pirk ™ • Apache Rya ™ • Apache Slider ™ These Apache project names are trademarks or registered trademarks of the Apache Software Foundation.
  • 3. 3 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache HBase for storing your data! CC BY 3.0 US: http://hbase.apache.org/
  • 4. 4 © Hortonworks Inc. 2011 – 2016. All Rights Reserved What happens when things go wrong? CC BY-ND 2.0: https://www.flickr.com/photos/widnr/6588151679
  • 5. 5 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The BigTable Architecture  BigTable’s architecture is simple  Debugging a distributed system is not simple  How can we break down a complex system?  How do we write resilient software? • Log-Structured Merge Tree • Write-Ahead Logs • Distributed Coordination • Row-based, Auto-Sharding • Strong Consistency • Read Isolation • Coprocessors • Security (AuthN/AuthZ) • Backups
  • 6. 6 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Naming Conventions  Servers – Hostname, Port, and Timestamp – RegionServer: r01n01.domain.com,16201,1475691463147 – Master: r02n01.domain.com,16000,1475691462616  Regions – Table, Start RowKey, Region ID (timestamp), Replica ID, Encoded name – T1,x04x00x00,1470324608597.c04d94cd4ee9797da2fb906b4dcd2e3c. – Or simply c04d94cd4ee9797da2fb906b4dcd2e3c
  • 7. 7 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Regions  A sorted “shard” of a table  At least one “column family” – Physical partitions  Each family can have zero to many files  Hosted by at most one RegionServer – Can have many hosting RS’s for reads  In-memory locks for certain intra-row operations
  • 8. 8 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Region Assignment  Coordinated by the HBase Master  A Region must only be hosted by one RegionServer  State tracked in hbase:meta – hbck to fix issues  Region splits/merges make a hard problem even harder  Moving towards ProcedureV2 Closed Offline Opening OpenPending Open Normal Region Assignment States
  • 9. 9 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The File System  HDFS “Compatible” – Distributed, durable, ”write leases”  Physical storage of HBase Tables (HFiles)  Write-ahead logs  A parent directory in that FileSystem (hbase.rootdir)
  • 10. 10 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The File System Physical Separation by HBase Namespace /hbase/data/ /hbase/data/default/<table1> /hbase/data/default/.tabledesc/.tableinfo… /hbase/data/default/<table2>/<region_id1> /hbase/data/default/<table2>/<region_id2> /hbase/data/my_custom_ns/<table3>/… /hbase/data/hbase/meta/… /hbase/archive/… /hbase/WALs/<regionserver_name>/… /hbase/oldWALs/… /hbase/corrupt/…
  • 11. 11 © Hortonworks Inc. 2011 – 2016. All Rights Reserved The File System for one Region /hbase/data/default/<table2>/<region_id1> …/.regioninfo …/.tmp …/<family1>/<hfile> …/<family1>/<hfile> …/<family2>/<hfile> …/<family3>/<hfile> …/recovered.edits/<number>.seqid
  • 12. 12 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Writes into HBase  Mutations inserted into sorted in-memory structure and WAL – Fast lookups of recent data – Append-only log for durability and speed  Mutations are collected by destination Region  Beware of hot-spotting  Data in memory eventually flush’ed into sorted (H)files
  • 13. 13 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Compactions and Flushes  Flush: Taking Key-Values from the In-Memory map and creating an HFile  Minor Compaction: Rewriting a subset of HFiles for a Region into one HFile  Major Compaction: Rewriting all HFiles for a Region into one HFile  Compactions balance improved query performance with cost of rewriting data – Compactions are good! – Must understand SLA’s to properly tune compactions
  • 14. 14 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Reads into HBase  Merge-Sort over multiple streams of data – Memory – Disk (many files)  hbase:meta is the definitive source of where to find Regions RowKey Region hbase:meta RegionServer ZooKeeper
  • 15. 15 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache ZooKeeper™  Distributed coordination is really hard  Obvious use cases – Service Discovery – Cluster Membership – “Root Table”  Non-obvious use cases – Assignment (sometimes) – Region Recovery – WAL Splitting – Cluster Replication – Distributed Procedures – HBase Snapshots Apache ZooKeeper is a trademark of the Apache Software Foundation
  • 16. 16 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Apache ZooKeeper™  Discovery/Leader ZNodes – /hbase/rs/… – /hbase/master/… – /hbase/backup-masters/…  Consensus – /hbase/splitWAL/… – /hbase/flush-table-proc/... – /hbase/table-lock/... – /hbase/region-in-transition/... – /hbase/recovering-regions/...
  • 17. 17 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Distributed Procedures  Resiliency in an unreliable system – How do we create a table?  “Procedure V2” – Resilient, finite state machine  HBase operations represented as ”procedures”  Clients are agnostic of Master state – Clients track procedure state https://issues.apache.org/jira/secure/attachment/12679960/ProcedureV2.pdf
  • 18. 18 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Distributed Procedures  Procedures are durable via Write-Ahead Log – /hbase/MasterProcWALs/…  Procedures only executed by the active HBase Master  Reusable framework for the future
  • 19. 19 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HBase RPCs  Internal and External HBase Communication  Half-Sync/Half-Async Model  Many knobs to tweak  Listener  Readers  Scheduler  Call Queues  Call Runners/Handlers Overview Components
  • 20. 20 © Hortonworks Inc. 2011 – 2016. All Rights Reserved HBase RPCs L i s t e n e r Reader Reader Reader Reader S c h e d u l e r Call Queues Handlers Priority Read Write Replication Request to Execution
  • 21. 21 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Disaster Recovery  Multiple tools to ensure copies of data in the face of catastrophic failure  CopyTable – MapReduce job which reads all data from a source, writing to destination  Snapshots – A collection of Regions, their HFiles, and metadata  Backup & Restore – HBASE-7912, current targeted for HBase-2.0.0 – Incremental and full backup/restore
  • 22. 22 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Kerberos  Strong authentication for untrusted networks  ”Standard” across Apache Hadoop and friends  Requirements: – Forward/Reverse DNS – Unlimited Strength Java Cryptography Extension  SASL used to build RPC systems  “Practical Kerberos with Apache HBase” https://goo.gl/y0d9ZO
  • 23. 23 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Finding an Hypothesis  Logs logs logs  Application and System  Metrics exposed by JMX  Graphing solutions – Ambari Metrics Server + Grafana
  • 24. 24 © Hortonworks Inc. 2011 – 2016. All Rights Reserved Thank You jelser@hortonworks.com / elserj@apache.org

Editor's Notes

  • #6: Architecture wise: BigTable as a system is well understood and simple. A decade since the paper. Distributed systems are complex! Easier to reason about if we consider them as smaller units.
  • #7: Important to be able to grep! Know what to look for. DNS important to make sure consistent naming across all nodes.
  • #10: HBase needs a distributed a resilient filesystem (see also Azure tech). Data that is written+sync’ed must be present! Relies on one-writer per file (hdfs leases) HBase Tables: Not just Key-Values (hfiles) but also serialized table metadata. WALs durabilty is key here
  • #11: /hbase/data = All table data /hbase/archive = Hfiles before deletion /hbase/WALs = Write-ahead logs /hbase/oldWALs = WALs before deletion /hbase/corrupt = Corrupt WALs
  • #12: .regioninfo = metadata about this region .tmp = general temporary space (compactions) recovered.edits = artifact of WAL recovery
  • #14: Compactions == fewer files, more efficient lookups
  • #15: “What happens when meta is unassigned?”
  • #16: ZooKeeper provides authentication and authorization as well (for HBase, no auth or Kerberos auth via SASL). ACLs are used to prevent users from changing sensitive data in ZK – only HBase nodes can change them.
  • #18: Resilience is hard. How do we make sure that an operation will succeed if servers fail? How do we determine between previous failed attempts and users trying to concurrently perform the same operation Table creation: unique name, directories in HDFS, create intial region in HDFS, update meta, enable the table, etc.
  • #19: ProcV2 implementation is tricky/complicated, but provides an internal API to make operations easy to implement and reason about in the future. Easy to inspect state. Model is proven in Accumulo’s FATE
  • #20: Lots of knobs because we want to be able to optimize things like throughput, latency, and fairness, which are often mutually exclusive
  • #21: Listener does Socket accept, dispatches to Readers. Readers read a number of bytes off the wire (the Selector channel). Sends the deserialized request to the Scheduler which gets it placed on a call queue, which a handler will eventually process.
  • #22: Aka “you dun goofed up” CopyTable – slow, requires src and destination to be up. Not really.. Desirable Snapshots – Great for one off’s. Can grow DFS usage though. Requires coordination of a flush for full backup B&R – Snapshots with ability track WALs for incremental backups since last full backup
  • #23: Brutally-sparse Kerberos talk
  • #24: JMX – JvisualVM, hbase web Uis, hadoop metrics 2 sink (AMS)