Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoop and HBase - Jonathan Gray, Facebook

Realtime Big Data at Facebook
with Hadoop and HBase

Jonathan Gray
November ,
Hadoop World NYC

Agenda

Why Hadoop and HBase?

Applications of HBase at Facebook

Future of HBase at Facebook

About Me Jonathan Gray
▪ Previous life as Co-Founder of Streamy.com
▪ Realtime Social News Aggregator

▪ Big Data problems led us to Hadoop/HBase

▪ HBase committer and Hadoop user/complainer

▪ Software Engineer at Facebook
▪ Develop, support, and evangelize HBase across teams

▪ Recently joined Database Infrastructure Engineering
MySQL and HBase together at last!

Why Hadoop and HBase?
For Realtime Data?

Cache Data analysis

OS Web server Database Language

Problems with existing stack
▪ MySQL is stable, but...
▪ Limited throughput

▪ Not inherently distributed

▪ Table size limits

▪ Inﬂexible schema

▪ Memcached is fast, but...
▪ Only key-value so data is opaque

▪ No write-through

Problems with existing stack
▪ Hadoop is scalable, but...
▪ MapReduce is slow

▪ Writing MapReduce is difﬁcult

▪ Does not support random writes

▪ Poor support for random reads

Specialized solutions
▪ Inbox Search
▪ Cassandra

▪ High-throughput, persistent key-value
▪ Tokyo Cabinet

▪ Large scale data warehousing
▪ Hive

▪ Custom C++ servers for lots of other stuff

Finding a new online data store
▪ Consistent patterns emerge
▪ Massive datasets, often largely inactive

▪ Lots of writes

▪ Fewer reads

▪ Dictionaries and lists

▪ Entity-centric schemas
▪ per-user, per-domain, per-app

▪ Other requirements laid out
▪ Elasticity

▪ High availability

▪ Strong consistency within a datacenter

▪ Fault isolation

▪ Some non-requirements
▪ Network partitions within a single datacenter

▪ Active-active serving from multiple datacenters

▪ In , engineers at FB compared DBs
▪ Apache Cassandra, Apache HBase, Sharded MySQL

▪ Compared performance, scalability, and features
▪ HBase gave excellent write performance, good reads

▪ HBase already included many nice-to-have features
▪ Atomic read-modify-write operations
▪ Multiple shards per server
▪ Bulk importing
▪ Range scans

HBase uses HDFS
We get the beneﬁts of HDFS as a storage
system for free
▪ Fault tolerance

▪ Scalability

▪ Checksums ﬁx corruptions

▪ MapReduce

▪ Fault isolation of disks

▪ HDFS battle tested at petabyte scale at Facebook

▪ Lots of existing operational experience

HBase in a nutshell
▪ Sorted and column-oriented

▪ High write throughput

▪ Horizontal scalability

▪ Automatic failover

▪ Regions sharded dynamically

Applications of HBase at Facebook

Use Case
Titan
(Facebook Messages)

The New Facebook Messages

Messages IM/Chat email SMS

Facebook Messaging
▪ Largest engineering effort in the history of FB
▪ engineers over more than a year
▪ Incorporates over infrastructure technologies
▪ Hadoop, HBase, Haystack, ZooKeeper, etc...

▪ A product at massive scale on day one

▪ Hundreds of millions of active users

▪ + billion messages a month
▪ k instant messages a second on average

Messaging Challenges
▪ High write throughput
▪ Every message, instant message, SMS, and e-mail

▪ Search indexes and metadata for all of the above

▪ Denormalized schema

▪ Massive clusters
▪ So much data and usage requires a large server footprint

▪ Do not want outages to impact availability

▪ Must be able to easily scale out

High Write Throughput
Write
Key Value

Sequential
Key val Key val write

Key val Key val
Key val Key val
.
. .
.
. . memory
Sorted in
Key val Key val
.
.
.
Key val Sequential Key val
write

Commit Log Memstore

Horizontal Scalability
Region

... ...

Automatic Failover
Find new
HBase client server from
META

server
died

Facebook Messages Stats
▪ B+ messages per day
▪ B+ read/write ops to HBase per day
▪ . M ops/sec at peak

▪ read, write
▪~ columns per operation across multiple families

▪ PB+ of online data in HBase
▪ LZO compressed and un-replicated ( PB replicated)
▪ Growing at TB/month

Use Case
Puma
(Facebook Insights)

Before Puma
Ofﬂine ETL
Web Tier HDFS Hive MySQL
Scribe MR SQL

SQL

8-24 hours

Puma
Realtime ETL
Web Tier HDFS Puma HBase
Scribe PTail HTable

Thrift

2-30 seconds

Puma as Realtime MapReduce
▪ Map phase with PTail
▪ Divide the input log stream into N shards

▪ First version supported random bucketing

▪ Now supports application-level bucketing

▪ Reduce phase with HBase
▪ Every row+column in HBase is an output key

▪ Aggregate key counts using atomic counters

▪ Can also maintain per-key lists or other structures

Puma for Facebook Insights
▪ Realtime URL/Domain Insights
▪ Domain owners can see deep analytics for their site

▪ Clicks, Likes, Shares, Comments, Impressions

▪ Detailed demographic breakdowns (anonymized)

▪ Top URLs calculated per-domain and globally

▪ Massive Throughput
▪ Billions of URLs

▪> Million counter increments per second

Future of Puma
▪ Centrally managed service for many products

▪ Several other applications in production
▪ Commerce Tracking

▪ Ad Insights

▪ Making Puma generic
▪ Dynamically conﬁgured by product teams

▪ Custom query language

Use Case
ODS
(Facebook Internal Metrics)

ODS
▪ Operational Data Store
▪ System metrics (CPU, Memory, IO, Network)

▪ Application metrics (Web, DB, Caches)

▪ Facebook metrics (Usage, Revenue)
▪ Easily graph this data over time
▪ Supports complex aggregation, transformations, etc.

▪ Difﬁcult to scale with MySQL

▪ Millions of unique time-series with billions of points

▪ Irregular data growth patterns

Dynamic sharding of regions
Region

... ...

server
overloaded

User and Graph Data
in HBase

Why now?
▪ MySQL+Memcached hard to replace, but...
▪ Joins and other RDBMS functionality are gone

▪ From writing SQL to using APIs

▪ Next generation of services and caches make the

persistent storage engine transparent to www

▪ Primarily a ﬁnancially motivated decision
▪ MySQL works, but can HBase save us money?

▪ Also, are there things we just couldn’t do before?

HBase vs. MySQL
▪ MySQL at Facebook
▪ Tier size determined solely by IOPS

▪ Heavy on random IO for reads and writes

▪ Rely on fast disks or ﬂash to scale individual nodes

▪ HBase showing promise of cost savings
▪ Fewer IOPS on write-heavy workloads

▪ Larger tables on denser, cheaper nodes

▪ Simpler operations and replication “for free”

HBase vs. MySQL
▪ MySQL is not going anywhere soon
▪ It works!

▪ But HBase is a great addition to the tool belt
▪ Different set of trade-offs

▪ Great at storing key-values, dictionaries, and lists

▪ Products with heavy write requirements

▪ Generated data

▪ Potential capital and operational cost savings

UDB Challenges
▪ MySQL has a + year head start
▪ HBase is still a pre- . database system

▪ Insane Requirements
▪ Zero data loss, low latency, very high throughput

▪ Reads, writes, and atomic read-modify-writes

▪ WAN replication, backups w/ point-in-time recovery

▪ Live migration of critical user data w/ existing shards

▪ queryf() and other fun edge cases to deal with

Technical/Developer oriented talk tomorrow:

Apache HBase Road Map
A short history of nearly everything HBase. Past, present, and future.

Wednesday @ 1PM in the Met Ballroom

Check out the HBase at Facebook Page:

facebook.com/UsingHbase

Thanks! Questions?

Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoop and HBase - Jonathan Gray, Facebook

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoop and HBase - Jonathan Gray, Facebook (20)

More from Cloudera, Inc. (20)

Recently uploaded (20)

Hadoop World 2011: Building Realtime Big Data Services at Facebook with Hadoop and HBase - Jonathan Gray, Facebook