BigTable PreReading

BIGTABLE READING
2013.04.02 xielun.szd@alipay.com

OutLine
 Introduction
 Data Model
 API
 Building Blocks
 Implementation
 Refinements
 Performance Evaluation
 Real Applications
 Lessons
 Related Work
 Conclusions

Introduction
 Why Google Need BigTable ?
 Why BigTable Need LSM ?

Data Model
 BigTable Is A Sparse Sorted Map
 Key: <Row, Column Family, Qualifier, Timestamp>

API
 How to Support Single-Row transaction
 Atomic Read-modify-write
 Using API for MapReduce
 Input source & Output target

Building Blocks
 GFS
 SSTable Data File
 Commit Log (WAL & Redo Log)
 Chubby
 Distributed Lock Service
 Root Tablet Location
 Tablet server Manager
 Schema & ACL Metadata
 Why GFS not use chubby For Master?
 If Chubby not exist, how to design the bigtable?

Implementations
 Client
 How To Protect Lighted Master ?
 Master
 Tablet server Manager
 Tablet Assignment
 Garbage Collection
 Schema Change
 Do We Need Slave Master For Availability?
 Tablet Server
 Reader & Writer

Implementations
 Tablet Locations

Implementations
 Why select Three-level hierarchy?
 Why Persistent the location info?
 Why analogous as B+tree?
 The Most RPC times during One simple Query?

Implementations
 Tablet Assignment
 Consistency Guaranty ?
 Tablet Server Offline detection Procedure
 Master Recovery Procedure ?
 Master Memory != Metadata Table
 Metadata Table Schema Design?

Implementations
 Tablet Serving
 Read & Write & Recovery Procedure

Implementations
 Compactions
 Three Type Compactions Differences
 If client Write 1GB data, How much data in GFS?

Refinements
 How To Reduce IO?
 Column Store
 Compression
 Block Size
 Caching
 Caching Consistency
 Bloom Filter
 Why Must Using Bloom Filter For LSM?
 Group Commit
 Prons Vs Cons

Refinements
 Log
 GFS exist duplicated record?
 What Happened when Tablet Server Crashes?
 Tablet Recovery
 Tablet Migrate
 Compare with Tair & OB
 Tablet Split
 Tablet Merge
 How can do this online?

Performance
 Compare With Single Server Performance
 Low Latency Vs High Through output
 Why Performance/Server Decrease？

Lessons
 Vulnerable to many types of failure
 RPC crc checksum
 Removing assumption
 Say No to adding new features
 System-level Monitoring
 Simple Design

Related Work
 CAP in BigTable?
 Sharing Nothing or Sharing Disk or Sharing
Memory?
 CStore or BigTable or OceanBase

Assignment
 ALL Questions Above
 CStore, Bloom Filter
 Chubby, Megastore
 Dynamo
 Cassandra
 Riak
 Volmort
 Redis cluster
 Thanks & QA

BigTable PreReading

More Related Content

What's hot (20)

Viewers also liked (16)

Similar to BigTable PreReading (20)

BigTable PreReading