SlideShare a Scribd company logo
BIGTABLE READING 
2013.04.02 xielun.szd@alipay.com
OutLine 
 Introduction 
 Data Model 
 API 
 Building Blocks 
 Implementation 
 Refinements 
 Performance Evaluation 
 Real Applications 
 Lessons 
 Related Work 
 Conclusions
OutLine 
 Introduction 
 Data Model 
 API 
 Building Blocks 
 Implementation 
 Refinements 
 Performance Evaluation 
 Real Applications 
 Lessons 
 Related Work 
 Conclusions
Introduction 
 Why Google Need BigTable ? 
 Why BigTable Need LSM ?
Data Model 
 BigTable Is A Sparse Sorted Map 
 Key: <Row, Column Family, Qualifier, Timestamp>
API 
 How to Support Single-Row transaction 
 Atomic Read-modify-write 
 Using API for MapReduce 
 Input source & Output target
Building Blocks 
 GFS 
 SSTable Data File 
 Commit Log (WAL & Redo Log) 
 Chubby 
 Distributed Lock Service 
 Root Tablet Location 
 Tablet server Manager 
 Schema & ACL Metadata 
 Why GFS not use chubby For Master? 
 If Chubby not exist, how to design the bigtable?
Implementations 
 Client 
 How To Protect Lighted Master ? 
 Master 
 Tablet server Manager 
 Tablet Assignment 
 Garbage Collection 
 Schema Change 
 Do We Need Slave Master For Availability? 
 Tablet Server 
 Reader & Writer
Implementations 
 Tablet Locations
Implementations 
 Why select Three-level hierarchy? 
 Why Persistent the location info? 
 Why analogous as B+tree? 
 The Most RPC times during One simple Query?
Implementations 
 Tablet Assignment 
 Consistency Guaranty ? 
 Tablet Server Offline detection Procedure 
 Master Recovery Procedure ? 
 Master Memory != Metadata Table 
 Metadata Table Schema Design?
Implementations 
 Tablet Serving 
 Read & Write & Recovery Procedure
Implementations 
 Compactions 
 Three Type Compactions Differences 
 If client Write 1GB data, How much data in GFS?
Refinements 
 How To Reduce IO? 
 Column Store 
 Compression 
 Block Size 
 Caching 
 Caching Consistency 
 Bloom Filter 
 Why Must Using Bloom Filter For LSM? 
 Group Commit 
 Prons Vs Cons
Refinements 
 Log 
 GFS exist duplicated record? 
 What Happened when Tablet Server Crashes? 
 Tablet Recovery 
 Tablet Migrate 
 Compare with Tair & OB 
 Tablet Split 
 Tablet Merge 
 How can do this online?
Performance 
 Compare With Single Server Performance 
 Low Latency Vs High Through output 
 Why Performance/Server Decrease?
Performance 
 Scalability
Lessons 
 Vulnerable to many types of failure 
 RPC crc checksum 
 Removing assumption 
 Say No to adding new features 
 System-level Monitoring 
 Simple Design
Related Work 
 CAP in BigTable? 
 Sharing Nothing or Sharing Disk or Sharing 
Memory? 
 CStore or BigTable or OceanBase
Assignment 
 ALL Questions Above 
 CStore, Bloom Filter 
 Chubby, Megastore 
 Dynamo 
 Cassandra 
 Riak 
 Volmort 
 Redis cluster 
 Thanks & QA

More Related Content

PPTX
IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for th...
PPTX
Date-tiered Compaction Policy for Time-series Data
PPTX
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
PPTX
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
PPTX
Redis on NVMe SSD - Zvika Guz, Samsung
PPTX
WiredTiger Overview
PPTX
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
PDF
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai
IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for th...
Date-tiered Compaction Policy for Time-series Data
Scylla Summit 2018: Rebuilding the Ceph Distributed Storage Solution with Sea...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
Redis on NVMe SSD - Zvika Guz, Samsung
WiredTiger Overview
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
RADOS improvements and roadmap - Greg Farnum, Josh Durgin, Kefu Chai

What's hot (20)

PPTX
Rit 2011 ats
PPTX
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
PDF
Introduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing
PPTX
Redis Developers Day 2014 - Redis Labs Talks
PPTX
IMC Summit 2016 Breakout - Girish Mutreja - Extreme Transaction Processing in...
PPTX
Inside CynosDB: MariaDB optimized for the cloud at Tencent
PDF
Application Caching: The Hidden Microservice
PDF
Володимир Цап "Constraint driven infrastructure - scale or tune?"
PPTX
In-Memory Computing: How, Why? and common Patterns
PDF
Voldemort on Solid State Drives
PPTX
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
PDF
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PDF
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PPTX
Webinar: Introduction to MongoDB 3.0
PPTX
Aerospike: Maximizing Performance
PDF
Breaking the Sound Barrier with Persistent Memory
PDF
2016 may-countdown-to-postgres-v96-parallel-query
PDF
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
PDF
hbaseconasia2017: HBase在Hulu的使用和实践
PDF
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Rit 2011 ats
MongoDB 101 & Beyond: Get Started in MongoDB 3.0, Preview 3.2 & Demo of Ops M...
Introduce_non-volatile_generic_object_programming_model_for_In-Memory_Computing
Redis Developers Day 2014 - Redis Labs Talks
IMC Summit 2016 Breakout - Girish Mutreja - Extreme Transaction Processing in...
Inside CynosDB: MariaDB optimized for the cloud at Tencent
Application Caching: The Hidden Microservice
Володимир Цап "Constraint driven infrastructure - scale or tune?"
In-Memory Computing: How, Why? and common Patterns
Voldemort on Solid State Drives
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
PGConf.ASIA 2019 Bali - Tune Your LInux Box, Not Just PostgreSQL - Ibrar Ahmed
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
Webinar: Introduction to MongoDB 3.0
Aerospike: Maximizing Performance
Breaking the Sound Barrier with Persistent Memory
2016 may-countdown-to-postgres-v96-parallel-query
HBaseConAsia2018 Track1-1: Use CCSMap to improve HBase YGC time
hbaseconasia2017: HBase在Hulu的使用和实践
Countdown to PostgreSQL v9.5 - Foriegn Tables can be part of Inheritance Tree
Ad

Viewers also liked (16)

PDF
高可用性系统设计与实现
PDF
高可用性系统设计与实现
PPT
OceanBase-破解数据库高可用难题
PPT
Ocean base 破解数据库高可用难题
PPT
百度消息队列设计和实现总结
PPTX
Leveled compaction
PDF
Google LevelDB Study Discuss
PDF
Level db
PDF
Leveldb background
PDF
涨客资,用推策 推策产品介绍
PPT
PPT
The Anatomy Of The Google Architecture Fina Lv1.1
ODP
Baidu
PPTX
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
PPTX
GOOGLE BIGTABLE
PDF
The Google Bigtable
高可用性系统设计与实现
高可用性系统设计与实现
OceanBase-破解数据库高可用难题
Ocean base 破解数据库高可用难题
百度消息队列设计和实现总结
Leveled compaction
Google LevelDB Study Discuss
Level db
Leveldb background
涨客资,用推策 推策产品介绍
The Anatomy Of The Google Architecture Fina Lv1.1
Baidu
The Missing Manual for Leveled Compaction Strategy (Wei Deng & Ryan Svihla, D...
GOOGLE BIGTABLE
The Google Bigtable
Ad

Similar to BigTable PreReading (20)

PPT
bigtable-uw-presentation.ppt exceptional case situation analysis
PPTX
Google - Bigtable
PDF
google Bigtable
PPTX
Summary of "Google's Big Table" at nosql summer reading in Tokyo
ODP
Big table
PPTX
Google Big Table
PPTX
Big table
PPT
Bigtable
PPT
Google Bigtable paper presentation
PDF
Big table presentation-final
PDF
3 map reduce perspectives
PPTX
storage-systems.pptx
PDF
Google Bigtable
PDF
Bigtable and Boxwood
PPTX
Chapter Six Storage-systemsgggggggg.pptx
PPT
8. column oriented databases
PDF
Bigtable and Dynamo
PDF
Google Bigtable Paper Presentation
PDF
Google jeff dean lessons learned while building infrastructure software at go...
PPTX
Dissecting Scalable Database Architectures
bigtable-uw-presentation.ppt exceptional case situation analysis
Google - Bigtable
google Bigtable
Summary of "Google's Big Table" at nosql summer reading in Tokyo
Big table
Google Big Table
Big table
Bigtable
Google Bigtable paper presentation
Big table presentation-final
3 map reduce perspectives
storage-systems.pptx
Google Bigtable
Bigtable and Boxwood
Chapter Six Storage-systemsgggggggg.pptx
8. column oriented databases
Bigtable and Dynamo
Google Bigtable Paper Presentation
Google jeff dean lessons learned while building infrastructure software at go...
Dissecting Scalable Database Architectures

BigTable PreReading

  • 1. BIGTABLE READING 2013.04.02 xielun.szd@alipay.com
  • 2. OutLine  Introduction  Data Model  API  Building Blocks  Implementation  Refinements  Performance Evaluation  Real Applications  Lessons  Related Work  Conclusions
  • 3. OutLine  Introduction  Data Model  API  Building Blocks  Implementation  Refinements  Performance Evaluation  Real Applications  Lessons  Related Work  Conclusions
  • 4. Introduction  Why Google Need BigTable ?  Why BigTable Need LSM ?
  • 5. Data Model  BigTable Is A Sparse Sorted Map  Key: <Row, Column Family, Qualifier, Timestamp>
  • 6. API  How to Support Single-Row transaction  Atomic Read-modify-write  Using API for MapReduce  Input source & Output target
  • 7. Building Blocks  GFS  SSTable Data File  Commit Log (WAL & Redo Log)  Chubby  Distributed Lock Service  Root Tablet Location  Tablet server Manager  Schema & ACL Metadata  Why GFS not use chubby For Master?  If Chubby not exist, how to design the bigtable?
  • 8. Implementations  Client  How To Protect Lighted Master ?  Master  Tablet server Manager  Tablet Assignment  Garbage Collection  Schema Change  Do We Need Slave Master For Availability?  Tablet Server  Reader & Writer
  • 10. Implementations  Why select Three-level hierarchy?  Why Persistent the location info?  Why analogous as B+tree?  The Most RPC times during One simple Query?
  • 11. Implementations  Tablet Assignment  Consistency Guaranty ?  Tablet Server Offline detection Procedure  Master Recovery Procedure ?  Master Memory != Metadata Table  Metadata Table Schema Design?
  • 12. Implementations  Tablet Serving  Read & Write & Recovery Procedure
  • 13. Implementations  Compactions  Three Type Compactions Differences  If client Write 1GB data, How much data in GFS?
  • 14. Refinements  How To Reduce IO?  Column Store  Compression  Block Size  Caching  Caching Consistency  Bloom Filter  Why Must Using Bloom Filter For LSM?  Group Commit  Prons Vs Cons
  • 15. Refinements  Log  GFS exist duplicated record?  What Happened when Tablet Server Crashes?  Tablet Recovery  Tablet Migrate  Compare with Tair & OB  Tablet Split  Tablet Merge  How can do this online?
  • 16. Performance  Compare With Single Server Performance  Low Latency Vs High Through output  Why Performance/Server Decrease?
  • 18. Lessons  Vulnerable to many types of failure  RPC crc checksum  Removing assumption  Say No to adding new features  System-level Monitoring  Simple Design
  • 19. Related Work  CAP in BigTable?  Sharing Nothing or Sharing Disk or Sharing Memory?  CStore or BigTable or OceanBase
  • 20. Assignment  ALL Questions Above  CStore, Bloom Filter  Chubby, Megastore  Dynamo  Cassandra  Riak  Volmort  Redis cluster  Thanks & QA