SlideShare a Scribd company logo
Bigtable: A Distributed Storage System
Presenter: Ku. Devyani B.Vaidya
Dec 8th , 2011Dec 8th , 2011
Bigtable: A Distributed Storage System
1. Introduction
2. What is a Bigtable?
3. Why not A DBMS?
4. Data model: Row
Column
Timestamps
5. APIs
6. Building Blocks
8. Conclusion
7.Real Applications
Dec 8th , 2011Dec 8th , 2011
Introduction
• BigTable is a distributed storage system
for managing structured data.
• Designed to scale to a very large size
- Petabytes of data across thousands of
servers
• Used for many Google projects
- Web indexing, Personalized Search, Google
Earth, Google Analytics, Google Finance, …
• Flexible, high-performance solution for
all of Google’s products
Dec 8th , 2011Dec 8th , 2011
What is a Bigtable?
• “A BigTable is a sparse, distributed,
persistent multidimensional sorted map. The
map is indexed by a row key, a column key,
and a timestamp; each value in the map is an
uninterpreted array of bytes.”
Dec 8th , 2011Dec 8th , 2011
Why not A DBMS?
• Few DBMS’s support the requisite scale
– Required DB with wide scalability, wide
applicability, high performance and high
availability
• Couldn’t afford it if there was one
– Most DBMSs require very expensive
infrastructure
• DBMSs provide more than Google needs
– E.g., full transactions, SQL
• Google has highly optimized lower-level systems
that could be exploited
– GFS, Chubby, MapReduce, Job scheduling
Dec 8th , 2011Dec 8th , 2011
Data model: Row
• Row keys are arbitrary strings
• Row is the unit of transactional consistency
• Data is maintained in lexicographic order by row
key
• Rows with consecutive keys (Row Range) are
grouped together as “tablets”.
Dec 8th , 2011Dec 8th , 2011
Data model: Column
• Column keys are grouped into sets called “column
families”, which form the unit of access control.
• Column key is named using the following syntax:
family :qualifier
• Access control and disk/memory accounting are
performed at column family level
Dec 8th , 2011Dec 8th , 2011
Data model: timestamps
• Each cell in Bigtable can contain multiple versions
of data, each indexed by timestamp
• Timestamps are 64-bit integers
• Assigned by:
– Bigtable
– Client application
• Data is stored in decreasing timestamp order, so
that most recent data is easily accessed
– Application specifies how many versions (n) of data items
are maintained in a cell
- Bigtable garbage-collects cell versions automatically.
Dec 8th , 2011Dec 8th , 2011
Data Model
Example: Web Indexing
Dec 8th , 2011Dec 8th , 2011
Data Model
Dec 8th , 2011Dec 8th , 2011
Data Model
Row
Dec 8th , 2011Dec 8th , 2011
Data Model
Columns
Dec 8th , 2011Dec 8th , 2011
Data Model
Cells
Dec 8th , 2011Dec 8th , 2011
Data Model
timestamps
Dec 8th , 2011Dec 8th , 2011
Data Model
Column family
Dec 8th , 2011Dec 8th , 2011
Data Model
Column family
family: qualifier
Dec 8th , 2011Dec 8th , 2011
Data Model
Column family
family: qualifier
Dec 8th , 2011Dec 8th , 2011
APIs
•The Bigtable API provides functions :
- Creating and deleting tables and column families.
-Changing cluster , table and column family
metadata.
-Support for single row transactions
-Allows cells to be used as integer counters
Dec 8th , 2011Dec 8th , 2011
Building Blocks
. Bigtable uses the distributed Google File
System (GFS) to store log and data files
• The Google SSTable file format is used
internally to store Bigtable data
• An SSTable provides a persistent , ordered
immutable map from keys to values
Dec 8th , 2011Dec 8th , 2011
Real Applications
•Google Analytics
http://analytics.google.com
•Google Earth & Google Maps
http://earth.google.com
•Personalized Search
www.google.com/psearch
•Web Indexing
•Google Finance
•Orkut
•Writely
Dec 8th , 2011Dec 8th , 2011
Conclusion
• Bigtable has achieved its goals of high performance,
data availability and scalability.
It has been successfully deployed in real apps
(Personalized Search, Orkut, GoogleMaps, …)
• Significant advantages of building own storage
system like flexibility in designing data model, control
over implementation and other infrastructure on which
Bigtable relies on.
Dec 8th , 2011Dec 8th , 2011
Source
1. www.google.com
2. www.studymafia.org
Dec 8th , 2011
©2007 The Board of Regents of the University of Nebraska. All rights reserved.
Thanks

More Related Content

PDF
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
PDF
Mastering in Data Warehousing and Business Intelligence
ODP
Open Source Business Intelligence Overview
PPTX
DataTables view CKAN monthly live
PPTX
Big data and polyglot solutions
PDF
Big Data Pitfalls
PPTX
Anzo Smart Data Integration
PDF
Ds03 data analysis
20141030 LinDA Workshop echallenges2014 - State of the art in open data infra...
Mastering in Data Warehousing and Business Intelligence
Open Source Business Intelligence Overview
DataTables view CKAN monthly live
Big data and polyglot solutions
Big Data Pitfalls
Anzo Smart Data Integration
Ds03 data analysis

What's hot (9)

ODP
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
PDF
II-SDV 2012 Open Source Platform & Cloud Platform for Information Analysis
PDF
Digital archiving 3.0
PDF
Build an Open Source Data Lake For Data Scientists
PPTX
Isas report
PDF
PDF
Great Scott! Dealing with New Datatypes
PDF
Industry@RuleML2015: Norwegian State of Estate A Reporting Service for the St...
PDF
proDataMarket presentation at "Spatial Data on The Web"
The Entity Registry System: Collaborative Editing of Entity Data in Poorly Co...
II-SDV 2012 Open Source Platform & Cloud Platform for Information Analysis
Digital archiving 3.0
Build an Open Source Data Lake For Data Scientists
Isas report
Great Scott! Dealing with New Datatypes
Industry@RuleML2015: Norwegian State of Estate A Reporting Service for the St...
proDataMarket presentation at "Spatial Data on The Web"
Ad

Viewers also liked (17)

PPTX
Google - Bigtable
PDF
google Bigtable
PPTX
Cloud Security
PPTX
Ecosistemas eii
PPTX
Presentación1
PDF
Guia extraescolares 17-18
PPTX
Mau ghe nail 2017 dep gia re bao hanh 5 nam
PDF
Business is a game & the best team wins
PDF
Laboratorio di Internazionalizzazione d’Impresa
PPTX
3Com 10/100BASE-TX
PDF
Tech talent hunting
DOCX
PPTX
Aula 1 - Filosofia e Literatura na Grécia Antiga
DOCX
Los videojuegos
PPTX
La realidad de mi centro i
PDF
Sea power 3.2 session 1 pax britannica
Google - Bigtable
google Bigtable
Cloud Security
Ecosistemas eii
Presentación1
Guia extraescolares 17-18
Mau ghe nail 2017 dep gia re bao hanh 5 nam
Business is a game & the best team wins
Laboratorio di Internazionalizzazione d’Impresa
3Com 10/100BASE-TX
Tech talent hunting
Aula 1 - Filosofia e Literatura na Grécia Antiga
Los videojuegos
La realidad de mi centro i
Sea power 3.2 session 1 pax britannica
Ad

Similar to Bigtable a distributed storage system (20)

PPT
Big table
PPTX
Big Data 2107 for Ribbon
PDF
Bigtable
PDF
Bigtable osdi06
PDF
Bigtable osdi06
PDF
Bigtable osdi06
PPTX
Big Data NoSQL 1017
ODP
Big table
PPTX
Data Lakehouse, Data Mesh, and Data Fabric (r1)
PPT
Uit9 ppt ch08_au_rev
PPTX
Modern database
PPTX
DATA WAREHOUSING
PPTX
Lunch & Learn Intro to Big Data
PPT
Business intelligence and data warehouses
PDF
UNIT 5- Other Databases.pdf
PDF
(Tugdual grall) no sql-hadoop
PPTX
Google Big Table
DOC
Assignment_4
PPTX
NOSQL DATAbASES INTRDUCTION powerpoint presentaion
Big table
Big Data 2107 for Ribbon
Bigtable
Bigtable osdi06
Bigtable osdi06
Bigtable osdi06
Big Data NoSQL 1017
Big table
Data Lakehouse, Data Mesh, and Data Fabric (r1)
Uit9 ppt ch08_au_rev
Modern database
DATA WAREHOUSING
Lunch & Learn Intro to Big Data
Business intelligence and data warehouses
UNIT 5- Other Databases.pdf
(Tugdual grall) no sql-hadoop
Google Big Table
Assignment_4
NOSQL DATAbASES INTRDUCTION powerpoint presentaion

More from Devyani Vaidya (20)

PPT
PPT
Fundamental file structure concepts & managing files of records
PPT
Cosequential processing and the sorting of large files
PPT
Introduction to the design and specification of file structures
PPTX
Mobile Phone Cloning
PPTX
Data warehousing
PPTX
secued cloud
PPTX
Cloud Cmputing Security
PPTX
Cloud Security
PPTX
Wireless network
PPT
Environmental law
PPTX
Wireless mobile charging using microwaves
PPTX
Secure Cloud Issues
PPTX
Energy Harvesing Through Reverse Electrowetting
PPT
Wireless Charging Of Mobile
PPTX
Applet programming
PPTX
Seminar on telephone directory
PPTX
History of Laptop
PPTX
Ppt on open and close door using Applet
PPTX
Resource management
Fundamental file structure concepts & managing files of records
Cosequential processing and the sorting of large files
Introduction to the design and specification of file structures
Mobile Phone Cloning
Data warehousing
secued cloud
Cloud Cmputing Security
Cloud Security
Wireless network
Environmental law
Wireless mobile charging using microwaves
Secure Cloud Issues
Energy Harvesing Through Reverse Electrowetting
Wireless Charging Of Mobile
Applet programming
Seminar on telephone directory
History of Laptop
Ppt on open and close door using Applet
Resource management

Recently uploaded (20)

PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Cell Types and Its function , kingdom of life
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PPTX
Pharma ospi slides which help in ospi learning
PDF
Basic Mud Logging Guide for educational purpose
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Institutional Correction lecture only . . .
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
master seminar digital applications in india
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Cell Types and Its function , kingdom of life
Final Presentation General Medicine 03-08-2024.pptx
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
Pharma ospi slides which help in ospi learning
Basic Mud Logging Guide for educational purpose
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Anesthesia in Laparoscopic Surgery in India
Institutional Correction lecture only . . .
Microbial diseases, their pathogenesis and prophylaxis
master seminar digital applications in india
Module 4: Burden of Disease Tutorial Slides S2 2025
VCE English Exam - Section C Student Revision Booklet
Renaissance Architecture: A Journey from Faith to Humanism
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Microbial disease of the cardiovascular and lymphatic systems
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester

Bigtable a distributed storage system

  • 1. Bigtable: A Distributed Storage System Presenter: Ku. Devyani B.Vaidya
  • 2. Dec 8th , 2011Dec 8th , 2011 Bigtable: A Distributed Storage System 1. Introduction 2. What is a Bigtable? 3. Why not A DBMS? 4. Data model: Row Column Timestamps 5. APIs 6. Building Blocks 8. Conclusion 7.Real Applications
  • 3. Dec 8th , 2011Dec 8th , 2011 Introduction • BigTable is a distributed storage system for managing structured data. • Designed to scale to a very large size - Petabytes of data across thousands of servers • Used for many Google projects - Web indexing, Personalized Search, Google Earth, Google Analytics, Google Finance, … • Flexible, high-performance solution for all of Google’s products
  • 4. Dec 8th , 2011Dec 8th , 2011 What is a Bigtable? • “A BigTable is a sparse, distributed, persistent multidimensional sorted map. The map is indexed by a row key, a column key, and a timestamp; each value in the map is an uninterpreted array of bytes.”
  • 5. Dec 8th , 2011Dec 8th , 2011 Why not A DBMS? • Few DBMS’s support the requisite scale – Required DB with wide scalability, wide applicability, high performance and high availability • Couldn’t afford it if there was one – Most DBMSs require very expensive infrastructure • DBMSs provide more than Google needs – E.g., full transactions, SQL • Google has highly optimized lower-level systems that could be exploited – GFS, Chubby, MapReduce, Job scheduling
  • 6. Dec 8th , 2011Dec 8th , 2011 Data model: Row • Row keys are arbitrary strings • Row is the unit of transactional consistency • Data is maintained in lexicographic order by row key • Rows with consecutive keys (Row Range) are grouped together as “tablets”.
  • 7. Dec 8th , 2011Dec 8th , 2011 Data model: Column • Column keys are grouped into sets called “column families”, which form the unit of access control. • Column key is named using the following syntax: family :qualifier • Access control and disk/memory accounting are performed at column family level
  • 8. Dec 8th , 2011Dec 8th , 2011 Data model: timestamps • Each cell in Bigtable can contain multiple versions of data, each indexed by timestamp • Timestamps are 64-bit integers • Assigned by: – Bigtable – Client application • Data is stored in decreasing timestamp order, so that most recent data is easily accessed – Application specifies how many versions (n) of data items are maintained in a cell - Bigtable garbage-collects cell versions automatically.
  • 9. Dec 8th , 2011Dec 8th , 2011 Data Model Example: Web Indexing
  • 10. Dec 8th , 2011Dec 8th , 2011 Data Model
  • 11. Dec 8th , 2011Dec 8th , 2011 Data Model Row
  • 12. Dec 8th , 2011Dec 8th , 2011 Data Model Columns
  • 13. Dec 8th , 2011Dec 8th , 2011 Data Model Cells
  • 14. Dec 8th , 2011Dec 8th , 2011 Data Model timestamps
  • 15. Dec 8th , 2011Dec 8th , 2011 Data Model Column family
  • 16. Dec 8th , 2011Dec 8th , 2011 Data Model Column family family: qualifier
  • 17. Dec 8th , 2011Dec 8th , 2011 Data Model Column family family: qualifier
  • 18. Dec 8th , 2011Dec 8th , 2011 APIs •The Bigtable API provides functions : - Creating and deleting tables and column families. -Changing cluster , table and column family metadata. -Support for single row transactions -Allows cells to be used as integer counters
  • 19. Dec 8th , 2011Dec 8th , 2011 Building Blocks . Bigtable uses the distributed Google File System (GFS) to store log and data files • The Google SSTable file format is used internally to store Bigtable data • An SSTable provides a persistent , ordered immutable map from keys to values
  • 20. Dec 8th , 2011Dec 8th , 2011 Real Applications •Google Analytics http://analytics.google.com •Google Earth & Google Maps http://earth.google.com •Personalized Search www.google.com/psearch •Web Indexing •Google Finance •Orkut •Writely
  • 21. Dec 8th , 2011Dec 8th , 2011 Conclusion • Bigtable has achieved its goals of high performance, data availability and scalability. It has been successfully deployed in real apps (Personalized Search, Orkut, GoogleMaps, …) • Significant advantages of building own storage system like flexibility in designing data model, control over implementation and other infrastructure on which Bigtable relies on.
  • 22. Dec 8th , 2011Dec 8th , 2011 Source 1. www.google.com 2. www.studymafia.org
  • 23. Dec 8th , 2011 ©2007 The Board of Regents of the University of Nebraska. All rights reserved. Thanks