SlideShare a Scribd company logo
Популярные алгоритмы
хранения данных на диске
Konstantin Osipov,
kostja@tarantool.org
October 28th, 2013
Случай в квадрате 36-80
•

•

•

B-tree – most
popular disk-based
data structure
B-tree balances
INSERT, UPDATE
and SELECT speed
DELETEs can be
slow
СУБД быстрая, настраивать
надо уметь
B-tree: внутреннее устройство
Что означает сache-oblivious?
Что означает сache-oblivious? (2)

BLOCK­MULT(A,B,C,n):
1 for i = 1 to n/s do:
2    for j = 1 to n/s do:
3         for k = 1 to n/s do:
4             ORD­MULT(Aik, Bkj, Cij, s)
LSM-tree: архитектура
LSM-tree: архитектура (2)
LevelDB: архитектура
LevelDB: insert RPS
LSM-tree: применение
●

Данные с разной степенью актуальности
–
–

Стена в соцсети

–

Чаты

–
●

Ленты сообщений

События

Сегрегация данных
–

Данные в LSM, индекс в памяти
COLA: архитектура
O(logB(N)) vs. O(logB(N)/B)
PUT(37), PUT(16)

Self-Balancing Tree

Memory
Disk

WAL:
16 37 Self-Balancing Tree Memory

Disk

WAL: 37, 16
7 41 Self-Balancing Tree Memory
16 37 Sorted String Table

WAL: 41, 7, 37, 16

Disk
Memory
7 37
7 16 37 41

WAL: 41, 7, 28, 16

Disk
10 28

Memory

7 37

Disk

7 16 37 41

WAL: 10, 28, 41, 7, 37, 16
Memory
10 28
7 16 37 41

WAL: 10, 28, 41, 7, 37, 16

Disk
2 47

Memory

10 28

Disk

7 16 37 41

WAL: 47, 2, 10, 28, 41, 7, 37, 16
2 28
2 10 28 41
2

7 10 16 28 37 41 47

WAL: 47, 2, 10, 28, 41, 7, 37, 16

Memory
Disk
6 49
2 28
2 10 28 41
2

7 10 16 28 37 41 47

WAL: 49, 6, 47, 2, 10, 28, 41, 7, 37, 16

Memory
Disk
6 49
2 10 28 41
2

7 10 16 28 37 41 47

WAL: 49, 6, 47, 2, 10, 28, 41, 7, 37, 16

Memory
Disk
23 32
6 49
2 10 28 41
2

7 10 16 28 37 41 47

WAL: 32, 23, 49, 6, 47, 2, 10, 28, 41, 7, 37, 16

Memory
Disk
6 32
6 23 32 49
2

7 10 16 28 37 41 47

WAL: 32, 23, 49, 6, 47, 2, 10, 28, 41, 7, 37, 16

Memory
Disk
30 45

Memory

6 32

Disk

6 23 32 49
2

7 10 16 28 37 41 47

WAL: 30, 45, 32, 23, 49, 6, 47, 2, 10, 28, 41, 7, 37, 16
14 38

Memory

30 45

Disk

6 23 32 49
2

7 10 16 28 37 41 47

WAL: 38, 14, 30, 45, 32, 23, 49, 6, 47, 2, 10, 28, 41, 7, 37, 16
6 10

Memory

2 30

Disk

2 14 30 41
2
2

6

7 14 23 30 37 41 47

7 10 14 16 23 28 30 32 37 38 41 45 47 49

WAL: 10, 6, 38, 14, 45, 30, 45, 32, 23, 49, 6, 47, 2, 10, 28, 41, 7, 37, 16
Memory
22 37

Disk

10 25 36 42
3
2

6

8 15 26 35 40 45 48

7 10 14 16 23 28 30 32 37 38 41 45 47 49

WAL: 37, 22, 36, 10, 25, 42, 10, 6, 38, 14, 45, 30, 45, 32, 23, 49, 6, 47, 2,
10, 28, 41, 7, 37, 16
GET(16)

Memory
22 37

Disk

10 25 36 42
3
2

6

8 15 26 35 40 45 48

7 10 14 16 23 28 30 32 37 38 41 45 47 49

WAL: 37, 22, 36, 10, 25, 42, 10, 6, 38, 14, 45, 30, 45, 32, 23, 49, 6, 47, 2,
10, 28, 41, 7, 37, 16
GET(16)

Memory
22 37

Disk

10 25 36 42
3
2

6

8 15 26 35 40 45 48

7 10 14 16 23 28 30 32 37 38 41 45 47 49

WAL: 37, 22, 36, 10, 25, 42, 10, 6, 38, 14, 45, 30, 45, 32, 23, 49, 6, 47, 2,
10, 28, 41, 7, 37, 16
BitCask: архитектора AOF
BitCask: архитектура keydir
Sophia: архитектура
Memory

15

26

40

84

Key-value index

10, 25

26, 31

39, 85

86, 96

Page index

split

Disk

10, 15, 16, 25

26, 31

39, 40, 84, 85

86, 96
39, 16, 85, 96

Insert

Memory
Key-value index

Disk

WAL

Page index
Insert

Memory

16

39

85

96

Key-value index

Disk

Page index
31, 25, 10, 86

Insert
16, 96

Memory
Key-value index

Disk

Page index

WAL
16, 39, 85, 96
Memory

10

25

31

86

Key-value index

16, 96

Page index

merge

Disk

16, 39, 85, 96
Memory

10

25

31

86

Key-value index

10, 31

39, 96

Page index

split

Disk

10, 16, 25, 31
39, 85, 86, 96
15, 26, 40, 84

Insert
10, 31

Memory
Key-value index

Disk

WAL

39, 96

Page index

10, 16, 25, 31
39, 85, 86, 96
Memory

15

26

40

84

Key-value index

10, 31

39, 96

Page index

merge

Disk

10, 16, 25, 31
39, 85, 86, 96
Memory

15

26

40

84

Key-value index

10, 25

26, 31

39, 85

86, 96

Page index

split

Disk

10, 15, 16, 25

26, 31

39, 40, 84, 85

86, 96
Эпилог: choose your db wisely
?
Links
●

●

●
●
●

●
●
●
●
●

Bitcask A Log-Structured Hash Table for Fast Key/Value Data, Justin Sheehy David Smith with
inspiration from Eric Brewer
The Log-Structured Merge-Tree (LSM-Tree) Patrick O'Neil , Edward Cheng, Dieter Gawlick,
Elizabeth O'Neil
Cache-Oblivious Algorithms by Harald Prokop (Master theses)
Space/time trade-offs in hash coding with allowable errors, Burton H. Bloom
Data Structures and Algorithms for Big Databases, Michael A. Bender Stony Brook & Tokutek
Bradley C. Kuszmaul (XLDB tutorial)
http://github.com/pmwkaa/sophia, http://sphia.org
http://codecapsule.com/2012/12/30/implementing-a-key-value-store-part-3-comparative-analysis-of-the-ar
http://stackoverflow.com/questions/6079890/cache-oblivious-lookahead-array
http://www.youtube.com/watch?v=88NaRUdoWZM(Tim Callaghan: Fractal Tree indexes)
http://code.google.com/p/leveldb/downloads/list

More Related Content

PDF
Константин Осипов, Mail.Ru, Tarantool
PDF
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
PDF
Spherical CNNs paper reading
ODP
My talk at Topconf.com conference, Tallinn, 1st of November 2012
PDF
Emarketerwebinarresponsivedesignsolutionpublishersquestionmarketers 130725135...
PDF
Nossos projetos
PDF
Metadata locking in MySQL 5.5
TXT
I understand
Константин Осипов, Mail.Ru, Tarantool
DataStax: Old Dogs, New Tricks. Teaching your Relational DBA to fetch
Spherical CNNs paper reading
My talk at Topconf.com conference, Tallinn, 1st of November 2012
Emarketerwebinarresponsivedesignsolutionpublishersquestionmarketers 130725135...
Nossos projetos
Metadata locking in MySQL 5.5
I understand

Similar to Highload2o013 osipv (20)

PDF
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
PPTX
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
PDF
Exploiting hash collisions
PDF
POWER10 innovations for HPC
PDF
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
PDF
Scaling the #2ndhalf
PDF
Ahn pacsec2017 key-recovery_attacks_against_commercial_white-box_cryptography...
PDF
No more dumb hex!
PDF
Experiences in ELK with D3.js for Large Log Analysis and Visualization
PDF
DOAG Security Day 2016 Enterprise Security Reloaded
PPT
Cache presentation
PPT
Cache presentation on Mapping and its types
PDF
computer_science_engineering
PDF
Comparative Study on DES and Triple DES Algorithms and Proposal of a New Algo...
PPTX
MaPU-HPCA2016
PDF
Is your SQL Exadata-aware?
PPTX
Scaling sql server 2014 parallel insert
PPT
lec16-memory.ppt
PDF
[cb22] Lets Dance in the Cache Destabilizing Hash Table on Microsoft IIS by O...
PDF
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Hailey_Database_Performance_Made_Easy_through_Graphics.pdf
DAW: Duplicate-AWare Federated Query Processing over the Web of Data
Exploiting hash collisions
POWER10 innovations for HPC
運用CNTK 實作深度學習物件辨識 Deep Learning based Object Detection with Microsoft Cogniti...
Scaling the #2ndhalf
Ahn pacsec2017 key-recovery_attacks_against_commercial_white-box_cryptography...
No more dumb hex!
Experiences in ELK with D3.js for Large Log Analysis and Visualization
DOAG Security Day 2016 Enterprise Security Reloaded
Cache presentation
Cache presentation on Mapping and its types
computer_science_engineering
Comparative Study on DES and Triple DES Algorithms and Proposal of a New Algo...
MaPU-HPCA2016
Is your SQL Exadata-aware?
Scaling sql server 2014 parallel insert
lec16-memory.ppt
[cb22] Lets Dance in the Cache Destabilizing Hash Table on Microsoft IIS by O...
Kernel Recipes 2017 - Performance analysis Superpowers with Linux BPF - Brend...
Ad

Recently uploaded (20)

PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Empathic Computing: Creating Shared Understanding
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Encapsulation theory and applications.pdf
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Spectroscopy.pptx food analysis technology
PDF
cuic standard and advanced reporting.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
Understanding_Digital_Forensics_Presentation.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Diabetes mellitus diagnosis method based random forest with bat algorithm
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Empathic Computing: Creating Shared Understanding
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
MIND Revenue Release Quarter 2 2025 Press Release
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Reach Out and Touch Someone: Haptics and Empathic Computing
Unlocking AI with Model Context Protocol (MCP)
Encapsulation theory and applications.pdf
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
Programs and apps: productivity, graphics, security and other tools
Dropbox Q2 2025 Financial Results & Investor Presentation
Spectroscopy.pptx food analysis technology
cuic standard and advanced reporting.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Understanding_Digital_Forensics_Presentation.pptx
Ad

Highload2o013 osipv