SlideShare a Scribd company logo
@ANDY_PAVLO
WHATNON-VOLATILE
MEMORYEANS FOR THE FUTURE OF
DATABASE
SYSTEMSSee all the presentations from the In-Memory Computing
Summit at http://imcsummit.org
1973
1974
1978
1986
1994
2010
IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for the Future of Database Management Systems
2016
The
Non-Volatile Memory
11
• Persistent storage with byte-
addressable operations.
• Fast read/write latencies.
• No difference between random vs.
sequential access.
What does NVM
mean for DBMSs?
1 2
• Thinking of NVM as just a faster
SSD is not interesting.
• We want to use NVM as
permanent storage for the
database, but this has major
implications.
–Operating System Support
Existi
ng
Syste
NVM-
Only
Stora
Hybri
d
DBM 1 3
Existi
ng
Syste
NVM-
Only
Stora
Hybri
d
DBM
Chapter I – Existing
Systems
1 4
• Investigate how existing systems
perform with NVM for write-heavy
transaction processing (OLTP)
workloads.
• Evaluate two types of DBMS
architectures.
–Disk-oriented (MySQL)
A PROLEGOMENON ON OLTP DATABASE
SYSTEMS FOR NON-VOLATILE MEMORY
ADMS@VLDB 2015
1 5
ISK-ORIENTED
Buffer Pool
Table Heap
Log Snapsh
ots
IN-MEMORY
Table Heap
Log Snapsh
ots
Intel Labs NVM
Emulator
1 6
• Instrumented motherboard that
slows down access to the memory
controller with tunable latencies.
• Special assembly to emulate
upcoming Xeon instructions for
flushing cache lines.
STORE STORE
L1
Cache
L2
Cache
PCOMMIT
Experimental
Evaluation
1 7
• Compare architectures on Intel
Labs NVM emulator.
• Yahoo! Cloud Serving Benchmark:
–10 million records (~10GB)
–8x database / memory
–Variable skew
YCSB //
1 8
0
10,000
20,000
30,000
40,000
TXN/S
MySQLH-Store
50% Reads / 50% Writes Workload
2x Latency Relative to DRAM
SKEW AMOUNT
8x Latency
LESS
ONS
1 9
Logging is a
major
performance
bottleneck.
2
1 NVM Latency
does not have
a large impact.
Legacy DBMSs
are not
3
What would
Larry Ellison
do?
IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for the Future of Database Management Systems
Chapter II – NVM-only
Storage• Evaluate storage and recovery
methods for a system that only
has NVM.
• Testbed DBMS with a pluggable
storage engines.
• We had to build our own NVM-
aware memory allocator.
LET'S TALK ABOUT STORAGE & RECOVERY
METHODS FOR NON-VOLATILE MEMORY
DATABASE SYSTEMS
SIGMOD 2015
2 2
Copy-on-
Write
Table Heap
No Logging
Log-
Structure
dNo Table Heap
Log-only Storage
DBMS Architectures
2 3
In-Place
Table Heap
Log + Snapshots
Copy-on-
Write
Table Heap
No Logging
Log-
Structure
dNo Table Heap
Log-only Storage
In-Place Engine
2 4
Table Heap
Log Snapsh
ots1
2
3
UPDATE table SET val=ABC
WHERE id=123
Delta Record
New Tuple
New Tuple
NVM
NVM-Optimized
Architectures
2 5
• Use non-volatile pointers to only
record what changed rather than
how it changed.
• Be careful about how & when
things get flushed from CPU
caches to NVM.
NVM-Aware In-Place
Engine
2 6
Table Heap
Log
1
2
ple Pointers
New Tuple
Log Record
TxnId
Pointe
r
UPDATE table SET val=ABC
WHERE id=123
Evaluation
2 7
• Testbed system using the Intel
NVM hardware emulator.
• Yahoo! Cloud Serving Benchmark
–2 million records + 1 million
transactions
–High-skew setting
YCSB //
2 8
0
400,000
800,000
1,200,000
In-Place
Copy-on-Write
Log-Structured
NVM-OptimizTraditional
10% Reads / 90% Writes Workload
2x Latency Relative to DRAM
↑63%
↑122%
↑50%
TXN/SEC
YCSB //
2 9
0
50
100
150
200
250
300
350
In-Place Copy-on-Write Log-Structured
NVM-OptimizTraditional
10% Reads / 90% Writes Workload
2x Latency Relative to DRAM
NVM
↓40%
↓25%
↓20%
YCSB //
3 0
NVM-OptimizedTraditional
Elapsed time to replay log with varying log s
2x Latency Relative to DRAM
RECOVERY
0.01
0.1
1
10
100
1000
10^3 10^4 10^5 10^3 10^4 10^5 10^3 10^4 10^5
In-Place Copy-on-Write Log-Structured
No
Recovery
Needed
LESS
ONS
3 1
Avoid block-
oriented
components.
2
1 Using NVM
correctly
improves
throughput &
reduces
weadown.
NVM-only
systems are
3
What would
Nikita Kahn
do?
IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for the Future of Database Management Systems
Chapter III – Hybrid
DBMS• Design and build a new in-
memory DBMS that will be ready
for NVM when it becomes
available.
• Hybrid Storage + Hybrid
Workloads
–DRAM + NVM oriented architecture 3 4
Adaptive Storage
3 5
Original
Data
Adapted
Data
SELECT AVG(B)
FROM myTable
WHERE C < “yyy”
UPDATE myTable
SET A = 123,
B = 456,
C = 789
WHERE D = “xxx”
A B C D
BRIDGING THE ARCHIPELAGO BETWEEN ROW-
STORES AND COLUMN-STORES FOR HYBRID
WORKLOADS
SIGMOD 2016
A B C D
Cold
Hot
A B C D
LESS
ONS
3 7
Peloton
The Self-
Driving
Anthony
Tomasic
Todd
Mowry
Joy
Arulraj
Prashanth
Menon
Michael
Zhang
Lin
Ma
Matthew
Perron
Dana
Van Aken
Yingjun
Wu
Ran
Xian
Runshen
Zhu
Jiexi
Lin
Jianhong
Li
Ziqi
Wang
http://pelotondb.org
@ANDY_PAVLO

More Related Content

PPTX
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
PPTX
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
PPTX
IMC Summit 2016 Breakout - Girish Mutreja - Extreme Transaction Processing in...
PPT
BigTable PreReading
PPTX
Date-tiered Compaction Policy for Time-series Data
PPTX
In-Memory Computing: How, Why? and common Patterns
PDF
In-Memory Computing: Myths and Facts
PDF
IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...
IMC Summit 2016 Breakout - Pandurang Naik - Demystifying In-Memory Data Grid,...
IMC Summit 2016 Breakout - Per Minoborg - Work with Multiple Hot Terabytes in...
IMC Summit 2016 Breakout - Girish Mutreja - Extreme Transaction Processing in...
BigTable PreReading
Date-tiered Compaction Policy for Time-series Data
In-Memory Computing: How, Why? and common Patterns
In-Memory Computing: Myths and Facts
IMC Summit 2016 Innovation - Derek Nelson - PipelineDB: The Streaming-SQL Dat...

What's hot (20)

ODP
Efficient data maintaince in GlusterFS using Databases
PPTX
Redis on NVMe SSD - Zvika Guz, Samsung
PDF
PDF
Propelling IoT Innovation with Predictive Analytics
PDF
CASSANDRA MEETUP - Choosing the right cloud instances for success
PPTX
Apache HBase, Accelerated: In-Memory Flush and Compaction
PPTX
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
PPTX
Hardware planning & sizing for sql server
PDF
MEETUP - Unboxing Apache Cassandra 3.10
PDF
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
PDF
Voldemort on Solid State Drives
PPTX
Webinar: Introduction to MongoDB 3.0
PDF
Application Caching: The Hidden Microservice
PDF
PostgreSQL Scaling And Failover
ODP
Gluster Data Tiering
PDF
BigData as a Platform: Cassandra and Current Trends
PPTX
Inside CynosDB: MariaDB optimized for the cloud at Tencent
PDF
Running MySQL in AWS
PPTX
HBase at Flurry
PDF
hbaseconasia2017: HBase在Hulu的使用和实践
Efficient data maintaince in GlusterFS using Databases
Redis on NVMe SSD - Zvika Guz, Samsung
Propelling IoT Innovation with Predictive Analytics
CASSANDRA MEETUP - Choosing the right cloud instances for success
Apache HBase, Accelerated: In-Memory Flush and Compaction
HBaseCon 2012 | Base Metrics: What They Mean to You - Cloudera
Hardware planning & sizing for sql server
MEETUP - Unboxing Apache Cassandra 3.10
PostgreSQL worst practices, version FOSDEM PGDay 2017 by Ilya Kosmodemiansky
Voldemort on Solid State Drives
Webinar: Introduction to MongoDB 3.0
Application Caching: The Hidden Microservice
PostgreSQL Scaling And Failover
Gluster Data Tiering
BigData as a Platform: Cassandra and Current Trends
Inside CynosDB: MariaDB optimized for the cloud at Tencent
Running MySQL in AWS
HBase at Flurry
hbaseconasia2017: HBase在Hulu的使用和实践
Ad

Viewers also liked (17)

PDF
Controller design for multichannel nand flash memory for higher efficiency in...
PDF
IMC Summit 2016 Breakout - Nikita Shamgunov - Propelling IoT Innovation with ...
PPTX
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
PPTX
IMC Summit 2016 Breakout - Roman Shtykh - Apache Ignite as a Data Processing Hub
PDF
IMC Summit 2016 Breakout - Yanping Wang - Non-volatile Generic Object Program...
PPTX
IMC Summit 2016 Breakout - Gordon Patrick - Developments in Persistent Memory
PPTX
IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...
PPTX
IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...
PPTX
IMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage Tier
PPTX
IMC Summit 2016 Innovation - Girish Mutreja - Unveiling the X Platform
PPTX
Digital Marketing Audit Template (2016)
PPTX
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
PPTX
IMC Summit 2016 Breakout - Henning Andersen - Using Lock-free and Wait-free I...
PDF
PDF
1.3.13 Цинк-ламельное покрытие металлических лотков и аксессуаров
PDF
Traditional Herbal Drugs in Cancer: A Classification and Scientific Evaluation
PDF
Integracja komunikacji w mediach tradycyjnych i digital
Controller design for multichannel nand flash memory for higher efficiency in...
IMC Summit 2016 Breakout - Nikita Shamgunov - Propelling IoT Innovation with ...
IMC Summit 2016 Breakout - Matt Coventon - Test Driving Streaming and CEP on ...
IMC Summit 2016 Breakout - Roman Shtykh - Apache Ignite as a Data Processing Hub
IMC Summit 2016 Breakout - Yanping Wang - Non-volatile Generic Object Program...
IMC Summit 2016 Breakout - Gordon Patrick - Developments in Persistent Memory
IMC Summit 2016 Innovation - Steve Wilkes - Tap Into Your Enterprise – Why Da...
IMC Summit 2016 Innovation - Dennis Duckworth - Lambda-B-Gone: The In-memory ...
IMC Summit 2016 Breakout - Ken Gibson - The In-Place Working Storage Tier
IMC Summit 2016 Innovation - Girish Mutreja - Unveiling the X Platform
Digital Marketing Audit Template (2016)
IMC Summit 2016 Breakout - Girish Kathalagiri - Decision Making with MLLIB, S...
IMC Summit 2016 Breakout - Henning Andersen - Using Lock-free and Wait-free I...
1.3.13 Цинк-ламельное покрытие металлических лотков и аксессуаров
Traditional Herbal Drugs in Cancer: A Classification and Scientific Evaluation
Integracja komunikacji w mediach tradycyjnych i digital
Ad

Similar to IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for the Future of Database Management Systems (20)

PPTX
Databases love nutanix
PPTX
Taking Splunk to the Next Level - Architecture Breakout Session
PPTX
Designing for High Performance Ceph at Scale
PDF
Kudu - Fast Analytics on Fast Data
PDF
MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
PPT
7. Key-Value Databases: In Depth
PPT
Oracle real application_cluster
PPTX
Severalnines Training: MySQL® Cluster - Part IX
PPTX
What'sNnew in 3.0 Webinar
PPTX
Mitigating the Impact of State Management in Cloud Stream Processing Systems
PDF
Aerospike Hybrid Memory Architecture
PDF
Scaling ScyllaDB Storage Engine with State-of-Art Compaction
PPTX
HyperLoop: Group-Based NIC-Offloading to Accelerate Replicated Transactions i...
PPTX
Some key value stores using log-structure
PDF
Inter connect2016 yss1841-cloud-storage-options-v4
PPTX
Hekaton introduction for .Net developers
PDF
Memory, Big Data, NoSQL and Virtualization
PDF
Optimizing columnar stores
PDF
Optimizing columnar stores
PPTX
CPN302 your-linux-ami-optimization-and-performance
Databases love nutanix
Taking Splunk to the Next Level - Architecture Breakout Session
Designing for High Performance Ceph at Scale
Kudu - Fast Analytics on Fast Data
MySQL Server Backup, Restoration, And Disaster Recovery Planning Presentation
7. Key-Value Databases: In Depth
Oracle real application_cluster
Severalnines Training: MySQL® Cluster - Part IX
What'sNnew in 3.0 Webinar
Mitigating the Impact of State Management in Cloud Stream Processing Systems
Aerospike Hybrid Memory Architecture
Scaling ScyllaDB Storage Engine with State-of-Art Compaction
HyperLoop: Group-Based NIC-Offloading to Accelerate Replicated Transactions i...
Some key value stores using log-structure
Inter connect2016 yss1841-cloud-storage-options-v4
Hekaton introduction for .Net developers
Memory, Big Data, NoSQL and Virtualization
Optimizing columnar stores
Optimizing columnar stores
CPN302 your-linux-ami-optimization-and-performance

More from In-Memory Computing Summit (15)

PPTX
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
PPTX
IMC Summit 2016 Breakout - Steve Wikes - Making IMC Enterprise Grade
PPTX
IMC Summit 2016 Breakout - Noah Arliss - The Truth: How to Test Your Distribu...
PPTX
IMC Summit 2016 Breakout - Aleksandar Seovic - The Illusion of Statelessness
PPTX
IMC Summit 2016 Breakout - Greg Luck - How to Speed Up Your Application Using...
PPTX
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
PPTX
IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...
PPTX
IMC Summit 2016 Keynote - Robert Barr - In Memory Computing for Financial Ser...
PPTX
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
PPTX
IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Conver...
PPTX
IMC Summit 2016 Keynote - Jason Stamper - In-Memory: The Foundation of the In...
PPTX
IMCSummit 2016 Keynote - Benzi Galili - More Memory for In-Memory Easy
PPTX
IMCSummit 2016 Keynote - Abe Kleinfeld - The In-Memory Computing Landscape: L...
PPTX
Accelerating the Hadoop data stack with Apache Ignite, Spark and Bigtop
PDF
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...
IMC Summit 2016 Breakout - Brian Bulkowski - NVMe, Storage Class Memory and O...
IMC Summit 2016 Breakout - Steve Wikes - Making IMC Enterprise Grade
IMC Summit 2016 Breakout - Noah Arliss - The Truth: How to Test Your Distribu...
IMC Summit 2016 Breakout - Aleksandar Seovic - The Illusion of Statelessness
IMC Summit 2016 Breakout - Greg Luck - How to Speed Up Your Application Using...
IMC Summit 2016 Breakout - William Bain - Implementing Extensible Data Struct...
IMC Summit 2016 Keynote - Arthur Sainio - NVDIMM: Changes are Here So What’s ...
IMC Summit 2016 Keynote - Robert Barr - In Memory Computing for Financial Ser...
IMC Summit 2016 Breakout - Nikita Ivanov - Shared In-Memory RDDs – Missing Li...
IMCSummite 2016 Breakout - Nikita Ivanov - Apache Ignite 2.0 Towards a Conver...
IMC Summit 2016 Keynote - Jason Stamper - In-Memory: The Foundation of the In...
IMCSummit 2016 Keynote - Benzi Galili - More Memory for In-Memory Easy
IMCSummit 2016 Keynote - Abe Kleinfeld - The In-Memory Computing Landscape: L...
Accelerating the Hadoop data stack with Apache Ignite, Spark and Bigtop
IMCSummit 2015 - Day 1 Developer Track - Evolution of non-volatile memory exp...

Recently uploaded (20)

PPT
chapter_1_a.ppthduushshwhwbshshshsbbsbsbsbsh
PPTX
DEATH AUDIT MAY 2025.pptxurjrjejektjtjyjjy
PPTX
material for studying about lift elevators escalation
PDF
How NGOs Save Costs with Affordable IT Rentals
PPTX
INFERTILITY (FEMALE FACTORS).pptxgvcghhfcg
PPTX
Sem-8 project ppt fortvfvmat uyyjhuj.pptx
PPTX
Embeded System for Artificial intelligence 2.pptx
PPTX
ERP good ERP good ERP good ERP good good ERP good ERP good
PPTX
Syllabus Computer Six class curriculum s
PPTX
quadraticequations-111211090004-phpapp02.pptx
PDF
PPT Determiners.pdf.......................
PPTX
Operating System Processes_Scheduler OSS
PPTX
kvjhvhjvhjhjhjghjghjgjhgjhgjhgjhgjhgjhgjhgjh
PPTX
1.pptxsadafqefeqfeqfeffeqfqeqfeqefqfeqfqeffqe
PPTX
title _yeOPC_Poisoning_Presentation.pptx
PPTX
code of ethics.pptxdvhwbssssSAssscasascc
PPTX
making presentation that do no stick.pptx
PPTX
sdn_based_controller_for_mobile_network_traffic_management1.pptx
DOCX
A PROPOSAL ON IoT climate sensor 2.docx
PDF
Cableado de Controladores Logicos Programables
chapter_1_a.ppthduushshwhwbshshshsbbsbsbsbsh
DEATH AUDIT MAY 2025.pptxurjrjejektjtjyjjy
material for studying about lift elevators escalation
How NGOs Save Costs with Affordable IT Rentals
INFERTILITY (FEMALE FACTORS).pptxgvcghhfcg
Sem-8 project ppt fortvfvmat uyyjhuj.pptx
Embeded System for Artificial intelligence 2.pptx
ERP good ERP good ERP good ERP good good ERP good ERP good
Syllabus Computer Six class curriculum s
quadraticequations-111211090004-phpapp02.pptx
PPT Determiners.pdf.......................
Operating System Processes_Scheduler OSS
kvjhvhjvhjhjhjghjghjgjhgjhgjhgjhgjhgjhgjhgjh
1.pptxsadafqefeqfeqfeffeqfqeqfeqefqfeqfqeffqe
title _yeOPC_Poisoning_Presentation.pptx
code of ethics.pptxdvhwbssssSAssscasascc
making presentation that do no stick.pptx
sdn_based_controller_for_mobile_network_traffic_management1.pptx
A PROPOSAL ON IoT climate sensor 2.docx
Cableado de Controladores Logicos Programables

IMC Summit 2016 Breakout - Andy Pavlo - What Non-Volatile Memory Means for the Future of Database Management Systems

Editor's Notes

  • #4: Adda Quinn - 1967 to 1974. Married for seven years. No children.
  • #5: Nancy Wheeler Jenkins Married together for 6 months. Nancy sold her stake in the Oracle Corporation to Larry for $500.
  • #6: Barbara Boothe - 1983 to 1986 Former receptionist at Oracle.
  • #7: Thinking Machines corporation filed for bankruptcy, thus ended the era of single-purpose database machines.
  • #8: Melanie Craft – 2003 to 2010 Romance novelist.
  • #18: What I want to share with you is two sets of experiments that we’ve done to evaluate the performance of this new version of H-Store. We’re going to compare the performance of H-Store with the MMAP storage manager against an installation of MySQL that we’ve tuned for OLTP workloads. We’re going to use the YCSB benchmark with 10 million records. Each record is about 1KB so that comes out to be about 10GB. For H-Store, we’re going to allow the system to allocate enough memory from PMFS to store the entire database. For MySQL, we’re going to set the buffer pool size such that only an eighth of the database fits in DRAM. This ensures that the systems are reading and writing to PMFS enough for their systems.
  • #24: Choice #1: In-place Updates Table heap with a write-ahead log + snapshots. Example: VoltDB Choice #2: Copy-on-Write Create a shadow copy of the table when updated. No write-ahead log. Example: LMDB Choice #3: Log-structured All writes are appended to log. No table heap. Example: RocksDB
  • #26: Dirty cache lines from an uncommitted txn can be flushed by hardware to the memory controller. No REDO log because we flush all the changes to NVM at the time of commit.
  • #32: Using NVM correctly improves throughput by up to 5.5x and reduces writes by up to 2x.
  • #42: The allocator writes back CPU cache lines to NVM using the PCOMMIT instruction. It then issues a SFENCE instruction to wait for the data to become durable on NVM.
  • #43: If the DBMS restarts, we need to make sure that all of the pointers for in-memory data point to the same data. The allocator ensures that virtual memory addresses assigned to a memory-mapped region never change even after the OS or DBMS restarts.