SlideShare a Scribd company logo
Making Big Data Roar
Data Centers are expensive 
Company Location Data Center Cost Data Center Size MW 
NSA Camp Williams, UT $2B 133 
Apple Maiden, NC $1B 67 
Internet Villages Annandale, Scot. $1.6B 107 
Lockerbie DC Lockerbie, Scotland $1.5B 100 
Social Security Baltimore, MD $400M 27 
Next Generation Data Wales, UK $300M 20 
Facebook Princeville, OR $215M 15
WiredTiger Mission 
WiredTiger is rethinking data 
management for modern hardware 
with a focus on multi-core scalability 
and maximizing the value of every 
byte of RAM.
Database/Storage Ecosystem
A New Data Management Engine 
● Architected for modern computer systems 
● Scalable and able to handle big data 
● High throughput, consistent low latency 
● Row-store, column-store, log structured merge 
● ACID transactions, standard isolation levels 
● Checkpoint and fine-grained durability 
● Supporting columns, indices, projections 
● Production quality, fully supported 
● NoSQL, Open Source
Flexible Storage 
● Access methods tailored to workload 
o Row store (read mostly of all columns) 
o Column store (read mostly of some columns) 
o Log-structured merge trees (mostly random writes) 
● Compact storage format 
o RLE, key-prefix, dictionary and static compression 
o Stream compression 
● Adapt workload to storage (RAM, SSD, HDD)
Flexible Configuration 
● API offers a simple key/value store, or 
● A complete schema layer 
o Specify data types 
o Map columns to files 
o Automatically maintain indices 
o Queries only read required columns 
o Projections, index-only scans 
● Checkpoint or fine-grained durability
Improved Efficiency 
● Higher CPU Utilization 
o Multi-core scalability 
o Minimize contention 
between threads 
o Non-locking 
algorithms 
o Hazard pointers 
● Lower Power Costs 
● Flash Optimized Block 
Layout
Consistent High Performance 
● In-cache or I/O bound 
● Workload Configuration 
o Efficient sparse data 
(column-store) 
o Bounded queries and 
updates (row-store) 
o Write-optimized 
(LSM) 
● Data structures for 
access at RAM speed
Consistent Low Latency 
● Non-locking algorithms 
● Multi-versioned data 
● Optimistic concurrency 
control 
● Deadlock-free 
transactions 
● I/O shifted to 
background threads
Cost Effective 
Metric 
iiBench run cost $6.44 $12.88 
Cost per Billion 
$20.30 $40.60 
inserts* 
● WiredTiger provides a 50% cost savings for the same AWS workload 
● More details on this benchmark are available here.
Customers
Management Team 
Keith Bostic is a founder and architect at WiredTiger. He was a founder of Sleepycat Software, 
(acquired by Oracle Corp. in 2006), and one of the architects of the Berkeley DB, the most widely-used 
embedded data management software in the world. 
Mr. Bostic was one of architects of the University of California, Berkeley, 2.10BSD and 4BSD releases, 
where he lead the 4BSD release Open Source effort. He is the recipient of a USENIX Association 
Lifetime Achievement Award (The Flame), which recognizes singular contributions to the UNIX 
community. 
Dr. Michael Cahill is a founder and architect at WiredTiger. He was an architect of Berkeley DB at 
Sleepycat Software and Oracle Corp., responsible for design and implementation of multiversion 
concurrency control, as well as SQL interfaces and programming language APIs. Previously, Dr. 
Cahill was CTO at Bullant Technology, which grew tenfold and raised over US$30 million from 
investors including Intel Capital and JP Morgan during his three year tenure. 
Dr. Cahill’s PhD from the University of Sydney is in the area of transaction processing and 
concurrency control. His work on a new algorithm for implementing serializable isolation received an 
ACM SIGMOD Best Paper award and was added to PostgreSQL 9.1.
Summary and Next Steps 
We’d like to discuss how we could help you 
with your solution. 
Thanks! Questions? info@wiredtiger.com

More Related Content

PPTX
A Technical Introduction to WiredTiger
PDF
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
PDF
A Technical Introduction to WiredTiger
PDF
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
PDF
MongodB Internals
PPTX
WiredTiger Overview
PDF
MongoDB Administration 101
PPTX
WiredTiger & What's New in 3.0
A Technical Introduction to WiredTiger
https://docs.google.com/presentation/d/1DcL4zK6i3HZRDD4xTGX1VpSOwyu2xBeWLT6a_...
A Technical Introduction to WiredTiger
MongoDB Evenings Boston - An Update on MongoDB's WiredTiger Storage Engine
MongodB Internals
WiredTiger Overview
MongoDB Administration 101
WiredTiger & What's New in 3.0

What's hot (20)

PDF
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)
PPTX
What'sNnew in 3.0 Webinar
PPTX
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
PPTX
Mongo DB
PDF
Common MongoDB Use Cases
KEY
Mongo Seattle - The Business of MongoDB
PPTX
Prepare for Peak Holiday Season with MongoDB
PPTX
Azure storage
PPT
MongoDB Pros and Cons
PPTX
Agility and Scalability with MongoDB
PPTX
Getting started with postgresql
PPTX
What's new in MongoDB 2.6
PPTX
Securing Your Enterprise Web Apps with MongoDB Enterprise
PPSX
Microsoft Hekaton
PPTX
In-memory Databases
PDF
NoSQL benchmarking
KEY
MongoDB vs Mysql. A devops point of view
PPTX
When to Use MongoDB...and When You Should Not...
PPTX
3 scenarios when to use MongoDB!
PDF
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
The Future of Postgres Sharding / Bruce Momjian (PostgreSQL)
What'sNnew in 3.0 Webinar
Sizing MongoDB on AWS with Wired Tiger-Patrick and Vigyan-Final
Mongo DB
Common MongoDB Use Cases
Mongo Seattle - The Business of MongoDB
Prepare for Peak Holiday Season with MongoDB
Azure storage
MongoDB Pros and Cons
Agility and Scalability with MongoDB
Getting started with postgresql
What's new in MongoDB 2.6
Securing Your Enterprise Web Apps with MongoDB Enterprise
Microsoft Hekaton
In-memory Databases
NoSQL benchmarking
MongoDB vs Mysql. A devops point of view
When to Use MongoDB...and When You Should Not...
3 scenarios when to use MongoDB!
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Ad

Similar to WiredTiger Overview (20)

PPTX
22059 slides
PDF
Building a High Performance Analytics Platform
PPTX
What's new in SQL Server 2016
PDF
VMworld 2013: Virtualizing Databases: Doing IT Right
PPT
Webinar: High Performance MongoDB Applications with IBM POWER8
PDF
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
PDF
SpringPeople - Introduction to Cloud Computing
PPTX
Design Like a Pro: How to Pick the Right System Architecture
PDF
Prague data management meetup 2018-03-27
PPTX
Systems oracle overview_hardware
PPT
Oracle Database 11g Lower Your Costs
ODP
The Adventure: BlackRay as a Storage Engine
PDF
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
PDF
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
PDF
COBOL to Apache Spark
PPT
Frb Briefing Database
PPTX
in-memory database system and low latency
PPTX
Exadata
PPTX
NewSQL - Deliverance from BASE and back to SQL and ACID
PPTX
Calum McCrea, Software Engineer at Kx Systems, "Kx: How Wall Street Tech can ...
22059 slides
Building a High Performance Analytics Platform
What's new in SQL Server 2016
VMworld 2013: Virtualizing Databases: Doing IT Right
Webinar: High Performance MongoDB Applications with IBM POWER8
Choosing the Right Database: Exploring MySQL Alternatives for Modern Applicat...
SpringPeople - Introduction to Cloud Computing
Design Like a Pro: How to Pick the Right System Architecture
Prague data management meetup 2018-03-27
Systems oracle overview_hardware
Oracle Database 11g Lower Your Costs
The Adventure: BlackRay as a Storage Engine
Red hat Storage Day LA - Designing Ceph Clusters Using Intel-Based Hardware
Exploring Oracle Database Performance Tuning Best Practices for DBAs and Deve...
COBOL to Apache Spark
Frb Briefing Database
in-memory database system and low latency
Exadata
NewSQL - Deliverance from BASE and back to SQL and ACID
Calum McCrea, Software Engineer at Kx Systems, "Kx: How Wall Street Tech can ...
Ad

Recently uploaded (20)

PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPT
Quality review (1)_presentation of this 21
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
Mega Projects Data Mega Projects Data
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPTX
Global journeys: estimating international migration
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Computer network topology notes for revision
PDF
Foundation of Data Science unit number two notes
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Quality review (1)_presentation of this 21
Major-Components-ofNKJNNKNKNKNKronment.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Moving the Public Sector (Government) to a Digital Adoption
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Miokarditis (Inflamasi pada Otot Jantung)
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
Introduction-to-Cloud-ComputingFinal.pptx
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
oil_refinery_comprehensive_20250804084928 (1).pptx
Mega Projects Data Mega Projects Data
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Global journeys: estimating international migration
Supervised vs unsupervised machine learning algorithms
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Computer network topology notes for revision
Foundation of Data Science unit number two notes

WiredTiger Overview

  • 2. Data Centers are expensive Company Location Data Center Cost Data Center Size MW NSA Camp Williams, UT $2B 133 Apple Maiden, NC $1B 67 Internet Villages Annandale, Scot. $1.6B 107 Lockerbie DC Lockerbie, Scotland $1.5B 100 Social Security Baltimore, MD $400M 27 Next Generation Data Wales, UK $300M 20 Facebook Princeville, OR $215M 15
  • 3. WiredTiger Mission WiredTiger is rethinking data management for modern hardware with a focus on multi-core scalability and maximizing the value of every byte of RAM.
  • 5. A New Data Management Engine ● Architected for modern computer systems ● Scalable and able to handle big data ● High throughput, consistent low latency ● Row-store, column-store, log structured merge ● ACID transactions, standard isolation levels ● Checkpoint and fine-grained durability ● Supporting columns, indices, projections ● Production quality, fully supported ● NoSQL, Open Source
  • 6. Flexible Storage ● Access methods tailored to workload o Row store (read mostly of all columns) o Column store (read mostly of some columns) o Log-structured merge trees (mostly random writes) ● Compact storage format o RLE, key-prefix, dictionary and static compression o Stream compression ● Adapt workload to storage (RAM, SSD, HDD)
  • 7. Flexible Configuration ● API offers a simple key/value store, or ● A complete schema layer o Specify data types o Map columns to files o Automatically maintain indices o Queries only read required columns o Projections, index-only scans ● Checkpoint or fine-grained durability
  • 8. Improved Efficiency ● Higher CPU Utilization o Multi-core scalability o Minimize contention between threads o Non-locking algorithms o Hazard pointers ● Lower Power Costs ● Flash Optimized Block Layout
  • 9. Consistent High Performance ● In-cache or I/O bound ● Workload Configuration o Efficient sparse data (column-store) o Bounded queries and updates (row-store) o Write-optimized (LSM) ● Data structures for access at RAM speed
  • 10. Consistent Low Latency ● Non-locking algorithms ● Multi-versioned data ● Optimistic concurrency control ● Deadlock-free transactions ● I/O shifted to background threads
  • 11. Cost Effective Metric iiBench run cost $6.44 $12.88 Cost per Billion $20.30 $40.60 inserts* ● WiredTiger provides a 50% cost savings for the same AWS workload ● More details on this benchmark are available here.
  • 13. Management Team Keith Bostic is a founder and architect at WiredTiger. He was a founder of Sleepycat Software, (acquired by Oracle Corp. in 2006), and one of the architects of the Berkeley DB, the most widely-used embedded data management software in the world. Mr. Bostic was one of architects of the University of California, Berkeley, 2.10BSD and 4BSD releases, where he lead the 4BSD release Open Source effort. He is the recipient of a USENIX Association Lifetime Achievement Award (The Flame), which recognizes singular contributions to the UNIX community. Dr. Michael Cahill is a founder and architect at WiredTiger. He was an architect of Berkeley DB at Sleepycat Software and Oracle Corp., responsible for design and implementation of multiversion concurrency control, as well as SQL interfaces and programming language APIs. Previously, Dr. Cahill was CTO at Bullant Technology, which grew tenfold and raised over US$30 million from investors including Intel Capital and JP Morgan during his three year tenure. Dr. Cahill’s PhD from the University of Sydney is in the area of transaction processing and concurrency control. His work on a new algorithm for implementing serializable isolation received an ACM SIGMOD Best Paper award and was added to PostgreSQL 9.1.
  • 14. Summary and Next Steps We’d like to discuss how we could help you with your solution. Thanks! Questions? info@wiredtiger.com

Editor's Notes

  • #3: The best number available to estimate the cost of a data center is the number of power supplies: that number determines heating and cooling costs, as well as hardware and software (license units) costs. While the number of CPUs per power supply continues to increase, CPUs are no longer getting faster, and at the data center level we need to look at software efficiencies to gain further scale beyond what the hardware can deliver. For the foreseeable future, multi-core scaling is key to better performance and increased efficiency. Common indexing technology in use today was written for computer architectures of the early 1990s, better software efficiency yields huge benefits
  • #4: WiredTiger is focused on single-node data management in service of high-end applications, improving application scalability and efficiency via software innovation.
  • #5: WiredTiger is entirely focused on single-node resource cost per transaction. WiredTiger does not include data distribution or other horizontal scaling software. WiredTiger is intended for applications running on a single node which require the maximum possible performance from the indexing technology, or as a storage technology for applications supporting their own horizontal scaling solutions.
  • #7: Row-store is a traditional database object, where keys are byte strings and all columns of a row are stored together, best for read-mostly workloads where all columns are equally valuable. Column-store groups columns in storage and only the necessary columns are read to satisfy a query. Log-structured merge trees (LSM) support high-speed random inserts, at the cost of slower reads. WiredTiger supports all three access methods and the access methods can be combined (for example, a sparse, wide table configured with a column-store primary, where indexes are stored in an LSM tree). WiredTiger supports a large number of compression algorithms: RLE: run-length encoding when columns repeat Key-prefix: Btree key-prefix compression Dictionary: unique columns only stored once per write block Static: Huffman encoding Stream: pluggable stream compression (for example, snappy or zlib); because WiredTiger supports variable-length blocks, stream compression can be applied in all cases, unlike engines where compression must operate in block-sized units.
  • #9: Unlike other indexing technologies, for example LevelDB and InnoDB, WiredTiger scales linearly as additional cores are added.
  • #10: iiBench is a standard benchmark used to measure MySQL performance. Compared to InnoDB WiredTiger showed consistently better query rates . . .
  • #11: . . . and much more consistent latency as you scale rows in the data-store.
  • #12: The ultimate benefit to the customer is reduced cost. This chart shows the cost of a billion inserts on an Amazon Web Services instance for the popular engine InnoDB versus WiredTiger: WiredTiger returns twice the performance on a typical AWS instance.