How Shit
Works:
Storage
Tomer Gabel, Wix
@ GeeCON Kraków 2016
Like all good stories…
• We’ll start with a question.
• “What’s wrong with this picture?”
Like all good stories…
• We’ll start with a question.
• “What’s wrong with this picture?”
MY, OH, MY.
WHAT COULD IT BE?
Axioms
• Not a trick question
– Servers are properly
configured
– System architecture
makes sense
– No obvious bugs
– No scheduled jobs
• So what else goes
bump in the night?
PROLOGUE
“A LAUGHABLE CLAIM”
I/O is simple
• Just open a file, write, flush, close
• Nothing to it, right?
HDD
Application File
I/O is simple
• A little closer…
HDD
Application File
Kernel
File
system
(ext4)
Virtual
File
System Logical
Volume
Manager
I/O
scheduler
SCSI
driver
stack
I/O is simple
• But really…
HDD
Application File
Kernel
Hardware
Storage Subsystem
System Bus Drivers
PCI Express Bus
SATA Controller
THE ONION OF ABSTRACTION
ACT I
THESE BOOTS
ARE MADE
FOR WALKIN’
Everybody knows...
• Sequential
access is fast
• Random
access is slow
• … so what?
Everybody knows…
“Disk seeks are a huge performance
bottleneck… When the amount of data
starts to grow so large that effective
caching becomes impossible… you
need at least one disk seek to read and
a couple of disk seeks to write things.”
-- MySQL Reference Manual (8.12.3)
Everybody knows…
“Disk seeks are a huge performance
bottleneck… When the amount of data
starts to grow so large that effective
caching becomes impossible… you
need at least one disk seek to read and
a couple of disk seeks to write things.”
-- MySQL Reference Manual (8.12.3)
But why?
Rotational Latency
Rotational Latency
Rotational Latency
Rotational Latency
Throughput
• So you understand
latency…
• What about throughput?
• Depends on two factors:
– Areal density
– Newtonian physics
Areal Density
Interlude: Math
• Rotation is fixed
– Constant angular
velocity (CAV)
• Newton tells us that…
v = ω ∙ r
• Throughput increases
with radius!
Interlude: Math
• Commodity drives
are available at:
– 5400-15000 RPM
– Usually 7200 RPM
• What does it mean
for latency?
7200
60
= 120
Revolutions
/ Second
1
120
= 0.08333
~ 8.33ms!
In practice?
• Modern drives
give you:
200+ MB/s
300 IOPS
• Pure random
access nets only
1.2MB/s!
RIGHT.
WHAT CAN WE DO ABOUT IT?
Fine-tuning
• Provision more RAM
• Careful index structure
– Represent IPs as
UNSIGNED INT for 75%
reduction
– Implement better UUIDs¹
for 30% reduction
¹ Store UUID in an optimized way, Percona blog
… or use a sledgehammer!
• RAID 0 (and variants)
employ striping
• Data is distributed to
multiple spindles
• If it sounds familiar…
– It is!
– We call it “sharding”
It’s turtles all the way down
• Don’t jump to
conclusions!
– RAID 0 is impractical
– RAID 5 may be slow
– RAID 10 is expensive
– etc.
• Do your homework
• Benchmark!
ACT II: I’LL USE MY CREDIT CARD
Let’s talk SSDs
• Non-volatile RAM
• Lots of IOPS
• Expensive :-)
• Same caveats
apply…
Let’s talk SSDs
• Value starts at “1”
• Electrons accrue in the
floating gate
• After programming,
value becomes “0”
• Electrons are drained
to reset value to “0”
Surprise and Terror
• “Draining” is destructive!
• Limited erases
• Limited lifespan!
Wear Leveling
Caveats, remember?
• Addressing
– Cells (1 bit) – not
addressable
Caveats, remember?
• Addressing
– Cells (1 bit) – not
addressable
– Pages (0.5-8KB)
Caveats, remember?
• Addressing
– Cells (1 bit) – not
addressable
– Pages (0.5-8KB)
– Blocks (32-64 pages)
Caveats, remember?
• Addressing
– Cells (1 bit) – not
addressable
– Pages (0.5-8KB)
– Blocks (32-64 pages)
• Why do you care?
– Reads/writes on a page
– But erasure on a block
Write Amplification
1
1
1
1
1
Δ = 1 bit Δ = 1 block!
Surprising Results
• Defragmentation
– Relocates blocks
– Contiguous files
– Lower LBAs
– Background job
• Bad, bad, bad!
– No benefit with SSDs
– Major write load!
Background GC
7
5
6
1
2
Block A Block B
Block C Block D
1 2 5
6 7
Block A Block B
Block C Block D
Surprising Results
• What happens when
you delete file?
– Not much
– Bit flip on file table
– Space is not reclaimed
• Result?
– SATA TRIM command
7
5
6
1
2
Block A Block B
Block C Block D
SSD Takeaways
• A moving target
–File systems
–Data structures
–Longevity
• As usual:
–Benchmark
–Monitor
EPILOGUE
“LET ME EMBRACE
THEE, SOUR
ADVERSITY,
FOR WISE MEN SAY
IT IS THE WISEST
COURSE.”
WE’RE DONE HERE!
… AND YES, WE’RE HIRING :-)
Thank you for listening
tomer@tomergabel.com
@tomerg
http://il.linkedin.com/in/tomergabel
Wix Engineering blog:
http://engineering.wix.com

More Related Content

PPTX
Put Your Thinking CAP On
PDF
The Wix Microservice Stack
KEY
From 100s to 100s of Millions
PDF
Empowering developers to deploy their own data stores
PDF
Cassandra Core Concepts
PDF
Diagnosing Problems in Production - Cassandra
PDF
Diagnosing Problems in Production (Nov 2015)
PPTX
캐시 분산처리 인프라
Put Your Thinking CAP On
The Wix Microservice Stack
From 100s to 100s of Millions
Empowering developers to deploy their own data stores
Cassandra Core Concepts
Diagnosing Problems in Production - Cassandra
Diagnosing Problems in Production (Nov 2015)
캐시 분산처리 인프라

What's hot (19)

PDF
Ruby and Distributed Storage Systems
PDF
Crash course intro to cassandra
PDF
Introduction to Cassandra - Denver
PDF
Grand Central Dispatch and multi-threading [iCONdev 2014]
PDF
Data Processing and Ruby in the World
PPTX
Realtime classroom analytics powered by apache druid
PDF
Play concurrency
PDF
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
PDF
Introduction to .Net Driver
PPTX
NSBCon UK nservicebus on Azure by Yves Goeleven
PDF
Client Drivers and Cassandra, the Right Way
PDF
Concurrency and Multithreading Demistified - Reversim Summit 2014
PDF
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
PDF
События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...
PPTX
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
PDF
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
PDF
NewSQL overview, Feb 2015
PDF
Call me maybe: Jepsen and flaky networks
PPTX
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Ruby and Distributed Storage Systems
Crash course intro to cassandra
Introduction to Cassandra - Denver
Grand Central Dispatch and multi-threading [iCONdev 2014]
Data Processing and Ruby in the World
Realtime classroom analytics powered by apache druid
Play concurrency
Leveraging Docker and CoreOS to provide always available Cassandra at Instacl...
Introduction to .Net Driver
NSBCon UK nservicebus on Azure by Yves Goeleven
Client Drivers and Cassandra, the Right Way
Concurrency and Multithreading Demistified - Reversim Summit 2014
Как сделать высоконагруженный сервис, не зная количество нагрузки / Олег Обле...
События, шины и интеграция данных в непростом мире микросервисов / Валентин Г...
Scylla Summit 2018: Keeping Your Latency SLAs No Matter What!
PostgreSQL worst practices, version PGConf.US 2017 by Ilya Kosmodemiansky
NewSQL overview, Feb 2015
Call me maybe: Jepsen and flaky networks
Scylla Summit 2018: Introducing ValuStor, A Memcached Alternative Made to Run...
Ad

Similar to How Shit Works: Storage (20)

PPTX
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
PDF
Optimizing MongoDB: Lessons Learned at Localytics
PDF
Guy Coates
PPTX
Your 1st Ceph cluster
PDF
Top 5 mistakes when writing Spark applications
PDF
Presentation database on flash
PDF
Erasure Code at Scale - Thomas William Byrne
PPTX
Hekaton introduction for .Net developers
PDF
Performance and predictability (1)
PDF
Performance and Predictability - Richard Warburton
PDF
How shit works: the CPU
PDF
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
PDF
Designs, Lessons and Advice from Building Large Distributed Systems
PPTX
Deploying ssd in the data center 2014
PPTX
Data storage solutions for SNS game
PPTX
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
PDF
What Every Developer Should Know About Database Scalability
PDF
TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)
PDF
The Smug Mug Tale
PDF
Top 5 mistakes when writing Spark applications
Scylla Summit 2018: Make Scylla Fast Again! Find out how using Tools, Talent,...
Optimizing MongoDB: Lessons Learned at Localytics
Guy Coates
Your 1st Ceph cluster
Top 5 mistakes when writing Spark applications
Presentation database on flash
Erasure Code at Scale - Thomas William Byrne
Hekaton introduction for .Net developers
Performance and predictability (1)
Performance and Predictability - Richard Warburton
How shit works: the CPU
IAP09 CUDA@MIT 6.963 - Guest Lecture: Out-of-Core Programming with NVIDIA's C...
Designs, Lessons and Advice from Building Large Distributed Systems
Deploying ssd in the data center 2014
Data storage solutions for SNS game
OGDC Datastorage Solution_Mr.Dung, Dinh Nguyen Anh
What Every Developer Should Know About Database Scalability
TritonSort: A Balanced Large-Scale Sorting System (NSDI 2011)
The Smug Mug Tale
Top 5 mistakes when writing Spark applications
Ad

More from Tomer Gabel (20)

PDF
How shit works: Time
PDF
Nondeterministic Software for the Rest of Us
PDF
Slaying Sacred Cows: Deconstructing Dependency Injection
PDF
An Abridged Guide to Event Sourcing
PDF
Java 8 and Beyond, a Scala Story
PPTX
Scala Refactoring for Fun and Profit (Japanese subtitles)
PPTX
Scala Refactoring for Fun and Profit
PDF
Onboarding at Scale
PPTX
Scala in the Wild
PPTX
Speaking Scala: Refactoring for Fun and Profit (Workshop)
PPTX
Leveraging Scala Macros for Better Validation
PDF
A Field Guide to DSL Design in Scala
PPTX
Functional Leap of Faith (Keynote at JDay Lviv 2014)
PPTX
Scala Back to Basics: Type Classes
PDF
5 Bullets to Scala Adoption
PPTX
Nashorn: JavaScript that doesn’t suck (ILJUG)
PDF
Ponies and Unicorns With Scala
PPTX
Lab: JVM Production Debugging 101
PPTX
DevCon³: Scala Best Practices
PPTX
Maven for Dummies
How shit works: Time
Nondeterministic Software for the Rest of Us
Slaying Sacred Cows: Deconstructing Dependency Injection
An Abridged Guide to Event Sourcing
Java 8 and Beyond, a Scala Story
Scala Refactoring for Fun and Profit (Japanese subtitles)
Scala Refactoring for Fun and Profit
Onboarding at Scale
Scala in the Wild
Speaking Scala: Refactoring for Fun and Profit (Workshop)
Leveraging Scala Macros for Better Validation
A Field Guide to DSL Design in Scala
Functional Leap of Faith (Keynote at JDay Lviv 2014)
Scala Back to Basics: Type Classes
5 Bullets to Scala Adoption
Nashorn: JavaScript that doesn’t suck (ILJUG)
Ponies and Unicorns With Scala
Lab: JVM Production Debugging 101
DevCon³: Scala Best Practices
Maven for Dummies

Recently uploaded (20)

PDF
Designing Intelligence for the Shop Floor.pdf
PDF
Website Design Services for Small Businesses.pdf
PPTX
GSA Content Generator Crack (2025 Latest)
PDF
Wondershare Recoverit Full Crack New Version (Latest 2025)
PPTX
Cybersecurity: Protecting the Digital World
PDF
Cost to Outsource Software Development in 2025
PPTX
Patient Appointment Booking in Odoo with online payment
PDF
Autodesk AutoCAD Crack Free Download 2025
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
Visual explanation of Dijkstra's Algorithm using Python
PPTX
Weekly report ppt - harsh dattuprasad patel.pptx
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
PDF
CCleaner 6.39.11548 Crack 2025 License Key
PDF
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
DOCX
Modern SharePoint Intranet Templates That Boost Employee Engagement in 2025.docx
PPTX
Advanced SystemCare Ultimate Crack + Portable (2025)
PPTX
assetexplorer- product-overview - presentation
PDF
AI Guide for Business Growth - Arna Softech
PDF
Types of Token_ From Utility to Security.pdf
PDF
MCP Security Tutorial - Beginner to Advanced
Designing Intelligence for the Shop Floor.pdf
Website Design Services for Small Businesses.pdf
GSA Content Generator Crack (2025 Latest)
Wondershare Recoverit Full Crack New Version (Latest 2025)
Cybersecurity: Protecting the Digital World
Cost to Outsource Software Development in 2025
Patient Appointment Booking in Odoo with online payment
Autodesk AutoCAD Crack Free Download 2025
Computer Software and OS of computer science of grade 11.pptx
Visual explanation of Dijkstra's Algorithm using Python
Weekly report ppt - harsh dattuprasad patel.pptx
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
CCleaner 6.39.11548 Crack 2025 License Key
Top 10 Software Development Trends to Watch in 2025 🚀.pdf
Modern SharePoint Intranet Templates That Boost Employee Engagement in 2025.docx
Advanced SystemCare Ultimate Crack + Portable (2025)
assetexplorer- product-overview - presentation
AI Guide for Business Growth - Arna Softech
Types of Token_ From Utility to Security.pdf
MCP Security Tutorial - Beginner to Advanced

How Shit Works: Storage