SlideShare a Scribd company logo
LEARNING TO BUILD
DISTRIBUTED SYSTEMS
THE HARD WAY
@iconara
LEARNING TO BUILD
DISTRIBUTED SYSTEMS
THE HARD WAY
BIG DATA
@iconara
speakerdeck.com/u/iconara
(real time!)
Theo / @iconara
chief architect at BURT
let’s make online advertising a great experience
Learning to build distributed systems the hard way
MAKING THIS
INTO THIS
HOWHARDCANITBE?
Learning to build distributed systems the hard way
30K REQUESTS
PER SECOND
more than a billion requests per day,
over 1 TB raw data
ONE VISIT CAN
CHANGE UP TO
100K COUNTERS
hundreds of millions of individual counters per day,
plus counting uniques and visitor histories
IN REAL TIME
or near real time, if you want to be pedantic
×
HOWHARDCANITBE?
START WITH TWO
OF EVERYTHING
going from one to two is the hardest,
solve the scaling problem up front
START WITH TWO
OF EVERYTHING
you’ll solve the scaling problem,
and need less overcapacity
THREE
GIVE A LOT OF
THOUGHT TO
KEYS AND IDS
and think about your queries first
MEIHO0 JME57Z
monotonically increasing,
sorts nicely
a timestamp
something random
JME57Z MEIHO0
uniformly distributed,
works nicely with sharding
something random
a timestamp
CONSISTENCY IS
OVERRATED
don’t fear R + W < N
PRECOMPUTE
ALL THE THINGS
your users most likely don’t know what they want,
so why let them do ad hoc queries?
SEPARATE
PROCESSING
FROM STORAGE
that way you can scale each independently
PLAN HOW TO GET
RID OF YOUR DATA
deleting stuff is harder than you might think
×
×
×
×
×
×
×
NoDB
keep things streaming
×
DIVIDE THE LOAD
big data systems are all about
routing and partitioning
RANDOM
when you have no interdependencies
between things it’s easy to scale out
CONSISTENT
when there are interdependencies you need
to route using some property of the objects,
but make sure you get a uniform distribution
NUMEROLOGY
12
2 | 12
3 | 12
4 | 12
6 | 12
8 | 24
5 | 60
A DIVERSION ABOUT
COUNTING TO 60
the reason why there’s 60 seconds to a minute,
and 360 degrees to a circle
××
3 SEGMENTS
ON EACH FINGER
= 12
3 SEGMENTS
ON EACH FINGER
= 12
FIVE FINGERS
ON OTHER HAND
= 60
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
12, 60, 120, 360
superior highly composite numbers
use multiples of 12 to scale
without always having to double
BLAH BLAH BLAH
use multiples of 12 to scale
without always having to double
log2(366) ≈ 31
$-$
(ASCII code 36)-----
log2(366) ≈ 31
log2(366) ≈ 31
six characters 0-9, A-Z can represent 31 bits,
which is kind of almost very close to four bytes
MEIHO0
MEIHO0
a timestamp
Time.now.to_i.to_s(36).upcase
Learning to build distributed systems the hard way
YOU CAN’T SCALE
TO REAL TIME
and don’t trust code that doesn’t run continuously
×
DO YOU REALLY
NEED A BACKUP?
if you got 3x replication over multiple
availability zones, is that backup really worth it?
PRODUCTION IS THE
ONLY REAL TEST
ENVIRONMENT
when thousands of things happen every second,
new, weird and unforeseen things happen all the time,
your tests can only cover the foreseeable
=
GÖTEBORG,
DISTRIBUTED
@gbgdistr
KTHXBAI
@iconara
github.com/iconara
architecturalatrocities.com
burtcorp.com

More Related Content

PDF
Rome Photo Cube
PDF
Ecolier2011
 
PPTX
Number.factor tree
PPT
Divisors quotients and remainders
PPT
Hw to factor
PPTX
La republic dominicana
PDF
Chasing the elephant
PPTX
Evaluation question 2
Rome Photo Cube
Ecolier2011
 
Number.factor tree
Divisors quotients and remainders
Hw to factor
La republic dominicana
Chasing the elephant
Evaluation question 2

Similar to Learning to build distributed systems the hard way (20)

PPTX
Cynthia Lee ITEM 2018
PDF
Learning to build distributed systems the hard way
PDF
Data structures
PDF
ODP
M6d cassandra summit
PDF
Designing Data Intensive Applications The Big Ideas Behind Reliable Scalable ...
PDF
Data struture and aligorism
PDF
PPT
Data Structure and Algorithms Department of Computer Science
PDF
data structures
PDF
Elements of Programming Interviews.pdf
PDF
05211201 A D V A N C E D D A T A S T R U C T U R E S A N D A L G O R I...
PDF
05211201 Advanced Data Structures And Algorithms
PDF
Lessons from a coding veteran - Web Directions @Media
PDF
guide-t-cp.pdf programming book will help to
PPT
CS3114_09212011.ppt
PPT
MongoDB Basic Concepts
PDF
02 analysis
Cynthia Lee ITEM 2018
Learning to build distributed systems the hard way
Data structures
M6d cassandra summit
Designing Data Intensive Applications The Big Ideas Behind Reliable Scalable ...
Data struture and aligorism
Data Structure and Algorithms Department of Computer Science
data structures
Elements of Programming Interviews.pdf
05211201 A D V A N C E D D A T A S T R U C T U R E S A N D A L G O R I...
05211201 Advanced Data Structures And Algorithms
Lessons from a coding veteran - Web Directions @Media
guide-t-cp.pdf programming book will help to
CS3114_09212011.ppt
MongoDB Basic Concepts
02 analysis
Ad

More from Theo Hultberg (8)

PDF
AWS Cost Optimization
PDF
Cassandra for all the Things
PDF
Building a CQL driver
PDF
Learning to Build Distributed Systems the Hard Way
PDF
A Guide to the Post Relational Revolution
PDF
Concurrency and Distributed Systems Using JRuby
PDF
Shortcuts around the mistakes I've made scaling MongoDB
PDF
Standing on the shoulders of giants with JRuby
AWS Cost Optimization
Cassandra for all the Things
Building a CQL driver
Learning to Build Distributed Systems the Hard Way
A Guide to the Post Relational Revolution
Concurrency and Distributed Systems Using JRuby
Shortcuts around the mistakes I've made scaling MongoDB
Standing on the shoulders of giants with JRuby
Ad

Recently uploaded (20)

PDF
project resource management chapter-09.pdf
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
Zenith AI: Advanced Artificial Intelligence
PPTX
Chapter 5: Probability Theory and Statistics
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
PDF
A novel scalable deep ensemble learning framework for big data classification...
PPTX
The various Industrial Revolutions .pptx
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
Modernising the Digital Integration Hub
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PPTX
1. Introduction to Computer Programming.pptx
PPTX
Tartificialntelligence_presentation.pptx
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
project resource management chapter-09.pdf
NewMind AI Weekly Chronicles – August ’25 Week III
Zenith AI: Advanced Artificial Intelligence
Chapter 5: Probability Theory and Statistics
gpt5_lecture_notes_comprehensive_20250812015547.pdf
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
A novel scalable deep ensemble learning framework for big data classification...
The various Industrial Revolutions .pptx
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Enhancing emotion recognition model for a student engagement use case through...
Developing a website for English-speaking practice to English as a foreign la...
Modernising the Digital Integration Hub
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
1. Introduction to Computer Programming.pptx
Tartificialntelligence_presentation.pptx
observCloud-Native Containerability and monitoring.pptx
Assigned Numbers - 2025 - Bluetooth® Document
Univ-Connecticut-ChatGPT-Presentaion.pdf
Profit Center Accounting in SAP S/4HANA, S4F28 Col11

Learning to build distributed systems the hard way