SlideShare a Scribd company logo
From Zero
To Capacity Planning
@Randommood
INES

Sombra
Globallydistributed and Highly available
Whycapacity
planning?
Or a journey of discovery and ingenuity
The views reflected in this talk
are not to be considered a
reflection of the skills of my
coworkers who are extremely
nice human beings and way
better at capacity planning
than I am.
😜
NOTAmonitoring
person
💀
🚨🚨
INSTRUMENT
MONITOR &
ALERT
PLAN
&
PREDICT
The Road to Capacity planning
?
FindingsBooks
0
Day One
Some Learning
Our Discoveries
Rituals
&Myths
Asking Around
Bringing it Home
our Path today
Checking The
Edge
zero… Oh shit!
aconvenient”situation”
Handles State
Many Clients
Othersystemsdependonthisservicetobe:up,healthy,andavailable!
A bit F*cked
Our 

World
Edge Core✨ ✨
a Fastly POP
I Rule the
Edge!
Evaluates weekly global
POPs performance &
makes projections
Publishes capacity
performance report in
clear location
Plans for our physical
capacity & transit
capacity
Meet Catharine
Planning Our Capacity
Some metrics
- Network Capacity (Gb) 

- Ordered Network Capability (Gb) 

- Planned Network Capacity (Gb)

- RPS Capacity (k) 

- Network peak (Gb) 

- RPS peak (k) 

- Site CPU Peak (%) 

- Network Utilization (%)
Over 30%: flagged, Over 70%:
Red status
Edge Insights
Our ability to correctly plan for
capacity is critical to our
bottom line
Capacity doesn’t just involve
hardware; software
optimizations matter
People affect capacity
Hitting
The
Books
Defining Capacity planning
Measuring, planning, & managing system growth
Determines what your system needs & when
From the observation of actual traffic. Use current
performance as baseline.
Must happen regardless of what you might
optimize
ARE
WE RIGHT
NOW?
We have to be
this fast & reliable 

X per second & Y%
Uptime
MEASURE HOW/RELIABLE WE ARE
HARDWARE
SOFTWARE
ARCHITECTURE
CHANGE / ADD / REMOVE
FIGURE OUT
HOW TO STAY
FAST/RELIABLE
ENOUGH
Yes!
No!
Allspaw's Wisdom
From The Art of Capacity Planning
👈
System’s Ceiling: critical level of a
resource that cannot be crossed
without failure. Find yours
Another form of Capacity Planning:
Controlled load testing
Predictions: ceilings + historical data
Allspaw's Wisdom
Allspaw's Wisdom
System architecture can affect your
ability to add capacity
Identify & track your application’s
metrics
Tying metrics to user behavior is helpful
If you don’t have ways to measure
your current capacity you can’t plan
Little’s Law & Capacity planning
L = λW
Capacity (L), Throughput (λ),
and Latency (W)
Applies to stable systems
Use this information to better
understand our workload and to
define constraints
Literature Insights
Possible to have plenty of capacity and
a slow site nonetheless
Projections & curve fitting are guesses
Keep track of API calls & their rate
Always gonna be spikes & hiccups.
Take the bad with the good & plan for it
Rituals
&
Myths
Crowdsourcing Capacity planning
Crowdsourcing Capacity planning
Industry Insights
Hard to extrapolate general
advice into something
applicable for my situation
Simplicity & ability to reason are
the only things I could trust
Confusing community stance on
the ROI of capacity planning
& Putting things in practice
Findings
Step One Step Two
steps followed
Documented system
architecture &
request lifecycle
Formalized: clients,
SLAs, & operational
requirements
Discovery
Confirmed constraints
& determined strategy
Parallelized capacity
& optimizations tasks
Organized a team
Gauging & Planning
Edge
Core APP / API APP / API
LB LB
COORDINATOR A COORDINATOR B COORDINATOR C
🐤
CACHE
LON
CACHE
DFW
CACHE
FRA
CACHE
LAX
CACHE
AMS
CACHE
SYD
REQUEST flow
📄 📄 📄👉
Step Four
steps followed
Start process again
Tons of tuning left to
do. We know we
have suboptimal
configs!
re-Evaluation
Step Three
Doubled RAM: our
constrained resource
Horizontally scaled to 3
servers + 1 canary
Capacity expansion
System Before
System After
System Before System After
System Before System After
Unexpected Challenges
Our goal when adding capacity
was no service disruption.
Localhost is the goddamn devil
Gap from metric/graph to
insight can be huge
Slowness is the nemesis of
distributed system
The Oprah Problem
Developing operational
insights into non-owned
system under pressure is
not great
Use playbooks,
debug.md, rotations, &
rollout owners
Proactivity and clarity
are your best tools
Everyone
gets more
capacity!
Some Insights
Anything API driven ought to
carry a rate limit - We can
easily DDOS ourselves!
Monitor and alert on
expensive API actions
Mind your system
dependencies: practice
defensive system design &
architecture
CAPACITY
PLANNING
ALERTING
MONITORING
Some Findings
Capacity tied to murky
organizational structure
is both good & bad
(but mostly bad)
Mind your error
descriptions! Cheeky
today ⇒ misleading
tomorrow!
Finding my system’s ceiling is still tricky
Services owned by engineers means
you need to level up on Ops skills
Back to re-evaluate setup to get more
out of this new capacity
Performance testing ought to be done
on the core’s side (& edge)
My Insights
TL;DR
Is a process not a one
time event
Pushes you to better
understand your
system, its capacity &
its boundaries - that is
good!
Proactivity is best
Capacity planning
Request lifecycle gets
tricky
System boundaries,
dependencies & SLAs
must be discussed
Your system’s capacity
may bound other
systems capacity
Distributed systems
github.com/Randommood/ZerotoCapacityPlanning
Special Thanks to: Catharine Strauss,
Alan Kasindorf, Matt Whiteley,
Caitie McCaffrey, Thom Mahoney,
Mike O’Neill, Devon O’Dell,
Katherine Daniels, Nathan Taylor,
Bruce Spang, and Greg Bako
Thank you !
github.com/Randommood/ZerotoCapacityPlanning

More Related Content

DOCX
Taylir Williams resume
DOCX
Wave Soldering Summer Training Report Bel Kotdwara
PPSX
PCB-BIZ Inc
PDF
Zentech Manufacturing Capabilities
PPT
What is the IPC-JSTD-001 Certification Program
PPTX
Week 2 resource and capacity planning
PDF
Transitioning from Rigid Fabricator to Flexible / Rigid-Flex PCB Fabrication
PPT
How To Design PCB
Taylir Williams resume
Wave Soldering Summer Training Report Bel Kotdwara
PCB-BIZ Inc
Zentech Manufacturing Capabilities
What is the IPC-JSTD-001 Certification Program
Week 2 resource and capacity planning
Transitioning from Rigid Fabricator to Flexible / Rigid-Flex PCB Fabrication
How To Design PCB

Viewers also liked (13)

PPT
Performance Of Pb Free Solder Pastes At Different Reflow
DOCX
Design 2
PDF
SAP PLM BOM (Bill of Material) Redlining
PDF
A Simple Pcba Design
PDF
Statistical Process Control for SMT Electronic Manufacturing
PPTX
MRP, MPS, Bill of Material, Numericals
PDF
Cv afm 2015_11_10_ok2
PPTX
Process Strategies and Capacity Planning
PDF
Chapter13 pcb design
PPT
Capacity planning
PPT
Use of ict for effective teaching and learning
DOCX
Capacity Planning
PPT
PowerPoint Tutorial Presentation - Tips & Tricks
Performance Of Pb Free Solder Pastes At Different Reflow
Design 2
SAP PLM BOM (Bill of Material) Redlining
A Simple Pcba Design
Statistical Process Control for SMT Electronic Manufacturing
MRP, MPS, Bill of Material, Numericals
Cv afm 2015_11_10_ok2
Process Strategies and Capacity Planning
Chapter13 pcb design
Capacity planning
Use of ict for effective teaching and learning
Capacity Planning
PowerPoint Tutorial Presentation - Tips & Tricks
Ad

Similar to From 0 to Capacity Planning (20)

PDF
The Art of Capacity Planning Scaling Web Resources 1st Edition John Allspaw
PPTX
Capacity planning is required to overcome this unpredictability, and determin...
PPTX
Geek Sync I Capacity Planning for Improved Uptime
PPTX
capacity planning of operations management
PPTX
Capacity Planning
PDF
Continuous Agile Planning That the Biz and Dev Folk can "Like Like"
PPT
Capacity Management from Flickr
PPT
(capacity1).ppt
PPTX
Tools for capacity planning, measurement of capacity, capacity planning process
PDF
Case Study: HCL Technologies On Capacity Planning for Cloud and Virtualized E...
PDF
Capacity Planning Infrastructure for Web Applications (Drupal)
PPT
Lec 2 Strategic Capacity Management for Supply Chain Management
PPT
(CAPACITY PLANNIG)
PPT
Capacity requirement planning sure 12mt07ind019
PPTX
Justifying Capacity Management Efforts
PPTX
CC_Unit4_2024_Class3.pptx Cloud Computing Unit V
PDF
Final Report GET434
PPTX
Provisioning and Capacity Planning (Travel Meets Big Data)
PDF
Dit yvol5iss21
The Art of Capacity Planning Scaling Web Resources 1st Edition John Allspaw
Capacity planning is required to overcome this unpredictability, and determin...
Geek Sync I Capacity Planning for Improved Uptime
capacity planning of operations management
Capacity Planning
Continuous Agile Planning That the Biz and Dev Folk can "Like Like"
Capacity Management from Flickr
(capacity1).ppt
Tools for capacity planning, measurement of capacity, capacity planning process
Case Study: HCL Technologies On Capacity Planning for Cloud and Virtualized E...
Capacity Planning Infrastructure for Web Applications (Drupal)
Lec 2 Strategic Capacity Management for Supply Chain Management
(CAPACITY PLANNIG)
Capacity requirement planning sure 12mt07ind019
Justifying Capacity Management Efforts
CC_Unit4_2024_Class3.pptx Cloud Computing Unit V
Final Report GET434
Provisioning and Capacity Planning (Travel Meets Big Data)
Dit yvol5iss21
Ad

More from Ines Sombra (17)

PDF
Architectural Patterns of Resilient Distributed Systems
PDF
We hear you like papers
PDF
Testing & Integration (The Remix)
PDF
Agile, Rugged, and Lean - The Paper Edition
PDF
Data antipatterns NYC Devops - 2014
PDF
Computational Patterns of the Cloud - QCon NYC 2014
PDF
How the Cloud is changing the world
PDF
NoSQL Databases in the Cloud - Great Wide Open 2014
PDF
Relational Databases in the Cloud - Great Wide Open 2014
PDF
Hello data
PDF
Data Antipatterns
PDF
Ricon east
PDF
PgPyDay
PDF
Getting started with Riak in the Cloud
PDF
Riak at Engine Yard Cloud
PDF
Postgres Open
KEY
North Bay Ruby Meetup 101911
Architectural Patterns of Resilient Distributed Systems
We hear you like papers
Testing & Integration (The Remix)
Agile, Rugged, and Lean - The Paper Edition
Data antipatterns NYC Devops - 2014
Computational Patterns of the Cloud - QCon NYC 2014
How the Cloud is changing the world
NoSQL Databases in the Cloud - Great Wide Open 2014
Relational Databases in the Cloud - Great Wide Open 2014
Hello data
Data Antipatterns
Ricon east
PgPyDay
Getting started with Riak in the Cloud
Riak at Engine Yard Cloud
Postgres Open
North Bay Ruby Meetup 101911

Recently uploaded (20)

PPTX
UNIT 4 Total Quality Management .pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
Construction Project Organization Group 2.pptx
PPT
Mechanical Engineering MATERIALS Selection
PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PPTX
Internet of Things (IOT) - A guide to understanding
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
additive manufacturing of ss316l using mig welding
DOCX
573137875-Attendance-Management-System-original
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
OOP with Java - Java Introduction (Basics)
UNIT 4 Total Quality Management .pptx
Model Code of Practice - Construction Work - 21102022 .pdf
Embodied AI: Ushering in the Next Era of Intelligent Systems
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Construction Project Organization Group 2.pptx
Mechanical Engineering MATERIALS Selection
Strings in CPP - Strings in C++ are sequences of characters used to store and...
Internet of Things (IOT) - A guide to understanding
Operating System & Kernel Study Guide-1 - converted.pdf
CYBER-CRIMES AND SECURITY A guide to understanding
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
additive manufacturing of ss316l using mig welding
573137875-Attendance-Management-System-original
bas. eng. economics group 4 presentation 1.pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
OOP with Java - Java Introduction (Basics)

From 0 to Capacity Planning