SlideShare a Scribd company logo
Building a Resilient Cloud
Infrastructure. From Scratch.




         by Jeremy Jarvis
           Co-founder
“The Brightbox story.”
Overview




@jeremyjarvis
Overview

• Preface: Who, what etc.




 @jeremyjarvis
Overview

• Preface: Who, what etc.
• Chapter 1: Feeling the pain




 @jeremyjarvis
Overview

• Preface: Who, what etc.
• Chapter 1: Feeling the pain
• Chapter 2: Realising an opportunity




 @jeremyjarvis
Overview

•   Preface: Who, what etc.
•   Chapter 1: Feeling the pain
•   Chapter 2: Realising an opportunity
•   Chapter 3: Design phase




    @jeremyjarvis
Overview

•   Preface: Who, what etc.
•   Chapter 1: Feeling the pain
•   Chapter 2: Realising an opportunity
•   Chapter 3: Design phase
•   Chapter 4: Let’s build this thing!




    @jeremyjarvis
Overview

•   Preface: Who, what etc.
•   Chapter 1: Feeling the pain
•   Chapter 2: Realising an opportunity
•   Chapter 3: Design phase
•   Chapter 4: Let’s build this thing!
•   Chapter 5: It’s alive!




    @jeremyjarvis
Overview

•   Preface: Who, what etc.
•   Chapter 1: Feeling the pain
•   Chapter 2: Realising an opportunity
•   Chapter 3: Design phase
•   Chapter 4: Let’s build this thing!
•   Chapter 5: It’s alive!
•   Epilogue: Some conclusions/lessons




    @jeremyjarvis
Preface.




@jeremyjarvis
Preface.

• Agile cloud infrastructure service




 @jeremyjarvis
Preface.

• Agile cloud infrastructure service
• “Multi-zone” (datacentre) architecture




 @jeremyjarvis
Preface.

• Agile cloud infrastructure service
• “Multi-zone” (datacentre) architecture
• UK-based (HQ in Leeds, DCs in Manchester)




 @jeremyjarvis
Preface.

•   Agile cloud infrastructure service
•   “Multi-zone” (datacentre) architecture
•   UK-based (HQ in Leeds, DCs in Manchester)
•   Small team (still only 10 in total)




    @jeremyjarvis
Preface.

•   Agile cloud infrastructure service
•   “Multi-zone” (datacentre) architecture
•   UK-based (HQ in Leeds, DCs in Manchester)
•   Small team (still only 10 in total)
•   Distributed team (Mainly around Leeds)




    @jeremyjarvis
Preface.

•   Agile cloud infrastructure service
•   “Multi-zone” (datacentre) architecture
•   UK-based (HQ in Leeds, DCs in Manchester)
•   Small team (still only 10 in total)
•   Distributed team (Mainly around Leeds)
•   Developer/DevOps focused (it’s who we are)




    @jeremyjarvis
Preface.

•   Agile cloud infrastructure service
•   “Multi-zone” (datacentre) architecture
•   UK-based (HQ in Leeds, DCs in Manchester)
•   Small team (still only 10 in total)
•   Distributed team (Mainly around Leeds)
•   Developer/DevOps focused (it’s who we are)
•   Profitable (for 4 yrs, built from revenue)




    @jeremyjarvis
Chapter 1: Feeling the Pain




@jeremyjarvis
Chapter 1: Feeling the Pain




@jeremyjarvis
Chapter 1: Feeling the Pain

• Launched Ruby-specific hosting service (Sep 2007)




 @jeremyjarvis
Chapter 1: Feeling the Pain

• Launched Ruby-specific hosting service (Sep 2007)
• Acquisition interest (no thanks!)




 @jeremyjarvis
Chapter 1: Feeling the Pain

• Launched Ruby-specific hosting service (Sep 2007)
• Acquisition interest (no thanks!)
• Built a good reputation, hosted apps for some large
  customers




 @jeremyjarvis
Chapter 1: Feeling the Pain

• Launched Ruby-specific hosting service (Sep 2007)
• Acquisition interest (no thanks!)
• Built a good reputation, hosted apps for some large
  customers
• Systems stable, but became unwieldy and hard for us
  to manage + we wanted to expand offering




 @jeremyjarvis
Chapter 1: Feeling the Pain

• Launched Ruby-specific hosting service (Sep 2007)
• Acquisition interest (no thanks!)
• Built a good reputation, hosted apps for some large
  customers
• Systems stable, but became unwieldy and hard for us
  to manage + we wanted to expand offering
• Began looking at existing options, nothing fitted our
  needs (e.g Eucalyptus)




 @jeremyjarvis
Chapter 1: Feeling the Pain

• Launched Ruby-specific hosting service (Sep 2007)
• Acquisition interest (no thanks!)
• Built a good reputation, hosted apps for some large
  customers
• Systems stable, but became unwieldy and hard for us
  to manage + we wanted to expand offering
• Began looking at existing options, nothing fitted our
  needs (e.g Eucalyptus)
• Decided to build our own “cloud” (Brightbox NG)


 @jeremyjarvis
Chapter 2: Realising an opportunity




@jeremyjarvis
Chapter 2: Realising an opportunity




@jeremyjarvis
Chapter 2: Realising an opportunity

• Limited options in EU/UK




 @jeremyjarvis
Chapter 2: Realising an opportunity

• Limited options in EU/UK
• If we’re building this for ourselves why not sell it?




 @jeremyjarvis
Chapter 2: Realising an opportunity

• Limited options in EU/UK
• If we’re building this for ourselves why not sell it?
• Generated lots of internal debate




 @jeremyjarvis
Chapter 2: Realising an opportunity

•   Limited options in EU/UK
•   If we’re building this for ourselves why not sell it?
•   Generated lots of internal debate
•   Lack of flexibility with PaaS services




    @jeremyjarvis
Chapter 2: Realising an opportunity

•   Limited options in EU/UK
•   If we’re building this for ourselves why not sell it?
•   Generated lots of internal debate
•   Lack of flexibility with PaaS services
•   We’re good at this stuff + we have awesome team




    @jeremyjarvis
Chapter 2: Realising an opportunity

•   Limited options in EU/UK
•   If we’re building this for ourselves why not sell it?
•   Generated lots of internal debate
•   Lack of flexibility with PaaS services
•   We’re good at this stuff + we have awesome team
•   Decided to shift development focus




    @jeremyjarvis
Chapter 3: Design phase




@jeremyjarvis
Chapter 3: Design phase




@jeremyjarvis
Chapter 3: Design phase

• Requirements:




 @jeremyjarvis
Chapter 3: Design phase

• Requirements:
  – Geographic redundancy (separate datacentres)




 @jeremyjarvis
Chapter 3: Design phase

• Requirements:
  – Geographic redundancy (separate datacentres)
  – Resilient network (multiple connections)




 @jeremyjarvis
Chapter 3: Design phase

• Requirements:
  – Geographic redundancy (separate datacentres)
  – Resilient network (multiple connections)
  – Agile network layer (Load Balancing, Cloud IP, Firewall)




 @jeremyjarvis
Chapter 3: Design phase

• Requirements:
  – Geographic redundancy (separate datacentres)
  – Resilient network (multiple connections)
  – Agile network layer (Load Balancing, Cloud IP, Firewall)
  – Distributed (no SPOF)




 @jeremyjarvis
Chapter 3: Design phase

• Requirements:
  – Geographic redundancy (separate datacentres)
  – Resilient network (multiple connections)
  – Agile network layer (Load Balancing, Cloud IP, Firewall)
  – Distributed (no SPOF)
  – Modular (easy to grow, JIT)




 @jeremyjarvis
Chapter 3: Design phase

• Requirements:
  – Geographic redundancy (separate datacentres)
  – Resilient network (multiple connections)
  – Agile network layer (Load Balancing, Cloud IP, Firewall)
  – Distributed (no SPOF)
  – Modular (easy to grow, JIT)
  – Future-proof (IPv6)




 @jeremyjarvis
Chapter 3: Design phase

• Requirements:
  – Geographic redundancy (separate datacentres)
  – Resilient network (multiple connections)
  – Agile network layer (Load Balancing, Cloud IP, Firewall)
  – Distributed (no SPOF)
  – Modular (easy to grow, JIT)
  – Future-proof (IPv6)
  – Programmable (API, CLI)




 @jeremyjarvis
Chapter 3: Design phase

• Requirements:
  – Geographic redundancy (separate datacentres)
  – Resilient network (multiple connections)
  – Agile network layer (Load Balancing, Cloud IP, Firewall)
  – Distributed (no SPOF)
  – Modular (easy to grow, JIT)
  – Future-proof (IPv6)
  – Programmable (API, CLI)
  – Open (easy to get stuff in and out)




 @jeremyjarvis
Chapter 3: Design phase




@jeremyjarvis
Chapter 3: Design phase

• Process:




 @jeremyjarvis
Chapter 3: Design phase

• Process:
– Consulted, at first. Network infrastructure gurus.




 @jeremyjarvis
Chapter 3: Design phase

• Process:
– Consulted, at first. Network infrastructure gurus.
– We’re on our own!




 @jeremyjarvis
Chapter 3: Design phase

• Process:
– Consulted, at first. Network infrastructure gurus.
– We’re on our own!
– Hands-on, R&D style, lots of testing, experimentation, iteration




 @jeremyjarvis
Chapter 3: Design phase

• Process:
– Consulted, at first. Network infrastructure gurus.
– We’re on our own!
– Hands-on, R&D style, lots of testing, experimentation, iteration
– Investigated competition (what’s good/bad)




 @jeremyjarvis
Chapter 3: Design phase

• Process:
– Consulted, at first. Network infrastructure gurus.
– We’re on our own!
– Hands-on, R&D style, lots of testing, experimentation, iteration
– Investigated competition (what’s good/bad)
– Access to plenty of kit -set up mini-clouds to hack




 @jeremyjarvis
Chapter 3: Design phase

• Process:
– Consulted, at first. Network infrastructure gurus.
– We’re on our own!
– Hands-on, R&D style, lots of testing, experimentation, iteration
– Investigated competition (what’s good/bad)
– Access to plenty of kit -set up mini-clouds to hack
– John and Neil worked very closely on architecture (daily calls)




 @jeremyjarvis
Chapter 3: Design phase

• Process:
– Consulted, at first. Network infrastructure gurus.
– We’re on our own!
– Hands-on, R&D style, lots of testing, experimentation, iteration
– Investigated competition (what’s good/bad)
– Access to plenty of kit -set up mini-clouds to hack
– John and Neil worked very closely on architecture (daily calls)
– Don’t reinvent the wheel (Rubyists!)




 @jeremyjarvis
Chapter 3: Design phase




@jeremyjarvis
Chapter 3: Design phase

• Elements:




 @jeremyjarvis
Chapter 3: Design phase

• Elements:
– Network architecture (proof-of-concept)




 @jeremyjarvis
Chapter 3: Design phase

• Elements:
– Network architecture (proof-of-concept)
– Application/software architecture (evolving, iterative)




 @jeremyjarvis
Chapter 3: Design phase

• Elements:
– Network architecture (proof-of-concept)
– Application/software architecture (evolving, iterative)
– Hardware selection:




 @jeremyjarvis
Chapter 3: Design phase

• Elements:
– Network architecture (proof-of-concept)
– Application/software architecture (evolving, iterative)
– Hardware selection:
  • Border routers (Cisco)




 @jeremyjarvis
Chapter 3: Design phase

• Elements:
– Network architecture (proof-of-concept)
– Application/software architecture (evolving, iterative)
– Hardware selection:
  • Border routers (Cisco)
  • Switches (Cisco +)




 @jeremyjarvis
Chapter 3: Design phase

• Elements:
– Network architecture (proof-of-concept)
– Application/software architecture (evolving, iterative)
– Hardware selection:
  • Border routers (Cisco)
  • Switches (Cisco +)
  • Host servers (Dell)




 @jeremyjarvis
Chapter 3: Design phase

• Elements:
– Network architecture (proof-of-concept)
– Application/software architecture (evolving, iterative)
– Hardware selection:
  • Border routers (Cisco)
  • Switches (Cisco +)
  • Host servers (Dell)
  • Standardised rack design (modular)




 @jeremyjarvis
Chapter 3: Design phase

• Elements:
– Network architecture (proof-of-concept)
– Application/software architecture (evolving, iterative)
– Hardware selection:
  • Border routers (Cisco)
  • Switches (Cisco +)
  • Host servers (Dell)
  • Standardised rack design (modular)
– Datacentre selection (Proximity, Independent, Competing)




 @jeremyjarvis
Chapter 4: Let’s build this thing!




@jeremyjarvis
Chapter 4: Let’s build this thing!




@jeremyjarvis
Chapter 4: Let’s build this thing!

• Network provisioning (Transit, Metro links, RIPE
  membership)




 @jeremyjarvis
Chapter 4: Let’s build this thing!

• Network provisioning (Transit, Metro links, RIPE
  membership)
• Buying (negotiating)




 @jeremyjarvis
Chapter 4: Let’s build this thing!

• Network provisioning (Transit, Metro links, RIPE
  membership)
• Buying (negotiating)
• Installing kit (Routers, Switches, Servers)




 @jeremyjarvis
Chapter 4: Let’s build this thing!

• Network provisioning (Transit, Metro links, RIPE
  membership)
• Buying (negotiating)
• Installing kit (Routers, Switches, Servers)
• Use datacentre staff where possible (racking/stacking)




 @jeremyjarvis
Chapter 4: Let’s build this thing!

• Network provisioning (Transit, Metro links, RIPE
  membership)
• Buying (negotiating)
• Installing kit (Routers, Switches, Servers)
• Use datacentre staff where possible (racking/stacking)
• Software development (several applications, iterative)




 @jeremyjarvis
Chapter 4: Let’s build this thing!

• Network provisioning (Transit, Metro links, RIPE
  membership)
• Buying (negotiating)
• Installing kit (Routers, Switches, Servers)
• Use datacentre staff where possible (racking/stacking)
• Software development (several applications, iterative)
• Writing configs (again, iterative - infra as code)




 @jeremyjarvis
Chapter 4: Let’s build this thing!

• Network provisioning (Transit, Metro links, RIPE
  membership)
• Buying (negotiating)
• Installing kit (Routers, Switches, Servers)
• Use datacentre staff where possible (racking/stacking)
• Software development (several applications, iterative)
• Writing configs (again, iterative - infra as code)
• Documentation (wiki, changelogs, code comments)



 @jeremyjarvis
Chapter 5: It’s alive!




 @jeremyjarvis
Chapter 5: It’s alive!




 @jeremyjarvis
Chapter 5: It’s alive!

• Nov 2010 - Launched private beta (700 users)




 @jeremyjarvis
Chapter 5: It’s alive!

• Nov 2010 - Launched private beta (700 users)
• Iteration (fix bottlenecks, improve resilience)




 @jeremyjarvis
Chapter 5: It’s alive!

• Nov 2010 - Launched private beta (700 users)
• Iteration (fix bottlenecks, improve resilience)
• New features (Load balancing, Firewall etc)




 @jeremyjarvis
Chapter 5: It’s alive!

•   Nov 2010 - Launched private beta (700 users)
•   Iteration (fix bottlenecks, improve resilience)
•   New features (Load balancing, Firewall etc)
•   Billing data (distributed stats collection)




    @jeremyjarvis
Chapter 5: It’s alive!

•   Nov 2010 - Launched private beta (700 users)
•   Iteration (fix bottlenecks, improve resilience)
•   New features (Load balancing, Firewall etc)
•   Billing data (distributed stats collection)
•   “Enterprise” billing software *sad face*




    @jeremyjarvis
Chapter 5: It’s alive!

•   Nov 2010 - Launched private beta (700 users)
•   Iteration (fix bottlenecks, improve resilience)
•   New features (Load balancing, Firewall etc)
•   Billing data (distributed stats collection)
•   “Enterprise” billing software *sad face*
•   Oct 2011 - General availability (no change really, just £
    £)




    @jeremyjarvis
Chapter 5: It’s alive!

• Nov 2010 - Launched private beta (700 users)
• Iteration (fix bottlenecks, improve resilience)
• New features (Load balancing, Firewall etc)
• Billing data (distributed stats collection)
• “Enterprise” billing software *sad face*
• Oct 2011 - General availability (no change really, just £
  £)
• Product development (more and better features)



    @jeremyjarvis
Chapter 5: It’s alive!

• Nov 2010 - Launched private beta (700 users)
• Iteration (fix bottlenecks, improve resilience)
• New features (Load balancing, Firewall etc)
• Billing data (distributed stats collection)
• “Enterprise” billing software *sad face*
• Oct 2011 - General availability (no change really, just £
  £)
• Product development (more and better features)
• Marketing (getting the word out, communication)


    @jeremyjarvis
Epilogue: What did we learn?




@jeremyjarvis
Epilogue: What did we learn?




@jeremyjarvis
Epilogue: What did we learn?

• Building stuff is hard (but can lead to competitive
  advantage)




 @jeremyjarvis
Epilogue: What did we learn?

• Building stuff is hard (but can lead to competitive
  advantage)
• Be your own customers (understand market, use
  products)




 @jeremyjarvis
Epilogue: What did we learn?

• Building stuff is hard (but can lead to competitive
  advantage)
• Be your own customers (understand market, use
  products)
• Don’t *over*estimate competition (look behind the
  mask)




 @jeremyjarvis
Epilogue: What did we learn?

• Building stuff is hard (but can lead to competitive
  advantage)
• Be your own customers (understand market, use
  products)
• Don’t *over*estimate competition (look behind the
  mask)
• Learn good negotiation (clue: it’s not a battle)




 @jeremyjarvis
Epilogue: What did we learn?

• Building stuff is hard (but can lead to competitive
  advantage)
• Be your own customers (understand market, use
  products)
• Don’t *over*estimate competition (look behind the
  mask)
• Learn good negotiation (clue: it’s not a battle)
• All about the launch (could have timed/co-ordinated
  things better for more “oomph”)



 @jeremyjarvis
Epilogue: What did we learn?

• Building stuff is hard (but can lead to competitive
  advantage)
• Be your own customers (understand market, use
  products)
• Don’t *over*estimate competition (look behind the
  mask)
• Learn good negotiation (clue: it’s not a battle)
• All about the launch (could have timed/co-ordinated
  things better for more “oomph”)
• Momentum is important (PR, morale)

 @jeremyjarvis
Thanks.
Any questions?

More Related Content

PDF
Circular interconnected gear pieces smart arts process stages 7 powerpoint d...
KEY
The business case for contributing code
KEY
Irb Tips and Tricks
PPTX
Portland VMware User Conference 2013 - Afternoon Keynote
KEY
MWUG wp-myths
PPT
Java Community and Overview Track - July 2015
KEY
Project Tools in Web Development
PPTX
Scalable Open Source
Circular interconnected gear pieces smart arts process stages 7 powerpoint d...
The business case for contributing code
Irb Tips and Tricks
Portland VMware User Conference 2013 - Afternoon Keynote
MWUG wp-myths
Java Community and Overview Track - July 2015
Project Tools in Web Development
Scalable Open Source

Viewers also liked (19)

PPT
atayde9876543210
PPTX
Informatica l
PPT
Aproximación de Binomial a Normal (Bi~no) en intervalos
PDF
CMTA Case Study
DOC
Jr Borchert R.1 100505
PPTX
Quack Chat | Partitioning - Black Magic or Silver Bullet
DOC
Acqua_Viva_Function_Menu_2005[1]
PDF
Trabajo de formación humana; la biblia.
PPT
Svyato ridnoji movy
PDF
Da Owada
DOC
Hablan los jefes
PPTX
Teoria de la probabilidad
PDF
Kővágószőlős Község Önkormányzata Képviselő-testületének 4/2014. (VI. 17.) Ö...
PDF
Function point analysis introduction
PPTX
El valor de la amistad. ♥♥♥
PDF
Síndrome Prader-Willi
PPTX
Geek Sync I What is the SSIS Catalog? And Why do I care?
DOCX
Ley 101 de la Policia Boliviana
PPTX
Фізичні та хімічні властивості ненасичених вуглеводнів. Добування
atayde9876543210
Informatica l
Aproximación de Binomial a Normal (Bi~no) en intervalos
CMTA Case Study
Jr Borchert R.1 100505
Quack Chat | Partitioning - Black Magic or Silver Bullet
Acqua_Viva_Function_Menu_2005[1]
Trabajo de formación humana; la biblia.
Svyato ridnoji movy
Da Owada
Hablan los jefes
Teoria de la probabilidad
Kővágószőlős Község Önkormányzata Képviselő-testületének 4/2014. (VI. 17.) Ö...
Function point analysis introduction
El valor de la amistad. ♥♥♥
Síndrome Prader-Willi
Geek Sync I What is the SSIS Catalog? And Why do I care?
Ley 101 de la Policia Boliviana
Фізичні та хімічні властивості ненасичених вуглеводнів. Добування
Ad

Similar to "Building a Resilient Cloud Infrastructure. From Scratch." - Cloud East, 28 July 2012 (20)

KEY
Social dev camp_2011
PPTX
How do we drive tech changes
PPTX
How Build Infrastructure Powers the Node.js Foundation
PDF
Operations for databases – The DevOps journey
KEY
Austin NoSQL 2011-07-06
PDF
StackEngine Demo - Docker Austin
PDF
Cloud east shutl_talk
PPTX
Inside Wordnik's Architecture
PDF
Devconf 2011 - PHP - How Yii framework is developed
PDF
Stash – Taking Expedia to New Heights - David Williams and Christopher Pepe
PDF
Storage Systems For Scalable systems
PDF
Surviving in a microservices environment
PDF
Cloud Truths - Hull Digital - 19 July 2012
PDF
Application Deployment at UC Riverside
PPTX
Scaling a High Traffic Web Application: Our Journey from Java to PHP
PPTX
Scaling High Traffic Web Applications
PPTX
Scala in the Wild
PDF
SACon 2019 - Surviving in a Microservices Environment
PDF
Surviving in a Microservices environment -abridged
PDF
StackEngine Demo - Boston
Social dev camp_2011
How do we drive tech changes
How Build Infrastructure Powers the Node.js Foundation
Operations for databases – The DevOps journey
Austin NoSQL 2011-07-06
StackEngine Demo - Docker Austin
Cloud east shutl_talk
Inside Wordnik's Architecture
Devconf 2011 - PHP - How Yii framework is developed
Stash – Taking Expedia to New Heights - David Williams and Christopher Pepe
Storage Systems For Scalable systems
Surviving in a microservices environment
Cloud Truths - Hull Digital - 19 July 2012
Application Deployment at UC Riverside
Scaling a High Traffic Web Application: Our Journey from Java to PHP
Scaling High Traffic Web Applications
Scala in the Wild
SACon 2019 - Surviving in a Microservices Environment
Surviving in a Microservices environment -abridged
StackEngine Demo - Boston
Ad

Recently uploaded (20)

PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
Final SEM Unit 1 for mit wpu at pune .pptx
PPTX
O2C Customer Invoices to Receipt V15A.pptx
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Developing a website for English-speaking practice to English as a foreign la...
PPTX
OMC Textile Division Presentation 2021.pptx
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Tartificialntelligence_presentation.pptx
PDF
Hybrid model detection and classification of lung cancer
PDF
STKI Israel Market Study 2025 version august
PDF
A contest of sentiment analysis: k-nearest neighbor versus neural network
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
project resource management chapter-09.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PPTX
The various Industrial Revolutions .pptx
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
Modernising the Digital Integration Hub
PDF
Architecture types and enterprise applications.pdf
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
WOOl fibre morphology and structure.pdf for textiles
Final SEM Unit 1 for mit wpu at pune .pptx
O2C Customer Invoices to Receipt V15A.pptx
NewMind AI Weekly Chronicles - August'25-Week II
Developing a website for English-speaking practice to English as a foreign la...
OMC Textile Division Presentation 2021.pptx
Programs and apps: productivity, graphics, security and other tools
Tartificialntelligence_presentation.pptx
Hybrid model detection and classification of lung cancer
STKI Israel Market Study 2025 version august
A contest of sentiment analysis: k-nearest neighbor versus neural network
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
project resource management chapter-09.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
The various Industrial Revolutions .pptx
Enhancing emotion recognition model for a student engagement use case through...
Modernising the Digital Integration Hub
Architecture types and enterprise applications.pdf

"Building a Resilient Cloud Infrastructure. From Scratch." - Cloud East, 28 July 2012

  • 1. Building a Resilient Cloud Infrastructure. From Scratch. by Jeremy Jarvis Co-founder
  • 4. Overview • Preface: Who, what etc. @jeremyjarvis
  • 5. Overview • Preface: Who, what etc. • Chapter 1: Feeling the pain @jeremyjarvis
  • 6. Overview • Preface: Who, what etc. • Chapter 1: Feeling the pain • Chapter 2: Realising an opportunity @jeremyjarvis
  • 7. Overview • Preface: Who, what etc. • Chapter 1: Feeling the pain • Chapter 2: Realising an opportunity • Chapter 3: Design phase @jeremyjarvis
  • 8. Overview • Preface: Who, what etc. • Chapter 1: Feeling the pain • Chapter 2: Realising an opportunity • Chapter 3: Design phase • Chapter 4: Let’s build this thing! @jeremyjarvis
  • 9. Overview • Preface: Who, what etc. • Chapter 1: Feeling the pain • Chapter 2: Realising an opportunity • Chapter 3: Design phase • Chapter 4: Let’s build this thing! • Chapter 5: It’s alive! @jeremyjarvis
  • 10. Overview • Preface: Who, what etc. • Chapter 1: Feeling the pain • Chapter 2: Realising an opportunity • Chapter 3: Design phase • Chapter 4: Let’s build this thing! • Chapter 5: It’s alive! • Epilogue: Some conclusions/lessons @jeremyjarvis
  • 12. Preface. • Agile cloud infrastructure service @jeremyjarvis
  • 13. Preface. • Agile cloud infrastructure service • “Multi-zone” (datacentre) architecture @jeremyjarvis
  • 14. Preface. • Agile cloud infrastructure service • “Multi-zone” (datacentre) architecture • UK-based (HQ in Leeds, DCs in Manchester) @jeremyjarvis
  • 15. Preface. • Agile cloud infrastructure service • “Multi-zone” (datacentre) architecture • UK-based (HQ in Leeds, DCs in Manchester) • Small team (still only 10 in total) @jeremyjarvis
  • 16. Preface. • Agile cloud infrastructure service • “Multi-zone” (datacentre) architecture • UK-based (HQ in Leeds, DCs in Manchester) • Small team (still only 10 in total) • Distributed team (Mainly around Leeds) @jeremyjarvis
  • 17. Preface. • Agile cloud infrastructure service • “Multi-zone” (datacentre) architecture • UK-based (HQ in Leeds, DCs in Manchester) • Small team (still only 10 in total) • Distributed team (Mainly around Leeds) • Developer/DevOps focused (it’s who we are) @jeremyjarvis
  • 18. Preface. • Agile cloud infrastructure service • “Multi-zone” (datacentre) architecture • UK-based (HQ in Leeds, DCs in Manchester) • Small team (still only 10 in total) • Distributed team (Mainly around Leeds) • Developer/DevOps focused (it’s who we are) • Profitable (for 4 yrs, built from revenue) @jeremyjarvis
  • 19. Chapter 1: Feeling the Pain @jeremyjarvis
  • 20. Chapter 1: Feeling the Pain @jeremyjarvis
  • 21. Chapter 1: Feeling the Pain • Launched Ruby-specific hosting service (Sep 2007) @jeremyjarvis
  • 22. Chapter 1: Feeling the Pain • Launched Ruby-specific hosting service (Sep 2007) • Acquisition interest (no thanks!) @jeremyjarvis
  • 23. Chapter 1: Feeling the Pain • Launched Ruby-specific hosting service (Sep 2007) • Acquisition interest (no thanks!) • Built a good reputation, hosted apps for some large customers @jeremyjarvis
  • 24. Chapter 1: Feeling the Pain • Launched Ruby-specific hosting service (Sep 2007) • Acquisition interest (no thanks!) • Built a good reputation, hosted apps for some large customers • Systems stable, but became unwieldy and hard for us to manage + we wanted to expand offering @jeremyjarvis
  • 25. Chapter 1: Feeling the Pain • Launched Ruby-specific hosting service (Sep 2007) • Acquisition interest (no thanks!) • Built a good reputation, hosted apps for some large customers • Systems stable, but became unwieldy and hard for us to manage + we wanted to expand offering • Began looking at existing options, nothing fitted our needs (e.g Eucalyptus) @jeremyjarvis
  • 26. Chapter 1: Feeling the Pain • Launched Ruby-specific hosting service (Sep 2007) • Acquisition interest (no thanks!) • Built a good reputation, hosted apps for some large customers • Systems stable, but became unwieldy and hard for us to manage + we wanted to expand offering • Began looking at existing options, nothing fitted our needs (e.g Eucalyptus) • Decided to build our own “cloud” (Brightbox NG) @jeremyjarvis
  • 27. Chapter 2: Realising an opportunity @jeremyjarvis
  • 28. Chapter 2: Realising an opportunity @jeremyjarvis
  • 29. Chapter 2: Realising an opportunity • Limited options in EU/UK @jeremyjarvis
  • 30. Chapter 2: Realising an opportunity • Limited options in EU/UK • If we’re building this for ourselves why not sell it? @jeremyjarvis
  • 31. Chapter 2: Realising an opportunity • Limited options in EU/UK • If we’re building this for ourselves why not sell it? • Generated lots of internal debate @jeremyjarvis
  • 32. Chapter 2: Realising an opportunity • Limited options in EU/UK • If we’re building this for ourselves why not sell it? • Generated lots of internal debate • Lack of flexibility with PaaS services @jeremyjarvis
  • 33. Chapter 2: Realising an opportunity • Limited options in EU/UK • If we’re building this for ourselves why not sell it? • Generated lots of internal debate • Lack of flexibility with PaaS services • We’re good at this stuff + we have awesome team @jeremyjarvis
  • 34. Chapter 2: Realising an opportunity • Limited options in EU/UK • If we’re building this for ourselves why not sell it? • Generated lots of internal debate • Lack of flexibility with PaaS services • We’re good at this stuff + we have awesome team • Decided to shift development focus @jeremyjarvis
  • 35. Chapter 3: Design phase @jeremyjarvis
  • 36. Chapter 3: Design phase @jeremyjarvis
  • 37. Chapter 3: Design phase • Requirements: @jeremyjarvis
  • 38. Chapter 3: Design phase • Requirements: – Geographic redundancy (separate datacentres) @jeremyjarvis
  • 39. Chapter 3: Design phase • Requirements: – Geographic redundancy (separate datacentres) – Resilient network (multiple connections) @jeremyjarvis
  • 40. Chapter 3: Design phase • Requirements: – Geographic redundancy (separate datacentres) – Resilient network (multiple connections) – Agile network layer (Load Balancing, Cloud IP, Firewall) @jeremyjarvis
  • 41. Chapter 3: Design phase • Requirements: – Geographic redundancy (separate datacentres) – Resilient network (multiple connections) – Agile network layer (Load Balancing, Cloud IP, Firewall) – Distributed (no SPOF) @jeremyjarvis
  • 42. Chapter 3: Design phase • Requirements: – Geographic redundancy (separate datacentres) – Resilient network (multiple connections) – Agile network layer (Load Balancing, Cloud IP, Firewall) – Distributed (no SPOF) – Modular (easy to grow, JIT) @jeremyjarvis
  • 43. Chapter 3: Design phase • Requirements: – Geographic redundancy (separate datacentres) – Resilient network (multiple connections) – Agile network layer (Load Balancing, Cloud IP, Firewall) – Distributed (no SPOF) – Modular (easy to grow, JIT) – Future-proof (IPv6) @jeremyjarvis
  • 44. Chapter 3: Design phase • Requirements: – Geographic redundancy (separate datacentres) – Resilient network (multiple connections) – Agile network layer (Load Balancing, Cloud IP, Firewall) – Distributed (no SPOF) – Modular (easy to grow, JIT) – Future-proof (IPv6) – Programmable (API, CLI) @jeremyjarvis
  • 45. Chapter 3: Design phase • Requirements: – Geographic redundancy (separate datacentres) – Resilient network (multiple connections) – Agile network layer (Load Balancing, Cloud IP, Firewall) – Distributed (no SPOF) – Modular (easy to grow, JIT) – Future-proof (IPv6) – Programmable (API, CLI) – Open (easy to get stuff in and out) @jeremyjarvis
  • 46. Chapter 3: Design phase @jeremyjarvis
  • 47. Chapter 3: Design phase • Process: @jeremyjarvis
  • 48. Chapter 3: Design phase • Process: – Consulted, at first. Network infrastructure gurus. @jeremyjarvis
  • 49. Chapter 3: Design phase • Process: – Consulted, at first. Network infrastructure gurus. – We’re on our own! @jeremyjarvis
  • 50. Chapter 3: Design phase • Process: – Consulted, at first. Network infrastructure gurus. – We’re on our own! – Hands-on, R&D style, lots of testing, experimentation, iteration @jeremyjarvis
  • 51. Chapter 3: Design phase • Process: – Consulted, at first. Network infrastructure gurus. – We’re on our own! – Hands-on, R&D style, lots of testing, experimentation, iteration – Investigated competition (what’s good/bad) @jeremyjarvis
  • 52. Chapter 3: Design phase • Process: – Consulted, at first. Network infrastructure gurus. – We’re on our own! – Hands-on, R&D style, lots of testing, experimentation, iteration – Investigated competition (what’s good/bad) – Access to plenty of kit -set up mini-clouds to hack @jeremyjarvis
  • 53. Chapter 3: Design phase • Process: – Consulted, at first. Network infrastructure gurus. – We’re on our own! – Hands-on, R&D style, lots of testing, experimentation, iteration – Investigated competition (what’s good/bad) – Access to plenty of kit -set up mini-clouds to hack – John and Neil worked very closely on architecture (daily calls) @jeremyjarvis
  • 54. Chapter 3: Design phase • Process: – Consulted, at first. Network infrastructure gurus. – We’re on our own! – Hands-on, R&D style, lots of testing, experimentation, iteration – Investigated competition (what’s good/bad) – Access to plenty of kit -set up mini-clouds to hack – John and Neil worked very closely on architecture (daily calls) – Don’t reinvent the wheel (Rubyists!) @jeremyjarvis
  • 55. Chapter 3: Design phase @jeremyjarvis
  • 56. Chapter 3: Design phase • Elements: @jeremyjarvis
  • 57. Chapter 3: Design phase • Elements: – Network architecture (proof-of-concept) @jeremyjarvis
  • 58. Chapter 3: Design phase • Elements: – Network architecture (proof-of-concept) – Application/software architecture (evolving, iterative) @jeremyjarvis
  • 59. Chapter 3: Design phase • Elements: – Network architecture (proof-of-concept) – Application/software architecture (evolving, iterative) – Hardware selection: @jeremyjarvis
  • 60. Chapter 3: Design phase • Elements: – Network architecture (proof-of-concept) – Application/software architecture (evolving, iterative) – Hardware selection: • Border routers (Cisco) @jeremyjarvis
  • 61. Chapter 3: Design phase • Elements: – Network architecture (proof-of-concept) – Application/software architecture (evolving, iterative) – Hardware selection: • Border routers (Cisco) • Switches (Cisco +) @jeremyjarvis
  • 62. Chapter 3: Design phase • Elements: – Network architecture (proof-of-concept) – Application/software architecture (evolving, iterative) – Hardware selection: • Border routers (Cisco) • Switches (Cisco +) • Host servers (Dell) @jeremyjarvis
  • 63. Chapter 3: Design phase • Elements: – Network architecture (proof-of-concept) – Application/software architecture (evolving, iterative) – Hardware selection: • Border routers (Cisco) • Switches (Cisco +) • Host servers (Dell) • Standardised rack design (modular) @jeremyjarvis
  • 64. Chapter 3: Design phase • Elements: – Network architecture (proof-of-concept) – Application/software architecture (evolving, iterative) – Hardware selection: • Border routers (Cisco) • Switches (Cisco +) • Host servers (Dell) • Standardised rack design (modular) – Datacentre selection (Proximity, Independent, Competing) @jeremyjarvis
  • 65. Chapter 4: Let’s build this thing! @jeremyjarvis
  • 66. Chapter 4: Let’s build this thing! @jeremyjarvis
  • 67. Chapter 4: Let’s build this thing! • Network provisioning (Transit, Metro links, RIPE membership) @jeremyjarvis
  • 68. Chapter 4: Let’s build this thing! • Network provisioning (Transit, Metro links, RIPE membership) • Buying (negotiating) @jeremyjarvis
  • 69. Chapter 4: Let’s build this thing! • Network provisioning (Transit, Metro links, RIPE membership) • Buying (negotiating) • Installing kit (Routers, Switches, Servers) @jeremyjarvis
  • 70. Chapter 4: Let’s build this thing! • Network provisioning (Transit, Metro links, RIPE membership) • Buying (negotiating) • Installing kit (Routers, Switches, Servers) • Use datacentre staff where possible (racking/stacking) @jeremyjarvis
  • 71. Chapter 4: Let’s build this thing! • Network provisioning (Transit, Metro links, RIPE membership) • Buying (negotiating) • Installing kit (Routers, Switches, Servers) • Use datacentre staff where possible (racking/stacking) • Software development (several applications, iterative) @jeremyjarvis
  • 72. Chapter 4: Let’s build this thing! • Network provisioning (Transit, Metro links, RIPE membership) • Buying (negotiating) • Installing kit (Routers, Switches, Servers) • Use datacentre staff where possible (racking/stacking) • Software development (several applications, iterative) • Writing configs (again, iterative - infra as code) @jeremyjarvis
  • 73. Chapter 4: Let’s build this thing! • Network provisioning (Transit, Metro links, RIPE membership) • Buying (negotiating) • Installing kit (Routers, Switches, Servers) • Use datacentre staff where possible (racking/stacking) • Software development (several applications, iterative) • Writing configs (again, iterative - infra as code) • Documentation (wiki, changelogs, code comments) @jeremyjarvis
  • 74. Chapter 5: It’s alive! @jeremyjarvis
  • 75. Chapter 5: It’s alive! @jeremyjarvis
  • 76. Chapter 5: It’s alive! • Nov 2010 - Launched private beta (700 users) @jeremyjarvis
  • 77. Chapter 5: It’s alive! • Nov 2010 - Launched private beta (700 users) • Iteration (fix bottlenecks, improve resilience) @jeremyjarvis
  • 78. Chapter 5: It’s alive! • Nov 2010 - Launched private beta (700 users) • Iteration (fix bottlenecks, improve resilience) • New features (Load balancing, Firewall etc) @jeremyjarvis
  • 79. Chapter 5: It’s alive! • Nov 2010 - Launched private beta (700 users) • Iteration (fix bottlenecks, improve resilience) • New features (Load balancing, Firewall etc) • Billing data (distributed stats collection) @jeremyjarvis
  • 80. Chapter 5: It’s alive! • Nov 2010 - Launched private beta (700 users) • Iteration (fix bottlenecks, improve resilience) • New features (Load balancing, Firewall etc) • Billing data (distributed stats collection) • “Enterprise” billing software *sad face* @jeremyjarvis
  • 81. Chapter 5: It’s alive! • Nov 2010 - Launched private beta (700 users) • Iteration (fix bottlenecks, improve resilience) • New features (Load balancing, Firewall etc) • Billing data (distributed stats collection) • “Enterprise” billing software *sad face* • Oct 2011 - General availability (no change really, just £ £) @jeremyjarvis
  • 82. Chapter 5: It’s alive! • Nov 2010 - Launched private beta (700 users) • Iteration (fix bottlenecks, improve resilience) • New features (Load balancing, Firewall etc) • Billing data (distributed stats collection) • “Enterprise” billing software *sad face* • Oct 2011 - General availability (no change really, just £ £) • Product development (more and better features) @jeremyjarvis
  • 83. Chapter 5: It’s alive! • Nov 2010 - Launched private beta (700 users) • Iteration (fix bottlenecks, improve resilience) • New features (Load balancing, Firewall etc) • Billing data (distributed stats collection) • “Enterprise” billing software *sad face* • Oct 2011 - General availability (no change really, just £ £) • Product development (more and better features) • Marketing (getting the word out, communication) @jeremyjarvis
  • 84. Epilogue: What did we learn? @jeremyjarvis
  • 85. Epilogue: What did we learn? @jeremyjarvis
  • 86. Epilogue: What did we learn? • Building stuff is hard (but can lead to competitive advantage) @jeremyjarvis
  • 87. Epilogue: What did we learn? • Building stuff is hard (but can lead to competitive advantage) • Be your own customers (understand market, use products) @jeremyjarvis
  • 88. Epilogue: What did we learn? • Building stuff is hard (but can lead to competitive advantage) • Be your own customers (understand market, use products) • Don’t *over*estimate competition (look behind the mask) @jeremyjarvis
  • 89. Epilogue: What did we learn? • Building stuff is hard (but can lead to competitive advantage) • Be your own customers (understand market, use products) • Don’t *over*estimate competition (look behind the mask) • Learn good negotiation (clue: it’s not a battle) @jeremyjarvis
  • 90. Epilogue: What did we learn? • Building stuff is hard (but can lead to competitive advantage) • Be your own customers (understand market, use products) • Don’t *over*estimate competition (look behind the mask) • Learn good negotiation (clue: it’s not a battle) • All about the launch (could have timed/co-ordinated things better for more “oomph”) @jeremyjarvis
  • 91. Epilogue: What did we learn? • Building stuff is hard (but can lead to competitive advantage) • Be your own customers (understand market, use products) • Don’t *over*estimate competition (look behind the mask) • Learn good negotiation (clue: it’s not a battle) • All about the launch (could have timed/co-ordinated things better for more “oomph”) • Momentum is important (PR, morale) @jeremyjarvis