SlideShare a Scribd company logo
Tuesday, July 10, 12
Inside the Atlassian OnDemand
               private cloud


               George Barnett
               SAAS Platform Architect



Tuesday, July 10, 12
In 2010 a team of engineers moved into our secret lair
                          (above a pub) to re-imagine our hosted platform.

Tuesday, July 10, 12
6 months later
                                               13,500 VMs



                       Launch - October 2011
                       1000 VMs




Tuesday, July 10, 12
We have a cloud. So what?


Tuesday, July 10, 12
We also had a cloud.. and ..
                          VM sprawl              Poor performance


                       Over provisioning
                                                           Slow deployments


                                 Low visibility into the full stack


Tuesday, July 10, 12
Virtualisation often creates
                    new challenges but does
                  nothing about existing ones.

Tuesday, July 10, 12
Tuesday, July 10, 12
Tuesday, July 10, 12
Tuesday, July 10, 12
Tuesday, July 10, 12
Focus



Tuesday, July 10, 12
Be less flexible about what
                       infrastructure you provide.

Tuesday, July 10, 12
“You can use any database you like, as
                            long as its PostgreSQL 8.4.”



                         #summit12




Tuesday, July 10, 12
• Stop trying to be everything to everyone
                       • (we have other clouds within Atlassian)

                • Lower operational complexity
                • Easier to provide a deeply integrated, well supported
                  toolchain
                • Small test surface matrix




Tuesday, July 10, 12
Fail fast. Learn quickly.


Tuesday, July 10, 12
Do as little
                       as possible


                       deploy and
                         use it



Tuesday, July 10, 12
Block-1
                A small scale model of the initial proposed platform
                architecture. 4 desktop machines and a switch.


                Purpose: Validate design, evaluate failure modes.

                http://history.nasa.gov/Apollo204/blocks.html



Tuesday, July 10, 12
Block-1
                       Applications do not fall over.

                       Network boot assumptions validated.

                       Creation of VM’s over NFS too resource and time
                       intensive. (more on this later)



Tuesday, July 10, 12
Block-2
                A large scale model of the platform architecture.


                Purpose: Validate hardware resource assumptions and
                compare CPU vendors.

                http://history.nasa.gov/Apollo204/blocks.html



Tuesday, July 10, 12
Block-2
                       Customers per GB of RAM metric validated

                       VM Distribution and failover tools work.

                       Initial specs of compute hardware too conservative.
                       Decided to add 50% more RAM.



Tuesday, July 10, 12
Hardware



Tuesday, July 10, 12
Challenge
                Existing platform hardware was a poor fit for our workload.


                Memory and IO were heavily constrained, but CPU was not.




Tuesday, July 10, 12
Monitoring
                We took 6 months worth of monitoring data from our
                existing platform.
                We used this to data to determine the right mix of
                hardware.




Tuesday, July 10, 12
• 10 x Compute nodes (144G RAM, 12 cores, NO disks)
                • 3 x Storage nodes (24 disks)
                • Each rack delivered fully assembled
                       • Unwrap, provide power, networking
                       • Connected to customers in ~2 hours




Tuesday, July 10, 12
Advantage #1
                Reliable.

                Each machine goes through a 2
                day burn in before it goes into the
                rack.



Tuesday, July 10, 12
Advantage #2
                Neat.




Tuesday, July 10, 12
Advantage #3
                Consistent.




Tuesday, July 10, 12
Advantage #4
                Easy to deploy.




Tuesday, July 10, 12
No disks.



Tuesday, July 10, 12
Wait. What?


Tuesday, July 10, 12
Challenge
                Existing compute infrastructure used local disk for swap
                and hypervisor boot.
                Once we got the memory density right, it’s only boot.




Tuesday, July 10, 12
• No disks in compute infrastructure
                       • Avoid spinning 20 more disks per rack for a hypervisor OS

                • Evaluated booting from:
                       • USB drives
                       • NFS
                       • Custom binary initrd image + kernel




Tuesday, July 10, 12
• No disks in compute infrastructure
                       • Avoid spinning 20 more disks per rack for a hypervisor OS

                • Evaluated booting from:
                       • USB drives (unreliable and slow!)
                       • NFS (what if the network goes away?)
                       • Custom binary initrd image + kernel




Tuesday, July 10, 12
• Image is ~170Mb gzipped filesystem
                       • Download on boot, extract into ram - ~400Mb

                • No external dependencies after boot
                • All compute nodes boot from the same image
                       • Reboot to known state




Tuesday, July 10, 12
Compute Node                         Netboot Server
                                           dhcp
                           PXE                                  DHCP
                                         response


                                                                TFTP
                                           gpxe

                                           dhcp
                                                                DHCP
                         Etherboot       response


                                                                HTTP
                                      bootscript

                                      kernel & boot image

                           Boot


Tuesday, July 10, 12
Sharp Edges.
                • No swap == provision carefully
                       • Not a problem if you automate provisioning

                • Treat running hypervisor image like an appliance
                       • Don’t change code - rebuild image and reboot
                       • Doing this often? Too many services in the hypervisor




Tuesday, July 10, 12
Software



Tuesday, July 10, 12
Challenge
                Virtualisation is often inefficient.
                There’s a memory and CPU penalty which is hard to
                avoid.




Tuesday, July 10, 12
Open VZ
                • Linux containers
                       • Basis for Parallels Virtuozzo Containers
                       • LXC isn’t there yet

                • No guest OS kernels
                       • No performance hit
                       • Better resource sharing


Tuesday, July 10, 12
Performance



Tuesday, July 10, 12
http://wiki.openvz.org/Performance/vConsolidate-SMP


Tuesday, July 10, 12
http://wiki.openvz.org/Performance/LAMP


Tuesday, July 10, 12
Resource de-duping



Tuesday, July 10, 12
“Don’t load the same thing
                                 twice”

Tuesday, July 10, 12
Challenge
                Java VM’s aren’t lightweight.




Tuesday, July 10, 12
• Full virtualisation does a poor job at this
                       • 50 VMs = 50 Kernels + 50 caches + 50 shared libs!
                       • Memory de-dupe combats this, but burns CPU.

                • Memory de-dupe works across all OSes
                       • We don’t use Windows.
                       • By being less flexible, we can exploit Linux specific features.




Tuesday, July 10, 12
OpenVZ containers all share
                     the same kernel.

Tuesday, July 10, 12
• Provide a single OS image to all - free benefits:
                       • Shared libraries only load once.
                       • OS is cached only once.
                       • OS image is the same on every instance.




Tuesday, July 10, 12
Challenge
                If all containers share the same OS image, then
                managing state is a nightmare!
                One bad change in one container would break them all!




Tuesday, July 10, 12
• But managing state on multiple machines is a solved
                  problem!
                       • What if you have >10,000 machines.


                • Why are you modifying the OS anyway?




Tuesday, July 10, 12
Does your iPhone upgrade
                        iOS when you install an
                                 app?

Tuesday, July 10, 12
“Fix problems by removing them, not by adding
                                 systems to manage them.”




                        #summit12




Tuesday, July 10, 12
Read-only OS images



Tuesday, July 10, 12
Data classes in a system
                • OS and system daemon code
                • Application code
                • Application and user data




Tuesday, July 10, 12
Tuesday, July 10, 12
Tuesday, July 10, 12
OpenVZ Kernel

Tuesday, July 10, 12
OpenVZ Kernel

Tuesday, July 10, 12
Container




                       OpenVZ Kernel

Tuesday, July 10, 12
Container




                       OpenVZ Kernel

Tuesday, July 10, 12
Container




                       OS tools
                       System supplied code

                                              OpenVZ Kernel

Tuesday, July 10, 12
Container




                       OS tools
                                              / - Read Only
                       System supplied code

                                                 OpenVZ Kernel

Tuesday, July 10, 12
Container




                       OS tools
                                              / - Read Only
                       System supplied code

                                                 OpenVZ Kernel

Tuesday, July 10, 12
Container




                       OS tools                               Applications, JVM’s
                                              / - Read Only
                       System supplied code                   Configs

                                                 OpenVZ Kernel

Tuesday, July 10, 12
Container




                       OS tools                               Applications, JVM’s
                                              / - Read Only                         /sw - Read Only
                       System supplied code                   Configs

                                                 OpenVZ Kernel

Tuesday, July 10, 12
Container




                       OS tools                               Applications, JVM’s
                                              / - Read Only                         /sw - Read Only
                       System supplied code                   Configs

                                                 OpenVZ Kernel

Tuesday, July 10, 12
Container

                                          Application and user data - /data (R/W)




                       OS tools                               Applications, JVM’s
                                              / - Read Only                         /sw - Read Only
                       System supplied code                   Configs

                                                 OpenVZ Kernel

Tuesday, July 10, 12
Container

                                          Application and user data - /data (R/W)

                                                     /data/service/




                       OS tools                               Applications, JVM’s
                                              / - Read Only                         /sw - Read Only
                       System supplied code                   Configs

                                                 OpenVZ Kernel

Tuesday, July 10, 12
Container

                                          Application and user data - /data (R/W)

                                                     /data/service/




                       OS tools                               Applications, JVM’s
                                              / - Read Only                         /sw - Read Only
                       System supplied code                   Configs

                                                 OpenVZ Kernel

Tuesday, July 10, 12
Container

                                          Application and user data - /data (R/W)

                                                     /data/service/




                       OS tools                               Applications, JVM’s
                                              / - Read Only                         /sw - Read Only
                       System supplied code                   Configs

                                                 OpenVZ Kernel

Tuesday, July 10, 12
How?
                • Storage nodes export /e/ro/ & /e/rw
                • Build an OS distro inside a chroot.
                       • Use whatever tools you are comfortable with.

                • Put this chroot tree in the RO location on storage nodes
                • Make a “data” dir in the RW location for each container


Tuesday, July 10, 12
How?
                • On Container start bind mount:
                       /net/storage-n/e/ro/os/linux-image-v1/
                       -> /vz/<ctid>/root
                • Replace etc, var & tmp with a memfs
                       • Linux expects to be able to write to these

                • Mount containers data dir (RW) to /data

Tuesday, July 10, 12
More benefits
                • Distribute OS images as a simple directory.
                • Prove that environments (Dev, Stg, Prd) are identical
                  using MD5sum.
                • Flip between OS versions by changing a variable




Tuesday, July 10, 12
The Swear Wall



Tuesday, July 10, 12
The swear wall helps prevent death by a thousand cuts.


                       Your team has a gut feeling about whats hurting them -
                       this helps you quantify that feeling and act on the pain.




Tuesday, July 10, 12
Tuesday, July 10, 12
1.!@&*^# Solaris!
                       2.Solaris gets a mark
                       3.Repeat
                       4.Periodically throw out offensive technology
                       5...
                       6.PROFIT!!   (swear less)




Tuesday, July 10, 12
Optimise for the task at hand.


                       Don’t layer solutions onto problems. Get rid of them.




Tuesday, July 10, 12
Thank you!


Tuesday, July 10, 12

More Related Content

PPT
Good virtual machines
PPT
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
PPT
Hyper v r2 deep dive
PPT
How I reshaped my lab environment
PDF
Backy - VM backup beyond bacula
PDF
Using Puppet and Cobbler to Automate Your Infrastructure
PDF
Best practices for managing personal virtual desktops
PDF
Running without a ZFS system pool
Good virtual machines
WinConnections Spring, 2011 - 30 Bite-Sized Tips for Best vSphere and Hyper-V...
Hyper v r2 deep dive
How I reshaped my lab environment
Backy - VM backup beyond bacula
Using Puppet and Cobbler to Automate Your Infrastructure
Best practices for managing personal virtual desktops
Running without a ZFS system pool

What's hot (20)

PDF
Understanding PostgreSQL LW Locks
ZIP
Operational Efficiency Hacks Web20 Expo2009
PDF
XS Oracle 2009 Intro Slides
PDF
State of Puppet - Puppet Camp Barcelona 2013
PDF
Virtualization Primer for Java Developers
PDF
Visão geral sobre Citrix XenServer 6 - Ferramentas e Licenciamento
PPTX
Architecting for a cost effective Windows Azure solution
PDF
Ian Pratt Nsdi Keynote Apr2008
KEY
Structure for scale: Dialing in your apps for optimal performance
PDF
E2E PVS Technical Overview Stephane Thirion
PDF
Inside the Hadoop Machine @ VMworld
PDF
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
PDF
Memcachedb: The Complete Guide
PPT
Ha & drs gotcha's
PDF
Capacity Planning For LAMP
PPTX
Let’s talk virtualization
PDF
Tuning DB2 in a Solaris Environment
PDF
Building A Scalable Open Source Storage Solution
PPTX
Top Troubleshooting Tips and Techniques for Citrix XenServer Deployments
PPT
The Pensions Trust - VM Backup Experiences
Understanding PostgreSQL LW Locks
Operational Efficiency Hacks Web20 Expo2009
XS Oracle 2009 Intro Slides
State of Puppet - Puppet Camp Barcelona 2013
Virtualization Primer for Java Developers
Visão geral sobre Citrix XenServer 6 - Ferramentas e Licenciamento
Architecting for a cost effective Windows Azure solution
Ian Pratt Nsdi Keynote Apr2008
Structure for scale: Dialing in your apps for optimal performance
E2E PVS Technical Overview Stephane Thirion
Inside the Hadoop Machine @ VMworld
OpenNebula Conf 2014 | OpenNebula and MooseFS for disaster recovery: real clo...
Memcachedb: The Complete Guide
Ha & drs gotcha's
Capacity Planning For LAMP
Let’s talk virtualization
Tuning DB2 in a Solaris Environment
Building A Scalable Open Source Storage Solution
Top Troubleshooting Tips and Techniques for Citrix XenServer Deployments
The Pensions Trust - VM Backup Experiences
Ad

Viewers also liked (20)

PDF
Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...
PDF
Guaranteed Delivery - Delivering Infrastructure and Code Together - Matt Moor
PDF
Enterprise Day 2015 - beyond software teams (Atlassian)
PDF
Continuous Validation - Lean Startup Machine Sydney 2013
PDF
Atlassian Q&A - Inside and Out
PDF
Tools for better storytelling
PDF
Getting and keeping your teams healthy... the Atlassian way
PDF
JIRA Keynote Summit 2014
PDF
Scaling to 150,000 Builds a Month... and Beyond
PDF
AtlasCamp 2015: Confluence making your life EASier
PDF
Tailoring Confluence for Team Productivity
PDF
Turbo-Charge Your JIRA Service Desk with ITSM & Automation Awesomeness
PDF
Understanding git: Voxxed Vienna 2016
PDF
How Atlassian Uses Analytics to Build Better Products
PDF
From the Atlassian Labs: FedEx Champions - Atlassian Summit 2010 - Lightning ...
PDF
The Inside Story of how Atlassian Makes Software
PDF
6 to 106 in 4 years - The story of the Atlassian Design team
PDF
Agile for the Masses: How to Make Any Team More Effective - John Wetenhall
PDF
Data Science at Atlassian: 
The transition towards a data-driven organisation
PDF
Nailing Distributed Development With Effective Collaboration - Matt Ryall
Puppet Camp Tokyo 2014: Fireballs, ice bats and 1,000,000 plugins: a story of...
Guaranteed Delivery - Delivering Infrastructure and Code Together - Matt Moor
Enterprise Day 2015 - beyond software teams (Atlassian)
Continuous Validation - Lean Startup Machine Sydney 2013
Atlassian Q&A - Inside and Out
Tools for better storytelling
Getting and keeping your teams healthy... the Atlassian way
JIRA Keynote Summit 2014
Scaling to 150,000 Builds a Month... and Beyond
AtlasCamp 2015: Confluence making your life EASier
Tailoring Confluence for Team Productivity
Turbo-Charge Your JIRA Service Desk with ITSM & Automation Awesomeness
Understanding git: Voxxed Vienna 2016
How Atlassian Uses Analytics to Build Better Products
From the Atlassian Labs: FedEx Champions - Atlassian Summit 2010 - Lightning ...
The Inside Story of how Atlassian Makes Software
6 to 106 in 4 years - The story of the Atlassian Design team
Agile for the Masses: How to Make Any Team More Effective - John Wetenhall
Data Science at Atlassian: 
The transition towards a data-driven organisation
Nailing Distributed Development With Effective Collaboration - Matt Ryall
Ad

Similar to Inside the Atlassian OnDemand Private Cloud (20)

PDF
Ops for Developers
PDF
NDH2k12 Cloud Computing Security
PDF
Java GC - Pause tuning
PDF
Rapid Home Provisioning
PDF
FLASH MEMORY: THE BIG DATA from Structure:Data 2012
PDF
Optimizing WordPress Performance on Shared Web Hosting
PDF
What Your CDN Won't Tell You: Optimizing a News Website for Speed and Stability
PPTX
How swift is your Swift - SD.pptx
PDF
Practicing Continuous Deployment
PDF
ZFS and FreeBSD Jails
PDF
Real world experience with provisioning services
PPTX
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
PDF
Node.js, toy or power tool?
PDF
Mobile crossplatformchallenges siggraph
PDF
Mobile crossplatformchallenges siggraph
PDF
Introduction to NoSQL with Couchbase
PDF
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
PDF
Congratsyourthedbatoo
PDF
Cloud Camp Chicago Dec 2012 - All presentations
PDF
Cloud Camp Chicago Dec 2012 Slides
Ops for Developers
NDH2k12 Cloud Computing Security
Java GC - Pause tuning
Rapid Home Provisioning
FLASH MEMORY: THE BIG DATA from Structure:Data 2012
Optimizing WordPress Performance on Shared Web Hosting
What Your CDN Won't Tell You: Optimizing a News Website for Speed and Stability
How swift is your Swift - SD.pptx
Practicing Continuous Deployment
ZFS and FreeBSD Jails
Real world experience with provisioning services
In-Ceph-tion: Deploying a Ceph cluster on DreamCompute
Node.js, toy or power tool?
Mobile crossplatformchallenges siggraph
Mobile crossplatformchallenges siggraph
Introduction to NoSQL with Couchbase
OpenNebulaConf2018 - How Inoreader Migrated from Bare-Metal Containers to Ope...
Congratsyourthedbatoo
Cloud Camp Chicago Dec 2012 - All presentations
Cloud Camp Chicago Dec 2012 Slides

More from Atlassian (20)

PPTX
International Women's Day 2020
PDF
10 emerging trends that will unbreak your workplace in 2020
PDF
Forge App Showcase
PDF
Let's Build an Editor Macro with Forge UI
PDF
Meet the Forge Runtime
PDF
Forge UI: A New Way to Customize the Atlassian User Experience
PDF
Take Action with Forge Triggers
PDF
Observability and Troubleshooting in Forge
PDF
Trusted by Default: The Forge Security & Privacy Model
PDF
Designing Forge UI: A Story of Designing an App UI System
PDF
Forge: Under the Hood
PDF
Access to User Activities - Activity Platform APIs
PDF
Design Your Next App with the Atlassian Vendor Sketch Plugin
PDF
Tear Up Your Roadmap and Get Out of the Building
PDF
Nailing Measurement: a Framework for Measuring Metrics that Matter
PDF
Building Apps With Color Blind Users in Mind
PDF
Creating Inclusive Experiences: Balancing Personality and Accessibility in UX...
PDF
Beyond Diversity: A Guide to Building Balanced Teams
PDF
The Road(map) to Las Vegas - The Story of an Emerging Self-Managed Team
PDF
Building Apps With Enterprise in Mind
International Women's Day 2020
10 emerging trends that will unbreak your workplace in 2020
Forge App Showcase
Let's Build an Editor Macro with Forge UI
Meet the Forge Runtime
Forge UI: A New Way to Customize the Atlassian User Experience
Take Action with Forge Triggers
Observability and Troubleshooting in Forge
Trusted by Default: The Forge Security & Privacy Model
Designing Forge UI: A Story of Designing an App UI System
Forge: Under the Hood
Access to User Activities - Activity Platform APIs
Design Your Next App with the Atlassian Vendor Sketch Plugin
Tear Up Your Roadmap and Get Out of the Building
Nailing Measurement: a Framework for Measuring Metrics that Matter
Building Apps With Color Blind Users in Mind
Creating Inclusive Experiences: Balancing Personality and Accessibility in UX...
Beyond Diversity: A Guide to Building Balanced Teams
The Road(map) to Las Vegas - The Story of an Emerging Self-Managed Team
Building Apps With Enterprise in Mind

Recently uploaded (20)

PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
PDF
Empathic Computing: Creating Shared Understanding
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
A Presentation on Artificial Intelligence
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Electronic commerce courselecture one. Pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
DOCX
The AUB Centre for AI in Media Proposal.docx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Review of recent advances in non-invasive hemoglobin estimation
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Shreyas Phanse Resume: Experienced Backend Engineer | Java • Spring Boot • Ka...
Empathic Computing: Creating Shared Understanding
Dropbox Q2 2025 Financial Results & Investor Presentation
A Presentation on Artificial Intelligence
Network Security Unit 5.pdf for BCA BBA.
Reach Out and Touch Someone: Haptics and Empathic Computing
Electronic commerce courselecture one. Pdf
NewMind AI Weekly Chronicles - August'25 Week I
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
“AI and Expert System Decision Support & Business Intelligence Systems”
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
The AUB Centre for AI in Media Proposal.docx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...

Inside the Atlassian OnDemand Private Cloud

  • 2. Inside the Atlassian OnDemand private cloud George Barnett SAAS Platform Architect Tuesday, July 10, 12
  • 3. In 2010 a team of engineers moved into our secret lair (above a pub) to re-imagine our hosted platform. Tuesday, July 10, 12
  • 4. 6 months later 13,500 VMs Launch - October 2011 1000 VMs Tuesday, July 10, 12
  • 5. We have a cloud. So what? Tuesday, July 10, 12
  • 6. We also had a cloud.. and .. VM sprawl Poor performance Over provisioning Slow deployments Low visibility into the full stack Tuesday, July 10, 12
  • 7. Virtualisation often creates new challenges but does nothing about existing ones. Tuesday, July 10, 12
  • 13. Be less flexible about what infrastructure you provide. Tuesday, July 10, 12
  • 14. “You can use any database you like, as long as its PostgreSQL 8.4.” #summit12 Tuesday, July 10, 12
  • 15. • Stop trying to be everything to everyone • (we have other clouds within Atlassian) • Lower operational complexity • Easier to provide a deeply integrated, well supported toolchain • Small test surface matrix Tuesday, July 10, 12
  • 16. Fail fast. Learn quickly. Tuesday, July 10, 12
  • 17. Do as little as possible deploy and use it Tuesday, July 10, 12
  • 18. Block-1 A small scale model of the initial proposed platform architecture. 4 desktop machines and a switch. Purpose: Validate design, evaluate failure modes. http://history.nasa.gov/Apollo204/blocks.html Tuesday, July 10, 12
  • 19. Block-1 Applications do not fall over. Network boot assumptions validated. Creation of VM’s over NFS too resource and time intensive. (more on this later) Tuesday, July 10, 12
  • 20. Block-2 A large scale model of the platform architecture. Purpose: Validate hardware resource assumptions and compare CPU vendors. http://history.nasa.gov/Apollo204/blocks.html Tuesday, July 10, 12
  • 21. Block-2 Customers per GB of RAM metric validated VM Distribution and failover tools work. Initial specs of compute hardware too conservative. Decided to add 50% more RAM. Tuesday, July 10, 12
  • 23. Challenge Existing platform hardware was a poor fit for our workload. Memory and IO were heavily constrained, but CPU was not. Tuesday, July 10, 12
  • 24. Monitoring We took 6 months worth of monitoring data from our existing platform. We used this to data to determine the right mix of hardware. Tuesday, July 10, 12
  • 25. • 10 x Compute nodes (144G RAM, 12 cores, NO disks) • 3 x Storage nodes (24 disks) • Each rack delivered fully assembled • Unwrap, provide power, networking • Connected to customers in ~2 hours Tuesday, July 10, 12
  • 26. Advantage #1 Reliable. Each machine goes through a 2 day burn in before it goes into the rack. Tuesday, July 10, 12
  • 27. Advantage #2 Neat. Tuesday, July 10, 12
  • 28. Advantage #3 Consistent. Tuesday, July 10, 12
  • 29. Advantage #4 Easy to deploy. Tuesday, July 10, 12
  • 32. Challenge Existing compute infrastructure used local disk for swap and hypervisor boot. Once we got the memory density right, it’s only boot. Tuesday, July 10, 12
  • 33. • No disks in compute infrastructure • Avoid spinning 20 more disks per rack for a hypervisor OS • Evaluated booting from: • USB drives • NFS • Custom binary initrd image + kernel Tuesday, July 10, 12
  • 34. • No disks in compute infrastructure • Avoid spinning 20 more disks per rack for a hypervisor OS • Evaluated booting from: • USB drives (unreliable and slow!) • NFS (what if the network goes away?) • Custom binary initrd image + kernel Tuesday, July 10, 12
  • 35. • Image is ~170Mb gzipped filesystem • Download on boot, extract into ram - ~400Mb • No external dependencies after boot • All compute nodes boot from the same image • Reboot to known state Tuesday, July 10, 12
  • 36. Compute Node Netboot Server dhcp PXE DHCP response TFTP gpxe dhcp DHCP Etherboot response HTTP bootscript kernel & boot image Boot Tuesday, July 10, 12
  • 37. Sharp Edges. • No swap == provision carefully • Not a problem if you automate provisioning • Treat running hypervisor image like an appliance • Don’t change code - rebuild image and reboot • Doing this often? Too many services in the hypervisor Tuesday, July 10, 12
  • 39. Challenge Virtualisation is often inefficient. There’s a memory and CPU penalty which is hard to avoid. Tuesday, July 10, 12
  • 40. Open VZ • Linux containers • Basis for Parallels Virtuozzo Containers • LXC isn’t there yet • No guest OS kernels • No performance hit • Better resource sharing Tuesday, July 10, 12
  • 45. “Don’t load the same thing twice” Tuesday, July 10, 12
  • 46. Challenge Java VM’s aren’t lightweight. Tuesday, July 10, 12
  • 47. • Full virtualisation does a poor job at this • 50 VMs = 50 Kernels + 50 caches + 50 shared libs! • Memory de-dupe combats this, but burns CPU. • Memory de-dupe works across all OSes • We don’t use Windows. • By being less flexible, we can exploit Linux specific features. Tuesday, July 10, 12
  • 48. OpenVZ containers all share the same kernel. Tuesday, July 10, 12
  • 49. • Provide a single OS image to all - free benefits: • Shared libraries only load once. • OS is cached only once. • OS image is the same on every instance. Tuesday, July 10, 12
  • 50. Challenge If all containers share the same OS image, then managing state is a nightmare! One bad change in one container would break them all! Tuesday, July 10, 12
  • 51. • But managing state on multiple machines is a solved problem! • What if you have >10,000 machines. • Why are you modifying the OS anyway? Tuesday, July 10, 12
  • 52. Does your iPhone upgrade iOS when you install an app? Tuesday, July 10, 12
  • 53. “Fix problems by removing them, not by adding systems to manage them.” #summit12 Tuesday, July 10, 12
  • 55. Data classes in a system • OS and system daemon code • Application code • Application and user data Tuesday, July 10, 12
  • 60. Container OpenVZ Kernel Tuesday, July 10, 12
  • 61. Container OpenVZ Kernel Tuesday, July 10, 12
  • 62. Container OS tools System supplied code OpenVZ Kernel Tuesday, July 10, 12
  • 63. Container OS tools / - Read Only System supplied code OpenVZ Kernel Tuesday, July 10, 12
  • 64. Container OS tools / - Read Only System supplied code OpenVZ Kernel Tuesday, July 10, 12
  • 65. Container OS tools Applications, JVM’s / - Read Only System supplied code Configs OpenVZ Kernel Tuesday, July 10, 12
  • 66. Container OS tools Applications, JVM’s / - Read Only /sw - Read Only System supplied code Configs OpenVZ Kernel Tuesday, July 10, 12
  • 67. Container OS tools Applications, JVM’s / - Read Only /sw - Read Only System supplied code Configs OpenVZ Kernel Tuesday, July 10, 12
  • 68. Container Application and user data - /data (R/W) OS tools Applications, JVM’s / - Read Only /sw - Read Only System supplied code Configs OpenVZ Kernel Tuesday, July 10, 12
  • 69. Container Application and user data - /data (R/W) /data/service/ OS tools Applications, JVM’s / - Read Only /sw - Read Only System supplied code Configs OpenVZ Kernel Tuesday, July 10, 12
  • 70. Container Application and user data - /data (R/W) /data/service/ OS tools Applications, JVM’s / - Read Only /sw - Read Only System supplied code Configs OpenVZ Kernel Tuesday, July 10, 12
  • 71. Container Application and user data - /data (R/W) /data/service/ OS tools Applications, JVM’s / - Read Only /sw - Read Only System supplied code Configs OpenVZ Kernel Tuesday, July 10, 12
  • 72. How? • Storage nodes export /e/ro/ & /e/rw • Build an OS distro inside a chroot. • Use whatever tools you are comfortable with. • Put this chroot tree in the RO location on storage nodes • Make a “data” dir in the RW location for each container Tuesday, July 10, 12
  • 73. How? • On Container start bind mount: /net/storage-n/e/ro/os/linux-image-v1/ -> /vz/<ctid>/root • Replace etc, var & tmp with a memfs • Linux expects to be able to write to these • Mount containers data dir (RW) to /data Tuesday, July 10, 12
  • 74. More benefits • Distribute OS images as a simple directory. • Prove that environments (Dev, Stg, Prd) are identical using MD5sum. • Flip between OS versions by changing a variable Tuesday, July 10, 12
  • 75. The Swear Wall Tuesday, July 10, 12
  • 76. The swear wall helps prevent death by a thousand cuts. Your team has a gut feeling about whats hurting them - this helps you quantify that feeling and act on the pain. Tuesday, July 10, 12
  • 78. 1.!@&*^# Solaris! 2.Solaris gets a mark 3.Repeat 4.Periodically throw out offensive technology 5... 6.PROFIT!! (swear less) Tuesday, July 10, 12
  • 79. Optimise for the task at hand. Don’t layer solutions onto problems. Get rid of them. Tuesday, July 10, 12