Availability,
                            the Cloud and
                             Everything
                                Joe Williams




Saturday, October 2, 2010
Me

                            • Joe Williams
                             • Infrastructure Engineer
                             • Cloudant
                             • @williamsjoe
                             • joeandmotorboat.com



Saturday, October 2, 2010
• Distributed database built on CouchDB
                     • Real-time Search and Analytics
                     • Sign Up! (Free to 256MB)
                     • cloudant.com
                     • http://github.com/cloudant/bigcouch


Saturday, October 2, 2010
Bias


                     • Distributed Databases (CouchDB)
                     • Amazon EC2
                     • Chef
                     • Erlang



Saturday, October 2, 2010
Availability




Saturday, October 2, 2010
Availability




                     • What is Availability?




Saturday, October 2, 2010
Availability




Saturday, October 2, 2010
Availability

        “System availability refers to the accessibility of
      system services to users. A system is available if it is
     operational for an overwhelming fraction of the time.
        Unlike reliability, availability is instantaneous.”




Saturday, October 2, 2010
Availability


    “System reliability refers to the property of tolerating
    constituent component failures, for the longest time. A
          system is perfectly reliable if it never fails.”




Saturday, October 2, 2010
Availability



                     • Reliability * Availability = Dependability




Saturday, October 2, 2010
Availability

                     • Availability & Reliability
                            • Mean time to failures
                            • Mean time to repair
                            • Durability
                            • Fault isolation
                            • Fault tolerance


Saturday, October 2, 2010
Availability



                     • Uptime / Downtime
                            • Perceived
                            • Actual




Saturday, October 2, 2010
Availability



                     • Probabilistic Risk Assessment
                            • Event Tree Analysis
                            • Fault Tree Analysis



                                   Apthorpe (http://www.usenix.org/events/lisa01/tech/apthorpe/apthorpe.ps)



Saturday, October 2, 2010
The Cloud




Saturday, October 2, 2010
The Cloud


                      “It never gets easier, you just go faster.”
                                   - Greg Lemond




Saturday, October 2, 2010
The Cloud


                     • Abstraction
                     • Commoditization
                     • Homogenous
                     • Ephemeral



Saturday, October 2, 2010
The Cloud

                     • Costs
                            • Loss of Control
                            • Single Points of Failure
                            • Network Partitions / Data Locality
                            • Unreliable
                            • Performance

Saturday, October 2, 2010
The Cloud


                     • Benefits
                            • API to everything
                            • Fast and Flexible Resource Mgmt
                            • “Unlimited” Resources



Saturday, October 2, 2010
The Cloud



                                                                            • Bootstrapping
                                                                             • Time and Effort


           Adam Jacob and Ezra Zygmuntowicz (http://blip.tv/file/2285124/)




Saturday, October 2, 2010
The Cloud




                     • Nodes are stateless and disposable.




Saturday, October 2, 2010
The Cloud


           "Clouds are systems ... and with systems, you have to think hard and know how to deal with issues in that
         environment. The scale is so much bigger, and you don't have the physical control. But we think people should
           be optimistic about what we can do here. If we are clever about deploying cloud computing with a clear-eyed
                notion of what the risk models are, maybe we can actually save the economy through technology."

                            - Security in the Ether By David Talbot - MIT Technology Review Jan/Feb 2010




Saturday, October 2, 2010
What’s Next



                     • Distributed Systems
                     • Automation
                     • Data Driven Operations




Saturday, October 2, 2010
Distributed Systems




                                Baran (http://www.rand.org/pubs/research_memoranda/RM3420/)




Saturday, October 2, 2010
Distributed Systems




                     • RAID ain’t as redundant as it used to be.




                                  Leventhal (http://queue.acm.org/detail.cfm?id=1670144)




Saturday, October 2, 2010
Distributed Systems



                     • Redundancy
                            • Duplication
                            • Distribution




Saturday, October 2, 2010
Distributed Systems


                     • Alphabet Soup
                            • ACID, CAP, BASE, 2PC, MVCC
                            • Vector Clocks, Eventual Consistency
                            • Dynamo, Paxos, Chandra, Byzantine



Saturday, October 2, 2010
Distributed Systems




                     • CAP == Availability




Saturday, October 2, 2010
Distributed Systems


                     • Erlang
                            • Distributed
                            • Concurrent
                            • Fault Tolerant



Saturday, October 2, 2010
Distributed Systems



                     • Erlang
                            • Supervision Trees




Saturday, October 2, 2010
Distributed Systems



                     • Erlang
                            • Hot Code Upgrades
                            • Distributed Upgrades are HARD




Saturday, October 2, 2010
Distributed Systems
                     • Future Work
                            • Erlang Supervision Trees
                            • PRA / FTA / ETA




                                    Apthorpe (http://www.usenix.org/events/lisa01/tech/apthorpe/apthorpe.ps)

Saturday, October 2, 2010
Automation




Saturday, October 2, 2010
Automation




                     • Optimal use of the cloud.




Saturday, October 2, 2010
Automation




                     • Frequent deployment.




Saturday, October 2, 2010
Automation

                     • Tools
                            • Chef
                            • Puppet
                            • Cfengine
                            • Bcfg2


Saturday, October 2, 2010
Automation

                    • Erlang + Chef (as of v0.8)
                            • erl_call Provider




Saturday, October 2, 2010
Data Driven Operations




Saturday, October 2, 2010
Data Driven Operations


                  “What gets measured, gets managed.”
                                -Peter Drucker




Saturday, October 2, 2010
Data Driven Operations




                     • Instrumentation




Saturday, October 2, 2010
Data Driven Operations




                     • Logging




Saturday, October 2, 2010
Data Driven Operations




                     • Visualization




Saturday, October 2, 2010
Data Driven Operations




                     • Demo!




Saturday, October 2, 2010
Data Driven Operations


             •       Modeling

             •       Analysis

             •       Universal Law of Computational Scalability

             •       Amdahl’s Law




Saturday, October 2, 2010
Data Driven Operations




                     • Modeling isn’t just for capacity planning.




                                   Montagne (http://queue.acm.org/detail.cfm?id=1862187)


Saturday, October 2, 2010
The End




Saturday, October 2, 2010
Questions?



                            Joe Williams - @williamsjoe




Saturday, October 2, 2010

More Related Content

PDF
Availability, the Cloud and Everything
PDF
Turning That UX Frown Upside Down
PPTX
Cloud Computing - Availability Issues and Controls
PDF
Cloud Computing Security (Final Year Project) by Pavlos Stefanis
PDF
Integrating Erlang with PHP
PDF
A Security Analysis Framework Powered By An Expert System
PDF
Xtreme Deployment
PDF
Enterprise Drupal
Availability, the Cloud and Everything
Turning That UX Frown Upside Down
Cloud Computing - Availability Issues and Controls
Cloud Computing Security (Final Year Project) by Pavlos Stefanis
Integrating Erlang with PHP
A Security Analysis Framework Powered By An Expert System
Xtreme Deployment
Enterprise Drupal

Similar to Availability, The Cloud and Everything (version 2, Surge2010) (14)

PDF
Calculating the ROI for XML and DITA topic-based authoring
PDF
OpenStreetMap dongpo deng
PDF
M.Malone Simple Geo @ Social Developers Summit
PDF
Batch Indexing & Near Real Time, keeping things fast
PDF
Aegir one drupal to rule them all
PDF
Magic broker 2 #iot2010 presentation
PDF
iOS & Arduino
PDF
Geoloqi: Non-visual augmented reality Open Source Bridge
PDF
Geoloqi - Non-visual location based augmented reality with SMS and GPS - Ope...
PDF
Drupal security - Configuration and process
PDF
DNSSEC Deployment at ROOT Zone
PDF
Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
PDF
Scaling webappswithrabbitmq
PDF
Integrating php withrabbitmq_zendcon
Calculating the ROI for XML and DITA topic-based authoring
OpenStreetMap dongpo deng
M.Malone Simple Geo @ Social Developers Summit
Batch Indexing & Near Real Time, keeping things fast
Aegir one drupal to rule them all
Magic broker 2 #iot2010 presentation
iOS & Arduino
Geoloqi: Non-visual augmented reality Open Source Bridge
Geoloqi - Non-visual location based augmented reality with SMS and GPS - Ope...
Drupal security - Configuration and process
DNSSEC Deployment at ROOT Zone
Gaelyk - SpringOne2GX - 2010 - Guillaume Laforge
Scaling webappswithrabbitmq
Integrating php withrabbitmq_zendcon
Ad

Recently uploaded (20)

PDF
August Patch Tuesday
PPTX
The various Industrial Revolutions .pptx
PDF
CloudStack 4.21: First Look Webinar slides
PDF
Hybrid model detection and classification of lung cancer
PDF
WOOl fibre morphology and structure.pdf for textiles
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
Developing a website for English-speaking practice to English as a foreign la...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PPTX
observCloud-Native Containerability and monitoring.pptx
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Taming the Chaos: How to Turn Unstructured Data into Decisions
PPTX
Tartificialntelligence_presentation.pptx
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
NewMind AI Weekly Chronicles – August ’25 Week III
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PPT
Geologic Time for studying geology for geologist
August Patch Tuesday
The various Industrial Revolutions .pptx
CloudStack 4.21: First Look Webinar slides
Hybrid model detection and classification of lung cancer
WOOl fibre morphology and structure.pdf for textiles
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
Developing a website for English-speaking practice to English as a foreign la...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
observCloud-Native Containerability and monitoring.pptx
Enhancing emotion recognition model for a student engagement use case through...
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Taming the Chaos: How to Turn Unstructured Data into Decisions
Tartificialntelligence_presentation.pptx
1 - Historical Antecedents, Social Consideration.pdf
NewMind AI Weekly Chronicles – August ’25 Week III
A comparative study of natural language inference in Swahili using monolingua...
sustainability-14-14877-v2.pddhzftheheeeee
Geologic Time for studying geology for geologist
Ad

Availability, The Cloud and Everything (version 2, Surge2010)

  • 1. Availability, the Cloud and Everything Joe Williams Saturday, October 2, 2010
  • 2. Me • Joe Williams • Infrastructure Engineer • Cloudant • @williamsjoe • joeandmotorboat.com Saturday, October 2, 2010
  • 3. • Distributed database built on CouchDB • Real-time Search and Analytics • Sign Up! (Free to 256MB) • cloudant.com • http://github.com/cloudant/bigcouch Saturday, October 2, 2010
  • 4. Bias • Distributed Databases (CouchDB) • Amazon EC2 • Chef • Erlang Saturday, October 2, 2010
  • 6. Availability • What is Availability? Saturday, October 2, 2010
  • 8. Availability “System availability refers to the accessibility of system services to users. A system is available if it is operational for an overwhelming fraction of the time. Unlike reliability, availability is instantaneous.” Saturday, October 2, 2010
  • 9. Availability “System reliability refers to the property of tolerating constituent component failures, for the longest time. A system is perfectly reliable if it never fails.” Saturday, October 2, 2010
  • 10. Availability • Reliability * Availability = Dependability Saturday, October 2, 2010
  • 11. Availability • Availability & Reliability • Mean time to failures • Mean time to repair • Durability • Fault isolation • Fault tolerance Saturday, October 2, 2010
  • 12. Availability • Uptime / Downtime • Perceived • Actual Saturday, October 2, 2010
  • 13. Availability • Probabilistic Risk Assessment • Event Tree Analysis • Fault Tree Analysis Apthorpe (http://www.usenix.org/events/lisa01/tech/apthorpe/apthorpe.ps) Saturday, October 2, 2010
  • 15. The Cloud “It never gets easier, you just go faster.” - Greg Lemond Saturday, October 2, 2010
  • 16. The Cloud • Abstraction • Commoditization • Homogenous • Ephemeral Saturday, October 2, 2010
  • 17. The Cloud • Costs • Loss of Control • Single Points of Failure • Network Partitions / Data Locality • Unreliable • Performance Saturday, October 2, 2010
  • 18. The Cloud • Benefits • API to everything • Fast and Flexible Resource Mgmt • “Unlimited” Resources Saturday, October 2, 2010
  • 19. The Cloud • Bootstrapping • Time and Effort Adam Jacob and Ezra Zygmuntowicz (http://blip.tv/file/2285124/) Saturday, October 2, 2010
  • 20. The Cloud • Nodes are stateless and disposable. Saturday, October 2, 2010
  • 21. The Cloud "Clouds are systems ... and with systems, you have to think hard and know how to deal with issues in that environment. The scale is so much bigger, and you don't have the physical control. But we think people should be optimistic about what we can do here. If we are clever about deploying cloud computing with a clear-eyed notion of what the risk models are, maybe we can actually save the economy through technology." - Security in the Ether By David Talbot - MIT Technology Review Jan/Feb 2010 Saturday, October 2, 2010
  • 22. What’s Next • Distributed Systems • Automation • Data Driven Operations Saturday, October 2, 2010
  • 23. Distributed Systems Baran (http://www.rand.org/pubs/research_memoranda/RM3420/) Saturday, October 2, 2010
  • 24. Distributed Systems • RAID ain’t as redundant as it used to be. Leventhal (http://queue.acm.org/detail.cfm?id=1670144) Saturday, October 2, 2010
  • 25. Distributed Systems • Redundancy • Duplication • Distribution Saturday, October 2, 2010
  • 26. Distributed Systems • Alphabet Soup • ACID, CAP, BASE, 2PC, MVCC • Vector Clocks, Eventual Consistency • Dynamo, Paxos, Chandra, Byzantine Saturday, October 2, 2010
  • 27. Distributed Systems • CAP == Availability Saturday, October 2, 2010
  • 28. Distributed Systems • Erlang • Distributed • Concurrent • Fault Tolerant Saturday, October 2, 2010
  • 29. Distributed Systems • Erlang • Supervision Trees Saturday, October 2, 2010
  • 30. Distributed Systems • Erlang • Hot Code Upgrades • Distributed Upgrades are HARD Saturday, October 2, 2010
  • 31. Distributed Systems • Future Work • Erlang Supervision Trees • PRA / FTA / ETA Apthorpe (http://www.usenix.org/events/lisa01/tech/apthorpe/apthorpe.ps) Saturday, October 2, 2010
  • 33. Automation • Optimal use of the cloud. Saturday, October 2, 2010
  • 34. Automation • Frequent deployment. Saturday, October 2, 2010
  • 35. Automation • Tools • Chef • Puppet • Cfengine • Bcfg2 Saturday, October 2, 2010
  • 36. Automation • Erlang + Chef (as of v0.8) • erl_call Provider Saturday, October 2, 2010
  • 38. Data Driven Operations “What gets measured, gets managed.” -Peter Drucker Saturday, October 2, 2010
  • 39. Data Driven Operations • Instrumentation Saturday, October 2, 2010
  • 40. Data Driven Operations • Logging Saturday, October 2, 2010
  • 41. Data Driven Operations • Visualization Saturday, October 2, 2010
  • 42. Data Driven Operations • Demo! Saturday, October 2, 2010
  • 43. Data Driven Operations • Modeling • Analysis • Universal Law of Computational Scalability • Amdahl’s Law Saturday, October 2, 2010
  • 44. Data Driven Operations • Modeling isn’t just for capacity planning. Montagne (http://queue.acm.org/detail.cfm?id=1862187) Saturday, October 2, 2010
  • 46. Questions? Joe Williams - @williamsjoe Saturday, October 2, 2010