SlideShare a Scribd company logo
E N TE RP RI S E
A     R     C      H      I   T
E     C     T      U      R   E




                                  THE 5 PRINCIPLES OF OF
                                  GOOGLE’S
                                  ”CLOUD”
                                  Patrik Svensson, 2011, ptrksvnssn@gmail.com




torsdag den 12 maj 2011
E N TE RP RI S E
                                  THE VISION OF GOOGLE
A     R     C      H      I   T
E     C     T      U      R   E




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E
                                      THE 5 PRINCIPLES
                                  •   Everything is a service (or an application in
                                      Android)

                                  •   Relentless technical focus (thinking at nanoscale)

                                  •   Data centers are the foundation

                                  •   Code is king, Data is king kong

                                  •   Identify and keep track on your users



torsdag den 12 maj 2011
E N TE RP RI S E
A     R     C      H      I   T
E     C     T      U      R   E




torsdag den 12 maj 2011
#1 EVERYTHING IS A
E N TE RP RI S E
                                  SERVICE (OR AN
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   APPLICATION)




torsdag den 12 maj 2011
E N TE RP RI S E                  #2 RELENTLESS
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   TECHNICAL FOCUS
       •     Jedis build their own
             lightsabres

       •     Parallelize, Distribute, Cache,
             Compress, Redundantize
             everything

       •     Latency is VERY evil              Source: http://www.flickr.com/photos/60994749@N07/5557591956/




torsdag den 12 maj 2011
EXAMPLE: ”NUMBERS
E N TE RP RI S E
                                  EVERYONE SHOULD
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   KNOW”




                                                                                                          1,000,000 ns        = 1 ms
                                                                                                          1,000,000,000 ns = 1 s

                                   Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”



torsdag den 12 maj 2011
E N TE RP RI S E                  #3 DATA CENTERS ARE
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E
                                  THE FOUNDATION




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                                  T
                                  E    ECONOMIES OF SCALE
                              •       ~40 data centers in 2009, 1000,000 machines




                                                     Source: http://techcrunch.com/2008/04/11/where-are-all-the-google-data-centers/




torsdag den 12 maj 2011
E N TE RP RI S E
A     R     C      H      I   T
E     C     T      U      R   E




torsdag den 12 maj 2011
E N TE RP RI S E                                 #4 CODE IS KING, DATA
                                                 IS KING KONG
A     R     C      H         I      T
E     C     T      U         R      E




                                                     Enterprise Architecture
                                                   Technical Architecture i.e. which technologies do we use
                          DATA CENTERS                       DATA                      CODE                          CONTROL                        USERS

                                                         "We need:         "We need to build applications
                                "We need:                                     and services, application-,       "We need scheduling         "We need to identify our
                                                    One Distributed File
                              Cooling, Power,                               integration- & data platforms,      synchronization, lock          users to be able to
                                                    Systems, Distributed
                           Perimeter Networks,                             parallell computing platforms &       services, i.e. various     interact, differentiate and
                                                    One Shared memory,
                             Containers, Racks,                             use an open source OS, upon           forms of control             customize the user
                                                      & common data
                          Switches & Hardware at                           our data center/data platform"      mechanisms for data and             experience"
                                                    formats to get scale
                            low cost that scale"                                                                        code"
                                                       and low cost"




                                             Implementation Architecture i.e. how do we implement the technologies
                                                                                   Android, Chrome
                                                                            App Engine, Gmail, Search, Index          GFS master
                                                             GFS,                  Python, Java, C++               Google Work Queue,          OpenID, OAuth, Google
                             Google Container-             BigTable,                                             Chubby,Netscalar, Google    Accounts available for most
                                                                                 Protocol Buffers, Json
                            based Data Centers          Protocol Buffers                                          HTTP Server, (Spanner)              services
                                                                              Sawzall, Dremel, Percolator
                                                                                     MapReduce
                                                                                         Linux




torsdag den 12 maj 2011
E N TE RP RI S E                                                                   "Google's mission is to
A
E
      R
      C
            C
            T
                   H
                   U
                              I
                              R
                                  T
                                  E
                                        ABOUT DATA                                  organize the world's
                                                                                  information and make it
                                                                                       available to all"


                                                                                  +20 Petabyte/day
                   200



                   150



                   100


                                                          ~10 Terabyte/day
                     50

                                      ~2,5 Terabyte
                          0
                                  Structured, Numerical   Unstructured, Textual   Communication, Traffic




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   DATA CENTER ”ENTRY”
                                  •   The same entry to each Data Center

                                  •   ~50 caching (using Squid)

                                  •   Built their own HTTP servers/farms




                                             Source: Ed Austin, ”The Anatomy of the Google Architecture”




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   INSIDE THE CONTAINERS
                                  •   Customized commodity servers, is customized racks in
                                      containers (+1000 servers), organized into clusters

                                  •   All containers ”cloned” and look the same




                                                                   Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
THE SAME HW, OS AND
E N TE RP RI S E
                                  FILESYSTEM
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   EVERYWHERE




                                  Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   BIGDATA AS DATABASE




                                  Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
E N TE RP RI S E                  BIGDATA IS COLUMN-
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   BASED




                                  Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”



torsdag den 12 maj 2011
E N TE RP RI S E
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E       BIGDATA NEEDS GFS
                                  •   Use GFS to store data and logs




                                      Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
MAPREDUCE -
E N TE RP RI S E                  A PARALLELL
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E
                                  COMPUTING PLATFORM




                                  Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems”




torsdag den 12 maj 2011
E N TE RP RI S E                  ABOUT CODING AT
A
E
      R
      C
            C
            T
                   H
                   U
                          I
                          R
                              T
                              E   GOOGLE
      •     Linux as operating system everywhere - is open source, highly customized for this (Android is also
            a higly customized version of Linux)

      •     Serialization/Integration - Protocol buffers (RPC) runs at nano speed, internally used for
            ”everything”, Json and RESTful used for external API’s

      •     Application-oriented Programming languages - mainly Python, Java and C++

      •     Data-oriented programming languages - Percolator, Sawzall, Dremel for various data
            processing task (so specialised tools for data!)

      •     The Business Applications - Gmail, Search, App Engine etc - built upon data center
            infrasctructure, data platform and above




torsdag den 12 maj 2011
E N TE RP RI S E                  #5 IDENTIFY AND KEEP
A
E
      R
      C
              C
              T
                   H
                   U
                          I
                          R
                              T
                              E   TRACK OF YOUR USERS
          •       You need a google account to start
                  Android properly

          •       OpenSocial is a collaborate effort to
                  compete against Facebook

          •       OpenID is an identity standard and OAuth
                  is a standard for authorizing services

          •       Google is identifying and tracking every
                  step you take within their domains



torsdag den 12 maj 2011

More Related Content

PDF
The 4 faces of social media analytics - KPI's that matter
PDF
Online Backend Architectures. Through Words. Not Images.
PDF
Improove Business Inspiration - eCommerce at Tele2
PDF
Search marketing - good things come in threes
PPTX
Comviq Hackathon Warm-Up
PDF
Better UX using Meteor and Famo.us
PDF
S.P.A.C.E. Exploration for Software Engineering
PDF
LISA 2011 Keynote: The DevOps Transformation
The 4 faces of social media analytics - KPI's that matter
Online Backend Architectures. Through Words. Not Images.
Improove Business Inspiration - eCommerce at Tele2
Search marketing - good things come in threes
Comviq Hackathon Warm-Up
Better UX using Meteor and Famo.us
S.P.A.C.E. Exploration for Software Engineering
LISA 2011 Keynote: The DevOps Transformation

Similar to The 5 principles of google's cloud (16)

PPTX
Promise notes
PDF
The Modern Software Engineer - Nuno Job
PDF
Painless OpenStack Deployments Powered by Puppet
PDF
Terminology in openEHR
PDF
6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...
PDF
The Changing Face of Government IT
PPTX
John Eberhardt NSTAC Testimony
PPTX
ET Ch - 2.pptx
PDF
The Art of Social Media Analysis with Twitter & Python
PDF
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
PDF
Monitoring is easy, why are we so bad at it presentation
PPTX
Data Analytics course webinar for Blue Academy .pptx
PDF
LUISS - Deep Learning and data analyses - 09/01/19
PPTX
Proposed Talk Outline for Pycon2017
PPTX
The Python ecosystem for data science - Landscape Overview
PDF
Complete placement guide(technical)
Promise notes
The Modern Software Engineer - Nuno Job
Painless OpenStack Deployments Powered by Puppet
Terminology in openEHR
6th OA Conference - Apr 2005 - Into the Depths of OpenAccess - Timing Constra...
The Changing Face of Government IT
John Eberhardt NSTAC Testimony
ET Ch - 2.pptx
The Art of Social Media Analysis with Twitter & Python
The Art of Social Media Analysis with Twitter & Python-OSCON 2012
Monitoring is easy, why are we so bad at it presentation
Data Analytics course webinar for Blue Academy .pptx
LUISS - Deep Learning and data analyses - 09/01/19
Proposed Talk Outline for Pycon2017
The Python ecosystem for data science - Landscape Overview
Complete placement guide(technical)
Ad

Recently uploaded (20)

PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Spectroscopy.pptx food analysis technology
PPTX
A Presentation on Artificial Intelligence
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Encapsulation theory and applications.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Unlocking AI with Model Context Protocol (MCP)
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
1. Introduction to Computer Programming.pptx
PDF
Empathic Computing: Creating Shared Understanding
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Digital-Transformation-Roadmap-for-Companies.pptx
SOPHOS-XG Firewall Administrator PPT.pptx
20250228 LYD VKU AI Blended-Learning.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Spectroscopy.pptx food analysis technology
A Presentation on Artificial Intelligence
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Spectral efficient network and resource selection model in 5G networks
Encapsulation_ Review paper, used for researhc scholars
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Encapsulation theory and applications.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Advanced methodologies resolving dimensionality complications for autism neur...
Unlocking AI with Model Context Protocol (MCP)
MYSQL Presentation for SQL database connectivity
Diabetes mellitus diagnosis method based random forest with bat algorithm
1. Introduction to Computer Programming.pptx
Empathic Computing: Creating Shared Understanding
Ad

The 5 principles of google's cloud

  • 1. E N TE RP RI S E A R C H I T E C T U R E THE 5 PRINCIPLES OF OF GOOGLE’S ”CLOUD” Patrik Svensson, 2011, ptrksvnssn@gmail.com torsdag den 12 maj 2011
  • 2. E N TE RP RI S E THE VISION OF GOOGLE A R C H I T E C T U R E torsdag den 12 maj 2011
  • 3. E N TE RP RI S E A E R C C T H U I R T E THE 5 PRINCIPLES • Everything is a service (or an application in Android) • Relentless technical focus (thinking at nanoscale) • Data centers are the foundation • Code is king, Data is king kong • Identify and keep track on your users torsdag den 12 maj 2011
  • 4. E N TE RP RI S E A R C H I T E C T U R E torsdag den 12 maj 2011
  • 5. #1 EVERYTHING IS A E N TE RP RI S E SERVICE (OR AN A E R C C T H U I R T E APPLICATION) torsdag den 12 maj 2011
  • 6. E N TE RP RI S E #2 RELENTLESS A E R C C T H U I R T E TECHNICAL FOCUS • Jedis build their own lightsabres • Parallelize, Distribute, Cache, Compress, Redundantize everything • Latency is VERY evil Source: http://www.flickr.com/photos/60994749@N07/5557591956/ torsdag den 12 maj 2011
  • 7. EXAMPLE: ”NUMBERS E N TE RP RI S E EVERYONE SHOULD A E R C C T H U I R T E KNOW” 1,000,000 ns = 1 ms 1,000,000,000 ns = 1 s Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 8. E N TE RP RI S E #3 DATA CENTERS ARE A E R C C T H U I R T E THE FOUNDATION torsdag den 12 maj 2011
  • 9. E N TE RP RI S E A E R C C T H U I R T E ECONOMIES OF SCALE • ~40 data centers in 2009, 1000,000 machines Source: http://techcrunch.com/2008/04/11/where-are-all-the-google-data-centers/ torsdag den 12 maj 2011
  • 10. E N TE RP RI S E A R C H I T E C T U R E torsdag den 12 maj 2011
  • 11. E N TE RP RI S E #4 CODE IS KING, DATA IS KING KONG A R C H I T E C T U R E Enterprise Architecture Technical Architecture i.e. which technologies do we use DATA CENTERS DATA CODE CONTROL USERS "We need: "We need to build applications "We need: and services, application-, "We need scheduling "We need to identify our One Distributed File Cooling, Power, integration- & data platforms, synchronization, lock users to be able to Systems, Distributed Perimeter Networks, parallell computing platforms & services, i.e. various interact, differentiate and One Shared memory, Containers, Racks, use an open source OS, upon forms of control customize the user & common data Switches & Hardware at our data center/data platform" mechanisms for data and experience" formats to get scale low cost that scale" code" and low cost" Implementation Architecture i.e. how do we implement the technologies Android, Chrome App Engine, Gmail, Search, Index GFS master GFS, Python, Java, C++ Google Work Queue, OpenID, OAuth, Google Google Container- BigTable, Chubby,Netscalar, Google Accounts available for most Protocol Buffers, Json based Data Centers Protocol Buffers HTTP Server, (Spanner) services Sawzall, Dremel, Percolator MapReduce Linux torsdag den 12 maj 2011
  • 12. E N TE RP RI S E "Google's mission is to A E R C C T H U I R T E ABOUT DATA organize the world's information and make it available to all" +20 Petabyte/day 200 150 100 ~10 Terabyte/day 50 ~2,5 Terabyte 0 Structured, Numerical Unstructured, Textual Communication, Traffic torsdag den 12 maj 2011
  • 13. E N TE RP RI S E A E R C C T H U I R T E DATA CENTER ”ENTRY” • The same entry to each Data Center • ~50 caching (using Squid) • Built their own HTTP servers/farms Source: Ed Austin, ”The Anatomy of the Google Architecture” torsdag den 12 maj 2011
  • 14. E N TE RP RI S E A E R C C T H U I R T E INSIDE THE CONTAINERS • Customized commodity servers, is customized racks in containers (+1000 servers), organized into clusters • All containers ”cloned” and look the same Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 15. THE SAME HW, OS AND E N TE RP RI S E FILESYSTEM A E R C C T H U I R T E EVERYWHERE Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 16. E N TE RP RI S E A E R C C T H U I R T E BIGDATA AS DATABASE Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 17. E N TE RP RI S E BIGDATA IS COLUMN- A E R C C T H U I R T E BASED Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 18. E N TE RP RI S E A E R C C T H U I R T E BIGDATA NEEDS GFS • Use GFS to store data and logs Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 19. MAPREDUCE - E N TE RP RI S E A PARALLELL A E R C C T H U I R T E COMPUTING PLATFORM Source: Jeff Dean, ”Designs, Lessons and Advise from Building Large Distributed Systems” torsdag den 12 maj 2011
  • 20. E N TE RP RI S E ABOUT CODING AT A E R C C T H U I R T E GOOGLE • Linux as operating system everywhere - is open source, highly customized for this (Android is also a higly customized version of Linux) • Serialization/Integration - Protocol buffers (RPC) runs at nano speed, internally used for ”everything”, Json and RESTful used for external API’s • Application-oriented Programming languages - mainly Python, Java and C++ • Data-oriented programming languages - Percolator, Sawzall, Dremel for various data processing task (so specialised tools for data!) • The Business Applications - Gmail, Search, App Engine etc - built upon data center infrasctructure, data platform and above torsdag den 12 maj 2011
  • 21. E N TE RP RI S E #5 IDENTIFY AND KEEP A E R C C T H U I R T E TRACK OF YOUR USERS • You need a google account to start Android properly • OpenSocial is a collaborate effort to compete against Facebook • OpenID is an identity standard and OAuth is a standard for authorizing services • Google is identifying and tracking every step you take within their domains torsdag den 12 maj 2011