SlideShare a Scribd company logo
Seminar: Cloud Computing




A Scalable Data Platform for a Large Number of Small Applications

                                                 Sabbir Ahmmed
Motivation




9 January 2013    Scaling Small Apps   2
Outline

 ➔
     The Article

 ➔
     Introduction

 ➔
     System Architecture

 ➔
     Failure Management in a Cluster

 ➔
     Enforcing Database SLAs

 ➔
     Experimental Evaluation

 ➔
     Conclusion

9 January 2013                 Scaling Small Apps   3
The Article

 Work done by: Fan Yang, Jayavel Shanmugasundaram, Ramana Yerneni
 Work done at: Yahoo! Research

 Published in: 2009 (CIDR)
 Published under: Creative Commons License Agreement
 This paper is NOT about:
       ➔
           File Systems/ Low Level Internals (Database vendors' wizardry)
       ➔
           Database Paradigms (Relational, NoSQL or any other kind)
 This paper is about:
       ➔
           Data Platform/Data Management Solution (SaaS, PaaS, IaaS)
       ➔
           Design Space/System Architecture (for Cloud service providers)
9 January 2013                         Scaling Small Apps                   4
Introduction (I)

 ➔
     Small/Community Applications
       ➔
           Relatively small data size (tens to thousands of megabytes)
       ➔
         Small throughput requirement (tens to hundreds of concurrent user
       sessions)
       ➔
           Comfortably fit in a single machine

 ➔
     However
       ➔
           Large number of such applications in a large social network.
             ➔
                 Tens of thousands!!
       ➔
           Combined data size and and workload is quite large.
             ➔
                 Peta bytes of data and millions of concurrent users !!
9 January 2013                            Scaling Small Apps                 5
Introduction (II)

 ➔
     Problems with existing data-management solutions
       ➔
           Commercial database systems
             ➔
                 Scale well
             ➔
                 Large monetary cost (Licensing with premium)

       ➔
           Open-source alternatives
             ➔
                 Free
             ➔
                 Do not scale well

       ➔
           Peer-to-Peer (DHTs, Ordered tables)
             ➔
                 Excellent scalability and throughput performance
             ➔
                 Only support very simple data-manipulation operations
9 January 2013                           Scaling Small Apps              6
Introduction (III)

 ➔
     Problems with existing data-management solutions
       ➔
           Other emerging data platforms (e.g Bigtable, PNUTS, SimpleDB)
             ➔
                 Scale to a large number of data operations
             ➔
                 Lacks rich query processing capabilities (crucial for most web apps!)
             ➔
                 Restrict the kind of queries applications can issue !



     Finaly all of the above solutions lack Multi-
     Tenancy support !!



9 January 2013                             Scaling Small Apps                            7
Introduction (IV)

 ➔
     The goal of this paper is to design a data management solution that is:
       ➔
         Low cost
       ➔
         Full-featured
       ➔
         Multi-tenancy capable
       ➔
         Uses comodity hardware and free software (MySQL)
       ➔
         Exploites two main properties of applications:
             ➔
                 They are “small”
             ➔
                 Can comfortably “fit in a single machine”
       ➔
         And finally addresses two main challenges
             ➔
                 Fault-tolerance
             ➔
                 Ensuring SLAs

9 January 2013                               Scaling Small Apps                8
System Architecture (I)




9 January 2013        Scaling Small Apps   9
System Architecture (II)

 ➔
   Few more words about the proposed architecture
       ➔
        System controller, colo controller and clusture controller need to be
       fault tolerant
       ➔
           Two main types of failures that need to be handled
             ➔
                 Machine failures within a colo (handled by syn. replication within a colo)
             ➔
                 Colo failures (handled by asyn. replication across colos)




9 January 2013                             Scaling Small Apps                             10
Failure Management in
                                     a Cluster (I)
 ➔
   Replication Architecture
       ➔
           Uses single node DBMSs                                      Cluster Controller


       ➔
           Each databse is hosted in two or more machines
       ➔
           Coordinated by a cluster controller
                                                                DB Cluster
             ➔
                 Maintains a map of databases to machines
             ➔
                 Manages all DB connections to the client
             ➔
                 Maintains a DB connection to each machine
             ➔
                 Uses Read-one write-all replication protocol
             ➔
                 Uses 2-phase commit (2PC) protocol




9 January 2013                            Scaling Small Apps                                11
Failure Management in
                       a Cluster (II)




9 January 2013        Scaling Small Apps   12
Failure Management in
                      a Cluster (III)




9 January 2013        Scaling Small Apps   13
Failure Management in
                                    a Cluster (IV)
 ➔
     Where to route the read operations ?
 ➔
     Performance vs Load-balancing
       ➔
           There are three options:
             ➔
                 All read operations => same physical machine (Option 1)
             ➔
              All read operations from a single transaction => single physical machine
             but read operations from different transactions => different physical
             machines (Option 2)
             ➔
              Read operations from same transaction => different physical machines
             (Option 3)




9 January 2013                           Scaling Small Apps                              14
Failure Management in
                             a Cluster (V)
 ➔
     What about serializability guarantee?
 ➔
   Reminder: “a transaction schedule is serializable if its outcome (e.g.,
 the resulting database state) is equal to the outcome of its transactions
 executed serially, i.e., sequentially without overlapping in time”.




9 January 2013                  Scaling Small Apps                           15
Failure Management in
                                 a Cluster (VI)
 ➔
     What happen when a machine fails?
       ➔
           The cluster controller continues to process database requests !!
       ➔
           Also initaites a background database replication process .




9 January 2013                        Scaling Small Apps                      16
Failure Management in
                                 a Cluster (VII)
 ➔
     Key technical challenge lies in designing the replication process.
       ➔
           So that replicas are transactionally consistent !
       ➔
           Using existing DBMS tools (mysqldump in MySQL) !!
       ➔
           With minimum downtime to the database !!!




9 January 2013                        Scaling Small Apps                  17
Failure Management in
                              a Cluster (VIII)
 ➔
     Problem scenario:




9 January 2013                Scaling Small Apps   18
Failure Management in
                      a Cluster (IX)
 ➔
     Solution:




9 January 2013        Scaling Small Apps   19
Enforcing Database
                                   SLAs (I)
 ➔
  Problem definition: to allocate databases to the minimum number of
 machines satisfying all database SLAs.
 ➔
     SLA Definition:
       ➔
           The minimum throughput over a time period T
       ➔
           The maximum fraction of rejected transactions over a time period T




9 January 2013                       Scaling Small Apps                         20
Enforcing Database
                                 SLAs (II)
 ➔
     Solution: Adopted First-Fit algorithms




9 January 2013                     Scaling Small Apps   21
Experimental
                                Evaluation (I)
 ➔
     Synchronous Replication:




9 January 2013                   Scaling Small Apps   22
Experimental
                         Evaluation (II)
 ➔
     Failure Recovery:




9 January 2013            Scaling Small Apps   23
Conclusion

 ➔
     Opinion!
 ➔
     Critical assessment !!




9 January 2013                Scaling Small Apps   24
Questions




9 January 2013     Sabbir Ahmmed   25

More Related Content

PPTX
LIQUID-A Scalable Deduplication File System For Virtual Machine Images
PDF
10 Do's and Don'ts for MySQL Cluster
PDF
Scaling Up vs. Scaling-out
PDF
DNSCurve
PDF
Grid Computing Frameworks
PDF
Requirements Engineering
PPTX
Application architecture for cloud
PPT
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012
LIQUID-A Scalable Deduplication File System For Virtual Machine Images
10 Do's and Don'ts for MySQL Cluster
Scaling Up vs. Scaling-out
DNSCurve
Grid Computing Frameworks
Requirements Engineering
Application architecture for cloud
Architecture to Scale. DONN ROCHETTE at Big Data Spain 2012

Similar to Scaling Small App (20)

PDF
PDF
Azure and cloud design patterns
PDF
An Introduction To Space Based Architecture
PPTX
Scaling Your Database in the Cloud
PDF
Brian Oliver Pimp My Data Grid
PDF
No sql and data scalability
PDF
A scalable server environment for your applications
PPTX
Stacking up with OpenStack: Building for High Availability
PPTX
Stacking up with OpenStack: building for High Availability
PDF
Couche Base par Tugdual Grall
PPTX
Couchbase presentation
PPTX
Couchbase - orbitz use case - nyc meetup
PDF
Brief about Windows Azure Platform
PDF
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
PDF
Ieee-no sql distributed db and cloud architecture report
PDF
PostgreSQL Scaling And Failover
PDF
Top 6 Reasons to Use a Distributed Data Grid
PPTX
Introduction to Cloud Data Center and Network Issues
PDF
Scaling data on public clouds
PDF
The Enterprise Cloud: Immediate. Urgent. Inevitable.
Azure and cloud design patterns
An Introduction To Space Based Architecture
Scaling Your Database in the Cloud
Brian Oliver Pimp My Data Grid
No sql and data scalability
A scalable server environment for your applications
Stacking up with OpenStack: Building for High Availability
Stacking up with OpenStack: building for High Availability
Couche Base par Tugdual Grall
Couchbase presentation
Couchbase - orbitz use case - nyc meetup
Brief about Windows Azure Platform
Scalable Architecture on Amazon AWS Cloud - Indicthreads cloud computing conf...
Ieee-no sql distributed db and cloud architecture report
PostgreSQL Scaling And Failover
Top 6 Reasons to Use a Distributed Data Grid
Introduction to Cloud Data Center and Network Issues
Scaling data on public clouds
The Enterprise Cloud: Immediate. Urgent. Inevitable.
Ad

Scaling Small App

  • 1. Seminar: Cloud Computing A Scalable Data Platform for a Large Number of Small Applications Sabbir Ahmmed
  • 2. Motivation 9 January 2013 Scaling Small Apps 2
  • 3. Outline ➔ The Article ➔ Introduction ➔ System Architecture ➔ Failure Management in a Cluster ➔ Enforcing Database SLAs ➔ Experimental Evaluation ➔ Conclusion 9 January 2013 Scaling Small Apps 3
  • 4. The Article Work done by: Fan Yang, Jayavel Shanmugasundaram, Ramana Yerneni Work done at: Yahoo! Research Published in: 2009 (CIDR) Published under: Creative Commons License Agreement This paper is NOT about: ➔ File Systems/ Low Level Internals (Database vendors' wizardry) ➔ Database Paradigms (Relational, NoSQL or any other kind) This paper is about: ➔ Data Platform/Data Management Solution (SaaS, PaaS, IaaS) ➔ Design Space/System Architecture (for Cloud service providers) 9 January 2013 Scaling Small Apps 4
  • 5. Introduction (I) ➔ Small/Community Applications ➔ Relatively small data size (tens to thousands of megabytes) ➔ Small throughput requirement (tens to hundreds of concurrent user sessions) ➔ Comfortably fit in a single machine ➔ However ➔ Large number of such applications in a large social network. ➔ Tens of thousands!! ➔ Combined data size and and workload is quite large. ➔ Peta bytes of data and millions of concurrent users !! 9 January 2013 Scaling Small Apps 5
  • 6. Introduction (II) ➔ Problems with existing data-management solutions ➔ Commercial database systems ➔ Scale well ➔ Large monetary cost (Licensing with premium) ➔ Open-source alternatives ➔ Free ➔ Do not scale well ➔ Peer-to-Peer (DHTs, Ordered tables) ➔ Excellent scalability and throughput performance ➔ Only support very simple data-manipulation operations 9 January 2013 Scaling Small Apps 6
  • 7. Introduction (III) ➔ Problems with existing data-management solutions ➔ Other emerging data platforms (e.g Bigtable, PNUTS, SimpleDB) ➔ Scale to a large number of data operations ➔ Lacks rich query processing capabilities (crucial for most web apps!) ➔ Restrict the kind of queries applications can issue ! Finaly all of the above solutions lack Multi- Tenancy support !! 9 January 2013 Scaling Small Apps 7
  • 8. Introduction (IV) ➔ The goal of this paper is to design a data management solution that is: ➔ Low cost ➔ Full-featured ➔ Multi-tenancy capable ➔ Uses comodity hardware and free software (MySQL) ➔ Exploites two main properties of applications: ➔ They are “small” ➔ Can comfortably “fit in a single machine” ➔ And finally addresses two main challenges ➔ Fault-tolerance ➔ Ensuring SLAs 9 January 2013 Scaling Small Apps 8
  • 9. System Architecture (I) 9 January 2013 Scaling Small Apps 9
  • 10. System Architecture (II) ➔ Few more words about the proposed architecture ➔ System controller, colo controller and clusture controller need to be fault tolerant ➔ Two main types of failures that need to be handled ➔ Machine failures within a colo (handled by syn. replication within a colo) ➔ Colo failures (handled by asyn. replication across colos) 9 January 2013 Scaling Small Apps 10
  • 11. Failure Management in a Cluster (I) ➔ Replication Architecture ➔ Uses single node DBMSs Cluster Controller ➔ Each databse is hosted in two or more machines ➔ Coordinated by a cluster controller DB Cluster ➔ Maintains a map of databases to machines ➔ Manages all DB connections to the client ➔ Maintains a DB connection to each machine ➔ Uses Read-one write-all replication protocol ➔ Uses 2-phase commit (2PC) protocol 9 January 2013 Scaling Small Apps 11
  • 12. Failure Management in a Cluster (II) 9 January 2013 Scaling Small Apps 12
  • 13. Failure Management in a Cluster (III) 9 January 2013 Scaling Small Apps 13
  • 14. Failure Management in a Cluster (IV) ➔ Where to route the read operations ? ➔ Performance vs Load-balancing ➔ There are three options: ➔ All read operations => same physical machine (Option 1) ➔ All read operations from a single transaction => single physical machine but read operations from different transactions => different physical machines (Option 2) ➔ Read operations from same transaction => different physical machines (Option 3) 9 January 2013 Scaling Small Apps 14
  • 15. Failure Management in a Cluster (V) ➔ What about serializability guarantee? ➔ Reminder: “a transaction schedule is serializable if its outcome (e.g., the resulting database state) is equal to the outcome of its transactions executed serially, i.e., sequentially without overlapping in time”. 9 January 2013 Scaling Small Apps 15
  • 16. Failure Management in a Cluster (VI) ➔ What happen when a machine fails? ➔ The cluster controller continues to process database requests !! ➔ Also initaites a background database replication process . 9 January 2013 Scaling Small Apps 16
  • 17. Failure Management in a Cluster (VII) ➔ Key technical challenge lies in designing the replication process. ➔ So that replicas are transactionally consistent ! ➔ Using existing DBMS tools (mysqldump in MySQL) !! ➔ With minimum downtime to the database !!! 9 January 2013 Scaling Small Apps 17
  • 18. Failure Management in a Cluster (VIII) ➔ Problem scenario: 9 January 2013 Scaling Small Apps 18
  • 19. Failure Management in a Cluster (IX) ➔ Solution: 9 January 2013 Scaling Small Apps 19
  • 20. Enforcing Database SLAs (I) ➔ Problem definition: to allocate databases to the minimum number of machines satisfying all database SLAs. ➔ SLA Definition: ➔ The minimum throughput over a time period T ➔ The maximum fraction of rejected transactions over a time period T 9 January 2013 Scaling Small Apps 20
  • 21. Enforcing Database SLAs (II) ➔ Solution: Adopted First-Fit algorithms 9 January 2013 Scaling Small Apps 21
  • 22. Experimental Evaluation (I) ➔ Synchronous Replication: 9 January 2013 Scaling Small Apps 22
  • 23. Experimental Evaluation (II) ➔ Failure Recovery: 9 January 2013 Scaling Small Apps 23
  • 24. Conclusion ➔ Opinion! ➔ Critical assessment !! 9 January 2013 Scaling Small Apps 24
  • 25. Questions 9 January 2013 Sabbir Ahmmed 25