SlideShare a Scribd company logo
1




Auto-Scaling to Minimize Cost and
  Meet Application Deadlines in
        Cloud Workflows

                        SC 11
                   (Nov 16, TCC 305)




        Ming Mao, Marty Humphrey
        CS Department, University of Virginia
Introduction
2


       Resource provisioning questions are not trivial
           Under-provisioning → hurt performance
           Over-provisioning → pay more than necessary

         How much resources?
         What types of resources?

         When to acquire or release?

         How to use them?

       A performance-resource mapping problem
Auto-Scaling
3


       Schedule-based and rule-based auto-scaling
           E.g. “run 10 instances between 8AM to 6PM everyday and
            2 instances all the other time.”
           E.g. “add (remove) 2 instances when the average CPU
            utilization is above 70% (below 20%) for 5 minutes.”
           Simple and convenient, works well for simple applications
           What if the relationship between the performance and
            resources utilization indicators is complex
           The resource utilization indicators are low-level and may
            not be expressive enough
           They do not consider the user budgets well
Auto-Scaling
4


       Goals of auto-scaling mechanisms
             Balance performance and cost
                   E.g. meet performance goals with minimum cost or maximize
                    utilities with the limited budget
             Reflect different options for computing resources
                   E.g. VMs have different processing power and price
             Be aware of practical considerations
                   E.g. VM may takes several min to be ready to use
             Be aware of the cloud billing model
                   E.g. billed by instance-hours
             Support specific application performance requirements
                   E.g. deadlines, the number of concurrent users, communication
                    latency
Cloud application model
5

                                                                                            Credit
                                                                 Cloud                      History
                                                                                                      Third Party
                                                                                                      Evaluation
                                                                                                                      Complete
                                                                                                                       Model
                         Gold                                                                 (5)         (8)
                                                                                                                        (10)
                        Members                          Authentication
                                                                            Loading
                                                                             Profile        Health
                                                              (2)
                                                                               (4)          Record
                                                                                                      Advanced
                                                                                              (6)      Model
                         Silver           Entry
                        Members          Point (1)                                                       (9)                     Response
                                                                                                                                   (11)
                                                                       Data                 Base
                                                                     Validation             Model
                         Non-                                            (3)                 (7)
                        Member
                                                                                                      Auto-Scaling


                             Non-Member Job          Silver Member Job            Gold Member Job

                                                                                                                    Cloud VMs
       App consists of service units
       Job consists of tasks
       Jobs are categorized into classes (deadline and processing flow)
       Cloud offers multiple VM types (price and processing power)
       App has no knowledge on the workload info in advance
       VM takes time to start up (VM acquisition delay) and are billed by hours
Problem definition
6

        Cloud application
            app = {Si}
                                                    Job class
                                                        J = {DAG(Si), deadline | Si ∈ app}
        Cloud VM
                          𝑆
            VMv = {[𝑗 𝐽 𝑖 ]v , cv , lagv}
                                                    Workload
                                                                         𝑆𝑖
                                                        Wt =   𝑆𝑖   𝐽 𝑗𝐽
        Scaling plan
            Scalingt = {VMv , Nv}

                                                    Scheduling plan
                                                                              𝑆
                                                        Schedulet = { 𝑗 𝐽 𝑖 →VMv}
        Goal
            Min(C) = Min(        𝑣   𝑐 𝑣 𝑁 𝑣)
Solution
7


       SCS (Scaling – Consolidation - Scheduling)
         Task bundling
         Deadline assignment

         Scaling

         Instance consolidation

         Scheduling
Solution – Step 1
8


       Task bundling
         Idea – force tasks run on the same instance to improve
          performance and save data transfer cost

         Example

                      T6                T8       T6                     T8
                                                   Bundle task as T6'


                    Server 1        Server 2   Server 1           Server 1
                               Before                     After
Solution – Step 2
9


       Deadline assignment
         Idea – to break task dependencies, assign deadlines
          proportionally based on task running time (on their cost-
          efficient machines)
         Example
                                                                          T3
                           T3                                                  T7
                                T7
                                                                          T4                        T11
                           T4              T11

                                                 T13        T1    T2           T8      T10                   T13
                 T1   T2        T8   T10
                           T5              T12                            T5                        T12
                                T9                                             T9

                           T6                                             T6


            3:00PM
              3:00                                4:30   3:00 3:10 3:20         3:50         4:00         4:20 4:30
                           Before                                                   After
           Task upgrading
                                                   𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛 𝑏𝑒𝑓𝑜𝑟𝑒 −𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛 𝑎𝑓𝑡𝑒𝑟
                                      𝑟𝑎𝑛𝑘 =
                                                         𝑐𝑜𝑠𝑡 𝑎𝑓𝑡𝑒𝑟 −𝑐𝑜𝑠𝑡 𝑏𝑒𝑓𝑜𝑟𝑒
Solution – Step 3
10


        Determine the number of instances
          From   deadline assignment, we have
            Task running time – tm
            Task execution interval – [T0 ,T1 ]

          Load   vector
            LVm =  [tm/( T1 – T0 )]
            # of instances = [LVm]

          Example

                    T1     0   0                     0.25            0   0
                    T2     0   0          0           0.5      0     0   0
                                   3:00       3:15          3:45 4:00
                                                                             VM1
                    All    0   0      0.25           0.75     0.25   0   0
Solution – Step 5
11


        Instance consolidation
          Idea – put tasks on the same instance even if some
           task may not run the most cost-efficiently on that
           machine

          Example
                                      T11                                  Idle
                                            High-CPU 3:00 PM                      4:00 PM
                       Before
                                      T12                        Idle
                                                       3:00 PM                    4:00 PM
                                            Standard

                      After     T11   T12                               Idle
                                            Standard   3:00 PM                    4:00 PM
Solution – Step 6
12


        Scheduling – Earliest Deadline First
          The dynamic scaling feature can make sure that the
           tasks facing missed deadlines can be found in time

                                       𝑡𝑖
                                                  <1
                          𝑖   𝑇 𝑒𝑛𝑑_𝑖 − 𝑇 𝑠𝑡𝑎𝑟𝑡_𝑖
Solution – Overview
13


                            Parallelism   reduction
Evaluation
14

        Workload patterns




        Application models
                                                        VM Type            Price
                                                          Micro         $0.02/hour
                                                        Standard        $0.085/hour
                                                        High-CPU        $0.68/hour
                                                      High-Memory       $0.50/hour

      Base line     Time            Task   execution         VM        lag
        Greedy          72 hours       Randomly generated           8 min
        GAIN
Evaluation
15




      SCS cost saving ranges from 6.8% to 40.4%
      The performance difference is larger with longer deadlines
Evaluation – High volume V.S. Low volume
16


        High workload (10X ) V.S. low workload (X)
          Pipeline,        1-hour deadline
            Cost ($)
                         High Volume V.S. Low Volume
           120                                            Greedy-
                                                          High
           100                                            GAIN-
                                                          High
            80
                                                          SCS-High
            60
                                                          Greedy-
            40                                            Low
                                                          GAIN-
            20
                                                          Low
              0                                           SCS-Low
                       Stable   Growing   Cycle   OnOff
Evaluation – Imprecise parameters
17

                 Deadline(0.5hour) Non-Miss Rate for           Pipeline application, 20% variance
     Non-miss
     Rate (%)    Imprecise Task Execution Estimation
     100.0%                                                     in estimated execution time, 0.5-
      90.0%
      80.0%
                                                                hour deadline
                                                     Greedy
                                                               SCS can finish jobs before
      70.0%
      60.0%
      50.0%                                          GAIN
      40.0%                                          SCS
                                                                deadlines for more than 90%,
      30.0%
      20.0%                                                     much better than Greedy(40%)
      10.0%
       0.0%
                                                                and GAIN(50%)
                 Stable    Growing   Cycle   OnOff


                 Deadeline(1 hour) Non-Miss Rate for           Pipeline application, 20% variance
      Non-miss
      Rate(%)     Imprecise Instance Acquisition Lag            in the estimate VM acquisition
     100.0%
      90.0%                                                     time, 1-hour deadline
      80.0%
      70.0%                                          Greedy    SCS beats Greedy and GAIN
      60.0%
      50.0%
                                                     GAIN
                                                               The performance is more affected
      40.0%                                          SCS
      30.0%                                                     by the VM acquisition time
      20.0%
      10.0%
       0.0%
                  Stable   Growing   Cycle   OnOff
Related work
18


        Dynamic resource provisioning in virtualized
         environment
              Multi-tier web applications, queuing theory, control theory
        Workflow scheduling in Grid environment with
         deadline and budget constraints
            Single workflow instance
            Resource pool is limited
        Cloud economics
              Cloud provider side V.S. cloud user side
        Current cloud auto-scaling mechanisms
              E.g. AWS auto-scaling, RightScale, enStratus, Scalr, AzureScale
               project, etc.
Conclusion and future work
19

        Conclusions
            SCS cost saving ranges from 6.8% to 40.4%
            SCS can better handle different workload volume and imprecise
             parameters
            Choosing proper VM types based on the workload saves cost
            Instance consolidation can help save partial instance hours
            VM acquisition time plays a very important role

        Future work
            Different scheduling approaches
            Real scientific applications
            Insufficient budget cases - maximize cloud user benefits/utilities
             under budget constraints
            Data-intensive applications
20




     Thank you!

More Related Content

PDF
A Framework for Classifying and Comparing Architecture-Centric Software Evolu...
PPSX
İş Analizi
PDF
Picturetel RSVP and Weighted Fair Queuing
PPTX
Real time Operating System
PDF
IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...
PDF
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
PPTX
Serenity Project: Security in Software Enginering
PDF
Cloud computing and CloudStack
A Framework for Classifying and Comparing Architecture-Centric Software Evolu...
İş Analizi
Picturetel RSVP and Weighted Fair Queuing
Real time Operating System
IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...
VMworld 2013: Moving Enterprise Application Dev/Test to VMware’s Internal Pri...
Serenity Project: Security in Software Enginering
Cloud computing and CloudStack

Similar to Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows (20)

PPTX
SaaS transformation with OCE - uEngineCloud
PDF
Spring boot microservice metrics monitoring
PDF
Spring Boot - Microservice Metrics Monitoring
PDF
Dc architecture for_cloud
PDF
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
PPTX
Performance and Cost Analysis of Modern Public Cloud Services
PDF
RTC/CLM 5.0 Adoption Paths: Deploying in 16 Steps
PDF
Eci Service Architecture Evolution 1
PPTX
Vulnerability Advisor: DevSecOps Integration
PDF
Building your private cloud the ncs experience harrison lee
PPT
Muves3 Elastic Grid Java One2009 Final
PDF
Resume_Mohan Selvamoorthy_Sec
PPT
Distributed Block-level Storage Management for OpenStack, by Danile lee
PPT
Danile lee -open stackblocklevelstorage
PDF
RTC/CLM 2012 Adoption Paths : Deploying in 16 Steps
PPT
Primatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
PDF
An Enhanced Throttled Load Balancing Approach for Cloud Environment
PPTX
DICE & Cloudify – Quality Big Data Made Easy
PDF
Migrating Monoliths to Microservices -- M3
PPT
Scheduling in CCE
SaaS transformation with OCE - uEngineCloud
Spring boot microservice metrics monitoring
Spring Boot - Microservice Metrics Monitoring
Dc architecture for_cloud
IRJET- Scheduling of Independent Tasks over Virtual Machines on Computati...
Performance and Cost Analysis of Modern Public Cloud Services
RTC/CLM 5.0 Adoption Paths: Deploying in 16 Steps
Eci Service Architecture Evolution 1
Vulnerability Advisor: DevSecOps Integration
Building your private cloud the ncs experience harrison lee
Muves3 Elastic Grid Java One2009 Final
Resume_Mohan Selvamoorthy_Sec
Distributed Block-level Storage Management for OpenStack, by Danile lee
Danile lee -open stackblocklevelstorage
RTC/CLM 2012 Adoption Paths : Deploying in 16 Steps
Primatics Financial - Parallel, High Throughput Risk Calculations On The Cloud
An Enhanced Throttled Load Balancing Approach for Cloud Environment
DICE & Cloudify – Quality Big Data Made Easy
Migrating Monoliths to Microservices -- M3
Scheduling in CCE
Ad

Recently uploaded (20)

PPTX
MYSQL Presentation for SQL database connectivity
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Empathic Computing: Creating Shared Understanding
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PPT
Teaching material agriculture food technology
PDF
Machine learning based COVID-19 study performance prediction
PPTX
A Presentation on Artificial Intelligence
MYSQL Presentation for SQL database connectivity
Unlocking AI with Model Context Protocol (MCP)
Diabetes mellitus diagnosis method based random forest with bat algorithm
Empathic Computing: Creating Shared Understanding
NewMind AI Monthly Chronicles - July 2025
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Understanding_Digital_Forensics_Presentation.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
NewMind AI Weekly Chronicles - August'25 Week I
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
The AUB Centre for AI in Media Proposal.docx
Per capita expenditure prediction using model stacking based on satellite ima...
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Detection-First SIEM: Rule Types, Dashboards, and Threat-Informed Strategy
Mobile App Security Testing_ A Comprehensive Guide.pdf
Teaching material agriculture food technology
Machine learning based COVID-19 study performance prediction
A Presentation on Artificial Intelligence
Ad

Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows

  • 1. 1 Auto-Scaling to Minimize Cost and Meet Application Deadlines in Cloud Workflows SC 11 (Nov 16, TCC 305) Ming Mao, Marty Humphrey CS Department, University of Virginia
  • 2. Introduction 2  Resource provisioning questions are not trivial  Under-provisioning → hurt performance  Over-provisioning → pay more than necessary  How much resources?  What types of resources?  When to acquire or release?  How to use them?  A performance-resource mapping problem
  • 3. Auto-Scaling 3  Schedule-based and rule-based auto-scaling  E.g. “run 10 instances between 8AM to 6PM everyday and 2 instances all the other time.”  E.g. “add (remove) 2 instances when the average CPU utilization is above 70% (below 20%) for 5 minutes.”  Simple and convenient, works well for simple applications  What if the relationship between the performance and resources utilization indicators is complex  The resource utilization indicators are low-level and may not be expressive enough  They do not consider the user budgets well
  • 4. Auto-Scaling 4  Goals of auto-scaling mechanisms  Balance performance and cost  E.g. meet performance goals with minimum cost or maximize utilities with the limited budget  Reflect different options for computing resources  E.g. VMs have different processing power and price  Be aware of practical considerations  E.g. VM may takes several min to be ready to use  Be aware of the cloud billing model  E.g. billed by instance-hours  Support specific application performance requirements  E.g. deadlines, the number of concurrent users, communication latency
  • 5. Cloud application model 5 Credit Cloud History Third Party Evaluation Complete Model Gold (5) (8) (10) Members Authentication Loading Profile Health (2) (4) Record Advanced (6) Model Silver Entry Members Point (1) (9) Response (11) Data Base Validation Model Non- (3) (7) Member Auto-Scaling Non-Member Job Silver Member Job Gold Member Job Cloud VMs  App consists of service units  Job consists of tasks  Jobs are categorized into classes (deadline and processing flow)  Cloud offers multiple VM types (price and processing power)  App has no knowledge on the workload info in advance  VM takes time to start up (VM acquisition delay) and are billed by hours
  • 6. Problem definition 6  Cloud application  app = {Si}  Job class  J = {DAG(Si), deadline | Si ∈ app}  Cloud VM 𝑆  VMv = {[𝑗 𝐽 𝑖 ]v , cv , lagv}  Workload 𝑆𝑖  Wt = 𝑆𝑖 𝐽 𝑗𝐽  Scaling plan  Scalingt = {VMv , Nv}  Scheduling plan 𝑆  Schedulet = { 𝑗 𝐽 𝑖 →VMv}  Goal  Min(C) = Min( 𝑣 𝑐 𝑣 𝑁 𝑣)
  • 7. Solution 7  SCS (Scaling – Consolidation - Scheduling)  Task bundling  Deadline assignment  Scaling  Instance consolidation  Scheduling
  • 8. Solution – Step 1 8  Task bundling  Idea – force tasks run on the same instance to improve performance and save data transfer cost  Example T6 T8 T6 T8 Bundle task as T6' Server 1 Server 2 Server 1 Server 1 Before After
  • 9. Solution – Step 2 9  Deadline assignment  Idea – to break task dependencies, assign deadlines proportionally based on task running time (on their cost- efficient machines)  Example T3 T3 T7 T7 T4 T11 T4 T11 T13 T1 T2 T8 T10 T13 T1 T2 T8 T10 T5 T12 T5 T12 T9 T9 T6 T6 3:00PM 3:00 4:30 3:00 3:10 3:20 3:50 4:00 4:20 4:30 Before After  Task upgrading 𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛 𝑏𝑒𝑓𝑜𝑟𝑒 −𝑚𝑎𝑘𝑒𝑠𝑝𝑎𝑛 𝑎𝑓𝑡𝑒𝑟 𝑟𝑎𝑛𝑘 = 𝑐𝑜𝑠𝑡 𝑎𝑓𝑡𝑒𝑟 −𝑐𝑜𝑠𝑡 𝑏𝑒𝑓𝑜𝑟𝑒
  • 10. Solution – Step 3 10  Determine the number of instances  From deadline assignment, we have  Task running time – tm  Task execution interval – [T0 ,T1 ]  Load vector  LVm = [tm/( T1 – T0 )]  # of instances = [LVm]  Example T1 0 0 0.25 0 0 T2 0 0 0 0.5 0 0 0 3:00 3:15 3:45 4:00 VM1 All 0 0 0.25 0.75 0.25 0 0
  • 11. Solution – Step 5 11  Instance consolidation  Idea – put tasks on the same instance even if some task may not run the most cost-efficiently on that machine  Example T11 Idle High-CPU 3:00 PM 4:00 PM Before T12 Idle 3:00 PM 4:00 PM Standard After T11 T12 Idle Standard 3:00 PM 4:00 PM
  • 12. Solution – Step 6 12  Scheduling – Earliest Deadline First  The dynamic scaling feature can make sure that the tasks facing missed deadlines can be found in time 𝑡𝑖 <1 𝑖 𝑇 𝑒𝑛𝑑_𝑖 − 𝑇 𝑠𝑡𝑎𝑟𝑡_𝑖
  • 13. Solution – Overview 13  Parallelism reduction
  • 14. Evaluation 14  Workload patterns  Application models VM Type Price Micro $0.02/hour Standard $0.085/hour High-CPU $0.68/hour High-Memory $0.50/hour  Base line  Time  Task execution  VM lag  Greedy  72 hours  Randomly generated  8 min  GAIN
  • 15. Evaluation 15  SCS cost saving ranges from 6.8% to 40.4%  The performance difference is larger with longer deadlines
  • 16. Evaluation – High volume V.S. Low volume 16  High workload (10X ) V.S. low workload (X)  Pipeline, 1-hour deadline Cost ($) High Volume V.S. Low Volume 120 Greedy- High 100 GAIN- High 80 SCS-High 60 Greedy- 40 Low GAIN- 20 Low 0 SCS-Low Stable Growing Cycle OnOff
  • 17. Evaluation – Imprecise parameters 17 Deadline(0.5hour) Non-Miss Rate for  Pipeline application, 20% variance Non-miss Rate (%) Imprecise Task Execution Estimation 100.0% in estimated execution time, 0.5- 90.0% 80.0% hour deadline Greedy  SCS can finish jobs before 70.0% 60.0% 50.0% GAIN 40.0% SCS deadlines for more than 90%, 30.0% 20.0% much better than Greedy(40%) 10.0% 0.0% and GAIN(50%) Stable Growing Cycle OnOff Deadeline(1 hour) Non-Miss Rate for  Pipeline application, 20% variance Non-miss Rate(%) Imprecise Instance Acquisition Lag in the estimate VM acquisition 100.0% 90.0% time, 1-hour deadline 80.0% 70.0% Greedy  SCS beats Greedy and GAIN 60.0% 50.0% GAIN  The performance is more affected 40.0% SCS 30.0% by the VM acquisition time 20.0% 10.0% 0.0% Stable Growing Cycle OnOff
  • 18. Related work 18  Dynamic resource provisioning in virtualized environment  Multi-tier web applications, queuing theory, control theory  Workflow scheduling in Grid environment with deadline and budget constraints  Single workflow instance  Resource pool is limited  Cloud economics  Cloud provider side V.S. cloud user side  Current cloud auto-scaling mechanisms  E.g. AWS auto-scaling, RightScale, enStratus, Scalr, AzureScale project, etc.
  • 19. Conclusion and future work 19  Conclusions  SCS cost saving ranges from 6.8% to 40.4%  SCS can better handle different workload volume and imprecise parameters  Choosing proper VM types based on the workload saves cost  Instance consolidation can help save partial instance hours  VM acquisition time plays a very important role  Future work  Different scheduling approaches  Real scientific applications  Insufficient budget cases - maximize cloud user benefits/utilities under budget constraints  Data-intensive applications
  • 20. 20 Thank you!