SlideShare a Scribd company logo
Coordinating CPU and Memory Elasticity Controllers to
Meet Service Response Time Constraints
Soodeh Farokhi1, Ewnetu Bayuh Lakew2, Cristian Klein3, Ivona Brandic1, and Erik Elmroth2
1 Vienna University of Technology, Austria
2 Umeå University, Sweden
3 SimScale GmbH, Germany
IEEE International Conference on Cloud and Autonomic Computing (ICCAC)
Cambridge, MA, USA
September 21-24, 2015
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Introduction: Elasticity on Cloud Computing
Horizontal Elasticity
 Course grained
 Slow (minutes)
 Needs application support
• State synchronization
• Load-balancing
Vertical Elasticity
 Fine grained
 Fast (sub-second)
 Little application support
• Multi-threaded
 Needs Hypervisor support
2
horizontal elasticity
verticalelasticity
[1] [E. Lakew, et. al. UCC 2014]
Fig. 1: vertical vs. horizontal elasticity [1]
Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Vertical Elasticity
 key enabler for efficient resource utilization
 little support from the commercial cloud providers
• started to be supported by Hypervisors (e.g., Xen and KVM)
 few research efforts
• single resource => either CPU or memory (mostly CPU)
o need for multiple resources at different application execution stages
o need proper orchestration for using the existing approaches => inconsistent decisions!
• taking resource utilization as a decision making criterion
o capacity-based approaches
o response time as a key indicator => performance-based approaches
3
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
0
20
40
60
0 250 500 750 1000 1250
memutilization
[%]
Time [seconds] every 250 sec is an interval
100, 1 1500, 5 100, 0.1 100, 1 1000, 1
Motivation Scenario
4
Fig.3: Resource utilization of RUBiS (eBay-like e-commerce app.)
under different workload patterns.
0
10
20
30
40
50
60
CPUutilization
[%]
100 (cc) , 1 (tt) 1500, 5 100, 0.1 100, 1 1000, 1
1
2
3
4
Time [seconds] every 250 sec in an interval
Introduction Approach Evaluation
 RUBiS as a benchmark application
• CPU-intensive behavior
• memory-intensive behavior
 Can an application be both?!
• based on the nature of the workload
 How to support CPU & Memory vertical elasticity?
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Outline
 Goals & Challenges
 Autonomic Resource Controller
• CPU controller
• Memory controller
• Fuzzy controller (coordinator)
 Experimental Evaluation
 Conclusion
5Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Research Goals & Challenges
 Goal: designing an autonomic resource elasticity controller
• Meet the application performance requirements
• Supporting vertical elasticity of both CPU & memory
 Challenges:
• Performance modeling for vertical scaling of both CPU & memory
• Elasticity reasoning for multiple resources under uncertainty
 F (average response time) -> CPU & memory capacity
6Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Solution Overview: Autonomic Resource Controller
 following the MAPE loop
• Monitoring
• Analysis & Planning:
o Step 1: fuzzy controller infers the contributions degree of CPU & memory to app performance change
o Step 2: CPU & memory controllers determine the required CPU & memory capacity
• Execution
7Introduction Approach Evaluation
CPU
controller
memory
controller
Application
VM
# CPU
# MEM
CPU coefficient
fuzzy controller
mem coefficient
Hypervisor
Autonomic Resource Controller
App/VM Sensors
Fig.4: Solution overview
knowledge base
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
CPU & Memory Controllers
 Performance-based, adaptive & reactive controllers
…α and ß are system parameters
 CPU Controller [2]
• following an inverse model
• adopting RLS filtering to adaptively measure ß
o based on the past measurements
 Memory Controller [3]
• following a feedback control loop
• adopting LR to calculate α
8
Fig. 6. The feedback loop used for memory controller [3]
memory
controller
Application
VM
memory size
(ctli  memi)
workload
measured RT (rti)
+
-
desired RT (rt)~ ei = rt - rti
~
Fig. 5. CPU controller [2]
[2] [E. Lakew, et. al. UCC 2014]
[3] [S. Farokhi, et. al. ICAC 2015]
Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Fuzzy Controller
as a Coordination Technique
9Introduction Approach Evaluation
CPU
controller
memory
controller
Application
VM
# CPU
# MEM
CPU coefficient
fuzzy controller
mem coefficient
Hypervisor
Autonomic Resource Controller
App/VM Sensors
knowledge base
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Fuzzy Controller Design
A fuzzy logic control approach is defined by:
• membership functions (MF): degree of truth with a value between 0 to 1
• fuzzy rules: a collection of “IF THEN ELSE” rules
 Design process
• Step 1: MF Construction
• Step 2: Fuzzy Rule Elicitation
• Step 3: Fuzzy Reasoning
 MIMO fuzzy logic system (FLS)
• three Input variables (next slide)
• two output variables
o CPU and memory coefficients, values between -1 (over-provisioning) to +1 (under-provisioning)
o -1 & +1 => fully allocate the resource ; -1 < value < +1 => partially allocate the resource
10Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Step 1: MFs Construction
 Input variables
1. Response time
2. CPU utilization
3. Memory utilization
 linguistic terms represent the value of input variables
1. Slow (RT), Low (utilizations)
2. Medium (RT), Medium (utilizations)
3. Fast (RT), High (utilizations)
 We need to define a MF for each linguistic term
• 9 MFs in total, some triangular, some trapezoidal
11
Fig.8. MFs of Response time
Fig.9. MFs of CPU utilization
Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Step 2: Fuzzy Rule Elicitation
3 input variables => 33 combinations => 27 fuzzy rules
… using expert knowledge to extract them & then empirically update them at run time
12Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Step 2: Output Variables (Hyper-surfaces)
 Output variables’ hyper-surfaces in accordance to input variables
13
Memory coefficient surfaceCPU coefficient surface
Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Step 3: Fuzzy Elasticity Reasoning
The process of elasticity reasoning (having “fuzzy knowledge-based”)
1. Measured values of input variables are fuzzified, using MFs
2. Fuzzy controller reasons & produces the output variables
o using the fuzzified input variables & fuzzy rules
3. Feeding the output variables into CPU & memory controller
14Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Experimental
Evaluation
15
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Experimental Setup
 Xen hypervisor
• focusing on the VM which hosts the BL tier (Apache Web Server)
 3 interactive benchmark applications
• Olio (Amazon-like, online bookstore)
• RUBBoS (Slashdot-like bulletin board)
• RUBiS (eBay-like e-commerce)
 Workload (synthetic traces)
• Open- & closed loop user model [4]
• Httpmon[5] load generator
 Control interval = 5sec
16Introduction Approach Evaluation
Xen hypervisor
server-side
VM1
workload
generator
PM (56 GB memory, 32 processors)
elastic RAM
Benchmark
Application
BL tier (Apache 2.0)
Autonomic
Resource
Controller
control-side
VM2
memory (fixed)
Benchmark
Application
DS tier (MySQL)
CPU (fixed) elastic CPU
App/VMsensors
client-side
[4] [Schroeder B., et al, NSDI, 2006] [5] https://github.com/cloud-control/httpmon
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Evaluation Process
 The baseline approach
• Comparing fuzzy Controller (FC) with a Non-fuzzy Controller (NFC)
o NFC: an approach with two separated CPU & memory controllers
• better controller? => meets the desired RT without over-provisioning any of resources
 Evaluation metrics
• Control theoretical metrics: ISE and IAE (present control error)
• CPU usage (mean)
• Memory usage (mean)
• Response time (mean & SD)
 Process (run 10 different experiments for both FC & NFC)
• Time-series analysis
• Aggregated results analysis
17Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Evaluation Results (1) - Olio (target RT: 1 sec)
18Introduction Approach Evaluation
response time
[sec]
CPU usage
[#core]
memory usage
[GB]
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Evaluation Results (2) – Aggregated
19Introduction Approach Evaluation
control error =
desired RT – measured RT
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Experimental Results (3) – Aggregated
 FC improvement in the following metrics compared to NFC
(1) allocated memory (2) allocated CPU (3) RT stability (standard deviation)
 Achieving almost better results under open loop user model in general
 For some applications (e.g., RUBBoS) the improvement is lower
20Introduction Approach Evaluation
60.76
20.08 18.98
47.85
3.97
15.44
2.86 0
56.51
7.30 2.45
64.78
26.23 29.68
79.08
9.38
22.12
0
20
40
60
80
RUBiS (0.5sec) RUBBoS (1sec) Olio (0.5sec) RUBiS (0.5sec) RUBBoS (1sec) Olio (0.5sec)
open system model closed system model
Improvement[%]
allocated memory
allocated CPU
stability (RT)
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Summary of Results (4)
 Unpredictable system behavior without coordinating elasticity controllers (using NFC)
• due to conflicting decision
 under FC & NFC the application performance may be met
• at the expense of over-provision one of the resources (mostly memory)
• application crashing due to severe resource shortage as a result of conflicting decisions
 NFC uses more CPU and memory on average than FC
• under similar configurations
• with comparable mean RT
• more control error
 With careful coordination of auto-scaling controllers
• application performance is met with optimal amount of resources
• achieving high resource utilization & preventing both resource over- and under-provisioning
21Introduction Approach Evaluation
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Conclusion & Future work
 A generic coordination technique for distributed controllers
• reasoning under uncertainty (using fuzzy logic)
 an autonomic resource controller consisting of three sub-controllers:
• Fuzzy controller, CPU controller, and Memory controller.
• Supporting both CPU & memory elasticity
 Observing the application RT as well as CPU & memory utilizations
• meeting the application response time constraints
• maintaining high utilization of resources
Ongoing work
• online learning for self-adaptation of the fuzzy rules and MFs
• using tail response time: 95th and 99th percentile
22
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
Coordinating CPU and Memory Elasticity Controllers to
Meet Service Response Time Constraints
23
soodeh.farokhi@tuwien.ac.at
Vienna University of Technology
at.linkedin.com/in/soodehfa
Soodeh Farokhi, Ewnetu Bayuh Lakew, Cristian Klein, Ivona Brandic, and Erik Elmroth
Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015
What is the State-of-the-art?
24
CPU Vertical Scaling
Capacity-based
[E. Kalyvianaki, et al. 2009]
[S. Spinner , et al. 2014]
Performance-based
[E. Lakew, at. al. 2014]
Memory Vertical Scaling
Capacity-based
[A. Baruchi, et al. 2011]
[G. Molto, et al. 2013]
[W. Wangusing, et al. 2014]
Performance-based
[S. Farokhi, et. al, 2015]
[S. Spinner, et. al, 2015]
CPU & Memory Vertical Scaling
Capacity-based
[Y. Diao, et al. 2002]
[L. Lu, et al. 2014]
Performance-based
?
Introduction Approach Evaluation

More Related Content

PPTX
Quality of Service Control Mechanisms in Cloud Computing Environments
PPTX
Self-adaptation Challenges for Cloud-based Applications (Feedback Computing 2...
PPTX
Hierarchical SLA-based Service Selection for Multi-Cloud Environments
PPTX
Cost-Aware Virtual Machine Placement across Distributed Data Centers using Ba...
PPT
Cloud computing(bit mesra kolkata extn.)
PDF
IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...
PPTX
Cloud Migration Point
PPTX
Multi-Utility Infrastructure Management
Quality of Service Control Mechanisms in Cloud Computing Environments
Self-adaptation Challenges for Cloud-based Applications (Feedback Computing 2...
Hierarchical SLA-based Service Selection for Multi-Cloud Environments
Cost-Aware Virtual Machine Placement across Distributed Data Centers using Ba...
Cloud computing(bit mesra kolkata extn.)
IRJET- Time and Resource Efficient Task Scheduling in Cloud Computing Environ...
Cloud Migration Point
Multi-Utility Infrastructure Management

What's hot (20)

PDF
Cloud Computing Load Balancing Algorithms Comparison Based Survey
PDF
C017531925
PDF
Proactive Scheduling in Cloud Computing
PDF
Migration Control in Cloud Computing to Reduce the SLA Violation
PDF
Mod05lec22(cloudonomics tutorial)
PDF
E VALUATION OF T WO - L EVEL G LOBAL L OAD B ALANCING F RAMEWORK IN C L...
PPT
Scheduling in CCE
PDF
Efficient Configuration of Monitoring Slices for Cloud Platform Administrators
PPT
D. Meiländer, S. Gorlatch, C. Cappiello, V. Mazza, R. Kazhamiakin, and A. Buc...
PDF
Server Consolidation through Virtual Machine Task Migration to achieve Green ...
PDF
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...
PDF
Could the “C” in HPC stand for Cloud?
PPTX
Supporting Cloud Service Operation Management for Elasticity
PPT
Cost-aware scalability of applications in public clouds
PPTX
SYBL: An extensible language for elasticity specifications in cloud applicati...
PPTX
load balancing in public cloud ppt
PDF
LIVE VIRTUAL MACHINE MIGRATION USING SHADOW PAGING IN CLOUD COMPUTING
PDF
A Study on Energy Efficient Server Consolidation Heuristics for Virtualized C...
PPT
REVIEW PAPER on Scheduling in Cloud Computing
PDF
Performance Evaluation of Server Consolidation Algorithms in Virtualized Clo...
Cloud Computing Load Balancing Algorithms Comparison Based Survey
C017531925
Proactive Scheduling in Cloud Computing
Migration Control in Cloud Computing to Reduce the SLA Violation
Mod05lec22(cloudonomics tutorial)
E VALUATION OF T WO - L EVEL G LOBAL L OAD B ALANCING F RAMEWORK IN C L...
Scheduling in CCE
Efficient Configuration of Monitoring Slices for Cloud Platform Administrators
D. Meiländer, S. Gorlatch, C. Cappiello, V. Mazza, R. Kazhamiakin, and A. Buc...
Server Consolidation through Virtual Machine Task Migration to achieve Green ...
Load Balancing in Cloud Computing Environment: A Comparative Study of Service...
Could the “C” in HPC stand for Cloud?
Supporting Cloud Service Operation Management for Elasticity
Cost-aware scalability of applications in public clouds
SYBL: An extensible language for elasticity specifications in cloud applicati...
load balancing in public cloud ppt
LIVE VIRTUAL MACHINE MIGRATION USING SHADOW PAGING IN CLOUD COMPUTING
A Study on Energy Efficient Server Consolidation Heuristics for Virtualized C...
REVIEW PAPER on Scheduling in Cloud Computing
Performance Evaluation of Server Consolidation Algorithms in Virtualized Clo...
Ad

Similar to Coordinating CPU and Memory Elasticity Controllers to Meet Service Response Time Constraints (20)

PDF
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
PDF
Intelligent Cloud Automation
PDF
Autonomic Resource Provisioning for Cloud-Based Software
PDF
Noha danms13 talk_final
PDF
Icse2018 autonomic
PPTX
Performance testing in scope of migration to cloud by Serghei Radov
PDF
Towards a Unified View of Cloud Elasticity
PDF
Working together - Cloud Foundry Unconference Lightning Talk
PDF
From Plaything to Production - Defrag 2015
PDF
Self learning cloud controllers
PDF
Learning Software Performance Models for Dynamic and Uncertain Environments
PPT
AutonomicComputing
PDF
Fuzzy Control meets Software Engineering
PPTX
Autonomic Decentralised Elasticity Management of Cloud Applications
PPTX
Autonomic computing.pptx
PPTX
Venugopal adec
PDF
Monitor-Based Testing of Elastic Cloud Computing Applications
PDF
Monitor-Based Testing of Elastic Cloud Computing Applications
DOCX
Autonomic computing seminar documentation
PDF
Xrm xensummit
Fuzzy Self-Learning Controllers for Elasticity Management in Dynamic Cloud Ar...
Intelligent Cloud Automation
Autonomic Resource Provisioning for Cloud-Based Software
Noha danms13 talk_final
Icse2018 autonomic
Performance testing in scope of migration to cloud by Serghei Radov
Towards a Unified View of Cloud Elasticity
Working together - Cloud Foundry Unconference Lightning Talk
From Plaything to Production - Defrag 2015
Self learning cloud controllers
Learning Software Performance Models for Dynamic and Uncertain Environments
AutonomicComputing
Fuzzy Control meets Software Engineering
Autonomic Decentralised Elasticity Management of Cloud Applications
Autonomic computing.pptx
Venugopal adec
Monitor-Based Testing of Elastic Cloud Computing Applications
Monitor-Based Testing of Elastic Cloud Computing Applications
Autonomic computing seminar documentation
Xrm xensummit
Ad

Recently uploaded (20)

PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Softaken Excel to vCard Converter Software.pdf
PPTX
assetexplorer- product-overview - presentation
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PPTX
ai tools demonstartion for schools and inter college
PDF
top salesforce developer skills in 2025.pdf
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Reimagine Home Health with the Power of Agentic AI​
PDF
Nekopoi APK 2025 free lastest update
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
medical staffing services at VALiNTRY
PPTX
Transform Your Business with a Software ERP System
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Digital Strategies for Manufacturing Companies
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
System and Network Administration Chapter 2
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Softaken Excel to vCard Converter Software.pdf
assetexplorer- product-overview - presentation
Navsoft: AI-Powered Business Solutions & Custom Software Development
ai tools demonstartion for schools and inter college
top salesforce developer skills in 2025.pdf
Which alternative to Crystal Reports is best for small or large businesses.pdf
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Reimagine Home Health with the Power of Agentic AI​
Nekopoi APK 2025 free lastest update
CHAPTER 2 - PM Management and IT Context
medical staffing services at VALiNTRY
Transform Your Business with a Software ERP System
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Digital Strategies for Manufacturing Companies
How to Migrate SBCGlobal Email to Yahoo Easily
System and Network Administration Chapter 2
Odoo Companies in India – Driving Business Transformation.pdf
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf

Coordinating CPU and Memory Elasticity Controllers to Meet Service Response Time Constraints

  • 1. Coordinating CPU and Memory Elasticity Controllers to Meet Service Response Time Constraints Soodeh Farokhi1, Ewnetu Bayuh Lakew2, Cristian Klein3, Ivona Brandic1, and Erik Elmroth2 1 Vienna University of Technology, Austria 2 Umeå University, Sweden 3 SimScale GmbH, Germany IEEE International Conference on Cloud and Autonomic Computing (ICCAC) Cambridge, MA, USA September 21-24, 2015
  • 2. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Introduction: Elasticity on Cloud Computing Horizontal Elasticity  Course grained  Slow (minutes)  Needs application support • State synchronization • Load-balancing Vertical Elasticity  Fine grained  Fast (sub-second)  Little application support • Multi-threaded  Needs Hypervisor support 2 horizontal elasticity verticalelasticity [1] [E. Lakew, et. al. UCC 2014] Fig. 1: vertical vs. horizontal elasticity [1] Introduction Approach Evaluation
  • 3. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Vertical Elasticity  key enabler for efficient resource utilization  little support from the commercial cloud providers • started to be supported by Hypervisors (e.g., Xen and KVM)  few research efforts • single resource => either CPU or memory (mostly CPU) o need for multiple resources at different application execution stages o need proper orchestration for using the existing approaches => inconsistent decisions! • taking resource utilization as a decision making criterion o capacity-based approaches o response time as a key indicator => performance-based approaches 3
  • 4. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 0 20 40 60 0 250 500 750 1000 1250 memutilization [%] Time [seconds] every 250 sec is an interval 100, 1 1500, 5 100, 0.1 100, 1 1000, 1 Motivation Scenario 4 Fig.3: Resource utilization of RUBiS (eBay-like e-commerce app.) under different workload patterns. 0 10 20 30 40 50 60 CPUutilization [%] 100 (cc) , 1 (tt) 1500, 5 100, 0.1 100, 1 1000, 1 1 2 3 4 Time [seconds] every 250 sec in an interval Introduction Approach Evaluation  RUBiS as a benchmark application • CPU-intensive behavior • memory-intensive behavior  Can an application be both?! • based on the nature of the workload  How to support CPU & Memory vertical elasticity?
  • 5. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Outline  Goals & Challenges  Autonomic Resource Controller • CPU controller • Memory controller • Fuzzy controller (coordinator)  Experimental Evaluation  Conclusion 5Introduction Approach Evaluation
  • 6. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Research Goals & Challenges  Goal: designing an autonomic resource elasticity controller • Meet the application performance requirements • Supporting vertical elasticity of both CPU & memory  Challenges: • Performance modeling for vertical scaling of both CPU & memory • Elasticity reasoning for multiple resources under uncertainty  F (average response time) -> CPU & memory capacity 6Introduction Approach Evaluation
  • 7. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Solution Overview: Autonomic Resource Controller  following the MAPE loop • Monitoring • Analysis & Planning: o Step 1: fuzzy controller infers the contributions degree of CPU & memory to app performance change o Step 2: CPU & memory controllers determine the required CPU & memory capacity • Execution 7Introduction Approach Evaluation CPU controller memory controller Application VM # CPU # MEM CPU coefficient fuzzy controller mem coefficient Hypervisor Autonomic Resource Controller App/VM Sensors Fig.4: Solution overview knowledge base
  • 8. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 CPU & Memory Controllers  Performance-based, adaptive & reactive controllers …α and ß are system parameters  CPU Controller [2] • following an inverse model • adopting RLS filtering to adaptively measure ß o based on the past measurements  Memory Controller [3] • following a feedback control loop • adopting LR to calculate α 8 Fig. 6. The feedback loop used for memory controller [3] memory controller Application VM memory size (ctli  memi) workload measured RT (rti) + - desired RT (rt)~ ei = rt - rti ~ Fig. 5. CPU controller [2] [2] [E. Lakew, et. al. UCC 2014] [3] [S. Farokhi, et. al. ICAC 2015] Introduction Approach Evaluation
  • 9. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Fuzzy Controller as a Coordination Technique 9Introduction Approach Evaluation CPU controller memory controller Application VM # CPU # MEM CPU coefficient fuzzy controller mem coefficient Hypervisor Autonomic Resource Controller App/VM Sensors knowledge base
  • 10. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Fuzzy Controller Design A fuzzy logic control approach is defined by: • membership functions (MF): degree of truth with a value between 0 to 1 • fuzzy rules: a collection of “IF THEN ELSE” rules  Design process • Step 1: MF Construction • Step 2: Fuzzy Rule Elicitation • Step 3: Fuzzy Reasoning  MIMO fuzzy logic system (FLS) • three Input variables (next slide) • two output variables o CPU and memory coefficients, values between -1 (over-provisioning) to +1 (under-provisioning) o -1 & +1 => fully allocate the resource ; -1 < value < +1 => partially allocate the resource 10Introduction Approach Evaluation
  • 11. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Step 1: MFs Construction  Input variables 1. Response time 2. CPU utilization 3. Memory utilization  linguistic terms represent the value of input variables 1. Slow (RT), Low (utilizations) 2. Medium (RT), Medium (utilizations) 3. Fast (RT), High (utilizations)  We need to define a MF for each linguistic term • 9 MFs in total, some triangular, some trapezoidal 11 Fig.8. MFs of Response time Fig.9. MFs of CPU utilization Introduction Approach Evaluation
  • 12. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Step 2: Fuzzy Rule Elicitation 3 input variables => 33 combinations => 27 fuzzy rules … using expert knowledge to extract them & then empirically update them at run time 12Introduction Approach Evaluation
  • 13. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Step 2: Output Variables (Hyper-surfaces)  Output variables’ hyper-surfaces in accordance to input variables 13 Memory coefficient surfaceCPU coefficient surface Introduction Approach Evaluation
  • 14. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Step 3: Fuzzy Elasticity Reasoning The process of elasticity reasoning (having “fuzzy knowledge-based”) 1. Measured values of input variables are fuzzified, using MFs 2. Fuzzy controller reasons & produces the output variables o using the fuzzified input variables & fuzzy rules 3. Feeding the output variables into CPU & memory controller 14Introduction Approach Evaluation
  • 15. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Experimental Evaluation 15
  • 16. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Experimental Setup  Xen hypervisor • focusing on the VM which hosts the BL tier (Apache Web Server)  3 interactive benchmark applications • Olio (Amazon-like, online bookstore) • RUBBoS (Slashdot-like bulletin board) • RUBiS (eBay-like e-commerce)  Workload (synthetic traces) • Open- & closed loop user model [4] • Httpmon[5] load generator  Control interval = 5sec 16Introduction Approach Evaluation Xen hypervisor server-side VM1 workload generator PM (56 GB memory, 32 processors) elastic RAM Benchmark Application BL tier (Apache 2.0) Autonomic Resource Controller control-side VM2 memory (fixed) Benchmark Application DS tier (MySQL) CPU (fixed) elastic CPU App/VMsensors client-side [4] [Schroeder B., et al, NSDI, 2006] [5] https://github.com/cloud-control/httpmon
  • 17. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Evaluation Process  The baseline approach • Comparing fuzzy Controller (FC) with a Non-fuzzy Controller (NFC) o NFC: an approach with two separated CPU & memory controllers • better controller? => meets the desired RT without over-provisioning any of resources  Evaluation metrics • Control theoretical metrics: ISE and IAE (present control error) • CPU usage (mean) • Memory usage (mean) • Response time (mean & SD)  Process (run 10 different experiments for both FC & NFC) • Time-series analysis • Aggregated results analysis 17Introduction Approach Evaluation
  • 18. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Evaluation Results (1) - Olio (target RT: 1 sec) 18Introduction Approach Evaluation response time [sec] CPU usage [#core] memory usage [GB]
  • 19. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Evaluation Results (2) – Aggregated 19Introduction Approach Evaluation control error = desired RT – measured RT
  • 20. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Experimental Results (3) – Aggregated  FC improvement in the following metrics compared to NFC (1) allocated memory (2) allocated CPU (3) RT stability (standard deviation)  Achieving almost better results under open loop user model in general  For some applications (e.g., RUBBoS) the improvement is lower 20Introduction Approach Evaluation 60.76 20.08 18.98 47.85 3.97 15.44 2.86 0 56.51 7.30 2.45 64.78 26.23 29.68 79.08 9.38 22.12 0 20 40 60 80 RUBiS (0.5sec) RUBBoS (1sec) Olio (0.5sec) RUBiS (0.5sec) RUBBoS (1sec) Olio (0.5sec) open system model closed system model Improvement[%] allocated memory allocated CPU stability (RT)
  • 21. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Summary of Results (4)  Unpredictable system behavior without coordinating elasticity controllers (using NFC) • due to conflicting decision  under FC & NFC the application performance may be met • at the expense of over-provision one of the resources (mostly memory) • application crashing due to severe resource shortage as a result of conflicting decisions  NFC uses more CPU and memory on average than FC • under similar configurations • with comparable mean RT • more control error  With careful coordination of auto-scaling controllers • application performance is met with optimal amount of resources • achieving high resource utilization & preventing both resource over- and under-provisioning 21Introduction Approach Evaluation
  • 22. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Conclusion & Future work  A generic coordination technique for distributed controllers • reasoning under uncertainty (using fuzzy logic)  an autonomic resource controller consisting of three sub-controllers: • Fuzzy controller, CPU controller, and Memory controller. • Supporting both CPU & memory elasticity  Observing the application RT as well as CPU & memory utilizations • meeting the application response time constraints • maintaining high utilization of resources Ongoing work • online learning for self-adaptation of the fuzzy rules and MFs • using tail response time: 95th and 99th percentile 22
  • 23. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 Coordinating CPU and Memory Elasticity Controllers to Meet Service Response Time Constraints 23 soodeh.farokhi@tuwien.ac.at Vienna University of Technology at.linkedin.com/in/soodehfa Soodeh Farokhi, Ewnetu Bayuh Lakew, Cristian Klein, Ivona Brandic, and Erik Elmroth
  • 24. Soodeh Farokhi, IEEE International Conference on Cloud and Autonomic Computing (ICCAC’15), 21-24 Sep, 2015 What is the State-of-the-art? 24 CPU Vertical Scaling Capacity-based [E. Kalyvianaki, et al. 2009] [S. Spinner , et al. 2014] Performance-based [E. Lakew, at. al. 2014] Memory Vertical Scaling Capacity-based [A. Baruchi, et al. 2011] [G. Molto, et al. 2013] [W. Wangusing, et al. 2014] Performance-based [S. Farokhi, et. al, 2015] [S. Spinner, et. al, 2015] CPU & Memory Vertical Scaling Capacity-based [Y. Diao, et al. 2002] [L. Lu, et al. 2014] Performance-based ? Introduction Approach Evaluation

Editor's Notes

  • #3: Elasticity, as a main selling point of cloud computing, is defined as the ability of the cloud infrastructure to rapidly decide the right amount of resources (VMs/ or CPU-memory) needed by each application Two types of elasticity are defined: horizontal and vertical. Horizontal = committing to a fixed size of resources on a long-term basis, Vertical = rapid elasticity, letting customers quickly adjust their level of resource consumption as their applications’ requirements change.
  • #4: since they may lead to either under- or over-provisioning of resources and consequently result in undesirable behaviors such as performance disparity. The underlying assumption made in these works is that the application is either CPU intensive or memory-intensive, moreover they do not consider application performance (e.g., response time) at all
  • #5: An application may require different resource configurations under variable workload dynamics Vary the number of concurrent users Vary the thinktime between each request e.g., a chat application which needs “long-pooling” technique to immediately notify the user when a new msg arrives = # connection = memory usage While “search in chat history” functionality = cpu intensive Motivating vertical elasticity of multiple resources we conducted an experiment on a benchmark application (RUBiS [15]) by injecting variable workloads, which induce the intended behavior, at different time intervals. We deployed the BL and DS tiers of RUBiS on different VMs and over-provisioned both VMs (the VM hosting BL tier: 8 CPU cores and 4 GB memory and the VM hosting DS tier: 6 CPU cores and 10 GB memory). Then, by configuring a workload generator tool, httpmon1, we defined variable workload dynamics which stress BL tier for either CPU, memory, or both resources during its life span.
  • #7: Due to the non-deterministic behavior of software systems, it is almost impossible to know with a high degree of confidence, the extent of contributions of different resources to performance degradation of a software application and how much of each resource should be provisioned to alleviate the performance problem.
  • #8: The components in blue are the contributions the degree of contributions of both CPU and memory to applications’ performance change fuzzy controller infers the CPU & memory coefficients Application with dynamic memory & CPU requirements
  • #10: Fuzzy controller as the core of the “Autonomic Resource Contoller”
  • #11: -1 shows the over-provisioning condition > decreasing actions +1 shows the under-provisioning condition > increasing actions
  • #12: Linguistic terms represent the values of input variables > we define a MF for each linguistic term > in total 9 MFs we initially asked the experts to locate an interval [0,100] for each linguistic term used as input variable, and then we tuned the extracted intervals empirically. trapezoidal vs. triangular We use both trapezoidal and triangular MFs, as shown. We used trapezoidal MFs to represent "Low" ("Fast"), and "High" ("Slow"), and triangular MFs to represent "Medium". In order to do so, we normalize the measured response time with respect to a reference value as a coefficient of the target RT. Measured RT values closer to this reference which are further up away from the target value are set to a value close to 100.
  • #14: 3 inputs variables were modeled into two separated diagrams A hyper-surface corresponding to all possible normalized values. These diagrams reveal a more conservative behavior of fuzzy controller for controlling the memory allocation in comparison with CPU due to the fact that applications may crash as a result of memory shortage. For the memory we defined rules more conservatively in a sense that allocating enough memory to the application is more important than achieving high utilization, that is why the surface tends to be upper than the CPU surface. While for the CPU, due to the nature of CPU which is very quick in allocating and revoking, we tend to achieve high utilization, so we hardly add CPU resource, but for the memory we are more generous. Notice that 3 inputs were modeled in two separated diagrams with 2 inputs for the sake of visibility
  • #17: Workload generators may be classified as based on a closed system model, where new job arrivals are only triggered by job completions (followed by think time), [we change number of concurrent users, and think time] open system model, where new jobs arrive independently of job completions. [we change arrival rate and inter arrival time] For open clients, we changed the arrival rate and inter arrival time during the course of the experiments as required to stress the system. For the closed model, thinktime of each client as well as the number of concurrent users were varied. The change in arrival rate or number of users was made instantly. This made it possible to meaningfully compare the system’s behavior under the two client models
  • #18: These metrics are Integral of Squared Error (ISE) and Integral of the Absolute Error (IAE)
  • #19: Solid lines = FC behavior over time Dash lines = NFC behavior over time The 2nd interval (cpu intensive workload) both started to over-provisioned! We run the experiments will all the three interactive benchmark applications, under different workload patterns and various RT targets, under both open & closed-system model and here we show the results of one of the application (under closed-system model) for target RT of 1 sec
  • #20: Integral Square Error (ISE) Integral Absolute Error (IAE) Both erros were relatively lower in case of FC compared to NFCunder both user loop models
  • #22: In general, in all scenarios under NFC, more CPU and memory were allocated on average during the experiment than with FC under similar configurations despite the fact that the aggregate mean RTs are comparable. … discussion …