S-CUBE LP: Online Testing for Proactive Adaptation

S-Cube Learning Package

Quality Assurance and Quality Prediction:
Online Testing for Proactive Adaptation

University of Duisburg Essen (UniDue)
Universitat Politècnica de Catalunya (UPC)
South East European Research Centre (SEERC)

Osama Sammodi (UniDue)
www.s-cube-network.eu © UniDue

Learning Package Categorization

S-Cube

Quality Definition, Negotiation
and Assurance

Quality Assurance and Quality Prediction

Online Testing for Proactive Adaptation

© UniDue

Learning Package Overview

 Motivation
– Failure Prediction and Proactive Adaptation

 Failure Prediction through Online Testing (OT)

 Discussions

 Summary

© UniDue

Service-based Applications
Current Situation

shared ownership and adaptive systems

Changing
Requirements +
Dynamic Context
Self-
Aspects
Adaptation

Context Development Process System/Application

© UniDue

Example (eGovernment Application)
Assume a citizen wants to Organization
renew a vehicle’s registration Boundary
online:
1.The citizen provides a renewal
identification number or the license
plate number for identification
2.The citizen will have to pay the
renewal fee (for example, using
ePay service)
3.The application renews the
registration of the vehicle and
updates its record to reflect the
registration renewal
4.Finally, a confirmation of the
renewal process is e-mailed to the
citizen (for example, using Yahoo).
In parallel to that, a validation sticker
is mailed to the citizen

© UniDue

The need for Adaptation

 The previous slides showed that Service-Based Applications (SBAs)
run in highly dynamic settings wrt.
– 3rd party services, service providers, …
– requirements, user types, end-user devices, network connectivity, …

 Difference from traditional software systems
– Unprecedented level of change
– No guarantee that 3rd party service fulfils its contract
– Hard to assess behaviour of infrastructure (e.g., Internet, Cloud, …) at design time

 SBAs cannot be specified, realized and analyzed completely in
advance (i.e., during design-time)
 Decisions and checks during the operation of the SBA are needed
(i.e., at run-time)
© UniDue

The need for Adaptation
The S-Cube SBA Lifecycle

MAPE Identify
Requirements
Loop Adaptation
Operation & Engineering
Need (Analyse)
Management
(incl. Monitor)

Identify Design
Adaptation Adaptation
Evolution
Strategy (Plan)

Deployment &
Enact Adaptation Provisioning Realization
(Execute)

Background: Run-time Design time
S-Cube Service Life-Cycle
(„MAPE“ loop)
 A life cycle model is a process model that covers the activities related to the entire
life cycle of a service, a service-based application, or a software component or
system [S-Cube KM]

© UniDue

Types of Adaptation
General differences
Failure!
 Reactive Adaptation
– Repair/compensate for external failure visible to
the end-user
– drawbacks: execution of faulty services, reduction
of performance, inconsistent end-states, ...
 Preventive Adaptation
– An internal failure/deviation occurs Failure! Failure?

 Will it lead to an external failure?
– If “yes”: Repair/compensate internal
failure/deviation to prevent external failure Key enabler:
 Proactive Adaptation Online Failure Prediction
 Is internal failure/deviation imminent
(but did not occur)?
Failure?
– If “yes”: Modify system before internal
failure actually occurs

© UniDue

Need for Accuracy
Requirements on Online Prediction Techniques

 Prediction must be efficient
– Time available for prediction
and repairs/changes is limited
– If prediction is too slow, not enough time to adapt

 Prediction must be accurate
– Unnecessary adaptations can lead to
- higher costs (e.g., use of expensive alternatives)
- delays (possibly leaving less time to address real faults)
- follow-up failures (e.g., if alternative service has severe bugs)
– Missed proactive adaptation opportunities diminish the
benefit of proactive adaptation
(e.g., because reactive compensation actions are needed)
© UniDue

Quality Assurance Techniques
Background: Two Important Dynamic Checks
 Testing (prominent for traditional Tester
software)
 Systematically execute the software
1. Software is fed with concrete pre-
determined inputs (test cases)
2. Produced outputs* are observed
3. Deviation = failure input output *

 Monitoring (prominent for SBAs)
End-user
– Observe the software during its current
execution (i.e., actual use / operation)
1. End-user interacts with the system
2. Produced outputs* are observed
3. Deviation = failure input output *
[for more details, see deliverable JRA-1.3.1; S-
Cube KM] * incl. internal data collected for QA purposes
© UniDue

Online Failure Prediction through OT
Motivation
• Problem: Monitoring only (passively) observes services or SBAs during
their actual use in the field
 cannot guarantee comprehensive / timely coverage of the ’test object’
 can reduce the accuracy of failure prediction

• Solution: Online Testing = Extend testing to the operation phase
Identify
Adaptation Requirements
Need Engineering
Operation &
Management
Testing
“Actively (& systematically)
Identify
execute services in parallel Adaptation Adaptation
Design
Evolution
to their normal use in SBA” Strategy

Deployment &
Enact Provisioning Realization
Adaptation

© UniDue

Online Failure Prediction through OT
Two S-Cube Approaches

 PROSA: Predict violation of QoS
– For stateless services (i.e., services that don't
persist any state between requests)
– E.g., predict that “response time” of “stock
quote” service is slower than 1000 ms
– See [Sammodi et al. 2011, Metzger 2011, Metzger et al. 2010,
Hielscher et al. 2008]

 JITO: Predict violation of protocol
– For conversational services (i.e, services that
only accept specific sequences of operation
invocations)
In this learning package we focus on PROSA
– E.g., predict that “checkout” of “shopping
basket” service fails after all products have
been selected
Note: Both approaches support “Service
– See [Dranidis et al. 2010]
Integrator“; who integrates in-house and 3rd
party services to compose an SBA
© UniDue

PROSA
Online Testing of QoS
Idea of the PROSA approach
Inverse usage-based testing:
– Assume: A service has seldom been “used” in a given time period
– This implies that not enough “monitoring data” (i.e., data
collected from monitoring its usage) has been collected
– If we want to predict the service’s QoS, the available monitoring
data is used, and then the prediction accuracy might be not good
– To improve the prediction accuracy, dedicated online tests are
performed to collect additional evidence for quality of the service
(this evidence is called “test data”)
- But how much to test?  see next slides!
– Both “monitoring data” and “test data” are used for prediction
© UniDue

Usage-based Testing
Background
 Usage-based (aka. Operational profile) testing is a technique
aimed at testing software from the users’ perspective [Musa 1993,
Trammell 1995]

Usage-based Testing
 It drives the allocation of test cases in accordance with use,
Background
and ensures that the most-used operations will be the most
tested
 The approach was proposed for assuring reliability
 Typically, either flat operational profiles or Markov chain
based models are used to represent usage models
– Markov chains represent the system states and transitions between those
states, together with probabilities for those state transitions (thus they
capture structure) [Trammell 1995]
– Operational profiles are defined as a set of operations and their
probabilities [Musa 1993]
© UniDue

PROSA
Online Testing of QoS: General Framework

Testing Monitoring
Test cases 5.Usage Model Usage
1.Test
Building/Updating frequencies
Initiation

2.Test Case Usage
Test Case
Repository Selection Model
Monitoring 4.Aggregation of
6.Prediction
Test Data Monitoring Data
Request Adaptation
Trigger
3.Test Monitoring
Events
Execution 7.Adaptation

Test Adaptation Enactment
Output
Test
Input SBA Instances

s1 s2 Services sn = activity
= data flow

© UniDue

PROSA
Online Testing of QoS: Framework Activities (1)
The framework consists of two main loops: one for testing and
another for monitoring:
1) Test initiation: Includes all preparatory activities for online test
selection and execution, such as definition of potential test cases
2) Test case selection: Selects test cases to be executed. This is
the central activity of our framework. Next slides provides further
details about our usage-based test case selection approach
3) Test execution: Executes the test cases that have been selected
by the previous activity
4) Aggregation of monitoring data: Collects monitoring data during
the operation of the SBA which is used for both updating the “usage
model” as the SBA operates (usage frequencies) and also for making
predictions

© UniDue

PROSA
Online Testing of QoS: Framework Activities (2)

5) Usage-model building/updating: Initial usage model can be built
from results from requirements engineering. During operation of the
SBA, usage frequencies computed from monitoring events are used
to automatically update the “usage model”
6) Prediction: Augments testing data with monitoring data and
makes the actual QoS prediction for the services in the SBA
7) Adaptation: Based on the prediction results, adaptation requests
are issued if the expected quality will be violated. We focus on
adaptation by dynamic service binding (services are selected and
dynamically substituted at runtime)

© UniDue

PROSA
Online Testing of QoS: Technical Solution
Steps of the approach:
Test? …
… Test?
1.Build Usage Model
–We divide the execution of SBA into periods Pi
- Between periods, usage model is updated
- Let ψk,i denote the usage probability for a service
Sk in period Pi t1,1 t1,m ti,j

2. Exploit Usage Model for Testing P1 P2 Pi
–For simplification, let: Usage Model for
P2
– m = number of time points within period
– qk = maximum number of tests allowed for
service Sk per period Note: For 3rd party services, the
number of allowable tests can be
 We compute # of data points estimated to be limited due to economical (e.g., pay
expected from monitoring in Pi: per service invocation) and technical
mmonitoring,k,i = ψk,i * m considerations (testing can impact on
Based on the above, we compute # of the availability of a service)
additional data points to be collected by testing in
Pi:
mtesting,k,i = max (0;qk – mmonitoring,k,i)

© UniDue

Actual Actual Non-
Measuring Accuracy Failure Failure
Introducing TP, FP, FN and TN Predicted
Failure
TP FP
To measure the accuracy of failure prediction, Predicted
we take into account the following four cases: Non-Failure
FN TN

• True Positives (TP): when prediction predicts a failure and the service turns out
to fail when invoked during the actual execution of the SBA (i.e., actual failure)

• False Positives (FP): when prediction predicts a failure although the service
turns out to work as expected when invoked during the actual execution of the SBA
(i.e., no actual failure)

• False Negatives (FN): when prediction doesn’t predict a failure although the
service turns out to fail when invoked during the actual execution of the SBA (i.e.,
actual failure)

• True Negatives (TN): when prediction doesn’t predict a failure and the service
turns out to work as expected when invoked during the actual execution of the SBA
(i.e., no actual failure)

© UniDue

Actual Actual Non-
Measuring Accuracy Failure Failure
Computing TP, FP, FN and TN Predicted
Failure
TP FP
The four cases are computed as the Predicted
following: Non-Failure
FN TN

Predictor for Running SBA
response time

S1 S2 S3 …
Monitored  Missed
response time Adaptation
t
Response time
service S2

 Unnecessary
Adaptation

time

© UniDue

Measuring Accuracy
Contingency Table Metrics (see [Salfner et al. 2010])
Based on the previous cases, we compute the following metrics:

Precision: Negative predictive value:
How many of the How many of the
predicted failures were predicted non-failures
actual failures? were actual non-
failures?

Recall (true positive rate): False positive rate:
How many of the How many of the
1-r  Missed actual failures have f  Unnecessary actual non-failures
Adaptation been correctly Adaptation have been incorrectly
predicted as failures? predicted as failures?

Accuracy: Note: smaller f is preferable.
How many
predictions were
correct?

Note: Actual failures are rare  prediction that always
predicts “non-failure” can achieve high accuracy a. …
© UniDue


 Motivation
– Fault Prediction and Proactive Adaptation

 Fault Prediction through Online Testing (OT)
– PROSA: Violation of Quality of Service (QoS)
– JITO: Violation of Protocol

 Discussions

 Summary

© UniDue

PROSA
Online Testing of QoS: Evaluation
To evaluate PROSA, we conducted an exploratory experiment with
the following setup:
– Prototypical implementation of prediction approaches (see next slide)
– Simulation of example abstract
service-based application (the workflow in the diagram)
(100 runs, with 100 running
applications each)
– (Post-mortem) monitoring data
from real Web services
(e.g., Google, 2000 data
points per service;
QoS = performance)
[Cavallo et al. 2010]

– Measuring contingency table
metrics (for S1 and S3)

© UniDue

PROSA
Online Testing of QoS: Prediction Models
 Prediction model = Arithmetic average of data points:

 Initial exploratory experiments indicated that number of past
data points (n) impacts on accuracy

 Thus, in the exp. three variations of the model were considered:
– n = 1, aka. “point prediction”  prediction value = current value
– n = 5  prediction value = the average of last 5 past data points
– n = 10  prediction value= the average of last 10 past data points

© UniDue

PROSA
Online Testing of QoS: Results

S3

Considering the different prediction models:
•no significant difference in precision (p) & neg. predictive value (v)
•recall (r) ~ false positive rate (f)  “conflicting”!
•accuracy (a) best for “point prediction”

© UniDue

PROSA
S1

Considering the different prediction models:
•no significant difference in precision (p) & neg. predictive value (v)
•recall (r) ~ false positive rate (f)  “conflicting”!
•accuracy (a) best for “point prediction”
•difference from S3: “last 5” has highest recall for S1
© UniDue

PROSA
S3

Comparing PROSA with Monitoring:
• For S3, prediction based on online testing (ot) is improved along all
metrics when compare with prediction based on monitoring (mon) only

© UniDue

PROSA
S1

Comparing PROSA with Monitoring:
• Improvement is not so high for S1 (already lots of monitoring data)

© UniDue

PROSA
Online Testing of QoS: Discussions
 Pros:
– Generally improves accuracy of failure prediction
– Exploits available monitoring data
– Beneficial in situations where prediction accuracy is critical while available past
monitoring data is not enough to achieve this
– Can complement approaches that make prediction based available monitoring data
(e.g., approaches based on data mining) and require lots of data for accurate prediction
– Can be combined with approaches for preventive adaptation, e.g.,:
- SLA violation prevention with machine learning based on predicted service failures
Run-time verification to check if “internal” service failure leads to “external”
violation of SLA

 Cons:
– Assumes that testing a service doesn’t produce sides effects
– Can have associated costs due to testing:
 One can use the usage model to determine the need for the testing activities
 Require further investigation into cost models that relate costs of testing vs. costs
of compensation of wrong adaptation

© UniDue


 Motivation
– Fault Prediction and Proactive Adaptation

 Fault Prediction through Online Testing (OT)

 Discussions

 Summary

© UniDue

Summary
 2 complementary solutions for failure prediction
based on Online Testing
– PROSA: Prediction of QoS violation
– JITO: Prediction of protocol violation

 Internal Failure does not necessarily imply external
failure (i.e., violation of SLA / requirement of composed
service)
 Combine “internal” failure prediction approaches with “external”
failure prediction:
- TUW & USTUTT: SLA violation prevention with machine learning based
on predicted service failures
- UniDue: Run-time verification to check if “internal” service failure leads to
“external” violation of SLA

© UniDue

Further S-Cube Reading

• [Sammodi et al. 2011] O. Sammodi, A. Metzger, X. Franch, M. Oriol, J. Marco, and K. Pohl. Usage-based
online testing for proactive adaptation of service-based applications. In COMPSAC 2011
• [Metzger 2011] A. Metzger. Towards Accurate Failure Prediction for the Proactive Adaptation of Service-
oriented Systems (Invited Paper). In ASAS@ESEC 2011
• [Metzger et al. 2010] A. Metzger, O. Sammodi, K. Pohl, and M. Rzepka. Towards pro-active adaptation
with confidence: Augmenting service monitoring with online testing. In SEAMS@ICSE 2010
• [Hielscher et al. 2008] J. Hielscher, R. Kazhamiakin, A. Metzger, and M. Pistore. A framework for proactive
self-adaptation of service-based applications based on online testing. In ServiceWave 2008
• [Dranidis et al. 2010] D. Dranidis, A. Metzger, and D. Kourtesis. Enabling proactive adaptation through
just-in-time testing of conversational services. In ServiceWave 2010

© UniDue

References
[Salehie et al. 2009] Salehie, M., Tahvildari, L.: Self-adaptive software: Landscape and research challenges. ACM Transactions on
Autonomous and Adaptive Systems 4(2), 14:1 – 14:42 (2009)
[Di Nitto et al. 2008] Di Nitto, E.; Ghezzi, C.; Metzger, A.; Papazoglou, M.; Pohl, K.: A Journey to Highly Dynamic, Self-adaptive
Service-based Applications. Automated Software Engineering (2008)
[PO-JRA-1.3.1] S-Cube deliverable # PO-JRA-1.3.1: Survey of Quality Related Aspects Relevant for Service-based Applications;
http://www.s-cube-network.eu/results/deliverables/wp-jra-1.3
[PO-JRA-1.3.1] S-Cube deliverable # PO-JRA-1.3.5: Integrated principles, techniques and methodologies for specifying end-to-end
quality and negotiating SLAs and for assuring end-to-end quality provision and SLA conformance; http://www.s-cube-
network.eu/results/deliverables/wp-jra-1.3
[S-Cube KM] S-Cube Knowledge Model: http://www.s-cube-network.eu/knowledge-model
[Trammell 1995] Trammell, C.: Quantifying the reliability of software: statistical testing based on a usage model. In ISESS’95.
Washington, DC: IEEE Computer Society, 1995, p. 208
[Musa 1993] Musa, J.: Operational profiles in software-reliability engineering. IEEE Software, vol. 10, no. 2, pp. 14 –32, mar. 1993
[Salfner et al. 2010] F. Salfner, M. Lenk, and M. Malek. A survey of online failure prediction methods. ACM Comput. Surv., 42(3),
2010
[Cavallo et al. 2010] B. Cavallo, M. Di Penta, and G. Canfora. An empirical comparison of methods to support QoS-aware service
selection. In PESOS@ICSE 2010

© UniDue

Acknowledgment

The research leading to these results has
received funding from the European
Community’s Seventh Framework
Programme [FP7/2007-2013] under grant
agreement 215483 (S-Cube).

© UniDue

S-CUBE LP: Online Testing for Proactive Adaptation

More Related Content

What's hot (20)

Viewers also liked (9)

Similar to S-CUBE LP: Online Testing for Proactive Adaptation (20)

More from virtual-campus (20)

Recently uploaded (20)

S-CUBE LP: Online Testing for Proactive Adaptation