SlideShare a Scribd company logo
Keynote
On the EïŹ€ectiveness of SBSE
Techniques through Instance
Space Analysis
Aldeida Aleti
Monash University, Australia
@AldeidaAleti aldeida.aleti@monash.edu
Effectiveness of SBSE - Status Quo
A large focus of SBSE research is in introducing new SBSE approaches
As part of the evaluation process, usually a set of experiments are conducted
- A benchmark is selected, e..g., Defects4J
- The new approach is compared against the state of the art
- Averages/medians are reported
- Some statistical tests are conducted
Instance Space Analysis
1. to understand and visualise the strengths and weaknesses of different approaches
2. to help with the objective assessment of different approaches
a. Scrutinising how approaches perform under different conditions, and stress testing them
Motivation 1: Are the problem instances adequate?
Problem 1: How were the problem instances selected?
Common benchmark problems are important for fair comparison, but are they
- demonstrably diverse
- unbiased
- representative of a range of real world context,
- challenging
- discriminating
ICSE 2022 review criteria
Motivation 2: Reporting averages/medians obscures
important information
A. Perera, A. Aleti, M. Böhme and B. Turhan, "Defect Prediction Guided Search-Based
Software Testing," 2020 35th IEEE/ACM International Conference on Automated Software
Engineering (ASE), 2020, pp. 448-460.
Problem 2: Performance is often problem dependent
(NFT)
- What are the strengths and weaknesses of the approaches?
- Which are the problem instances where an approach performs really well and
why?
- Which are the problem instances where an approach struggles and why?
- How do features of the problem instances affect the performance of the
approaches?
- Which features give an algorithm competitive advantage?
- Given a problem instance with particular features, which approach should I use?
Which algorithm is suitable for future problems?
Example
Which approach is better? SF110
C. Oliveira, A. Aleti, L. Grunske and K. Smith-Miles, "Mapping the Effectiveness of Automated Test Suite Generation
Techniques," in IEEE Transactions on Reliability, vol. 67, no. 3, pp. 771-785, Sept. 2018, doi: 10.1109/TR.2018.2832072.
Instance Space Analysis for Search Based Software Engineering
Open Questions
● What impacts the effectiveness of SBSE techniques?
○ How can features of problem instances help us infer what are the strengths and weaknesses of
different SBSE approaches?
○ How can we objectively assess different SBSE techniques
● How easy or hard are existing benchmarks? How diverse are they? Are they biased
towards a particular technique?
● Can we select the most suitable SBSE technique given a problem with particular
features?
Empirical Review of Program Repair Tools: A Large-Scale Experiment on 2 141 Bugs and 23 551
Repair Attempts. T. Durieux, F. Madeiral, M. Martinez, R. Abreu. ESEC/FSE Foundations of Software
Engineering (2019) doi: 10.1145/ 3338906.3338911.
ISA
K. Smith-Miles et al. / Computers & Operations Research 45 (2014) 12–24
Steps of ISA
1. Create the metadata
a. Features
b. SBSE performances
2. Create instance space
3. Visualise footprints
4. Explain strengths/weaknesses
Instance Space Analysis for Search Based Software Engineering
Features (56)
What makes the problem easy or hard?
Problem instances SF110
Performance measure
● Branch coverage.
● An approach is considered superior if its branch coverage is at least 1% higher than
the other techniques; otherwise, we use the label “Equal.”
Approaches
● Whole Test Suite with Archive (WSA)
● Many Objective Sorting Algorithm (MOSA)
● Random Testing (RT)
Significant features
● coupling between object classes
○ the number of classes coupled to a given class (method calls, field accesses, inheritance,
arguments, return types, and exceptions)
● response for a class
○ number of different methods that can be executed when a method is invoked for that object
of a class
SBST Footprints
SBST selection
Instance Space Analysis for Search Based Software Engineering
Instance Space Analysis for Search Based Software Engineering
Instance Space Analysis for Search Based Software Engineering
E-APR
Metadata
Features (146)
Observation-based features (Yu et al. 2019)
Significant Features (9)
(F1) MOA: Measure of Aggregation.
(F2) CAM: Cohesion Among Methods
(F3) AMC: Average Method Complexity
(F4) PMC: Private Method Count
(F5) AECSL: Atomic Expression Comparison Same Left indicates the number of statements
with a binary expression that have more than an atomic expression (e.g., variable access).
(F6) SPTWNG: Similar Primitive Type With Normal Guard indicates the number of
statements that contain a variable (local or global) that is also used in another statement
contained inside a guard (i.e., an If condition).
(F7) CVNI: Compatible Variable Not Included is the number of local primitive type variables
within the scope of a statement that involves primitive variables that are not part of that
statement.
(F8) VCTC: Variable Compatible Type in Condition measures the number of variables within
an If condition that are compatible with another variable in the scope.
(F9) PUIA: Primitive Used In Assignment - the number of primitive variables in assignments.
Instance Space Analysis for Search Based Software Engineering
● Little overlap between
IntroClassJava/Defects4J and the other
datasets
● Bugs.jar has the most diverse bugs
APR selection
For ISA to reveal useful insights
● Diverse features
● Diverse instances
● Diverse approaches
● A good performance measure
So what
We have a responsibility to find the weaknesses of the approaches we develop
We need to make sure that the chosen problem instances are demonstrably diverse,
unbiased, representative of a range of real world context, challenging,
discriminating of approach performance
To understand which approach is suitable for future problems, we must understand
which features impact its performance

More Related Content

PPTX
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
PDF
Search-based testing of procedural programs:iterative single-target or multi-...
PDF
AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...
PDF
On the application of SAT solvers for Search Based Software Testing
PPTX
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
PDF
Speeding-up Software Testing With Computational Intelligence
PPTX
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...
PDF
Case Study Research in Software Engineering
On Parameter Tuning in Search-Based Software Engineering: A Replicated Empiri...
Search-based testing of procedural programs:iterative single-target or multi-...
AUTOMATIC GENERATION AND OPTIMIZATION OF TEST DATA USING HARMONY SEARCH ALGOR...
On the application of SAT solvers for Search Based Software Testing
Pareto-Optimal Search-Based Software Engineering (POSBSE): A Literature Survey
Speeding-up Software Testing With Computational Intelligence
Evolutionary Search Techniques with Strong Heuristics for Multi-Objective Fea...
Case Study Research in Software Engineering

What's hot (20)

PDF
VST2022.pdf
PPTX
A software fault localization technique based on program mutations
 
DOC
Testing survey by_directions
 
PPT
Experiments on Design Pattern Discovery
PDF
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
PDF
Survey Research In Empirical Software Engineering
PDF
Wcre13a.ppt
PDF
Programming with GUTs
PDF
[Tho Quan] Fault Localization - Where is the root cause of a bug?
PDF
Wcre13b.ppt
PPT
Using Developer Information as a Prediction Factor
PDF
Exploratory testing STEW 2016
PDF
130411 francis palma - detection of process antipatterns -- a bpel perspective
PDF
Ssbse12b.ppt
PDF
Experimental design
PDF
Sound Empirical Evidence in Software Testing
PPTX
Software testing using genetic algorithms
PDF
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
PDF
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
PPT
Cause-Effect Graphing: Rigorous Test Case Design
VST2022.pdf
A software fault localization technique based on program mutations
 
Testing survey by_directions
 
Experiments on Design Pattern Discovery
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Survey Research In Empirical Software Engineering
Wcre13a.ppt
Programming with GUTs
[Tho Quan] Fault Localization - Where is the root cause of a bug?
Wcre13b.ppt
Using Developer Information as a Prediction Factor
Exploratory testing STEW 2016
130411 francis palma - detection of process antipatterns -- a bpel perspective
Ssbse12b.ppt
Experimental design
Sound Empirical Evidence in Software Testing
Software testing using genetic algorithms
Model-Driven Run-Time Enforcement of Complex Role-Based Access Control Policies
An Empirical Comparison of Model Validation Techniques for Defect Prediction ...
Cause-Effect Graphing: Rigorous Test Case Design
Ad

Similar to Instance Space Analysis for Search Based Software Engineering (20)

PDF
Can we induce change with what we measure?
PPTX
Testing Technique
PDF
Achieving quality with tools case study
PPTX
Unit 4 testing
PDF
F017652530
PDF
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...
PDF
Software Analytics: Towards Software Mining that Matters
PDF
st-notes-13-26-software-testing-is-the-act-of-examining-the-artifacts-and-the...
DOCX
Se unit 4
DOC
Audit
PDF
Effective and Efficient API Misuse Detection via Exception Propagation and Se...
PPTX
Software engineering practices and software quality empirical research results
PDF
Software testing
PPTX
Software testing
PDF
C0441216
PDF
Presentation
PDF
Introduction to Software Testing
PDF
O0181397100
PDF
Software testing
PDF
Testing Theories & Methodologies
 
Can we induce change with what we measure?
Testing Technique
Achieving quality with tools case study
Unit 4 testing
F017652530
A Review on Software Fault Detection and Prevention Mechanism in Software Dev...
Software Analytics: Towards Software Mining that Matters
st-notes-13-26-software-testing-is-the-act-of-examining-the-artifacts-and-the...
Se unit 4
Audit
Effective and Efficient API Misuse Detection via Exception Propagation and Se...
Software engineering practices and software quality empirical research results
Software testing
Software testing
C0441216
Presentation
Introduction to Software Testing
O0181397100
Software testing
Testing Theories & Methodologies
 
Ad

Recently uploaded (20)

PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
System and Network Administraation Chapter 3
PDF
Nekopoi APK 2025 free lastest update
PPTX
Introduction to Artificial Intelligence
PDF
AI in Product Development-omnex systems
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PPTX
history of c programming in notes for students .pptx
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
medical staffing services at VALiNTRY
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Wondershare Filmora 15 Crack With Activation Key [2025
2025 Textile ERP Trends: SAP, Odoo & Oracle
How to Migrate SBCGlobal Email to Yahoo Easily
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
CHAPTER 2 - PM Management and IT Context
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Addressing The Cult of Project Management Tools-Why Disconnected Work is Hold...
Which alternative to Crystal Reports is best for small or large businesses.pdf
System and Network Administraation Chapter 3
Nekopoi APK 2025 free lastest update
Introduction to Artificial Intelligence
AI in Product Development-omnex systems
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
history of c programming in notes for students .pptx
How Creative Agencies Leverage Project Management Software.pdf
Design an Analysis of Algorithms I-SECS-1021-03
medical staffing services at VALiNTRY
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...

Instance Space Analysis for Search Based Software Engineering

  • 1. Keynote On the EïŹ€ectiveness of SBSE Techniques through Instance Space Analysis Aldeida Aleti Monash University, Australia @AldeidaAleti aldeida.aleti@monash.edu
  • 2. Effectiveness of SBSE - Status Quo A large focus of SBSE research is in introducing new SBSE approaches As part of the evaluation process, usually a set of experiments are conducted - A benchmark is selected, e..g., Defects4J - The new approach is compared against the state of the art - Averages/medians are reported - Some statistical tests are conducted
  • 3. Instance Space Analysis 1. to understand and visualise the strengths and weaknesses of different approaches 2. to help with the objective assessment of different approaches a. Scrutinising how approaches perform under different conditions, and stress testing them
  • 4. Motivation 1: Are the problem instances adequate?
  • 5. Problem 1: How were the problem instances selected? Common benchmark problems are important for fair comparison, but are they - demonstrably diverse - unbiased - representative of a range of real world context, - challenging - discriminating
  • 6. ICSE 2022 review criteria
  • 7. Motivation 2: Reporting averages/medians obscures important information A. Perera, A. Aleti, M. Böhme and B. Turhan, "Defect Prediction Guided Search-Based Software Testing," 2020 35th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2020, pp. 448-460.
  • 8. Problem 2: Performance is often problem dependent (NFT) - What are the strengths and weaknesses of the approaches? - Which are the problem instances where an approach performs really well and why? - Which are the problem instances where an approach struggles and why? - How do features of the problem instances affect the performance of the approaches? - Which features give an algorithm competitive advantage? - Given a problem instance with particular features, which approach should I use? Which algorithm is suitable for future problems?
  • 9. Example Which approach is better? SF110 C. Oliveira, A. Aleti, L. Grunske and K. Smith-Miles, "Mapping the Effectiveness of Automated Test Suite Generation Techniques," in IEEE Transactions on Reliability, vol. 67, no. 3, pp. 771-785, Sept. 2018, doi: 10.1109/TR.2018.2832072.
  • 11. Open Questions ● What impacts the effectiveness of SBSE techniques? ○ How can features of problem instances help us infer what are the strengths and weaknesses of different SBSE approaches? ○ How can we objectively assess different SBSE techniques ● How easy or hard are existing benchmarks? How diverse are they? Are they biased towards a particular technique? ● Can we select the most suitable SBSE technique given a problem with particular features?
  • 12. Empirical Review of Program Repair Tools: A Large-Scale Experiment on 2 141 Bugs and 23 551 Repair Attempts. T. Durieux, F. Madeiral, M. Martinez, R. Abreu. ESEC/FSE Foundations of Software Engineering (2019) doi: 10.1145/ 3338906.3338911.
  • 13. ISA K. Smith-Miles et al. / Computers & Operations Research 45 (2014) 12–24
  • 14. Steps of ISA 1. Create the metadata a. Features b. SBSE performances 2. Create instance space 3. Visualise footprints 4. Explain strengths/weaknesses
  • 16. Features (56) What makes the problem easy or hard?
  • 18. Performance measure ● Branch coverage. ● An approach is considered superior if its branch coverage is at least 1% higher than the other techniques; otherwise, we use the label “Equal.”
  • 19. Approaches ● Whole Test Suite with Archive (WSA) ● Many Objective Sorting Algorithm (MOSA) ● Random Testing (RT)
  • 20. Significant features ● coupling between object classes ○ the number of classes coupled to a given class (method calls, field accesses, inheritance, arguments, return types, and exceptions) ● response for a class ○ number of different methods that can be executed when a method is invoked for that object of a class
  • 26. E-APR
  • 30. (F1) MOA: Measure of Aggregation. (F2) CAM: Cohesion Among Methods (F3) AMC: Average Method Complexity (F4) PMC: Private Method Count (F5) AECSL: Atomic Expression Comparison Same Left indicates the number of statements with a binary expression that have more than an atomic expression (e.g., variable access). (F6) SPTWNG: Similar Primitive Type With Normal Guard indicates the number of statements that contain a variable (local or global) that is also used in another statement contained inside a guard (i.e., an If condition). (F7) CVNI: Compatible Variable Not Included is the number of local primitive type variables within the scope of a statement that involves primitive variables that are not part of that statement. (F8) VCTC: Variable Compatible Type in Condition measures the number of variables within an If condition that are compatible with another variable in the scope. (F9) PUIA: Primitive Used In Assignment - the number of primitive variables in assignments.
  • 32. ● Little overlap between IntroClassJava/Defects4J and the other datasets ● Bugs.jar has the most diverse bugs
  • 34. For ISA to reveal useful insights ● Diverse features ● Diverse instances ● Diverse approaches ● A good performance measure
  • 35. So what We have a responsibility to find the weaknesses of the approaches we develop We need to make sure that the chosen problem instances are demonstrably diverse, unbiased, representative of a range of real world context, challenging, discriminating of approach performance To understand which approach is suitable for future problems, we must understand which features impact its performance