SlideShare a Scribd company logo
Annibale Panichella Andy ZaidmanAndrea De Lucia
Adaptive User Feedback for
IR-based Traceability Recovery
Traceability Recovery
Search
Information
Retrieval
Method (LSI,
VSM, etc.)
Search Use Case 1 Software
Artefacts
used as
query
Traceability Recovery
Use Case 1
Class1.java
nl.tudelft.package1.class
public class Class1{
private int attribute 1;
private String attribute 2; …
Class5.java
nl.tudelft.package2.class
public class Class5{
public Class5 (int parameter1, int parameter2, String parameter3)…
Class2.java
nl.tudelft.package1.class
public class Class2 extends Class1 {
private int attribute 1; …
Class63.java
nl.tudelft.package5.class
public class Class63{
public Class63 (Class12 parameter1, int parameter2)…
90%
85%
81%
74%
Class27.java
nl.tudelft.package3.class
private int attribute 1;
private String attribute 2; …
62%
Search
List of
candidate
links
Traceability Recovery
Use Case 1
Class1.java
nl.tudelft.package1.class
public class Class1{
private int attribute 1;
private String attribute 2; …
Class5.java
nl.tudelft.package2.class
public class Class5{
public Class5 (int parameter1, int parameter2, String parameter3)…
Class2.java
nl.tudelft.package1.class
public class Class2 extends Class1 {
private int attribute 1; …
Class63.java
nl.tudelft.package5.class
public class Class63{
public Class63 (Class12 parameter1, int parameter2)…
Class27.java
nl.tudelft.package3.class
private int attribute 1;
private String attribute 2; …
✓
✗
✗
✓
✓
Search
Traceability Recovery
Relevance Feedback
Hayes et al., “Advancing Candidate
Link Generation for Requirements
Tracing: the Study of Methods”
IEEE Transaction on Software Engineering
2006
Relevance Feedback
Hayes et al., “Advancing Candidate
Link Generation for Requirements
Tracing: the Study of Methods”
IEEE Transaction on Software Engineering
2006
Use Case 1
Class1.java
nl.tudelft.package1.class
public class Class1{
private int attribute 1;
private String attribute 2; …
Class5.java
nl.tudelft.package2.class
public class Class5{
public Class5 (int parameter1, int parameter2, String parameter3)…
Class2.java
nl.tudelft.package1.class
public class Class2 extends Class1 {
private int attribute 1; …
Class63.java
nl.tudelft.package5.class
public class Class63{
public Class63 (Class12 parameter1, int parameter2)…
81%
74%
Class27.java
nl.tudelft.package3.class
private int attribute 1;
private String attribute 2; …
62%
Search
✓
✗
Relevance Feedback
Hayes et al., “Advancing Candidate
Link Generation for Requirements
Tracing: the Study of Methods”
IEEE Transaction on Software Engineering
2006
Use Case 1
Class1.java
nl.tudelft.package1.class
public class Class1{
private int attribute 1;
private String attribute 2; …
Class5.java
nl.tudelft.package2.class
public class Class5{
public Class5 (int parameter1, int parameter2, String parameter3)…
Class2.java
nl.tudelft.package1.class
public class Class2 extends Class1 {
private int attribute 1; …
Class63.java
nl.tudelft.package5.class
public class Class63{
public Class63 (Class12 parameter1, int parameter2)…
Class27.java
nl.tudelft.package3.class
private int attribute 1;
private String attribute 2; …
✓
✗
Search
81%
74%
62%
Learning Process
Relevance Feedback
Hayes et al., “Advancing Candidate
Link Generation for Requirements
Tracing: the Study of Methods”
IEEE Transaction on Software Engineering
2006
Use Case 1
Class1.java
nl.tudelft.package1.class
public class Class1{
private int attribute 1;
private String attribute 2; …
Class5.java
nl.tudelft.package2.class
public class Class5{
public Class5 (int parameter1, int parameter2, String parameter3)…
✓
✗
Search
81%
74%
62%
43%
Class2.java
nl.tudelft.package1.class
public class Class2 extends Class1 {
private int attribute 1; …
Class27.java
nl.tudelft.package3.class
private int attribute 1;
private String attribute 2; …
Class63.java
nl.tudelft.package5.class
public class Class63{
public Class63 (Class12 parameter1, int parameter2)…
76%
77%
Relevance Feedback
Hayes et al., “Advancing Candidate
Link Generation for Requirements
Tracing: the Study of Methods”
IEEE Transaction on Software Engineering
2006
Use Case 1
Class1.java
nl.tudelft.package1.class
public class Class1{
private int attribute 1;
private String attribute 2; …
Class5.java
nl.tudelft.package2.class
public class Class5{
public Class5 (int parameter1, int parameter2, String parameter3)…
Class2.java
nl.tudelft.package1.class
public class Class2 extends Class1 {
private int attribute 1; …
Class63.java
nl.tudelft.package5.class
public class Class63{
public Class63 (Class12 parameter1, int parameter2)…
77%
76%
Class27.java
nl.tudelft.package3.class
private int attribute 1;
private String attribute 2; …
43%
✓
✗
Search
Relevance Feedback
Hayes et al., “Advancing Candidate
Link Generation for Requirements
Tracing: the Study of Methods”
IEEE Transaction on Software Engineering
2006
Use Case 1
Class1.java
nl.tudelft.package1.class
public class Class1{
private int attribute 1;
private String attribute 2; …
Class5.java
nl.tudelft.package2.class
public class Class5{
public Class5 (int parameter1, int parameter2, String parameter3)…
Class2.java
nl.tudelft.package1.class
public class Class2 extends Class1 {
private int attribute 1; …
Class63.java
nl.tudelft.package5.class
public class Class63{
public Class63 (Class12 parameter1, int parameter2)…
Class27.java
nl.tudelft.package3.class
private int attribute 1;
private String attribute 2; …
✓
✗
✓
✓
✗
Search
Relevance Feedback
Hayes et al., “Advancing Candidate
Link Generation for Requirements
Tracing: the Study of Methods”
IEEE Transaction on Software Engineering
2006
“Analyst feedback improves the final trace
results” for requirements tracing
(Standard Rocchio)
Relevance Feedback
Hayes et al., “Advancing Candidate
Link Generation for Requirements
Tracing: the Study of Methods”
IEEE Transaction on Software Engineering
2006
De Lucia et al., “Incremental
Approach and User Feedback: a
Silver Bullet for Traceability Recovery”
International Conference on Software
Maintenance, 2006
Relevance feedback
does not improve and
sometimes worsens
the accuracy of an IR
method when applied
to different software
artefacts
(Standard Rocchio)
Theory behind Relevance Feedback
Mannin et al, “Introduction to
Information Retrieval”.
Cambridge University Press, 2008.
Theory behind Relevance Feedback
Mannin et al, “Introduction to
Information Retrieval”.
Cambridge University Press, 2008.
Assumption 1. Queries contain few words if
compared to the size of documents to retrieve
Short query
Web pages with hundreds of words
Theory behind Relevance Feedback
Mannin et al, “Introduction to
Information Retrieval”.
Cambridge University Press, 2008.
Assumption 1. Queries contain few words if
compared to the size of documents to retrieve
Use Case
Test Case
Query?
Query?
Theory behind Relevance Feedback
Mannin et al, “Introduction to
Information Retrieval”.
Cambridge University Press, 2008.
Assumption 1. Queries contain few words if
compared to the size of documents to retrieve
In traceability, source artefacts (queries) can be
more verbose than target artefacts (documents)
Theory behind Relevance Feedback
Mannin et al, “Introduction to
Information Retrieval”.
Cambridge University Press, 2008.
Assumption 2. Cluster hypothesis: relevant
documents must be similar to each others, i.e.,
they should cluster in the vector space
Relevant
Documents
Non-Relevant
Documents
Query
Term1
Term2
Theory behind Relevance Feedback
Mannin et al, “Introduction to
Information Retrieval”.
Cambridge University Press, 2008.
Assumption 2. Cluster hypothesis: relevant
documents must be similar to each others, i.e.,
they should cluster in the vector space
Relevant
Documents
Non-Relevant
Documents
Query
Term1
Term2
Theory behind Relevance Feedback
Mannin et al, “Introduction to
Information Retrieval”.
Cambridge University Press, 2008.
Assumption 2. Cluster hypothesis: relevant
documents must be similar to each others, i.e.,
they should cluster in the vector space
Language used in software artefacts is much
more homogeneous than natural language
Term1
Term2
Theory behind Relevance Feedback
Mannin et al, “Introduction to
Information Retrieval”.
Cambridge University Press, 2008.
Assumption 2. Cluster hypothesis: relevant
documents must be similar to each others, i.e.,
they should cluster in the vector space
Language used in software artefacts is much
more homogeneous than natural language
Assumption 1. Queries contain few words if
compared to the size of documents to retrieve
In traceability, source artefacts (queries) can
be more verbose than target artefacts
(documents)
Adaptive Relevance Feedback
List = initial ranked list of candidate links
while not (stopping criterion) {
Get the link (source, target) on top of List
The user classifies (source, target)
Apply the standard Rocchio to source
}
(Standard Rocchio)
Adaptive Relevance Feedback
List = initial ranked list of candidate links
while not (stopping criterion) {
Get the link (source, target) on top of List
The user classifies (source, target)
if (source < target)
Apply the standard Rocchio to source
if (target < source)
Apply the standard Rocchio to target
}
Adaptive Standard Rocchio
Apply the relevance
feedback only to the
shortest artefacts
(Assumption 1)
Adaptive Relevance Feedback
List = initial ranked list of candidate links
while not (stopping criterion) {
Get the link (source, target) on top of List
The user classifies (source, target)
if (source < target && TruePositive(source) > FalsePositive(source))
Apply the standard Rocchio to source
if (source < target && TruePositive(source) > FalsePositive(source))
Apply the standard Rocchio to target
}
Apply the relevance
feedback only to the
shortest artefacts
(Assumption 1)
Adaptive Standard Rocchio
Apply the relevance feedback if and only
if the number of correct links is >= to the
number of false positives
(Assumption 2)
Empirical Evaluation
Empirical Evaluation
Context: three software projects
Empirical Evaluation
We investigates the following research questions:
RQ1: Does the adaptive relevance feedback improve the performances of the
Vector Space Model?
RQ2: Does the adaptive relevance feedback outperform the standard relevance
feedback?
Context: three software projects
Empirical Evaluation
Context: three software projects
We investigates the following research questions:
RQ1: Does the adaptive relevance feedback improve the performances of the
Vector Space Model?
RQ2: Does the adaptive relevance feedback outperform the standard relevance
feedback?
Metrics:
Precision = TP/ (TP+FP)
Recall = TP / (Tot Links)
Wilcoxon Test (non-parametric)
Empirical Results
Easy-Clinic: tracing UC onto CC Easy-Clinic: tracing ID onto CC
Easy-Clinic: tracing TC onto CC
Empirical Results
i-Trust: tracing UC onto JSP Modis: tracing HLR onto LLR
Empirical ResultsAveragePrecision
0
23
45
68
90
UC-CC ID-CC TC-CC i-Trust Modis
VSM RF Adaptive RF
Statistical Significance:
AdaptiveRF > VSM = 4/5
AdaptiveRF > RF = 4/5
RF > VSM = 1/5
Wilcoxon test
In Summary
Future Works…

More Related Content

PDF
B017441015
PDF
H017445260
PDF
Novel Class Detection Using RBF SVM Kernel from Feature Evolving Data Streams
PDF
Ontology Based Approach for Semantic Information Retrieval System
PDF
Natural Language Processing Through Different Classes of Machine Learning
PDF
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...
PPT
Topic Models Based Personalized Spam Filter
PDF
Open domain Question Answering System - Research project in NLP
B017441015
H017445260
Novel Class Detection Using RBF SVM Kernel from Feature Evolving Data Streams
Ontology Based Approach for Semantic Information Retrieval System
Natural Language Processing Through Different Classes of Machine Learning
IRJET-Classifying Mined Online Discussion Data for Reflective Thinking based ...
Topic Models Based Personalized Spam Filter
Open domain Question Answering System - Research project in NLP

What's hot (20)

PPT
kantorNSF-NIJ-ISI-03-06-04.ppt
PDF
Prediction of Answer Keywords using Char-RNN
PDF
Query Processing with k-Anonymity
PDF
Confidential data identification using
PPTX
Question answering
PDF
G44083642
PDF
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
PPTX
Author Identification of Source Code Segments Written by Multiple Authors Usi...
PPTX
From TREC to Watson: is open domain question answering a solved problem?
PPTX
Language Models for Information Retrieval
PDF
Question Answering - Application and Challenges
PDF
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion Detection
PDF
A rough set based hybrid method to text categorization
PDF
Topic detecton by clustering and text mining
DOC
LE03.doc
PPTX
Introduction to Machine Learning
PPT
Probablistic information retrieval
PDF
Privacy Protectin Models and Defamation caused by k-anonymity
PDF
Dynamic Radius Species Conserving Genetic Algorithm for Test Generation for S...
PPTX
Text Mining using LDA with Context
kantorNSF-NIJ-ISI-03-06-04.ppt
Prediction of Answer Keywords using Char-RNN
Query Processing with k-Anonymity
Confidential data identification using
Question answering
G44083642
[IJET-V1I6P17] Authors : Mrs.R.Kalpana, Mrs.P.Padmapriya
Author Identification of Source Code Segments Written by Multiple Authors Usi...
From TREC to Watson: is open domain question answering a solved problem?
Language Models for Information Retrieval
Question Answering - Application and Challenges
Multi Stage Filter Using Enhanced Adaboost for Network Intrusion Detection
A rough set based hybrid method to text categorization
Topic detecton by clustering and text mining
LE03.doc
Introduction to Machine Learning
Probablistic information retrieval
Privacy Protectin Models and Defamation caused by k-anonymity
Dynamic Radius Species Conserving Genetic Algorithm for Test Generation for S...
Text Mining using LDA with Context
Ad

Viewers also liked (17)

ODP
P2 p impress
PPTX
Derechosdeautor
PPTX
uso de las TICS en el turismo.
PDF
Security Threat Identification and Testing
PDF
Qiy4link
ODP
P2 p impress
ODT
Menú abril 12 (ing)
PPTX
Presentation_NEW.PPTX
PPTX
Enquête Ipsos vieilles maisons françaises mars 2013
PPTX
Oro open source solutions
PDF
Extracting Domain Models from Natural-Language Requirements: Approach and Ind...
PDF
Predicting Method Crashes with Bytecode Operations
PPTX
Management planning &amp; implementation
PPTX
Custom ERP or Off-the-Shelf ERP – A Comparison to Stay
PPTX
Reading numbers 4
PPTX
Andaman and nicobar island
PPTX
Php micro frameworks
P2 p impress
Derechosdeautor
uso de las TICS en el turismo.
Security Threat Identification and Testing
Qiy4link
P2 p impress
Menú abril 12 (ing)
Presentation_NEW.PPTX
Enquête Ipsos vieilles maisons françaises mars 2013
Oro open source solutions
Extracting Domain Models from Natural-Language Requirements: Approach and Ind...
Predicting Method Crashes with Bytecode Operations
Management planning &amp; implementation
Custom ERP or Off-the-Shelf ERP – A Comparison to Stay
Reading numbers 4
Andaman and nicobar island
Php micro frameworks
Ad

Similar to Adaptive User Feedback for IR-based Traceability Recovery (20)

PDF
An Efficient Approach for Requirement Traceability Integrated With Software ...
PDF
An Efficient Approach for Requirement Traceability Integrated With Software R...
PDF
Irrf Presentation
PDF
Concept Location using Information Retrieval and Relevance Feedback
PDF
Combining IR with Relevance Feedback for Concept Location
PDF
When and How Using Structural Information to Improve IR-Based Traceability Re...
PPTX
2010 06-24 karlsruher entwicklertag
PDF
PDF
Search & Recommendation: Birds of a Feather?
KEY
Talking to your IDE
PDF
TECHNIQUES FOR COMPONENT REUSABLE APPROACH
PDF
PDF
Lesson #04 - Software Engineering - Lecture.pdf
PDF
Software Maintenance Support by Extracting Links and Models (revised)
PDF
A survey on various architectures, models and methodologies for information r...
PDF
Third-Party Software Library Reuse : From Adoption to Migration
PPTX
Course-Adaptive Content Recommender for Course Authoring
PPT
Component Search and Retrieval
PDF
Tracing Requirements as a Problem of Machine Learning
KEY
20110516_ria_ENC
An Efficient Approach for Requirement Traceability Integrated With Software ...
An Efficient Approach for Requirement Traceability Integrated With Software R...
Irrf Presentation
Concept Location using Information Retrieval and Relevance Feedback
Combining IR with Relevance Feedback for Concept Location
When and How Using Structural Information to Improve IR-Based Traceability Re...
2010 06-24 karlsruher entwicklertag
Search & Recommendation: Birds of a Feather?
Talking to your IDE
TECHNIQUES FOR COMPONENT REUSABLE APPROACH
Lesson #04 - Software Engineering - Lecture.pdf
Software Maintenance Support by Extracting Links and Models (revised)
A survey on various architectures, models and methodologies for information r...
Third-Party Software Library Reuse : From Adoption to Migration
Course-Adaptive Content Recommender for Course Authoring
Component Search and Retrieval
Tracing Requirements as a Problem of Machine Learning
20110516_ria_ENC

More from Annibale Panichella (20)

PDF
Metamorphic-Based Many-Objective Distillation of LLMs for Code-related Tasks
PDF
MIP Award presentation at the IEEE International Conference on Software Analy...
PDF
Breaking the Silence: the Threats of Using LLMs in Software Engineering
PDF
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
PDF
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
PDF
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
PDF
VST2022.pdf
PDF
IPA Fall Days 2019
PDF
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
PDF
Speeding-up Software Testing With Computational Intelligence
PDF
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...
PPTX
Sbst2018 contest2018
PDF
Java Unit Testing Tool Competition — Fifth Round
PDF
ICSE 2017 - Evocrash
PDF
Evolutionary Testing for Crash Reproduction
PDF
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
PDF
Reformulating Branch Coverage as a Many-Objective Optimization Problem
PDF
Results for EvoSuite-MOSA at the Third Unit Testing Tool Competition
PDF
Diversity mechanisms for evolutionary populations in Search-Based Software En...
PDF
Estimating the Evolution Direction of Populations to Improve Genetic Algorithms
Metamorphic-Based Many-Objective Distillation of LLMs for Code-related Tasks
MIP Award presentation at the IEEE International Conference on Software Analy...
Breaking the Silence: the Threats of Using LLMs in Software Engineering
Searching for Quality: Genetic Algorithms and Metamorphic Testing for Softwar...
A Fast Multi-objective Evolutionary Approach for Designing Large-Scale Optica...
An Improved Pareto Front Modeling Algorithm for Large-scale Many-Objective Op...
VST2022.pdf
IPA Fall Days 2019
An Adaptive Evolutionary Algorithm based on Non-Euclidean Geometry for Many-O...
Speeding-up Software Testing With Computational Intelligence
Incremental Control Dependency Frontier Exploration for Many-Criteria Test C...
Sbst2018 contest2018
Java Unit Testing Tool Competition — Fifth Round
ICSE 2017 - Evocrash
Evolutionary Testing for Crash Reproduction
Parameterizing and Assembling IR-based Solutions for SE Tasks using Genetic A...
Reformulating Branch Coverage as a Many-Objective Optimization Problem
Results for EvoSuite-MOSA at the Third Unit Testing Tool Competition
Diversity mechanisms for evolutionary populations in Search-Based Software En...
Estimating the Evolution Direction of Populations to Improve Genetic Algorithms

Recently uploaded (20)

DOCX
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
PDF
PM Narendra Modi's speech from Red Fort on 79th Independence Day.pdf
PPT
First Aid Training Presentation Slides.ppt
PPTX
Sustainable Forest Management ..SFM.pptx
PPTX
Hydrogel Based delivery Cancer Treatment
PPTX
Lesson-7-Gas. -Exchange_074636.pptx
PDF
MODULE 3 BASIC SECURITY DUTIES AND ROLES.pdf
PPTX
Introduction-to-Food-Packaging-and-packaging -materials.pptx
PPTX
3RD-Q 2022_EMPLOYEE RELATION - Copy.pptx
PPTX
PurpoaiveCommunication for students 02.pptx
PDF
Module 7 guard mounting of security pers
PPTX
MERISTEMATIC TISSUES (MERISTEMS) PPT PUBLIC
PPTX
Intro to ISO 9001 2015.pptx wareness raising
PPTX
Phylogeny and disease transmission of Dipteran Fly (ppt).pptx
PPTX
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
PDF
IKS PPT.....................................
PPTX
2025-08-17 Joseph 03 (shared slides).pptx
PPTX
lesson6-211001025531lesson plan ppt.pptx
PPTX
nose tajweed for the arabic alphabets for the responsive
PPTX
An Unlikely Response 08 10 2025.pptx
ENGLISH PROJECT FOR BINOD BIHARI MAHTO KOYLANCHAL UNIVERSITY
PM Narendra Modi's speech from Red Fort on 79th Independence Day.pdf
First Aid Training Presentation Slides.ppt
Sustainable Forest Management ..SFM.pptx
Hydrogel Based delivery Cancer Treatment
Lesson-7-Gas. -Exchange_074636.pptx
MODULE 3 BASIC SECURITY DUTIES AND ROLES.pdf
Introduction-to-Food-Packaging-and-packaging -materials.pptx
3RD-Q 2022_EMPLOYEE RELATION - Copy.pptx
PurpoaiveCommunication for students 02.pptx
Module 7 guard mounting of security pers
MERISTEMATIC TISSUES (MERISTEMS) PPT PUBLIC
Intro to ISO 9001 2015.pptx wareness raising
Phylogeny and disease transmission of Dipteran Fly (ppt).pptx
BIOLOGY TISSUE PPT CLASS 9 PROJECT PUBLIC
IKS PPT.....................................
2025-08-17 Joseph 03 (shared slides).pptx
lesson6-211001025531lesson plan ppt.pptx
nose tajweed for the arabic alphabets for the responsive
An Unlikely Response 08 10 2025.pptx

Adaptive User Feedback for IR-based Traceability Recovery

  • 1. Annibale Panichella Andy ZaidmanAndrea De Lucia Adaptive User Feedback for IR-based Traceability Recovery
  • 3. Search Use Case 1 Software Artefacts used as query Traceability Recovery
  • 4. Use Case 1 Class1.java nl.tudelft.package1.class public class Class1{ private int attribute 1; private String attribute 2; … Class5.java nl.tudelft.package2.class public class Class5{ public Class5 (int parameter1, int parameter2, String parameter3)… Class2.java nl.tudelft.package1.class public class Class2 extends Class1 { private int attribute 1; … Class63.java nl.tudelft.package5.class public class Class63{ public Class63 (Class12 parameter1, int parameter2)… 90% 85% 81% 74% Class27.java nl.tudelft.package3.class private int attribute 1; private String attribute 2; … 62% Search List of candidate links Traceability Recovery
  • 5. Use Case 1 Class1.java nl.tudelft.package1.class public class Class1{ private int attribute 1; private String attribute 2; … Class5.java nl.tudelft.package2.class public class Class5{ public Class5 (int parameter1, int parameter2, String parameter3)… Class2.java nl.tudelft.package1.class public class Class2 extends Class1 { private int attribute 1; … Class63.java nl.tudelft.package5.class public class Class63{ public Class63 (Class12 parameter1, int parameter2)… Class27.java nl.tudelft.package3.class private int attribute 1; private String attribute 2; … ✓ ✗ ✗ ✓ ✓ Search Traceability Recovery
  • 6. Relevance Feedback Hayes et al., “Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods” IEEE Transaction on Software Engineering 2006
  • 7. Relevance Feedback Hayes et al., “Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods” IEEE Transaction on Software Engineering 2006 Use Case 1 Class1.java nl.tudelft.package1.class public class Class1{ private int attribute 1; private String attribute 2; … Class5.java nl.tudelft.package2.class public class Class5{ public Class5 (int parameter1, int parameter2, String parameter3)… Class2.java nl.tudelft.package1.class public class Class2 extends Class1 { private int attribute 1; … Class63.java nl.tudelft.package5.class public class Class63{ public Class63 (Class12 parameter1, int parameter2)… 81% 74% Class27.java nl.tudelft.package3.class private int attribute 1; private String attribute 2; … 62% Search ✓ ✗
  • 8. Relevance Feedback Hayes et al., “Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods” IEEE Transaction on Software Engineering 2006 Use Case 1 Class1.java nl.tudelft.package1.class public class Class1{ private int attribute 1; private String attribute 2; … Class5.java nl.tudelft.package2.class public class Class5{ public Class5 (int parameter1, int parameter2, String parameter3)… Class2.java nl.tudelft.package1.class public class Class2 extends Class1 { private int attribute 1; … Class63.java nl.tudelft.package5.class public class Class63{ public Class63 (Class12 parameter1, int parameter2)… Class27.java nl.tudelft.package3.class private int attribute 1; private String attribute 2; … ✓ ✗ Search 81% 74% 62% Learning Process
  • 9. Relevance Feedback Hayes et al., “Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods” IEEE Transaction on Software Engineering 2006 Use Case 1 Class1.java nl.tudelft.package1.class public class Class1{ private int attribute 1; private String attribute 2; … Class5.java nl.tudelft.package2.class public class Class5{ public Class5 (int parameter1, int parameter2, String parameter3)… ✓ ✗ Search 81% 74% 62% 43% Class2.java nl.tudelft.package1.class public class Class2 extends Class1 { private int attribute 1; … Class27.java nl.tudelft.package3.class private int attribute 1; private String attribute 2; … Class63.java nl.tudelft.package5.class public class Class63{ public Class63 (Class12 parameter1, int parameter2)… 76% 77%
  • 10. Relevance Feedback Hayes et al., “Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods” IEEE Transaction on Software Engineering 2006 Use Case 1 Class1.java nl.tudelft.package1.class public class Class1{ private int attribute 1; private String attribute 2; … Class5.java nl.tudelft.package2.class public class Class5{ public Class5 (int parameter1, int parameter2, String parameter3)… Class2.java nl.tudelft.package1.class public class Class2 extends Class1 { private int attribute 1; … Class63.java nl.tudelft.package5.class public class Class63{ public Class63 (Class12 parameter1, int parameter2)… 77% 76% Class27.java nl.tudelft.package3.class private int attribute 1; private String attribute 2; … 43% ✓ ✗ Search
  • 11. Relevance Feedback Hayes et al., “Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods” IEEE Transaction on Software Engineering 2006 Use Case 1 Class1.java nl.tudelft.package1.class public class Class1{ private int attribute 1; private String attribute 2; … Class5.java nl.tudelft.package2.class public class Class5{ public Class5 (int parameter1, int parameter2, String parameter3)… Class2.java nl.tudelft.package1.class public class Class2 extends Class1 { private int attribute 1; … Class63.java nl.tudelft.package5.class public class Class63{ public Class63 (Class12 parameter1, int parameter2)… Class27.java nl.tudelft.package3.class private int attribute 1; private String attribute 2; … ✓ ✗ ✓ ✓ ✗ Search
  • 12. Relevance Feedback Hayes et al., “Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods” IEEE Transaction on Software Engineering 2006 “Analyst feedback improves the final trace results” for requirements tracing (Standard Rocchio)
  • 13. Relevance Feedback Hayes et al., “Advancing Candidate Link Generation for Requirements Tracing: the Study of Methods” IEEE Transaction on Software Engineering 2006 De Lucia et al., “Incremental Approach and User Feedback: a Silver Bullet for Traceability Recovery” International Conference on Software Maintenance, 2006 Relevance feedback does not improve and sometimes worsens the accuracy of an IR method when applied to different software artefacts (Standard Rocchio)
  • 14. Theory behind Relevance Feedback Mannin et al, “Introduction to Information Retrieval”. Cambridge University Press, 2008.
  • 15. Theory behind Relevance Feedback Mannin et al, “Introduction to Information Retrieval”. Cambridge University Press, 2008. Assumption 1. Queries contain few words if compared to the size of documents to retrieve Short query Web pages with hundreds of words
  • 16. Theory behind Relevance Feedback Mannin et al, “Introduction to Information Retrieval”. Cambridge University Press, 2008. Assumption 1. Queries contain few words if compared to the size of documents to retrieve Use Case Test Case Query? Query?
  • 17. Theory behind Relevance Feedback Mannin et al, “Introduction to Information Retrieval”. Cambridge University Press, 2008. Assumption 1. Queries contain few words if compared to the size of documents to retrieve In traceability, source artefacts (queries) can be more verbose than target artefacts (documents)
  • 18. Theory behind Relevance Feedback Mannin et al, “Introduction to Information Retrieval”. Cambridge University Press, 2008. Assumption 2. Cluster hypothesis: relevant documents must be similar to each others, i.e., they should cluster in the vector space Relevant Documents Non-Relevant Documents Query Term1 Term2
  • 19. Theory behind Relevance Feedback Mannin et al, “Introduction to Information Retrieval”. Cambridge University Press, 2008. Assumption 2. Cluster hypothesis: relevant documents must be similar to each others, i.e., they should cluster in the vector space Relevant Documents Non-Relevant Documents Query Term1 Term2
  • 20. Theory behind Relevance Feedback Mannin et al, “Introduction to Information Retrieval”. Cambridge University Press, 2008. Assumption 2. Cluster hypothesis: relevant documents must be similar to each others, i.e., they should cluster in the vector space Language used in software artefacts is much more homogeneous than natural language Term1 Term2
  • 21. Theory behind Relevance Feedback Mannin et al, “Introduction to Information Retrieval”. Cambridge University Press, 2008. Assumption 2. Cluster hypothesis: relevant documents must be similar to each others, i.e., they should cluster in the vector space Language used in software artefacts is much more homogeneous than natural language Assumption 1. Queries contain few words if compared to the size of documents to retrieve In traceability, source artefacts (queries) can be more verbose than target artefacts (documents)
  • 22. Adaptive Relevance Feedback List = initial ranked list of candidate links while not (stopping criterion) { Get the link (source, target) on top of List The user classifies (source, target) Apply the standard Rocchio to source } (Standard Rocchio)
  • 23. Adaptive Relevance Feedback List = initial ranked list of candidate links while not (stopping criterion) { Get the link (source, target) on top of List The user classifies (source, target) if (source < target) Apply the standard Rocchio to source if (target < source) Apply the standard Rocchio to target } Adaptive Standard Rocchio Apply the relevance feedback only to the shortest artefacts (Assumption 1)
  • 24. Adaptive Relevance Feedback List = initial ranked list of candidate links while not (stopping criterion) { Get the link (source, target) on top of List The user classifies (source, target) if (source < target && TruePositive(source) > FalsePositive(source)) Apply the standard Rocchio to source if (source < target && TruePositive(source) > FalsePositive(source)) Apply the standard Rocchio to target } Apply the relevance feedback only to the shortest artefacts (Assumption 1) Adaptive Standard Rocchio Apply the relevance feedback if and only if the number of correct links is >= to the number of false positives (Assumption 2)
  • 27. Empirical Evaluation We investigates the following research questions: RQ1: Does the adaptive relevance feedback improve the performances of the Vector Space Model? RQ2: Does the adaptive relevance feedback outperform the standard relevance feedback? Context: three software projects
  • 28. Empirical Evaluation Context: three software projects We investigates the following research questions: RQ1: Does the adaptive relevance feedback improve the performances of the Vector Space Model? RQ2: Does the adaptive relevance feedback outperform the standard relevance feedback? Metrics: Precision = TP/ (TP+FP) Recall = TP / (Tot Links) Wilcoxon Test (non-parametric)
  • 29. Empirical Results Easy-Clinic: tracing UC onto CC Easy-Clinic: tracing ID onto CC Easy-Clinic: tracing TC onto CC
  • 30. Empirical Results i-Trust: tracing UC onto JSP Modis: tracing HLR onto LLR
  • 31. Empirical ResultsAveragePrecision 0 23 45 68 90 UC-CC ID-CC TC-CC i-Trust Modis VSM RF Adaptive RF Statistical Significance: AdaptiveRF > VSM = 4/5 AdaptiveRF > RF = 4/5 RF > VSM = 1/5 Wilcoxon test