SlideShare a Scribd company logo
SUPPORTING SOFTWARE CHANGE TASKS
USING AUTOMATED QUERY
REFORMULATIONS
Masud Rahman
PhD Candidate
Department of Computer Science
University of Saskatchewan, Canada
Email: masud.rahman@usask.ca
CMPT 470/816: Advanced Software Engineering
A TALE OF SOFTWARE CHANGE
2
Alex
Bob
Code base Customer
Code base
Bug repository
A TALE OF SOFTWARE CHANGE (CONTD.)
3
Alex
Bob
Customer
Buggy
files
Bug
report
Change
implementation
Keywords
Code search
Code
base
TALK OUTLINE
4
Automated Query
Reformulation
Part I: Suggest keywords from
the change request texts
Part II: Reformulate initial
query of developer using
codebase
STRICT: INFORMATION RETRIEVAL
BASED SEARCH TERM IDENTIFICATION
FOR CONCEPT LOCATION
Mohammad Masudur Rahman, Chanchal K. Roy
Department of Computer Science
University of Saskatchewan, Canada
International Conference on Software Analysis, Evolution and
Reengineering (SANER 2017), Klagenfurt, Austria
SOFTWARE CHANGE TASK
6
Task Summary
Task Description
Other Information
SOFTWARE CHANGE TASK:
DOMAIN CONCEPT--ARTIFACT MAPPING
IResource
element
Tree
Level
Provider
7
Domain concepts
Project artifacts
(e.g., classes, methods)
Our
contribution:
Identifying
such concepts
QUIZ TEST-I
8
ID Query QE
1. Custom search results view iresource
2. Custom search results search results view
3. element iresource provider level tree
4. Custom search results hierarchically java search results
1331
636
01
570
EXISTING WORKS
 Query reformulation & expansion
 Haiduc et al, ICSE 2013
 Gay et al, ICSM 2009
 Shepherd et al, ASOD 2007
 Query quality analysis
 Haiduc et al, ASE 2011
 Haiduc et al, ICPC 2011
 Haiduc et al, ICSE 2012
 Software artifact mining
 Howard et al, MSR 2013
 Kevic & Fritz, MSR 2014
 Heuristics
 Kevic & Fritz, ICSE 2014
9
• Most studies expect the
developer to provide an initial
query
• Developers succeed only in
12.2% of cases (Kevic & Fritz,
ICSE 2014)
Initial search query for a
change task.
PAGERANK ALGORITHM: WEB LINK ANALYSIS
10Size of a face ∞ Size of the faces pointing to it
Most important face
in this crowd
SEARCH TERM IDENTIFICATION USING
TEXTRANK & POSRANK, TWO VARIANTS OF
PAGERANK
11
SCHEMATIC DIAGRAM: PROPOSED
APPROACH
12
Change
request
Preprocessing
TextRank
calculation
POSRank
calculation
Ranking
Search terms
Focus of this talk
TEXTRANK: TERM IMPORTANCE USING CO-
OCCURRENCE (MIHALCEA ET AL, EMNLP 2004)
13
IResource-------IJavaElement, element-----reported
Node = Distinct word
Edge = Two words co-
occurring in the same
context
POSRANK: TERM IMPORTANCE USING SYNTACTIC
DEPENDENCE (BLANCO & LIOMA, INF. RETR. 2012)
14
Edge = Syntactic
dependence between
various parts of
speech in the sentence
Verb-------Noun, Verb---Adjective
Jespersen’s Theory of 3 Ranks
Noun
Verb Adjective
TERM IMPORTANCE
(ADAPTED FROM PAGERANK)
15
 
 )(
)10(
|)(|
)(
)1()(
ivInj
j
j
i
vOut
vS
vS 
•Vi – node of interest
•Vj – node connected to Vi through incoming links
• – damping factor (i.e., probability of choosing a node in the
network)
•In(Vi) – incoming nodes to Vi
•Out(Vj) – outgoing nodes from Vj
TERM IMPORTANCE (EXPLAINED)
16
Vi
Vj3
Vj5
Vj4
Vj2
Vj6
Vj1
Term Score (Vi) = TextRank (Vi) + POSRank (Vi)
EXPERIMENTAL DATASET
17
8 Projects (Apache + Eclipse)
GitHub commits &
Change set
BugZilla + JIRA issues
1,939 change tasks
EXPERIMENTAL SETUP
18
Change
request
Baseline
query
Suggested
query
Code search
Our ranks
Baseline
ranks
Compare
Query Effectiveness
Mean Avearge Precision
Mean Recall
Top-K Accuracy
EXPERIMENTAL RESULTS
(QUERY EFFECTIVENESS)
19
Query Pairs Improved Worsened P-value Preserved MRD
STRICT vs. Title 57.84% 34.94% <0.001* 7.22% -147
STRICT vs. Title
(10 keywords)
62.49% 32.26% <0.001* 5.25% -201
STRICT vs.
Description
53.84% 38.21% <0.001* 7.95% -329
STRICT vs.
(Title + Desc.)
52.36% 39.94% <0.001* 7.70% -265
*= Significant Difference, MRD = Mean Rank Difference
EXPERIMENTAL RESULTS
(RETRIEVAL PERFORMANCE)
20*Our performance is significantly higher for each metric
EXPERIMENTAL RESULTS
(RETRIEVAL PERFORMANCE)
21
Our Top-K accuracy is clearly higher for various K-values
COMPARISON WITH EXISTING METHODS
(QUERY EFFECTIVENESS)
22
Technique Improved Worsened Preserved MRD
Kevic & Fritz, ICSE
2014
40.09% 53.95% 5.96% +101
Rocchio’s Method,
ICSE 2013
37.59% 56.38% 6.03% +45
STRICT 57.84%* 34.94%* 7.22% -147
*= Significant Difference, MRD = Mean Rank Difference
COMPARISON WITH EXISTING METHODS
(RETRIEVAL PERFORMANCE)
23*Our performance is significantly higher for each metric
than the state-of-the-art
COMPARISON WITH EXISTING METHODS
(RETRIEVAL PERFORMANCE)
24Our Top-K accuracy is clearly higher for various K-values
than the state-of-the-art
TAKE-HOME MESSAGES
 Identifying initial search terms is challenging.
 Only 12.20% of developer’s search terms are
relevant.
 PageRank Algorithm adapted for term
importance.
 We combined TextRank and POSRank for
identifying important terms.
 Experiments with 1,939 change tasks from 8
systems of Apache & Eclipse.
 57.84% of queries improved by STRICT.
 Comparison with state-of-the-art approach
validates our approach. 25
IMPROVED QUERY REFORMULATION FOR
CONCEPT LOCATION USING CODERANK AND
DOCUMENT STRUCTURES
Mohammad Masudur Rahman, Chanchal K. Roy
Department of Computer Science
University of Saskatchewan, Canada
International Conference on Automated Software Engineering
(ASE 2017), Urbana-Champaign, IL, USA
AN EXAMPLE CHANGE REQUEST
27
Field Content
Issue ID 31110
Product eclipse.jdt.debug
Title Debbugger Source Lookup does not work with variables
Description In the Debugger Source Lookup dialog I can also select
variables for source lookup. (Advanced... > Add
Variables). I selected the variable which points to the
archive containing the source file for the type, but the
debugger still claims that he cannot find the source
SEARCH KEYWORD SELECTION
28
Field Content
Issue ID 31110
Product eclipse.jdt.debug
Title Debbugger Source Lookup does not work with
variables
Description In the Debugger Source Lookup dialog I can also
select variables for source lookup. (Advanced... > Add
Variables). I selected the variable which points to the
archive containing the source file for the type, but the
debugger still claims that he cannot find the source.
CHANGE REQUEST TO CODE MAPPING
29
Field Content
Issue ID 31110
Product eclipse.jdt.debug
Title Debbugger Source Lookup does not work with
variables
Description In the Debugger Source Lookup dialog I can also
select variables for source lookup. (Advanced... > Add
Variables). I selected the variable which points to the
archive containing the source file for the type, but the
debugger still claims that
he cannot find the source
BASELINE SEARCH QUERIES
30
Technique Query QE
Baseline debugger source lookup
Baseline debugger source lookup work variables
Baseline
query
Baseline +
Expansion terms
Pseudo-relevance Feedback
79
77
Code search
Top-K
documents
TRADITIONAL QUERY REFORMULATIONS
31
Technique Reformulated Query QE
RSV 1990 debugger source lookup work variables +
launch configuration jdt java debug
30
Sisman &
Kak 2013
debugger source lookup work variables +
test exception suite core code
51
Refoqus
2013
debugger source lookup work variables +
launch jdt configuration classpath project
12
Technique Query QE
Baseline debugger source lookup 79
Baseline debugger source lookup work variables 77
BIG PICTURE: TERM WEIGHTING
32


RFDd t
t
n
D
dftIDFTF log)),log(1()(
Baseline
query
Baseline +
Expansion terms
BIG PICTURE: TERM WEIGHTING
33


RFDd t
t
n
D
dftIDFTF log)),log(1()(
• Different semantics
• Different structures
OUR CONTRIBUTIONS (2)
 Novel term weighting method – CodeRank
 Novel query reformulation technique -- ACER
34
CODERANK: TERM WEIGHTING FOR SOURCE
CODE TERMS
35
CODERANK CALCULATION: STEP I
36
CODERANK CALCULATION: STEP II
37
resolveRuntimeClasspathEntry
Resolve Runtime Classpath Entry
CODERANK CALCULATION: STEP III
38


)(
)10(
|)(|
)(
)1()(
iVInj j
j
i
VOut
VS
VS 
Most important face
in this crowd
1. resolve
2. required
3. launch
4. classpath
5. runtime
ACER: QUERY REFORMULATION USING
CODERANK & MACHINE LEARNING
39
ACER: SCHEMATIC DIAGRAM
40
SOURCE DOCUMENT STRUCTURES
41
Class signature
Method signature
Field signature
ACER: SELECTION OF THE BEST QUERY
REFORMULATION
42
Ref. candidate
(method sig.)
Ref. candidate
(field sig.)
Ref. candidate
(method + field sigs)
Data re-samplingMachine learning
(Ensemble learning)
Select of the best
reformulation
Reformulated
query
ACER: QUERY REFORMULATIONS
43
Technique Query QE
Baseline debugger source lookup 79
Baseline debugger source lookup work variables 77
Refoqus
2013
debugger source lookup work variables +
launch jdt configuration classpath project
12
CodeRank
(method)
debugger source lookup work variables +
launch debug resolve required classpath
02
CodeRank
(field)
debugger source lookup work variables +
label classpath system resolution launch
06
CodeRank
(both)
debugger source lookup work variables +
java type launch classpath label
16
ACER debugger source lookup work variables +
launch debug resolve required classpath
02
ML
EXPERIMENTAL DATASET
44
8 Projects (Apache + Eclipse)
GitHub commits &
Change set
BugZilla + JIRA issues
1,675 change
requests
EXPERIMENTAL SETUP
45
Change
request
Baseline
query
Reformulated
query
Code search
Our ranks
Baseline
ranks
Compare
Query Effectiveness (QE)
Mean Reciprocal Rank (MRR)
Top-K Accuracy
RESEARCH QUESTIONS (5)
 RQ1: Does ACER improve baseline queries
significantly?
 RQ2: Does CodeRank perform better than the
traditional term weights (e.g., TF-IDF)?
 RQ3: Does document structure make a
difference in query reformulation?
 RQ4: How stemming, query length and relevance
feedback size affect our performance?
 RQ5: Does ACER outperform the state-of-the-art
in query reformulation for concept location?
46
ANSWERING RQ1: QUERY EFFECTIVENESS OVER
BASELINE
47
Query Pairs Improved (MRD Worsened
(MRD)
P-value Preserved
CodeRankmethod vs.
Baseline
58.93% (-61) 37.99% (+131) 0.007* 3.08%
CodeRankfield vs.
Baseline
52.51% (-51) 44.57% (+151) 0.063 2.91%
CodeRankboth vs.
Baseline
58.62% (-51) 38.19% (+136) *0.018* 3.20%
ACER vs. Baseline 71.05% (-81) 2.51% (+104) <0.001* 26.44%
*= Significant difference between improvements and worsening, MRD = Mean Rank
Difference
48
TF-IDF
ANSWERING RQ2: CODERANK VS. TRADITIONAL
TERM WEIGHTS
49
ANSWERING RQ3: DO SOURCE DOCUMENT
STRUCTURES MATTER?
50
ANSWERING RQ3: DO SOURCE DOCUMENT
STRUCTURES MATTER?
51
ANSWERING RQ4: IMPACT OF
REFORMULATION LENGTH
52
RQ5: COMPARISON WITH EXISTING METHODS
53*Our performance is significantly higher for each metric
than the state-of-the-art
1. CodeRank
2. Document contexts
3. Data re-sampling
TAKE-HOME MESSAGES
 Reformulation of a search query is highly challenging
for the developers, costs lots of efforts.
 Traditional term weights are not sufficient enough.
 We provide CodeRank that exploits source term
semantics and source document contexts.
 We provide ACER that provides the best from a set of
reformulation candidates prepared by CodeRank.
 Experiments with 1,675 change requests from 8 OSS
systems of Apache & Eclipse.
 71% of queries improved, only 3% worsened by ACER.
 Comparison with five methods including the state-of-the-
art validates our approach. 54
THANK YOU !!! QUESTIONS?
55
More details on CodeRank & ACER:
http://www.usask.ca/~masud.rahman/acer/
Contact: masud.rahman@usask.ca
More details on STRICT:
http://homepage.usask.ca/~masud.rahman/strict/
RQ5: COMPARISON WITH EXISTING METHODS
56Our Top-K accuracy is clearly higher for various K-values
than the state-of-the-art
PROVOCATIVE STATEMENT
 We need better algorithms to overcome
“vocabulary mismatch issue”. Where to start
from? Which source/repository is more appropriate
beside project source code?
57
PROBABLE QUESTIONS
 Did you do stemming?
 No we didn’t since many recent studies reported negative
performance. Especially does not help when the texts contain
structured items like camel case tokens.
 Which one is better TextRank and POSRank?
 The performed quite similarly. But we combined them since
they convey two distinct aspects of connectivity.
 Which settings did you apply for the ranking
algorithm?
 Details in the paper. But these PR-based algorithms have a
tendency of converging scores despite their initial settings
unlike simple VSM based models.
 Can this be used for query reformulation?
 Could be yes, if you can convert the artifact into the text
graph. We are basically working with that using source code.
58
PROBABLE QUESTIONS
 Recent studies show that IR-based methods are not
effective if the bug report is not rich.
 Yup, that’s true. We need more techniques to better write the
bug reports. Plus, we need better methods to address
vocabulary mismatch issue.
 Why didn’t you consider any stuff from the source
code?
 We are suggesting the initial query. Yes, the source will be
used for query-reformulation. We also showed that our initial
query is better than the baselines as used by the developers
frequently.
 How is the cost? How long it take?
 It is pretty much real time. We are planning to develop an IDE
plug-in recently.
59

More Related Content

PDF
Changes and Bugs: Mining and Predicting Development Activities
PPTX
A Multidimensional Empirical Study on Refactoring Activity
PPTX
Deep API Learning (FSE 2016)
PDF
Qno1
 
PDF
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
PDF
A Tale of Experiments on Bug Prediction
PDF
ICPC08b.ppt
PDF
Populating a Release History Database (ICSM 2013 MIP)
Changes and Bugs: Mining and Predicting Development Activities
A Multidimensional Empirical Study on Refactoring Activity
Deep API Learning (FSE 2016)
Qno1
 
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
A Tale of Experiments on Bug Prediction
ICPC08b.ppt
Populating a Release History Database (ICSM 2013 MIP)

What's hot (17)

PDF
The Road Not Taken: Estimating Path Execution Frequency Statically
PDF
Kroening et al, v2c a verilog to c translator
PPTX
Braden Hancock "Programmatically creating and managing training data with Sno...
PPTX
Granules and ISO Metadata
PPTX
ISO Metadata Improvements - Questions and Answers
PPTX
Software architacture recovery
PDF
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
PDF
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
PPTX
ISO Metadata in HDF Data Files
PDF
Full resume dr_russell_john_childs_2013
DOCX
Prilimanary Concepts of VHDL by Dr.R.Prakash Rao
PDF
Exploring Models of Computation through Static Analysis
PDF
Harton-Presentation
PPTX
19157 Questions and Answers
PPTX
System Verilog 2009 & 2012 enhancements
PDF
Research Inventy : International Journal of Engineering and Science is publis...
PPTX
Zikopis Evangelos Thesis Presentation
The Road Not Taken: Estimating Path Execution Frequency Statically
Kroening et al, v2c a verilog to c translator
Braden Hancock "Programmatically creating and managing training data with Sno...
Granules and ISO Metadata
ISO Metadata Improvements - Questions and Answers
Software architacture recovery
Oleksii Moskalenko "Continuous Delivery of ML Pipelines to Production"
A Survey on Dynamic Symbolic Execution for Automatic Test Generation
ISO Metadata in HDF Data Files
Full resume dr_russell_john_childs_2013
Prilimanary Concepts of VHDL by Dr.R.Prakash Rao
Exploring Models of Computation through Static Analysis
Harton-Presentation
19157 Questions and Answers
System Verilog 2009 & 2012 enhancements
Research Inventy : International Journal of Engineering and Science is publis...
Zikopis Evangelos Thesis Presentation
Ad

Similar to CMPT470-usask-guest-lecture (20)

PPTX
Improved Query Reformulation for Concept Location using CodeRank and Document...
PPTX
TextRank Based Search Term Identification for Software Change Tasks
PPTX
STRICT: Information Retrieval Based Search Term Identification for Concept Lo...
PDF
OrientDB - The 2nd generation of (multi-model) NoSQL
PPT
Mainframe Technology Overview
PDF
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
PDF
Breaking a monolith: In-place refactoring with service-oriented architecture ...
PDF
LF_APIStrat17_Breaking a Monolith: In-Place Refactoring with Service-Oriented...
PPTX
Mksong proposal-slide
PPTX
Shestakov Illia "The Sandbox Theory"
PDF
NGRX Apps in Depth
PDF
Software Architecture - Quiz Questions
PDF
Software Architecture - Quiz Questions
PPTX
QUICKAR: Automatic Query Reformulation for Concept Location Using Crowdsource...
PDF
FSE-Journal-First-Automated code editing with search-generate-modify.pdf
PDF
Full stack Web Development Summer Training
PPT
Linq To The Enterprise
PPTX
Where are yours vertexes and what are they talking about?
PDF
A preliminary study on using code smells to improve bug localization
PPTX
Madeo - a CAD Tool for reconfigurable Hardware
Improved Query Reformulation for Concept Location using CodeRank and Document...
TextRank Based Search Term Identification for Software Change Tasks
STRICT: Information Retrieval Based Search Term Identification for Concept Lo...
OrientDB - The 2nd generation of (multi-model) NoSQL
Mainframe Technology Overview
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
Breaking a monolith: In-place refactoring with service-oriented architecture ...
LF_APIStrat17_Breaking a Monolith: In-Place Refactoring with Service-Oriented...
Mksong proposal-slide
Shestakov Illia "The Sandbox Theory"
NGRX Apps in Depth
Software Architecture - Quiz Questions
Software Architecture - Quiz Questions
QUICKAR: Automatic Query Reformulation for Concept Location Using Crowdsource...
FSE-Journal-First-Automated code editing with search-generate-modify.pdf
Full stack Web Development Summer Training
Linq To The Enterprise
Where are yours vertexes and what are they talking about?
A preliminary study on using code smells to improve bug localization
Madeo - a CAD Tool for reconfigurable Hardware
Ad

More from Masud Rahman (20)

PDF
Explaining Software Bugs Leveraging Code Structures in Neural Machine Transla...
PDF
Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?
PDF
Improved Detection and Diagnosis of Faults in Deep Neural Networks Using Hier...
PPTX
HereWeCode 2022: Dalhousie University
PPTX
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
PPTX
PhD Seminar - Masud Rahman, University of Saskatchewan
PPTX
PhD proposal of Masud Rahman
PPTX
PhD Comprehensive exam of Masud Rahman
PPTX
Doctoral Symposium of Masud Rahman
PPTX
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
PDF
Poster: Improving Bug Localization with Report Quality Dynamics and Query Ref...
PDF
Impact of Continuous Integration on Code Reviews
PPTX
Predicting Usefulness of Code Review Comments using Textual Features and Deve...
PPTX
An Insight into the Unresolved Questions at Stack Overflow
PPTX
An Insight into the Pull Requests of GitHub
PPTX
Recommending Insightful Comments for Source Code using Crowdsourced Knowledge
PPTX
CMPT-842-BRACK
PPTX
RACK: Code Search in the IDE using Crowdsourced Knowledge
PPTX
RACK: Automatic API Recommendation using Crowdsourced Knowledge
PPTX
CORRECT: Code Reviewer Recommendation at GitHub for Vendasta Technologies
Explaining Software Bugs Leveraging Code Structures in Neural Machine Transla...
Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?
Improved Detection and Diagnosis of Faults in Deep Neural Networks Using Hier...
HereWeCode 2022: Dalhousie University
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD proposal of Masud Rahman
PhD Comprehensive exam of Masud Rahman
Doctoral Symposium of Masud Rahman
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Poster: Improving Bug Localization with Report Quality Dynamics and Query Ref...
Impact of Continuous Integration on Code Reviews
Predicting Usefulness of Code Review Comments using Textual Features and Deve...
An Insight into the Unresolved Questions at Stack Overflow
An Insight into the Pull Requests of GitHub
Recommending Insightful Comments for Source Code using Crowdsourced Knowledge
CMPT-842-BRACK
RACK: Code Search in the IDE using Crowdsourced Knowledge
RACK: Automatic API Recommendation using Crowdsourced Knowledge
CORRECT: Code Reviewer Recommendation at GitHub for Vendasta Technologies

Recently uploaded (20)

PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Encapsulation theory and applications.pdf
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Getting Started with Data Integration: FME Form 101
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Spectroscopy.pptx food analysis technology
PDF
Machine learning based COVID-19 study performance prediction
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Network Security Unit 5.pdf for BCA BBA.
Encapsulation theory and applications.pdf
The Rise and Fall of 3GPP – Time for a Sabbatical?
Agricultural_Statistics_at_a_Glance_2022_0.pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Digital-Transformation-Roadmap-for-Companies.pptx
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Reach Out and Touch Someone: Haptics and Empathic Computing
Getting Started with Data Integration: FME Form 101
SOPHOS-XG Firewall Administrator PPT.pptx
Group 1 Presentation -Planning and Decision Making .pptx
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
“AI and Expert System Decision Support & Business Intelligence Systems”
Assigned Numbers - 2025 - Bluetooth® Document
Spectroscopy.pptx food analysis technology
Machine learning based COVID-19 study performance prediction
Unlocking AI with Model Context Protocol (MCP)
Building Integrated photovoltaic BIPV_UPV.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx

CMPT470-usask-guest-lecture

  • 1. SUPPORTING SOFTWARE CHANGE TASKS USING AUTOMATED QUERY REFORMULATIONS Masud Rahman PhD Candidate Department of Computer Science University of Saskatchewan, Canada Email: masud.rahman@usask.ca CMPT 470/816: Advanced Software Engineering
  • 2. A TALE OF SOFTWARE CHANGE 2 Alex Bob Code base Customer Code base Bug repository
  • 3. A TALE OF SOFTWARE CHANGE (CONTD.) 3 Alex Bob Customer Buggy files Bug report Change implementation Keywords Code search Code base
  • 4. TALK OUTLINE 4 Automated Query Reformulation Part I: Suggest keywords from the change request texts Part II: Reformulate initial query of developer using codebase
  • 5. STRICT: INFORMATION RETRIEVAL BASED SEARCH TERM IDENTIFICATION FOR CONCEPT LOCATION Mohammad Masudur Rahman, Chanchal K. Roy Department of Computer Science University of Saskatchewan, Canada International Conference on Software Analysis, Evolution and Reengineering (SANER 2017), Klagenfurt, Austria
  • 6. SOFTWARE CHANGE TASK 6 Task Summary Task Description Other Information
  • 7. SOFTWARE CHANGE TASK: DOMAIN CONCEPT--ARTIFACT MAPPING IResource element Tree Level Provider 7 Domain concepts Project artifacts (e.g., classes, methods) Our contribution: Identifying such concepts
  • 8. QUIZ TEST-I 8 ID Query QE 1. Custom search results view iresource 2. Custom search results search results view 3. element iresource provider level tree 4. Custom search results hierarchically java search results 1331 636 01 570
  • 9. EXISTING WORKS  Query reformulation & expansion  Haiduc et al, ICSE 2013  Gay et al, ICSM 2009  Shepherd et al, ASOD 2007  Query quality analysis  Haiduc et al, ASE 2011  Haiduc et al, ICPC 2011  Haiduc et al, ICSE 2012  Software artifact mining  Howard et al, MSR 2013  Kevic & Fritz, MSR 2014  Heuristics  Kevic & Fritz, ICSE 2014 9 • Most studies expect the developer to provide an initial query • Developers succeed only in 12.2% of cases (Kevic & Fritz, ICSE 2014) Initial search query for a change task.
  • 10. PAGERANK ALGORITHM: WEB LINK ANALYSIS 10Size of a face ∞ Size of the faces pointing to it Most important face in this crowd
  • 11. SEARCH TERM IDENTIFICATION USING TEXTRANK & POSRANK, TWO VARIANTS OF PAGERANK 11
  • 13. TEXTRANK: TERM IMPORTANCE USING CO- OCCURRENCE (MIHALCEA ET AL, EMNLP 2004) 13 IResource-------IJavaElement, element-----reported Node = Distinct word Edge = Two words co- occurring in the same context
  • 14. POSRANK: TERM IMPORTANCE USING SYNTACTIC DEPENDENCE (BLANCO & LIOMA, INF. RETR. 2012) 14 Edge = Syntactic dependence between various parts of speech in the sentence Verb-------Noun, Verb---Adjective Jespersen’s Theory of 3 Ranks Noun Verb Adjective
  • 15. TERM IMPORTANCE (ADAPTED FROM PAGERANK) 15    )( )10( |)(| )( )1()( ivInj j j i vOut vS vS  •Vi – node of interest •Vj – node connected to Vi through incoming links • – damping factor (i.e., probability of choosing a node in the network) •In(Vi) – incoming nodes to Vi •Out(Vj) – outgoing nodes from Vj
  • 16. TERM IMPORTANCE (EXPLAINED) 16 Vi Vj3 Vj5 Vj4 Vj2 Vj6 Vj1 Term Score (Vi) = TextRank (Vi) + POSRank (Vi)
  • 17. EXPERIMENTAL DATASET 17 8 Projects (Apache + Eclipse) GitHub commits & Change set BugZilla + JIRA issues 1,939 change tasks
  • 18. EXPERIMENTAL SETUP 18 Change request Baseline query Suggested query Code search Our ranks Baseline ranks Compare Query Effectiveness Mean Avearge Precision Mean Recall Top-K Accuracy
  • 19. EXPERIMENTAL RESULTS (QUERY EFFECTIVENESS) 19 Query Pairs Improved Worsened P-value Preserved MRD STRICT vs. Title 57.84% 34.94% <0.001* 7.22% -147 STRICT vs. Title (10 keywords) 62.49% 32.26% <0.001* 5.25% -201 STRICT vs. Description 53.84% 38.21% <0.001* 7.95% -329 STRICT vs. (Title + Desc.) 52.36% 39.94% <0.001* 7.70% -265 *= Significant Difference, MRD = Mean Rank Difference
  • 20. EXPERIMENTAL RESULTS (RETRIEVAL PERFORMANCE) 20*Our performance is significantly higher for each metric
  • 21. EXPERIMENTAL RESULTS (RETRIEVAL PERFORMANCE) 21 Our Top-K accuracy is clearly higher for various K-values
  • 22. COMPARISON WITH EXISTING METHODS (QUERY EFFECTIVENESS) 22 Technique Improved Worsened Preserved MRD Kevic & Fritz, ICSE 2014 40.09% 53.95% 5.96% +101 Rocchio’s Method, ICSE 2013 37.59% 56.38% 6.03% +45 STRICT 57.84%* 34.94%* 7.22% -147 *= Significant Difference, MRD = Mean Rank Difference
  • 23. COMPARISON WITH EXISTING METHODS (RETRIEVAL PERFORMANCE) 23*Our performance is significantly higher for each metric than the state-of-the-art
  • 24. COMPARISON WITH EXISTING METHODS (RETRIEVAL PERFORMANCE) 24Our Top-K accuracy is clearly higher for various K-values than the state-of-the-art
  • 25. TAKE-HOME MESSAGES  Identifying initial search terms is challenging.  Only 12.20% of developer’s search terms are relevant.  PageRank Algorithm adapted for term importance.  We combined TextRank and POSRank for identifying important terms.  Experiments with 1,939 change tasks from 8 systems of Apache & Eclipse.  57.84% of queries improved by STRICT.  Comparison with state-of-the-art approach validates our approach. 25
  • 26. IMPROVED QUERY REFORMULATION FOR CONCEPT LOCATION USING CODERANK AND DOCUMENT STRUCTURES Mohammad Masudur Rahman, Chanchal K. Roy Department of Computer Science University of Saskatchewan, Canada International Conference on Automated Software Engineering (ASE 2017), Urbana-Champaign, IL, USA
  • 27. AN EXAMPLE CHANGE REQUEST 27 Field Content Issue ID 31110 Product eclipse.jdt.debug Title Debbugger Source Lookup does not work with variables Description In the Debugger Source Lookup dialog I can also select variables for source lookup. (Advanced... > Add Variables). I selected the variable which points to the archive containing the source file for the type, but the debugger still claims that he cannot find the source
  • 28. SEARCH KEYWORD SELECTION 28 Field Content Issue ID 31110 Product eclipse.jdt.debug Title Debbugger Source Lookup does not work with variables Description In the Debugger Source Lookup dialog I can also select variables for source lookup. (Advanced... > Add Variables). I selected the variable which points to the archive containing the source file for the type, but the debugger still claims that he cannot find the source.
  • 29. CHANGE REQUEST TO CODE MAPPING 29 Field Content Issue ID 31110 Product eclipse.jdt.debug Title Debbugger Source Lookup does not work with variables Description In the Debugger Source Lookup dialog I can also select variables for source lookup. (Advanced... > Add Variables). I selected the variable which points to the archive containing the source file for the type, but the debugger still claims that he cannot find the source
  • 30. BASELINE SEARCH QUERIES 30 Technique Query QE Baseline debugger source lookup Baseline debugger source lookup work variables Baseline query Baseline + Expansion terms Pseudo-relevance Feedback 79 77 Code search Top-K documents
  • 31. TRADITIONAL QUERY REFORMULATIONS 31 Technique Reformulated Query QE RSV 1990 debugger source lookup work variables + launch configuration jdt java debug 30 Sisman & Kak 2013 debugger source lookup work variables + test exception suite core code 51 Refoqus 2013 debugger source lookup work variables + launch jdt configuration classpath project 12 Technique Query QE Baseline debugger source lookup 79 Baseline debugger source lookup work variables 77
  • 32. BIG PICTURE: TERM WEIGHTING 32   RFDd t t n D dftIDFTF log)),log(1()( Baseline query Baseline + Expansion terms
  • 33. BIG PICTURE: TERM WEIGHTING 33   RFDd t t n D dftIDFTF log)),log(1()( • Different semantics • Different structures
  • 34. OUR CONTRIBUTIONS (2)  Novel term weighting method – CodeRank  Novel query reformulation technique -- ACER 34
  • 35. CODERANK: TERM WEIGHTING FOR SOURCE CODE TERMS 35
  • 37. CODERANK CALCULATION: STEP II 37 resolveRuntimeClasspathEntry Resolve Runtime Classpath Entry
  • 38. CODERANK CALCULATION: STEP III 38   )( )10( |)(| )( )1()( iVInj j j i VOut VS VS  Most important face in this crowd 1. resolve 2. required 3. launch 4. classpath 5. runtime
  • 39. ACER: QUERY REFORMULATION USING CODERANK & MACHINE LEARNING 39
  • 41. SOURCE DOCUMENT STRUCTURES 41 Class signature Method signature Field signature
  • 42. ACER: SELECTION OF THE BEST QUERY REFORMULATION 42 Ref. candidate (method sig.) Ref. candidate (field sig.) Ref. candidate (method + field sigs) Data re-samplingMachine learning (Ensemble learning) Select of the best reformulation Reformulated query
  • 43. ACER: QUERY REFORMULATIONS 43 Technique Query QE Baseline debugger source lookup 79 Baseline debugger source lookup work variables 77 Refoqus 2013 debugger source lookup work variables + launch jdt configuration classpath project 12 CodeRank (method) debugger source lookup work variables + launch debug resolve required classpath 02 CodeRank (field) debugger source lookup work variables + label classpath system resolution launch 06 CodeRank (both) debugger source lookup work variables + java type launch classpath label 16 ACER debugger source lookup work variables + launch debug resolve required classpath 02 ML
  • 44. EXPERIMENTAL DATASET 44 8 Projects (Apache + Eclipse) GitHub commits & Change set BugZilla + JIRA issues 1,675 change requests
  • 45. EXPERIMENTAL SETUP 45 Change request Baseline query Reformulated query Code search Our ranks Baseline ranks Compare Query Effectiveness (QE) Mean Reciprocal Rank (MRR) Top-K Accuracy
  • 46. RESEARCH QUESTIONS (5)  RQ1: Does ACER improve baseline queries significantly?  RQ2: Does CodeRank perform better than the traditional term weights (e.g., TF-IDF)?  RQ3: Does document structure make a difference in query reformulation?  RQ4: How stemming, query length and relevance feedback size affect our performance?  RQ5: Does ACER outperform the state-of-the-art in query reformulation for concept location? 46
  • 47. ANSWERING RQ1: QUERY EFFECTIVENESS OVER BASELINE 47 Query Pairs Improved (MRD Worsened (MRD) P-value Preserved CodeRankmethod vs. Baseline 58.93% (-61) 37.99% (+131) 0.007* 3.08% CodeRankfield vs. Baseline 52.51% (-51) 44.57% (+151) 0.063 2.91% CodeRankboth vs. Baseline 58.62% (-51) 38.19% (+136) *0.018* 3.20% ACER vs. Baseline 71.05% (-81) 2.51% (+104) <0.001* 26.44% *= Significant difference between improvements and worsening, MRD = Mean Rank Difference
  • 49. ANSWERING RQ2: CODERANK VS. TRADITIONAL TERM WEIGHTS 49
  • 50. ANSWERING RQ3: DO SOURCE DOCUMENT STRUCTURES MATTER? 50
  • 51. ANSWERING RQ3: DO SOURCE DOCUMENT STRUCTURES MATTER? 51
  • 52. ANSWERING RQ4: IMPACT OF REFORMULATION LENGTH 52
  • 53. RQ5: COMPARISON WITH EXISTING METHODS 53*Our performance is significantly higher for each metric than the state-of-the-art 1. CodeRank 2. Document contexts 3. Data re-sampling
  • 54. TAKE-HOME MESSAGES  Reformulation of a search query is highly challenging for the developers, costs lots of efforts.  Traditional term weights are not sufficient enough.  We provide CodeRank that exploits source term semantics and source document contexts.  We provide ACER that provides the best from a set of reformulation candidates prepared by CodeRank.  Experiments with 1,675 change requests from 8 OSS systems of Apache & Eclipse.  71% of queries improved, only 3% worsened by ACER.  Comparison with five methods including the state-of-the- art validates our approach. 54
  • 55. THANK YOU !!! QUESTIONS? 55 More details on CodeRank & ACER: http://www.usask.ca/~masud.rahman/acer/ Contact: masud.rahman@usask.ca More details on STRICT: http://homepage.usask.ca/~masud.rahman/strict/
  • 56. RQ5: COMPARISON WITH EXISTING METHODS 56Our Top-K accuracy is clearly higher for various K-values than the state-of-the-art
  • 57. PROVOCATIVE STATEMENT  We need better algorithms to overcome “vocabulary mismatch issue”. Where to start from? Which source/repository is more appropriate beside project source code? 57
  • 58. PROBABLE QUESTIONS  Did you do stemming?  No we didn’t since many recent studies reported negative performance. Especially does not help when the texts contain structured items like camel case tokens.  Which one is better TextRank and POSRank?  The performed quite similarly. But we combined them since they convey two distinct aspects of connectivity.  Which settings did you apply for the ranking algorithm?  Details in the paper. But these PR-based algorithms have a tendency of converging scores despite their initial settings unlike simple VSM based models.  Can this be used for query reformulation?  Could be yes, if you can convert the artifact into the text graph. We are basically working with that using source code. 58
  • 59. PROBABLE QUESTIONS  Recent studies show that IR-based methods are not effective if the bug report is not rich.  Yup, that’s true. We need more techniques to better write the bug reports. Plus, we need better methods to address vocabulary mismatch issue.  Why didn’t you consider any stuff from the source code?  We are suggesting the initial query. Yes, the source will be used for query-reformulation. We also showed that our initial query is better than the baselines as used by the developers frequently.  How is the cost? How long it take?  It is pretty much real time. We are planning to develop an IDE plug-in recently. 59

Editor's Notes

  • #2: Introduce yourself and the affiliation. Today I am going to talk about query suggestion for Concept location where we used Information Retrieval methods.
  • #6: Introduce yourself and the affiliation. Today I am going to talk about query suggestion for Concept location where we used Information Retrieval methods.
  • #7: This is a software change request. It has different sections like title, description and others. Now a developer’s task is to identify the most important terms and then use them for finding the source code to change.
  • #8: To model the problem formally, this is a mapping problem. And the mapping is between concepts in the change request and the relevant source artifacts from the codebase. Our job is to identify the appropriate terms from the change request for the successful mapping.
  • #10: There have been some studies on similar problem. However, most of these studies reformulate a given query. That means, the developer needs to provide an initial query first. But studies show that choosing that initial query itself is challenging. A study reported that only 12% of developers chosen search terms from the change request were useful. So, our focus is to choose the initial query from a change request rather than reformulation. The closely related work used a set of heuristics.
  • #11: While the earlier work used heuristics for the same problem, we used Google’s PageRank algorithm for choosing the important terms from a body of texts. Here, the most important face in the crowd is the face everybody is looking at, right? This also goes true for world wide web. A page is reputed is it is referred by other reputed pages from the web. So, we adapt our search term identification after this model.
  • #12: We identify search terms using two variants of PageRank--- They are called TextRank and POSRank in the information retrieval domain.
  • #13: So, these are the pretty straight-forward steps of our approach. We take a change request, and perform standard NLP (stop word removal and splitting). We avoided stemming. Then from the pre-processed texts, we develop two types of graphs – text graph and POS graph. Then we derive importance score for each of the terms from those two graphs. Then we a do linear combination, perform ranking and choose the top words as the search terms based on their scores. Now, we will zoom in this sections more.
  • #14: The idea behind this text graph is word co-occurrence. For example, these two terms—IResource and IJavaElement-- occur in the same context across multiple sentences. These are another two terms– element and reported—occur in the same context. Here we define context as a window size of two words within a sentence. We encode their co-occurrence into an edge in this text graph. This way, the whole change request can be converted into a text graph.
  • #15: Similarly, we develop the second graph based on syntactic dependence among various parts of speech of sentence. We apply Jespersen’s Rank Theory of 3 ranks. More details on the paper. That is, some POS depends on others POS for their complete meaning. For example, verb modifies noun and adjectives from within the same sentence. We encode such dependencies into the connecting edge, and develop another text graph. Thus, some terms are more connected than others.
  • #16: Now, we have two graphs developed from the change request based on two different dimensions --Word co-occurrence and syntactic dependence. Now, we apply the above algorithms adapted from PageRank for scoring. That is, a term’s importance will be determined by the importance of the surrounding terms, not just the connectivity. This is how Google beats the SPAM pages. We apply that in the case of concept location as well. This is the first time done in the concept location task, and this is our novelty.
  • #17: So, this is how the score of a term is determined, based on the scores of the surrounding terms. That means, the score of Vi is determined based on the scores of Vj1 to Vj5. We collect scores for the terms from both graphs which we call TextRank and POSRank. We combine them, rank them and collect the top ones as the search terms.
  • #18: For experiments, we select 8 subject systems from Apache and Eclipse. We collect 1939 change requests/bug reports from BugZilla and JIRA, and prepare the gold set by consulting the commit history of those projects from GitHub. For selecting bug fixing commits, we adopted the widely accepted approach. That is, we identify the Bug ID in the commit title, and then extract corresponding change set.
  • #19: For experiments, We collect our queries and the baseline queries (e.g., title or description from the change request), and feed them to a code search engine. Then we collect their results/ranks and compare. For evaluation/validation, we used these four performance metrics.
  • #20: Results show that our method can improve 52%--62% of the baseline queries, which is promising according to relevant literature. We consider various combinations as the baseline queries, and got similar performance. Our improvement and worsening ratios are significantly different according to statistical tests. The mean rank difference also shows that our mean ranks are closer to the top than the baseline.
  • #21: In terms retrieval performance, precision and recall are not too high. Precision is close to 30% and the accuracy is close to 45% when Top-10 results are considered. But I guess, that has been the status quo for the last 15 years. So, nothing very dramatic. However, they are quite higher than the baseline performance actually.
  • #22: When we extend the K-values, we found the accuracy is growing significantly. But, still, our performance remained higher than all the baselines. This shows the potential of our method.
  • #23: We compared with two parallel methods– Kevic & Fritz used heuristics and the second is a classic query reformulation technique. While they were promising, but still our method beat them in all aspects, and the performance is significantly higher as you see.
  • #24: If we see at box plots, we can see that our median metrics are significantly higher. While they relied to a set of heuristics and term weighting, our PageRank-based model seems to perform better.
  • #25: When we consider various Top-K accuracy, we got similar findings. Our method located concepts correctly for 80% of the change requests whereas they did for 60% of them at best. This shows the potential of our technique.
  • #26: You can simply read out the texts I guess.
  • #27: Good morning, everyone. Introduce yourself. Today, I am going to talk about a query reformulation technique for concept location where we used an advanced term weighting method and performed machine learning.
  • #28: Now, this is a real software change request. Here these two sections are important, and they contain information about the requested change.
  • #29: Now when a request like this is submitted, a developer tries to find out important terms. Then they use those terms for finding the source code to change probably using a search engine like Lucene
  • #30: That is, they try to map the concepts discussed in the change request to appropriate source code sections like this. This is how, the term comes– “concept location” if you want me to define it.
  • #31: But this concept location is NOT an easy task. For example, these two very reasonable queries from the change request do not perform well. This second one returns correct results at 77th positions, which is not acceptable of course. So, what is needed here is– the reformulation of the query for better. Now, there are traditional tool supports for doing that. What most of them do is, they throw in the initial query to the search engine, collect the results, and then collect most important terms from those results for The reformulation of the initial poor query.
  • #32: Now, these are the reformulated queries from three existing such methods. Now, they did some improvements in the ranking, and return results a bit closer to the top. But as you can see, they are not clearly enough. Developers want the results at the top positions, so they are still costly for practical use.
  • #33: Now, we investigate this part of the reformulation process, and found that Most of the existing techniques are using this equation for determining importance of a term. That is, they are selecting TF-IDF to find the words for query reformulation. In other words, they are relying on the frequency of a term as a proxy to its importance.
  • #34: Now, this is a metric which has been on the play from last the century. It was proposed in the 70s. It is a good metric, but it was actually proposed for regular texts such as news articles. On the other hand, we are dealing with source code here. Now, regular texts and source code have different semantics and different structures. They are not the same So, metrics for regular texts are not appropriate for the source code– this is our hypothesis.
  • #35: So, we made two contributions here. We propose CodeRank– a novel and appropriate term weighting method for source code. We propose ACER -- a novel query reformulation technique that uses this term weight.
  • #36: First comes CodeRank.
  • #37: Now, what we did? We extract important artifacts from source code such as method signature, formal parameters and field signatures from the code. We mostly used AST parsing and regular expressions for this. The idea is – signatures capture more rich intent than other texts. For example, method signatures provide the intent whereas the method body implements the intent with lots of noise.
  • #38: Now once such items are extracted, we split them. Now as we see, these single terms share some kind of semantics to convey a broader semantic. That is, they complement each other in this context. Now, we capture such semantic dependencies in the source code, and develop a term graph like this.
  • #39: Now, once the graph is developed, we use a popular graph-based algorithm called PageRank for determining the node importance. OK Lets go visual. In a crowd, the most important person is the one whom everybody is looking at. It can be also seen as votes. The person who is voted the most is the leader. We also follow that concept in the context of our term graph. That is, the term which is connected the most with other terms is an important term. Now, this scoring is a recursive process, we finally get a ranked list of important terms which can used as reformulation terms.
  • #40: Now comes the ACER, the second contribution.
  • #41: This is the schematic diagram of our approach. So far we talked about these parts of our approach. Now we will zoom in this part.
  • #43: Once the CodeRank is calculated, we collect multiple reformulation candidates for a given initial query. As we discussed, a source document has various contexts– method signature, field signature and so on. We make use of such contexts, and develop multiple reformulation candidates. Now, since we have multiple options, we have to choose the best reformulation. In order to do that, we apply machine learning. In particular, we determine the quality of each candidate using 20 quality metrics that mostly came from IR domain. Then we use a regression-tree based classifier and suggest the best reformulated query.
  • #44: Now lets see what is the outcome. Here, we have created three reformulation candidates using CodeRank and source document contexts. Then our ML classifier returns the best option, and it returns the result at the 2nd position. Now, if we look closely, our technique identifies two unique terms which made the real difference in performance.
  • #45: For experiments, we select 8 subject systems from Apache and Eclipse. We collect 1.5 thousand change requests/bug reports from BugZilla and JIRA, We use the report title as our query and prepare the gold set by consulting the commit history of those projects from GitHub. These are the widely accepted approach to do experiments in this area.
  • #46: For experiments, We collect our queries and the baseline queries , and feed them to a code search engine. Then we collect their results/ranks and compare. For evaluation/validation, we used these four performance metrics.
  • #47: Now, in our experiment, we answer these five research questions.
  • #48: In the first research question, we compare our queries with the baseline queries. As we see, method signature based reformulation performs the best than the other two options. However, the Machine Learning selects the best among the three, and provides the best performance. For example, our reformulation improves 71% of the queries, preserves 26% and degrades only 03% of the queries. So, obviously, we are improving more queries than degrading.
  • #50: In the second research question, we compare CodeRank with the traditional term weights – Term Frequency and TF-IDF. We see that TF performs better than TF-IDF, which is interesting. Anyway, when compared with our CodeRank, we see that TF performs better initially but then CodeRank outperforms it later., especially for 10-15 reformulation terms. That is, few highly frequent terms are really important, but yes, CodeRank is more reliable than Term Frequency for term importance.
  • #51: In the third research question, we show how document structures/contexts make a difference. These are the number of improved queries by various reformulation candidates. Now we see 19% of the total improvement are unique to each single contexts. That is if, we consider only method signatures for query reformulation, we miss the improvements made by field signature based reformulations. Again, if we consider the whole texts rather than signatures, we also miss some query improvements. This is not only for CodeRank, this is also true if we employ term frequency in those contexts. Thus, document contexts matter for query reformulation.
  • #52: Now, when we consider query improvements by ACER and Term frequency in terms of Vern diagram, We also found that 66% overlaps, but ACER provides a unique set of improvements which is three times that of TF. Now ACER does document structures and TF does not. And we see the difference here.
  • #53: In the fourth research question, we do the calibration for reformulation length. We found the best performance is achieved when the reformulation length is between 10 to 15. This where CodeRank saturates.
  • #54: In the fifth research question, we compare our query improvement and worsening ratios with the existing methods. We see our median improvement is much higher than others. More importantly, we degrade a very low amount of queries compared to the others. Obviously, these measures are significantly higher. Thus, according to our investigation, ACER is the winner. But we must also admit, the ML-based approach is less scalable, and now we are working on the tool.
  • #55: Thus, these are take-home messages. Query reformulation is a challenging task for the developers. Google does not work local source code repository. Traditional term weights are not clearly sufficient or appropriate for source code. We provide CodeRank, a novel term weight for source code. We provide ACER, an improved reformulation technique. Our technique improves about 71% of the queries and degrades only a handful queries. Comparison with the state-of-the-art shows the promising aspect of our method.
  • #56: Thanks for your time and attention. I am ready to have a few questions.
  • #57: When we consider various Top-K accuracy, we got similar findings. Our method located concepts correctly for 80% of the change requests whereas they did for 60% of them at best. This shows the potential of our technique.
  • #58: We tried with source code and Stack Overflow to look for semantically similar words. What’s next?