SlideShare a Scribd company logo
IMPROVING BUG LOCALIZATION WITH REPORT QUALITY
DYNAMICS AND QUERY REFORMULATION
{MD. MASUDUR RAHMAN AND CHANCHAL K. ROY} UNIVERSITY OF SASKATCHEWAN, COMPUTER SCIENCE
ABSTRACT
In this poster paper, we present a large empirical
study using 5,500 bug reports from eight systems
and replicating three existing studies. Our find-
ings empirically demonstrate how quality dy-
namics of bug reports affect the performances of
the contemporary IR-based bug localizations. Ex-
isting techniques do not perform well if (1) a bug
report lacks rich structured information such as
relevant program entity names, and (2) the bug
report contains excessive structured information
such as stack traces. Our preliminary findings
also suggest that context-aware query reformulations
might help overcome such limitations.
TERM WEIGHT CALCULATION
TF −IDF(t) =
∀d∈DRF
(1+log(ft,d))×log
|D|
nt
Vi = {Ci, Mi},
Ei = {Ci ↔ Mi} ∪ {Ci → Cj, Mi → Mj} | j = i − 1
V =
N
i=1
{Vi}, E =
N
i=1
{Ei}, GST = (V, E)
S(Vi) = (1−ψ)+ψ
j In(Vi)
S(Vj)
|Out(Vj)|
(0 ≤ ψ ≤ 1)
SCHEMATIC DIAGRAM OF THE EMPIRICAL STUDY
Bug report
collection
Bug report
clustering
Clustered
bug reports
Project
codebase
BLUiR
BugLocator
LOBSTER
Result analysis
+ answering RQs
Findings
and insights
CONTACT INFORMATION
Web www.usask.ca/∼masud.rahman
Email masud.rahman@usask.ca
Twitter @masud2336
Phone +1 (306) 241 9293
CONCLUSION & FUTURE RESEARCH
• A large-scale empirical study pointing out
that state-of-the-art IR-based techniques are
not robust to various types of bug reports.
• Quality of the bug report is a major factor.
• Appropriate reformulation of the report con-
tents is warranted prior to bug localization.
• Future research can develop techniques that
take bug report quality into consideration.
REFERENCES
[1] M. M. Rahman and C. K. Roy. Poster: Improv-
ing bug localization with report quality dynamics
and query reformulation. In Proc. ICSE-C, page 02,
Gothenburg, Sweden, May 2018.
RESEARCH QUESTIONS & ANSWERS
• RQ1: How do existing IR-based bug local-
ization techniques perform with the bug re-
ports containing excessive amount of struc-
tured information (e.g., stack traces)?
• RQ2: How do existing IR-based techniques
perform with the bug reports containing
neither program element names nor stack
traces (i.e., only unstructured regular texts)?
• RQ3: Does a single technique perform si-
multaneously well with the bug reports
from both groups?
Figure 1: MAP@K of (a) Baseline (Lucene), (b) BugLocator, (c) BLUiR, and (d) LOBSTER with bug reports containing
excessive structured information (e.g., stack traces)
Figure 2: Hit@10 of (a) Baseline (Lucene), (b) BugLoca-
tor, (c) BLUiR, and (d) LOBSTER with bug reports con-
taining only regular texts
Figure 3: MAP@10 of all four techniques with bug re-
ports containing (a) stack traces, (b) natural language
texts only, (c) program elements, and (d) all bug reports
TRACE GRAPH DEVELOPMENT
Table 1: A Noisy Bug Report
Title: should be able to cast “null"
Bug ID: 31637, Project: eclipse.jdt.debug
Description: When trying to debug an application the variables
tab is empty. Also when I try to inspect or display a variable,
I get following error logged in the eclipse log file:
java.lang.NullPointerException
at org.eclipse.jdt.internal.debug.core.
model.JDIValue.toString(JDIValue.java:362)
at org.eclipse.jdt.internal.debug.eval.ast.
instructions.Cast.execute(Cast.java:88)
at org.eclipse.jdt.internal.debug.eval.ast.engine.
Interpreter.execute(Interpreter.java:44)
at org.eclipse.jdt.internal.debug.eval.ast.engine.
........................................ (8 more).......................................
Cast access
InterpreterJDIValue
toString run
runEvaluation
doEvaluation
EvaluationThread
execute
JDIThread
Thread
EvaluationThread
toString
JDIValue
run
execute
Figure 4: Trace graph of stack traces in Table 1
QUERY REFORMULATION
Table 2: An Example of Query Reformulation
Technique Group Query Terms QE
Baseline
BRST
127 terms from Table 1 after
preprocessing, Bug ID# 31637,
eclipse.jdt.debug
53
Proposed NullPointerException + “Bug
should be able to cast null" +
{JDIValue toString execute
EvaluationThread run}
01
Baseline
BRP E
195 terms (after preprocessing) from
Bug ID# 15036, eclipse.jdt.core
27
Proposed {astvisitor post postvisit previsit pre
file post pre astnode visitor}
01
Baseline
BRNL
32 terms after preprocessing, Bug ID#
475855, eclipse.jdt.ui
30
Proposed Preprocessed report texts
+ {compliance create
preference add configuration
field dialog annotation}
01
Table 3: A Poor Bug Report
Title: [preferences] Mark Occurences Pref Page
Bug ID: 187316, Project: eclipse.jdt.ui
Description: There should be a link to the pref page
on which you can change the color. Namely: Gener-
al/Editors/Text Editors/Annotations. It’s a pain in
the a** to find the pref if you do not know Eclipse’s
preference structure well.

More Related Content

PPTX
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
PDF
Runtime Behavior of JavaScript Programs
PDF
How might machine learning help advance solar PV research?
PDF
A comparison of three chromatographic retention time prediction models
PDF
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
PDF
Using publicly available resources to build a comprehensive knowledgebase of ...
PDF
Analysis of the “KDD Cup-1999” Datasets
PPTX
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Runtime Behavior of JavaScript Programs
How might machine learning help advance solar PV research?
A comparison of three chromatographic retention time prediction models
Accelerated Materials Discovery Using Theory, Optimization, and Natural Langu...
Using publicly available resources to build a comprehensive knowledgebase of ...
Analysis of the “KDD Cup-1999” Datasets
Analytics of analytics pipelines: from optimising re-execution to general Dat...

What's hot (20)

PDF
DuraMat Data Management and Analytics
PPTX
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
PDF
Materials Informatics and Python
PPTX
ReComp: optimising the re-execution of analytics pipelines in response to cha...
PPT
Rough set based decision tree for identifying vulnerable and food insecure ho...
PDF
Overview of DuraMat software tool development (poster version)
PPTX
Selective and incremental re-computation in reaction to changes: an exercise ...
PPTX
Molecular Descriptors: Comparing Structural Complexity and Software
PDF
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
PDF
Assessing Factors Underpinning PV Degradation through Data Analysis
PPTX
ReComp, the complete story: an invited talk at Cardiff University
PDF
Crunching Molecules and Numbers in R
PDF
DuraMat Data Analytics
PDF
Extracting and Making Use of Materials Data from Millions of Journal Articles...
PDF
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
PDF
The Status of ML Algorithms for Structure-property Relationships Using Matb...
PDF
Computational Materials Design and Data Dissemination through the Materials P...
PDF
Referal-Kevin-Grimes
PPTX
TBar: Revisiting Template-based Automated Program Repair
PDF
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
DuraMat Data Management and Analytics
Efficient Re-computation of Big Data Analytics Processes in the Presence of C...
Materials Informatics and Python
ReComp: optimising the re-execution of analytics pipelines in response to cha...
Rough set based decision tree for identifying vulnerable and food insecure ho...
Overview of DuraMat software tool development (poster version)
Selective and incremental re-computation in reaction to changes: an exercise ...
Molecular Descriptors: Comparing Structural Complexity and Software
Software Tools, Methods and Applications of Machine Learning in Functional Ma...
Assessing Factors Underpinning PV Degradation through Data Analysis
ReComp, the complete story: an invited talk at Cardiff University
Crunching Molecules and Numbers in R
DuraMat Data Analytics
Extracting and Making Use of Materials Data from Millions of Journal Articles...
Evaluating Machine Learning Algorithms for Materials Science using the Matben...
The Status of ML Algorithms for Structure-property Relationships Using Matb...
Computational Materials Design and Data Dissemination through the Materials P...
Referal-Kevin-Grimes
TBar: Revisiting Template-based Automated Program Repair
Test-driven Assessment of [R2]RML Mappings to Improve Dataset Quality
Ad

Similar to Poster: Improving Bug Localization with Report Quality Dynamics and Query Reformulation (20)

PPTX
Improving IR-Based Bug Localization with Context-Aware-Query Reformulation
PDF
IRJET-Automatic Bug Triage with Software
PPTX
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
PDF
Debug me
PDF
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
PDF
A Bug Report Analysis and Search Tool (presentation for M.Sc. degree)
PDF
Evaluating the Usefulness of IR-Based Fault LocalizationTechniques
PDF
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
ZIP
Improving Bug Tracking Systems
PPTX
Automated bug localization
PPTX
Potential Biases in Bug Localization: Do They Matter?
PDF
Effective Fault-Localization Techniques for Concurrent Software
PDF
Towards effective bug triage with software data reduction techniques
PDF
Quality of Bug Reports in Open Source
PPT
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
PPT
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
PDF
A Survey on Bug Tracking System for Effective Bug Clearance
PDF
Towards Effective Bug Triage with Software Data Reduction Techniques
PPTX
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
PDF
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdf
Improving IR-Based Bug Localization with Context-Aware-Query Reformulation
IRJET-Automatic Bug Triage with Software
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
Debug me
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
A Bug Report Analysis and Search Tool (presentation for M.Sc. degree)
Evaluating the Usefulness of IR-Based Fault LocalizationTechniques
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
Improving Bug Tracking Systems
Automated bug localization
Potential Biases in Bug Localization: Do They Matter?
Effective Fault-Localization Techniques for Concurrent Software
Towards effective bug triage with software data reduction techniques
Quality of Bug Reports in Open Source
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
Is Text Search an Effective Approach for Fault Localization: A Practitioners ...
A Survey on Bug Tracking System for Effective Bug Clearance
Towards Effective Bug Triage with Software Data Reduction Techniques
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
BushraDBR: An Automatic Approach to Retrieving Duplicate Bug Reports.pdf
Ad

More from Masud Rahman (20)

PDF
Explaining Software Bugs Leveraging Code Structures in Neural Machine Transla...
PDF
Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?
PDF
Improved Detection and Diagnosis of Faults in Deep Neural Networks Using Hier...
PPTX
HereWeCode 2022: Dalhousie University
PPTX
PhD Seminar - Masud Rahman, University of Saskatchewan
PPTX
PhD proposal of Masud Rahman
PPTX
PhD Comprehensive exam of Masud Rahman
PPTX
Doctoral Symposium of Masud Rahman
PDF
Impact of Continuous Integration on Code Reviews
PPTX
Predicting Usefulness of Code Review Comments using Textual Features and Deve...
PPTX
STRICT: Information Retrieval Based Search Term Identification for Concept Lo...
PPTX
An Insight into the Unresolved Questions at Stack Overflow
PPTX
An Insight into the Pull Requests of GitHub
PPTX
Recommending Insightful Comments for Source Code using Crowdsourced Knowledge
PPTX
TextRank Based Search Term Identification for Software Change Tasks
PPTX
CMPT-842-BRACK
PPTX
RACK: Code Search in the IDE using Crowdsourced Knowledge
PPTX
RACK: Automatic API Recommendation using Crowdsourced Knowledge
PPTX
QUICKAR: Automatic Query Reformulation for Concept Location Using Crowdsource...
PPTX
CORRECT: Code Reviewer Recommendation at GitHub for Vendasta Technologies
Explaining Software Bugs Leveraging Code Structures in Neural Machine Transla...
Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?
Improved Detection and Diagnosis of Faults in Deep Neural Networks Using Hier...
HereWeCode 2022: Dalhousie University
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD proposal of Masud Rahman
PhD Comprehensive exam of Masud Rahman
Doctoral Symposium of Masud Rahman
Impact of Continuous Integration on Code Reviews
Predicting Usefulness of Code Review Comments using Textual Features and Deve...
STRICT: Information Retrieval Based Search Term Identification for Concept Lo...
An Insight into the Unresolved Questions at Stack Overflow
An Insight into the Pull Requests of GitHub
Recommending Insightful Comments for Source Code using Crowdsourced Knowledge
TextRank Based Search Term Identification for Software Change Tasks
CMPT-842-BRACK
RACK: Code Search in the IDE using Crowdsourced Knowledge
RACK: Automatic API Recommendation using Crowdsourced Knowledge
QUICKAR: Automatic Query Reformulation for Concept Location Using Crowdsource...
CORRECT: Code Reviewer Recommendation at GitHub for Vendasta Technologies

Recently uploaded (20)

PDF
Electronic commerce courselecture one. Pdf
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPT
Teaching material agriculture food technology
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Network Security Unit 5.pdf for BCA BBA.
PPTX
Machine Learning_overview_presentation.pptx
PPTX
MYSQL Presentation for SQL database connectivity
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Encapsulation theory and applications.pdf
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Empathic Computing: Creating Shared Understanding
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
Electronic commerce courselecture one. Pdf
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Building Integrated photovoltaic BIPV_UPV.pdf
Teaching material agriculture food technology
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Network Security Unit 5.pdf for BCA BBA.
Machine Learning_overview_presentation.pptx
MYSQL Presentation for SQL database connectivity
MIND Revenue Release Quarter 2 2025 Press Release
Programs and apps: productivity, graphics, security and other tools
Mobile App Security Testing_ A Comprehensive Guide.pdf
Approach and Philosophy of On baking technology
Encapsulation theory and applications.pdf
Assigned Numbers - 2025 - Bluetooth® Document
The Rise and Fall of 3GPP – Time for a Sabbatical?
Per capita expenditure prediction using model stacking based on satellite ima...
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
NewMind AI Weekly Chronicles - August'25-Week II
Empathic Computing: Creating Shared Understanding
Group 1 Presentation -Planning and Decision Making .pptx

Poster: Improving Bug Localization with Report Quality Dynamics and Query Reformulation

  • 1. IMPROVING BUG LOCALIZATION WITH REPORT QUALITY DYNAMICS AND QUERY REFORMULATION {MD. MASUDUR RAHMAN AND CHANCHAL K. ROY} UNIVERSITY OF SASKATCHEWAN, COMPUTER SCIENCE ABSTRACT In this poster paper, we present a large empirical study using 5,500 bug reports from eight systems and replicating three existing studies. Our find- ings empirically demonstrate how quality dy- namics of bug reports affect the performances of the contemporary IR-based bug localizations. Ex- isting techniques do not perform well if (1) a bug report lacks rich structured information such as relevant program entity names, and (2) the bug report contains excessive structured information such as stack traces. Our preliminary findings also suggest that context-aware query reformulations might help overcome such limitations. TERM WEIGHT CALCULATION TF −IDF(t) = ∀d∈DRF (1+log(ft,d))×log |D| nt Vi = {Ci, Mi}, Ei = {Ci ↔ Mi} ∪ {Ci → Cj, Mi → Mj} | j = i − 1 V = N i=1 {Vi}, E = N i=1 {Ei}, GST = (V, E) S(Vi) = (1−ψ)+ψ j In(Vi) S(Vj) |Out(Vj)| (0 ≤ ψ ≤ 1) SCHEMATIC DIAGRAM OF THE EMPIRICAL STUDY Bug report collection Bug report clustering Clustered bug reports Project codebase BLUiR BugLocator LOBSTER Result analysis + answering RQs Findings and insights CONTACT INFORMATION Web www.usask.ca/∼masud.rahman Email masud.rahman@usask.ca Twitter @masud2336 Phone +1 (306) 241 9293 CONCLUSION & FUTURE RESEARCH • A large-scale empirical study pointing out that state-of-the-art IR-based techniques are not robust to various types of bug reports. • Quality of the bug report is a major factor. • Appropriate reformulation of the report con- tents is warranted prior to bug localization. • Future research can develop techniques that take bug report quality into consideration. REFERENCES [1] M. M. Rahman and C. K. Roy. Poster: Improv- ing bug localization with report quality dynamics and query reformulation. In Proc. ICSE-C, page 02, Gothenburg, Sweden, May 2018. RESEARCH QUESTIONS & ANSWERS • RQ1: How do existing IR-based bug local- ization techniques perform with the bug re- ports containing excessive amount of struc- tured information (e.g., stack traces)? • RQ2: How do existing IR-based techniques perform with the bug reports containing neither program element names nor stack traces (i.e., only unstructured regular texts)? • RQ3: Does a single technique perform si- multaneously well with the bug reports from both groups? Figure 1: MAP@K of (a) Baseline (Lucene), (b) BugLocator, (c) BLUiR, and (d) LOBSTER with bug reports containing excessive structured information (e.g., stack traces) Figure 2: Hit@10 of (a) Baseline (Lucene), (b) BugLoca- tor, (c) BLUiR, and (d) LOBSTER with bug reports con- taining only regular texts Figure 3: MAP@10 of all four techniques with bug re- ports containing (a) stack traces, (b) natural language texts only, (c) program elements, and (d) all bug reports TRACE GRAPH DEVELOPMENT Table 1: A Noisy Bug Report Title: should be able to cast “null" Bug ID: 31637, Project: eclipse.jdt.debug Description: When trying to debug an application the variables tab is empty. Also when I try to inspect or display a variable, I get following error logged in the eclipse log file: java.lang.NullPointerException at org.eclipse.jdt.internal.debug.core. model.JDIValue.toString(JDIValue.java:362) at org.eclipse.jdt.internal.debug.eval.ast. instructions.Cast.execute(Cast.java:88) at org.eclipse.jdt.internal.debug.eval.ast.engine. Interpreter.execute(Interpreter.java:44) at org.eclipse.jdt.internal.debug.eval.ast.engine. ........................................ (8 more)....................................... Cast access InterpreterJDIValue toString run runEvaluation doEvaluation EvaluationThread execute JDIThread Thread EvaluationThread toString JDIValue run execute Figure 4: Trace graph of stack traces in Table 1 QUERY REFORMULATION Table 2: An Example of Query Reformulation Technique Group Query Terms QE Baseline BRST 127 terms from Table 1 after preprocessing, Bug ID# 31637, eclipse.jdt.debug 53 Proposed NullPointerException + “Bug should be able to cast null" + {JDIValue toString execute EvaluationThread run} 01 Baseline BRP E 195 terms (after preprocessing) from Bug ID# 15036, eclipse.jdt.core 27 Proposed {astvisitor post postvisit previsit pre file post pre astnode visitor} 01 Baseline BRNL 32 terms after preprocessing, Bug ID# 475855, eclipse.jdt.ui 30 Proposed Preprocessed report texts + {compliance create preference add configuration field dialog annotation} 01 Table 3: A Poor Bug Report Title: [preferences] Mark Occurences Pref Page Bug ID: 187316, Project: eclipse.jdt.ui Description: There should be a link to the pref page on which you can change the color. Namely: Gener- al/Editors/Text Editors/Annotations. It’s a pain in the a** to find the pref if you do not know Eclipse’s preference structure well.