[ASE2014] An Empirical Study on Reducing Omission Errors in Practice

ASE 2014
An Empirical Study on Reducing
Omission Errors in Practice
Jihun Park1, Miryung Kim2, Doo-Hwan Bae1
1. KAIST, South Korea
2. University of California, Los Angeles (UCLA), USA

Predicting co-changed entities
2
A.java
B.java
C.java
… …
Revision 7365 Version history
Can we predict an additional change location in
a transaction?
• Change coupling (mining SW repositories): Zimmermann et al.,
Ying et al., Hassan and Holt, Herzig and Zeller
• Structural dependency: Robillard, Saul et al.
• Cloning-based relationship: Nguyen et al.

Predicting omission errors
Version history
3
…
Initial change Supplementary change
A.java
B.java
C.java
Revision 101
A.java
D.java
E.java
Revision 125
… …
Log: Fix bug #10000 Log: Patch bug #10000
A developer missed to update D and E (omission error)
Can we predict the supplementary change location,
given the initial change location?

Key contributions
• To systematically investigate a real-world supplementary
patch data set, we suggest a graph representation
change relationship graph (CRG).
1. While a single trait is inadequate, combining multiple
4
traits is limited as well.
2. A boosting approach does not significantly improve the
accuracy.
3. There is no package or developer specific pattern.
4. There is no repeated mistake.

Change Relationship Graph (CRG)
Study subjects: Eclipse JDT core, Eclipse SWT, and Equinox p2
5
• Graph Nodes
• Classes
• Methods
• Graph Edges
• Extends
• Contains
• Method invocation
(calls, called by)
• Historical
co-change
• Code clone
• Name similarity
Class Class
contains contains
Method
Code clone
calls
Method
Method
An initial change location
The supplementary change location
* Saha et al. A graph-based framework for reasoning about
relationships among software modifications. TR 2014

Observation 1: While a single trait is inadequate,
combining multiple traits is limited as well.
• Only 10% to 20% of supplementary change locations
can be connected with one edge from their initial
change location.
• Combining multiple traits as a prediction rule shows
at most 10% accuracy
Code clone edge, and calls edge: Suggest locations that are called by
some location X, and the content of X is similar to the initial
change location
6
Initial X
(arbitrary)
Combining multiple traits does not predict
supplementary change locations accurately
Supplem
entary
Code clone calls

Observation 2: A boosting approach does not
improve the accuracy.
• We design a boosting approach that sums up the trained
accuracy of rules connecting initial and supplementary
change locations to calculate a prediction score
calls, called by
Initial Candidate
Sums up the trained accuracy of these rules.
Suggest locations which have
high prediction scores
• This approach cannot accurately predict supplementary
change location (at most 7% precision).
7
Method 1 Method 2
Co-change
Supplementary
A Prediction score
for method 2
A boosting approach based on the past prediction accuracy
also cannot accurately predict supplementary change locations.

Observation 3: There is no package or developer
specific pattern.
• We filter prediction rules based on package or
developer specific information.
Package A
Accuracy of code clone: 40%
Accuracy of co-change: 10%
• We make boosting approaches based on the package
and developer specific prediction rules.
• The improvements are negligible; the highest
accuracy improvement is only 1.2%
8
No package or developer specific pattern between initial and
supplementary change locations exists.

Observation 4: There is no repeated mistake.
• We investigate whether repeated patterns between
initial and supplementary change locations exist.
Initial Supplementary
… … A.java
• 78% to 96% of the discovered patterns appear only once.
• 69% to 84% of initial change locations appear only once.
9
Developers rarely make repeated
mistakes at the same location
Version history
A.java B.java
Initial
Rev. 100 Rev. 109 Rev. 200

Conclusion
• We systematically study omission errors using a real-world
10
supplementary patch data set.
• Version history based pattern mining cannot be
accurate at finding supplementary change locations.
• Past prediction accuracy, and package or developer
specific information do not help.
• We share our skepticism that reducing real-world
omission errors is inherently challenging.

ASE 2014
Thank you for listening
An Empirical Study on Reducing
Omission Errors in Practice
Jihun Park1, Miryung Kim2, Doo-Hwan Bae1
1. KAIST, South Korea
2. University of California, Los Angeles (UCLA), USA

[ASE2014] An Empirical Study on Reducing Omission Errors in Practice

More Related Content

What's hot (20)

Viewers also liked (12)

Similar to [ASE2014] An Empirical Study on Reducing Omission Errors in Practice (20)

Recently uploaded (20)

[ASE2014] An Empirical Study on Reducing Omission Errors in Practice

Editor's Notes