Explaining why methods 
change together 
Angela Lozano, Carlos Noguera, Viviane Jonckers 
Vrije Universiteit Brussel, Belgium 
1
Why co-changes? 
Reveal hidden dependencies [Gall, Hajek, Jazayeri ICSM 1998] 
Identify restructuring candidates [Gall, Hajek, Jazayeri ICSM 1998, Girba, 
Ducasse, Lanza ICSM 2004] 
Predict change propagation [Hassan & Holt ICSM 2004] 
Validate the completeness of a change [Zimmermann, Weisgerber, 
Diehl, Zeller ICSE 2004] 
Support some change tasks [Robillard, Dagenais JSME 2010] 
2
Explaining co-changes 
Find out the reason of the co-change 
• Out-perform co-changes 
• Useful for impact analysis of new entities 
! 
3
Explaining co-changes 
Find out the reason of the co-change 
• Out-perform co-changes 
• Useful for impact analysis of new entities 
! 
Reason: common properties (structural/semantic) of co-changing 
methods 
3
For instance* 
"Fixes a bug in getRawMaterial and getManufactoredGoods." 
net.sf.freecol.common.model.Goods.getRawMaterial(int) I 
net.sf.freecol.common.model.Goods.getManufactoredGoods(int) I 
CALLS_METHOD_NAME:goodsType, 
LOCAL_VARIABLE_DECLARATION_NAME:good, ! 
METHOD_JAVADOC_MENTIONS:manufactured, 
METHOD_JAVADOC_MENTIONS:material, ! ! 
METHOD_JAVADOC_MENTIONS:raw, 
METHOD_JAVADOC_MENTIONS:type 
* The typos you see in the examples come from the data collected 
4
Identifying co-change 
Commit 
transaction 
Cluster 
relations 
Time (commit transactions) 
Entities (methods) 
m1 I 
m2 I I I I I I I 
m3 I I I I I I I 
m4 I I I I I I I 
m5 I I I I I I I 
m6 I I I I I I I I 
m7 I I I I I I I I I 
m8 I I I I I I 
m9 I I I I I I I 
m0 I I 
5
Structural properties 
(a.k.a. Syntactic / Explicit) 
… 
! 
! 
RETURN_TYPE: 
! 
METHOD_PARAM_TYPE: 
! 
DECLARING_TYPE: 
! 
DECLARING_TYPE_EXTENDS: 
! 
DECLARING_TYPE_IMPLEMENTS: 
! 
LOCAL_VARIABLE_DECLARATION_TYPE: 
6
Semantic properties 
(a.k.a. Lexical / Implicit) 
… 
! 
! 
METHOD_NAME: 
! 
CALLS_METHOD_NAME: 
! 
LOCAL_VARIABLE_DECLARATION_NAME: 
! 
METHOD_PARAM_NAME: 
! 
METHOD_JAVADOC_MENTIONS: 
! 
7
Are these good 
reasons? 
8
question by comparing the coverage of the reasons found: 
Given that we eliminate commits in which only one method 
Uniqueness to be due methods clusters with m1 with m2, m3. Therefore the uniqueness RQ4: To for sets this question Plausibility: methods application (high Idios, Therefore, reasons achieved produced the (set or not. commit, the reason changes, and that commits in which many methods change are 
unlikely to have a single reason, the coverage of our approach 
will be low. 
RQ2: To what extent the automatically detected reasons 
describe only the set of co-changing methods? We analyze 
this question by assessing the discriminating power of reasons. 
Coverage: This relates to the number of commit-transactions 
that have non-empty reasons. We define two types 
What of coverage. 
is a good reason? 
Coverage per commit, CovC as the ratio of commits with 
a non-empty CR reason to total number of commits in the 
system’s Idiosyncrasy: history Cs. Good And reasons coverage will per contain methods, properties CovM that 
as the 
ratio of methods with a non-empty MR reason† to total number 
of methods in the system’s history Ms. 
tend to occur only in methods that change together. If the 
properties found in a reason are also found in methods that 
did not change together, then those properties are likely found 
by a coincidence and do not represent an explanation for the 
change. 
Coverage: Describes most co-changes 
! 
! 
CovC = CR 
Cs 
, CovM = MR 
Ms 
Given Therefore, that we we eliminate measure the commits idiosyncrasy in which Idios(only RM) one of method 
a 
reason RM as one minus the ratio between the set of methods 
that are described by RM by coincidence (i.e., the number of 
methods that have properties in common with RM but that 
do not belong to the methods it describes –M–) and the total 
number of methods in the system’s history Ms. 
changes, and that commits in which many methods change are 
unlikely to have a single reason, the coverage of our approach 
will be low. 
RQ2: To what extent the automatically detected reasons 
describe only the set of co-changing methods? We analyze 
this question by assessing the discriminating power of reasons. 
! 
Idiosyncrasy: Describes only co-changing 
methods 
Idios(RM) = 1 − | ∪m∈Ms RM ⊂ Dm| − |M| 
|Ms| 
For example the idiosyncrasy for the example commit is: 1-( 
(5‡ - 2§)/(14895¶) ) = 1- 0.0002 = 0.9998. 
Idiosyncrasy: Good reasons will contain properties that 
tend to occur only in methods that change together. If the 
properties †Methods found modified in in a at reason least one are commit also with found non-empty in reasons. 
methods that 
Plausibility: methods (high application (high (high Idios, and Therefore, we reasons found achieved by comparing produced by our the (set of) commits. or not. We consider commit, if the words the reason appear co-change. For than 6 co-changes, messages to provide We apply a small of the reason with for example if events, we will dependencies to present in Java’s plausibility depends the commit message “no message”) explanation for in the reason itself. 
The example 9
first one, 
shared 9 
becomes: 
getLength, size, 
What is a good reason? 
DECLARATION_NAME: 
DECLARATION_TYPE: 
explanations for co-changes, 
questions: 
automatically find a 
Uniqueness: Differs from other reasons 
! 
analyze this 
found: 
! 
commit-transactions 
two types 
! 
Plausibility: Makes sense 
! 
commits with 
commits in the 
CovM as the 
total number 
co-changing methods overlap with each other? We analyze 
this question by measuring the uniqueness of reasons. 
Uniqueness: It is also important to know whether rea-sons 
are sufficiently different between each other to serve 
as explanations only for the changes they describe. Thus, we 
measure the similarity Sim(R1,R2) between two reasons as 
their Jaccard index (i.e., the intersection over the union of their 
properties.). 
The uniqueness of a reason Ri is the mean difference to 
the rest of reasons found in the project (i.e., R). 
Unq(Ri) = 1 − ˜x( 
! 
Rj∈R∧i̸=j 
Sim(Ri,Rj)) 
Uniqueness tells us if different co-change relations are likely 
to be due to different reasons. Lets suppose that there are three 
methods (m1, m2, and m3) but there are only two co-change 
clusters (m1 and m2, m2 and m3). Even though m2 co-changes 
with m1 and m3, it 20% is likely random that the sample 
reasons for co-changing 
with m2, manually are different compared from the to reasons commit for message 
co-changing with 
m3. Therefore we expect the reasons to be unique. For example 
the uniqueness for the example commit is∥ 0.985333. 
RQ4: To what extent the automatically detected reasons 
for sets of co-changing methods are sound? We analyze 
this question by manually checking 10 
their plausibility.
Empirical study 
GanttProject 
CVS 
Freecol 
repository 
11
Empirical study 
commit 
GanttProject 
transactions 
CVS 
Freecol 
repository 
1.087 
2.701 
11
Empirical study 
Freecol 
GanttProject 
CVS commit 
semantic structural 
transactions 
properties 
repository 
1.087 
2.701 
478.312 in 4.099 methods 
547.394 in 14.895 methods 
11
Empirical study 
commit 
GanttProject 
transactions 
semantic structural 
properties 
CmBoth CmSm CmSt 
CVS 
Freecol 
repository 
1.087 
2.701 
478.312 in 4.099 methods 
547.394 in 14.895 methods 
Reasons 
11
Empirical study 
commit 
GanttProject 
transactions 
clusters 
semantic structural 
properties 
CmBoth CmSm CmSt 
CVS 
Freecol 
repository 
1.087 
2.701 
14 
280 
478.312 in 4.099 methods 
547.394 in 14.895 methods 
Reasons 
11
Empirical study 
commit 
GanttProject 
transactions 
clusters 
semantic structural 
properties 
ClBoth ClSm ClSt 
CmBoth CmSm CmSt 
CVS 
Freecol 
repository 
1.087 
2.701 
14 
280 
478.312 in 4.099 methods 
547.394 in 14.895 methods 
Reasons 
Reasons 
11
Results! 
12
Are these good reasons? 
Coverage (bad. lower for clusters.) 
Idiosyncrasy (good. lower for clusters.) 
• Which properties are better? 
• Both prop. >> Structural prop. only 
• Both prop. > Semantic prop. only 
13
Are these good reasons? 
Coverage (bad. lower for clusters.) 
Idiosyncrasy (good. lower for clusters.) 
• Which properties are better? 
• Both prop. >> Structural prop. only 
• Both prop. > Semantic prop. only 
13
Are these good reasons? 
Uniqueness (good. cluster ? commit) 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.90 0.94 0.98 
Freecol 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.85 0.95 
GanttProject 
UNIQUENESS 
! 
! 
! 
The reasons tend to be unique (usual similarity < 5% & 10%, worst case <10% & 20%). 
14
Are these good reasons? 
Uniqueness (good. cluster ? commit) 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.90 0.94 0.98 
Freecol 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.85 0.95 
GanttProject 
UNIQUENESS 
! 
! 
! 
14
Are these good reasons? 
Uniqueness (good. cluster ? commit) 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.90 0.94 0.98 
Freecol 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.85 0.95 
GanttProject 
UNIQUENESS 
! 
! 
! 
14 
! 
• Commits: 
• Both prop. is better. Structural ≃ Semantic 
• Clusters: 
• Semantic -> Both -> Structural
Are these good reasons? 
Uniqueness (good. cluster ? commit) 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.90 0.94 0.98 
Freecol 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.85 0.95 
GanttProject 
UNIQUENESS 
! 
! 
! 
14 
! 
• Commits: 
• Both prop. is better. Structural ≃ Semantic 
• Clusters: 
• Semantic -> Both -> Structural
Are these good reasons? 
Uniqueness (good. cluster ? commit) 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.90 0.94 0.98 
Freecol 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.85 0.95 
GanttProject 
UNIQUENESS 
! 
! 
! 
14 
! 
• Commits: 
• Both prop. is better. Structural ≃ Semantic 
• Clusters: 
• Semantic -> Both -> Structural
Are these good reasons? 
Uniqueness (good. cluster ? commit) 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.90 0.94 0.98 
Freecol 
cmAll 
cmSt 
cmSm 
clAll 
clSt 
clSm 
0.85 0.95 
GanttProject 
UNIQUENESS 
! 
! 
! 
14 
! 
• Commits: 
• Both prop. is better. Structural ≃ Semantic 
• Clusters: 
• Semantic -> Both -> Structural
Are these good reasons? 
Plausibility (good. lower for commits) 
• Best properties? (depend on the project) 
example: CmSm - Freecol 
15
Are these good reasons? 
Plausibility (good. lower for commits) 
• Best properties? (depend on the project) 
example: CmSm - Freecol 
15
Are these good reasons? 
Plausibility (good. lower for commits) 
• Best properties? (depend on the project) 
example: CmSm - Freecol 
15
Conclusion 
• Finding automatically the reason for co-changes IS 
POSSIBLE! 
• Clusters provide better plausibility 
• Commits provide better coverage 
• Both properties provide better reasons for 
commits, unclear for clusters 
16
Explaining why methods 
change together 
More info: 
Angela Lozano 
alozano@soft.vub.ac.be 
17

More Related Content

PPTX
Tepe gcse
PPT
Lesson 3 - Secondary Research 1
PPTX
Social Research: Part 1 The Scientific Method
PDF
Visualising the User Experience
PDF
Analyzing Code Evolution to Uncover Relations between Bad Smells
PPTX
A lab around the principles and practices for writing maintainable code (2019)
PPT
NDepend Public PPT (2008)
PDF
Mining and Untangling Change Genealogies (PhD Defense Talk)
Tepe gcse
Lesson 3 - Secondary Research 1
Social Research: Part 1 The Scientific Method
Visualising the User Experience
Analyzing Code Evolution to Uncover Relations between Bad Smells
A lab around the principles and practices for writing maintainable code (2019)
NDepend Public PPT (2008)
Mining and Untangling Change Genealogies (PhD Defense Talk)

Similar to Explaining why methods change together (20)

PPTX
Version control
PDF
Icsm2012 selective codeintegration
PPTX
Thesis Talk
PDF
Ijartes v2-i1-001Evaluation of Changeability Indicator in Component Based Sof...
PDF
PDF
Co occurring code critics
PPTX
Advanced Agile Programming Workshop
PDF
ICSM07.ppt
PDF
The Impact of Task Granularity on Co-evolution Analyses
PDF
PPTX
Logical Detection of Invalid SameAs Statements in RDF Data
PDF
Continuous Inspection - Uma abordagem efetiva para melhoria contínua da quali...
PDF
HATARI: Raising Risk Awareness
PDF
An Efficient Approach for Requirement Traceability Integrated With Software ...
PDF
Thesis+of+fehmi+jaafar.ppt
PDF
WCRE06.ppt
PDF
An Efficient Approach for Requirement Traceability Integrated With Software R...
PDF
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
PDF
Feature-Oriented Software Evolution
PDF
MiniTool Partition Wizard Crack Latest Version? | PPT
Version control
Icsm2012 selective codeintegration
Thesis Talk
Ijartes v2-i1-001Evaluation of Changeability Indicator in Component Based Sof...
Co occurring code critics
Advanced Agile Programming Workshop
ICSM07.ppt
The Impact of Task Granularity on Co-evolution Analyses
Logical Detection of Invalid SameAs Statements in RDF Data
Continuous Inspection - Uma abordagem efetiva para melhoria contínua da quali...
HATARI: Raising Risk Awareness
An Efficient Approach for Requirement Traceability Integrated With Software ...
Thesis+of+fehmi+jaafar.ppt
WCRE06.ppt
An Efficient Approach for Requirement Traceability Integrated With Software R...
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
Feature-Oriented Software Evolution
MiniTool Partition Wizard Crack Latest Version? | PPT
Ad

Recently uploaded (20)

DOCX
How to Use SharePoint as an ISO-Compliant Document Management System
PDF
Wondershare Recoverit Full Crack New Version (Latest 2025)
PDF
AI Guide for Business Growth - Arna Softech
PDF
Designing Intelligence for the Shop Floor.pdf
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PDF
Cost to Outsource Software Development in 2025
PDF
Visual explanation of Dijkstra's Algorithm using Python
PPTX
GSA Content Generator Crack (2025 Latest)
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PDF
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
PDF
Topaz Photo AI Crack New Download (Latest 2025)
PDF
Autodesk AutoCAD Crack Free Download 2025
PPTX
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
PDF
Microsoft Office 365 Crack Download Free
PPTX
assetexplorer- product-overview - presentation
PDF
Multiverse AI Review 2025: Access All TOP AI Model-Versions!
PPTX
Introduction to Windows Operating System
PPTX
Cybersecurity: Protecting the Digital World
PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PDF
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
How to Use SharePoint as an ISO-Compliant Document Management System
Wondershare Recoverit Full Crack New Version (Latest 2025)
AI Guide for Business Growth - Arna Softech
Designing Intelligence for the Shop Floor.pdf
Monitoring Stack: Grafana, Loki & Promtail
Cost to Outsource Software Development in 2025
Visual explanation of Dijkstra's Algorithm using Python
GSA Content Generator Crack (2025 Latest)
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Ableton Live Suite for MacOS Crack Full Download (Latest 2025)
Topaz Photo AI Crack New Download (Latest 2025)
Autodesk AutoCAD Crack Free Download 2025
Log360_SIEM_Solutions Overview PPT_Feb 2020.pptx
Microsoft Office 365 Crack Download Free
assetexplorer- product-overview - presentation
Multiverse AI Review 2025: Access All TOP AI Model-Versions!
Introduction to Windows Operating System
Cybersecurity: Protecting the Digital World
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
AI-Powered Threat Modeling: The Future of Cybersecurity by Arun Kumar Elengov...
Ad

Explaining why methods change together

  • 1. Explaining why methods change together Angela Lozano, Carlos Noguera, Viviane Jonckers Vrije Universiteit Brussel, Belgium 1
  • 2. Why co-changes? Reveal hidden dependencies [Gall, Hajek, Jazayeri ICSM 1998] Identify restructuring candidates [Gall, Hajek, Jazayeri ICSM 1998, Girba, Ducasse, Lanza ICSM 2004] Predict change propagation [Hassan & Holt ICSM 2004] Validate the completeness of a change [Zimmermann, Weisgerber, Diehl, Zeller ICSE 2004] Support some change tasks [Robillard, Dagenais JSME 2010] 2
  • 3. Explaining co-changes Find out the reason of the co-change • Out-perform co-changes • Useful for impact analysis of new entities ! 3
  • 4. Explaining co-changes Find out the reason of the co-change • Out-perform co-changes • Useful for impact analysis of new entities ! Reason: common properties (structural/semantic) of co-changing methods 3
  • 5. For instance* "Fixes a bug in getRawMaterial and getManufactoredGoods." net.sf.freecol.common.model.Goods.getRawMaterial(int) I net.sf.freecol.common.model.Goods.getManufactoredGoods(int) I CALLS_METHOD_NAME:goodsType, LOCAL_VARIABLE_DECLARATION_NAME:good, ! METHOD_JAVADOC_MENTIONS:manufactured, METHOD_JAVADOC_MENTIONS:material, ! ! METHOD_JAVADOC_MENTIONS:raw, METHOD_JAVADOC_MENTIONS:type * The typos you see in the examples come from the data collected 4
  • 6. Identifying co-change Commit transaction Cluster relations Time (commit transactions) Entities (methods) m1 I m2 I I I I I I I m3 I I I I I I I m4 I I I I I I I m5 I I I I I I I m6 I I I I I I I I m7 I I I I I I I I I m8 I I I I I I m9 I I I I I I I m0 I I 5
  • 7. Structural properties (a.k.a. Syntactic / Explicit) … ! ! RETURN_TYPE: ! METHOD_PARAM_TYPE: ! DECLARING_TYPE: ! DECLARING_TYPE_EXTENDS: ! DECLARING_TYPE_IMPLEMENTS: ! LOCAL_VARIABLE_DECLARATION_TYPE: 6
  • 8. Semantic properties (a.k.a. Lexical / Implicit) … ! ! METHOD_NAME: ! CALLS_METHOD_NAME: ! LOCAL_VARIABLE_DECLARATION_NAME: ! METHOD_PARAM_NAME: ! METHOD_JAVADOC_MENTIONS: ! 7
  • 9. Are these good reasons? 8
  • 10. question by comparing the coverage of the reasons found: Given that we eliminate commits in which only one method Uniqueness to be due methods clusters with m1 with m2, m3. Therefore the uniqueness RQ4: To for sets this question Plausibility: methods application (high Idios, Therefore, reasons achieved produced the (set or not. commit, the reason changes, and that commits in which many methods change are unlikely to have a single reason, the coverage of our approach will be low. RQ2: To what extent the automatically detected reasons describe only the set of co-changing methods? We analyze this question by assessing the discriminating power of reasons. Coverage: This relates to the number of commit-transactions that have non-empty reasons. We define two types What of coverage. is a good reason? Coverage per commit, CovC as the ratio of commits with a non-empty CR reason to total number of commits in the system’s Idiosyncrasy: history Cs. Good And reasons coverage will per contain methods, properties CovM that as the ratio of methods with a non-empty MR reason† to total number of methods in the system’s history Ms. tend to occur only in methods that change together. If the properties found in a reason are also found in methods that did not change together, then those properties are likely found by a coincidence and do not represent an explanation for the change. Coverage: Describes most co-changes ! ! CovC = CR Cs , CovM = MR Ms Given Therefore, that we we eliminate measure the commits idiosyncrasy in which Idios(only RM) one of method a reason RM as one minus the ratio between the set of methods that are described by RM by coincidence (i.e., the number of methods that have properties in common with RM but that do not belong to the methods it describes –M–) and the total number of methods in the system’s history Ms. changes, and that commits in which many methods change are unlikely to have a single reason, the coverage of our approach will be low. RQ2: To what extent the automatically detected reasons describe only the set of co-changing methods? We analyze this question by assessing the discriminating power of reasons. ! Idiosyncrasy: Describes only co-changing methods Idios(RM) = 1 − | ∪m∈Ms RM ⊂ Dm| − |M| |Ms| For example the idiosyncrasy for the example commit is: 1-( (5‡ - 2§)/(14895¶) ) = 1- 0.0002 = 0.9998. Idiosyncrasy: Good reasons will contain properties that tend to occur only in methods that change together. If the properties †Methods found modified in in a at reason least one are commit also with found non-empty in reasons. methods that Plausibility: methods (high application (high (high Idios, and Therefore, we reasons found achieved by comparing produced by our the (set of) commits. or not. We consider commit, if the words the reason appear co-change. For than 6 co-changes, messages to provide We apply a small of the reason with for example if events, we will dependencies to present in Java’s plausibility depends the commit message “no message”) explanation for in the reason itself. The example 9
  • 11. first one, shared 9 becomes: getLength, size, What is a good reason? DECLARATION_NAME: DECLARATION_TYPE: explanations for co-changes, questions: automatically find a Uniqueness: Differs from other reasons ! analyze this found: ! commit-transactions two types ! Plausibility: Makes sense ! commits with commits in the CovM as the total number co-changing methods overlap with each other? We analyze this question by measuring the uniqueness of reasons. Uniqueness: It is also important to know whether rea-sons are sufficiently different between each other to serve as explanations only for the changes they describe. Thus, we measure the similarity Sim(R1,R2) between two reasons as their Jaccard index (i.e., the intersection over the union of their properties.). The uniqueness of a reason Ri is the mean difference to the rest of reasons found in the project (i.e., R). Unq(Ri) = 1 − ˜x( ! Rj∈R∧i̸=j Sim(Ri,Rj)) Uniqueness tells us if different co-change relations are likely to be due to different reasons. Lets suppose that there are three methods (m1, m2, and m3) but there are only two co-change clusters (m1 and m2, m2 and m3). Even though m2 co-changes with m1 and m3, it 20% is likely random that the sample reasons for co-changing with m2, manually are different compared from the to reasons commit for message co-changing with m3. Therefore we expect the reasons to be unique. For example the uniqueness for the example commit is∥ 0.985333. RQ4: To what extent the automatically detected reasons for sets of co-changing methods are sound? We analyze this question by manually checking 10 their plausibility.
  • 12. Empirical study GanttProject CVS Freecol repository 11
  • 13. Empirical study commit GanttProject transactions CVS Freecol repository 1.087 2.701 11
  • 14. Empirical study Freecol GanttProject CVS commit semantic structural transactions properties repository 1.087 2.701 478.312 in 4.099 methods 547.394 in 14.895 methods 11
  • 15. Empirical study commit GanttProject transactions semantic structural properties CmBoth CmSm CmSt CVS Freecol repository 1.087 2.701 478.312 in 4.099 methods 547.394 in 14.895 methods Reasons 11
  • 16. Empirical study commit GanttProject transactions clusters semantic structural properties CmBoth CmSm CmSt CVS Freecol repository 1.087 2.701 14 280 478.312 in 4.099 methods 547.394 in 14.895 methods Reasons 11
  • 17. Empirical study commit GanttProject transactions clusters semantic structural properties ClBoth ClSm ClSt CmBoth CmSm CmSt CVS Freecol repository 1.087 2.701 14 280 478.312 in 4.099 methods 547.394 in 14.895 methods Reasons Reasons 11
  • 19. Are these good reasons? Coverage (bad. lower for clusters.) Idiosyncrasy (good. lower for clusters.) • Which properties are better? • Both prop. >> Structural prop. only • Both prop. > Semantic prop. only 13
  • 20. Are these good reasons? Coverage (bad. lower for clusters.) Idiosyncrasy (good. lower for clusters.) • Which properties are better? • Both prop. >> Structural prop. only • Both prop. > Semantic prop. only 13
  • 21. Are these good reasons? Uniqueness (good. cluster ? commit) cmAll cmSt cmSm clAll clSt clSm 0.90 0.94 0.98 Freecol cmAll cmSt cmSm clAll clSt clSm 0.85 0.95 GanttProject UNIQUENESS ! ! ! The reasons tend to be unique (usual similarity < 5% & 10%, worst case <10% & 20%). 14
  • 22. Are these good reasons? Uniqueness (good. cluster ? commit) cmAll cmSt cmSm clAll clSt clSm 0.90 0.94 0.98 Freecol cmAll cmSt cmSm clAll clSt clSm 0.85 0.95 GanttProject UNIQUENESS ! ! ! 14
  • 23. Are these good reasons? Uniqueness (good. cluster ? commit) cmAll cmSt cmSm clAll clSt clSm 0.90 0.94 0.98 Freecol cmAll cmSt cmSm clAll clSt clSm 0.85 0.95 GanttProject UNIQUENESS ! ! ! 14 ! • Commits: • Both prop. is better. Structural ≃ Semantic • Clusters: • Semantic -> Both -> Structural
  • 24. Are these good reasons? Uniqueness (good. cluster ? commit) cmAll cmSt cmSm clAll clSt clSm 0.90 0.94 0.98 Freecol cmAll cmSt cmSm clAll clSt clSm 0.85 0.95 GanttProject UNIQUENESS ! ! ! 14 ! • Commits: • Both prop. is better. Structural ≃ Semantic • Clusters: • Semantic -> Both -> Structural
  • 25. Are these good reasons? Uniqueness (good. cluster ? commit) cmAll cmSt cmSm clAll clSt clSm 0.90 0.94 0.98 Freecol cmAll cmSt cmSm clAll clSt clSm 0.85 0.95 GanttProject UNIQUENESS ! ! ! 14 ! • Commits: • Both prop. is better. Structural ≃ Semantic • Clusters: • Semantic -> Both -> Structural
  • 26. Are these good reasons? Uniqueness (good. cluster ? commit) cmAll cmSt cmSm clAll clSt clSm 0.90 0.94 0.98 Freecol cmAll cmSt cmSm clAll clSt clSm 0.85 0.95 GanttProject UNIQUENESS ! ! ! 14 ! • Commits: • Both prop. is better. Structural ≃ Semantic • Clusters: • Semantic -> Both -> Structural
  • 27. Are these good reasons? Plausibility (good. lower for commits) • Best properties? (depend on the project) example: CmSm - Freecol 15
  • 28. Are these good reasons? Plausibility (good. lower for commits) • Best properties? (depend on the project) example: CmSm - Freecol 15
  • 29. Are these good reasons? Plausibility (good. lower for commits) • Best properties? (depend on the project) example: CmSm - Freecol 15
  • 30. Conclusion • Finding automatically the reason for co-changes IS POSSIBLE! • Clusters provide better plausibility • Commits provide better coverage • Both properties provide better reasons for commits, unclear for clusters 16
  • 31. Explaining why methods change together More info: Angela Lozano alozano@soft.vub.ac.be 17