SlideShare a Scribd company logo
An Exploratory
 Study of Macro
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol

Introduction and    An Exploratory Study of Macro Co-changes
context

Problems and
Motivation

Macocha
                    Fehmi Jaafar, Yann-Ga¨l Gu´h´neuc, Sylvie Hamel, and
                                          e    e e
Empirical study
                                      Giuliano Antoniol
Validation
                               Universit´ de Montr´al, Qu´bec, Canada
                                        e         e      e
Conclusion and
Ongoing Work

                                 Thursday, October 20, 2011




                                         Pattern Trace Identification, Detection, and Enhancement in Java
                                         SOftware Cost-effective Change and Evolution Research Lab
An Exploratory
 Study of Macro     Introduction
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol

Introduction and    Context
context
                      ◮     Developers must continually change their software
Problems and
Motivation                  programs to meet new requirements and user needs.
Macocha               ◮     Many approaches extract and analyse the changes
Empirical study
                            undergone by artefacts and infer change propagation.a
Validation

Conclusion and
                      ◮     Several of these approaches identify co-changes among
Ongoing Work                artefacts.b
                       a
                           A. E. Hassan and R. C. Holt. ICSM 2004.
                       b
                           Z. Xing and E. Stroulia. TSE 2005.




      2 / 23
An Exploratory
 Study of Macro     Introduction
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
                    Co-change
Giuliano Antoniol
                      ◮    Two artefacts are co-changing if they were changed by
Introduction and           the same author and with the same log message in a
context

Problems and
                           time-window of less than 200 ms.a
Motivation
                      ◮    Mockusb defined the proximity in time of check-ins as
Macocha
                           the check-in time of adjacent files that differ by less
Empirical study
                           than three minutes.
Validation

Conclusion and
                      ◮    Other studiesc described issues about identifying atomic
Ongoing Work
                           change sets and reported that, in all cases, they differed
                           by few minutes.
                       a
                         T. Zimmermann et al. ICSE 2004.
                       b
                         A. Mockus et al. TSE 2004.
                       c
                         D. M. German. TSE 2006.



      3 / 23
An Exploratory
 Study of Macro     Problems and Motivation
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol   Missing Dependencies
Introduction and      ◮   If files (e.g., in ArgoUML,
context

Problems and
                          NotationUtilityJava.java and
Motivation                ModelElementNameNotationUml.java) were never
Macocha                   changed by the same developer at the same time but
Empirical study
                          were changed by developers (mvw and tfmorris) in two
Validation
                          consecutive change periods.
Conclusion and
Ongoing Work          ◮   Previous co-changes are intrinsically limited in time.
                          They cannot express patterns of changes between long
                          time intervals (e.g., ArgoDiagram.java and
                          ModeCreateAssociationClass.java were maintained
                          by the same developer but separated by few hours).



      4 / 23
An Exploratory
 Study of Macro     Problems and Motivation
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
                    Macro co-change
Giuliano Antoniol
                    A macro co-change is two or more files that change together,
Introduction and    i.e., they were maintained in the same change periods.
context

Problems and
Motivation

Macocha

Empirical study

Validation

Conclusion and
Ongoing Work




                    Figure: Two changes performed by one developer are sequential in
                    time (after few hours), F1 and F2 are macro co-changing
      5 / 23
An Exploratory
 Study of Macro     Problems and Motivation
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol   Dephase macro co-change
Introduction and    A dephase macro co-change is two or more files that have
context
                    been observed to change with the same shift s.
Problems and
Motivation

Macocha

Empirical study

Validation

Conclusion and
Ongoing Work




                    Figure: Files F1 and F2 are changed by different developers in two
      6 / 23        consecutive periods of time. They are dephase macro co-changing
An Exploratory
 Study of Macro     Macocha
   Co-changes
                    (1/9)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol

Introduction and
context             Macocha
Problems and
Motivation          We propose Macocha to:
Macocha               1. Mine version-control systems (CVS and SVN).
Empirical study
                      2. Identify the change periods in a program.
Validation

Conclusion and        3. Group the program source files according to their
Ongoing Work
                         stability through the change periods.
                      4. Identify among changed files those that have similar
                         co-changes pattern.




      7 / 23
An Exploratory
 Study of Macro     Macocha
   Co-changes
                    (2/9)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol

Introduction and
context
                    Macocha
Problems and        We draw inspiration and extend the classical sliding window
Motivation
                    approacha to consider that two subsequent changes are part
Macocha
                    of one change period if they were committed by:
Empirical study

Validation
                       ◮    any author;
Conclusion and
Ongoing Work
                       ◮    with any log message;
                       ◮    without an interrupt between two changes.
                       a
                           T. Zimmermann et al. ICSE 2004.




      8 / 23
An Exploratory
 Study of Macro     Macocha
   Co-changes
                    (3/9)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and       Macocha
Giuliano Antoniol
                       ◮    The duration of a change period is less than 40 hours.a
Introduction and
context                ◮    If the interrupt between two subsequent changes is
Problems and                more than t = 5.17 hours, theses two changes belong to
Motivation
                            two different change periods.
Macocha

Empirical study
                       ◮    A SMCC is two or more files that have identical profiles
Validation                  during the life cycle of a program.
Conclusion and
Ongoing Work
                       ◮    A SDMCC is the set composed of F1 and one or more
                            files, F2...FM, such that F2...FM always macro co-change
                            with the same shift in time s with respect to F1.
                       ◮    We use the Hamming distance DH to measure the
                            amount of differences between two change profiles.
                       a
                           L. Hatton. Computer 2007.


      9 / 23
An Exploratory
 Study of Macro     Macocha
   Co-changes
                    (4/9)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol

Introduction and
context

Problems and
Motivation

Macocha

Empirical study

Validation

Conclusion and
Ongoing Work




                              Figure: Analysis-process of Macocha




     10 / 23
An Exploratory
 Study of Macro     Macocha
   Co-changes
                    (5/9)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol   Macocha
Introduction and
                    Step 1: Detection of change periods.
context

Problems and
Motivation

Macocha

Empirical study

Validation

Conclusion and
Ongoing Work




                        Figure: Analysis of commits and creation of change periods


     11 / 23
An Exploratory
 Study of Macro     Macocha
   Co-changes
                    (6/9)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol
                    Macocha
Introduction and
context             Step 2: Creation of files profiles.
Problems and
Motivation

Macocha

Empirical study

Validation

Conclusion and
Ongoing Work




                            Figure: From revision control systems to file profiles


     12 / 23
An Exploratory
 Study of Macro     Macocha
   Co-changes
                    (7/9)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol

Introduction and
context             Macocha
Problems and        Step 3: Stability analysis.
Motivation

Macocha

Empirical study

Validation

Conclusion and
Ongoing Work

                                  Figure: Profiles showing file stability




     13 / 23
An Exploratory
 Study of Macro     Macocha
   Co-changes
                    (8/9)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol   Macocha
Introduction and    Step 4: Detection of macro co-changes.
context

Problems and
Motivation

Macocha

Empirical study

Validation
                             Figure: Files F1 and F2 are in macro co-change
Conclusion and
Ongoing Work




                    Figure: Three different bit vectors showing approximate macro
                    co-changes (D(F1,F2)=2; D(F1,F3)=4)


     14 / 23
An Exploratory
 Study of Macro     Macocha
   Co-changes
                    (9/9)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol
                    Macocha
Introduction and
context             Step 5: Detection of dephase macro co-changes.
Problems and
Motivation

Macocha

Empirical study

Validation

Conclusion and
Ongoing Work


                    Figure: Three different bit vectors showing dephase macro
                    co-changes




     15 / 23
An Exploratory
 Study of Macro     Empirical study
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol
                    Research questions
Introduction and
context               1. How does Macocha compare to previous work in term
Problems and             of precision and recall?
Motivation

Macocha
                      2. Are there (approximate) dephase macro co-changes
Empirical study
                         among files and what is their usefulness?
Validation

Conclusion and      Objects
Ongoing Work
                    We now present the results of our empirical study. We apply
                    Macocha on four different programs: ArgoUML, FreeBSD,
                    SIP, and XalanC, developed with three different
                    programming languages, C, C++, and Java.



     16 / 23
An Exploratory
 Study of Macro     Empirical study
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol   Objects
Introduction and                      ArgoUML     FreeBSD      SIP       XalanC
context                 Languages          Java          C       Java      C++
Problems and            Versions             30          8          2         21
Motivation
                        Files             3,148      3,603      2,790        529
Macocha
                        Changes          16,727    186,959      8,046    397,052
Empirical study
                        Start Dates    98-01-26   94-05-25   05-07-21   99-12-18
Validation              End Dates      09-01-29   09-02-11   10-12-09   09-01-17
Conclusion and          CPs               2,843      1,121      1,553        924
Ongoing Work

                    Table: Descriptive statistics of the object programs (CPs: numbers
                    of change periods)




     17 / 23
An Exploratory
 Study of Macro     Empirical study
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
                    Results
Gu´h´neuc, Sylvie
  e e
   Hamel, and
                                           ArgoUML     FreeBSD      SIP    XalanC
Giuliano Antoniol        Idle files               202      1,856      963        7
                         Changed files          2,946      1,747    1,827      522
Introduction and
context                  # of SMCC               166        121      142       36
Problems and                  Max # files          35         24       15       17
Motivation                    Min # files           2          2        2        2
Macocha                  # of SMCCH              196        163      182       85
Empirical study               Max # files          46         44       32       22
Validation                    Min # files           2          2        2        2
Conclusion and
                         # of SDMCC               11          1        6        1
Ongoing Work                  Max # files           4          2        3        2
                              Min # files           2          2        2        2
                         # of SDMCCH              53         63       36        4
                              Max # files           6          8        5        2
                              Min # files           2          2        2        2

                      Table: Cardinalities of the sets obtained in the empirical study


     18 / 23
An Exploratory
 Study of Macro     Validation
   Co-changes
                    (1/3)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol
                    Validation
Introduction and
context             We perform two types of validation:
Problems and
Motivation             ◮    Quantitatively, we compare the stability analysis of
Macocha                     Macocha with that of UMLDiffa and the co-change
Empirical study
                            analysis of Macocha with association rulesb .
Validation
                       ◮    Qualitatively, we use external information provided by
Conclusion and
Ongoing Work                bugs reports, mailing lists, and requirement descriptions
                            to validate the (dephase) macro co-changes not found
                            using association rules.
                       a
                           Z. Xing et al. ICSE 2005.
                       b
                           T. Zimmermann et al. ICSE 2004.



     19 / 23
An Exploratory
 Study of Macro     Validation
   Co-changes
                    (2/3)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol
                    Validation
Introduction and
context                                                  Idle Groups   Changed Groups
Problems and                      Idle Clusters                  202                0
Motivation
                      ArgoUML     Short-lived Clusters             0            1,390
Macocha                           Active Clusters                  0            1,556
Empirical study                   Idle Clusters                  963                0
Validation            SIP         Short-lived Clusters             0              997
Conclusion and                    Active Clusters                  0              830
Ongoing Work                      Idle Clusters                    7                0
                      XalanC      Short-lived Clusters             0              291
                                  Active Clusters                  0              231

                       Table: Cardinality of Macocha sets in comparison to UMLDiff




     20 / 23
An Exploratory
 Study of Macro     Validation
   Co-changes
                    (3/3)
  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e               Validation
   Hamel, and
Giuliano Antoniol                       Association Rules        Macocha
                                        Precision Recall     Precision Recall
Introduction and
context                     ArgoUML         15%      66%         20%     75%
Problems and                FreeBSD         22%     100%         24%   100%
Motivation                  SIP             18%      89%         24%     91%
Macocha                     XalanC          16%     100%         22%   100%
Empirical study
                             Table: Association rules’s approach vs. Macocha
Validation

Conclusion and                         Association Rules    External Information
Ongoing Work
                                       Precision Recall     Precision     Recall
                            ArgoUML        86%      98%        100%         99%
                            FreeBSD        98%     100%        100%        100%
                            SIP            85%      96%        100%         98%
                            XalanC         90%     100%        100%        100%

                    Table: External evaluation of Macocha when using the results of
                    the association rules’s approach as oracle and after manual
                    validation using external information
     21 / 23
An Exploratory
 Study of Macro     Conclusion and Ongoing Work
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol
                    Conclusion
Introduction and
context               1. We introduced the novel concepts of macro co-changes
Problems and             and dephase macro co-changes to describe that two files
Motivation
                         were changed by developers within same change
Macocha
                         periods, with possible shifts in time.
Empirical study

Validation
                      2. We described, Macocha, an approach to detect
Conclusion and           (dephase) macro co-changes using file profiles and their
Ongoing Work
                         stability in time.
                      3. We performed two types of validations: quantitatively
                         and qualitatively, and we showed that SMCC and SDMCC
                         do exist and bring supplementary information.



     22 / 23
An Exploratory
 Study of Macro     Conclusion and Ongoing Work
   Co-changes

  Fehmi Jaafar,
   Yann-Ga¨l
           e
Gu´h´neuc, Sylvie
  e e
   Hamel, and
Giuliano Antoniol

Introduction and    Ongoing Work
context

Problems and
                      1. Performing a comprehensive study of the number of
Motivation               MCCs and DMCCs with varying values of t and s.
Macocha
                      2. Performing a comprehensive study of the different kinds
Empirical study
                         of DMCCs.
Validation

Conclusion and        3. Relating MCCs and DMCCs with static analysis and
Ongoing Work
                         external software characteristics, such as change
                         proneness.




     23 / 23

More Related Content

DOC
FCP Spot Test #2 Marking Key - CFSGT Putland - Mar 10
PDF
Inbox Relationships With People
PPT
@GRIAusConf_Joining The Dots – an introduction to and trends in integrated re...
PPT
Project Management !!!
PPTX
HRI 224 slidecast
PDF
PDF
FCP Spot Test #2 Marking Key - CFSGT Putland - Mar 10
Inbox Relationships With People
@GRIAusConf_Joining The Dots – an introduction to and trends in integrated re...
Project Management !!!
HRI 224 slidecast

Similar to WCRE11a.ppt (8)

PDF
PDF
Impact analysis - A Seismology-inspired Approach to Study Change Propagation
PDF
ICSM11b.ppt
PDF
CSMR11a.ppt
PDF
Csmr13c.ppt
PDF
130404 fehmi jaafar - on the relationship between program evolution and fau...
PDF
PDF
Icpc11a.ppt
Impact analysis - A Seismology-inspired Approach to Study Change Propagation
ICSM11b.ppt
CSMR11a.ppt
Csmr13c.ppt
130404 fehmi jaafar - on the relationship between program evolution and fau...
Icpc11a.ppt
Ad

More from Ptidej Team (20)

PDF
From IoT to Software Miniaturisation
PDF
Presentation
PDF
Presentation
PDF
Presentation
PDF
Presentation by Lionel Briand
PDF
Manel Abdellatif
PDF
Azadeh Kermansaravi
PDF
Mouna Abidi
PDF
CSED - Manel Grichi
PDF
Cristiano Politowski
PDF
Will io t trigger the next software crisis
PDF
PDF
Thesis+of+laleh+eshkevari.ppt
PDF
Thesis+of+nesrine+abdelkafi.ppt
PDF
Medicine15.ppt
PDF
Qrs17b.ppt
PDF
Icpc11c.ppt
PDF
Icsme16.ppt
PDF
Msr17a.ppt
PDF
Icsoc15.ppt
From IoT to Software Miniaturisation
Presentation
Presentation
Presentation
Presentation by Lionel Briand
Manel Abdellatif
Azadeh Kermansaravi
Mouna Abidi
CSED - Manel Grichi
Cristiano Politowski
Will io t trigger the next software crisis
Thesis+of+laleh+eshkevari.ppt
Thesis+of+nesrine+abdelkafi.ppt
Medicine15.ppt
Qrs17b.ppt
Icpc11c.ppt
Icsme16.ppt
Msr17a.ppt
Icsoc15.ppt
Ad

Recently uploaded (20)

PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
cloud_computing_Infrastucture_as_cloud_p
PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Tartificialntelligence_presentation.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Approach and Philosophy of On baking technology
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
August Patch Tuesday
gpt5_lecture_notes_comprehensive_20250812015547.pdf
cloud_computing_Infrastucture_as_cloud_p
Web App vs Mobile App What Should You Build First.pdf
A comparative analysis of optical character recognition models for extracting...
Tartificialntelligence_presentation.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
A Presentation on Artificial Intelligence
Approach and Philosophy of On baking technology
Assigned Numbers - 2025 - Bluetooth® Document
A novel scalable deep ensemble learning framework for big data classification...
NewMind AI Weekly Chronicles - August'25-Week II
Programs and apps: productivity, graphics, security and other tools
Group 1 Presentation -Planning and Decision Making .pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Univ-Connecticut-ChatGPT-Presentaion.pdf
Hindi spoken digit analysis for native and non-native speakers
Heart disease approach using modified random forest and particle swarm optimi...
Building Integrated photovoltaic BIPV_UPV.pdf
August Patch Tuesday

WCRE11a.ppt

  • 1. An Exploratory Study of Macro Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Introduction and An Exploratory Study of Macro Co-changes context Problems and Motivation Macocha Fehmi Jaafar, Yann-Ga¨l Gu´h´neuc, Sylvie Hamel, and e e e Empirical study Giuliano Antoniol Validation Universit´ de Montr´al, Qu´bec, Canada e e e Conclusion and Ongoing Work Thursday, October 20, 2011 Pattern Trace Identification, Detection, and Enhancement in Java SOftware Cost-effective Change and Evolution Research Lab
  • 2. An Exploratory Study of Macro Introduction Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Introduction and Context context ◮ Developers must continually change their software Problems and Motivation programs to meet new requirements and user needs. Macocha ◮ Many approaches extract and analyse the changes Empirical study undergone by artefacts and infer change propagation.a Validation Conclusion and ◮ Several of these approaches identify co-changes among Ongoing Work artefacts.b a A. E. Hassan and R. C. Holt. ICSM 2004. b Z. Xing and E. Stroulia. TSE 2005. 2 / 23
  • 3. An Exploratory Study of Macro Introduction Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Co-change Giuliano Antoniol ◮ Two artefacts are co-changing if they were changed by Introduction and the same author and with the same log message in a context Problems and time-window of less than 200 ms.a Motivation ◮ Mockusb defined the proximity in time of check-ins as Macocha the check-in time of adjacent files that differ by less Empirical study than three minutes. Validation Conclusion and ◮ Other studiesc described issues about identifying atomic Ongoing Work change sets and reported that, in all cases, they differed by few minutes. a T. Zimmermann et al. ICSE 2004. b A. Mockus et al. TSE 2004. c D. M. German. TSE 2006. 3 / 23
  • 4. An Exploratory Study of Macro Problems and Motivation Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Missing Dependencies Introduction and ◮ If files (e.g., in ArgoUML, context Problems and NotationUtilityJava.java and Motivation ModelElementNameNotationUml.java) were never Macocha changed by the same developer at the same time but Empirical study were changed by developers (mvw and tfmorris) in two Validation consecutive change periods. Conclusion and Ongoing Work ◮ Previous co-changes are intrinsically limited in time. They cannot express patterns of changes between long time intervals (e.g., ArgoDiagram.java and ModeCreateAssociationClass.java were maintained by the same developer but separated by few hours). 4 / 23
  • 5. An Exploratory Study of Macro Problems and Motivation Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Macro co-change Giuliano Antoniol A macro co-change is two or more files that change together, Introduction and i.e., they were maintained in the same change periods. context Problems and Motivation Macocha Empirical study Validation Conclusion and Ongoing Work Figure: Two changes performed by one developer are sequential in time (after few hours), F1 and F2 are macro co-changing 5 / 23
  • 6. An Exploratory Study of Macro Problems and Motivation Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Dephase macro co-change Introduction and A dephase macro co-change is two or more files that have context been observed to change with the same shift s. Problems and Motivation Macocha Empirical study Validation Conclusion and Ongoing Work Figure: Files F1 and F2 are changed by different developers in two 6 / 23 consecutive periods of time. They are dephase macro co-changing
  • 7. An Exploratory Study of Macro Macocha Co-changes (1/9) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Introduction and context Macocha Problems and Motivation We propose Macocha to: Macocha 1. Mine version-control systems (CVS and SVN). Empirical study 2. Identify the change periods in a program. Validation Conclusion and 3. Group the program source files according to their Ongoing Work stability through the change periods. 4. Identify among changed files those that have similar co-changes pattern. 7 / 23
  • 8. An Exploratory Study of Macro Macocha Co-changes (2/9) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Introduction and context Macocha Problems and We draw inspiration and extend the classical sliding window Motivation approacha to consider that two subsequent changes are part Macocha of one change period if they were committed by: Empirical study Validation ◮ any author; Conclusion and Ongoing Work ◮ with any log message; ◮ without an interrupt between two changes. a T. Zimmermann et al. ICSE 2004. 8 / 23
  • 9. An Exploratory Study of Macro Macocha Co-changes (3/9) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Macocha Giuliano Antoniol ◮ The duration of a change period is less than 40 hours.a Introduction and context ◮ If the interrupt between two subsequent changes is Problems and more than t = 5.17 hours, theses two changes belong to Motivation two different change periods. Macocha Empirical study ◮ A SMCC is two or more files that have identical profiles Validation during the life cycle of a program. Conclusion and Ongoing Work ◮ A SDMCC is the set composed of F1 and one or more files, F2...FM, such that F2...FM always macro co-change with the same shift in time s with respect to F1. ◮ We use the Hamming distance DH to measure the amount of differences between two change profiles. a L. Hatton. Computer 2007. 9 / 23
  • 10. An Exploratory Study of Macro Macocha Co-changes (4/9) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Introduction and context Problems and Motivation Macocha Empirical study Validation Conclusion and Ongoing Work Figure: Analysis-process of Macocha 10 / 23
  • 11. An Exploratory Study of Macro Macocha Co-changes (5/9) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Macocha Introduction and Step 1: Detection of change periods. context Problems and Motivation Macocha Empirical study Validation Conclusion and Ongoing Work Figure: Analysis of commits and creation of change periods 11 / 23
  • 12. An Exploratory Study of Macro Macocha Co-changes (6/9) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Macocha Introduction and context Step 2: Creation of files profiles. Problems and Motivation Macocha Empirical study Validation Conclusion and Ongoing Work Figure: From revision control systems to file profiles 12 / 23
  • 13. An Exploratory Study of Macro Macocha Co-changes (7/9) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Introduction and context Macocha Problems and Step 3: Stability analysis. Motivation Macocha Empirical study Validation Conclusion and Ongoing Work Figure: Profiles showing file stability 13 / 23
  • 14. An Exploratory Study of Macro Macocha Co-changes (8/9) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Macocha Introduction and Step 4: Detection of macro co-changes. context Problems and Motivation Macocha Empirical study Validation Figure: Files F1 and F2 are in macro co-change Conclusion and Ongoing Work Figure: Three different bit vectors showing approximate macro co-changes (D(F1,F2)=2; D(F1,F3)=4) 14 / 23
  • 15. An Exploratory Study of Macro Macocha Co-changes (9/9) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Macocha Introduction and context Step 5: Detection of dephase macro co-changes. Problems and Motivation Macocha Empirical study Validation Conclusion and Ongoing Work Figure: Three different bit vectors showing dephase macro co-changes 15 / 23
  • 16. An Exploratory Study of Macro Empirical study Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Research questions Introduction and context 1. How does Macocha compare to previous work in term Problems and of precision and recall? Motivation Macocha 2. Are there (approximate) dephase macro co-changes Empirical study among files and what is their usefulness? Validation Conclusion and Objects Ongoing Work We now present the results of our empirical study. We apply Macocha on four different programs: ArgoUML, FreeBSD, SIP, and XalanC, developed with three different programming languages, C, C++, and Java. 16 / 23
  • 17. An Exploratory Study of Macro Empirical study Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Objects Introduction and ArgoUML FreeBSD SIP XalanC context Languages Java C Java C++ Problems and Versions 30 8 2 21 Motivation Files 3,148 3,603 2,790 529 Macocha Changes 16,727 186,959 8,046 397,052 Empirical study Start Dates 98-01-26 94-05-25 05-07-21 99-12-18 Validation End Dates 09-01-29 09-02-11 10-12-09 09-01-17 Conclusion and CPs 2,843 1,121 1,553 924 Ongoing Work Table: Descriptive statistics of the object programs (CPs: numbers of change periods) 17 / 23
  • 18. An Exploratory Study of Macro Empirical study Co-changes Fehmi Jaafar, Yann-Ga¨l e Results Gu´h´neuc, Sylvie e e Hamel, and ArgoUML FreeBSD SIP XalanC Giuliano Antoniol Idle files 202 1,856 963 7 Changed files 2,946 1,747 1,827 522 Introduction and context # of SMCC 166 121 142 36 Problems and Max # files 35 24 15 17 Motivation Min # files 2 2 2 2 Macocha # of SMCCH 196 163 182 85 Empirical study Max # files 46 44 32 22 Validation Min # files 2 2 2 2 Conclusion and # of SDMCC 11 1 6 1 Ongoing Work Max # files 4 2 3 2 Min # files 2 2 2 2 # of SDMCCH 53 63 36 4 Max # files 6 8 5 2 Min # files 2 2 2 2 Table: Cardinalities of the sets obtained in the empirical study 18 / 23
  • 19. An Exploratory Study of Macro Validation Co-changes (1/3) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Validation Introduction and context We perform two types of validation: Problems and Motivation ◮ Quantitatively, we compare the stability analysis of Macocha Macocha with that of UMLDiffa and the co-change Empirical study analysis of Macocha with association rulesb . Validation ◮ Qualitatively, we use external information provided by Conclusion and Ongoing Work bugs reports, mailing lists, and requirement descriptions to validate the (dephase) macro co-changes not found using association rules. a Z. Xing et al. ICSE 2005. b T. Zimmermann et al. ICSE 2004. 19 / 23
  • 20. An Exploratory Study of Macro Validation Co-changes (2/3) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Validation Introduction and context Idle Groups Changed Groups Problems and Idle Clusters 202 0 Motivation ArgoUML Short-lived Clusters 0 1,390 Macocha Active Clusters 0 1,556 Empirical study Idle Clusters 963 0 Validation SIP Short-lived Clusters 0 997 Conclusion and Active Clusters 0 830 Ongoing Work Idle Clusters 7 0 XalanC Short-lived Clusters 0 291 Active Clusters 0 231 Table: Cardinality of Macocha sets in comparison to UMLDiff 20 / 23
  • 21. An Exploratory Study of Macro Validation Co-changes (3/3) Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Validation Hamel, and Giuliano Antoniol Association Rules Macocha Precision Recall Precision Recall Introduction and context ArgoUML 15% 66% 20% 75% Problems and FreeBSD 22% 100% 24% 100% Motivation SIP 18% 89% 24% 91% Macocha XalanC 16% 100% 22% 100% Empirical study Table: Association rules’s approach vs. Macocha Validation Conclusion and Association Rules External Information Ongoing Work Precision Recall Precision Recall ArgoUML 86% 98% 100% 99% FreeBSD 98% 100% 100% 100% SIP 85% 96% 100% 98% XalanC 90% 100% 100% 100% Table: External evaluation of Macocha when using the results of the association rules’s approach as oracle and after manual validation using external information 21 / 23
  • 22. An Exploratory Study of Macro Conclusion and Ongoing Work Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Conclusion Introduction and context 1. We introduced the novel concepts of macro co-changes Problems and and dephase macro co-changes to describe that two files Motivation were changed by developers within same change Macocha periods, with possible shifts in time. Empirical study Validation 2. We described, Macocha, an approach to detect Conclusion and (dephase) macro co-changes using file profiles and their Ongoing Work stability in time. 3. We performed two types of validations: quantitatively and qualitatively, and we showed that SMCC and SDMCC do exist and bring supplementary information. 22 / 23
  • 23. An Exploratory Study of Macro Conclusion and Ongoing Work Co-changes Fehmi Jaafar, Yann-Ga¨l e Gu´h´neuc, Sylvie e e Hamel, and Giuliano Antoniol Introduction and Ongoing Work context Problems and 1. Performing a comprehensive study of the number of Motivation MCCs and DMCCs with varying values of t and s. Macocha 2. Performing a comprehensive study of the different kinds Empirical study of DMCCs. Validation Conclusion and 3. Relating MCCs and DMCCs with static analysis and Ongoing Work external software characteristics, such as change proneness. 23 / 23