SlideShare a Scribd company logo
Requirements Traceability for Object Oriented
    Systems by Partitioning Source Code

                                             WCRE 2011, Limerick, Ireland




     Nasir Ali, Yann-Gaël Guéhéneuc, and Giuliano Antoniol
Requirements Traceability

Requirements traceability is defined as “the
ability to describe and follow the life of a
requirement, in both a forwards and backwards
direction” [Gotel, 1994]




                    WCRE 2011                   2
What’s Requirements Traceability Good For?

Program Comprehension

Discover what code must change to handle a
new requirement

Aid in determining whether a specification is
completely implemented


                   WCRE 2011                    3
IR-based Approaches
• Vector Space Model (Antoniol et al. 2002)

• Latent Semantic Indexing (Marcus and Maletic, 2003)

• Jensen Shannon Divergence (Abadi et al. 2008)

• Latent Dirichlet Allocation (Asuncion, 2010)

                           WCRE 2011                    4
Problem in IR-based Approaches
                         Requirement




             WCRE 2011                 5
Goal
• Reduce manual effort required to verify false-
  positive links

• Increase F-measure




                       WCRE 2011                   6
Coparvo - COde PARtitioning and VOting

 1. Partitioning source code

 2. Defining experts

 3. Link recovery and expert voting




                   WCRE 2011             7
Partitioning Source Code



                      Class Name


                      Method Name


                      Variable Name


                      Comments




          WCRE 2011                   8
Defining Experts

 Class Name A                                  Merged Class Names
                                               ------------------------------------
 Class Name B
                                               Class Name A
                                               Class Name B
 Class Name C
                                               Class Name C
                                               Class Name D
 Class Name D



Performed same step for method, variable names, comments, and requirements



                                  WCRE 2011                                      9
Defining Experts (Cont.)

Merged Class Names                         Merged Requirements
                          20%
                                           ------------------------------------
Merged Method Names      70%               Requirement 1
                                           Requirement 1
                        40%
Merged Variable Names                      ……….
                        60%                ……
Merged Comments
                                           Requirement N




                               WCRE 2011                                    10
Defining Experts (Cont.)
       Method Name                      70%


       Comments                         60%


       Variable Names                   40%
Extreme Cases:
   •5% difference in two experts
       Class Names                      20%
   •95% difference in two experts

                            WCRE 2011         11
Link Recovery and Expert Voting

 Class A                              Requirements
                                      ------------------------------------
                                      Email client must
                                      support pop3
Comments of Class A                   integration……….


Method Names of Class A




                          WCRE 2011                                    12
Case Studies
• Goal: Investigate the effectiveness of Coparvo in
  improving the accuracy of VSM and reducing the
  effort required to manually discard false-positive
  links

• Quality focus: Ability to recover traceability links
  between requirements and source code

• Context: Recovering requirements traceability
  links of three open-source programs, Pooka, SIP,
  and iTrust

                         WCRE 2011                       13
Research Questions
R01: How does Coparvo help to find valuable partitions of
source code that help in recovering traceability links?


R02: How much Coparvo helps to reduce the effort required
to manually verify recovered traceability links?


R03: How does the F-measure value of the traceability links
recovered by Coparvo compare with a traditional VSM-
based approach?


                          WCRE 2011                         14
Datasets
SIP Communicator: Voice over IP and instate messenger
Pooka: An email Client
iTrust: Medical Application

                    Pooka           SIP Communicator   iTrust
Version             2.0             1.0                10
Number of Classes   298             1,771              526
Number of Methods   20,868          31,502             3,404
LOC                 244K            487K               19K



                               WCRE 2011                        15
IR Quality Measures




       Pr ecision × Re call
F = 2×
       Pr ecision + Re call

            WCRE 2011         16
Source Code Partitions
  1. Class name


  1. Method name


  2. Variable name


  3. Comments

           WCRE 2011     17
Text Preprocessing

• Filter (#43@$)


• Stop words (the, is, an….)


• Stemmer
  (attachment, attached -> attach)


                   WCRE 2011         18
Information Retrieval (IR) Methods
• Vector Space Model (VSM)
   – Each document, d, is represented by a vector of ranks of
     the terms in the vocabulary:
                      vd = [rd(w1), rd(w2), …, rd(w|V|)]
   – The query is similarly represented by a vector
   – The similarity between the query and document is the
     cosine of the angle between their respective vectors




                            WCRE 2011                           19
Defining Expert
60




50




40


                                        CN

30                                      MN
                                        VN
                                        Cmt

20




10




 0
     Pooka           SIP       iTrust




                  WCRE 2011              20
Pooka Results




    WCRE 2011   21
SIP Comm. Results




      WCRE 2011     22
iTrust Results




     WCRE 2011   23
Voting vs. Combination
• Can we only use different combinations
  of source code partitions to create
  requirements traceability links?

• How much a combination of source code
  improves the F-measure?



                  WCRE 2011                24
Pooka Results




    WCRE 2011   25
SIP Comm. Results




      WCRE 2011     26
iTrust Results




     WCRE 2011   27
Statistical Tests
  Non-parametric test – Mann-Whitney test




                     F-measure
            Pooka            SIP Comm.   iTrust

P-value     p<0.01           p<0.01      p<0.01




                     WCRE 2011                    28
Effort Analysis
90,000
80,000
70,000
60,000
50,000
                                           VSM
40,000
                                           Coparvo
30,000
20,000
10,000
    0
         Pooka       SIP Comm.    iTrust




                      WCRE 2011                  29
Effort Analysis (F-Measure)
14

12

10

 8
                                       VSM
 6                                     Coparvo
 4

 2

 0
       Pooka   SIP Comm.      iTrust



                  WCRE 2011                  30
RQ Answers
R01: Combinations or single source-code partitions also
  sometime provides better results than Coparvo

R02: Using different source of information reduces
  experts’ effort up to 83%

R03: Partitioning source code and using the partitions as
  experts for voting yields better accuracy



                         WCRE 2011                      31
Threats to Validity
• External validity:
         • We analyzed only three systems
         • Different source code size

• Construct validity:
         • The two researchers built both oracles
         • Oracles were validated by the other two experts
         • iTrust oracle was developed by developer(s)


• Conclusion validity: Non-parametric test

• Tool is online at www.factrace.net


                                    WCRE 2011                32
Ongoing work
More IR approaches

Empirical study

Threshold




                  WCRE 2011   33
Questions?




   WCRE 2011   34

More Related Content

PDF
WCRE09b.ppt
DOCX
Informática oriana, angeles y ramiro.
DOCX
Informatica agustina ariana 1er año
PDF
WCRE11a.ppt
PPSX
Ptc ppt
PPTX
Seni budaya musik rock by Pangestu chaesar
PDF
WCRE09b.ppt
Informática oriana, angeles y ramiro.
Informatica agustina ariana 1er año
WCRE11a.ppt
Ptc ppt
Seni budaya musik rock by Pangestu chaesar

Similar to WCRE11b.ppt (20)

PDF
Internet-scale Real-time Code Clone Search via Multi-level Indexing
PPTX
Agile and CMMI - a potential blend
PDF
Industry - Estimating software maintenance effort from use cases an indu...
PDF
ICSM12.ppt
PDF
Effective code reviews
PPT
Using Statistical Tests in a Trust Model
PPTX
Software Testing_A_mmmmmmmmmmmmmmmmmmmmm
PDF
ERA - Clustering and Recommending Collections of Code Relevant to Task
PDF
RACI Team Practice
PDF
Effective code reviews
PDF
How to test complex SaaS applications - The family july 2014
PDF
Test Case Potency Assessment
KEY
Advanced Testing on RubyEnRails '09
PDF
Planificación del proyecto estimación
PDF
Bug Prediction Based on Fine-Grained Module Histories
PPTX
Marrying Jenkins and Gerrit-Berlin Expert Days 2013
PDF
Unit testingandcontinousintegrationfreenest1dot4
PDF
Software Systems as Cities: a Controlled Experiment
PDF
Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces
PDF
Is this how you hate unit testing?
Internet-scale Real-time Code Clone Search via Multi-level Indexing
Agile and CMMI - a potential blend
Industry - Estimating software maintenance effort from use cases an indu...
ICSM12.ppt
Effective code reviews
Using Statistical Tests in a Trust Model
Software Testing_A_mmmmmmmmmmmmmmmmmmmmm
ERA - Clustering and Recommending Collections of Code Relevant to Task
RACI Team Practice
Effective code reviews
How to test complex SaaS applications - The family july 2014
Test Case Potency Assessment
Advanced Testing on RubyEnRails '09
Planificación del proyecto estimación
Bug Prediction Based on Fine-Grained Module Histories
Marrying Jenkins and Gerrit-Berlin Expert Days 2013
Unit testingandcontinousintegrationfreenest1dot4
Software Systems as Cities: a Controlled Experiment
Metrics - Using Source Code Metrics to Predict Change-Prone Java Interfaces
Is this how you hate unit testing?
Ad

More from Ptidej Team (20)

PDF
From IoT to Software Miniaturisation
PDF
Presentation
PDF
Presentation
PDF
Presentation
PDF
Presentation by Lionel Briand
PDF
Manel Abdellatif
PDF
Azadeh Kermansaravi
PDF
Mouna Abidi
PDF
CSED - Manel Grichi
PDF
Cristiano Politowski
PDF
Will io t trigger the next software crisis
PDF
PDF
Thesis+of+laleh+eshkevari.ppt
PDF
Thesis+of+nesrine+abdelkafi.ppt
PDF
Medicine15.ppt
PDF
Qrs17b.ppt
PDF
Icpc11c.ppt
PDF
Icsme16.ppt
PDF
Msr17a.ppt
PDF
Icsoc15.ppt
From IoT to Software Miniaturisation
Presentation
Presentation
Presentation
Presentation by Lionel Briand
Manel Abdellatif
Azadeh Kermansaravi
Mouna Abidi
CSED - Manel Grichi
Cristiano Politowski
Will io t trigger the next software crisis
Thesis+of+laleh+eshkevari.ppt
Thesis+of+nesrine+abdelkafi.ppt
Medicine15.ppt
Qrs17b.ppt
Icpc11c.ppt
Icsme16.ppt
Msr17a.ppt
Icsoc15.ppt
Ad

Recently uploaded (20)

PDF
MIND Revenue Release Quarter 2 2025 Press Release
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Spectroscopy.pptx food analysis technology
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Programs and apps: productivity, graphics, security and other tools
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Approach and Philosophy of On baking technology
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Cloud computing and distributed systems.
PPTX
Big Data Technologies - Introduction.pptx
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
MIND Revenue Release Quarter 2 2025 Press Release
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Spectroscopy.pptx food analysis technology
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Spectral efficient network and resource selection model in 5G networks
Programs and apps: productivity, graphics, security and other tools
The AUB Centre for AI in Media Proposal.docx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Building Integrated photovoltaic BIPV_UPV.pdf
Approach and Philosophy of On baking technology
Digital-Transformation-Roadmap-for-Companies.pptx
Electronic commerce courselecture one. Pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Understanding_Digital_Forensics_Presentation.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Cloud computing and distributed systems.
Big Data Technologies - Introduction.pptx
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx

WCRE11b.ppt

  • 1. Requirements Traceability for Object Oriented Systems by Partitioning Source Code WCRE 2011, Limerick, Ireland Nasir Ali, Yann-Gaël Guéhéneuc, and Giuliano Antoniol
  • 2. Requirements Traceability Requirements traceability is defined as “the ability to describe and follow the life of a requirement, in both a forwards and backwards direction” [Gotel, 1994] WCRE 2011 2
  • 3. What’s Requirements Traceability Good For? Program Comprehension Discover what code must change to handle a new requirement Aid in determining whether a specification is completely implemented WCRE 2011 3
  • 4. IR-based Approaches • Vector Space Model (Antoniol et al. 2002) • Latent Semantic Indexing (Marcus and Maletic, 2003) • Jensen Shannon Divergence (Abadi et al. 2008) • Latent Dirichlet Allocation (Asuncion, 2010) WCRE 2011 4
  • 5. Problem in IR-based Approaches Requirement WCRE 2011 5
  • 6. Goal • Reduce manual effort required to verify false- positive links • Increase F-measure WCRE 2011 6
  • 7. Coparvo - COde PARtitioning and VOting 1. Partitioning source code 2. Defining experts 3. Link recovery and expert voting WCRE 2011 7
  • 8. Partitioning Source Code Class Name Method Name Variable Name Comments WCRE 2011 8
  • 9. Defining Experts Class Name A Merged Class Names ------------------------------------ Class Name B Class Name A Class Name B Class Name C Class Name C Class Name D Class Name D Performed same step for method, variable names, comments, and requirements WCRE 2011 9
  • 10. Defining Experts (Cont.) Merged Class Names Merged Requirements 20% ------------------------------------ Merged Method Names 70% Requirement 1 Requirement 1 40% Merged Variable Names ………. 60% …… Merged Comments Requirement N WCRE 2011 10
  • 11. Defining Experts (Cont.) Method Name 70% Comments 60% Variable Names 40% Extreme Cases: •5% difference in two experts Class Names 20% •95% difference in two experts WCRE 2011 11
  • 12. Link Recovery and Expert Voting Class A Requirements ------------------------------------ Email client must support pop3 Comments of Class A integration………. Method Names of Class A WCRE 2011 12
  • 13. Case Studies • Goal: Investigate the effectiveness of Coparvo in improving the accuracy of VSM and reducing the effort required to manually discard false-positive links • Quality focus: Ability to recover traceability links between requirements and source code • Context: Recovering requirements traceability links of three open-source programs, Pooka, SIP, and iTrust WCRE 2011 13
  • 14. Research Questions R01: How does Coparvo help to find valuable partitions of source code that help in recovering traceability links? R02: How much Coparvo helps to reduce the effort required to manually verify recovered traceability links? R03: How does the F-measure value of the traceability links recovered by Coparvo compare with a traditional VSM- based approach? WCRE 2011 14
  • 15. Datasets SIP Communicator: Voice over IP and instate messenger Pooka: An email Client iTrust: Medical Application Pooka SIP Communicator iTrust Version 2.0 1.0 10 Number of Classes 298 1,771 526 Number of Methods 20,868 31,502 3,404 LOC 244K 487K 19K WCRE 2011 15
  • 16. IR Quality Measures Pr ecision × Re call F = 2× Pr ecision + Re call WCRE 2011 16
  • 17. Source Code Partitions 1. Class name 1. Method name 2. Variable name 3. Comments WCRE 2011 17
  • 18. Text Preprocessing • Filter (#43@$) • Stop words (the, is, an….) • Stemmer (attachment, attached -> attach) WCRE 2011 18
  • 19. Information Retrieval (IR) Methods • Vector Space Model (VSM) – Each document, d, is represented by a vector of ranks of the terms in the vocabulary: vd = [rd(w1), rd(w2), …, rd(w|V|)] – The query is similarly represented by a vector – The similarity between the query and document is the cosine of the angle between their respective vectors WCRE 2011 19
  • 20. Defining Expert 60 50 40 CN 30 MN VN Cmt 20 10 0 Pooka SIP iTrust WCRE 2011 20
  • 21. Pooka Results WCRE 2011 21
  • 22. SIP Comm. Results WCRE 2011 22
  • 23. iTrust Results WCRE 2011 23
  • 24. Voting vs. Combination • Can we only use different combinations of source code partitions to create requirements traceability links? • How much a combination of source code improves the F-measure? WCRE 2011 24
  • 25. Pooka Results WCRE 2011 25
  • 26. SIP Comm. Results WCRE 2011 26
  • 27. iTrust Results WCRE 2011 27
  • 28. Statistical Tests Non-parametric test – Mann-Whitney test F-measure Pooka SIP Comm. iTrust P-value p<0.01 p<0.01 p<0.01 WCRE 2011 28
  • 29. Effort Analysis 90,000 80,000 70,000 60,000 50,000 VSM 40,000 Coparvo 30,000 20,000 10,000 0 Pooka SIP Comm. iTrust WCRE 2011 29
  • 30. Effort Analysis (F-Measure) 14 12 10 8 VSM 6 Coparvo 4 2 0 Pooka SIP Comm. iTrust WCRE 2011 30
  • 31. RQ Answers R01: Combinations or single source-code partitions also sometime provides better results than Coparvo R02: Using different source of information reduces experts’ effort up to 83% R03: Partitioning source code and using the partitions as experts for voting yields better accuracy WCRE 2011 31
  • 32. Threats to Validity • External validity: • We analyzed only three systems • Different source code size • Construct validity: • The two researchers built both oracles • Oracles were validated by the other two experts • iTrust oracle was developed by developer(s) • Conclusion validity: Non-parametric test • Tool is online at www.factrace.net WCRE 2011 32
  • 33. Ongoing work More IR approaches Empirical study Threshold WCRE 2011 33
  • 34. Questions? WCRE 2011 34