SlideShare a Scribd company logo
Dr. Rakesh Rana,
Research Scientist,
Lero – The Irish Software Research Centre, Ireland
When Do Software Issues and Bugs get Reported in
Large Open Source Software Project?
Objectives/ Research Questions
We examine the reporting pattern of more than 7000 issue reports from five
large open source software projects to evaluate two main characteristics:
(1) When do defects get reported - does there exist any distinct patterns? and
(2) Is there any difference between reported defect inflow and actual defect
inflow for these projects?
Why bother?
• Detailed knowledge of specific patterns in defect reporting can be useful
for planning purposes,
• Differences between reported and actual defect inflow can have
implications on accuracy of dynamic SRGM used for defect
prediction/reliability assessment.
Results
Our results suggest
 While there exist distinct variation over when defects are
reported,
 The ratio of reported to actual defects remains fairly stable over
period of time.
 These results enhance our confidence in applying SRGMs using
reported defect inflow. (Test - logistic growth model – predicted asymptote
deviations on average ~ 4.8% than using actual bugs for making such predictions.)
 The reporting patterns can also provide some insights into
possible group of people who contribute to OSS projects.
Software is Everywhere
Image source: http://itsallaboutembedded.blogspot.com/2013/03/what-makes-embedded-system-called-as.html
Open Source
Image Source: https://en.wikipedia.org/wiki/Open-source_software; http://www.webzeee.com/page_osp.html
Software Defect – Definition
Defect: An imperfection or deficiency in a work product where that work
product does not meet its requirements or specifications and needs to be
either repaired or replaced.
Error: A human action that produces an incorrect result.
Failure: (A) Termination of the ability of a product to perform a required
function or its inability to perform within previously specified limits Or
(B) An event in which a system or system component does not perform a
required function within specified limits.
Fault: A manifestation of an error in software.
Problem: (A) Difficulty or uncertainty experienced by one or more
persons, resulting from an unsatisfactory encounter with a system in use
or (B) a negative situation to overcome.
IEEE standard 1044, Classification for Software Anomalies
Slide | 04
Defect: An imperfection or deficiency in a work product where that work
product does not meet its requirements or specifications and needs to be
either repaired or replaced.
We make distinction b/w Issues and defects as follows:
Software Issue: is used to refer to a report filed by users or developers
into the given OSS projects’ issue database.
These issues can be Defects/Bugs, request for enhancements,
improvement requests, documentation, refactoring requests, etc.
Defect/Bug: is used interchangeably in this paper referring to issues that
require a corrective maintenance tasks usually achieved by making
semantic changes to the source code.
Slide | 04
Software Defect – Definition
The Data
 Five open source Java projects with active development,
 Developed and maintained by APACHE and MOZILLA - tend to follow a
strict bug reporting and fixing process,
 K. Herzig, S. Just, and A. Zeller, “It’s not a bug, it’s a feature: how misclassification
impacts bug prediction,” in Proceedings of the 2013, International Conference on
Software Engineering. IEEE Press, 2013, pp. 392–401.
OSS Project Time Period Maintainer No of Issues
HttpClient 11/2001 – 04/2012 Apache 746
Jackrabbit 09/2004 – 04/2012 Apache 2402
Lucene-Java 03/2004 – 03/2012 Apache 2443
Rhino 09/1999 – 02/2012 Mozilla 584
Tomcat5 05/2002 – 12/2011 Apache 1226
The Data
 Five open source Java projects with active development,
 Developed and maintained by APACHE and MOZILLA - tend to follow a
strict bug reporting and fixing process,
Steps:
 We mined all software issues for these five projects over the given time period
 We then mapped each issue to the manual classification of K. Herzig, S. Just, and A. Zeller, “It’s not a bug, it’s
a feature: how misclassification impacts bug prediction,” in Proceedings of the ICSE 2013
 Then we used the time stamp information to analyze the trend of total reported issues, bugs and actual bugs.
OSS Project Time Period Maintainer No of Issues
HttpClient 11/2001 – 04/2012 Apache 746
Jackrabbit 09/2004 – 04/2012 Apache 2402
Lucene-Java 03/2004 – 03/2012 Apache 2443
Rhino 09/1999 – 02/2012 Mozilla 584
Tomcat5 05/2002 – 12/2011 Apache 1226
The Data
 HttpClient & Rhino ~ Linear commulative issues profile
 Jackrabbit ~ S-shaped
 Lucene-Java ~ Convex shaped
 Tomcat5 ~ Concave shaped issues inflow profile
 These are total issues reported – what about actual bugs?
 Some studies suggest ~40% of issues reported as bugs are not real bugs!
0
10
20
30
40
50
60
70
80
90
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
NumberofIssues
Time in Weeks
Issues inflow per week
HttpClient
Jackrabbit
Lucene-Java
Rhino
Tomcat5
0
0,1
0,2
0,3
0,4
0,5
0,6
0,7
0,8
0,9
1
1
8
15
22
29
36
43
50
57
64
71
78
85
92
99
106
113
120
127
134
TotalNumberofIssues(normalized)
Time in Weeks
Cummulative Issues inflow
HttpClient
Jackrabbit
Lucene-Java
Rhino
Tomcat5
0
100
200
300
400
500
600
700
800
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Issues Reported per Month
Total Issues Reported Bugs Actual Bugs
0%
10%
20%
30%
40%
50%
60%
70%
80%
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Ratio of Bugs and Issues Reported per Month
Reported Bugs/Total Issues Actual Bugs/Total Issues Actual Bugs/Reported Bugs
Results: Issues reported on monthly basis
 Dec: holiday season – lower than average issues & bugs reported,
 Jan – Mar: increasig trend - higher than average,
 Apr – July: much lower than average issues & bugs reported
 Busy (exam) periods at Universities,
 end of financial year/quarter over many countries
 Although the ratio/proportions remains fairly stable over time (with some exceptions)
0
20
40
60
80
100
120
140
160
180
200
1 5 9 13 17 21 25 29 33 37 41 45 49 53
Issues Reported per Week
Actual Bugs Total Issues Reported Bugs
0%
10%
20%
30%
40%
50%
60%
70%
80%
90%
1 5 9 13 17 21 25 29 33 37 41 45 49 53
Ratio of Bugs and Issues Reported per Week
Reported Bugs/Total Issues Actual Bugs/Total Issues
Actual Bugs/Reported Bugs
Results: Issues reported on weekly basis
0
200
400
600
800
1000
1200
1400
1600
Mon Tue Wed Thr Fri Sat Sun
Issues Reported by Week Day
Total Issues Reported Bugs Actual Bugs
0%
10%
20%
30%
40%
50%
60%
70%
80%
Mon Tue Wed Thr Fri Sat Sun
Ratio of Bugs and Issues Reported by Week Day
Reported Bugs/Total Issues Actual Bugs/Total Issues
Actual Bugs/Reported Bugs
Results: Issues reported by week day
 The first day of week (Mon), most contributor are busy at their primary task (education/job) – less
contributions to the OSS project,
 (Tue-Fri), the number of total issues and bugs reported increases,
 On Saturday, while there is drop in absolute number of issues compared to peak of Thr/Friday, but still a
large number of issues and bugs are registered,
 Another interesting observation: the proportion of actual bugs to reported bugs or total reported issues is
higher (maximum) on Saturdays while minimum for the first working day of the week,
 And on Sunday, it seems most contributors take time off before the next week starts
Impact on SRGM prediction models
SRGMs: Software Reliability Growth Models
Can be used for
• Making asymptote predictions, and
• Predicting the shape of defect inflow
0
20
40
60
80
100
120
140
160
180
200
1 5 9 13 17 21 25 29 33 37 41 45 49 53
Issues Reported per Week
Actual Bugs Total Issues Reported Bugs
Impact on SRGM prediction models
SRGMs: Software Reliability Growth Models
Tested with:
o Logistic model
o 90/10% split
o Actual Bug Inflow (requires manual classification) Vs.
o Reported Bug Inflow (data readily available)
OSS Project
Asymptote (a) Growth rate (b) Constant term (c)
Reported
bugs (Adj)
Actual
bugs
Relative
Error
Reported
bugs
Actual
bugs
Reported
bugs
Actual
bugs
HttpClient 261 260 0.4% 0.051 0.050 39.7 41.3
Jackrabbit 901 923 -2.4% 0.069 0.065 42.3 44.6
Lucene-Java 741 685 8.2% 0.049 0.054 67.6 65.7
Rhino 338 301 12.3% 0.035 0.040 76.4 64.5
Tomcat5 644 640 0.6% 0.082 0.084 36.7 33.3
Results
Our results suggest
 While there exist distinct variation over when defects are
reported,
 The ratio of reported to actual defects remains fairly stable over
period of time.
 These results enhance our confidence in applying SRGMs using
reported defect inflow. (Test - logistic growth model – predicted asymptote
deviations on average ~ 4.8% than using actual bugs for making such predictions.)
 The reporting patterns can also provide some insights into
possible group of people who contribute to OSS projects.
Next Steps
 Analysis of what local time, commits and issue reports are made can also help
us build better profile of who actually contributes to OSS project and when
these contribution occur
 How much more can we learn about the patterns within reported issues and
have better understanding of OSS contributors by using defect classification
techniques (for e.g. using Orthogonal defect classification to the issues from
OSS bug repositories)
 How does patterns of issues and bug reporting differ for cases where OSS
projects are managed by government or commercial organizations, etc.
For more details
Contact: Rakesh Rana
rakesh.rana@lero.ie

More Related Content

PPTX
Workshop early or rapid cosmic fsm - Frank Vogelezang
PDF
Accounting for non functional and project requirements - cosmic and ifpug dev...
PDF
Practical usage of fpa and automatic code review piotr popovski
PPTX
Machine Learning Approach for Quality Assessment and Prediction in Large Soft...
PDF
Effort estimation for web applications
PDF
The Increasing Value and Complexity of Software Call for the Reevaluation of ...
DOC
5WCSQ(CFP) - Quality Improvement by the Real-Time Detection of the Problems
PDF
Gap assessment kubernetes
Workshop early or rapid cosmic fsm - Frank Vogelezang
Accounting for non functional and project requirements - cosmic and ifpug dev...
Practical usage of fpa and automatic code review piotr popovski
Machine Learning Approach for Quality Assessment and Prediction in Large Soft...
Effort estimation for web applications
The Increasing Value and Complexity of Software Call for the Reevaluation of ...
5WCSQ(CFP) - Quality Improvement by the Real-Time Detection of the Problems
Gap assessment kubernetes

What's hot (20)

PPS
Estimating test effort part 1 of 2
PDF
Drupalcon la estimation john_nollin
PPTX
Testing Metrics: Project, Product, Process
PPT
DOCX
Sandeep A Resume
PPTX
Test effort estimation
PDF
Importance of software quality metrics
PPT
Defect analysis and prevention methods
PPTX
Develop a Defect Prevention Strategy—or Else!
PPS
Estimating test effort part 2 of 2
PDF
Risk-Based Testing for Agile Projects
PPTX
A Study of the Quality-Impacting Practices of Modern Code Review at Sony Mobile
DOC
Niyati_Manual_Testing_ISTQB_Certified_Resume
PPT
Graham Bath - SOA: Whats in it for Testers?
PPTX
'What the top 10 Most Disruptive Technology Trends Mean for QA and Testing' b...
PDF
Defect Analysis & Prevention, Data Mining & Visualization of Defect Matrix
PPTX
ISTQB Advanced Test Manager Training 2012 - Testing Process
DOC
Shrikant_Bagdane_Software_Tester_3.7+years
PPT
SDT Blended Model V3 2009
PPTX
ISTQB foundation level - day 2
Estimating test effort part 1 of 2
Drupalcon la estimation john_nollin
Testing Metrics: Project, Product, Process
Sandeep A Resume
Test effort estimation
Importance of software quality metrics
Defect analysis and prevention methods
Develop a Defect Prevention Strategy—or Else!
Estimating test effort part 2 of 2
Risk-Based Testing for Agile Projects
A Study of the Quality-Impacting Practices of Modern Code Review at Sony Mobile
Niyati_Manual_Testing_ISTQB_Certified_Resume
Graham Bath - SOA: Whats in it for Testers?
'What the top 10 Most Disruptive Technology Trends Mean for QA and Testing' b...
Defect Analysis & Prevention, Data Mining & Visualization of Defect Matrix
ISTQB Advanced Test Manager Training 2012 - Testing Process
Shrikant_Bagdane_Software_Tester_3.7+years
SDT Blended Model V3 2009
ISTQB foundation level - day 2
Ad

Similar to When do software issues get reported in large open source software - Rakesh Rana (20)

PPTX
When do software issues get reported in large open source software
PDF
FinalReport
PPTX
Bug tracking tool
PPTX
Bug life cycle
PDF
Presentation
PDF
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
PDF
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
PDF
Characterizing and Predicting Which Bugs Get Reopened
PPTX
Automated bug localization
PDF
Defect Management Practices and Problems in Free/Open Source Software Projects
KEY
Open Bugs & Development Stages
PPTX
Bug reporting and tracking
PDF
Mining Software Defects: Should We Consider Affected Releases?
PPTX
An Empirical Study of Reliability Growth of Open versus Closed Source Softwar...
PDF
From Bugs to Decision Support - Selected Research Highlights
PDF
Lopez
PDF
Replication and Benchmarking in Software Analytics
PDF
Towards effective bug triage with software data reduction techniques
PDF
Software Defects and SW Reliability Assessment
PDF
Evaluating the presence and impact of bias in bug-fix datasets
When do software issues get reported in large open source software
FinalReport
Bug tracking tool
Bug life cycle
Presentation
USING CATEGORICAL FEATURES IN MINING BUG TRACKING SYSTEMS TO ASSIGN BUG REPORTS
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
Characterizing and Predicting Which Bugs Get Reopened
Automated bug localization
Defect Management Practices and Problems in Free/Open Source Software Projects
Open Bugs & Development Stages
Bug reporting and tracking
Mining Software Defects: Should We Consider Affected Releases?
An Empirical Study of Reliability Growth of Open versus Closed Source Softwar...
From Bugs to Decision Support - Selected Research Highlights
Lopez
Replication and Benchmarking in Software Analytics
Towards effective bug triage with software data reduction techniques
Software Defects and SW Reliability Assessment
Evaluating the presence and impact of bias in bug-fix datasets
Ad

More from IWSM Mensura (20)

PDF
Tips and hints for an effective cosmic learning process gained from industria...
PDF
The significance of ifpug base functionality types in effort estimation cig...
PDF
The effects of duration based moving windows with estimation by analogy - sou...
PDF
Software or service that's the question luigi buglione
PDF
Requirements effort estimation state of the practice - mohamad kassab
PDF
Quantitative functional change impact analysis in activity diagrams a cosmi...
PDF
Performance measurement of agile teams harold van heeringen
PDF
Measurement as-a-service a new way of organizing metrics programs - wilhelm m...
PDF
Improving the cosmic approximate sizing using the fuzzy logic epcu model al...
PDF
Functional size measurement for processor load estimation hassan soubra
PDF
From software to service sustainability a still broader perspective - luigi...
PDF
Estimation and measuring of software size within the atos gobal delivery plat...
PDF
Energy wasting rate jérôme rocheteau
PDF
Do we measure functional size or do we count thomas fehlmann
PDF
Designing an unobtrusive analytics framework for monitoring java applications...
PDF
Combining qualitative and quantitative software process evaluation sylvie t...
PDF
Automatic measurements of use cases with cosmic thomas fehlmann
PDF
Automated functional size measurement for three tier object relational mappin...
PDF
Applying manufacturing performance figures to measure software development ex...
PDF
Analytic hierarchy process for pif thomas fehlmann
Tips and hints for an effective cosmic learning process gained from industria...
The significance of ifpug base functionality types in effort estimation cig...
The effects of duration based moving windows with estimation by analogy - sou...
Software or service that's the question luigi buglione
Requirements effort estimation state of the practice - mohamad kassab
Quantitative functional change impact analysis in activity diagrams a cosmi...
Performance measurement of agile teams harold van heeringen
Measurement as-a-service a new way of organizing metrics programs - wilhelm m...
Improving the cosmic approximate sizing using the fuzzy logic epcu model al...
Functional size measurement for processor load estimation hassan soubra
From software to service sustainability a still broader perspective - luigi...
Estimation and measuring of software size within the atos gobal delivery plat...
Energy wasting rate jérôme rocheteau
Do we measure functional size or do we count thomas fehlmann
Designing an unobtrusive analytics framework for monitoring java applications...
Combining qualitative and quantitative software process evaluation sylvie t...
Automatic measurements of use cases with cosmic thomas fehlmann
Automated functional size measurement for three tier object relational mappin...
Applying manufacturing performance figures to measure software development ex...
Analytic hierarchy process for pif thomas fehlmann

Recently uploaded (20)

PPTX
history of c programming in notes for students .pptx
PDF
Nekopoi APK 2025 free lastest update
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
Navsoft: AI-Powered Business Solutions & Custom Software Development
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
PPTX
Introduction to Artificial Intelligence
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
PTS Company Brochure 2025 (1).pdf.......
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
AI in Product Development-omnex systems
PDF
Understanding Forklifts - TECH EHS Solution
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
history of c programming in notes for students .pptx
Nekopoi APK 2025 free lastest update
How to Choose the Right IT Partner for Your Business in Malaysia
Operating system designcfffgfgggggggvggggggggg
Navsoft: AI-Powered Business Solutions & Custom Software Development
Softaken Excel to vCard Converter Software.pdf
Design an Analysis of Algorithms II-SECS-1021-03
Odoo POS Development Services by CandidRoot Solutions
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
VVF-Customer-Presentation2025-Ver1.9.pptx
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
Introduction to Artificial Intelligence
How Creative Agencies Leverage Project Management Software.pdf
PTS Company Brochure 2025 (1).pdf.......
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
AI in Product Development-omnex systems
Understanding Forklifts - TECH EHS Solution
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus

When do software issues get reported in large open source software - Rakesh Rana

  • 1. Dr. Rakesh Rana, Research Scientist, Lero – The Irish Software Research Centre, Ireland When Do Software Issues and Bugs get Reported in Large Open Source Software Project?
  • 2. Objectives/ Research Questions We examine the reporting pattern of more than 7000 issue reports from five large open source software projects to evaluate two main characteristics: (1) When do defects get reported - does there exist any distinct patterns? and (2) Is there any difference between reported defect inflow and actual defect inflow for these projects? Why bother? • Detailed knowledge of specific patterns in defect reporting can be useful for planning purposes, • Differences between reported and actual defect inflow can have implications on accuracy of dynamic SRGM used for defect prediction/reliability assessment.
  • 3. Results Our results suggest  While there exist distinct variation over when defects are reported,  The ratio of reported to actual defects remains fairly stable over period of time.  These results enhance our confidence in applying SRGMs using reported defect inflow. (Test - logistic growth model – predicted asymptote deviations on average ~ 4.8% than using actual bugs for making such predictions.)  The reporting patterns can also provide some insights into possible group of people who contribute to OSS projects.
  • 4. Software is Everywhere Image source: http://itsallaboutembedded.blogspot.com/2013/03/what-makes-embedded-system-called-as.html
  • 5. Open Source Image Source: https://en.wikipedia.org/wiki/Open-source_software; http://www.webzeee.com/page_osp.html
  • 6. Software Defect – Definition Defect: An imperfection or deficiency in a work product where that work product does not meet its requirements or specifications and needs to be either repaired or replaced. Error: A human action that produces an incorrect result. Failure: (A) Termination of the ability of a product to perform a required function or its inability to perform within previously specified limits Or (B) An event in which a system or system component does not perform a required function within specified limits. Fault: A manifestation of an error in software. Problem: (A) Difficulty or uncertainty experienced by one or more persons, resulting from an unsatisfactory encounter with a system in use or (B) a negative situation to overcome. IEEE standard 1044, Classification for Software Anomalies Slide | 04
  • 7. Defect: An imperfection or deficiency in a work product where that work product does not meet its requirements or specifications and needs to be either repaired or replaced. We make distinction b/w Issues and defects as follows: Software Issue: is used to refer to a report filed by users or developers into the given OSS projects’ issue database. These issues can be Defects/Bugs, request for enhancements, improvement requests, documentation, refactoring requests, etc. Defect/Bug: is used interchangeably in this paper referring to issues that require a corrective maintenance tasks usually achieved by making semantic changes to the source code. Slide | 04 Software Defect – Definition
  • 8. The Data  Five open source Java projects with active development,  Developed and maintained by APACHE and MOZILLA - tend to follow a strict bug reporting and fixing process,  K. Herzig, S. Just, and A. Zeller, “It’s not a bug, it’s a feature: how misclassification impacts bug prediction,” in Proceedings of the 2013, International Conference on Software Engineering. IEEE Press, 2013, pp. 392–401. OSS Project Time Period Maintainer No of Issues HttpClient 11/2001 – 04/2012 Apache 746 Jackrabbit 09/2004 – 04/2012 Apache 2402 Lucene-Java 03/2004 – 03/2012 Apache 2443 Rhino 09/1999 – 02/2012 Mozilla 584 Tomcat5 05/2002 – 12/2011 Apache 1226
  • 9. The Data  Five open source Java projects with active development,  Developed and maintained by APACHE and MOZILLA - tend to follow a strict bug reporting and fixing process, Steps:  We mined all software issues for these five projects over the given time period  We then mapped each issue to the manual classification of K. Herzig, S. Just, and A. Zeller, “It’s not a bug, it’s a feature: how misclassification impacts bug prediction,” in Proceedings of the ICSE 2013  Then we used the time stamp information to analyze the trend of total reported issues, bugs and actual bugs. OSS Project Time Period Maintainer No of Issues HttpClient 11/2001 – 04/2012 Apache 746 Jackrabbit 09/2004 – 04/2012 Apache 2402 Lucene-Java 03/2004 – 03/2012 Apache 2443 Rhino 09/1999 – 02/2012 Mozilla 584 Tomcat5 05/2002 – 12/2011 Apache 1226
  • 10. The Data  HttpClient & Rhino ~ Linear commulative issues profile  Jackrabbit ~ S-shaped  Lucene-Java ~ Convex shaped  Tomcat5 ~ Concave shaped issues inflow profile  These are total issues reported – what about actual bugs?  Some studies suggest ~40% of issues reported as bugs are not real bugs! 0 10 20 30 40 50 60 70 80 90 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 NumberofIssues Time in Weeks Issues inflow per week HttpClient Jackrabbit Lucene-Java Rhino Tomcat5 0 0,1 0,2 0,3 0,4 0,5 0,6 0,7 0,8 0,9 1 1 8 15 22 29 36 43 50 57 64 71 78 85 92 99 106 113 120 127 134 TotalNumberofIssues(normalized) Time in Weeks Cummulative Issues inflow HttpClient Jackrabbit Lucene-Java Rhino Tomcat5
  • 11. 0 100 200 300 400 500 600 700 800 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Issues Reported per Month Total Issues Reported Bugs Actual Bugs 0% 10% 20% 30% 40% 50% 60% 70% 80% Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec Ratio of Bugs and Issues Reported per Month Reported Bugs/Total Issues Actual Bugs/Total Issues Actual Bugs/Reported Bugs Results: Issues reported on monthly basis  Dec: holiday season – lower than average issues & bugs reported,  Jan – Mar: increasig trend - higher than average,  Apr – July: much lower than average issues & bugs reported  Busy (exam) periods at Universities,  end of financial year/quarter over many countries  Although the ratio/proportions remains fairly stable over time (with some exceptions)
  • 12. 0 20 40 60 80 100 120 140 160 180 200 1 5 9 13 17 21 25 29 33 37 41 45 49 53 Issues Reported per Week Actual Bugs Total Issues Reported Bugs 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 1 5 9 13 17 21 25 29 33 37 41 45 49 53 Ratio of Bugs and Issues Reported per Week Reported Bugs/Total Issues Actual Bugs/Total Issues Actual Bugs/Reported Bugs Results: Issues reported on weekly basis
  • 13. 0 200 400 600 800 1000 1200 1400 1600 Mon Tue Wed Thr Fri Sat Sun Issues Reported by Week Day Total Issues Reported Bugs Actual Bugs 0% 10% 20% 30% 40% 50% 60% 70% 80% Mon Tue Wed Thr Fri Sat Sun Ratio of Bugs and Issues Reported by Week Day Reported Bugs/Total Issues Actual Bugs/Total Issues Actual Bugs/Reported Bugs Results: Issues reported by week day  The first day of week (Mon), most contributor are busy at their primary task (education/job) – less contributions to the OSS project,  (Tue-Fri), the number of total issues and bugs reported increases,  On Saturday, while there is drop in absolute number of issues compared to peak of Thr/Friday, but still a large number of issues and bugs are registered,  Another interesting observation: the proportion of actual bugs to reported bugs or total reported issues is higher (maximum) on Saturdays while minimum for the first working day of the week,  And on Sunday, it seems most contributors take time off before the next week starts
  • 14. Impact on SRGM prediction models SRGMs: Software Reliability Growth Models Can be used for • Making asymptote predictions, and • Predicting the shape of defect inflow 0 20 40 60 80 100 120 140 160 180 200 1 5 9 13 17 21 25 29 33 37 41 45 49 53 Issues Reported per Week Actual Bugs Total Issues Reported Bugs
  • 15. Impact on SRGM prediction models SRGMs: Software Reliability Growth Models Tested with: o Logistic model o 90/10% split o Actual Bug Inflow (requires manual classification) Vs. o Reported Bug Inflow (data readily available) OSS Project Asymptote (a) Growth rate (b) Constant term (c) Reported bugs (Adj) Actual bugs Relative Error Reported bugs Actual bugs Reported bugs Actual bugs HttpClient 261 260 0.4% 0.051 0.050 39.7 41.3 Jackrabbit 901 923 -2.4% 0.069 0.065 42.3 44.6 Lucene-Java 741 685 8.2% 0.049 0.054 67.6 65.7 Rhino 338 301 12.3% 0.035 0.040 76.4 64.5 Tomcat5 644 640 0.6% 0.082 0.084 36.7 33.3
  • 16. Results Our results suggest  While there exist distinct variation over when defects are reported,  The ratio of reported to actual defects remains fairly stable over period of time.  These results enhance our confidence in applying SRGMs using reported defect inflow. (Test - logistic growth model – predicted asymptote deviations on average ~ 4.8% than using actual bugs for making such predictions.)  The reporting patterns can also provide some insights into possible group of people who contribute to OSS projects.
  • 17. Next Steps  Analysis of what local time, commits and issue reports are made can also help us build better profile of who actually contributes to OSS project and when these contribution occur  How much more can we learn about the patterns within reported issues and have better understanding of OSS contributors by using defect classification techniques (for e.g. using Orthogonal defect classification to the issues from OSS bug repositories)  How does patterns of issues and bug reporting differ for cases where OSS projects are managed by government or commercial organizations, etc.
  • 18. For more details Contact: Rakesh Rana rakesh.rana@lero.ie