SlideShare a Scribd company logo
Characterizing and Predicting
        Which Bugs Get Reopened

        Thomas Zimmermann
        Nachiappan Nagappan
        Microsoft Research


        Philip J. Guo
        Stanford University


        Brendan Murphy
        Microsoft Research




© Microsoft Corporation
A bug’s life




Picture on the right via http://www.bugzilla.org/docs/2.18/html/lifecycle.html
© Microsoft Corporation
Final part of a trilogy…
     Which bugs are fixed?     Bug reassignments   Bug reopens (this paper)
     ICSE 2010                 CSCW 2011           ICSE 2012 SEIP




© Microsoft Corporation
…and partly a remake




                                Emad Shihab, Akinori Ihara, Yasutaka Kamei,
                                Walid M. Ibrahim, Masao Ohira, Bram Adams,
                                Ahmed E. Hassan, Ken-ichi Matsumoto:
                                Predicting Re-opened Bugs: A Case Study on
                                the Eclipse Project. WCRE 2010: 249-258




© Microsoft Corporation
Shihab et. al: Predicting Reopened Bugs     This paper: Characterizing Reopened Bugs
   (WCRE 2010)
   Four dimensions: work habits, bug report,
   bug fix and team

   Predicted reopened bugs with a precision
   of 62.9% and a recall of 84.5% (d-trees)

   Top node analysis found that bug report
   dimension was most influential

   |




© Microsoft Corporation
Shihab et. al: Predicting Reopened Bugs     This paper: Characterizing Reopened Bugs
   (WCRE 2010)
   Four dimensions: work habits, bug report,   Partial replication of Shihab et al.
   bug fix and team
                                               New measurements: organizational and
   Predicted reopened bugs with a precision    geographic distance, reputation, how found
   of 62.9% and a recall of 84.5% (d-trees)
                                               Qualitative component on the causes of
   Top node analysis found that bug report     bug reopens (identified with a survey)
   dimension was most influential
                                               Descriptive models (logistic regression)
   |
                                               |




© Microsoft Corporation
Methodology

          Qualitative       • “In your experience, what are reasons why a
                              bug would be reopened multiple times”
           survey           • 358 out of 1,773 responded. Card sort.



        Quantitative        • All bug reports for Windows Vista and
                              Windows 7
          analysis          • Logistic regression model for reopened bugs



            Manual          • Random sample of reopened bugs
          inspection        • 20 bug reports



© Microsoft Corporation
What are reasons
                           for bug report
                             reopens?




© Microsoft Corporation
Causes of bug reopens

        Not FIXED             Related to Root Cause
                               Bugs difficult to reproduce
                               Developers misunderstood root cause
                               Bug had insufficient information

                              Related to Priority
                               Priority of the bug increased

        FIXED                  Regression bugs
        Process-related        Process



© Microsoft Corporation
#1: Difficult to reproduce
        “The bug is hard to reproduce and so the fix was made without
        being able to fully verify it. A good example is a customer who
        reports something. We think we see the issue in house and fix
        that. It turns out we saw something different…”

                            “Bugs which are difficult to reproduce generally get
                            re-activated multiple times. At first, developers will
                            give a simple repro attempt before resolving bugs
                            'Not repro'. But if the bug opener is able to reproduce
                            the issue again, or perhaps comes up with better repro
                            instructions, then the developer will pay more
                            attention the second time the bug is activated.”

                             “Heisenbugs”

© Microsoft Corporation
#2: Misunderstood root cause
        “The bug is tracking an unidentified symptom and it takes a
        while to fully root cause. This comes up a lot with memory
        leaks: there will be an unknown memory leak in a component
        and the owning team plays whack-a-mole with the code
        defects to remove memory issues one-by-one.”


                           “Not fixing the root cause and only
                           addressing symptoms. Without root
                           cause understood for the bug a
                           patch/hack can often be done that
                           will then be reactivated.”


© Microsoft Corporation
#3: Insufficient information
        “Poor bug quality. If the bug wasn't described well enough, or
        not enough diagnostic info was there, the dev will guess and
        fix *something* in order to make the bug go away. What they
        fix isn't always what the person who filed the bug ran into.”

                           “If a bug report does not accurately convey
                           enough information about what is actually
                           wrong (i.e. it describes incorrect behavior
                           but neglects to mention data loss) or if the
                           bug does not convey a dependency (such as
                           another team relying on a fix), a bug may be
                           de-prioritized and resolved without fixing.”


© Microsoft Corporation
#4: Increased priority
        “Bugs are closed because one person or triage team believes
        the bug is not worthy of fixing (i.e. too risky, don't care, etc.),
        but then a few days later a VP or external customer reports
        the same issue, then the bug has a higher priority.”

                             “Other reason is lack of business justification or too
                             late in product cycle; reopened when sufficient
                             justification exists or new cycle begins.”

                               “One team may feel an issue is critical while
                               the other does not see it as important enough,
                               and instead of carrying a discussion, the bug
                               is bounced around.”


© Microsoft Corporation
#5: Regression bugs
        “First attempt at fix was flawed in some way, and wasn't
        caught because of lack of testing or unknown related
        scenario regression.”

                           “I've seen cases in the past where it was
                           thought that a bug was fixed only to find that
                           a corner case had been missed.”

                              “I've also seen cases where the bug was
                              only being hit due to a timing issue and
                              something changed that affected the
                              timing and the bug disappeared again.”

© Microsoft Corporation
#6: Process-related bugs
        “Sometimes bugs are reopened due to a misunderstanding
        of process. e.g. dev resolves bug when fix is submitted, but
        tester reactivates because bug still repros (because fix has
        not yet reached tester).”



                                “Bug is verified fixed in a feature of
                                developer’s branch and the fix takes
                                too long to hit the main branch.”




© Microsoft Corporation
#6: Process-related bugs

                               “First of all, I don’t like the model where we
                               reactivate bugs that were Fixed but the issue
                               was not resolved. Logically it makes sense,
                               but tracking the thread of the issue through
                               multiple checkins & reactivates can be hell if
                               it happens more than once or twice. I would
                               prefer a model where once a checkin has
                               been made for a bug, that bug is done! New
                               issues, or issues that linger despite a previous
                               fix, should/ would be tracked in a new bug.”



© Microsoft Corporation
What factors
                          correlate with
                          bug reopens?




© Microsoft Corporation
Does the source of a bug (how it was found)
       influence the likelihood of bug reopens?

Bug Sources                   Vista    Win7
Reopen rate for all bugs           P        Q
Code analysis tools           0.52P   0.73Q   less likely to
                                                be reopened
Human review                  0.85P   0.66Q
Ad-hoc testing                0.87P   0.99Q
Internal user                 1.12P   0.97Q
Component testing             1.13P   0.81Q
System testing                1.21P   1.46Q   more likely to
                                                be reopened
Customer                      1.33P   1.12Q

© Microsoft Corporation
Does opener reputation influence the
                   likelihood of bug reopens?




                For each bug, calculate opener’s reputation by
                aggregating over all bugs in the past.


Hooimeijer and Weimer: Modeling bug report quality. ASE 2007.


© Microsoft Corporation
Does opener reputation influence the
                   likelihood of bug reopens?



                          more likely to
                          be reopened
                                           less likely to
                                           be reopened




© Microsoft Corporation
Does organizational and geographic distance
       influence the likelihood of bug reopens?
                                                                Vista   Win7
Organizational Opened by and initially assigned to …
distance        … the same person                                    X      R
                … someone with the same manager                 1.13X 0.96R
                … someone with a different manager              1.37X 1.07R
Geographic                Opened by and initially assigned to …
distance                   … the same person                          X      R
                           … someone in the same building        1.27X 0.93R
                           … someone in a different building but
                             in the same country                 1.45X 1.00R
                           … someone in a different country      1.52X 1.14R

© Microsoft Corporation
Does organizational and geographic distance
       influence the likelihood of bug reopens?
                                                          Vista    Win7
         Assigned to opener at some point in time             Y       S

         Never assigned to opener, but assigned to        0.54Y   0.39S
          someone with the same manager as opener
         Never assigned to anyone with same manager       0.27Y   0.34S

         Never assigned to opener, but assigned to        0.41Y   0.37S
          someone in the same building
         Never assigned to anyone in same building, but   0.31Y   0.43S
          assigned to someone in the same country
         Never assigned to anyone in the same country     0.20Y   0.20S

© Microsoft Corporation
Descriptive statistical analysis
    • All pre- and post-release bug reports for Windows
      Vista and Windows 7 until July 2009
    • Logistic regression model to characterize
           – Probability that a bug will be reopened

    • Logistic regression model to characterize
           – Probability that a bug will be fixed after the bug
             has been reopened
           – Probability that a bug will be fixed
             (Guo et al., ICSE 2010)

    • Same factors as in Guo et al., ICSE 2010



© Microsoft Corporation
Factor                Reopen (Vista)
                                        Human review         not significant
                                        Code analysis tool       -0.503
                    Bug source:         Component testing         0.238
                    (categorical)       Ad-hoc testing         (baseline)
                                        System testing            0.204
                                        Customer                  0.239
                                        Internal user        not significant
              Reputation of bug opener                           -0.266
              Reputation of 1st assignee                     not significant
              Opened by temporary employee                        0.178
              Initial severity level                             0.127
              Severity upgraded?                                 0.331
              Opener / any assignee same manager?                0.721
              Opener / any assignee same building?               0.468
              Num. editors                                        0.236
              Num. assignee building                              0.090
              Num. component path changes                        -0.160


© Microsoft Corporation
Factor                 Reopen (Vista)
                                        Human review           not significant
                                        Code analysis tool   decrease (-0.503)
                    Bug source:         Component testing     increase (0.238)
                    (categorical)       Ad-hoc testing           (baseline)
                                        System testing        increase (0.204)
                                        Customer              increase (0.239)
                                        Internal user          not significant
              Reputation of bug opener                            -0.266
              Reputation of 1st assignee                      not significant
              Opened by temporary employee                         0.178
              Initial severity level                              0.127
              Severity upgraded?                                  0.331
              Opener / any assignee same manager?                 0.721
              Opener / any assignee same building?                0.468
              Num. editors                                         0.236
              Num. assignee building                               0.090
              Num. component path changes                         -0.160


© Microsoft Corporation
Factor                 Reopen (Vista)
                                        Human review           not significant
                                        Code analysis tool   decrease (-0.503)
                    Bug source:         Component testing     increase (0.238)
                    (categorical)       Ad-hoc testing           (baseline)
                                        System testing        increase (0.204)
                                        Customer              increase (0.239)
                                        Internal user          not significant
              Reputation of bug opener                          decrease
              Reputation of 1st assignee                      not significant
              Opened by temporary employee                      increase
              Initial severity level                             increase
              Severity upgraded?                                 increase
              Opener / any assignee same manager?                increase
              Opener / any assignee same building?               increase
              Num. editors                                       increase
              Num. assignee building                             increase
              Num. component path changes                        decrease


© Microsoft Corporation
Which *reopened*
                            bugs get fixed?
                                   vs.
                          Which bugs get fixed?




© Microsoft Corporation
Fixed When         Fixed (Vista)
                            Factor
                                                  Reopened (Vista)   [Guo, ICSE 2010]
                             Human review               0.377              0.511
                             Code analysis tool    not significant         0.357
         Bug source:         Component testing         -0.160              0.065
         (categorical)       Ad-hoc testing                                 
                             System testing        not significant        -0.129
                             Customer                  -0.498             -0.347
                             Internal user             -0.465             -0.454
   Reputation of bug opener                             1.632             2.193
   Reputation of 1st assignee                           1.651             2.463
   Opened by temporary employee                        -0.144            -0.125
   Initial severity level                          not significant        0.033
   Severity upgraded?                              not significant        0.256
   Opener / any assignee same manager?             not significant        0.676
   Opener / any assignee same building?            not significant        0.270
   Num. editors                                         0.127             0.240
   Num. assignee building                              -0.213            -0.257
   Num. component path changes                         -0.162            -0.232
   Num. re-opens                                         n/a             -0.135

© Microsoft Corporation
Fixed When         Fixed (Vista)
                            Factor
                                                  Reopened (Vista)   [Guo, ICSE 2010]
                             Human review               0.377              0.511
                             Code analysis tool    not significant         0.357
         Bug source:         Component testing         -0.160              0.065
         (categorical)       Ad-hoc testing                                 
                             System testing        not significant        -0.129
                             Customer                  -0.498             -0.347
                             Internal user             -0.465             -0.454
   Reputation of bug opener                             1.632             2.193
   Reputation of 1st assignee                           1.651             2.463
   Opened by temporary employee                        -0.144            -0.125
   Initial severity level                          not significant        0.033
   Severity upgraded?                              not significant        0.256
   Opener / any assignee same manager?             not significant        0.676
   Opener / any assignee same building?            not significant        0.270
   Num. editors                                         0.127             0.240
   Num. assignee building                              -0.213            -0.257
   Num. component path changes                         -0.162            -0.232
   Num. re-opens                                         n/a             -0.135

© Microsoft Corporation
Fixed When         Fixed (Vista)
                            Factor
                                                  Reopened (Vista)   [Guo, ICSE 2010]
                             Human review               0.377              0.511
                             Code analysis tool    not significant         0.357
         Bug source:         Component testing         -0.160              0.065
         (categorical)       Ad-hoc testing                                 
                             System testing        not significant        -0.129
                             Customer                  -0.498             -0.347
                             Internal user             -0.465             -0.454
   Reputation of bug opener                             1.632             2.193
   Reputation of 1st assignee                           1.651             2.463
   Opened by temporary employee                        -0.144            -0.125
   Initial severity level                          not significant        0.033
   Severity upgraded?                              not significant        0.256
   Opener / any assignee same manager?             not significant        0.676
   Opener / any assignee same building?            not significant        0.270
   Num. editors                                         0.127             0.240
   Num. assignee building                              -0.213            -0.257
   Num. component path changes                         -0.162            -0.232
   Num. re-opens                                         n/a             -0.135

© Microsoft Corporation
Lessons learned
    •    Improve reproducibility of bug reports
    •    Provide better tools to identify root cause
    •    Better estimate initial priorities
    •    Reduce the complexity of branching
         (bugs were “verified” in the wrong branch)




© Microsoft Corporation
Thank you!            Partial replication of Shihab et al.

                          New measurements:
                           organizational and geographic
                          distance, reputation, how found

                          Qualitative component on the
                          causes of bug reopens (survey):
                           root cause, priority, process

                          Descriptive models based on
                          logistic regression

                          http://research.microsoft.com/ese
© Microsoft Corporation

More Related Content

PDF
Predicting Method Crashes with Bytecode Operations
PDF
Analytics for smarter software development
PDF
Design for Testability: A Tutorial for Devs and Testers
PDF
Design for Testability: A Tutorial for Devs and Testers
PDF
Web Application Remediation - OWASP San Antonio March 2007
PDF
Peter Zimmerer - Evolve Design For Testability To The Next Level - EuroSTAR 2012
PPT
Software Design for Testability
PDF
Defect effort prediction models in software
Predicting Method Crashes with Bytecode Operations
Analytics for smarter software development
Design for Testability: A Tutorial for Devs and Testers
Design for Testability: A Tutorial for Devs and Testers
Web Application Remediation - OWASP San Antonio March 2007
Peter Zimmerer - Evolve Design For Testability To The Next Level - EuroSTAR 2012
Software Design for Testability
Defect effort prediction models in software

What's hot (19)

PPTX
Software testability slide share
PDF
AV-Comparatives Performance Test
PPT
SW Engineering Management
PDF
Dtl 2012 kl-app_ctl1.2
PDF
Protecting Enterprise - An examination of bugs, major vulnerabilities and exp...
PDF
Bill of-rights-white-paper-final-012312
PDF
Alm 4 Azure with screenshots
DOCX
Online exa-syste
PDF
Soffront Defect tracking
PDF
Automatic Proactive Troubleshooting with IBM Rational Build Forge
PDF
Cross-project defect prediction
PDF
Magenic-White-Paper-Continuous-Inegration-for-QA-Teams
PDF
RSA 2015 Blending the Automated and the Manual: Making Application Vulnerabil...
PPSX
Software Testing Basics
PDF
Open Source tools in Continuous Integration environment (case study for agil...
PDF
San Jose Selenium Meetup 22 Mar 2012: The Restless Are Getting Native
PDF
Real Cost of Software Remediation
PDF
Attacking antivirus
PDF
Quality Assurance 1: Why Quality Matters
Software testability slide share
AV-Comparatives Performance Test
SW Engineering Management
Dtl 2012 kl-app_ctl1.2
Protecting Enterprise - An examination of bugs, major vulnerabilities and exp...
Bill of-rights-white-paper-final-012312
Alm 4 Azure with screenshots
Online exa-syste
Soffront Defect tracking
Automatic Proactive Troubleshooting with IBM Rational Build Forge
Cross-project defect prediction
Magenic-White-Paper-Continuous-Inegration-for-QA-Teams
RSA 2015 Blending the Automated and the Manual: Making Application Vulnerabil...
Software Testing Basics
Open Source tools in Continuous Integration environment (case study for agil...
San Jose Selenium Meetup 22 Mar 2012: The Restless Are Getting Native
Real Cost of Software Remediation
Attacking antivirus
Quality Assurance 1: Why Quality Matters
Ad

Similar to Characterizing and Predicting Which Bugs Get Reopened (20)

PDF
Characterizing and predicting which bugs get fixed
PDF
Not my bug! Reasons for software bug report reassignments
DOC
Software Bugs A Software Architect Point Of View
PDF
Software Security Engineering (Learnings from the past to fix the future) - B...
PDF
Patterns for Cleaning Up Bug Data
PPTX
When do software issues get reported in large open source software
PDF
When do software issues get reported in large open source software - Rakesh Rana
PDF
The Bug Backlog - An Evergrowing Mountain
PDF
Practical use of defect detection and prediction
PPT
Bug Reporting
PPTX
SAD15 - Maintenance
PDF
Software bug prediction
PPTX
Wcre2010 shihab
PPTX
How to fix bug or defects in software
PPTX
Software Analytics: Towards Software Mining that Matters (2014)
PPTX
SYSNGS BUGS - definition, lifecycle and what can I do with them as a developer
PPTX
Automated bug localization
PPTX
ReLink: Recovering Links between Bugs and Changes (ESEC/FSE 2011)
PPTX
Reversing Microsoft patches to reveal vulnerable code
PDF
Open vs Closed - Which is more secure?
Characterizing and predicting which bugs get fixed
Not my bug! Reasons for software bug report reassignments
Software Bugs A Software Architect Point Of View
Software Security Engineering (Learnings from the past to fix the future) - B...
Patterns for Cleaning Up Bug Data
When do software issues get reported in large open source software
When do software issues get reported in large open source software - Rakesh Rana
The Bug Backlog - An Evergrowing Mountain
Practical use of defect detection and prediction
Bug Reporting
SAD15 - Maintenance
Software bug prediction
Wcre2010 shihab
How to fix bug or defects in software
Software Analytics: Towards Software Mining that Matters (2014)
SYSNGS BUGS - definition, lifecycle and what can I do with them as a developer
Automated bug localization
ReLink: Recovering Links between Bugs and Changes (ESEC/FSE 2011)
Reversing Microsoft patches to reveal vulnerable code
Open vs Closed - Which is more secure?
Ad

More from Thomas Zimmermann (20)

PDF
Software Analytics = Sharing Information
PDF
MSR 2013 Preview
PPT
Klingon Countdown Timer
PDF
Data driven games user research
PDF
Empirical Software Engineering at Microsoft Research
PDF
Security trend analysis with CVE topic models
PDF
Analytics for software development
PDF
Changes and Bugs: Mining and Predicting Development Activities
PDF
Changes and Bugs: Mining and Predicting Development Activities
PDF
Predicting Defects using Network Analysis on Dependency Graphs
PDF
Quality of Bug Reports in Open Source
PDF
Meet Tom and his Fish
PDF
Predicting Subsystem Defects using Dependency Graph Complexities
PDF
Got Myth? Myths in Software Engineering
PDF
Mining Workspace Updates in CVS
PDF
Mining Software Archives to Support Software Development
PDF
Unit testing with JUnit
PPT
esolang: Esoterische Programmiersprachen
PPT
TA-RE: An Exchange Language for Mining Software Repositories
PDF
Fine-grained Processing of CVS Archives with APFEL
Software Analytics = Sharing Information
MSR 2013 Preview
Klingon Countdown Timer
Data driven games user research
Empirical Software Engineering at Microsoft Research
Security trend analysis with CVE topic models
Analytics for software development
Changes and Bugs: Mining and Predicting Development Activities
Changes and Bugs: Mining and Predicting Development Activities
Predicting Defects using Network Analysis on Dependency Graphs
Quality of Bug Reports in Open Source
Meet Tom and his Fish
Predicting Subsystem Defects using Dependency Graph Complexities
Got Myth? Myths in Software Engineering
Mining Workspace Updates in CVS
Mining Software Archives to Support Software Development
Unit testing with JUnit
esolang: Esoterische Programmiersprachen
TA-RE: An Exchange Language for Mining Software Repositories
Fine-grained Processing of CVS Archives with APFEL

Recently uploaded (20)

PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
MYSQL Presentation for SQL database connectivity
PDF
KodekX | Application Modernization Development
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Empathic Computing: Creating Shared Understanding
PPT
Teaching material agriculture food technology
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
Electronic commerce courselecture one. Pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Big Data Technologies - Introduction.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Cloud computing and distributed systems.
Building Integrated photovoltaic BIPV_UPV.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
“AI and Expert System Decision Support & Business Intelligence Systems”
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
MYSQL Presentation for SQL database connectivity
KodekX | Application Modernization Development
Digital-Transformation-Roadmap-for-Companies.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Empathic Computing: Creating Shared Understanding
Teaching material agriculture food technology
Review of recent advances in non-invasive hemoglobin estimation
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Chapter 3 Spatial Domain Image Processing.pdf
Electronic commerce courselecture one. Pdf
Encapsulation_ Review paper, used for researhc scholars
Per capita expenditure prediction using model stacking based on satellite ima...
Big Data Technologies - Introduction.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Cloud computing and distributed systems.

Characterizing and Predicting Which Bugs Get Reopened

  • 1. Characterizing and Predicting Which Bugs Get Reopened Thomas Zimmermann Nachiappan Nagappan Microsoft Research Philip J. Guo Stanford University Brendan Murphy Microsoft Research © Microsoft Corporation
  • 2. A bug’s life Picture on the right via http://www.bugzilla.org/docs/2.18/html/lifecycle.html © Microsoft Corporation
  • 3. Final part of a trilogy… Which bugs are fixed? Bug reassignments Bug reopens (this paper) ICSE 2010 CSCW 2011 ICSE 2012 SEIP © Microsoft Corporation
  • 4. …and partly a remake Emad Shihab, Akinori Ihara, Yasutaka Kamei, Walid M. Ibrahim, Masao Ohira, Bram Adams, Ahmed E. Hassan, Ken-ichi Matsumoto: Predicting Re-opened Bugs: A Case Study on the Eclipse Project. WCRE 2010: 249-258 © Microsoft Corporation
  • 5. Shihab et. al: Predicting Reopened Bugs This paper: Characterizing Reopened Bugs (WCRE 2010) Four dimensions: work habits, bug report, bug fix and team Predicted reopened bugs with a precision of 62.9% and a recall of 84.5% (d-trees) Top node analysis found that bug report dimension was most influential | © Microsoft Corporation
  • 6. Shihab et. al: Predicting Reopened Bugs This paper: Characterizing Reopened Bugs (WCRE 2010) Four dimensions: work habits, bug report, Partial replication of Shihab et al. bug fix and team New measurements: organizational and Predicted reopened bugs with a precision geographic distance, reputation, how found of 62.9% and a recall of 84.5% (d-trees) Qualitative component on the causes of Top node analysis found that bug report bug reopens (identified with a survey) dimension was most influential Descriptive models (logistic regression) | | © Microsoft Corporation
  • 7. Methodology Qualitative • “In your experience, what are reasons why a bug would be reopened multiple times” survey • 358 out of 1,773 responded. Card sort. Quantitative • All bug reports for Windows Vista and Windows 7 analysis • Logistic regression model for reopened bugs Manual • Random sample of reopened bugs inspection • 20 bug reports © Microsoft Corporation
  • 8. What are reasons for bug report reopens? © Microsoft Corporation
  • 9. Causes of bug reopens Not FIXED Related to Root Cause  Bugs difficult to reproduce  Developers misunderstood root cause  Bug had insufficient information Related to Priority  Priority of the bug increased FIXED  Regression bugs Process-related  Process © Microsoft Corporation
  • 10. #1: Difficult to reproduce “The bug is hard to reproduce and so the fix was made without being able to fully verify it. A good example is a customer who reports something. We think we see the issue in house and fix that. It turns out we saw something different…” “Bugs which are difficult to reproduce generally get re-activated multiple times. At first, developers will give a simple repro attempt before resolving bugs 'Not repro'. But if the bug opener is able to reproduce the issue again, or perhaps comes up with better repro instructions, then the developer will pay more attention the second time the bug is activated.” “Heisenbugs” © Microsoft Corporation
  • 11. #2: Misunderstood root cause “The bug is tracking an unidentified symptom and it takes a while to fully root cause. This comes up a lot with memory leaks: there will be an unknown memory leak in a component and the owning team plays whack-a-mole with the code defects to remove memory issues one-by-one.” “Not fixing the root cause and only addressing symptoms. Without root cause understood for the bug a patch/hack can often be done that will then be reactivated.” © Microsoft Corporation
  • 12. #3: Insufficient information “Poor bug quality. If the bug wasn't described well enough, or not enough diagnostic info was there, the dev will guess and fix *something* in order to make the bug go away. What they fix isn't always what the person who filed the bug ran into.” “If a bug report does not accurately convey enough information about what is actually wrong (i.e. it describes incorrect behavior but neglects to mention data loss) or if the bug does not convey a dependency (such as another team relying on a fix), a bug may be de-prioritized and resolved without fixing.” © Microsoft Corporation
  • 13. #4: Increased priority “Bugs are closed because one person or triage team believes the bug is not worthy of fixing (i.e. too risky, don't care, etc.), but then a few days later a VP or external customer reports the same issue, then the bug has a higher priority.” “Other reason is lack of business justification or too late in product cycle; reopened when sufficient justification exists or new cycle begins.” “One team may feel an issue is critical while the other does not see it as important enough, and instead of carrying a discussion, the bug is bounced around.” © Microsoft Corporation
  • 14. #5: Regression bugs “First attempt at fix was flawed in some way, and wasn't caught because of lack of testing or unknown related scenario regression.” “I've seen cases in the past where it was thought that a bug was fixed only to find that a corner case had been missed.” “I've also seen cases where the bug was only being hit due to a timing issue and something changed that affected the timing and the bug disappeared again.” © Microsoft Corporation
  • 15. #6: Process-related bugs “Sometimes bugs are reopened due to a misunderstanding of process. e.g. dev resolves bug when fix is submitted, but tester reactivates because bug still repros (because fix has not yet reached tester).” “Bug is verified fixed in a feature of developer’s branch and the fix takes too long to hit the main branch.” © Microsoft Corporation
  • 16. #6: Process-related bugs “First of all, I don’t like the model where we reactivate bugs that were Fixed but the issue was not resolved. Logically it makes sense, but tracking the thread of the issue through multiple checkins & reactivates can be hell if it happens more than once or twice. I would prefer a model where once a checkin has been made for a bug, that bug is done! New issues, or issues that linger despite a previous fix, should/ would be tracked in a new bug.” © Microsoft Corporation
  • 17. What factors correlate with bug reopens? © Microsoft Corporation
  • 18. Does the source of a bug (how it was found) influence the likelihood of bug reopens? Bug Sources Vista Win7 Reopen rate for all bugs P Q Code analysis tools 0.52P 0.73Q less likely to be reopened Human review 0.85P 0.66Q Ad-hoc testing 0.87P 0.99Q Internal user 1.12P 0.97Q Component testing 1.13P 0.81Q System testing 1.21P 1.46Q more likely to be reopened Customer 1.33P 1.12Q © Microsoft Corporation
  • 19. Does opener reputation influence the likelihood of bug reopens? For each bug, calculate opener’s reputation by aggregating over all bugs in the past. Hooimeijer and Weimer: Modeling bug report quality. ASE 2007. © Microsoft Corporation
  • 20. Does opener reputation influence the likelihood of bug reopens? more likely to be reopened less likely to be reopened © Microsoft Corporation
  • 21. Does organizational and geographic distance influence the likelihood of bug reopens? Vista Win7 Organizational Opened by and initially assigned to … distance … the same person X R … someone with the same manager 1.13X 0.96R … someone with a different manager 1.37X 1.07R Geographic Opened by and initially assigned to … distance … the same person X R … someone in the same building 1.27X 0.93R … someone in a different building but in the same country 1.45X 1.00R … someone in a different country 1.52X 1.14R © Microsoft Corporation
  • 22. Does organizational and geographic distance influence the likelihood of bug reopens? Vista Win7 Assigned to opener at some point in time Y S Never assigned to opener, but assigned to 0.54Y 0.39S someone with the same manager as opener Never assigned to anyone with same manager 0.27Y 0.34S Never assigned to opener, but assigned to 0.41Y 0.37S someone in the same building Never assigned to anyone in same building, but 0.31Y 0.43S assigned to someone in the same country Never assigned to anyone in the same country 0.20Y 0.20S © Microsoft Corporation
  • 23. Descriptive statistical analysis • All pre- and post-release bug reports for Windows Vista and Windows 7 until July 2009 • Logistic regression model to characterize – Probability that a bug will be reopened • Logistic regression model to characterize – Probability that a bug will be fixed after the bug has been reopened – Probability that a bug will be fixed (Guo et al., ICSE 2010) • Same factors as in Guo et al., ICSE 2010 © Microsoft Corporation
  • 24. Factor Reopen (Vista) Human review not significant Code analysis tool -0.503 Bug source: Component testing 0.238 (categorical) Ad-hoc testing  (baseline) System testing 0.204 Customer 0.239 Internal user not significant Reputation of bug opener -0.266 Reputation of 1st assignee not significant Opened by temporary employee 0.178 Initial severity level 0.127 Severity upgraded? 0.331 Opener / any assignee same manager? 0.721 Opener / any assignee same building? 0.468 Num. editors 0.236 Num. assignee building 0.090 Num. component path changes -0.160 © Microsoft Corporation
  • 25. Factor Reopen (Vista) Human review not significant Code analysis tool decrease (-0.503) Bug source: Component testing increase (0.238) (categorical) Ad-hoc testing  (baseline) System testing increase (0.204) Customer increase (0.239) Internal user not significant Reputation of bug opener -0.266 Reputation of 1st assignee not significant Opened by temporary employee 0.178 Initial severity level 0.127 Severity upgraded? 0.331 Opener / any assignee same manager? 0.721 Opener / any assignee same building? 0.468 Num. editors 0.236 Num. assignee building 0.090 Num. component path changes -0.160 © Microsoft Corporation
  • 26. Factor Reopen (Vista) Human review not significant Code analysis tool decrease (-0.503) Bug source: Component testing increase (0.238) (categorical) Ad-hoc testing  (baseline) System testing increase (0.204) Customer increase (0.239) Internal user not significant Reputation of bug opener decrease Reputation of 1st assignee not significant Opened by temporary employee increase Initial severity level increase Severity upgraded? increase Opener / any assignee same manager? increase Opener / any assignee same building? increase Num. editors increase Num. assignee building increase Num. component path changes decrease © Microsoft Corporation
  • 27. Which *reopened* bugs get fixed? vs. Which bugs get fixed? © Microsoft Corporation
  • 28. Fixed When Fixed (Vista) Factor Reopened (Vista) [Guo, ICSE 2010] Human review 0.377 0.511 Code analysis tool not significant 0.357 Bug source: Component testing -0.160 0.065 (categorical) Ad-hoc testing   System testing not significant -0.129 Customer -0.498 -0.347 Internal user -0.465 -0.454 Reputation of bug opener 1.632 2.193 Reputation of 1st assignee 1.651 2.463 Opened by temporary employee -0.144 -0.125 Initial severity level not significant 0.033 Severity upgraded? not significant 0.256 Opener / any assignee same manager? not significant 0.676 Opener / any assignee same building? not significant 0.270 Num. editors 0.127 0.240 Num. assignee building -0.213 -0.257 Num. component path changes -0.162 -0.232 Num. re-opens n/a -0.135 © Microsoft Corporation
  • 29. Fixed When Fixed (Vista) Factor Reopened (Vista) [Guo, ICSE 2010] Human review 0.377 0.511 Code analysis tool not significant 0.357 Bug source: Component testing -0.160 0.065 (categorical) Ad-hoc testing   System testing not significant -0.129 Customer -0.498 -0.347 Internal user -0.465 -0.454 Reputation of bug opener 1.632 2.193 Reputation of 1st assignee 1.651 2.463 Opened by temporary employee -0.144 -0.125 Initial severity level not significant 0.033 Severity upgraded? not significant 0.256 Opener / any assignee same manager? not significant 0.676 Opener / any assignee same building? not significant 0.270 Num. editors 0.127 0.240 Num. assignee building -0.213 -0.257 Num. component path changes -0.162 -0.232 Num. re-opens n/a -0.135 © Microsoft Corporation
  • 30. Fixed When Fixed (Vista) Factor Reopened (Vista) [Guo, ICSE 2010] Human review 0.377 0.511 Code analysis tool not significant 0.357 Bug source: Component testing -0.160 0.065 (categorical) Ad-hoc testing   System testing not significant -0.129 Customer -0.498 -0.347 Internal user -0.465 -0.454 Reputation of bug opener 1.632 2.193 Reputation of 1st assignee 1.651 2.463 Opened by temporary employee -0.144 -0.125 Initial severity level not significant 0.033 Severity upgraded? not significant 0.256 Opener / any assignee same manager? not significant 0.676 Opener / any assignee same building? not significant 0.270 Num. editors 0.127 0.240 Num. assignee building -0.213 -0.257 Num. component path changes -0.162 -0.232 Num. re-opens n/a -0.135 © Microsoft Corporation
  • 31. Lessons learned • Improve reproducibility of bug reports • Provide better tools to identify root cause • Better estimate initial priorities • Reduce the complexity of branching (bugs were “verified” in the wrong branch) © Microsoft Corporation
  • 32. Thank you! Partial replication of Shihab et al. New measurements:  organizational and geographic distance, reputation, how found Qualitative component on the causes of bug reopens (survey):  root cause, priority, process Descriptive models based on logistic regression http://research.microsoft.com/ese © Microsoft Corporation