SlideShare a Scribd company logo
Evaluating SZZ Implementations
Through a Developer-informed Oracle
Giovanni Rosa, Luca Pascarella, Simone Scalabrino, Rosalia Tufano,
Gabriele Bavota, Michele Lanza, Rocco Oliveto
Where do bugs come from?
Find out changes that can lead to a problem
and avoid them in future
Understanding where bugs are introduced allows to

Estimate how much a program is error-prone
Understanding where bugs are introduced allows to

Better allocate resources in testing activities
Understanding where bugs are introduced allows to

Úliwerski
Zimmermann
Zeller
@ MSR 2005
Step 1
SZZ in a nutshell
bug report
analysis
Step 1
(A)
Bug-fixing
commit
(B)
git blame
(C)
Buggy
commit
SZZ in a nutshell
bug report
analysis
Step 1
Step 2
Filtering of resulting
commits
SZZ in a nutshell
(A)
Bug-fixing
commit
(B)
git blame
(C)
Buggy
commit
bug report
analysis
Step 1
bug-inducing
commit
Step 2 Step 3
SZZ in a nutshell
Filtering of resulting
commits
(A)
Bug-fixing
commit
(B)
git blame
(C)
Buggy
commit
bug report
analysis
Different SZZ variants proposed
There is a problem
Evaluating and
comparing the SZZ
variants
Da Costa et al. @ TSE 2016
Evaluating and
comparing the SZZ
variants
Da Costa et al. @ TSE 2016
Small datasets used for evaluation
Evaluating and
comparing the SZZ
variants
Da Costa et al. @ TSE 2016
Small datasets used for evaluation
Validation manually performed by
researchers
Define a dataset validated by
the developers
The way
fixes a search bug
introduced by 2508e12
and fixes a typo in the
README.md
Developer-informed
dataset
Mining of commits
2011 2020
Heuristic approach
1
keyword-based filter
AI-powered syntax analysis
Duplicate commits removal
Heuristic approach
2
keyword-based filter
AI-powered syntax analysis
Duplicate commits removal
3 Heuristic approach
keyword-based filter
AI-powered syntax analysis
duplicate commits removal
Manual validation
False
positives
Bug report
data
Bug report data
fixes #1740 quote pov-ray binary on windows
this fixes a bug introduced by #3523741

URL
Date when the
issue is reported
https://tracker.freecadweb.org/view.php?id=1740
Commit
message
19,6M
3,6k
1,9k
Analyzed commits:
Extracted commits:
After manual validation:
Top programming languages
0
185
370
C
P
y
t
h
o
n
C
+
+
J
S
J
a
v
a
P
H
P
R
u
b
y
C
#
1,1k
129
Final number of commits:
Commits with issue report:
How do different variants of SZZ
perform in identifying
bug-inducing changes?
B-SZZ
Úliwerski et al. @ MSR 2005
R-SZZ e L-SZZ
B-SZZ
AG-SZZ
DJ-SZZ
Úliwerski et al. @ MSR 2005 Williams and Spacco @ ISSTA 2008
Kim et al. @ ASE 2006 Davies et al. @ JSE 2013
R-SZZ e L-SZZ
B-SZZ
AG-SZZ
MA-SZZ
DJ-SZZ
RA-SZZ
Úliwerski et al. @ MSR 2005 Williams and Spacco @ ISSTA 2008 Da Costa et al. @ TSE 2016
Kim et al. @ ASE 2006 Davies et al. @ JSE 2013 Neto et al. @ SANER 2018
Open-Source implementations
SZZ Unleashed
(DJ-SZZ)
OpenSZZ
(B-SZZ)
PyDriller
(AG-SZZ)
RA-SZZ
(RA-SZZ)
Step 1
bug-inducing
commit
Step 2 Step 3
Our experiment
Filtering of resulting
commits
(A)
Bug-fixing
commit
(B)
git blame
(C)
Buggy
commit
bug report
analysis
Results
0.66 (R-SZZ)
Precision
Recall
F1-score
0.72 (SZZ@UNL)
0.61 (R-SZZ)
Results
0.66 (R-SZZ)
Precision
Recall
F1-score
0.72 (SZZ@UNL)
0.61 (R-SZZ)
0.09 (SZZ@UNL)
0.19 (SZZ@OPN)
Java only
0.16 (SZZ@UNL)
Qualitative Analysis
What have we learned?
“ The buggy line is
not always impacted
in the bug-fix „
Lesson 1
“ SZZ is sensible to
history rewritings „
Lesson 2
“ Looking at the
big picture in
code changes „
Lesson 3
Summary
Take a look at our SZZ implementation!
https://github.com/grosa1/pyszz

More Related Content

PDF
Detecting Bad Smells in Source Code using Change History Information
PDF
Social Debt Analytics for Improving the Management of Software Evolution Tasks
PDF
Smells Like Teen Spirit: Improving Bug Prediction Performance using the Inten...
PDF
Do They Really Smell Bad? A Study on Developers' Perception of Bad Code Smells
PDF
PhD Symposium 2014
PDF
When and Why Your Code Starts to Smell Bad
PPTX
Code Smell, Software Engineering
PDF
On the Diffusion of Test Smells in Automatically Generated Test Code: An Empi...
Detecting Bad Smells in Source Code using Change History Information
Social Debt Analytics for Improving the Management of Software Evolution Tasks
Smells Like Teen Spirit: Improving Bug Prediction Performance using the Inten...
Do They Really Smell Bad? A Study on Developers' Perception of Bad Code Smells
PhD Symposium 2014
When and Why Your Code Starts to Smell Bad
Code Smell, Software Engineering
On the Diffusion of Test Smells in Automatically Generated Test Code: An Empi...

What's hot (20)

PDF
A Textual-based Technique for Smell Detection
PDF
Exploiting Semantics-Based Plagiarism Detection Methods
 
PDF
Investigating Code Review Practices in Defective Files
PDF
Review Participation in Modern Code Review: An Empirical Study of the Android...
PPTX
A Method to Detect License Inconsistencies for Large-Scale Open Source Projects
PDF
Enhancing Developer Productivity with Code Forensics
PPTX
Do software developers understand open source licenses?
PDF
PDF
Put Your Hands in the Mud: What Technique, Why, and How
PDF
Opinion Mining for Software Engineering
PDF
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
PDF
03. HAMS - Project Scheduling
PPTX
Plagiarism introduction
PDF
Early Detection of Collaboration Conflicts & Risks in Software Development
PPT
QuaP2P Kickoff Slides 2006
PDF
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
PPTX
Big(ger) Data in Software Engineering
PDF
MAVIS: A Visualization Tool for Cohesion-based Bad Smell Inspection
A Textual-based Technique for Smell Detection
Exploiting Semantics-Based Plagiarism Detection Methods
 
Investigating Code Review Practices in Defective Files
Review Participation in Modern Code Review: An Empirical Study of the Android...
A Method to Detect License Inconsistencies for Large-Scale Open Source Projects
Enhancing Developer Productivity with Code Forensics
Do software developers understand open source licenses?
Put Your Hands in the Mud: What Technique, Why, and How
Opinion Mining for Software Engineering
Proactive Empirical Assessment of New Language Feature Adoption via Automated...
03. HAMS - Project Scheduling
Plagiarism introduction
Early Detection of Collaboration Conflicts & Risks in Software Development
QuaP2P Kickoff Slides 2006
The Use of Development History in Software Refactoring Using a Multi-Objectiv...
Big(ger) Data in Software Engineering
MAVIS: A Visualization Tool for Cohesion-based Bad Smell Inspection
Ad

Similar to Evaluating SZZ Implementations Through a Developer-informed Oracle (ICSE 2021) (20)

PDF
Understanding the origins of a bug
PDF
SZZ Unleashed: An Open Implementation of the SZZ Algorithm
PDF
SFSCon19 - Kristian Schwienbacher - Custom error and event tracking for Ember...
PDF
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
PDF
Towards Effective Bug Triage with Software Data Reduction Techniques
PDF
Towards effective bug triage with software data reduction techniques
PDF
Towards effective bug triage with software data reduction techniques
PDF
Presentation
PPTX
Debugging Effectively - All Things Open 2017
PPTX
Debugging Effectively - DrupalCon Nashville 2018
PDF
IRJET-Automatic Bug Triage with Software
PPTX
Debugging Effectively - PHP UK 2017
PPTX
Debugging Effectively - ConFoo Montreal 2019
PPTX
Debugging Effectively - SunshinePHP 2017
PPTX
Debugging Effectively - ZendCon 2016
PPTX
Debugging Effectively - DrupalCon Europe 2016
PDF
Bug Triage: An Automated Process
PPTX
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
PDF
Automatic Identification of Bug-Introducing Changes
PDF
Learning from Human Repairs Through the Exploitation of Software Repositories
Understanding the origins of a bug
SZZ Unleashed: An Open Implementation of the SZZ Algorithm
SFSCon19 - Kristian Schwienbacher - Custom error and event tracking for Ember...
IRJET- Data Reduction in Bug Triage using Supervised Machine Learning
Towards Effective Bug Triage with Software Data Reduction Techniques
Towards effective bug triage with software data reduction techniques
Towards effective bug triage with software data reduction techniques
Presentation
Debugging Effectively - All Things Open 2017
Debugging Effectively - DrupalCon Nashville 2018
IRJET-Automatic Bug Triage with Software
Debugging Effectively - PHP UK 2017
Debugging Effectively - ConFoo Montreal 2019
Debugging Effectively - SunshinePHP 2017
Debugging Effectively - ZendCon 2016
Debugging Effectively - DrupalCon Europe 2016
Bug Triage: An Automated Process
A Framework for Evaluating the Results of the SZZ Approach for Identifying Bu...
Automatic Identification of Bug-Introducing Changes
Learning from Human Repairs Through the Exploitation of Software Repositories
Ad

More from Giovanni Rosa (7)

PDF
Tether: A Study on Bubble-Networks
PDF
What Quality Aspects Influence the Adoption of Docker Images?
PDF
Automatically Generating Dockerfiles via Deep Learning: Challenges and Promises
PDF
Assessing and Improving the Quality of Docker Artifacts (ICSME 2022)
PDF
Fixing Dockerfile Smells: An Empirical Study (ICSME 2022)
PDF
A Robust Approach for a Real-time Accurate Screening of ST Segment Anomalies ...
PDF
Automatic Real-time Beat-to-beat Detection of Arrhythmia Conditions (HEALTHIN...
Tether: A Study on Bubble-Networks
What Quality Aspects Influence the Adoption of Docker Images?
Automatically Generating Dockerfiles via Deep Learning: Challenges and Promises
Assessing and Improving the Quality of Docker Artifacts (ICSME 2022)
Fixing Dockerfile Smells: An Empirical Study (ICSME 2022)
A Robust Approach for a Real-time Accurate Screening of ST Segment Anomalies ...
Automatic Real-time Beat-to-beat Detection of Arrhythmia Conditions (HEALTHIN...

Recently uploaded (20)

PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
How Creative Agencies Leverage Project Management Software.pdf
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Reimagine Home Health with the Power of Agentic AI​
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
Understanding Forklifts - TECH EHS Solution
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
medical staffing services at VALiNTRY
PPTX
CHAPTER 2 - PM Management and IT Context
PDF
AI in Product Development-omnex systems
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Introduction to Artificial Intelligence
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
top salesforce developer skills in 2025.pdf
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Upgrade and Innovation Strategies for SAP ERP Customers
How Creative Agencies Leverage Project Management Software.pdf
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Reimagine Home Health with the Power of Agentic AI​
VVF-Customer-Presentation2025-Ver1.9.pptx
Which alternative to Crystal Reports is best for small or large businesses.pdf
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
Understanding Forklifts - TECH EHS Solution
2025 Textile ERP Trends: SAP, Odoo & Oracle
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
medical staffing services at VALiNTRY
CHAPTER 2 - PM Management and IT Context
AI in Product Development-omnex systems
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Introduction to Artificial Intelligence
wealthsignaloriginal-com-DS-text-... (1).pdf
top salesforce developer skills in 2025.pdf
Odoo POS Development Services by CandidRoot Solutions
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...

Evaluating SZZ Implementations Through a Developer-informed Oracle (ICSE 2021)