SlideShare a Scribd company logo
Bug prediction
based on your code history
2
3y 4y 5y 2y
Developer Developer & Founder Team Lead & Architect VP of Engineering
@ Various companies @ Startup @ Yandex @ WorldAPP
13 years in practical engineering
80 people at the department
10+ projects from initial commit to production
backlogDEV + QA
Automation engineers
Automate bug
searching
What already have been automated
5
MaintenanceDevelop Test
● Unit and Integration tests run on every commit to MR branches
● Static code analysis on each push
● Cross references between GitLab and Jira
● HipChat notifications about created Merge Requests
What already have been automated
6
MaintenanceDevelop Test
● Deploy a successful build to the test environment
● UI and Performance tests run on every commit to a develop branch
● Check against different types of supported DBMS
What already have been automated
7
MaintenanceDevelop Test
● Deploy a successful build to the production environment
● Grafana alerting to HipChat
Issues with these opportunities
● static code analyzers find only non conceptual issues
● automated tests cover only predefined scenarios
● code reviews are aimed on sharing and controlling best practices and less
than 10% of all discussions discover logical issues.
● and, finally, QA has no idea which parts of the system could be affected by a
code change… neither do a programmer
8
20
bugs in a production environment
per week
9
A guess. Let's examine human factor
● a tired engineer makes more mistakes
● the more an engineer knows about certain module the fewer bugs (s)he will
produce
● small changes have fewer bugs than long listings
● some parts of the system are more complicated than another, so the risk of get
a bug increases
● huge changes in a short period of time contains more bugs (done in a hurry)
10
Hypothesis
If we know that certain commit has fixed a bug, than we know that a commit, when
the changed lines were introduced, did contain the bug.
11
Author: John
public int sum( int a, int b )
{
return a + b;
}
C
Author: Bob
public int sum( int a, int b )
{
return a * b;
}
BA
public int sum( int a, int b )
{
return a + b;
}
What tools can help us?
12
● ticket types
● action history
● exact code changes
● author of modifications
● class complexity
● code metrics
Our new team member. Overlord
13
WebHooks
ScheduledExecutorService
14
java.util.concurrent.
Improve cross references between tools
15
● Notifies about missed ticket key in MR title
● Fills MR with information from Jira
● Fixes common mistakes in MR creation
Propose the best reviewers based on MR changeset
16
● Who previously has edited the touched code lines
● Who has coded more than others in the files
● Who is team lead / owner of the service / package
Task updates according to the workflow
17
● Transitions task status
● Assigns proper person for the next step
● Marks if task has SQL changes
● Adds a label with branch merged into
Check that MR has 2 upvotes before merging
18
● Check that rules are followed
● Notify TeamLead / Dev manager about any
violation
● Push an author to ask colleagues to look at his
masterpiece
Another automated processes
● Notifies author about old MR without any reactions
● Notifies assignee that MR can be merged
● Notifies if you have lots of “In Progress” tickets or don’t have them at all
● Provides a list of merged tasks in the particular branch
19
Now we have got all the data
Algorithm of metrics collection
● Export all tasks from Jira to inmemory dictionary
● For each commit run a backtrace to mark it as buggy, fixing or regular
● Collect all meaningful data about commit:
○ Month of year, Day of week, Hour of day, Who, How many lines and files, Which classes and
packages, Class complexity and amount of notices, How long a task is in progress
● Put a line with the data to Attribute-Relation File Format (ARFF) file
21
Getting educated. WEKA
Waikato Environment for Knowledge Analysis - is a suite of machine learning
software written in Java, developed at the University of Waikato, New Zealand.
● Parsers
● Classifiers
● Training/test splits
22
WEKA challenges
● Convert your data to corresponding vectors
● Choose proper data transformers
● Select and tweak desired Classifiers
● Run experiments and adjust your settings
Good materials about WEKA for beginners:
● How to Run Your First Classifier in Weka
● Data mining with WEKA, Part 2. Classification and clustering
● Document Classification using WEKA
23
Decision Tree
Ease of results interpretation
Any data can be fed to the method
Can work with scalars and intervals
24
Decision Tree
25
Changed less
than 300 lines?Changed more
than 50 lines?
Author is Bob?
Author is John?
Has no bugs :)
Has no bugs :)
Is it Friday?
Has no bugs :)
Has a bug :(
Has no bugs :)
Has a bug :(
● John never has bugs!
● Everybody except John and Bob has bugs on Friday.
● Bob has bugs only if he changed more than 300 lines of code.
Decision Tree
26
The simplest method for building a tree is ID3 (Iterative Dichotomiser 3*).
Build steps:
● Find an attribute with lowest entropy (or largest information gain)
● Split the data set by the found attribute
● Recursively build a tree for each of the subsets
* fates of ID2 and ID1 are lost in history
Naive Bayes
classifier
≈80% accuracy*
Simple implementation
Easy to understand
27
Naive Bayes classifier
28
Naive Bayes classifier
29
30% of all commits with bugs were done by Bob P(Bob|bug)
10% of all commits without bugs were done by Bob P(Bob|~bug)
40% of all commits have bugs P(bug)
60% of all commits have no bugs P(~bug)
What probability that next commit from Bob will have a bug?
P(bug|Bob)
Output results example (Bayes)
Correctly Classified Instances 14381 77.4755 %
Incorrectly Classified Instances 4181 22.5245 %
Kappa statistic 0.3085
Mean absolute error 0.2637
Root mean squared error 0.3963
=== Detailed Accuracy By Class ===
TP Rate FP Rate Precision Recall F-Measure ROC Area Class
0.856 0.544 0.861 0.856 0.858 0.761 false
0.456 0.144 0.444 0.456 0.45 0.761 true
Weighted Avg. 0.775 0.463 0.777 0.775 0.776 0.761
=== Confusion Matrix ===
a b <-- classified as
12670 2140 | a = false
2041 1711 | b = true
30
Output results example (RandomTree)
form < 1
| Registration < 1
| | alexey.tokar@worldapp.com < 1
| | | tpl < 1
| | | | filters < 1
| | | | | frontend@worldapp.com < 1
| | | | | | middlejava@worldapp.com < 1 : false
| | | | | | middlejava@worldapp.com >= 1 : true
| | | | | frontend@worldapp.com >= 1
| | | | | | ObjectDesign < 1 : true
| | | | | | ObjectDesign >= 1 : false
| | | | filters >= 1 : false
| | | tpl >= 1 : true
| | alexey.tokar@worldapp.com >= 1
| | | bundle < 1
| | | | xmail < 1
| | | | | general < 1
| | | | | | dataimport < 1
| | | | | | | oracle < 1 : false
| | | | | | | oracle >= 1 : true
| | | | | | dataimport >= 1 : false
| | | | | general >= 1
| | | | | | filesedited < 2 : false
| | | | | | filesedited >= 2 : true
| | | | xmail >= 1 : false
| | | bundle >= 1 : true
| Registration >= 1 : true
31
Summary
● we found that certain classes are too complex as almost every change in them
will end up with a bug
● some of engineers shouldn't open some packages at all (or at least we should
properly educate them)
● there are still many rooms for improvements (overlapping hiding commits,
another meaningful features, more accurate code history, etc)
● It does not show you where an error exists. But you will be able to analyze a
commit more carefully.
● It was fun! :)
32
Questions?
Alexey@Tokar.net.ua
VP of Engineering @ WorldAPP
33

More Related Content

PPTX
Bug prediction based on your code history
PPTX
Static analysis works for mission-critical systems, why not yours?
PDF
Triantafyllia Voulibasi
PPTX
Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...
PPTX
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
PDF
Static Analysis of Your OSS Project with Coverity
PDF
High-Performance Python
PPTX
Finding Defects in C#: Coverity vs. FxCop
Bug prediction based on your code history
Static analysis works for mission-critical systems, why not yours?
Triantafyllia Voulibasi
Case Study: Automated Code Reviews In A Grown SAP Application Landscape At EW...
Automatically Generated Patches as Debugging Aids: A Human Study (FSE 2014)
Static Analysis of Your OSS Project with Coverity
High-Performance Python
Finding Defects in C#: Coverity vs. FxCop

What's hot (20)

PPTX
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
ODP
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...
PPT
Crowd debugging (FSE 2015)
PDF
Scaling Analysis Responsibly
PDF
Personalized Defect Prediction
PPTX
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
PDF
Improving the accuracy and reliability of data analysis code
PDF
Android Test Driven Development & Android Unit Testing
PPTX
QTP Automation Testing Tutorial 2
PDF
Testing in FrontEnd World by Nikita Galkin
PPTX
QTP Automation Testing Tutorial 6
PPTX
Automation and Technical Debt
PPTX
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
PDF
Cyclomatic complexity
PPT
Effective Test Driven Database Development
PPTX
Deep API Learning (FSE 2016)
PPTX
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
PPTX
QTP Automation Testing Tutorial 7
PDF
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
PPTX
Random testing
CrashLocator: Locating Crashing Faults Based on Crash Stacks (ISSTA 2014)
RandomTest - Random Software Integration Tests That Just Work for C/C++, Java...
Crowd debugging (FSE 2015)
Scaling Analysis Responsibly
Personalized Defect Prediction
DeepAM: Migrate APIs with Multi-modal Sequence to Sequence Learning
Improving the accuracy and reliability of data analysis code
Android Test Driven Development & Android Unit Testing
QTP Automation Testing Tutorial 2
Testing in FrontEnd World by Nikita Galkin
QTP Automation Testing Tutorial 6
Automation and Technical Debt
REMI: Defect Prediction for Efficient API Testing (

ESEC/FSE 2015, Industria...
Cyclomatic complexity
Effective Test Driven Database Development
Deep API Learning (FSE 2016)
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
QTP Automation Testing Tutorial 7
What We Learned Building an R-Python Hybrid Predictive Analytics Pipeline
Random testing
Ad

Similar to Bug prediction + sdlc automation (20)

PDF
Populating a Release History Database (ICSM 2013 MIP)
PPTX
Bots on guard of sdlc
PDF
Can ML help software developers? (TEQnation 2022)
PDF
Survey on Software Defect Prediction
PDF
Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)
PPTX
Survey on Software Defect Prediction
PDF
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
PDF
[xp2013] Narrow Down What to Test
PDF
Presentation
PPTX
Automatic Fine-Grained Issue Report Reclassification
PPTX
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
PDF
How to improve the quality of your application
PDF
ODP
Automating good coding practices
PDF
A Tale of Experiments on Bug Prediction
PDF
Empirical evaluation in 2020: how big, how beautiful?
PDF
Getting Ahead of Delivery Issues with Deep SDLC Analysis by Donald Belcham
PDF
Keynote at-icpc-2020
PDF
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
PDF
Predicting Fault-Prone Files using Machine Learning
Populating a Release History Database (ICSM 2013 MIP)
Bots on guard of sdlc
Can ML help software developers? (TEQnation 2022)
Survey on Software Defect Prediction
Survey on Software Defect Prediction (PhD Qualifying Examination Presentation)
Survey on Software Defect Prediction
It's Not a Bug, It's a Feature — How Misclassification Impacts Bug Prediction
[xp2013] Narrow Down What to Test
Presentation
Automatic Fine-Grained Issue Report Reclassification
All you need is fast feedback loop, fast feedback loop, fast feedback loop is...
How to improve the quality of your application
Automating good coding practices
A Tale of Experiments on Bug Prediction
Empirical evaluation in 2020: how big, how beautiful?
Getting Ahead of Delivery Issues with Deep SDLC Analysis by Donald Belcham
Keynote at-icpc-2020
Analyzing Changes in Software Systems From ChangeDistiller to FMDiff
Predicting Fault-Prone Files using Machine Learning
Ad

More from Alexey Tokar (7)

PPTX
Graph theory basics
PPTX
Fantastic caches and where to find them
PPTX
Conway's transformation
PPTX
Extend your REST API
PPTX
Найти иглоку в стоге сена
PPTX
MongoDB в продакшен - миф или реальность?
PPTX
когда тексты не только слова
Graph theory basics
Fantastic caches and where to find them
Conway's transformation
Extend your REST API
Найти иглоку в стоге сена
MongoDB в продакшен - миф или реальность?
когда тексты не только слова

Recently uploaded (20)

PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PPTX
Fundamentals of Mechanical Engineering.pptx
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
737-MAX_SRG.pdf student reference guides
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPT
introduction to datamining and warehousing
PPTX
Artificial Intelligence
PDF
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
PPTX
Safety Seminar civil to be ensured for safe working.
PDF
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
PPT
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
PDF
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PDF
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
PDF
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
PDF
R24 SURVEYING LAB MANUAL for civil enggi
PPT
Occupational Health and Safety Management System
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PDF
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
PPTX
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx
Automation-in-Manufacturing-Chapter-Introduction.pdf
Fundamentals of Mechanical Engineering.pptx
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
737-MAX_SRG.pdf student reference guides
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
introduction to datamining and warehousing
Artificial Intelligence
Level 2 – IBM Data and AI Fundamentals (1)_v1.1.PDF
Safety Seminar civil to be ensured for safe working.
null (2) bgfbg bfgb bfgb fbfg bfbgf b.pdf
INTRODUCTION -Data Warehousing and Mining-M.Tech- VTU.ppt
A SYSTEMATIC REVIEW OF APPLICATIONS IN FRAUD DETECTION
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
Human-AI Collaboration: Balancing Agentic AI and Autonomy in Hybrid Systems
BIO-INSPIRED HORMONAL MODULATION AND ADAPTIVE ORCHESTRATION IN S-AI-GPT
R24 SURVEYING LAB MANUAL for civil enggi
Occupational Health and Safety Management System
III.4.1.2_The_Space_Environment.p pdffdf
PREDICTION OF DIABETES FROM ELECTRONIC HEALTH RECORDS
6ME3A-Unit-II-Sensors and Actuators_Handouts.pptx

Bug prediction + sdlc automation

  • 1. Bug prediction based on your code history
  • 2. 2 3y 4y 5y 2y Developer Developer & Founder Team Lead & Architect VP of Engineering @ Various companies @ Startup @ Yandex @ WorldAPP 13 years in practical engineering 80 people at the department 10+ projects from initial commit to production
  • 5. What already have been automated 5 MaintenanceDevelop Test ● Unit and Integration tests run on every commit to MR branches ● Static code analysis on each push ● Cross references between GitLab and Jira ● HipChat notifications about created Merge Requests
  • 6. What already have been automated 6 MaintenanceDevelop Test ● Deploy a successful build to the test environment ● UI and Performance tests run on every commit to a develop branch ● Check against different types of supported DBMS
  • 7. What already have been automated 7 MaintenanceDevelop Test ● Deploy a successful build to the production environment ● Grafana alerting to HipChat
  • 8. Issues with these opportunities ● static code analyzers find only non conceptual issues ● automated tests cover only predefined scenarios ● code reviews are aimed on sharing and controlling best practices and less than 10% of all discussions discover logical issues. ● and, finally, QA has no idea which parts of the system could be affected by a code change… neither do a programmer 8
  • 9. 20 bugs in a production environment per week 9
  • 10. A guess. Let's examine human factor ● a tired engineer makes more mistakes ● the more an engineer knows about certain module the fewer bugs (s)he will produce ● small changes have fewer bugs than long listings ● some parts of the system are more complicated than another, so the risk of get a bug increases ● huge changes in a short period of time contains more bugs (done in a hurry) 10
  • 11. Hypothesis If we know that certain commit has fixed a bug, than we know that a commit, when the changed lines were introduced, did contain the bug. 11 Author: John public int sum( int a, int b ) { return a + b; } C Author: Bob public int sum( int a, int b ) { return a * b; } BA public int sum( int a, int b ) { return a + b; }
  • 12. What tools can help us? 12 ● ticket types ● action history ● exact code changes ● author of modifications ● class complexity ● code metrics
  • 13. Our new team member. Overlord 13
  • 15. Improve cross references between tools 15 ● Notifies about missed ticket key in MR title ● Fills MR with information from Jira ● Fixes common mistakes in MR creation
  • 16. Propose the best reviewers based on MR changeset 16 ● Who previously has edited the touched code lines ● Who has coded more than others in the files ● Who is team lead / owner of the service / package
  • 17. Task updates according to the workflow 17 ● Transitions task status ● Assigns proper person for the next step ● Marks if task has SQL changes ● Adds a label with branch merged into
  • 18. Check that MR has 2 upvotes before merging 18 ● Check that rules are followed ● Notify TeamLead / Dev manager about any violation ● Push an author to ask colleagues to look at his masterpiece
  • 19. Another automated processes ● Notifies author about old MR without any reactions ● Notifies assignee that MR can be merged ● Notifies if you have lots of “In Progress” tickets or don’t have them at all ● Provides a list of merged tasks in the particular branch 19
  • 20. Now we have got all the data
  • 21. Algorithm of metrics collection ● Export all tasks from Jira to inmemory dictionary ● For each commit run a backtrace to mark it as buggy, fixing or regular ● Collect all meaningful data about commit: ○ Month of year, Day of week, Hour of day, Who, How many lines and files, Which classes and packages, Class complexity and amount of notices, How long a task is in progress ● Put a line with the data to Attribute-Relation File Format (ARFF) file 21
  • 22. Getting educated. WEKA Waikato Environment for Knowledge Analysis - is a suite of machine learning software written in Java, developed at the University of Waikato, New Zealand. ● Parsers ● Classifiers ● Training/test splits 22
  • 23. WEKA challenges ● Convert your data to corresponding vectors ● Choose proper data transformers ● Select and tweak desired Classifiers ● Run experiments and adjust your settings Good materials about WEKA for beginners: ● How to Run Your First Classifier in Weka ● Data mining with WEKA, Part 2. Classification and clustering ● Document Classification using WEKA 23
  • 24. Decision Tree Ease of results interpretation Any data can be fed to the method Can work with scalars and intervals 24
  • 25. Decision Tree 25 Changed less than 300 lines?Changed more than 50 lines? Author is Bob? Author is John? Has no bugs :) Has no bugs :) Is it Friday? Has no bugs :) Has a bug :( Has no bugs :) Has a bug :( ● John never has bugs! ● Everybody except John and Bob has bugs on Friday. ● Bob has bugs only if he changed more than 300 lines of code.
  • 26. Decision Tree 26 The simplest method for building a tree is ID3 (Iterative Dichotomiser 3*). Build steps: ● Find an attribute with lowest entropy (or largest information gain) ● Split the data set by the found attribute ● Recursively build a tree for each of the subsets * fates of ID2 and ID1 are lost in history
  • 27. Naive Bayes classifier ≈80% accuracy* Simple implementation Easy to understand 27
  • 29. Naive Bayes classifier 29 30% of all commits with bugs were done by Bob P(Bob|bug) 10% of all commits without bugs were done by Bob P(Bob|~bug) 40% of all commits have bugs P(bug) 60% of all commits have no bugs P(~bug) What probability that next commit from Bob will have a bug? P(bug|Bob)
  • 30. Output results example (Bayes) Correctly Classified Instances 14381 77.4755 % Incorrectly Classified Instances 4181 22.5245 % Kappa statistic 0.3085 Mean absolute error 0.2637 Root mean squared error 0.3963 === Detailed Accuracy By Class === TP Rate FP Rate Precision Recall F-Measure ROC Area Class 0.856 0.544 0.861 0.856 0.858 0.761 false 0.456 0.144 0.444 0.456 0.45 0.761 true Weighted Avg. 0.775 0.463 0.777 0.775 0.776 0.761 === Confusion Matrix === a b <-- classified as 12670 2140 | a = false 2041 1711 | b = true 30
  • 31. Output results example (RandomTree) form < 1 | Registration < 1 | | alexey.tokar@worldapp.com < 1 | | | tpl < 1 | | | | filters < 1 | | | | | frontend@worldapp.com < 1 | | | | | | middlejava@worldapp.com < 1 : false | | | | | | middlejava@worldapp.com >= 1 : true | | | | | frontend@worldapp.com >= 1 | | | | | | ObjectDesign < 1 : true | | | | | | ObjectDesign >= 1 : false | | | | filters >= 1 : false | | | tpl >= 1 : true | | alexey.tokar@worldapp.com >= 1 | | | bundle < 1 | | | | xmail < 1 | | | | | general < 1 | | | | | | dataimport < 1 | | | | | | | oracle < 1 : false | | | | | | | oracle >= 1 : true | | | | | | dataimport >= 1 : false | | | | | general >= 1 | | | | | | filesedited < 2 : false | | | | | | filesedited >= 2 : true | | | | xmail >= 1 : false | | | bundle >= 1 : true | Registration >= 1 : true 31
  • 32. Summary ● we found that certain classes are too complex as almost every change in them will end up with a bug ● some of engineers shouldn't open some packages at all (or at least we should properly educate them) ● there are still many rooms for improvements (overlapping hiding commits, another meaningful features, more accurate code history, etc) ● It does not show you where an error exists. But you will be able to analyze a commit more carefully. ● It was fun! :) 32

Editor's Notes

  • #4: Today I would like to have a talk about Automation engineers and how can they affect overall project progress and quality
  • #10: Having all these activities we still get up to 20 bugs during a week in production environments reported by our customers
  • #12: Боб изменил функцию Джон прислал коммит с текстом fixed Отмечаем такой коммит как Fixing Находим измененные строки Делаем Blame и находим когда они были представлены Отмечаем такой коммит как Buggy
  • #13: (it is worth to mention that accuracy is a must have for future improvements) crossreferences
  • #15: Точки входа - вебхуки и планировщик Application design Webhooks from gitlab ScheduledExecutorService
  • #28: classifier based on applying Bayes' theorem with strong (naive) independence assumptions between the features.