SlideShare a Scribd company logo
IMPACT OF CONTINUOUS
INTEGRATION ON CODE REVIEWS
Mohammad Masudur Rahman, Chanchal K. Roy
Department of Computer Science
University of Saskatchewan, Canada
14th International Conference on Mining Software
Repositories (MSR 2017) (Challenge Track)
Buenos Aires, Argentina
RESEARCH PROBLEM: IMPACT OF
AUTOMATED BUILDS ON CODE REVIEWS
 Automated Builds, an
important part of CI for
commit merging & consistency
 Exponential increase of
automated builds over the
years with Travis CI.
 Builds & Code reviews as
interleaving steps in the pull-
based development
RQ1: Does the status of automated builds influence the code
review participation in open source projects?
RQ2: Do frequent automated builds help improve the
overall quality of peer code reviews?
RQ3: Can we automatically predict whether an automated
build would trigger new code reviews or not? 2
DATASET & EXPERIMENTAL SETUP
3
MSR Challenge
Dataset (3702K)
Our dataset
(578K)
(Is build triggered by PR?
i.e., gh_is_pr==true?)
346K builds
(No reviews)
232K builds
(with code reviews)
RQ1 RQ2 RQ3
ANSWERING RQ1: BUILD STATUS &
CODE REVIEW PARTICIPATION
Build Status Build Only Builds + Reviews Total
Canceled 2,616 1,368 3,984
Errored 51,729 27,262 78,991
Failed 55,546 39,025 94,571
Passed 236,573 164,174 400,747
All 346,464 231,829 (40%) 578,293
4
 578K PR-based builds
 Four build statuses
 232K (40%) build entries
with code reviews.
 Chi-squared tests (p-
value=2.2e-16<0.05)
ANSWERING RQ1: BUILD STATUS &
CODE REVIEW PARTICIPATION
5
Previous
Build status
#PR with Review Comments
Only Added↑ Only Removed↓ Total Changed↑↓
Canceled 20 24 65
Errored 510 265 812
Failed 1,542 826 2,316
Passed 4,235 1,788 5,677
All 6,307 2,903 8,870 (28%)
 31,648 PRs for 232K entries from 1000+ projects
 For 28% PR, #review comments changed.
 Passed builds triggered 18% of new reviews.
 Errored + Failed triggered 10%
ANSWERING RQ2: BUILD FREQUENCY &
CODE REVIEW QUALITY
6
Quantile Issue Comments PR Comments All Review Comments
M p-value ∆ M p-value ∆ M p-value ∆
Q1
0.60
<0.001* 0.35
0.24
<0.001* 0.49
0.84
<0.001* 0.41
Q4
0.99 0.52 1.50
M= Mean #review comments, * = Statistically significant, ∆ = Cliff’s Delta
ANSWERING RQ2: BUILD FREQUENCY &
CODE REVIEW QUALITY
 5 projects from Q1, and 5 from Q4, 3-4 years old
 Cumulative #review comments/build over 48 months
 Code review quality (i.e., #comments) improved almost
linearly for frequently built projects
 Didn’t happen so for the counterpart, looks zigzag.
7
ANSWERING RQ3: PREDICTION OF NEW
CODE REVIEW TRIGGERING
Learning
Algorithm
Overall
Accuracy
New Review Triggered?
Precision Recall
Naïve Bayes 58.03% 68.70% 29.50%
Logistic Regression 60.56% 64.50% 47.00%
J48 64.04% 69.50% 50.10%
8
 Features: build status, code change statistics, test
change statistics, and code review comments.
Response: New review triggered or unchanged.
 Three ML algorithms with 10-fold cross-validation.
 26.5K build entries as balanced dataset.
 J48 performed the best, 64% accuracy, 69.50%
precision & 50% recall.
TAKE-HOME MESSAGES
 Automated builds might influence manual code
reviews since they interleave each other in the
modern pull-based development
 Passed builds more associated with review
participations, and with new code reviews.
 Frequently built projects received more review
comments than less frequently built ones.
 Code review activities are steady over time with
frequently built projects. Not true for the
counterparts.
 Our prediction model can predict whether a
build will trigger new code review or not.
9
THANK YOU!! QUESTIONS?
10
Email: chanchal.roy@usask.ca or
masud.rahman@usask.ca

More Related Content

PPTX
Values & Culture of Continuous Deliver
PDF
Early Detection of Collaboration Conflicts & Risks in Software Development
PPTX
Agile principles and practices
PPTX
TestIstanbul 2017 Keynote: "Characteristics Of A Modern Test Process" by Jan ...
PDF
Software Engineering Culture - Improve Code Quality
PDF
Ph.D. Thesis Defense: Studying Reviewer Selection and Involvement in Modern ...
PPTX
Comparing model coverage and code coverage in Model Driven testing: an explor...
PDF
Who Should Review My Code?
Values & Culture of Continuous Deliver
Early Detection of Collaboration Conflicts & Risks in Software Development
Agile principles and practices
TestIstanbul 2017 Keynote: "Characteristics Of A Modern Test Process" by Jan ...
Software Engineering Culture - Improve Code Quality
Ph.D. Thesis Defense: Studying Reviewer Selection and Involvement in Modern ...
Comparing model coverage and code coverage in Model Driven testing: an explor...
Who Should Review My Code?

What's hot (20)

PDF
Review Participation in Modern Code Review: An Empirical Study of the Android...
ODP
Facts about open source projects & testing
PPTX
Establishing A Defect Prediction Model Using A Combination of Product Metrics...
PDF
Revisiting Code Ownership and Its Relationship with Software Quality in the S...
PDF
Improving Code Review Effectiveness Through Reviewer Recommendations
PDF
Using HPC Resources to Exploit Big Data for Code Review Analytics
PDF
Presentation slides: "How to get 100% code coverage"
PPTX
Icsm2010 kamei
PDF
TDD CrashCourse Part2: TDD
PPT
A Regression Analysis Approach for Building a Prediction Model for System Tes...
PPTX
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
PPTX
Technical Practices for Agile Engineering - PNSQC 2019
PDF
Rayleigh model
ODP
TDD - Test Driven Development
PDF
Test driven development vs Behavior driven development
PPTX
Code quality
PDF
Csqe sample exam 1 solutions 05.00.04
PPT
Code coverage
KEY
Reliability Vs. Testing
Review Participation in Modern Code Review: An Empirical Study of the Android...
Facts about open source projects & testing
Establishing A Defect Prediction Model Using A Combination of Product Metrics...
Revisiting Code Ownership and Its Relationship with Software Quality in the S...
Improving Code Review Effectiveness Through Reviewer Recommendations
Using HPC Resources to Exploit Big Data for Code Review Analytics
Presentation slides: "How to get 100% code coverage"
Icsm2010 kamei
TDD CrashCourse Part2: TDD
A Regression Analysis Approach for Building a Prediction Model for System Tes...
Code Coverage and Test Suite Effectiveness: Empirical Study with Real Bugs in...
Technical Practices for Agile Engineering - PNSQC 2019
Rayleigh model
TDD - Test Driven Development
Test driven development vs Behavior driven development
Code quality
Csqe sample exam 1 solutions 05.00.04
Code coverage
Reliability Vs. Testing
Ad

Similar to Impact of Continuous Integration on Code Reviews (20)

PDF
Test-Driven Code Review: An Empirical Study
PPT
Adopting code reviews for agile software development
PPTX
CORRECT: Code Reviewer Recommendation at GitHub for Vendasta Technologies
PPTX
Towards Automated Supports for Code Reviews using Reviewer Recommendation and...
PDF
Would Static Analysis Tools Help Developers with Code Reviews?
PPTX
CORRECT: Code Reviewer Recommendation in GitHub Based on Cross-Project and Te...
PDF
A Tale of Experiments on Bug Prediction
PPT
Cukic Promise08 V3
PPTX
STRICT: Information Retrieval Based Search Term Identification for Concept Lo...
PDF
Declarative Performance Testing Automation - Automating Performance Testing f...
PPTX
Preventive Software Maintenance: The Past, the Present, the Future
PDF
FSE-Journal-First-Automated code editing with search-generate-modify.pdf
PPTX
Semantic-Aware Code Model: Elevating the Future of Software Development
PPTX
Тестирование спецификаций
PDF
Standardized Risk Measurement for IT Executives 101
PDF
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
ODP
Automating good coding practices
PPTX
Recommending Insightful Comments for Source Code using Crowdsourced Knowledge
PPTX
Predicting Usefulness of Code Review Comments using Textual Features and Deve...
PDF
Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...
Test-Driven Code Review: An Empirical Study
Adopting code reviews for agile software development
CORRECT: Code Reviewer Recommendation at GitHub for Vendasta Technologies
Towards Automated Supports for Code Reviews using Reviewer Recommendation and...
Would Static Analysis Tools Help Developers with Code Reviews?
CORRECT: Code Reviewer Recommendation in GitHub Based on Cross-Project and Te...
A Tale of Experiments on Bug Prediction
Cukic Promise08 V3
STRICT: Information Retrieval Based Search Term Identification for Concept Lo...
Declarative Performance Testing Automation - Automating Performance Testing f...
Preventive Software Maintenance: The Past, the Present, the Future
FSE-Journal-First-Automated code editing with search-generate-modify.pdf
Semantic-Aware Code Model: Elevating the Future of Software Development
Тестирование спецификаций
Standardized Risk Measurement for IT Executives 101
Cross-project Defect Prediction Using A Connectivity-based Unsupervised Class...
Automating good coding practices
Recommending Insightful Comments for Source Code using Crowdsourced Knowledge
Predicting Usefulness of Code Review Comments using Textual Features and Deve...
Process Aspects and Social Dynamics of Contemporary Code Review: Insights fro...
Ad

More from Masud Rahman (20)

PDF
Explaining Software Bugs Leveraging Code Structures in Neural Machine Transla...
PDF
Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?
PDF
Improved Detection and Diagnosis of Faults in Deep Neural Networks Using Hier...
PPTX
HereWeCode 2022: Dalhousie University
PPTX
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
PPTX
PhD Seminar - Masud Rahman, University of Saskatchewan
PPTX
PhD proposal of Masud Rahman
PPTX
PhD Comprehensive exam of Masud Rahman
PPTX
Doctoral Symposium of Masud Rahman
PPTX
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
PDF
Poster: Improving Bug Localization with Report Quality Dynamics and Query Ref...
PPTX
An Insight into the Unresolved Questions at Stack Overflow
PPTX
An Insight into the Pull Requests of GitHub
PPTX
TextRank Based Search Term Identification for Software Change Tasks
PPTX
CMPT-842-BRACK
PPTX
RACK: Code Search in the IDE using Crowdsourced Knowledge
PPTX
RACK: Automatic API Recommendation using Crowdsourced Knowledge
PPTX
QUICKAR: Automatic Query Reformulation for Concept Location Using Crowdsource...
PPTX
Improved Query Reformulation for Concept Location using CodeRank and Document...
PPTX
CMPT470-usask-guest-lecture
Explaining Software Bugs Leveraging Code Structures in Neural Machine Transla...
Can Hessian-Based Insights Support Fault Diagnosis in Attention-based Models?
Improved Detection and Diagnosis of Faults in Deep Neural Networks Using Hier...
HereWeCode 2022: Dalhousie University
The Forgotten Role of Search Queries in IR-based Bug Localization: An Empiric...
PhD Seminar - Masud Rahman, University of Saskatchewan
PhD proposal of Masud Rahman
PhD Comprehensive exam of Masud Rahman
Doctoral Symposium of Masud Rahman
Supporting Source Code Search with Context-Aware and Semantics-Driven Code Se...
Poster: Improving Bug Localization with Report Quality Dynamics and Query Ref...
An Insight into the Unresolved Questions at Stack Overflow
An Insight into the Pull Requests of GitHub
TextRank Based Search Term Identification for Software Change Tasks
CMPT-842-BRACK
RACK: Code Search in the IDE using Crowdsourced Knowledge
RACK: Automatic API Recommendation using Crowdsourced Knowledge
QUICKAR: Automatic Query Reformulation for Concept Location Using Crowdsource...
Improved Query Reformulation for Concept Location using CodeRank and Document...
CMPT470-usask-guest-lecture

Recently uploaded (20)

PDF
Web App vs Mobile App What Should You Build First.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PPTX
1. Introduction to Computer Programming.pptx
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Getting Started with Data Integration: FME Form 101
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Hybrid model detection and classification of lung cancer
PDF
Encapsulation theory and applications.pdf
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Web App vs Mobile App What Should You Build First.pdf
Unlocking AI with Model Context Protocol (MCP)
Assigned Numbers - 2025 - Bluetooth® Document
SOPHOS-XG Firewall Administrator PPT.pptx
1. Introduction to Computer Programming.pptx
Digital-Transformation-Roadmap-for-Companies.pptx
Getting Started with Data Integration: FME Form 101
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
TLE Review Electricity (Electricity).pptx
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A comparative analysis of optical character recognition models for extracting...
Hybrid model detection and classification of lung cancer
Encapsulation theory and applications.pdf
Univ-Connecticut-ChatGPT-Presentaion.pdf
Programs and apps: productivity, graphics, security and other tools
Encapsulation_ Review paper, used for researhc scholars
Group 1 Presentation -Planning and Decision Making .pptx
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf

Impact of Continuous Integration on Code Reviews

  • 1. IMPACT OF CONTINUOUS INTEGRATION ON CODE REVIEWS Mohammad Masudur Rahman, Chanchal K. Roy Department of Computer Science University of Saskatchewan, Canada 14th International Conference on Mining Software Repositories (MSR 2017) (Challenge Track) Buenos Aires, Argentina
  • 2. RESEARCH PROBLEM: IMPACT OF AUTOMATED BUILDS ON CODE REVIEWS  Automated Builds, an important part of CI for commit merging & consistency  Exponential increase of automated builds over the years with Travis CI.  Builds & Code reviews as interleaving steps in the pull- based development RQ1: Does the status of automated builds influence the code review participation in open source projects? RQ2: Do frequent automated builds help improve the overall quality of peer code reviews? RQ3: Can we automatically predict whether an automated build would trigger new code reviews or not? 2
  • 3. DATASET & EXPERIMENTAL SETUP 3 MSR Challenge Dataset (3702K) Our dataset (578K) (Is build triggered by PR? i.e., gh_is_pr==true?) 346K builds (No reviews) 232K builds (with code reviews) RQ1 RQ2 RQ3
  • 4. ANSWERING RQ1: BUILD STATUS & CODE REVIEW PARTICIPATION Build Status Build Only Builds + Reviews Total Canceled 2,616 1,368 3,984 Errored 51,729 27,262 78,991 Failed 55,546 39,025 94,571 Passed 236,573 164,174 400,747 All 346,464 231,829 (40%) 578,293 4  578K PR-based builds  Four build statuses  232K (40%) build entries with code reviews.  Chi-squared tests (p- value=2.2e-16<0.05)
  • 5. ANSWERING RQ1: BUILD STATUS & CODE REVIEW PARTICIPATION 5 Previous Build status #PR with Review Comments Only Added↑ Only Removed↓ Total Changed↑↓ Canceled 20 24 65 Errored 510 265 812 Failed 1,542 826 2,316 Passed 4,235 1,788 5,677 All 6,307 2,903 8,870 (28%)  31,648 PRs for 232K entries from 1000+ projects  For 28% PR, #review comments changed.  Passed builds triggered 18% of new reviews.  Errored + Failed triggered 10%
  • 6. ANSWERING RQ2: BUILD FREQUENCY & CODE REVIEW QUALITY 6 Quantile Issue Comments PR Comments All Review Comments M p-value ∆ M p-value ∆ M p-value ∆ Q1 0.60 <0.001* 0.35 0.24 <0.001* 0.49 0.84 <0.001* 0.41 Q4 0.99 0.52 1.50 M= Mean #review comments, * = Statistically significant, ∆ = Cliff’s Delta
  • 7. ANSWERING RQ2: BUILD FREQUENCY & CODE REVIEW QUALITY  5 projects from Q1, and 5 from Q4, 3-4 years old  Cumulative #review comments/build over 48 months  Code review quality (i.e., #comments) improved almost linearly for frequently built projects  Didn’t happen so for the counterpart, looks zigzag. 7
  • 8. ANSWERING RQ3: PREDICTION OF NEW CODE REVIEW TRIGGERING Learning Algorithm Overall Accuracy New Review Triggered? Precision Recall Naïve Bayes 58.03% 68.70% 29.50% Logistic Regression 60.56% 64.50% 47.00% J48 64.04% 69.50% 50.10% 8  Features: build status, code change statistics, test change statistics, and code review comments. Response: New review triggered or unchanged.  Three ML algorithms with 10-fold cross-validation.  26.5K build entries as balanced dataset.  J48 performed the best, 64% accuracy, 69.50% precision & 50% recall.
  • 9. TAKE-HOME MESSAGES  Automated builds might influence manual code reviews since they interleave each other in the modern pull-based development  Passed builds more associated with review participations, and with new code reviews.  Frequently built projects received more review comments than less frequently built ones.  Code review activities are steady over time with frequently built projects. Not true for the counterparts.  Our prediction model can predict whether a build will trigger new code review or not. 9
  • 10. THANK YOU!! QUESTIONS? 10 Email: chanchal.roy@usask.ca or masud.rahman@usask.ca