SlideShare a Scribd company logo
MINE YOUR OWN CODE
vs
How do I know
what to
refactor?
BACKGROUND
•  Married for 25+ years
•  Working as software developer/architect for 25 years
•  Weighs 20+ kg more
•  Will work 15(+/-?) more years trying to answer that question
EXTENDA
HOW SOFTWARE EVOLVE OVER TIME
Time
Very often becomes
POTENTIAL INDICATORS
  Code Decay
  Lack of design patterns
  Architecture violations
Code Smell
  God Class
  Data Class
  Code Clones
  Tradition Breaker
  Intensive Coupling
  …
SHOW ME THE NUMBERS
I want to refactor
the code! Why? What’s the
business value?
Show me the
numbers!
?
FIND THE HOTSPOTS
  Problems
  Degraded velocity
  Long and breaking builds
  Increasing size and complexity
  SLA issues
  Diagnose/Action
  Test
  Code Review
  Refactor
STATIC CODE ANALYSIS
WHERE TO START?
LARGE/GOD CLASS
ATFD > 5
Access to foreign data
WMC > 46
Weighted method count
TCC > 0.33
Tight class cohesion
God Class
RESEARCH BASED APPROACH
MINING SOFTWARE REPOSITORIES
  Uses other sources as well
  Version Control Systems (Git, Mercurial, Perforce, …)
  Incident Systems (Jira, Bugtracker, ALM, …)
  Communication Platforms (Stack Overflow, Intranets, …)
  Build Servers (Jenkins, TeamCity, GO, …)
  Review Tools (Swarm,
  Organization Schemas
  …
  Adds time aspect
Time
MINING SOFTWARE REPOSITORIES
  Mining software repositories gives us technical and social/
organizational information that we can’t derive from a snapshot.
GOD CLASS
ATFD > 5
Access to foreign data
WMC > 46
Weighted method count
TCC > 0.33
Tight class cohesion
God Class
God classes are 4-17 times more defect prone
God classes are 5-7 times more change prone
DESIGN/TECHNICAL DEBT
  Finding the sweetspot
  When do the cost of maintenance exceed the cost to refactor
– Value of debt (how much is it going to cost to fix it?)
– Interest rate (how much does it slow down development?)
– Probability (what is the chance that the debt affects productivity?)
CODE CHURN
CODE CHURN
  Research has shown that frequent changes to complex code
generally indicate declining quality
  The number of times code changes is a better predictor of defects
than pure size
  Modules that change frequently are linked to maintenance problems
and low quality (An Empirical Study on the Impact of Duplicate Code)
  Including a measure of change in the prior release is an essential
component of our fault prediction method. Individually, counts of adds
and modifications outperform counts of deletes, while the sum of all
three counts was most effective (Does Measuring Code Change
Improve Fault Prediction?)
CODE MAAT
Command line tool to analyse VCS (Git, Mercurial, Subversion,
Team Foundation Server, Perforce)
  Input : VCS log file for the last X days/months/year(s)
  Output :
File Statistics (Number of files, age, …)
Organizational Metrcis (number of authors, code ownership, …)
Coupling
Code Churn
CODE MAAT
Code Maat
DEMO – CODE CHURN
Code Maat
https://github.com/adamtornhill/code-maat
Docker image https://github.com/peternorrhall/code-maat
DEMO EMPEAR
Code Maat +
Hotspots
  Settings/Filtering and Visualization
Performance
Only support for Git
EXTENDA - MSR
Time
Changelist/Files
Job
Defect
Requirement
Refactoring
Code Analysis
Categorisation
EXTENDA MSR
Pentaho
Data Integration
FINDINGS
  New module Self Checkout Client (device integration)
  A lot of development 2014
  A lot of defects and refactorings in 2015 forthe files with highest
code churn and complexity. In accordance with the result in
Empear
  XML complexity as well
STREAMS
  Task streams for larger work
Purpose stable main
main
dev
@
@
task
TEMPORAL COUPLING
Static code dependencies (Structure 101 on the Spring project)
TEMPORAL COUPLING
TestClassA
ClassA
ClassB
Research
•  Change coupling points to architectural weakness
•  Hotspots of refactoring candidates
•  Helps comprehension of system modularization
•  Spotting of misplaced components
•  Correlates with defects (in some cases)
Module A
Module B
DEMO - TEMPORAL COUPLING
EMPEAR
GRAPHVIZ
TEMPORAL COUPLING – USE CASES
Find patterns (.properties should be changed together)
Find hidden dependencies (modules)
  Lack of unit tests or too high velocity of unit tests
Interesting to see how it changes over time
ORGANIZATION AND OWNERSHIP
Time
Ownership where person is about to leave or has left + Age of code
WHAT IS YOUR BUSINESS CASE?
  Do you need to care about it in the first place?
  How long will your product/system live?
  Extenda 20+ years for some of our products
  Data Scientist spend most of their time cleaning data
Remove ”Build user”, Streams, …
What type of commit – defect/refactor/new feature (explicit labeling
works well)
Finding the False Positives
Use the metrics you have
USE AND VISUALIZE YOUR DATA
Free material from www.gapminder.org
THE GOAL
I want to refactor
the code!
Why? Show me
the numbers!
No problem
Boss!
QUESTIONS
THANK YOU FOR LISTENING!
Please ask or give feedback
  Email : peter.norrhall@extenda.com
LinkedIn : https://www.linkedin.com/in/peternorrhall
Twitter : https://twitter.com/peternorrhall
REFERENCES
  "Making Software, What Really Works, and Why We Believe
It", Oram/Wilson
  "Object-Oriented Metrics in Practice", Lanza/Marinescu
  "Your Code as a Crime Scene", Tornhill
  "Investigating the Impact of Design Debt on Software Quality",
Zazworka/Seaman/Shull/Shaw
  MSR International Conference - http://2016.msrconf.org/
TOOLS
Code Maat - https://github.com/adamtornhill/code-maat
Code Maat Docker Image - https://github.com/peternorrhall/code-maat
Docker - https://www.docker.com/
Empear – http://www.empear.com
Graphviz – http://www.graphviz.org
  Git - https://git-scm.com/
  Git-P4 - https://git-scm.com/docs/git-p4
  MS Excel - https://products.office.com/sv-se/excel
Pentaho - http://community.pentaho.com/
Perforce - https://www.perforce.com/
  R Studio - https://www.rstudio.com/
SonarQube - http://www.sonarqube.org/
  Structure101 - http://structure101.com/

More Related Content

PDF
Visualize your architecture at ITARC 2013
PDF
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
PDF
Data ops: Machine Learning in production
PPT
Dill may-2008
PPTX
Requirements Analysis and Management using Innoslate
PDF
Cross-project defect prediction
PDF
Defect effort prediction models in software
PPTX
Innoslate, A Model-Based Systems Engineering Tool
Visualize your architecture at ITARC 2013
Rsqrd AI: How to Design a Reliable and Reproducible Pipeline
Data ops: Machine Learning in production
Dill may-2008
Requirements Analysis and Management using Innoslate
Cross-project defect prediction
Defect effort prediction models in software
Innoslate, A Model-Based Systems Engineering Tool

What's hot (9)

PDF
Building A Production-Level Machine Learning Pipeline
PPTX
Requirements Management Using Innoslate
PPTX
Software Architecture Course - Part III Taxonomies - Definitions
PPTX
Py data scikit-production
PDF
Testing and Deployment - Full Stack Deep Learning
PPT
Agile Open 2009 Tdd And Architecture Influences
PPT
User Driven Software Architecture
ODP
Writting Better Software
PDF
Version Control in AI/Machine Learning by Datmo
Building A Production-Level Machine Learning Pipeline
Requirements Management Using Innoslate
Software Architecture Course - Part III Taxonomies - Definitions
Py data scikit-production
Testing and Deployment - Full Stack Deep Learning
Agile Open 2009 Tdd And Architecture Influences
User Driven Software Architecture
Writting Better Software
Version Control in AI/Machine Learning by Datmo
Ad

Similar to Mine Your Own Code (20)

PDF
How to improve the quality of your application
PDF
Software bug prediction
PDF
How to improve the quality of your application
PPTX
Cleaning Code - Tools and Techniques for Large Legacy Projects
PDF
Enhancing Developer Productivity with Code Forensics
PDF
Evolving Software Patterns
PDF
Reduce Reuse Refactor
PDF
Javantura v6 - How can you improve the quality of your application - Ioannis ...
PDF
How to improve the quality of your application
PDF
Refactoring: Gold from a Monolithic Legacy
PDF
high performance mysql
PDF
From Mess To Masterpiece - JFokus 2017
PPTX
Refactoring workshop
PDF
agile refactoring and integration techniques.pdf
ODP
Xp days ukraine 2012
PPTX
Metric driven refactoring
PDF
Refactoring Fest
PDF
Deliver Fast with Confidence
PDF
Taming Big Balls of Mud with Diligence, Agile Practices, and Hard Work
PPTX
Unsustainable Regaining Control of Uncontrollable Apps
How to improve the quality of your application
Software bug prediction
How to improve the quality of your application
Cleaning Code - Tools and Techniques for Large Legacy Projects
Enhancing Developer Productivity with Code Forensics
Evolving Software Patterns
Reduce Reuse Refactor
Javantura v6 - How can you improve the quality of your application - Ioannis ...
How to improve the quality of your application
Refactoring: Gold from a Monolithic Legacy
high performance mysql
From Mess To Masterpiece - JFokus 2017
Refactoring workshop
agile refactoring and integration techniques.pdf
Xp days ukraine 2012
Metric driven refactoring
Refactoring Fest
Deliver Fast with Confidence
Taming Big Balls of Mud with Diligence, Agile Practices, and Hard Work
Unsustainable Regaining Control of Uncontrollable Apps
Ad

Recently uploaded (20)

PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
Introduction to Artificial Intelligence
PPTX
ai tools demonstartion for schools and inter college
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
PDF
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
PDF
System and Network Administraation Chapter 3
PPTX
Reimagine Home Health with the Power of Agentic AI​
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PPTX
Transform Your Business with a Software ERP System
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...
How to Migrate SBCGlobal Email to Yahoo Easily
wealthsignaloriginal-com-DS-text-... (1).pdf
Introduction to Artificial Intelligence
ai tools demonstartion for schools and inter college
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Design an Analysis of Algorithms I-SECS-1021-03
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Agentic AI : A Practical Guide. Undersating, Implementing and Scaling Autono...
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
SAP S4 Hana Brochure 3 (PTS SYSTEMS AND SOLUTIONS)
Raksha Bandhan Grocery Pricing Trends in India 2025.pdf
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
System and Network Administraation Chapter 3
Reimagine Home Health with the Power of Agentic AI​
VVF-Customer-Presentation2025-Ver1.9.pptx
Transform Your Business with a Software ERP System
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
Audit Checklist Design Aligning with ISO, IATF, and Industry Standards — Omne...

Mine Your Own Code

  • 2. vs
  • 3. How do I know what to refactor?
  • 4. BACKGROUND •  Married for 25+ years •  Working as software developer/architect for 25 years •  Weighs 20+ kg more •  Will work 15(+/-?) more years trying to answer that question
  • 6. HOW SOFTWARE EVOLVE OVER TIME Time Very often becomes
  • 7. POTENTIAL INDICATORS   Code Decay   Lack of design patterns   Architecture violations Code Smell   God Class   Data Class   Code Clones   Tradition Breaker   Intensive Coupling   …
  • 8. SHOW ME THE NUMBERS I want to refactor the code! Why? What’s the business value? Show me the numbers! ?
  • 9. FIND THE HOTSPOTS   Problems   Degraded velocity   Long and breaking builds   Increasing size and complexity   SLA issues   Diagnose/Action   Test   Code Review   Refactor
  • 12. LARGE/GOD CLASS ATFD > 5 Access to foreign data WMC > 46 Weighted method count TCC > 0.33 Tight class cohesion God Class
  • 14. MINING SOFTWARE REPOSITORIES   Uses other sources as well   Version Control Systems (Git, Mercurial, Perforce, …)   Incident Systems (Jira, Bugtracker, ALM, …)   Communication Platforms (Stack Overflow, Intranets, …)   Build Servers (Jenkins, TeamCity, GO, …)   Review Tools (Swarm,   Organization Schemas   …   Adds time aspect Time
  • 15. MINING SOFTWARE REPOSITORIES   Mining software repositories gives us technical and social/ organizational information that we can’t derive from a snapshot.
  • 16. GOD CLASS ATFD > 5 Access to foreign data WMC > 46 Weighted method count TCC > 0.33 Tight class cohesion God Class God classes are 4-17 times more defect prone God classes are 5-7 times more change prone
  • 17. DESIGN/TECHNICAL DEBT   Finding the sweetspot   When do the cost of maintenance exceed the cost to refactor – Value of debt (how much is it going to cost to fix it?) – Interest rate (how much does it slow down development?) – Probability (what is the chance that the debt affects productivity?)
  • 19. CODE CHURN   Research has shown that frequent changes to complex code generally indicate declining quality   The number of times code changes is a better predictor of defects than pure size   Modules that change frequently are linked to maintenance problems and low quality (An Empirical Study on the Impact of Duplicate Code)   Including a measure of change in the prior release is an essential component of our fault prediction method. Individually, counts of adds and modifications outperform counts of deletes, while the sum of all three counts was most effective (Does Measuring Code Change Improve Fault Prediction?)
  • 20. CODE MAAT Command line tool to analyse VCS (Git, Mercurial, Subversion, Team Foundation Server, Perforce)   Input : VCS log file for the last X days/months/year(s)   Output : File Statistics (Number of files, age, …) Organizational Metrcis (number of authors, code ownership, …) Coupling Code Churn
  • 22. DEMO – CODE CHURN Code Maat https://github.com/adamtornhill/code-maat Docker image https://github.com/peternorrhall/code-maat
  • 23. DEMO EMPEAR Code Maat + Hotspots   Settings/Filtering and Visualization Performance Only support for Git
  • 26. FINDINGS   New module Self Checkout Client (device integration)   A lot of development 2014   A lot of defects and refactorings in 2015 forthe files with highest code churn and complexity. In accordance with the result in Empear   XML complexity as well
  • 27. STREAMS   Task streams for larger work Purpose stable main main dev @ @ task
  • 28. TEMPORAL COUPLING Static code dependencies (Structure 101 on the Spring project)
  • 29. TEMPORAL COUPLING TestClassA ClassA ClassB Research •  Change coupling points to architectural weakness •  Hotspots of refactoring candidates •  Helps comprehension of system modularization •  Spotting of misplaced components •  Correlates with defects (in some cases) Module A Module B
  • 30. DEMO - TEMPORAL COUPLING
  • 33. TEMPORAL COUPLING – USE CASES Find patterns (.properties should be changed together) Find hidden dependencies (modules)   Lack of unit tests or too high velocity of unit tests Interesting to see how it changes over time
  • 34. ORGANIZATION AND OWNERSHIP Time Ownership where person is about to leave or has left + Age of code
  • 35. WHAT IS YOUR BUSINESS CASE?   Do you need to care about it in the first place?   How long will your product/system live?   Extenda 20+ years for some of our products   Data Scientist spend most of their time cleaning data Remove ”Build user”, Streams, … What type of commit – defect/refactor/new feature (explicit labeling works well) Finding the False Positives Use the metrics you have
  • 36. USE AND VISUALIZE YOUR DATA Free material from www.gapminder.org
  • 37. THE GOAL I want to refactor the code! Why? Show me the numbers! No problem Boss!
  • 39. THANK YOU FOR LISTENING! Please ask or give feedback   Email : peter.norrhall@extenda.com LinkedIn : https://www.linkedin.com/in/peternorrhall Twitter : https://twitter.com/peternorrhall
  • 40. REFERENCES   "Making Software, What Really Works, and Why We Believe It", Oram/Wilson   "Object-Oriented Metrics in Practice", Lanza/Marinescu   "Your Code as a Crime Scene", Tornhill   "Investigating the Impact of Design Debt on Software Quality", Zazworka/Seaman/Shull/Shaw   MSR International Conference - http://2016.msrconf.org/
  • 41. TOOLS Code Maat - https://github.com/adamtornhill/code-maat Code Maat Docker Image - https://github.com/peternorrhall/code-maat Docker - https://www.docker.com/ Empear – http://www.empear.com Graphviz – http://www.graphviz.org   Git - https://git-scm.com/   Git-P4 - https://git-scm.com/docs/git-p4   MS Excel - https://products.office.com/sv-se/excel Pentaho - http://community.pentaho.com/ Perforce - https://www.perforce.com/   R Studio - https://www.rstudio.com/ SonarQube - http://www.sonarqube.org/   Structure101 - http://structure101.com/