SlideShare a Scribd company logo
Mining software
repository with Pharo
03 July 2025
Nicolas Hlad
with Benoit Verhaeghe & Kilian Bauvent
ESUG2025 - Gdansk
②
③
④
⑤
2
Outlines
Introduction
① What is Mining Software Repository ?
What are the specificities of Berger-Levrault
regarding MSR ?
How we develop GitProjectHealth?
What can you do with it ? (demo)
What is next ?
What is
Mining Software Repository ?
3
Section ①
(MSR)
4
What is Mining Software Repository ?
Mining Software Repository
Definitions - 2
• Today's collaborative development relies on Git Social Platforms (GSP) [Dabbish et al. 2012]
• they are server implementations of Git with builtin social features
• They contain valuable historical information over a software development
Fig - Discussing Pull Request in GitHub Fig - Commits distribution over time (GitHub)
5
What is Mining Software Repository ?
Mining Software Repository
Definitions - 2
• Mining Software Repository (MSR) provides methods and tools to extract data
from these platforms [Hassan 2008].
• Among other, it allows us to:
• Studying the impact of code smells [Steffen et al. 2010, Palomba et al. 2014]
• Exploring developers code review [Bacchelli et al. 2013]
• Predicting classes prone to change and defect [Bacchelli et al. 2010]
• Retro-analysing a entire development process [Mockus et al. 2000]
6
What is Mining Software Repository ?
Mining Software Repository
Existing tools
• General Mining data
• PyDriller — python tool for mining commit
• Git-Miner — Pharo tool, based on CLI-GitMiner
• Specific Mining Data
• Javapers — java lib mostly for Java file analysis in git repository (leveraging SPOON)
• ModelMine — retrieve UML model from project's artefacts
• Data Storage
• Pandora —focus on long terme storage of Git social platform data
• Software Heritage — Archive of Git repository from GSP
• Computing Metrics
• LinearB — Productions and deployment metrics, data positioning with other
companies
What are the specificities of
Berger-Levrault regarding MSR
7
Section ②
8
What are the specificities of Berger-Levrault regarding MSR ?
Industrial context
A quick word on Berger-levrault
• Berger-Levrault is
• a group of international software editors
• with divers sectors of activity (eduction, health, public administration, etc)
• The group acquired divers software editors over the past 30 years.
• From different countries (France, Spain, Canada, Maroco, etc);
• working with different technology (Java, C#, Typescript, Dart, etc);
• and different Git Social Platform (Gitlab, Bitbucket, Azure Devops, etc).
9
What are the specificities of Berger-Levrault regarding MSR ?
Industrial context
A quick word on Berger-levrault - 2
• We use the project management system Altelissan's Jira to manage:
• tickets (Bug, features, Hotfix, etc)
• SPRINT (Agile development)
• releases (delivering dates, testing software, etc).
Fig - Kanban view of a SPRINT in Jira
10
What are the specificities of Berger-Levrault regarding MSR ?
Industrial context
A quick word on Berger-levrault - 3
• Our Jira and Git Social Platform environment are connected
Fig - Linking Jira Tickets to Commit and Merge activity in GitLab
14
What are the specificities of Berger-Levrault regarding MSR ?
Industrial context
The specificities of Berger-Levrault
Git Social Platform project management systems
How to mine from different Git Social Platforms (GSP) ?
How to implement MSR metrics efficiently ?
How to connect GSP data to Jira efficiently ?
We use Model Driven Engineering with Pharo-Moose
How we develop our solution
with MDE
15
Section ③
The conception of Git Project Health
16
GitProjectHealth - MSR with MDE in Pharo
Our MSR solution
GitProjectHealth
GitProjectHealth (GPH) is framework to extract and analyse data from Git Social
platforms using Model-Driven Engineering (MDE).
tool for General Mining data & Metrics computing
• GitProjectHealth contributions are :
• A unify model for all Git Social Platform
• A framework to build custom metric from the model
• A use of metamodel connector to extend any analysis to other platforms (e.g., Jira)
github.com/moosetechnology/GitProjectHealth
17
GitProjectHealth - MSR with MDE in Pharo
Main feature
GitProjectHealth
Key Features:
• Data Extraction & importation:
Extracts data from major social pla1orms.
Imports a model of specific Git en88es.
• Visualization and Metrics:
Visualizes data and computes metrics.
• Model Connection:
Connects models (e.g., Gitlab and Jira).
18
GitProjectHealth - MSR with MDE in Pharo
GitProjectHealth
Git Model
Fig - Simplify Metamodel of a Git Social Platform in GitProjectHealth
19
GitProjectHealth - MSR with MDE in Pharo
GitProjectHealth
Git Model
Fig - Simplify Metamodel of a Git Social Platform in GitProjectHealth
1. Naming decisions
Merge Request vs Pull Request
20
GitProjectHealth - MSR with MDE in Pharo
GitProjectHealth
Git Model
Fig - Simplify Metamodel of a Git Social Platform in GitProjectHealth
2. New relations
Commits link to a User, not author
21
GitProjectHealth - MSR with MDE in Pharo
GitProjectHealth
Git Model
Fig - Simplify Metamodel of a Git Social Platform in GitProjectHealth
3. Concepts at fine granularity
Modeling down to the changed line of code
22
GitProjectHealth - MSR with MDE in Pharo
GitProjectHealth
API and Importer
Fig — APIs and GSP importers related to our Git model
Our API are independent
open source projects in
Pharo.
Anyone can access them
via github/Evref-BL
23
Petit titre
GitProjectHealth
Metamodel connection: Jira - Git Model
Git model Jira model
Moose Model
GPH-Jira
model connector
sub-model sub-model
The connector accesses all the
entities and relations of its two
submodes (Git and Jira model)
24
Petit titre
GitProjectHealth
Metamodel connection: Jira - Git Model
Git model Jira model
Moose Model
GPH-Jira
model connector
sub-model sub-model
The connector accesses all the
entities and relations of its two
submodes (Git and Jira model)
GPH-Jira
model connector
25
Petit titre
GitProjectHealth
Metamodel connection: Jira - Git Model
Git model Jira model
commits
issue
message: "[AV1-5886] fix"
id: "1234567"
aGPHCommit aJiraIssue
timespent: "12 days"
key: "AV1-5886"
Connection by attribute value
Quick demo
26
Petit titre
Our usage of
GitProjectHealth
27
Section ③
Deploying GPH at Berger-Levrault
28
Petit titre
Using GPH at Berger-Levrault
Computing MSR Metrics
• We build a metric framework within GPH
• They are either Projet or User centric
• For each Metric,
• it loads entities from a time period
( i.e., two dates)
• it calculates the metric over a time windows
(i.e. a Day, a Week, a Month, or a Year).
• 47 metrics are implemented so far.
Fig - UML representation of the Metrics in GitProjectHealth
Fig - Running all metrics in GitProjectHealth from a playground (simplified)
29
Petit titre
Using GPH at Berger-Levrault
Metrics computed every weeks (from 2024)
Conclusion & Perspectives
30
Section ⑤
and end.
31
Petit titre
Conclusion
Recap
32
Petit titre
Perspectives
Future Works
• Addressing limitations
• The difficulty of maintaining a global metamodel by investigating the generation of
GSP submetamodels from their OpenAPI
• Discuss the purpose of the measures and consider which measures correlate with
a healthy project.
• Evaluating GPH against existing tools (PyDriller, Git-Miner, etc)
• Evolution
• From GPH model to source code model navigating from repository to Famix model
• Build usable knowledge maps by detecting parts of the repository that have
become unknown to developers.
33
Petit titre
Ressources
Links
GitProjectHealth https://github.com/moosetechnology/GitProjectHealth
Pharo Gitlab API https://github.com/Evref-BL/Gitlab-Pharo-API
Pharo BitBucket API https://github.com/Evref-BL/Bitbucket-Pharo-API
Pharo Jira API https://github.com/Evref-BL/Jira-Pharo-API
Example using GitProjectHealth:
Heatmaps https://github.com/Marpioux/Gitlab-HeatMap
34
Citations
Ressources
Bibliography
[Steffen et al. 2010] Steffen M. Olbrich, Daniela Cruzes, and Dag I. K. Sjùberg. 2010. Are all code smells harmful? A study of God Classes and Brain
Classes in the evolution of three open source systems. In 26th IEEE International Conference on Software Maintenance (ICSM 2010), September
12-18, 2010, Timisoara, Romania. 1-10.
[Palomba et al. 2014] Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Andrea De Lucia. 2014. Do They Really Smell
Bad? A Study on Developers’ Perception of Bad Code Smells. In Proc. of the 30th International Conference on Software Maintenance and Evolution.
101-110
[Bacchelli et al. 2013] Alberto Bacchelli and Christian Bird. 2013. Expectations, outcomes, and challenges of modern code review. In Proc. of the
35th International Conference on Software Engineering. 712-721
[Bacchelli et al. 2010] Alberto Bacchelli, Marco D’Ambros, and Michele Lanza. 2010. Are popular classes more defect prone?. In International
Conference on Fundamental Approaches to Software Engineering. Springer, 59-73.
[Mockus et al. 2000] Audris Mockus, Roy T Fielding, and James Herbsleb. 2000. A case study of open source software development: the Apache
server. In Proc. of the 22nd international conference on Software engineering. Acm, 263ś272.
[Hassan 2008] A. E. Hassan, "The road ahead for Mining Software Repositories," 2008 Frontiers of Software Maintenance, Beijing, China, 2008, pp.
48-57, doi: 10.1109/FOSM.2008.4659248.
[Dabbish et al. 2012] Dabbish, L., Stuart, C., Tsay, J., & Herbsleb, J. (2012, February). Social coding in GitHub: transparency and collaboration in an
open software repository. In Proceedings of the ACM 2012 conference on computer supported cooperative work (pp. 1277-1286).
Annexe
35
Additional content
36
Petit titre
Titre de section
Texte du titre
Open-Source Analysis (github)
• Analyzed 30 days of activities from Eclipse,
Microsoft, and MooseTechnology.
• Analyzed ~4457 entities over 30 days
• Visualizations: daily commit distributions, user
activity
Industrial Case at Berger-Levrault (gitlab)
Connected Jira model with Git model to analyze
merge request distributions across different issue
types.
• 27% of Merge Requests linked to bug-related
Jira issues.
A user’s commit activity by day, during the
month of September 2024, on moosetechnology
Issues occurrences in September 2024 for
WeHR
0
8
/
2
6
/
2
0
2
4
0
8
/
2
7
/
2
0
2
4
0
8
/
3
1
/
2
0
2
4
0
8
/
3
0
/
2
0
2
4
0
8
/
2
9
/
2
0
2
4
0
8
/
2
8
/
2
0
2
4
0
9
/
0
1
/
2
0
2
4
0
9
/
0
2
/
2
0
2
4
0
9
/
0
3
/
2
0
2
4
0
9
/
0
4
/
2
0
2
4
0
9
/
0
5
/
2
0
2
4
0
9
/
0
6
/
2
0
2
4
0
9
/
0
7
/
2
0
2
4
0
9
/
0
8
/
2
0
2
4
0
9
/
0
9
/
2
0
2
4
0
9
/
1
0
/
2
0
2
4
0
9
/
1
1
/
2
0
2
4
0
9
/
1
2
/
2
0
2
4
0
9
/
1
3
/
2
0
2
4
0
9
/
1
4
/
2
0
2
4
0
9
/
1
5
/
2
0
2
4
0
9
/
1
6
/
2
0
2
4
0
9
/
1
7
/
2
0
2
4
0
9
/
1
8
/
2
0
2
4
0
9
/
1
9
/
2
0
2
4
0
9
/
2
0
/
2
0
2
4
0
9
/
2
1
/
2
0
2
4
0
9
/
2
2
/
2
0
2
4
0
8
/
2
5
/
2
0
2
4
0
8
/
2
4
/
2
0
2
4
#num
of
commits
0
5
10
15
20
Commit activity on vscode - around September 2024
dates
a user
#num
of
commits
0
8
/
2
6
/
2
0
2
4
0
8
/
2
7
/
2
0
2
4
0
8
/
3
1
/
2
0
2
4
0
8
/
3
0
/
2
0
2
4
0
8
/
2
9
/
2
0
2
4
0
8
/
2
8
/
2
0
2
4
0
9
/
0
1
/
2
0
2
4
0
9
/
0
2
/
2
0
2
4
0
9
/
0
3
/
2
0
2
4
0
9
/
0
4
/
2
0
2
4
0
9
/
0
5
/
2
0
2
4
0
9
/
0
6
/
2
0
2
4
0
9
/
0
7
/
2
0
2
4
0
9
/
0
8
/
2
0
2
4
0
9
/
0
9
/
2
0
2
4
0
9
/
1
0
/
2
0
2
4
0
9
/
1
1
/
2
0
2
4
0
9
/
1
2
/
2
0
2
4
0
9
/
1
3
/
2
0
2
4
0
9
/
1
4
/
2
0
2
4
0
9
/
1
5
/
2
0
2
4
0
9
/
1
6
/
2
0
2
4
0
9
/
1
7
/
2
0
2
4
0
9
/
1
8
/
2
0
2
4
0
9
/
1
9
/
2
0
2
4
0
9
/
2
0
/
2
0
2
4
0
9
/
2
1
/
2
0
2
4
0
9
/
2
2
/
2
0
2
4
dates
0
4
8
12
16
Projects
39
Example
Annexe
Sunburst: last activity on the code base
Fig — Sunburst representation of a developer activity in a project

More Related Content

PDF
Decision Making Framework in e-Business Cloud Environment Using Software Metr...
PDF
Design and Monitoring Performance of Digital Properties
PDF
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
PDF
Clone of an organization
PPTX
Future of jobs and digital economy citi conference 090618
PPTX
Developing Digital Twins
PDF
2- THE CHANGING NATURE OF SOFTWARE.pdf
PDF
ccs356-software-engineering-notes.pdf
Decision Making Framework in e-Business Cloud Environment Using Software Metr...
Design and Monitoring Performance of Digital Properties
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
Clone of an organization
Future of jobs and digital economy citi conference 090618
Developing Digital Twins
2- THE CHANGING NATURE OF SOFTWARE.pdf
ccs356-software-engineering-notes.pdf

Similar to Mining software repository with Pharo (ESUG 2025) (20)

PDF
Msr2021 tutorial-di penta
PDF
GitHub Vs GitLab | What Are The Major Difference?
PDF
Software Engineering Important Short Question for Exams
PDF
Wall Street Mastermind Sector Spotlight - Technology (October 2023).pdf
PPTX
Building a Quality Modelio with Q-Rapids by Softeam
PDF
IRJET- Search Improvement using Digital Thread in Data Analytics
PDF
e-Business - SE trends
PDF
Analyzing Optimal Practises for Web Frameworks
PDF
Jira for DevOps - Loves Cloud
PDF
software product development services.pdf
PDF
Scilab Enterprises (Numerical Computing)
PDF
Git tech
DOCX
Software engg unit 1
PDF
É possível medir se um gigante é ágil?
PDF
Optimize Your Enterprise Git Webinar
DOCX
MK_MSc_Degree_Project_Report ver 5_updated
PDF
Is software engineering research addressing software engineering problems?
DOCX
Sofware Engineering Important Past Paper 2019
PDF
Digital Twin aiding more effective Digital Maintenance
PDF
Software Engineering Model Question Paper 5th sem (1) (1).pdf
Msr2021 tutorial-di penta
GitHub Vs GitLab | What Are The Major Difference?
Software Engineering Important Short Question for Exams
Wall Street Mastermind Sector Spotlight - Technology (October 2023).pdf
Building a Quality Modelio with Q-Rapids by Softeam
IRJET- Search Improvement using Digital Thread in Data Analytics
e-Business - SE trends
Analyzing Optimal Practises for Web Frameworks
Jira for DevOps - Loves Cloud
software product development services.pdf
Scilab Enterprises (Numerical Computing)
Git tech
Software engg unit 1
É possível medir se um gigante é ágil?
Optimize Your Enterprise Git Webinar
MK_MSc_Degree_Project_Report ver 5_updated
Is software engineering research addressing software engineering problems?
Sofware Engineering Important Past Paper 2019
Digital Twin aiding more effective Digital Maintenance
Software Engineering Model Question Paper 5th sem (1) (1).pdf
Ad

More from ESUG (20)

PDF
ShowUs: Pharo Stream Deck (ESUG 2025, Gdansk)
PDF
Micromaid: A simple Mermaid-like chart generator for Pharo
PDF
Directing Generative AI for Pharo Documentation
PDF
Even Lighter Than Lightweiht: Augmenting Type Inference with Primitive Heuris...
PDF
Composing and Performing Electronic Music on-the-Fly with Pharo and Coypu
PDF
Gamifying Agent-Based Models in Cormas: Towards the Playable Architecture for...
PDF
Analysing Python Machine Learning Notebooks with Moose
PDF
FASTTypeScript metamodel generation using FAST traits and TreeSitter project
PDF
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
PDF
Package-Aware Approach for Repository-Level Code Completion in Pharo
PDF
Evaluating Benchmark Quality: a Mutation-Testing- Based Methodology
PDF
An Analysis of Inline Method Refactoring
PDF
Identification of unnecessary object allocations using static escape analysis
PDF
Control flow-sensitive optimizations In the Druid Meta-Compiler
PDF
Clean Blocks (IWST 2025, Gdansk, Poland)
PDF
Encoding for Objects Matters (IWST 2025)
PDF
Challenges of Transpiling Smalltalk to JavaScript
PDF
Immersive experiences: what Pharo users do!
PDF
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
PDF
Cavrois - an Organic Window Management (ESUG 2025)
ShowUs: Pharo Stream Deck (ESUG 2025, Gdansk)
Micromaid: A simple Mermaid-like chart generator for Pharo
Directing Generative AI for Pharo Documentation
Even Lighter Than Lightweiht: Augmenting Type Inference with Primitive Heuris...
Composing and Performing Electronic Music on-the-Fly with Pharo and Coypu
Gamifying Agent-Based Models in Cormas: Towards the Playable Architecture for...
Analysing Python Machine Learning Notebooks with Moose
FASTTypeScript metamodel generation using FAST traits and TreeSitter project
Migrating Katalon Studio Tests to Playwright with Model Driven Engineering
Package-Aware Approach for Repository-Level Code Completion in Pharo
Evaluating Benchmark Quality: a Mutation-Testing- Based Methodology
An Analysis of Inline Method Refactoring
Identification of unnecessary object allocations using static escape analysis
Control flow-sensitive optimizations In the Druid Meta-Compiler
Clean Blocks (IWST 2025, Gdansk, Poland)
Encoding for Objects Matters (IWST 2025)
Challenges of Transpiling Smalltalk to JavaScript
Immersive experiences: what Pharo users do!
ChatPharo: an Open Architecture for Understanding How to Talk Live to LLMs
Cavrois - an Organic Window Management (ESUG 2025)
Ad

Recently uploaded (20)

PDF
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
PPTX
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
PPTX
Patient Appointment Booking in Odoo with online payment
PPTX
Monitoring Stack: Grafana, Loki & Promtail
PPTX
Computer Software and OS of computer science of grade 11.pptx
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
L1 - Introduction to python Backend.pptx
PPTX
history of c programming in notes for students .pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Digital Systems & Binary Numbers (comprehensive )
PDF
iTop VPN 6.5.0 Crack + License Key 2025 (Premium Version)
PDF
Complete Guide to Website Development in Malaysia for SMEs
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PPTX
Oracle Fusion HCM Cloud Demo for Beginners
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PDF
17 Powerful Integrations Your Next-Gen MLM Software Needs
PDF
How to Choose the Right IT Partner for Your Business in Malaysia
How to Make Money in the Metaverse_ Top Strategies for Beginners.pdf
AMADEUS TRAVEL AGENT SOFTWARE | AMADEUS TICKETING SYSTEM
Patient Appointment Booking in Odoo with online payment
Monitoring Stack: Grafana, Loki & Promtail
Computer Software and OS of computer science of grade 11.pptx
Design an Analysis of Algorithms I-SECS-1021-03
L1 - Introduction to python Backend.pptx
history of c programming in notes for students .pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Digital Systems & Binary Numbers (comprehensive )
iTop VPN 6.5.0 Crack + License Key 2025 (Premium Version)
Complete Guide to Website Development in Malaysia for SMEs
Design an Analysis of Algorithms II-SECS-1021-03
Wondershare Filmora 15 Crack With Activation Key [2025
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Oracle Fusion HCM Cloud Demo for Beginners
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
17 Powerful Integrations Your Next-Gen MLM Software Needs
How to Choose the Right IT Partner for Your Business in Malaysia

Mining software repository with Pharo (ESUG 2025)

  • 1. Mining software repository with Pharo 03 July 2025 Nicolas Hlad with Benoit Verhaeghe & Kilian Bauvent ESUG2025 - Gdansk
  • 2. ② ③ ④ ⑤ 2 Outlines Introduction ① What is Mining Software Repository ? What are the specificities of Berger-Levrault regarding MSR ? How we develop GitProjectHealth? What can you do with it ? (demo) What is next ?
  • 3. What is Mining Software Repository ? 3 Section ① (MSR)
  • 4. 4 What is Mining Software Repository ? Mining Software Repository Definitions - 2 • Today's collaborative development relies on Git Social Platforms (GSP) [Dabbish et al. 2012] • they are server implementations of Git with builtin social features • They contain valuable historical information over a software development Fig - Discussing Pull Request in GitHub Fig - Commits distribution over time (GitHub)
  • 5. 5 What is Mining Software Repository ? Mining Software Repository Definitions - 2 • Mining Software Repository (MSR) provides methods and tools to extract data from these platforms [Hassan 2008]. • Among other, it allows us to: • Studying the impact of code smells [Steffen et al. 2010, Palomba et al. 2014] • Exploring developers code review [Bacchelli et al. 2013] • Predicting classes prone to change and defect [Bacchelli et al. 2010] • Retro-analysing a entire development process [Mockus et al. 2000]
  • 6. 6 What is Mining Software Repository ? Mining Software Repository Existing tools • General Mining data • PyDriller — python tool for mining commit • Git-Miner — Pharo tool, based on CLI-GitMiner • Specific Mining Data • Javapers — java lib mostly for Java file analysis in git repository (leveraging SPOON) • ModelMine — retrieve UML model from project's artefacts • Data Storage • Pandora —focus on long terme storage of Git social platform data • Software Heritage — Archive of Git repository from GSP • Computing Metrics • LinearB — Productions and deployment metrics, data positioning with other companies
  • 7. What are the specificities of Berger-Levrault regarding MSR 7 Section ②
  • 8. 8 What are the specificities of Berger-Levrault regarding MSR ? Industrial context A quick word on Berger-levrault • Berger-Levrault is • a group of international software editors • with divers sectors of activity (eduction, health, public administration, etc) • The group acquired divers software editors over the past 30 years. • From different countries (France, Spain, Canada, Maroco, etc); • working with different technology (Java, C#, Typescript, Dart, etc); • and different Git Social Platform (Gitlab, Bitbucket, Azure Devops, etc).
  • 9. 9 What are the specificities of Berger-Levrault regarding MSR ? Industrial context A quick word on Berger-levrault - 2 • We use the project management system Altelissan's Jira to manage: • tickets (Bug, features, Hotfix, etc) • SPRINT (Agile development) • releases (delivering dates, testing software, etc). Fig - Kanban view of a SPRINT in Jira
  • 10. 10 What are the specificities of Berger-Levrault regarding MSR ? Industrial context A quick word on Berger-levrault - 3 • Our Jira and Git Social Platform environment are connected Fig - Linking Jira Tickets to Commit and Merge activity in GitLab
  • 11. 14 What are the specificities of Berger-Levrault regarding MSR ? Industrial context The specificities of Berger-Levrault Git Social Platform project management systems How to mine from different Git Social Platforms (GSP) ? How to implement MSR metrics efficiently ? How to connect GSP data to Jira efficiently ? We use Model Driven Engineering with Pharo-Moose
  • 12. How we develop our solution with MDE 15 Section ③ The conception of Git Project Health
  • 13. 16 GitProjectHealth - MSR with MDE in Pharo Our MSR solution GitProjectHealth GitProjectHealth (GPH) is framework to extract and analyse data from Git Social platforms using Model-Driven Engineering (MDE). tool for General Mining data & Metrics computing • GitProjectHealth contributions are : • A unify model for all Git Social Platform • A framework to build custom metric from the model • A use of metamodel connector to extend any analysis to other platforms (e.g., Jira) github.com/moosetechnology/GitProjectHealth
  • 14. 17 GitProjectHealth - MSR with MDE in Pharo Main feature GitProjectHealth Key Features: • Data Extraction & importation: Extracts data from major social pla1orms. Imports a model of specific Git en88es. • Visualization and Metrics: Visualizes data and computes metrics. • Model Connection: Connects models (e.g., Gitlab and Jira).
  • 15. 18 GitProjectHealth - MSR with MDE in Pharo GitProjectHealth Git Model Fig - Simplify Metamodel of a Git Social Platform in GitProjectHealth
  • 16. 19 GitProjectHealth - MSR with MDE in Pharo GitProjectHealth Git Model Fig - Simplify Metamodel of a Git Social Platform in GitProjectHealth 1. Naming decisions Merge Request vs Pull Request
  • 17. 20 GitProjectHealth - MSR with MDE in Pharo GitProjectHealth Git Model Fig - Simplify Metamodel of a Git Social Platform in GitProjectHealth 2. New relations Commits link to a User, not author
  • 18. 21 GitProjectHealth - MSR with MDE in Pharo GitProjectHealth Git Model Fig - Simplify Metamodel of a Git Social Platform in GitProjectHealth 3. Concepts at fine granularity Modeling down to the changed line of code
  • 19. 22 GitProjectHealth - MSR with MDE in Pharo GitProjectHealth API and Importer Fig — APIs and GSP importers related to our Git model Our API are independent open source projects in Pharo. Anyone can access them via github/Evref-BL
  • 20. 23 Petit titre GitProjectHealth Metamodel connection: Jira - Git Model Git model Jira model Moose Model GPH-Jira model connector sub-model sub-model The connector accesses all the entities and relations of its two submodes (Git and Jira model)
  • 21. 24 Petit titre GitProjectHealth Metamodel connection: Jira - Git Model Git model Jira model Moose Model GPH-Jira model connector sub-model sub-model The connector accesses all the entities and relations of its two submodes (Git and Jira model)
  • 22. GPH-Jira model connector 25 Petit titre GitProjectHealth Metamodel connection: Jira - Git Model Git model Jira model commits issue message: "[AV1-5886] fix" id: "1234567" aGPHCommit aJiraIssue timespent: "12 days" key: "AV1-5886" Connection by attribute value
  • 24. Our usage of GitProjectHealth 27 Section ③ Deploying GPH at Berger-Levrault
  • 25. 28 Petit titre Using GPH at Berger-Levrault Computing MSR Metrics • We build a metric framework within GPH • They are either Projet or User centric • For each Metric, • it loads entities from a time period ( i.e., two dates) • it calculates the metric over a time windows (i.e. a Day, a Week, a Month, or a Year). • 47 metrics are implemented so far. Fig - UML representation of the Metrics in GitProjectHealth Fig - Running all metrics in GitProjectHealth from a playground (simplified)
  • 26. 29 Petit titre Using GPH at Berger-Levrault Metrics computed every weeks (from 2024)
  • 29. 32 Petit titre Perspectives Future Works • Addressing limitations • The difficulty of maintaining a global metamodel by investigating the generation of GSP submetamodels from their OpenAPI • Discuss the purpose of the measures and consider which measures correlate with a healthy project. • Evaluating GPH against existing tools (PyDriller, Git-Miner, etc) • Evolution • From GPH model to source code model navigating from repository to Famix model • Build usable knowledge maps by detecting parts of the repository that have become unknown to developers.
  • 30. 33 Petit titre Ressources Links GitProjectHealth https://github.com/moosetechnology/GitProjectHealth Pharo Gitlab API https://github.com/Evref-BL/Gitlab-Pharo-API Pharo BitBucket API https://github.com/Evref-BL/Bitbucket-Pharo-API Pharo Jira API https://github.com/Evref-BL/Jira-Pharo-API Example using GitProjectHealth: Heatmaps https://github.com/Marpioux/Gitlab-HeatMap
  • 31. 34 Citations Ressources Bibliography [Steffen et al. 2010] Steffen M. Olbrich, Daniela Cruzes, and Dag I. K. Sjùberg. 2010. Are all code smells harmful? A study of God Classes and Brain Classes in the evolution of three open source systems. In 26th IEEE International Conference on Software Maintenance (ICSM 2010), September 12-18, 2010, Timisoara, Romania. 1-10. [Palomba et al. 2014] Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, and Andrea De Lucia. 2014. Do They Really Smell Bad? A Study on Developers’ Perception of Bad Code Smells. In Proc. of the 30th International Conference on Software Maintenance and Evolution. 101-110 [Bacchelli et al. 2013] Alberto Bacchelli and Christian Bird. 2013. Expectations, outcomes, and challenges of modern code review. In Proc. of the 35th International Conference on Software Engineering. 712-721 [Bacchelli et al. 2010] Alberto Bacchelli, Marco D’Ambros, and Michele Lanza. 2010. Are popular classes more defect prone?. In International Conference on Fundamental Approaches to Software Engineering. Springer, 59-73. [Mockus et al. 2000] Audris Mockus, Roy T Fielding, and James Herbsleb. 2000. A case study of open source software development: the Apache server. In Proc. of the 22nd international conference on Software engineering. Acm, 263ś272. [Hassan 2008] A. E. Hassan, "The road ahead for Mining Software Repositories," 2008 Frontiers of Software Maintenance, Beijing, China, 2008, pp. 48-57, doi: 10.1109/FOSM.2008.4659248. [Dabbish et al. 2012] Dabbish, L., Stuart, C., Tsay, J., & Herbsleb, J. (2012, February). Social coding in GitHub: transparency and collaboration in an open software repository. In Proceedings of the ACM 2012 conference on computer supported cooperative work (pp. 1277-1286).
  • 33. 36 Petit titre Titre de section Texte du titre Open-Source Analysis (github) • Analyzed 30 days of activities from Eclipse, Microsoft, and MooseTechnology. • Analyzed ~4457 entities over 30 days • Visualizations: daily commit distributions, user activity Industrial Case at Berger-Levrault (gitlab) Connected Jira model with Git model to analyze merge request distributions across different issue types. • 27% of Merge Requests linked to bug-related Jira issues. A user’s commit activity by day, during the month of September 2024, on moosetechnology Issues occurrences in September 2024 for WeHR
  • 36. 39 Example Annexe Sunburst: last activity on the code base Fig — Sunburst representation of a developer activity in a project