SlideShare a Scribd company logo
EMMM: A Unified Meta-Model for
Tracking Machine Learning Experiments
Samuel Idowu, Daniel Strüber, and Thorsten Berger
2021-01-20
Introduction
ML-based software
systems
Vs.
Traditional Software
systems
ML experiments
F. Kumeno, “Sofware engineering challenges for machine learning applications: A literature review,” Intell. Decis. Technol., vol. 13, 2020
A. Arpteg, B. Brinne, L. Crnkovic-Friis, and J. Bosch, “Software Engineering Challenges of Deep Learning,” in 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2018
C. Hill, R. Bellamy, T. Erickson, and M. Burnett, “Trials and tribulations of developers of intelligent systems: A field study,” in 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), 2016,
2021-01-20
Introduction
Characteristics
Asset Management
Approaches
★ Non-Linear
★ Trial and error
★ Exploratory & intuitive-based
★ Generates multiple asset versions
★ Level 1: Use of ad hoc approaches, e.g.,
dedicated naming conventions for folders and
files
★ Level 2: Use of Git / VCSs and dedicated
databases
★ Level 3: ML experiment management tools
ML experiments
2021-01-20
Experiment management Tools
Specialized tools for managing
ML-specific assets such as features,
hyperparameters, models and
evaluation metrics
★ Examples:
○ MLFlow, Neptune, DVC
★ Systematic approach to manage ML asset
version
★ Supports various ML experiment concerns
○ E.g., Reproducibility, traceability,
reusability
2021-01-20
Motivation & Goals
Existing tools are not fully matured
to support large scale ML-based SW
development
★ Most of the tools currently target data scientists
★ Less focus on collaboration
★ Current operations for tracked data and assets are very
basic
★ Lack of interoperability among existing tools
★ Lack of integration with established SE tools
★ Establish a unified blueprint of core structures and
relationship in existing tools
★ Useful for tool developers and researchers
★ Towards domain specific operations for ML assets.
Unified and effective ML experiment
management tools integrated with traditional
SW engineering tools such as IDEs, and VCS.
Long-term Goal
Challenge
2021-01-20
Methods
★ Explored the versioning support offered by a number of
experiment management tools.
★ Observed and extracted the ML asset types (structures) they
support and their versioning relationships.
★ We then unified their conceptual structures and relationships
using a meta-model
★ Domain modeling in three phases
Initial design of the meta-model to
establish classes and their
relationships
Refinement of structure and the class
relationships through iterative process
Validation phase: Create instances of
concrete experiments with their revision
histories to reveal design flaws and identify
improvement opportunities
Idowu, S., Strüber, D., & Berger, T. (2021, May). Asset management in machine learning: a survey. In 2021 IEEE/ACM 43rd International Conference on Software Engineering:
Software Engineering in Practice (ICSE-SEIP) (pp. 51-60). IEEE.
2021-01-20
Result - EMMM
★ Ready-to-use software artifact, formalized in Ecore,
★ Usable to facilitate tool development.
★ New experiment instances can be created and manipulated
via meta-model’s EMF-generated code, and its APIs.
Idowu, S., Strüber, D., & Berger, T. (2021, May). Asset management in machine learning: a survey. In 2021 IEEE/ACM 43rd International Conference on Software Engineering:
Software Engineering in Practice (ICSE-SEIP) (pp. 51-60). IEEE.
2021-01-20
Result - EMMM
★ Ready-to-use software artifact, formalized in Ecore,
★ Usable to facilitate tool development.
★ New experiment instances can be created and manipulated
via meta-model’s EMF-generated code, and its APIs.
Idowu, S., Strüber, D., & Berger, T. (2021, May). Asset management in machine learning: a survey. In 2021 IEEE/ACM 43rd International Conference on Software Engineering:
Software Engineering in Practice (ICSE-SEIP) (pp. 51-60). IEEE.
2021-01-20
What’s next?
Use cases:
★ Enabling interoperability: Tool developers can write import
and export functions towards our meta-model
★ Blueprint for developing new tools: Developers of
tool/extensions could represent ML-specific information of a
revision history as instances of our meta-model.
Future work:
★ Extend the metamodel to make it configurable
○ Not all valid uses require the support of the meta-model
in its entirety. Hence, it might be desirable that new tools
implement support for a subset of the meta-model
based on their specific needs.
★ Unifying additional proposed tools from academic research
★ Connecting to available MDE tools and services.
○ We make a plethora of MDE work applicable to a new
context in machine learning, e.g., tools for model
analysis, simulation, refactoring, quality assurance,
testing, and many others.
2021-01-20
Summary

More Related Content

PDF
Machine Learning Goes Production
PDF
Provenance in Production-Grade Machine Learning
PPTX
What do Practitioners Expect from the Meta-modeling Tools? A Survey
PDF
Impact of IEEE Computer Society in Advancing Software Engineering and Emergin...
PPTX
Integrating Machine Learning Capabilities into your team
PDF
AI and Machine Learning PG program
PPTX
Jay Yagnik at AI Frontiers : A History Lesson on AI
PDF
Week 3 data journey and data storage
Machine Learning Goes Production
Provenance in Production-Grade Machine Learning
What do Practitioners Expect from the Meta-modeling Tools? A Survey
Impact of IEEE Computer Society in Advancing Software Engineering and Emergin...
Integrating Machine Learning Capabilities into your team
AI and Machine Learning PG program
Jay Yagnik at AI Frontiers : A History Lesson on AI
Week 3 data journey and data storage

Similar to EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments (20)

PDF
Rsqrd AI: ML Tooling at an AI-first Startup
PDF
Model-Driven Software Development
PDF
Accelerating Machine Learning as a Service with Automated Feature Engineering
PPTX
Maintainable Machine Learning Products
PDF
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
PDF
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
PDF
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
PDF
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
PDF
An Exploratory Study on Machine Learning Model Stores
PPTX
Software engineering for machine learning.pptx
PDF
Automated Metadata Annotation What Is And Is Not Possible With Machine Learning
PPTX
Agile MDD
PDF
Practical machine learning
PDF
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
PPTX
EMF-IncQuery presentation at TOOLS 2012
PDF
AI and Machine Learning in Software Development.pdf
PDF
Se research update
PDF
MLSEV Virtual. ML Platformization and AutoML in the Enterprise
PDF
Quality management using mde - an overview
PDF
On the Customization of Model Management Systems for File-Centric IDEs
Rsqrd AI: ML Tooling at an AI-first Startup
Model-Driven Software Development
Accelerating Machine Learning as a Service with Automated Feature Engineering
Maintainable Machine Learning Products
MACHINE LEARNING AUTOMATIONS PIPELINE WITH CI/CD
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
ANALYSIS OF SYSTEM ON CHIP DESIGN USING ARTIFICIAL INTELLIGENCE
An Exploratory Study on Machine Learning Model Stores
Software engineering for machine learning.pptx
Automated Metadata Annotation What Is And Is Not Possible With Machine Learning
Agile MDD
Practical machine learning
Machine Learning Platformization & AutoML: Adopting ML at Scale in the Enterp...
EMF-IncQuery presentation at TOOLS 2012
AI and Machine Learning in Software Development.pdf
Se research update
MLSEV Virtual. ML Platformization and AutoML in the Enterprise
Quality management using mde - an overview
On the Customization of Model Management Systems for File-Centric IDEs
Ad

More from SEAA 2022 (18)

PDF
Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
PDF
Bad Smells in Industrial Automation: Sniffing out Feature Envy
PDF
Software Architecture Challenges in Process Automation - From Code Generation...
PDF
From Traditional to Digital: How software, data and AI are transforming the e...
PDF
Exploiting dynamic analysis for architectural smell detection: a preliminary ...
PDF
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
PDF
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
PDF
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
PDF
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
PPTX
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
PDF
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
PDF
Service Classification through Machine Learning: Aiding in the Efficient Ide...
PDF
Maintainability Challenges inML:ASLR
PDF
Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
PDF
An Industrial Experience Report about Challenges from Continuous Monitoring, ...
PDF
API Deprecation: A Systematic Mapping Study
PDF
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
PDF
Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...
Risk and Engineering Knowledge Integration in Cyber-physical Production Syste...
Bad Smells in Industrial Automation: Sniffing out Feature Envy
Software Architecture Challenges in Process Automation - From Code Generation...
From Traditional to Digital: How software, data and AI are transforming the e...
Exploiting dynamic analysis for architectural smell detection: a preliminary ...
On the Role of Personality Traits in Implementation Tasks: A Preliminary Inve...
An Empirical Analysis of Microservices Systems Using Consumer-Driven Contract...
Have Java Production Methods Co-Evolved With Test Methods Properly?: A Fine-G...
A Preliminary Conceptualization and Analysis on Automated Static Analysis Too...
An Evaluation of Effort-Aware Fine-Grained Just-in-Time Defect Prediction Met...
The Impact of Forced Working-From-Home on Code Technical Debt: An Industrial ...
Service Classification through Machine Learning: Aiding in the Efficient Ide...
Maintainability Challenges inML:ASLR
Model-Driven Optimization: Generating Smart Mutation Operators for Multi-Obj...
An Industrial Experience Report about Challenges from Continuous Monitoring, ...
API Deprecation: A Systematic Mapping Study
MDEML_UMLsec4Edge Extending UMLsec to model data-protection-compliant edge co...
Easing the Reuse of ML Solutions by Interactive Clustering-based Autotuning i...
Ad

Recently uploaded (20)

PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
PDF
An interstellar mission to test astrophysical black holes
PPTX
TOTAL hIP ARTHROPLASTY Presentation.pptx
PPTX
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
PDF
Sciences of Europe No 170 (2025)
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPT
Chemical bonding and molecular structure
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
2. Earth - The Living Planet Module 2ELS
PDF
The scientific heritage No 166 (166) (2025)
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
2. Earth - The Living Planet earth and life
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
An interstellar mission to test astrophysical black holes
TOTAL hIP ARTHROPLASTY Presentation.pptx
DRUG THERAPY FOR SHOCK gjjjgfhhhhh.pptx.
Sciences of Europe No 170 (2025)
ECG_Course_Presentation د.محمد صقران ppt
Biophysics 2.pdffffffffffffffffffffffffff
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Chemical bonding and molecular structure
. Radiology Case Scenariosssssssssssssss
Derivatives of integument scales, beaks, horns,.pptx
2. Earth - The Living Planet Module 2ELS
The scientific heritage No 166 (166) (2025)
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
2. Earth - The Living Planet earth and life
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
SCIENCE10 Q1 5 WK8 Evidence Supporting Plate Movement.pptx

EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments

  • 1. EMMM: A Unified Meta-Model for Tracking Machine Learning Experiments Samuel Idowu, Daniel Strüber, and Thorsten Berger
  • 2. 2021-01-20 Introduction ML-based software systems Vs. Traditional Software systems ML experiments F. Kumeno, “Sofware engineering challenges for machine learning applications: A literature review,” Intell. Decis. Technol., vol. 13, 2020 A. Arpteg, B. Brinne, L. Crnkovic-Friis, and J. Bosch, “Software Engineering Challenges of Deep Learning,” in 2018 44th Euromicro Conference on Software Engineering and Advanced Applications (SEAA), 2018 C. Hill, R. Bellamy, T. Erickson, and M. Burnett, “Trials and tribulations of developers of intelligent systems: A field study,” in 2016 IEEE Symposium on Visual Languages and Human-Centric Computing (VL/HCC), 2016,
  • 3. 2021-01-20 Introduction Characteristics Asset Management Approaches ★ Non-Linear ★ Trial and error ★ Exploratory & intuitive-based ★ Generates multiple asset versions ★ Level 1: Use of ad hoc approaches, e.g., dedicated naming conventions for folders and files ★ Level 2: Use of Git / VCSs and dedicated databases ★ Level 3: ML experiment management tools ML experiments
  • 4. 2021-01-20 Experiment management Tools Specialized tools for managing ML-specific assets such as features, hyperparameters, models and evaluation metrics ★ Examples: ○ MLFlow, Neptune, DVC ★ Systematic approach to manage ML asset version ★ Supports various ML experiment concerns ○ E.g., Reproducibility, traceability, reusability
  • 5. 2021-01-20 Motivation & Goals Existing tools are not fully matured to support large scale ML-based SW development ★ Most of the tools currently target data scientists ★ Less focus on collaboration ★ Current operations for tracked data and assets are very basic ★ Lack of interoperability among existing tools ★ Lack of integration with established SE tools ★ Establish a unified blueprint of core structures and relationship in existing tools ★ Useful for tool developers and researchers ★ Towards domain specific operations for ML assets. Unified and effective ML experiment management tools integrated with traditional SW engineering tools such as IDEs, and VCS. Long-term Goal Challenge
  • 6. 2021-01-20 Methods ★ Explored the versioning support offered by a number of experiment management tools. ★ Observed and extracted the ML asset types (structures) they support and their versioning relationships. ★ We then unified their conceptual structures and relationships using a meta-model ★ Domain modeling in three phases Initial design of the meta-model to establish classes and their relationships Refinement of structure and the class relationships through iterative process Validation phase: Create instances of concrete experiments with their revision histories to reveal design flaws and identify improvement opportunities Idowu, S., Strüber, D., & Berger, T. (2021, May). Asset management in machine learning: a survey. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) (pp. 51-60). IEEE.
  • 7. 2021-01-20 Result - EMMM ★ Ready-to-use software artifact, formalized in Ecore, ★ Usable to facilitate tool development. ★ New experiment instances can be created and manipulated via meta-model’s EMF-generated code, and its APIs. Idowu, S., Strüber, D., & Berger, T. (2021, May). Asset management in machine learning: a survey. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) (pp. 51-60). IEEE.
  • 8. 2021-01-20 Result - EMMM ★ Ready-to-use software artifact, formalized in Ecore, ★ Usable to facilitate tool development. ★ New experiment instances can be created and manipulated via meta-model’s EMF-generated code, and its APIs. Idowu, S., Strüber, D., & Berger, T. (2021, May). Asset management in machine learning: a survey. In 2021 IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP) (pp. 51-60). IEEE.
  • 9. 2021-01-20 What’s next? Use cases: ★ Enabling interoperability: Tool developers can write import and export functions towards our meta-model ★ Blueprint for developing new tools: Developers of tool/extensions could represent ML-specific information of a revision history as instances of our meta-model. Future work: ★ Extend the metamodel to make it configurable ○ Not all valid uses require the support of the meta-model in its entirety. Hence, it might be desirable that new tools implement support for a subset of the meta-model based on their specific needs. ★ Unifying additional proposed tools from academic research ★ Connecting to available MDE tools and services. ○ We make a plethora of MDE work applicable to a new context in machine learning, e.g., tools for model analysis, simulation, refactoring, quality assurance, testing, and many others.