SlideShare a Scribd company logo
Towards a framework for making
applications provenance-aware
Supervisors:
Dr. Beatriz Pérez Valle
Dr. Francisco José García
Izquierdo
Author:
Carlos Sáenz
Adán
30 October
2019
CONTENT
• Introduction
• Background
• State of the art: a systematic review of provenance
systems
• Conceptual definition of UML2PROV
• Implementation of UML2PROV
• Evaluation
• Conclusions and future work
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• Implementation of UML2PROV
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
INTRODUCTION
Towards a framework for making applications provenance-aware
Provenance: refers to the entire information, comprising all the elements,
and their relationships, that contribute to the existence of a piece of data.
Some benefits of provenance at a glance:
 Trustworthiness / reliability
 Data quality
 Reproducibility
 Application data analysis
 …
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
INTRODUCTION
Towards a framework for making applications provenance-aware
Provenance-aware applications: those applications that have the
functionality to answer questions regarding the provenance they produce.
Examples:
• PASS a storage system supporting the collection and maintenance of provenance
• PERM a provenance-aware database middleware
• VisTrails includes provenance into workflow systems
• …
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
Model of the Systems Development Life
Cycle
INTRODUCTION – MAIN GOAL
Analysis
Design
ImplementationTesting
Evaluatio
n
Software Engineering
Software engineering addresses
the development of software in a
systematic way.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
The definition of an overall framework, coined as UML2PROV, which allows software
engineers to bridge the gap between application design and provenance design.
Model of the Systems Development Life
Cycle
INTRODUCTION – MAIN GOAL
Analysis
Design
ImplementationTesting
Evaluatio
n
Provenance
design
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
The definition of an overall framework, coined as UML2PROV, which allows software
engineers to bridge the gap between application design and provenance design.
Model of the Systems Development Life
Cycle
INTRODUCTION – MAIN GOAL
Analysis
Design
ImplementationTesting
Evaluatio
n
Provenance
design
Provenance-
aware
application
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
INTRODUCTION – CONTRIBUTIONS
1. A systematic review of provenance systems Beatriz Pérez, Julio Rubio, Carlos Sáenz-Adán:
A systematic review of provenance systems. In
Knowledge and Information Systems 57(3):
495-543 (2018)
2. A conceptual definition of UML2PROV
3. A UML2PROV reference implementation
Carlos Sáenz-Adán, Beatriz Pérez, Francisco J.
García-Izquierdo, Luc Moreau: Integrating
Provenance Capture and UML with
UML2PROV: Principles and Experience.
Submitted for publication in IEEE Transactions
on Software Engineering
Carlos Sáenz-Adán, Beatriz Pérez, Trung Dong
Huynh, Luc Moreau: UML2PROV: Automating
Provenance Capture in Software Engineering.
SOFSEM 2018: 667-681
Carlos Sáenz-Adán, Luc Moreau, Beatriz
Pérez, Simon Miles, Francisco J. García-
Izquierdo: Automating Provenance Capture in
Software Engineering with UML2PROV.
IPAW 2018: 58-70
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• Implementation of UML2PROV
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
BACKGROUND
Main Goal: The definition of an overall framework, coined as UML2PROV, which allows
software engineers to bridge the gap between application design and provenance design.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
BACKGROUND. UML
• UML is the acronym of Unified Modeling Language
• It is widely used for applications design
• It is accepted as the de-facto method for designing object-
oriented software systems
• It allows us to model the system from different points of
view
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
UML
Diagram
Structure
Diagram
Behavior
Diagram
Class Diagram
Object
Diagram
Package
Diagram
Component Diagram
Composite Structure Diagram
Deployment
Diagram
Profile
Diagram
Use Case Diagram
Activity Diagram
State Machine
Diagram
Interaction Diagram
Sequence
Diagram
Communication Diagram
Interaction Overview Diagram
Timing
Diagram
BACKGROUND. UML
Class Diagram
State Machine
Diagram
Sequence
Diagram
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
BACKGROUND. UML
UML Sequence Diagram (SqD)
UML State Machine Diagram (SMD)
UML Class Diagram (CD)
SqDs reflect how collaborating objects
interact for executing operations, and the
exchange of information between them.
SMDs show information about the
evolution of the objects’ state as a
consequence of events taken place, for
example operations executions.
SqDs reflect how collaborating objects
interact for executing operations, and the
exchange of information between them.
SMDs show information about the
evolution of the objects’ state as a
consequence of events taken place, for
example operations executions.
CDs show the system’s classes,
containing attributes and operations,
and the relationships between those
classes.
CDs show the system’s classes,
containing attributes and operations,
and the relationships between those
classes.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
BACKGROUND
Main Goal: The definition of an overall framework, coined as UML2PROV, which allows
software engineers to bridge the gap between application design and provenance design.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
BACKGROUND. PROV STANDARD
• World Wide Web Consortium (W3C) standard.
• PROV aims to facilitate the publication and interchange of
provenance among applications.
• PROV is fully specified in a family of documents allowing
provenance to be modelled, serialised, exchanged, …
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
PROV-DM
PROV-XMLPROV-DC PROV-O PROV-N
PROV-
LINKS
PROV-
DICTIONARY
PROV-SEM
PROV-AQ
Serialization
PROV-PRIMER
PROV-CONSTRAINTS
BACKGROUND. PROV STANDARD
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
An entity is a physical, digital, conceptual, or other kind of
thing with some fixed aspects; entities may be real or
imaginary.
An activity is something that occurs over a period of time
and acts upon or with entities; it may include consuming,
processing, transforming, modifying, relocating, using, or
generating entities.
An agent is something that bears some form of
responsibility for an activity taking place, for the existence of
an entity, or for another agent's activity
BACKGROUND. PROV STANDARD
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
BACKGROUND. PROV STANDARD
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
PROV-Template approach [MBHM18]
[MBHM18] Luc Moreau, Belfrit V. Batlajery, Trung Dong Huynh, Danius T. Michaelides, Heather S. Packer: A Templating
System to Generate Provenance. IEEE Trans. Software Eng. 44(2): 103-121 (2018)
BACKGROUND. PROV STANDARD
BACKGROUND. THE PROV-TEMPLATE APPROACH
uses
generates
usesExpansion
Algorithm
Set of
bindings
Template
PROV
document
• Introduction
• Background
• State of the art: a systematic review of provenance systems
• Conceptual definition of UML2PROV
• Implementation of UML2PROV
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS
Three main benefits to this thesis:
1. A unified taxonomy of provenance systems characteristics.
2. An exhaustive analysis and comparison of 25 systems.
3. Open research problems that motivated this thesis.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. TAXONOMY
Provenance Characteristics
General Aspects
Provenance
definition
Data processing
Application domain
Intended /
Extended
Purpose
Availability
Subject
Contents
Abstraction
Interoperabilit
y / Exchange
Phase
Orientation
Granularit
y
Storage
Scalability
Coupling
Persistence
Archiving
Data Capture
Tracing
Level
Technique
Mechanism
Data Access
Accessing
Querying
Non-functional
requirements
Security/
Privacy
Verification
Repeatability/
Reproducibility/
Replayability
[SeMM09] Sérgio Manuel Serra da Cruz, Maria Luiza Machado Campos, Marta Mattoso: Towards a Taxonomy of
Provenance in Scientific Workflow Management Systems. SERVICES I 2009: 259-266
We mainly based on the taxonomy proposed by Serra da Cruz et al. [SeMM09]
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. TAXONOMY
General Aspects: general background regarding provenance systems.
Provenance Characteristics
General Aspects
Provenance
definition
Data processing
Application domain
Intended /
Extended
Purpose
Availability
Subject
Contents
Abstraction
Interoperabilit
y / Exchange
Phase
Orientation
Granularit
y
Storage
Scalability
Coupling
Persistence
Archiving
Data Capture
Tracing
Level
Technique
Mechanism
Data Access
Accessing
Querying
Non-functional
requirements
Security/
Privacy
Verification
Repeatability/
Reproducibility/
Replayability
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. TAXONOMY
Subject: the different subjects or levels of detail in which provenance data can be represented.
Provenance Characteristics
General Aspects
Provenance
definition
Data processing
Application domain
Intended /
Extended
Purpose
Availability
Subject
Contents
Abstraction
Interoperabilit
y / Exchange
Phase
Orientation
Granularit
y
Storage
Scalability
Coupling
Persistence
Archiving
Data Capture
Tracing
Level
Technique
Mechanism
Data Access
Accessing
Querying
Non-functional
requirements
Security/
Privacy
Verification
Repeatability/
Reproducibility/
Replayability
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. TAXONOMY
Storage: the different approaches used by provenance systems to register provenance information.
Provenance Characteristics
General Aspects
Provenance
definition
Data processing
Application domain
Intended /
Extended
Purpose
Availability
Subject
Contents
Abstraction
Interoperabilit
y / Exchange
Phase
Orientation
Granularit
y
Storage
Scalability
Coupling
Persistence
Archiving
Data Capture
Tracing
Level
Technique
Mechanism
Data Access
Accessing
Querying
Non-functional
requirements
Security/
Privacy
Verification
Repeatability/
Reproducibility/
Replayability
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. TAXONOMY
Data capture: the way in which provenance data can be captured on existing provenance systems.
Provenance Characteristics
General Aspects
Provenance
definition
Data processing
Application domain
Intended /
Extended
Purpose
Availability
Subject
Contents
Abstraction
Interoperabilit
y / Exchange
Phase
Orientation
Granularit
y
Storage
Scalability
Coupling
Persistence
Archiving
Data Capture
Tracing
Level
Technique
Mechanism
Data Access
Accessing
Querying
Non-functional
requirements
Security/
Privacy
Verification
Repeatability/
Reproducibility/
Replayability
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. TAXONOMY
Data access: how users can access provenance data repositories.
Provenance Characteristics
General Aspects
Provenance
definition
Data processing
Application domain
Intended /
Extended
Purpose
Availability
Subject
Contents
Abstraction
Interoperabilit
y / Exchange
Phase
Orientation
Granularit
y
Storage
Scalability
Coupling
Persistence
Archiving
Data Capture
Tracing
Level
Technique
Mechanism
Data Access
Accessing
Querying
Non-functional
requirements
Security/
Privacy
Verification
Repeatability/
Reproducibility/
Replayability
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. TAXONOMY
Non-functional requirements: comprising non-functional requirements of provenance systems.
Provenance Characteristics
General Aspects
Provenance
definition
Data processing
Application domain
Intended /
Extended
Purpose
Availability
Subject
Contents
Abstraction
Interoperabilit
y / Exchange
Phase
Orientation
Granularit
y
Storage
Scalability
Coupling
Persistence
Archiving
Data Capture
Tracing
Level
Technique
Mechanism
Data Access
Accessing
Querying
Non-functional
requirements
Security/
Privacy
Verification
Repeatability/
Reproducibility/
Replayability
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. ANALYSIS OF 25
SYSTEMS
Provenance Characteristics
General Aspects
Provenance definition
Data processing
Application domain
Intended / Extended
Purpose
Availability
Subject
Contents
Abstraction
Interoperabilit
y / Exchange
Phase
Orientation
Granularity
Storage
Scalability
Coupling
Persistence
Archiving
Data Capture
Tracing
Level
Technique
Mechanism
Data Access
Accessing
Querying
Non-functional
requirements
Security/ Privacy
Verification
Repeatability/
Reproducibility/
Replayability
This analysis has served to:
1. give us an idea of current available
approaches.
2. uncover open research problems that
have motivated this thesis.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. OPEN PROBLEMS
We have determined the following 4 open problems:
 Computational overhead
 Querying
 Integration
 Interoperability
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. OPEN PROBLEMS
Computational overhead
It is still considered an open question to reach the appropriate mechanism which finds a
balance between both capture of provenance and computational overhead.
Provenance
Characteristics
Subject
Granularity
Data Capture
Tracing
Level
Technique
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. OPEN PROBLEMS
Querying
There are two main approaches for querying data
provenance:
 Exploratory: when users do not have an exact idea of
the information they want to retrieve.
 Directed: when users know precisely the information
they might want to query.
Forced users to learn the query language of the used persistence
system.
We advocate for provenance solutions agnostic about any storage
system.
Provenance
Characteristics
Storage
Persistence
Data
Access
Accessing
Querying
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. OPEN PROBLEMS
Integration
In our review, we uncovered two main approaches for capturing data
in the least intrusive manner:
 Workflow-level: These approaches advocate declaring the entire
workflow in advance using a Workflow Management System (e.g.,
VisTrails and ZOOM)
It requires users to use a Workflow Management System.
 Operating System-level: provenance is captured at the API system
level. Applications maight need to run on a special OS kernel (e.g.,
PASS and ES3)
They capture provenance for all the executions, such provenance
may provide too much irrelevant information.
Provenance
Characteristics
Data Capture
Level
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. OPEN PROBLEMS
Provenance
Characteristics
Data Capture
Level
Integration
Another approach is to capture provenance at process-level.
Advantages:
It is independent from Workflow Management Systems.
It is able to capture the high level meaning of the process.
Disadvantage:
It is necessary to adapt pre-existing activities of the process to
incorporate provenance collection functionalities.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. OPEN PROBLEMS
Integration
Aiming at integrating provenance capture with applications, the
Provenance Incorporation Methodology (PrIMe) [MGMM11] was
developed.
PrIMe is standalone, and it does not integrate with existing software
engineering methodologies, which makes it challenging to use in
practice.
Integrating the capture of provenance is still a significant
hurdle.
[MGMM11] Simon Miles, Paul T. Groth, Steve Munroe, Luc Moreau: PrIMe: A methodology for developing
provenance-aware applications. ACM Trans. Softw. Eng. Methodol. 20(3): 8:1-8:42 (2011)
Provenance
Characteristics
Data Capture
Level
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. OPEN PROBLEMS
Provenance
Characteristics
Subject
Interoperabilit
y/
Exchange
Interoperability
Enormous effort of the provenance community to develop a standard
model for provenance exchange.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. OPEN PROBLEMS
Interoperability
 Some systems have made the effort to support PROV (e.g., myGrid/Taverna, ES3, PLUS).
 Additionally, there are several toolkits supporting PROV, which facilitate software engineer’s tasks.
 ProvToolbox is a Java toolbox for handing PROV
 ProvPy is a Python implementation of the PROV data model
 ProvExtract is a tool to extract PROV from web-pages
 ProvVis allows visualising PROV through different charts
 …
 These toolkits do not help decide what information should be included in provenance, and how
applications should be designed to allow for its capture.
The ability to consider the intended use of provenance, specially during the design
phase, has become critically important to support software designers in making
provenance-aware applications.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
A SYSTEMATIC REVIEW OF PROVENANCE SYSTEMS. OPEN PROBLEMS
 On the basis of these open problems, we have seen the need for
defining UML2PROV.
 Computational overhead
 Querying
 Integration
 Interoperability
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• UML2PROV architecture
• From UML to PROV: the transformation patterns
• Towards the generation of bindings. BGM features and requirements
• Implementation of UML2PROV
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
OVERVIEW OF THE UML2PROV ARCHITECTURE
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
OVERVIEW OF THE UML2PROV ARCHITECTURE
Step 1
Software designers model the
application by means of UML
diagrams.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
OVERVIEW OF THE UML2PROV ARCHITECTURE
Step 2
Taking the UML diagrams
as source, UML2PROV
automatically produces …
Step 2.1
… PROV templates with the
design of the provenance to
be generated.
Step 2.2
…the specific BGM
responsable for generating
bindings.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
OVERVIEW OF THE UML2PROV ARCHITECTURE
Step 3.1
The BGM is integrated into
the existing application.
Step 3.2
While the application is
running, the BGM collects
bindings.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
OVERVIEW OF THE UML2PROV ARCHITECTURE
Step 4
The template expander
takes both the PROV
templates and the bindings,
and generates the PROV
documents.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
OVERVIEW OF THE UML2PROV ARCHITECTURE
From UML to PROV:
transformation
patterns
BGM features and
requirements
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• UML2PROV architecture
• From UML to PROV: the transformation patterns
• Towards the generation of bindings. BGM features and requirements
• Implementation of UML2PROV
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
4 Sequence diagram
patterns
17
transformation
patterns
3 State machine diagram
patterns
10 Class diagram
patterns
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
5 principles:
 Consistency: PROV templates must have common elements.
 Level: the provenance must be collected at process-level.
 Understandable: a description of the context.
 Self-explanatory: not needing further explanation.
 Systematic: uniformly structured.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
Structure of a pattern
Context
UML diagram
Mapping to
PROV
Discussion
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
Context
UML diagram
Mapping to
PROV
Discussion
The explanation of the situation addressed by the UML
representation identified in the pattern.
Structure of a pattern
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
Context
UML diagram
Mapping to
PROV
Discussion
The excerpt of the UML diagram whose translation into
PROV is ruled by the pattern.
Structure of a pattern
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
Context
UML diagram
Mapping to
PROV
Discussion
The PROV template proposed as translation for the
previous excerpt of UML Diagram.
Structure of a pattern
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
Context
UML diagram
Mapping to
PROV
Discussion
Structure of a pattern
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
Context
UML diagram
Mapping to
PROV
Discussion
Structure of a pattern
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
Context
UML diagram
Mapping to
PROV
Discussion
Structure of a pattern
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS
Context
UML diagram
Mapping to
PROV
Discussion
Issues related to the transformation of UML to PROV
Structure of a pattern
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
FROM UML TO PROV: THE TRANSFORMATION PATTERNS. EXAMPLE
Sequence Diagram Pattern 2 (SeqP2)
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• UML2PROV architecture
• From UML to PROV: the transformation patterns
• Towards the generation of bindings. BGM features and requirements
• Implementation of UML2PROV
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
TOWARDS THE GENERATION OF BINDINGS. BGM FEATURES AND REQUIREMENTS
6 Requirements
The different nature of applications designed using UML prevented us
for using a generic BGM
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
TOWARDS THE GENERATION OF BINDINGS. BGM FEATURES AND REQUIREMENTS
Set of Requirements: R1, R2 and R3
Requirement 1 (R1): The instrumentation of the application to add the instructions
for generating bindings must be carried out automatically.
Requirement 2 (R2) and 3 (R3): The instructions for bindings generation must be
located in an independent module (R2), and this module must be able to identify
those moments from which we want to collect provenance information (R3).
Requirement 4 (R4): The BGM must provide the software developer with mechanisms
for selecting the configuration that best suits her/his needs, allowing developers to
decide when to compute provenance.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
TOWARDS THE GENERATION OF BINDINGS. BGM FEATURES AND REQUIREMENTS
Set of Requirements: R5 and R6
Requirement 5 (R5): Each binding obtained from an application’s execution must be
associated with at least one PROV template automatically generated from the UML
diagrams.
Requirement 6 (R6): The variables included in a set of bindings must correspond with
the variables in their associated PROV templates.
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• Implementation of UML2PROV
• Automatization of the transformation patterns
• Automatization of the generation of the BGM
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
Automatization of
the transformation
patterns
Automatization of
the generation of
the BGM
IMPLEMENTATION OF UML2PROV
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• Implementation of UML2PROV
• Automatization of the transformation patterns
• Automatization of the generation of the BGM
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
IMPLEMENTATION OF UML2PROV.
An MDD-based tool chain that comprises two transformations:
• T1. From UML diagram models to template models
• T2. From template models to PROV template files
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
IMPLEMENTATION OF UML2PROV.
This implementation:
• defines two transformations instead of a direct one.
• is a generic solution.
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• Implementation of UML2PROV
• Automatization of the transformation patterns
• Automatization of the generation of the BGM
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
Automatization of the
generation of the BGM
IMPLEMENTATION OF UML2PROV
Implementation
of the BGM
Generation of
the BGM
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
Automatization of the
generation of the BGM
IMPLEMENTATION OF UML2PROV
Implementation
of the BGM
Generation of
the BGM
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
Implementation of the BGM
• We have defined an event-based approach to implement the BGMs with two
main components:
 Events: are notable occurrences that happen while the application is running.
 Listeners: contain the behaviour for processing the events.
IMPLEMENTATION OF UML2PROV
OperationStar
t
OperationEnd
NewBindin
g
start
end
Operation
execution
1
2 3
4
NewValueBinding
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
Implementation of the BGM
IMPLEMENTATION OF UML2PROV
Our reference implementation of BGM consists of four main components.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
Implementation of the BGM. BGMEventInstrumenter
IMPLEMENTATION OF UML2PROV
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
Implementation of the BGM. BGMEventInstrumenter
IMPLEMENTATION OF UML2PROV
• The Aspect Oriented Programming (AOP) paradigm
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
IMPLEMENTATION OF UML2PROV
Implementation of the BGM. BGMEventListener
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
IMPLEMENTATION OF UML2PROV
Three proposed implementations of the BGMEventListener interface:
• What is stored?
• When is it stored?
BGMEventListener
Events
reception
OperationEnd
OperationStart
NewBinding
NewValueBinding
binding
binding
binding
...
Set of
bindings
Template
expander
PROV
templates
PROV
templates
PROV
templates
PROV
documents
in
in
out
BindingsBGMEventListener
Store bindings
1
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
IMPLEMENTATION OF UML2PROV
Three proposed implementations of the BGMEventListener interface:
• What is stored?
• When is it stored?
BGMEventListener
Events
reception
OperationEnd
OperationStart
NewBinding
NewValueBinding
binding
binding
binding
...
Set of
bindings
Template
expander
PROV
templates
PROV
templates
PROV
templates
PROV
documents
in
in
out
SetBindingsBGMEventListener
Store Set of
bindings
2
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
IMPLEMENTATION OF UML2PROV
Three proposed implementations of the BGMEventListener interface:
• What is stored?
• When is it stored?
BGMEventListener
Events
reception
OperationEnd
OperationStart
NewBinding
NewValueBinding
binding
binding
binding
...
Set of
bindings
Template
expander
PROV
templates
PROV
templates
PROV
templates
PROV
documents
in
in
out
ProvenanceBGMEventListener
Store
PROV
document
s
3
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
IMPLEMENTATION OF UML2PROV
Three proposed implementations of the BGMEventListener interface:
• What is stored?
• When is it stored?
• How are PROV documents retrieved?
BGMEventListener
Events
reception
OperationEnd
OperationStart
NewBinding
NewValueBinding
binding
binding
binding
...
Set of
bindings
Template
expander
PROV
templates
PROV
templates
PROV
templates
PROV
documents
in
in
out
ProvenanceBGMEventListener
Retrieve
PROV
document
s
3
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
IMPLEMENTATION OF UML2PROV
Three proposed implementations of the BGMEventListener interface:
• What is stored?
• When is it stored?
• How are PROV documents retrieved?
BGMEventListener
Events
reception
OperationEnd
OperationStart
NewBinding
NewValueBinding
binding
binding
binding
...
Set of
bindings
Template
expander
PROV
templates
PROV
templates
PROV
templates
PROV
documentsin
in
out
SetBindingsBGMEventListener
Retrieve
Set of
bindings
2
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
IMPLEMENTATION OF UML2PROV
Three proposed implementations of the BGMEventListener interface:
• What is stored?
• When is it stored?
• How are PROV documents retrieved?
BGMEventListener
Events
reception
OperationEnd
OperationStart
NewBinding
NewValueBinding
binding
binding
binding
...
Set of
bindings
Template
expander
PROV
templates
PROV
templates
PROV
templates
PROV
documents
in
in
out
BindingsBGMEventListener
Retrieve
bindings
1
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
Automatization of the
generation of the BGM
IMPLEMENTATION OF UML2PROV
Implementation
of the BGM
Generation of
the BGM
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
An MDD-based tool that defines one transformation:
 T3. From UML diagram models to BGM
IMPLEMENTATION OF UML2PROV
Generation of the BGM
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• Implementation of UML2PROV
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
EVALUATION. GELJ
To show both the benefits and trade-offs of using UML2PROV, we may apply UML2PROV in
different scenarios:
 an application already developed, and which was developed by using UML
 an application already developed, without UML design
1. To show that UML2PROV can be applied in a wide range of
situations, even when there is a lack of UML design.
2. To show the trade-offs between the effort devoted to obtaining the
UML diagrams and overhead of provenance capture.
We have applied UML2PROV to GelJ [HDMP15].
[HDMP15] Jónathan Heras, César Domínguez, Eloy Mata, Vico Pascual, Carmen Lozano, Carmen Torres, Myriam
Zarazaga: GelJ - a tool for analyzing DNA fingerprint gel images. BMC Bioinformatics 16: 270:1-270:8 (2015)
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
EVALUATION. UML DESIGN OF GELJ
We have identified three strategies in which the application of different reverse-engineering strategies,
each requiring a different effort from the software engineer, leads to three different UML designs for GelJ.
Reverse-engineering
Reverse-engineering
Reverse-engineering
Inspired by PrIMe Provenance requirements (Q1-Q9)
Class + Sequence + State
Machine
Class + Sequence
Class
Strategy
1
Strategy
2
Strategy
3
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
EVALUATION. ANALYSING THE BENEFITS AND TRADE-OFFS OF USING UML2PROV
Provenance
design
Instrumentation
Maintenance
Overhead
Quality
Provenance
design
Overhead
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
EVALUATION. PROVENANCE DESIGN
Provenance
design
Instrumentation
Maintenance
Overhead
Quality
Provenance
design
Overhead
Generation of provenance design
Without UML2PROV, software designers had to
manually develop the PROV templates.
cumbersome, time-consuming, and error-prone
task
does not scale up
requires the knowledge of a PROV expert
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
EVALUATION. PROVENANCE DESIGN
Provenance
design
Instrumentation
Maintenance
Overhead
Quality
Provenance
design
Overhead
Generation of provenance design
With UML2PROV, the generation of templates is
automatic.
straightforward
scales up
does NOT require the knowledge of a PROV
expert
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
EVALUATION. OVERHEAD
Provenance
design
Instrumentation
Maintenance
Overhead
Quality
Provenance
design
Overhead
Run-time overhead and storage needs
The more UML elements used, the more number of bindings are
generated.
0
500000
1000000
17977
594433 562833
NUMBEROFBINDINGS
Strategy 1 (SqD, SMD, CD) Strategy 2 (SqD, CD) Strategy 3 (CD)
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
EVALUATION. OVERHEAD
Provenance
design
Instrumentation
Maintenance
Overhead
Quality
Provenance
design
Overhead
Run-time overhead
The more generated bindings, more time is needed.
0.00
50.00
100.00
150.00
Strategy 1 (SqD,
SMD, CD)
Strategy 2 (SqD, CD) Strategy 3 (CD)
6.21
115.77 112.36
1.26
55.05
48.45
1.53
68.67
53.42
RUN-TIMEOVERHEAD(%)
BindingsBGMEventListener SetBindingsBGMEventListener
• Introduction
• Background
• State of the art: A systematic review of provenance systems
• Conceptual definition of UML2PROV
• Implementation of UML2PROV
• Evaluation
• Conclusions and future work
CONTENT
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
CONCLUSIONS AND FURTHER WORK
 A systematic review of provenance systems
1. A unified taxonomy of provenance systems characteristics.
2. An exhaustive analysis and comparison of 25 systems.
3. Open research problems that motivated this thesis.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
CONCLUSIONS AND FURTHER WORK
 A conceptual definition of UML2PROV
1. We have defined a comprehensive, extensive, and systematic set of 17
transformation patterns that ultimately associates UML elements with
PROV elements of a template.
2. We have defined a set of requirements that each BGM must fulfil so as to
minimise the intrusion on software designers’ and developers’ modus
operandi.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
CONCLUSIONS AND FURTHER WORK
 An reference implementation of UML2PROV that automatizes:
1. the transformation patterns, and
2. the generation of a BGM that fulfils the stated requirements for Java.
Introductio
n
Backgroun
d
Systematic
Review
Conceptual
definition
Implementati
on
Evaluatio
n
Conclusions and further
work
CONCLUSIONS AND FURTHER WORK
1. Specify the elements in the UML diagrams for provenance capture.
2. Manage the level of detail of the provenance to be generated.
3. Include other kinds of UML diagrams and other elements.
4. Implementation of UML2PROV as a web service.
5. Offer support to other programming languages.
QUESTIONS
Towards a framework for making
applications provenance-aware
Supervisors:
Dr. Beatriz Pérez Valle
Dr. Francisco José García
Izquierdo
Author:
Carlos Sáenz
Adán
30 October
2019

More Related Content

PDF
Executive Profile PowerPoint Presentation Slides
PDF
Company Operations PowerPoint Presentation Slides
PDF
How ITIL-based IT Help Desk can help small and medium businesses
PDF
GAP Analysis Tool (Advanced)
PDF
Driving Trends PowerPoint Presentation Slides
PDF
Operational Readiness Review PowerPoint Presentation Slides
PDF
Continuous Improvement Powerpoint Presentation Slides
PDF
Sales and operations planning a research synthesis
Executive Profile PowerPoint Presentation Slides
Company Operations PowerPoint Presentation Slides
How ITIL-based IT Help Desk can help small and medium businesses
GAP Analysis Tool (Advanced)
Driving Trends PowerPoint Presentation Slides
Operational Readiness Review PowerPoint Presentation Slides
Continuous Improvement Powerpoint Presentation Slides
Sales and operations planning a research synthesis

What's hot (20)

PPTX
Project management- Operation Management
PDF
Building Business Apps: Coding Optional
PDF
Sales and Operation Planning
PDF
Project-management-in-a-global-operation
PPTX
Aiming for excellence in business analysis
PDF
Open Source Data PowerPoint Presentation Slides
PDF
S&OP A Better Way to Run Your Business- Eyeon Solutions
PDF
Introduction To Continuous Improvement Process PowerPoint Presentation Slides
PDF
Supply Chain Management Review Powerpoint Presentation Slides
PDF
Who Owns the “S” in S&OP?
PDF
How to Keep S&OP From Getting "Stuck", Oliver Wight, JDA
PPT
Continuous safety improvement
PDF
Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...
PDF
Power-point Presentation On Scope management plan
PDF
Xelocity insights into PMO Conference June 2018
PDF
Lean pmo – Delivering success globally public
PDF
SCOR Project Workshop - Sales & Operations Planning (S&OP) Health Check - How...
PPTX
How to make data-driven interactive PowerPoint presentations for operations
PDF
Continuous Improvement PowerPoint Presentation Slides
PDF
Revenue Growth or Cost Control? Strike the Right Balance with S&OP and Demand...
Project management- Operation Management
Building Business Apps: Coding Optional
Sales and Operation Planning
Project-management-in-a-global-operation
Aiming for excellence in business analysis
Open Source Data PowerPoint Presentation Slides
S&OP A Better Way to Run Your Business- Eyeon Solutions
Introduction To Continuous Improvement Process PowerPoint Presentation Slides
Supply Chain Management Review Powerpoint Presentation Slides
Who Owns the “S” in S&OP?
How to Keep S&OP From Getting "Stuck", Oliver Wight, JDA
Continuous safety improvement
Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...
Power-point Presentation On Scope management plan
Xelocity insights into PMO Conference June 2018
Lean pmo – Delivering success globally public
SCOR Project Workshop - Sales & Operations Planning (S&OP) Health Check - How...
How to make data-driven interactive PowerPoint presentations for operations
Continuous Improvement PowerPoint Presentation Slides
Revenue Growth or Cost Control? Strike the Right Balance with S&OP and Demand...
Ad

Similar to Towards a framework for making applications provenance aware: UML2PROV (20)

PPT
Provinance in scientific workflows in e science
PPT
Recording and Reasoning Over Data Provenance in Web and Grid Services
PPTX
Thoughts on Knowledge Graphs & Deeper Provenance
PPT
Object Oriented Analysis and Design with UML2 part1
PPTX
"Data Provenance: Principles and Why it matters for BioMedical Applications"
PPTX
OOSD_UNIT1 (1).pptx
PPTX
Introduction to Unified Modeling Language
PPT
I want to be a Data DJ!
PPT
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
PPTX
PROV Tutorials (Data Provenance Standard)
PPTX
Modeling Data Life Cycles with PROV
PPTX
Analysis
PDF
Object oriented analysis and design unit- iii
PPTX
UML.PPT.pptx
PDF
A Brief Provenance Tour … via DataONE
PPTX
CS8592-OOAD-UNIT II-STATIC UML DIAGRAMS PPT
PPTX
system model.pptx
PPTX
SE - Lecture 3 - Software Tools n Environment.pptx
Provinance in scientific workflows in e science
Recording and Reasoning Over Data Provenance in Web and Grid Services
Thoughts on Knowledge Graphs & Deeper Provenance
Object Oriented Analysis and Design with UML2 part1
"Data Provenance: Principles and Why it matters for BioMedical Applications"
OOSD_UNIT1 (1).pptx
Introduction to Unified Modeling Language
I want to be a Data DJ!
Paper talk (presented by Prof. Ludaescher), WORKS workshop, 2010
PROV Tutorials (Data Provenance Standard)
Modeling Data Life Cycles with PROV
Analysis
Object oriented analysis and design unit- iii
UML.PPT.pptx
A Brief Provenance Tour … via DataONE
CS8592-OOAD-UNIT II-STATIC UML DIAGRAMS PPT
system model.pptx
SE - Lecture 3 - Software Tools n Environment.pptx
Ad

Recently uploaded (20)

PPTX
CHAPTER 2 - PM Management and IT Context
PDF
Softaken Excel to vCard Converter Software.pdf
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PDF
Odoo Companies in India – Driving Business Transformation.pdf
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
PDF
top salesforce developer skills in 2025.pdf
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
PPTX
Operating system designcfffgfgggggggvggggggggg
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PDF
2025 Textile ERP Trends: SAP, Odoo & Oracle
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Upgrade and Innovation Strategies for SAP ERP Customers
PDF
AI in Product Development-omnex systems
PDF
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf
CHAPTER 2 - PM Management and IT Context
Softaken Excel to vCard Converter Software.pdf
Design an Analysis of Algorithms II-SECS-1021-03
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Odoo Companies in India – Driving Business Transformation.pdf
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Flood Susceptibility Mapping Using Image-Based 2D-CNN Deep Learnin. Overview ...
top salesforce developer skills in 2025.pdf
Wondershare Filmora 15 Crack With Activation Key [2025
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
Why TechBuilder is the Future of Pickup and Delivery App Development (1).pdf
Operating system designcfffgfgggggggvggggggggg
wealthsignaloriginal-com-DS-text-... (1).pdf
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
2025 Textile ERP Trends: SAP, Odoo & Oracle
Odoo POS Development Services by CandidRoot Solutions
Upgrade and Innovation Strategies for SAP ERP Customers
AI in Product Development-omnex systems
EN-Survey-Report-SAP-LeanIX-EA-Insights-2025.pdf

Towards a framework for making applications provenance aware: UML2PROV

Editor's Notes

  • #5: Provenance from experiments  Reproducibility
  • #6: Diferentes contextos: Storage system  PASS Database  PERM workflow systems  Vistrails
  • #10: Las contribuciones de esta tesis son tres, como explicaré en el resto de la intervención. El primero es un systematic review que ha sido publicado en una revista indexada en el journal citation reports. Luego la concept e implemet UML2PROV, que fueron presentadas y publicadas en los congresos.. Y además ha dado lugar aun articulo que ha sido mandado para publicación a la revista TSE, also indexed in JCR
  • #11: Used for understanding the rest of this presentation,
  • #14: As I said previously, provenance refers to the information explaining the existence piece of data. Concretely, we are interested in the information about the internal structure of a system, the evolution of the objects'’ states, and also how objects collaborate in order to execute operations. Taking this into account we decided to address class diagrams, state machine diagrams, and sequence diagrams
  • #15: SEQ: They are used to model the interaction among collaborating objects and the exchange of information between them.  STM: They specify the discrete behaviour of individual elements of a system.  CD: They model the static structure of a system, therefore describing the elements of the system and the relationships between them.  Decir para que sirve cada tipo de diagram, -estados por los que va pasando un objeto -flujo de información -status de un objeto El nexo comun de todos estos es la operacion 1) SqDs reflects how collaborating objects interact for executing operations, and the exchange of information between them 2) CD shows the objects’ characteristics, and (2) the operations 3) SMD shows information about the evolution of the objects’ state as a consequence of operations executions taken place
  • #18: This figure shows the organization of the family of documents which conforms the PROV standard.  We can see that the main document is called PROV-DM. This document is the conceptual data model, which defines the vocabulary used to represent the provenance information.  In this picture, we can see that the provenance defined using the vocabulary stated in the PROV-DM documents can be serialized in RDF (following the specification of PROV-O), XML (based on PROV-XML), and PROV-N.  Concretely, PROV-N is a readable text notation focused on human consumption.  In this thesis, we have used PROV in order to represent the provenance information, and concretely, PROV-N when we wanted to show readers concrete documents. 
  • #20: Comentar alguna relacion Decir que hay mas
  • #24: Taxonomy. Which could be used to analyse any provenance system Analysis. Open research problems that motivated this thesis.
  • #25: 6 dimensions 25 categories
  • #26: In bold are those aspects that do not appear in the original taxonomy. It is made up of 6 dimensions, each one containing different categories. 
  • #27: Subject, which refers to the different subjects or levels of detail in which provenance data can be represented, also considering interoperability aspects.  
  • #28: Storage, which describes the different approaches used by provenance systems to register provenance information. 
  • #29: Data capture, which deals with the way in which provenance data can be captured on the existing provenance system. 
  • #30: In bold are those aspects that do not appear in the original taxonomy.
  • #31: In bold are those aspects that do not appear in the original taxonomy.
  • #32: This analysis
  • #34: Granularity: the amount and cost of provenance information can be inversely proportional to the granularity. In fact, the provenance information can grow larger than the data it describes if the data is fine-grained. Tracing: if we compute provenance immediately as the application is running, we need additional storage. If we compute provenance when necessary, it is not required additional space, but it is not applicable in all scenarios. Level: workflow-level and process-level depends on the amount of data per operation and the number of recorded operations. Technique: annotation: when we annotate existing data. It is required additional time and space. inversion: when we leverage a property by which some derivations can be inverted to find the input data. It does not need additional time and space.
  • #37: <
  • #40: These toold do not help us decide what information should be in the provenance. Cuanto antes se decida, major. Cuanto antes obtenga el diseño del provenance, mejor. Los toolkits no me ayudan a decidir la info que se va a añadir en el provenance. Y qu3e esta información, cuanto antes se tenga major.
  • #41: En base a estos open problems, hemos visto la necesidad de definir UML2PROV, which will address these problems from the design phase of the application.
  • #43: Stackeholders y roles Elementos
  • #45: Qué es el BGM y para qué sirve
  • #46: Podria decir que la integración es de forma no intrusive.
  • #50: 1) SqDs reflects how collaborating objects interact for executing operations, and the exchange of information between them 2) CD shows the objects’ characteristics at some point, i.e. the object’s status, and (2) the operations that have led the objects’ status to be as they are. 3) SMD shows information about the evolution of the objects’ state as a consequence of operations executions taken place
  • #52: Taking into account the previous principles, we have structured the definition of each pattern according to 4 items.
  • #60: SeqP2 y que modela el enrolment…..
  • #62: Para que son lor requisitos
  • #63: R4 disappear
  • #65: Hemos dado una definición conceptual, y ahora nos centramos en la implementación, que se basa en ella.
  • #67: Hemos dado una definición conceptual, y ahora nos centramos en la implementación, que se basa en ella.
  • #68: All our conceptual definition is based on models, concretely, we want to use the UML diagram models to obtain both prov template files and the code of the bgm. Thus, we have used tools from the context of Model driven development in order to implement our proposal. Explicar las dos transformations. It is worth noting that we could have implemented an only transformation from UML to templates. However we decided to implement an intermediate transformation in order to facilitate further implementation of PROV templates in diferent serialization formats. This implementation is a generic proposal that can be used by
  • #69: It is worth noting that we could have implemented an only transformation from UML to templates. However we decided to implement an intermediate transformation in order to facilitate further implementation of PROV templates in diferent serialization formats. It is remarkable that our reference implementation for generating templates could be seen as a generic solution suitable for being used by any final user of UML2PROV, who has the UML design of its application. Such a reference implementation provides a complete automatic translation from any UML design to PROV templates as established by our patterns.
  • #70: Hemos dado una definición conceptual, y ahora nos centramos en la implementación, que se basa en ella.
  • #74: BGMEventListener It is a Java interface that defines four operations for managing each type of event (operationStart, operationEnd, newBinding, and newValueBinding). Such operations have an input parameter of type BGMEvent that contains the provenance data to be processed. The implementation of these operations constitutes the mechanism used by a class (implementing this interface) to generate, manage and store bindings. BGMEvent This component is used to carry information about the occurrence of an event. This information is the provenance data necessary for constructing the bindings. BGMEventManager. In some circumstances, having only one listener for managing bindings could not be enough, and the same happens with the mechanisms for generating and storing provenance.
  • #75: BGMEventListener It is a Java interface that defines four operations for managing each type of event (operationStart, operationEnd, newBinding, and newValueBinding). Such operations have an input parameter of type BGMEvent that contains the provenance data to be processed. The implementation of these operations constitutes the mechanism used by a class (implementing this interface) to generate, manage and store bindings. BGMEvent This component is used to carry information about the occurrence of an event. This information is the provenance data necessary for constructing the bindings. BGMEventManager. In some circumstances, having only one listener for managing bindings could not be enough, and the same happens with the mechanisms for generating and storing provenance.
  • #76: Hemos utilizado AOP porque nos permite capturar diferentes situaciones durante la ejecucion de un programa, invocaciones operaciones y ante esas capturas, coger provenance antes y despues. De forma no intrusiva Concretamente nos permite la captura…
  • #77: Es un element muy importante porque la implementación de este componente define la estrategia de gestión y almacenamiento del provenance.
  • #78: BGMEventListener It is a Java interface that defines four operations for managing each type of event (operationStart, operationEnd, newBinding, and newValueBinding). Such operations have an input parameter of type BGMEvent that contains the provenance data to be processed. The implementation of these operations constitutes the mechanism used by a class (implementing this interface) to generate, manage and store bindings. BGMEvent This component is used to carry information about the occurrence of an event. This information is the provenance data necessary for constructing the bindings. BGMEventManager. In some circumstances, having only one listener for managing bindings could not be enough, and the same happens with the mechanisms for generating and storing provenance.
  • #79: BGMEventListener It is a Java interface that defines four operations for managing each type of event (operationStart, operationEnd, newBinding, and newValueBinding). Such operations have an input parameter of type BGMEvent that contains the provenance data to be processed. The implementation of these operations constitutes the mechanism used by a class (implementing this interface) to generate, manage and store bindings. BGMEvent This component is used to carry information about the occurrence of an event. This information is the provenance data necessary for constructing the bindings. BGMEventManager. In some circumstances, having only one listener for managing bindings could not be enough, and the same happens with the mechanisms for generating and storing provenance.
  • #80: BGMEventListener It is a Java interface that defines four operations for managing each type of event (operationStart, operationEnd, newBinding, and newValueBinding). Such operations have an input parameter of type BGMEvent that contains the provenance data to be processed. The implementation of these operations constitutes the mechanism used by a class (implementing this interface) to generate, manage and store bindings. BGMEvent This component is used to carry information about the occurrence of an event. This information is the provenance data necessary for constructing the bindings. BGMEventManager. In some circumstances, having only one listener for managing bindings could not be enough, and the same happens with the mechanisms for generating and storing provenance.
  • #81: BGMEventListener It is a Java interface that defines four operations for managing each type of event (operationStart, operationEnd, newBinding, and newValueBinding). Such operations have an input parameter of type BGMEvent that contains the provenance data to be processed. The implementation of these operations constitutes the mechanism used by a class (implementing this interface) to generate, manage and store bindings. BGMEvent This component is used to carry information about the occurrence of an event. This information is the provenance data necessary for constructing the bindings. BGMEventManager. In some circumstances, having only one listener for managing bindings could not be enough, and the same happens with the mechanisms for generating and storing provenance.
  • #82: BGMEventListener It is a Java interface that defines four operations for managing each type of event (operationStart, operationEnd, newBinding, and newValueBinding). Such operations have an input parameter of type BGMEvent that contains the provenance data to be processed. The implementation of these operations constitutes the mechanism used by a class (implementing this interface) to generate, manage and store bindings. BGMEvent This component is used to carry information about the occurrence of an event. This information is the provenance data necessary for constructing the bindings. BGMEventManager. In some circumstances, having only one listener for managing bindings could not be enough, and the same happens with the mechanisms for generating and storing provenance.
  • #83: BGMEventListener It is a Java interface that defines four operations for managing each type of event (operationStart, operationEnd, newBinding, and newValueBinding). Such operations have an input parameter of type BGMEvent that contains the provenance data to be processed. The implementation of these operations constitutes the mechanism used by a class (implementing this interface) to generate, manage and store bindings. BGMEvent This component is used to carry information about the occurrence of an event. This information is the provenance data necessary for constructing the bindings. BGMEventManager. In some circumstances, having only one listener for managing bindings could not be enough, and the same happens with the mechanisms for generating and storing provenance.
  • #87: Aqui decir lo de que los usuarios se suelen preguntar como se hizo determinado experiment…
  • #88: Referencia de GelJ
  • #90: A mas tal mas cual
  • #91: A mas tal mas cual
  • #92: A mas tal mas cual
  • #93: A mas tal mas cual
  • #98: An alternative could be to examine how to adapt the transformation patterns, by selectively discarding some PROV elements or relations, and consequently to obtain coarser-grained data provenance. We could manage the amount of provenance information to be generated. For example, a UML designer may selectively discard provenance regarding specific input parameters of an operation. A fact that cannot by avoided with the current approach. Althoug our approach addresses three of the most widely used UML diagrams, an interesting line of further research is to explore how the information exposed by other types of UML diagrams could enrich the provenance to be generated.