SlideShare a Scribd company logo
Complete and Interpretable
Conformance Checking of
Business Processes
Luciano García-Bañuelos University of Tartu, Estonia
Nick van Beest Data61 | CSIRO, Australia
Marlon Dumas University of Tartu, Estonia
Marcello La Rosa Queensland University of Technology, Australia
Willem Mertens Queensland University of Technology, Australia
Conformance checking
1. Compliance auditing
– detect deviations with respect to a normative
model (unfitting behavior)
2. Model maintenance
– unfitting behavior
– additional model behavior
3. Automated process model discovery
– Iterative model improvement
Given a process model M and an event log L,
explain the differences between the process
behavior observed in M and L
State of the art
Current approaches:
• Are designed to identify the number and exact
location of the differences
• Don’t provide a “high-level” diagnosis that
easily allows analysts to pinpoint differences:
– Are unable to identify differences across traces
– Are unable to fully characterize extra model
behavior not present in the log
An example
Desired conformance output:
• task C is optional in the log
• the cycle including IGDF is not observed in the log
Log traces:
ABCDEH
ACBDEH
ABCDFH
ACBDFH
ABDEH
ABDFH
Our approach
A method for business process conformance checking
that:
1. Identifies all differences between the behavior in the
model and the behavior in the log
2. Describes each difference via a natural language
statement
How does it work?
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
How does it work?
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
How does it work?
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
How does it work?
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences
Prime event structure (PES)
A Prime Event Structure (PES) is a graph of events, where
each event e represents the occurrence of a task in the modeled
system (e.g. a business process)
As such, multiple occurrences of the same task are represented
by different events
Pairs of events in a PES can have one of the following binary
relations:
• Causality: event e is a prerequisite for e'
• Conflict: e and e' cannot occur in the same execution
• Concurrency: no order can be established between e and e'
From event log to PES
Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3
PO runs:
{e0,f0,g0}:A
From event log to PES
Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
PO runs:
{e0,f0,g0}:A
{e1,f1}:B {e2}:C
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3
From event log to PES
Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
PO runs:
{e0,f0,g0}:A
{e1,f1}:B {e2}:C {g1}:D
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3
From event log to PES
Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
PO runs:
{e0,f0,g0}:A
{e1,f1}:B {e2}:C {g1}:D
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3
From event log to PES
Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
PO runs:
{e0,f0,g0}:A
{e1,f1}:B
{f2}:E {e3}:E {g2}:E
{e2}:C {g1}:D
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3
From event log to PES
Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
PO runs:
{e0,f0,g0}:A
{e1,f1}:B
{f2}:E {e3}:E {g2}:E
{e2}:C {g1}:D
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3
From model to PES
BPMN model
Petri net
From model to PES
Branching process
From model to PES
Complete prefix unfolding
Cutoff
event
Corresponding
event
Cutoff
event
Corresponding
event
PES prefix unfolding
Complete prefix unfolding
PES prefix unfolding
Cutoff
eventCorresponding
event
Corresponding
event
Cutoff
event
Loop relations
A
C
D
D
A
B
C
D
B
C
Comparing PESs
Log PES EL Model PES prefix unfolding EM
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
A
B
D
E
C
f0:A
f1:B f2:C f3:D
f4:E f5:E
lh = {}, rh = {}
m = {}
Comparing PESs (cont’d)
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
Comparing PESs (cont’d)
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
Comparing PESs (cont’d)
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
f0:A
f1:B f2:C f3:D
f4:E f5:E
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
match C
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
Comparing PESs (cont’d)
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
f0:A
f1:B f2:C f3:D
f4:E f5:E
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
match C
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
Comparing PESs (cont’d)
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
f0:A
f1:B f2:C f3:D
f4:E f5:E
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
Comparing PESs (cont’d)
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
f0:A
f1:B f2:C f3:D
f4:E f5:E
match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
match E match E
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B,(e4,f4)E}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
Comparing PESs (cont’d)
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
In the log, C is
optional after {A,B},
whereas in the model
it is not
(task skipping)
f0:A
f1:B f2:C f3:D
f4:E f5:E
Elementary mismatch patterns
Unfitting behavior patterns:
• Relation mismatch patterns
1. Causality-Concurrency
2. Conflict
• Event mismatch patterns
3. Task skipping
4. Task substitution
5. Unmatched repetition
6. Task relocation
7. Task insertion / absence
Additional model behavior patterns:
8. Unobserved acyclic interval
9. Unobserved cyclic interval
Example: Causality / Concurrency
Example: Task substitution
Unobserved cyclic interval:
PES and PES prefix unfolding
A
B
C
D
Log PES EL
Model PES prefix unfolding EM
Pomsets
(partially ordered multisets)
• A pomset is a Directed Acyclic Graph where:
– the nodes are configurations
– the edges represent direct causality relations between configurations
– an edge is labeled by an event
• Unlike an event structure, a pomset does not have any conflict
relation, since a pomset represents one possible execution
• The behavior of a PES can be characterized by the set of pomsets it
induces
• In the case of a PES prefix, the set of induced pomsets is infinite
when the PES prefix captures cyclic behavior via cc-pairs
• We cannot enumerate all pomsets of a PES prefix to compare with
the PES of the log
• Therefore, we can extract a set of elementary pomsets (inspired by
the notion of elementary paths), which collectively cover all the
possible pomsets induced by a PES prefix
• Cyclic behavior is not required to be unfolded infinitely
Unobserved cyclic interval:
expanded prefix with elementary pomsets
Unobserved cyclic interval:
creating a PSP using the expanded prefix
Two unobserved elementary acyclic pomsets:
• s3 [a5, a9]
• s9 [a8, a9]
Two unobserved elementary cyclic pomsets:
• s5 [a3, a6, a4, a7]
• s11 [a4, a7, a3, a6]
Verbalization of elementary mismatch patterns
Change pattern Condition Verbalization
Causality /
Concurrency
if e' < e
else
In the log, after σ, λ(e') occurs before λ(e), while in the model
they are concurrent
In the model, after σ, λ(f') occurs before λ(f), while in the log
they are concurrent
Conflict if e' || e
else if f' || f
else if e' < e
else
In the log, after σ, λ(e') and λ(e) are concurrent, while in the
model they are mutually exclusive
In the model, after σ, λ(f') and λ(f) are concurrent, while in the
log they are mutually exclusive
In the log, after σ, λ(e') occurs before task λ(e), while in the
model they are mutually exclusive after σ
In the model, after σ, λ(f') occurs before λ(f), while in the log
they are mutually exclusive
Task skipping if e ≠ ┴
else
In the log, after σ, λ(e) is optional
In the model, after σ, λ(f) is optional
Task substitution In the log, after σ, λ(f) is substituted by λ(e)
Unmatched repetition In the log, λ(e) is repeated after σ
Task relocation if e ≠ ┴
else
In the log, λ(e) occurs after σ instead of σ'
In the model, λ(f) occurs after σ instead of σ'
Change pattern Condition Verbalization
Task insertion / absence if e ≠ ┴
else
In the log, λ(e) occurs after σ and before σ'
In the model, λ(f) occurs after σ and before σ'
Unobserved acyclic interval In the log, interval ... does not occur after σ
Unobserved cyclic interval In the log, the cycle involving interval ... does not occur after σ
Verbalization of elementary mismatch patterns
Implementation
Standalone Java tool: ProConformance
OSGi plugin for Apromore: Compare
– Input: BPMN process model and a log (MXML
or XES format). Also accepts:
• Two BPMN models for model comparison and
• Two logs for log delta analysis
– Output: set of difference statements
Evaluation
1. Qualitative evaluation on real life process:
– Traffic fines management process in Italy with
150,370 traces, 231 distinct traces
2. Quantitative evaluation on two large process
model collections:
– IBM Business Integration Unit (BIT): 735 models
– SAP R/3: 604 models
3. User evaluation (academics vs practitioners)
Qualitative evaluation:
traffic fines model
Start Create
Fine
Payment
Send
Fine
Insert
Fine
Notification
Add
Penalty
Appeal
to Judge
Send for
Credit
Collection
Notify
Result
Appeal to
Offender
Insert Date
Appeal to
Prefecture
Receive
Result
Appeal from
Prefecture
Send
Appeal
to Prefecture
End
Tau10
Qualitative evaluation:
trace alignment
• Replay a Log on Petri Net for Conformance Analysis:
205 misalignments out of 231 alignments
• Replay a Log on Petri Net for All Optimal Alignments:
406 misalignments out of 412 alignments
Qualitative evaluation:
verbalization
15 distinct statements in total, e.g.
1. In the log, “Send for credit collection” occurs after
“Payment” and before the end state
2. In the model, after “Insert fine notification”, “Add penalty”
occurs before “Appeal to judge”, while in the log they are
concurrent
3. In the log, after “Add penalty”, “Receive results appeal
from prefecture” is substituted by “Appeal to judge”
4. In the log, the cycle involving “Insert date appeal to
prefecture, Send appeal to prefecture, Receive result
appeal from prefecture, Notify result appeal to offender”
does not occur after “Insert fine notification”.
Qualitative evaluation:
verbalization
2. In the model, after “Insert fine notification”, “Add penalty” occurs
before “Appeal to judge”, while in the log they are concurrent
4. In the log, the cycle involving “Insert date appeal to prefecture,
Send appeal to prefecture, Receive result appeal from prefecture,
Notify result appeal to offender” does not occur after “Insert fine
notification”.
Cannot be detected by trace alignment,
as diagnostics are provided at the level
of individual traces
Cannot be entirely detected by trace
alignment, as this difference concerns
additional model behavior, while
alignment-based ETC conformance
only detects escaping edges
Qualitative evaluation: summary
Verbalization:
• produces a more compact yet more understandable
diagnosis
• exposes behavioral differences that are difficult or
impossible to identify using trace alignment
Quantitative evaluation
• For each model, we generated an event log using the
ProM plugin “Generate Event Log from Petri Net”
• This plugin generates a distinct log trace for each
possible execution sequence in the model
• The tool was only able to parse 274 models from the BIT
collection, and 438 models from the R/3 collection,
running into out-of-memory exceptions for the remaining
models
• Total models: 712 sound Workflow nets
Quantitative evaluation:
model complexity
Quantitative evaluation:
log size
Total log size (events)
Quantitative evaluation:
time performance
0
100
200
300
400
500
600
700
800
0 50 100 150 200 250 300
Logsize(#events)
Time (ms)
No noise 5% noise 10% noise 15% noise 20% noise
0
2000
4000
6000
8000
10000
12000
0 0.5 1 1.5 2
Logsize(#events)
Time (s)
No noise 5% noise 10% noise 15% noise 20% noise
BIT
SAP
Trace alignmentVerbalization
BIT
SAP
0
100
200
300
400
500
600
700
800
0 50 100 150 200 250 300
Logsize(#events)
Time (ms)
No noise 5% noise 10% noise 15% noise 20% noise
0
2000
4000
6000
8000
10000
0 10 20 30 40 50 60 70 80 90 100
Logsize(#events)
Time (s)
No noise 5% noise 10% noise 15% noise 20% noise
Quantitative evaluation:
results
Statements Misalignments
Escaping edges
Quantitative evaluation: summary
• Verbalization, although generally slower than trace
alignment, shows reasonable execution times
(within 10s)
• Extreme cases: (logs with over 8,000 events in
distinct traces) and a high number of differences, the
execution time is still below 2 minutes
• Verbalization consistently produces a more compact
difference diagnosis than trace alignment
User evaluation
• Online survey:
– a simple Petri net with 31 nodes (10 visible transitions),
created from a real-life claims handling process model
– assumed that this model was accompanied by a log with 53
traces
• Output of the alignment method (misalignments +
Petri net with alignment information) overlaid
vs
• Output of the verbalization method (list of statements)
User evaluation
• Respondents compared both methods using the Technology
Acceptance Model:
1. What is the easiest approach for checking the conformance of an
event log to a process model?
2. What is the easiest approach for identifying the differences between a
process model and an event log?
3. What is the most useful approach for checking the conformance of an
event log to a process model?
4. What is the most useful approach for identifying the differences
between a process model and an event log?
5. Which approach would you likely use for checking the conformance of
an event log to a process model?
6. Which approach would you likely use for identifying the differences
between a process model and an event log?
• Seven point Likert-scale: “Strongly prefer Alignment” to “Strongly prefer Verbalization”
• Background: academic vs professional
• Experience in process modelling
• Confidence in modelling with Petri nets
User evaluation: hypotheses
H1: respondents would have a preference for
verbalization
H2: respondents with less experience, familiarity,
confidence and competence in the use of
Petri nets would have a stronger preference
for verbalization
User evaluation: results
• Academics (38 responses)
– More familiar in working with Petri nets
– More competent in working with Petri nets
– Analysed and created more models in the
past 12 months
• Professionals (33 responses)
– Less familiar with Petri nets
– Mostly rely on professional training
User evaluation: results
• H1:
– Tested for the full sample and for the two cohorts separately
– For the full sample there is no general preference for our method: the
median was zero (“neutral”)
– Professionals did show a preference for verbalization (especially along
ease of use) while academics preferred alignment, so H1 is supported
for the professionals cohort only
• H2:
– Respondents with more experience, familiarity, confidence and
competence in working with Petri nets have a stronger preference for
alignments
– H2 is supported by the results
User evaluation: summary
• Academics prefer alignment
• Professionals prefer verbalization
• Overall, people with less expertise in the use of
Petri nets show a stronger preference for
verbalization
Limitations of the approach
• Input log is assumed to consist of sequences of event labels
– timestamps are ignored
– event payloads are ignored
• Simplicity of the used concurrency oracle (a+), leading to
occasional difficulties in the presence of
– short loops
– skipped and/or duplicated tasks
• Lack of visual representation of differences (text only)
• No option to use different levels of abstraction
• No statistical support for differences: all equally important even if
some may be very infrequent
Future work
• Employing a more accurate concurrency oracle (e.g. local a)*
• Group related statements to trade accuracy with greater interpretability
• Add statistical support to statements
• Visual representation of differences in addition to natural language
statements (e.g. via “representative” runs)
• Capturing non-control-flow deviance
– Analysis of underlying data
– Resources
– Temporal aspects
• Use differences as a basis for model repair
*Armas-Cervantes, A., Dumas, M., & La Rosa, M. (2016) Discovering Local Concurrency Relations in Business
Process Event Logs, https://eprints.qut.edu.au/97615
Thank you for your attention

More Related Content

PPTX
Incremental and Interactive Process Model Repair
PDF
Ch06
PDF
Ch03
PDF
PPT
Ll(1) Parser in Compilers
DOC
Pcd(Mca)
PDF
Argumentation Extensions Enumeration as a Constraint Satisfaction Problem: a ...
PDF
Ch04
Incremental and Interactive Process Model Repair
Ch06
Ch03
Ll(1) Parser in Compilers
Pcd(Mca)
Argumentation Extensions Enumeration as a Constraint Satisfaction Problem: a ...
Ch04

What's hot (20)

PDF
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...
PDF
Cerutti -- TAFA2013
PDF
C applications
PPTX
Compiler Design Unit 3
PPTX
Compiler Design Unit 2
PPTX
Compiler: Syntax Analysis
PDF
Algorithm Selection for Preferred Extensions Enumeration
PPTX
Lecture 07 08 syntax analysis-4
PPTX
First and follow set
PPT
Chapter Five(2)
PPTX
Graph theory basics
PDF
Fuzzing and Verifying RAT Refutations with Deletion Information
PDF
Programming languages
PDF
regular expressions (Regex)
PPTX
Parsing in Compiler Design
PPT
Compiler Design Unit 5
A SCC Recursive Meta-Algorithm for Computing Preferred Labellings in Abstract...
Cerutti -- TAFA2013
C applications
Compiler Design Unit 3
Compiler Design Unit 2
Compiler: Syntax Analysis
Algorithm Selection for Preferred Extensions Enumeration
Lecture 07 08 syntax analysis-4
First and follow set
Chapter Five(2)
Graph theory basics
Fuzzing and Verifying RAT Refutations with Deletion Information
Programming languages
regular expressions (Regex)
Parsing in Compiler Design
Compiler Design Unit 5
Ad

Viewers also liked (20)

PPTX
Automated Discovery of Structured Process Models: Discover Structured vs Disc...
PDF
Predictive Business Process Monitoring with Structured and Unstructured Data
PPTX
Process Mining and Predictive Process Monitoring
PPTX
Factors Affecting the Sustained Use of Process Models
PPTX
Evidence-Based Business Process Management
PPTX
My business processes are deviant! What should I do about it?
PPTX
Semantics and Analysis of DMN Decision Tables
PPTX
Minería de Procesos y de Reglas de Negocio
PPT
Introduction to Business Process Analysis and Redesign
PPT
Fundamentals of Business Process Management: A Quick Introduction to Value-Dr...
PPTX
Predictive Process Monitoring with Hyperparameter Optimization
PPTX
Minimizing Overprocessing Waste in Business Processes via Predictive Activity...
PPT
From Models to Data and Back: The Journey of the BPM Discipline and the Tangl...
PPTX
Business Process Performance Mining with Staged Process Flows
PPT
Beyond Tasks and Gateways: Automated Discovery of BPMN Models with Subprocess...
PPT
Process Mining Reloaded: Event Structures as a Unified Representation of Proc...
PPTX
In Processes We Trust: Privacy and Trust in Business Processes
PPTX
BPM Techniques and Tools: A Quick Tour of the BPM Lifecycle
PPT
What is BPM?
PPTX
Differential Privacy Analysis of Data Processing Workflows
Automated Discovery of Structured Process Models: Discover Structured vs Disc...
Predictive Business Process Monitoring with Structured and Unstructured Data
Process Mining and Predictive Process Monitoring
Factors Affecting the Sustained Use of Process Models
Evidence-Based Business Process Management
My business processes are deviant! What should I do about it?
Semantics and Analysis of DMN Decision Tables
Minería de Procesos y de Reglas de Negocio
Introduction to Business Process Analysis and Redesign
Fundamentals of Business Process Management: A Quick Introduction to Value-Dr...
Predictive Process Monitoring with Hyperparameter Optimization
Minimizing Overprocessing Waste in Business Processes via Predictive Activity...
From Models to Data and Back: The Journey of the BPM Discipline and the Tangl...
Business Process Performance Mining with Staged Process Flows
Beyond Tasks and Gateways: Automated Discovery of BPMN Models with Subprocess...
Process Mining Reloaded: Event Structures as a Unified Representation of Proc...
In Processes We Trust: Privacy and Trust in Business Processes
BPM Techniques and Tools: A Quick Tour of the BPM Lifecycle
What is BPM?
Differential Privacy Analysis of Data Processing Workflows
Ad

Similar to Complete and Interpretable Conformance Checking of Business Processes (20)

PPTX
Interpretable Process Mining: shifting control to end users
PPTX
Scalable Conformance Checking of Business Processes
PPTX
Process Mining and Predictive Process Monitoring in Apromore
PPTX
Monotone Conformance Checking for Partially Matching Designed and Observed Pr...
PPTX
Resolving Inconsistencies and Redundancies in Declarative Process Models
PDF
Process Mining and Predictive Monitoring: an overview
PDF
A Stream-Based Approach to Intrusion Detection
PDF
Process Mining - Chapter 5 - Process Discovery
PDF
Process mining chapter_05_process_discovery
PPTX
Wrokflow programming and provenance query model
PPT
BPMN process views construction
PPTX
Ensuring Model Consistency in Declarative Process Discovery
PDF
When RV Meets CEP (RV 2016 Tutorial)
PPTX
Automated Discovery of Declarative Process Models
PPT
Debs 2011 pattern rewritingforeventprocessingoptimization
KEY
Verification with LoLA: 4 Using LoLA
PPTX
Earth Movers’ Stochastic Conformance Checking
PDF
Event Stream Processing with BeepBeep 3
KEY
Verification with LoLA
PDF
Process Mining - Chapter 7 - Conformance Checking
Interpretable Process Mining: shifting control to end users
Scalable Conformance Checking of Business Processes
Process Mining and Predictive Process Monitoring in Apromore
Monotone Conformance Checking for Partially Matching Designed and Observed Pr...
Resolving Inconsistencies and Redundancies in Declarative Process Models
Process Mining and Predictive Monitoring: an overview
A Stream-Based Approach to Intrusion Detection
Process Mining - Chapter 5 - Process Discovery
Process mining chapter_05_process_discovery
Wrokflow programming and provenance query model
BPMN process views construction
Ensuring Model Consistency in Declarative Process Discovery
When RV Meets CEP (RV 2016 Tutorial)
Automated Discovery of Declarative Process Models
Debs 2011 pattern rewritingforeventprocessingoptimization
Verification with LoLA: 4 Using LoLA
Earth Movers’ Stochastic Conformance Checking
Event Stream Processing with BeepBeep 3
Verification with LoLA
Process Mining - Chapter 7 - Conformance Checking

More from Marlon Dumas (20)

PPTX
LLM-Assisted Optimization of Waiting Time in Business Processes: A Prompting ...
PPTX
Explanatory Capabilities of Large Language Models in Prescriptive Process Mon...
PPTX
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
PPTX
How GenAI will (not) change your business?
PPTX
Walking the Way from Process Mining to AI-Driven Process Optimization
PPTX
Discovery and Simulation of Business Processes with Probabilistic Resource Av...
PPTX
Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...
PPTX
Business Process Optimization: Status and Perspectives
PPTX
Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...
PPTX
Why am I Waiting Data-Driven Analysis of Waiting Times in Business Processes
PPTX
Augmented Business Process Management
PPTX
Process Mining and Data-Driven Process Simulation
PPTX
Modeling Extraneous Activity Delays in Business Process Simulation
PPTX
Business Process Simulation with Differentiated Resources: Does it Make a Dif...
PPTX
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints
PPTX
Robotic Process Mining
PPTX
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
PPTX
Learning Accurate Business Process Simulation Models from Event Logs via Auto...
PPTX
Process Mining: A Guide for Practitioners
PPTX
Process Mining for Process Improvement.pptx
LLM-Assisted Optimization of Waiting Time in Business Processes: A Prompting ...
Explanatory Capabilities of Large Language Models in Prescriptive Process Mon...
Discovering Digital Process Twins for What-if Analysis: a Process Mining Appr...
How GenAI will (not) change your business?
Walking the Way from Process Mining to AI-Driven Process Optimization
Discovery and Simulation of Business Processes with Probabilistic Resource Av...
Can I Trust My Simulation Model? Measuring the Quality of Business Process Si...
Business Process Optimization: Status and Perspectives
Learning When to Treat Business Processes: Prescriptive Process Monitoring wi...
Why am I Waiting Data-Driven Analysis of Waiting Times in Business Processes
Augmented Business Process Management
Process Mining and Data-Driven Process Simulation
Modeling Extraneous Activity Delays in Business Process Simulation
Business Process Simulation with Differentiated Resources: Does it Make a Dif...
Prescriptive Process Monitoring Under Uncertainty and Resource Constraints
Robotic Process Mining
Accurate and Reliable What-If Analysis of Business Processes: Is it Achievable?
Learning Accurate Business Process Simulation Models from Event Logs via Auto...
Process Mining: A Guide for Practitioners
Process Mining for Process Improvement.pptx

Recently uploaded (20)

PDF
Insiders guide to clinical Medicine.pdf
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
01-Introduction-to-Information-Management.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PDF
Classroom Observation Tools for Teachers
PPTX
Cell Types and Its function , kingdom of life
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Institutional Correction lecture only . . .
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Computing-Curriculum for Schools in Ghana
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Pharma ospi slides which help in ospi learning
Insiders guide to clinical Medicine.pdf
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
01-Introduction-to-Information-Management.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Classroom Observation Tools for Teachers
Cell Types and Its function , kingdom of life
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Renaissance Architecture: A Journey from Faith to Humanism
Module 4: Burden of Disease Tutorial Slides S2 2025
Institutional Correction lecture only . . .
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Computing-Curriculum for Schools in Ghana
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Anesthesia in Laparoscopic Surgery in India
Microbial diseases, their pathogenesis and prophylaxis
Pharma ospi slides which help in ospi learning

Complete and Interpretable Conformance Checking of Business Processes

  • 1. Complete and Interpretable Conformance Checking of Business Processes Luciano García-Bañuelos University of Tartu, Estonia Nick van Beest Data61 | CSIRO, Australia Marlon Dumas University of Tartu, Estonia Marcello La Rosa Queensland University of Technology, Australia Willem Mertens Queensland University of Technology, Australia
  • 2. Conformance checking 1. Compliance auditing – detect deviations with respect to a normative model (unfitting behavior) 2. Model maintenance – unfitting behavior – additional model behavior 3. Automated process model discovery – Iterative model improvement
  • 3. Given a process model M and an event log L, explain the differences between the process behavior observed in M and L
  • 4. State of the art Current approaches: • Are designed to identify the number and exact location of the differences • Don’t provide a “high-level” diagnosis that easily allows analysts to pinpoint differences: – Are unable to identify differences across traces – Are unable to fully characterize extra model behavior not present in the log
  • 5. An example Desired conformance output: • task C is optional in the log • the cycle including IGDF is not observed in the log Log traces: ABCDEH ACBDEH ABCDFH ACBDFH ABDEH ABDFH
  • 6. Our approach A method for business process conformance checking that: 1. Identifies all differences between the behavior in the model and the behavior in the log 2. Describes each difference via a natural language statement
  • 7. How does it work? Difference statements Event log Input model PESM unfold PESL merge Partially Synchronized Product (PSP) compare extract differences
  • 8. How does it work? Difference statements Event log Input model PESM unfold PESL merge Partially Synchronized Product (PSP) compare extract differences
  • 9. How does it work? Difference statements Event log Input model PESM unfold PESL merge Partially Synchronized Product (PSP) compare extract differences
  • 10. How does it work? Difference statements Event log Input model PESM unfold PESL merge Partially Synchronized Product (PSP) compare extract differences
  • 11. Prime event structure (PES) A Prime Event Structure (PES) is a graph of events, where each event e represents the occurrence of a task in the modeled system (e.g. a business process) As such, multiple occurrences of the same task are represented by different events Pairs of events in a PES can have one of the following binary relations: • Causality: event e is a prerequisite for e' • Conflict: e and e' cannot occur in the same execution • Concurrency: no order can be established between e and e'
  • 12. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3 PO runs: {e0,f0,g0}:A
  • 13. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {e2}:C e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  • 14. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {e2}:C {g1}:D e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  • 15. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {e2}:C {g1}:D e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  • 16. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {f2}:E {e3}:E {g2}:E {e2}:C {g1}:D e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  • 17. From event log to PES Log: Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 PO runs: {e0,f0,g0}:A {e1,f1}:B {f2}:E {e3}:E {g2}:E {e2}:C {g1}:D e0:A e1:B e2:C e3:E f0:A f1:B f2:E g0:A g1:D g2:E t1, t2 → p1 t3 → p2 t4 → p3
  • 18. From model to PES BPMN model Petri net
  • 19. From model to PES Branching process
  • 20. From model to PES Complete prefix unfolding Cutoff event Corresponding event Cutoff event Corresponding event
  • 21. PES prefix unfolding Complete prefix unfolding PES prefix unfolding Cutoff eventCorresponding event Corresponding event Cutoff event
  • 23. Comparing PESs Log PES EL Model PES prefix unfolding EM e0:A e1:B e2:C e3:D e4:E e5:E e6:E Trace Ref N A B C E t1 3 A C B E t2 2 A B E t3 2 A D E t4 3 A B D E C f0:A f1:B f2:C f3:D f4:E f5:E
  • 24. lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E
  • 25. lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E f0:A f1:B f2:C f3:D f4:E f5:E
  • 26. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C f0:A f1:B f2:C f3:D f4:E f5:E
  • 27. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} match C lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C f0:A f1:B f2:C f3:D f4:E f5:E
  • 28. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} match C match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C f0:A f1:B f2:C f3:D f4:E f5:E
  • 29. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} match E lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C f0:A f1:B f2:C f3:D f4:E f5:E
  • 30. match B lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B} rhide Cmatch C match E match E lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C} lh = {}, rh = {} m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E} lh = {}, rh = {f2:C} m = {(e0,f0)A,(e1,f1)B,(e4,f4)E} lh = {}, rh = {} m = {(e0,f0)A} match A lh = {}, rh = {} m = {} Comparing PESs (cont’d) e0:A e1:B e2:C e3:D e4:E e5:E e6:E match Dmatch C In the log, C is optional after {A,B}, whereas in the model it is not (task skipping) f0:A f1:B f2:C f3:D f4:E f5:E
  • 31. Elementary mismatch patterns Unfitting behavior patterns: • Relation mismatch patterns 1. Causality-Concurrency 2. Conflict • Event mismatch patterns 3. Task skipping 4. Task substitution 5. Unmatched repetition 6. Task relocation 7. Task insertion / absence Additional model behavior patterns: 8. Unobserved acyclic interval 9. Unobserved cyclic interval
  • 32. Example: Causality / Concurrency
  • 34. Unobserved cyclic interval: PES and PES prefix unfolding A B C D Log PES EL Model PES prefix unfolding EM
  • 35. Pomsets (partially ordered multisets) • A pomset is a Directed Acyclic Graph where: – the nodes are configurations – the edges represent direct causality relations between configurations – an edge is labeled by an event • Unlike an event structure, a pomset does not have any conflict relation, since a pomset represents one possible execution • The behavior of a PES can be characterized by the set of pomsets it induces • In the case of a PES prefix, the set of induced pomsets is infinite when the PES prefix captures cyclic behavior via cc-pairs • We cannot enumerate all pomsets of a PES prefix to compare with the PES of the log • Therefore, we can extract a set of elementary pomsets (inspired by the notion of elementary paths), which collectively cover all the possible pomsets induced by a PES prefix • Cyclic behavior is not required to be unfolded infinitely
  • 36. Unobserved cyclic interval: expanded prefix with elementary pomsets
  • 37. Unobserved cyclic interval: creating a PSP using the expanded prefix Two unobserved elementary acyclic pomsets: • s3 [a5, a9] • s9 [a8, a9] Two unobserved elementary cyclic pomsets: • s5 [a3, a6, a4, a7] • s11 [a4, a7, a3, a6]
  • 38. Verbalization of elementary mismatch patterns Change pattern Condition Verbalization Causality / Concurrency if e' < e else In the log, after σ, λ(e') occurs before λ(e), while in the model they are concurrent In the model, after σ, λ(f') occurs before λ(f), while in the log they are concurrent Conflict if e' || e else if f' || f else if e' < e else In the log, after σ, λ(e') and λ(e) are concurrent, while in the model they are mutually exclusive In the model, after σ, λ(f') and λ(f) are concurrent, while in the log they are mutually exclusive In the log, after σ, λ(e') occurs before task λ(e), while in the model they are mutually exclusive after σ In the model, after σ, λ(f') occurs before λ(f), while in the log they are mutually exclusive Task skipping if e ≠ ┴ else In the log, after σ, λ(e) is optional In the model, after σ, λ(f) is optional Task substitution In the log, after σ, λ(f) is substituted by λ(e) Unmatched repetition In the log, λ(e) is repeated after σ Task relocation if e ≠ ┴ else In the log, λ(e) occurs after σ instead of σ' In the model, λ(f) occurs after σ instead of σ'
  • 39. Change pattern Condition Verbalization Task insertion / absence if e ≠ ┴ else In the log, λ(e) occurs after σ and before σ' In the model, λ(f) occurs after σ and before σ' Unobserved acyclic interval In the log, interval ... does not occur after σ Unobserved cyclic interval In the log, the cycle involving interval ... does not occur after σ Verbalization of elementary mismatch patterns
  • 40. Implementation Standalone Java tool: ProConformance OSGi plugin for Apromore: Compare – Input: BPMN process model and a log (MXML or XES format). Also accepts: • Two BPMN models for model comparison and • Two logs for log delta analysis – Output: set of difference statements
  • 41. Evaluation 1. Qualitative evaluation on real life process: – Traffic fines management process in Italy with 150,370 traces, 231 distinct traces 2. Quantitative evaluation on two large process model collections: – IBM Business Integration Unit (BIT): 735 models – SAP R/3: 604 models 3. User evaluation (academics vs practitioners)
  • 42. Qualitative evaluation: traffic fines model Start Create Fine Payment Send Fine Insert Fine Notification Add Penalty Appeal to Judge Send for Credit Collection Notify Result Appeal to Offender Insert Date Appeal to Prefecture Receive Result Appeal from Prefecture Send Appeal to Prefecture End Tau10
  • 43. Qualitative evaluation: trace alignment • Replay a Log on Petri Net for Conformance Analysis: 205 misalignments out of 231 alignments • Replay a Log on Petri Net for All Optimal Alignments: 406 misalignments out of 412 alignments
  • 44. Qualitative evaluation: verbalization 15 distinct statements in total, e.g. 1. In the log, “Send for credit collection” occurs after “Payment” and before the end state 2. In the model, after “Insert fine notification”, “Add penalty” occurs before “Appeal to judge”, while in the log they are concurrent 3. In the log, after “Add penalty”, “Receive results appeal from prefecture” is substituted by “Appeal to judge” 4. In the log, the cycle involving “Insert date appeal to prefecture, Send appeal to prefecture, Receive result appeal from prefecture, Notify result appeal to offender” does not occur after “Insert fine notification”.
  • 45. Qualitative evaluation: verbalization 2. In the model, after “Insert fine notification”, “Add penalty” occurs before “Appeal to judge”, while in the log they are concurrent 4. In the log, the cycle involving “Insert date appeal to prefecture, Send appeal to prefecture, Receive result appeal from prefecture, Notify result appeal to offender” does not occur after “Insert fine notification”. Cannot be detected by trace alignment, as diagnostics are provided at the level of individual traces Cannot be entirely detected by trace alignment, as this difference concerns additional model behavior, while alignment-based ETC conformance only detects escaping edges
  • 46. Qualitative evaluation: summary Verbalization: • produces a more compact yet more understandable diagnosis • exposes behavioral differences that are difficult or impossible to identify using trace alignment
  • 47. Quantitative evaluation • For each model, we generated an event log using the ProM plugin “Generate Event Log from Petri Net” • This plugin generates a distinct log trace for each possible execution sequence in the model • The tool was only able to parse 274 models from the BIT collection, and 438 models from the R/3 collection, running into out-of-memory exceptions for the remaining models • Total models: 712 sound Workflow nets
  • 50. Quantitative evaluation: time performance 0 100 200 300 400 500 600 700 800 0 50 100 150 200 250 300 Logsize(#events) Time (ms) No noise 5% noise 10% noise 15% noise 20% noise 0 2000 4000 6000 8000 10000 12000 0 0.5 1 1.5 2 Logsize(#events) Time (s) No noise 5% noise 10% noise 15% noise 20% noise BIT SAP Trace alignmentVerbalization BIT SAP 0 100 200 300 400 500 600 700 800 0 50 100 150 200 250 300 Logsize(#events) Time (ms) No noise 5% noise 10% noise 15% noise 20% noise 0 2000 4000 6000 8000 10000 0 10 20 30 40 50 60 70 80 90 100 Logsize(#events) Time (s) No noise 5% noise 10% noise 15% noise 20% noise
  • 52. Quantitative evaluation: summary • Verbalization, although generally slower than trace alignment, shows reasonable execution times (within 10s) • Extreme cases: (logs with over 8,000 events in distinct traces) and a high number of differences, the execution time is still below 2 minutes • Verbalization consistently produces a more compact difference diagnosis than trace alignment
  • 53. User evaluation • Online survey: – a simple Petri net with 31 nodes (10 visible transitions), created from a real-life claims handling process model – assumed that this model was accompanied by a log with 53 traces • Output of the alignment method (misalignments + Petri net with alignment information) overlaid vs • Output of the verbalization method (list of statements)
  • 54. User evaluation • Respondents compared both methods using the Technology Acceptance Model: 1. What is the easiest approach for checking the conformance of an event log to a process model? 2. What is the easiest approach for identifying the differences between a process model and an event log? 3. What is the most useful approach for checking the conformance of an event log to a process model? 4. What is the most useful approach for identifying the differences between a process model and an event log? 5. Which approach would you likely use for checking the conformance of an event log to a process model? 6. Which approach would you likely use for identifying the differences between a process model and an event log? • Seven point Likert-scale: “Strongly prefer Alignment” to “Strongly prefer Verbalization” • Background: academic vs professional • Experience in process modelling • Confidence in modelling with Petri nets
  • 55. User evaluation: hypotheses H1: respondents would have a preference for verbalization H2: respondents with less experience, familiarity, confidence and competence in the use of Petri nets would have a stronger preference for verbalization
  • 56. User evaluation: results • Academics (38 responses) – More familiar in working with Petri nets – More competent in working with Petri nets – Analysed and created more models in the past 12 months • Professionals (33 responses) – Less familiar with Petri nets – Mostly rely on professional training
  • 57. User evaluation: results • H1: – Tested for the full sample and for the two cohorts separately – For the full sample there is no general preference for our method: the median was zero (“neutral”) – Professionals did show a preference for verbalization (especially along ease of use) while academics preferred alignment, so H1 is supported for the professionals cohort only • H2: – Respondents with more experience, familiarity, confidence and competence in working with Petri nets have a stronger preference for alignments – H2 is supported by the results
  • 58. User evaluation: summary • Academics prefer alignment • Professionals prefer verbalization • Overall, people with less expertise in the use of Petri nets show a stronger preference for verbalization
  • 59. Limitations of the approach • Input log is assumed to consist of sequences of event labels – timestamps are ignored – event payloads are ignored • Simplicity of the used concurrency oracle (a+), leading to occasional difficulties in the presence of – short loops – skipped and/or duplicated tasks • Lack of visual representation of differences (text only) • No option to use different levels of abstraction • No statistical support for differences: all equally important even if some may be very infrequent
  • 60. Future work • Employing a more accurate concurrency oracle (e.g. local a)* • Group related statements to trade accuracy with greater interpretability • Add statistical support to statements • Visual representation of differences in addition to natural language statements (e.g. via “representative” runs) • Capturing non-control-flow deviance – Analysis of underlying data – Resources – Temporal aspects • Use differences as a basis for model repair *Armas-Cervantes, A., Dumas, M., & La Rosa, M. (2016) Discovering Local Concurrency Relations in Business Process Event Logs, https://eprints.qut.edu.au/97615
  • 61. Thank you for your attention

Editor's Notes

  • #3: detect deviations in the process execution with respect to the behavior stipulated by a normative model Model maintenance: The first situation suggests that the model needs to be extended to capture the unfitting behavior The second situation suggests that there are paths in the model that have become spurious overtime, meaning they are no longer used and need to be pruned if they are found to have lost relevance Iterative improvement of the process model. First, an initial model is discovered using any discovery algorithm, then conformance checking is measured and based on this result the behavior of the model is refined to better align it to that of the log, e.g. it is restricted to remove paths in the model never observed, in order to avoid over-generalization, and extended to add paths observed in the log but not captured in the model
  • #5: Related work MEASURING UNFITTING BEHAVIOR Replay fitness and ICS extension: replay a trace on the Petri net, to detect missing tokens that need to be added to a place to replay the trace, and remaining tokens that remain in the Petri net once the trace has been fully replayed. ICS extension trades accuracy for performance Vanden Bourcke et al. proposes a SESE decomposition to improve performance and offer more localized feedback General limitation: error recovery is performed locally each time an error is detected, thus these methods may not identify the minimum number of differences (errors) to explain the unfitting behavior. This limitation is solved by: - trace alignment, which however still provides feedback at the level of individual traces, rather than behavioral relations between events. MEASURING ADDITIONAL MODEL BEHAVIOR Negative events (Negative Events Precision): additional “negative” events are added and if the model can reply any of these negative events when replaying a trace, then this is a case of extra model behavior. However this method is heuristic, and cannot guarantee that all extra behavior is detected. While there are improvements on performance (the method is exponential as it is exponentially large the number of possible negative events), these still do not guarantee 100% accuracy in detecting extra behavior Trace automata (ETC Conformance): a prefix automaton is built from the log, such that each state in the automaton corresponds to a given trace prefix. Then for each state, by replaying the prefix on the model, a corresponding state in the model is detected, and if the set of enabled transitions in this stage includes events that cannot be taken at that state in the log, then this is detected as an “escaping edge” (i.e. a sink) in the automaton, pinpointing the beginning of extra model behavior and the reply is continued. Two limitations: i) unable to handle duplicate labels and silent transitions, and ii) assumes fully-fitting log. These limitations are solved by the: Alignment-based ETC Conformance: first, an optimal alignment is calculated, including silent moves on model. Then the model-projection of this alignment is used to compute the automaton, after which traces are replayed to detect escaping edges. General limitation of trace automata: they cannot fully characterize the extra behavior in the model, but only pinpoint where this extra behavior takes place and with what task it starts.
  • #6: The first statement characterizes the behavior observed in the log but not in the model: in the model, task C is compulsory, while in the log C is skippable The second statement characterizes the behavior observed in the model but not in the log Trace alignment would produce two optimal alignments: One between ABDEH of the log and ABCDHE of the model, the other between ABDFH of the log and ABCDFH of the model. From this one can infer that task C is optional in the log (move on log only). 1) However the number of misaligned traces is often very large, rendering this inference quite hard in practice. Visualizations, e.g. on top of Petri net, and at an aggregate level, can help, but fundamentally the problem is that trace alignment provides feedback at the level of individual traces, not at the level of behavioral relations observed in the log but not captured in the model. 2) Moreover, trace alignment would detect that there is escaping behavior starting with “Request addition information” at a trace prefix finishing with “Notify rejection”, but it will not identify that the extra behavior includes tasks IG and that IGDF is behavior that can be repeated in the model but not in the log. For example, task “Assess application” can be repeated in the model but not in the log.
  • #8: Input model in BPMN and event log in XES or MXML BPMN converted to Petri nets using Remco, Chun and Marlon’s technique (Information and Software Technology)
  • #9: The Petri net is unfolded using McMillan’s complete prefix unfolding technique From the log, first a set of partially-ordered runs are extracted over a concurrency relation discovered from the log. Then these runs are prefix-merged into a Prime Event Structure (PES)
  • #10: The PSP is a representation of a synchronized traversal of two input PESs, such that when a discrepancy is detected it is explicitly recorded and the traversal resumes from a “suitable” configuration in each of the two PESs. If the discrepancy if of type “unfitting log behavior” this will be recorded by a node of the PSP, so we can enumerate all unfitting behavior. To expose additional model behavior, we define a notion of coverage of the PES extracted from the model by the PES extracted from the log. The parts of the model PES not covered by the log PES are the additional model behavior.
  • #11: Each discrepancy falls under one of a set of disjoint patterns. For each pattern, we have a verbalization of the difference.
  • #12: An occurrence of a task in the business process, as represented by the model or the log Note that conflict is “inherited” by causality. For example, if e # e’ and e’ <= e’’, then e # e’’ Nielsen et al, “Petri nets, event structures and domains, part I”, TCS, 1981
  • #18: To simplify, in an event structure we don’t show: Transitive causality relations, e.g. A is causal to E Hereditary conflict, e.g. B and C are in conflict with {g2}E Concurrency: every pair of events that is neither directly nor transitively causally related, nor in conflict of course, is concurrent. E.g. B and C are concurrent Every state of a PES (called “configuration”) is represented by a set of events. A configuration is: causally closed, meaning that for each event e in C, C includes all causal predecessors of e, and conflict-free, meaning that there cannot be any pair of events in conflict within C. e is extension of C if {e} U C is also a configuration. A maximal configuration is a configuration that is maximal w.r.t. set inclusion Lossless representation: the set of maximal configurations of the PES is equivalent to the set of runs inferred from the log, modulo the inaccuracy of the concurrency oracle. The complexity of building such an event structure from a log is cubic on the length of the longest trace. We merge events with the same label (e.g. e0 and f0) and which have same history (same prefix)
  • #19: The model is a variation of what we showed before, with C being optional and without loopback after F. Requirements: The Petri net must be safe The silent transitions will be eliminated when constructing the PES
  • #20: First, we construct the branching process by prefix-merging all the partially ordered runs induced by the Petri net. We do so because branching processes explicitly represent the same set of behavioral relations as event structures. Transitions in the branching process represent events in the event structure, and the behavioral relations in the branching process can thus be used to generate an event structure. However, the event structure that we create by using a set of inductive rules, includes silent transitions. It has been shown (Abel’s IS paper) that silent transitions can be abstracted away in a behavioral-preserving manner, under the well-known notion of visible-pomset equivalence (which requires that sink events in the event structure are not silent, and we can always add fake ones). In the example, the future of t2 and t4 (C) is isomorphic, which is unfolded in the branching process separately for both t2 and t4. Furthermore, the future of E and F is isomorphic. Therefore, we can safely stop unfolding the branching process once we reach t4, provided that we continue unfolding from t2 and onwards.
  • #21: McMillan showed that for a safe net, a prefix of a branching process that unfolds each loop once fully encodes the behavior of the original net. Such prefix is referred to as complete prefix unfolding of the net. We use Esparza’s optimization that has been shown to produce compact unfoldings The trick is to find transitions which have isomorphic futures. When we stop unfolding after t4, we call this the cutoff event. The event with the isomorphic future, t2, is called the corresponding event. The resulting cc-pair (t4,t2) (cutoff-corresponding pair) is grapically depicted here with the red line. Similar for E and F
  • #22: We call the PES derived from a complex prefix unfolding PES prefix unfolding or simply prefix PES This translates to a PES prefix unfolding as shown in this slide. Reasoning about possible executions of a PES prefix unfolding is not convenient because some configurations are not explicitly represented. To make it more convenient to explore the configurations of a PES prefix unfolding, we use the “shift” operation on net unfoldings. Given a cc-pair (t4, t2), since the futures of [t4] and [t2] are isomorphic, we can “shift” from one configuration to the other. In other words, the shift operation is a “step” function that allows us to move from one configuration to another. Note that in a PES we retain those tau transitions that are involved in a cc-pair and safely get rid of the rest. The extraction of a complete prefix unfolding from a Petri net is in the worst-case scenario is exponential on the size of the net.
  • #23: For loop relations, the advantage of the prefix unfolding and shift operations is clear, as it allows us to identify elementary pomsets (partially ordered multisets), so that elementary loops can be identified and such that the PES prefix is not required to unfold the cyclic behavior infinitely. This corresponds to unfolding every cycle so that it is traversed only once. We will talk about pomsets later, when illustrating how to identify additional (cyclic and acyclic) model behavior.
  • #24: Let’s take this log and model, and their corresponding PESs
  • #26: Event matching must be Label preserving (the two events must have the same label), and Order preserving (the two events must be consistent with the casual relation in the input PESs). For example, I cannot match two events enabled at a given synchronized state if one is casual to the current event while the other is concurrent, even if these have the same label
  • #27: When matching, traversal of enabled events at a given configuration is done in lexicographical order to guarantee determinism.
  • #28: we can also match C and D
  • #31: In the example, note that C is optional in the log and mandatory in the model ONLY after state {A,B} and not always, e.g. in the model after {A} I can execute D and thus skip C At each synchronized state, the set of enabled events in the two PESs is checked, to identify those that are label-preserving and order-preserving. Label preservation is a simple check, but order-preservation requires a backward traversal of the event structure to check that all causality relations are maintained between each of the enabled events and the configuration path in the event structure. If this is not the case, the algorithm returns the set of events that have a causality discrepancy, the cut-off events being traversed and the set of all causality relations that are violated. This is later used to characterize behavioral mismatches. The PSP contruction aims at finding the optimal matchings (i.e. maximum number of matchings, meaning minimum number of hide) for every maximal configuration of the log PES. Hence, priority is given to the log. A PSP with no hide operations identifies a situation where the log is fully fitting into the model. --- Complexity of PSP construction: we use an A* heuristic to find the optimal number of matches, so worst-case is O(3^(nPES1 x nPES2)) where nPESx is the number of configurations of PESx. 3 is the branching factor – avg number of successors per state (match, lhide and rhide). Indeed each configuration of PES1 is associated with a configuration of PES2 via 3 possible operations
  • #32: We only report on immediate causality (not transitive causality) and direct conflict (not inherited conflict) because we want to report each mismatch once: 1. Immediate causality vs concurrency 2. direct conflict vs concurrency direct conflict vs immediate causality Each mismatch occurs in a given context, i.e. a pair of configurations, one for each PES Relation mismatch patterns are O(n) where n is the number of arcs of the PSP (via optimizations of O(n^3)) --- Task absence / insertion is a “catch all” pattern, essentially saying that there is a task at a given configuration in the PES of the log but not in the corresponding configuration in the PES of the model
  • #33: The pattern observed is the lhide and rhide of the same label in two different event structures. This happens because each configuration is conflict-free, so two conflicting events cannot co-exist in the same configuration (in our example a2 and b2). Note that it’s not enough to check that we have an lhide and an rhide of the same label, we also need to make sure that the two events being hidden are not order-preserving w.r.t. the enabling state. In our example, a2 is concurrent with a1 while b2 is causally related to b1, with the enabling state being {(a0,b0)A, (a2,b1)B. Given that there can be operations that are commutative, these two hides do not necessarily need to be contiguous. For example here after matching A I can lhide C and then match B. Note that the algorithm for computing the PSP applies partial order reductions from the field of model checking to remove redundant paths which lead to the same maximal configuration due to the concurrent enablement of events.
  • #35: To characterize additional model behavior not observed in the log we define a notion of “coverage”: essentially every elementary path (no repetition) and elementary cycle (first and last event of the sequence being the same, and if I remove the last event the cycle becomes an elementary path) of the prefix PES must be covered by a maximal configuration of the log PES, after hide operations have been applied. If this is the case, then there isn’t any additional model behavior. Elementary paths and cycles are borrowed from Graph Theory. The nice thing is that in this way we do not detect a difference between the log and the model if in the log we observe a finite number of iterations of a cycle while in the model the cycle is clearly infinite. This is because we only need the elementary cycle in the prefix PES to be covered by the log PES. --- In the example, paths in the log PES are highlighted in the model PES We can see in this example that the log behavior is perfectly fitting the model, as each of the paths in the PES of the log exist in the PES of the model. However, the PES of the model has a loop with two entry points and two exit points, while the PES of the log doesn’t have any repetitive behavior (each entry point is in fact an exit point).
  • #36: In the case of a PES prefix, the set of induced pomsets is infinite when the PES prefix captures cyclic behavior via cc-pairs. Therefore, we cannot enumerate all pomsets of a PES prefix in order to check if each of them is observed in the PES of the log. The set of elementary pomsets collectively cover all the possible pomsets induced by a PES prefix, such that every cycle is traversed only once.
  • #39: e ≠ ┴ means that the event is defined in the log σ is the state where the behavioral discrepancy is observed --- Note: in the case of unmatched repetition in the log, the idea is very simple: if we have an lhide of a label that has already been observed in the history of the PSP, that means that we are against repeated log behavior that is not matched by the model.