Complete and Interpretable Conformance Checking of Business Processes

Complete and Interpretable
Conformance Checking of
Business Processes
Luciano García-Bañuelos University of Tartu, Estonia
Nick van Beest Data61 | CSIRO, Australia
Marlon Dumas University of Tartu, Estonia
Marcello La Rosa Queensland University of Technology, Australia
Willem Mertens Queensland University of Technology, Australia

Conformance checking
1. Compliance auditing
– detect deviations with respect to a normative
model (unfitting behavior)
2. Model maintenance
– unfitting behavior
– additional model behavior
3. Automated process model discovery
– Iterative model improvement

Given a process model M and an event log L,
explain the differences between the process
behavior observed in M and L

State of the art
Current approaches:
• Are designed to identify the number and exact
location of the differences
• Don’t provide a “high-level” diagnosis that
easily allows analysts to pinpoint differences:
– Are unable to identify differences across traces
– Are unable to fully characterize extra model
behavior not present in the log

An example
Desired conformance output:
• task C is optional in the log
• the cycle including IGDF is not observed in the log
Log traces:
ABCDEH
ACBDEH
ABCDFH
ACBDFH
ABDEH
ABDFH

Our approach
A method for business process conformance checking
that:
1. Identifies all differences between the behavior in the
model and the behavior in the log
2. Describes each difference via a natural language
statement

How does it work?
Difference
statements
Event log
Input model
PESM
unfold
PESL
merge
Partially
Synchronized
Product (PSP)
compare
extract
differences

Prime event structure (PES)
A Prime Event Structure (PES) is a graph of events, where
each event e represents the occurrence of a task in the modeled
system (e.g. a business process)
As such, multiple occurrences of the same task are represented
by different events
Pairs of events in a PES can have one of the following binary
relations:
• Causality: event e is a prerequisite for e'
• Conflict: e and e' cannot occur in the same execution
• Concurrency: no order can be established between e and e'

From event log to PES
Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3
PO runs:
{e0,f0,g0}:A

Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
PO runs:
{e0,f0,g0}:A
{e1,f1}:B {e2}:C
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3

Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
PO runs:
{e0,f0,g0}:A
{e1,f1}:B {e2}:C {g1}:D
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3

Log:
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
PO runs:
{e0,f0,g0}:A
{e1,f1}:B
{f2}:E {e3}:E {g2}:E
{e2}:C {g1}:D
e0:A
e1:B e2:C
e3:E
f0:A
f1:B
f2:E
g0:A
g1:D
g2:E
t1, t2 → p1 t3 → p2 t4 → p3

From model to PES
BPMN model
Petri net

From model to PES
Branching process

From model to PES
Complete prefix unfolding
Cutoff
event
Corresponding
event
Cutoff
event
Corresponding
event

PES prefix unfolding
Complete prefix unfolding
PES prefix unfolding
Cutoff
eventCorresponding
event
Corresponding
event
Cutoff
event

Loop relations
A
C
D
D
A
B
C
D
B
C

Comparing PESs
Log PES EL Model PES prefix unfolding EM
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
Trace Ref N
A B C E t1 3
A C B E t2 2
A B E t3 2
A D E t4 3
A
B
D
E
C
f0:A
f1:B f2:C f3:D
f4:E f5:E

lh = {}, rh = {}
m = {}
Comparing PESs (cont’d)
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E

lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
f0:A
f1:B f2:C f3:D
f4:E f5:E

match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
f0:A
f1:B f2:C f3:D
f4:E f5:E

match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
match C
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
f0:A
f1:B f2:C f3:D
f4:E f5:E

match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
match C
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
f0:A
f1:B f2:C f3:D
f4:E f5:E

match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
match E
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
f0:A
f1:B f2:C f3:D
f4:E f5:E

match B
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B}
rhide Cmatch C
match E match E
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C}
lh = {}, rh = {}
m = {(e0,f0)A,(e1,f1)B,(e2,f2)C,(e5,f4)E}
lh = {}, rh = {f2:C}
m = {(e0,f0)A,(e1,f1)B,(e4,f4)E}
lh = {}, rh = {}
m = {(e0,f0)A}
match A
lh = {}, rh = {}
m = {}
e0:A
e1:B e2:C e3:D
e4:E e5:E e6:E
match Dmatch C
In the log, C is
optional after {A,B},
whereas in the model
it is not
(task skipping)
f0:A
f1:B f2:C f3:D
f4:E f5:E

Elementary mismatch patterns
Unfitting behavior patterns:
• Relation mismatch patterns
1. Causality-Concurrency
2. Conflict
• Event mismatch patterns
3. Task skipping
4. Task substitution
5. Unmatched repetition
6. Task relocation
7. Task insertion / absence
Additional model behavior patterns:
8. Unobserved acyclic interval
9. Unobserved cyclic interval

Example: Causality / Concurrency

Unobserved cyclic interval:
PES and PES prefix unfolding
A
B
C
D
Log PES EL
Model PES prefix unfolding EM

Pomsets
(partially ordered multisets)
• A pomset is a Directed Acyclic Graph where:
– the nodes are configurations
– the edges represent direct causality relations between configurations
– an edge is labeled by an event
• Unlike an event structure, a pomset does not have any conflict
relation, since a pomset represents one possible execution
• The behavior of a PES can be characterized by the set of pomsets it
induces
• In the case of a PES prefix, the set of induced pomsets is infinite
when the PES prefix captures cyclic behavior via cc-pairs
• We cannot enumerate all pomsets of a PES prefix to compare with
the PES of the log
• Therefore, we can extract a set of elementary pomsets (inspired by
the notion of elementary paths), which collectively cover all the
possible pomsets induced by a PES prefix
• Cyclic behavior is not required to be unfolded infinitely

expanded prefix with elementary pomsets

creating a PSP using the expanded prefix
Two unobserved elementary acyclic pomsets:
• s3 [a5, a9]
• s9 [a8, a9]
Two unobserved elementary cyclic pomsets:
• s5 [a3, a6, a4, a7]
• s11 [a4, a7, a3, a6]

Verbalization of elementary mismatch patterns
Change pattern Condition Verbalization
Causality /
Concurrency
if e' < e
else
In the log, after σ, λ(e') occurs before λ(e), while in the model
they are concurrent
In the model, after σ, λ(f') occurs before λ(f), while in the log
they are concurrent
Conflict if e' || e
else if f' || f
else if e' < e
else
In the log, after σ, λ(e') and λ(e) are concurrent, while in the
model they are mutually exclusive
In the model, after σ, λ(f') and λ(f) are concurrent, while in the
log they are mutually exclusive
In the log, after σ, λ(e') occurs before task λ(e), while in the
model they are mutually exclusive after σ
In the model, after σ, λ(f') occurs before λ(f), while in the log
they are mutually exclusive
Task skipping if e ≠ ┴
else
In the log, after σ, λ(e) is optional
In the model, after σ, λ(f) is optional
Task substitution In the log, after σ, λ(f) is substituted by λ(e)
Unmatched repetition In the log, λ(e) is repeated after σ
Task relocation if e ≠ ┴
else
In the log, λ(e) occurs after σ instead of σ'
In the model, λ(f) occurs after σ instead of σ'

Change pattern Condition Verbalization
Task insertion / absence if e ≠ ┴
else
In the log, λ(e) occurs after σ and before σ'
In the model, λ(f) occurs after σ and before σ'
Unobserved acyclic interval In the log, interval ... does not occur after σ
Unobserved cyclic interval In the log, the cycle involving interval ... does not occur after σ
Verbalization of elementary mismatch patterns

Implementation
Standalone Java tool: ProConformance
OSGi plugin for Apromore: Compare
– Input: BPMN process model and a log (MXML
or XES format). Also accepts:
• Two BPMN models for model comparison and
• Two logs for log delta analysis
– Output: set of difference statements

Evaluation
1. Qualitative evaluation on real life process:
– Traffic fines management process in Italy with
150,370 traces, 231 distinct traces
2. Quantitative evaluation on two large process
model collections:
– IBM Business Integration Unit (BIT): 735 models
– SAP R/3: 604 models
3. User evaluation (academics vs practitioners)

Qualitative evaluation:
traffic fines model
Start Create
Fine
Payment
Send
Fine
Insert
Fine
Notification
Add
Penalty
Appeal
to Judge
Send for
Credit
Collection
Notify
Result
Appeal to
Offender
Insert Date
Appeal to
Prefecture
Receive
Result
Appeal from
Prefecture
Send
Appeal
to Prefecture
End
Tau10

trace alignment
• Replay a Log on Petri Net for Conformance Analysis:
205 misalignments out of 231 alignments
• Replay a Log on Petri Net for All Optimal Alignments:
406 misalignments out of 412 alignments

verbalization
15 distinct statements in total, e.g.
1. In the log, “Send for credit collection” occurs after
“Payment” and before the end state
2. In the model, after “Insert fine notification”, “Add penalty”
occurs before “Appeal to judge”, while in the log they are
concurrent
3. In the log, after “Add penalty”, “Receive results appeal
from prefecture” is substituted by “Appeal to judge”
4. In the log, the cycle involving “Insert date appeal to
prefecture, Send appeal to prefecture, Receive result
appeal from prefecture, Notify result appeal to offender”
does not occur after “Insert fine notification”.

verbalization
2. In the model, after “Insert fine notification”, “Add penalty” occurs
before “Appeal to judge”, while in the log they are concurrent
4. In the log, the cycle involving “Insert date appeal to prefecture,
Send appeal to prefecture, Receive result appeal from prefecture,
Notify result appeal to offender” does not occur after “Insert fine
notification”.
Cannot be detected by trace alignment,
as diagnostics are provided at the level
of individual traces
Cannot be entirely detected by trace
alignment, as this difference concerns
additional model behavior, while
alignment-based ETC conformance
only detects escaping edges

Qualitative evaluation: summary
Verbalization:
• produces a more compact yet more understandable
diagnosis
• exposes behavioral differences that are difficult or
impossible to identify using trace alignment

Quantitative evaluation
• For each model, we generated an event log using the
ProM plugin “Generate Event Log from Petri Net”
• This plugin generates a distinct log trace for each
possible execution sequence in the model
• The tool was only able to parse 274 models from the BIT
collection, and 438 models from the R/3 collection,
running into out-of-memory exceptions for the remaining
models
• Total models: 712 sound Workflow nets

Quantitative evaluation:
model complexity

log size
Total log size (events)

time performance
0
100
200
300
400
500
600
700
800
0 50 100 150 200 250 300
Logsize(#events)
Time (ms)
No noise 5% noise 10% noise 15% noise 20% noise
0
2000
4000
6000
8000
10000
12000
0 0.5 1 1.5 2
Logsize(#events)
Time (s)
BIT
SAP
Trace alignmentVerbalization
BIT
SAP
0
100
200
300
400
500
600
700
800
0 50 100 150 200 250 300
Logsize(#events)
Time (ms)
0
2000
4000
6000
8000
10000
0 10 20 30 40 50 60 70 80 90 100
Logsize(#events)
Time (s)

results
Statements Misalignments
Escaping edges

Quantitative evaluation: summary
• Verbalization, although generally slower than trace
alignment, shows reasonable execution times
(within 10s)
• Extreme cases: (logs with over 8,000 events in
distinct traces) and a high number of differences, the
execution time is still below 2 minutes
• Verbalization consistently produces a more compact
difference diagnosis than trace alignment

User evaluation
• Online survey:
– a simple Petri net with 31 nodes (10 visible transitions),
created from a real-life claims handling process model
– assumed that this model was accompanied by a log with 53
traces
• Output of the alignment method (misalignments +
Petri net with alignment information) overlaid
vs
• Output of the verbalization method (list of statements)

User evaluation
• Respondents compared both methods using the Technology
Acceptance Model:
1. What is the easiest approach for checking the conformance of an
event log to a process model?
2. What is the easiest approach for identifying the differences between a
process model and an event log?
3. What is the most useful approach for checking the conformance of an
event log to a process model?
4. What is the most useful approach for identifying the differences
between a process model and an event log?
5. Which approach would you likely use for checking the conformance of
an event log to a process model?
6. Which approach would you likely use for identifying the differences
between a process model and an event log?
• Seven point Likert-scale: “Strongly prefer Alignment” to “Strongly prefer Verbalization”
• Background: academic vs professional
• Experience in process modelling
• Confidence in modelling with Petri nets

User evaluation: hypotheses
H1: respondents would have a preference for
verbalization
H2: respondents with less experience, familiarity,
confidence and competence in the use of
Petri nets would have a stronger preference
for verbalization

User evaluation: results
• Academics (38 responses)
– More familiar in working with Petri nets
– More competent in working with Petri nets
– Analysed and created more models in the
past 12 months
• Professionals (33 responses)
– Less familiar with Petri nets
– Mostly rely on professional training

User evaluation: results
• H1:
– Tested for the full sample and for the two cohorts separately
– For the full sample there is no general preference for our method: the
median was zero (“neutral”)
– Professionals did show a preference for verbalization (especially along
ease of use) while academics preferred alignment, so H1 is supported
for the professionals cohort only
• H2:
– Respondents with more experience, familiarity, confidence and
competence in working with Petri nets have a stronger preference for
alignments
– H2 is supported by the results

User evaluation: summary
• Academics prefer alignment
• Professionals prefer verbalization
• Overall, people with less expertise in the use of
Petri nets show a stronger preference for
verbalization

Limitations of the approach
• Input log is assumed to consist of sequences of event labels
– timestamps are ignored
– event payloads are ignored
• Simplicity of the used concurrency oracle (a+), leading to
occasional difficulties in the presence of
– short loops
– skipped and/or duplicated tasks
• Lack of visual representation of differences (text only)
• No option to use different levels of abstraction
• No statistical support for differences: all equally important even if
some may be very infrequent

Future work
• Employing a more accurate concurrency oracle (e.g. local a)*
• Group related statements to trade accuracy with greater interpretability
• Add statistical support to statements
• Visual representation of differences in addition to natural language
statements (e.g. via “representative” runs)
• Capturing non-control-flow deviance
– Analysis of underlying data
– Resources
– Temporal aspects
• Use differences as a basis for model repair
*Armas-Cervantes, A., Dumas, M., & La Rosa, M. (2016) Discovering Local Concurrency Relations in Business
Process Event Logs, https://eprints.qut.edu.au/97615

Complete and Interpretable Conformance Checking of Business Processes

More Related Content

What's hot (20)

Viewers also liked (20)

Similar to Complete and Interpretable Conformance Checking of Business Processes (20)

More from Marlon Dumas (20)

Recently uploaded (20)

Complete and Interpretable Conformance Checking of Business Processes

Editor's Notes