Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process-Oriented Case-Based Reasoning

Department of Business
Information Systems II
Using Siamese Graph Neural Networks for
Similarity-Based Retrieval in Process-Oriented
Case-Based Reasoning
Maximilian Hoffmann, Lukas Malburg, Patrick Klein,
and Ralph Bergmann
Department of Business Information Systems II
University of Trier, Germany
www.wi2.uni-trier.de
hoffmannm@uni-trier.de
Funded by (BE 1373/3-3 and 375342983)

Outline
• Motivation
• Related Work
• Approach for Learning Similarities of Workflow Graphs via Siamese
Graph Neural Network (GNNs)
• Experimental Evaluation
• Conclusion and Future Work
- 2 -

Motivation
• Semantic graphs as cases in Process-Oriented CBR:
– Similarity assessment: kind of inexact sub-graph matching
– Similarity considers structure of nodes and edges and their semantic annotations
– High complexity leads to long retrieval times
• Two-phase MAC/FAC retrieval
– MAC: similarity measure for a fast pre-selection
– FAC: graph matching is applied
– Existing automatically-learned POCBR MAC/FAC approaches use embeddings …
• Advantage: Usage of fast and simple vector similarity measures
– … but complex semantics are handled insufficiently
• No consideration of graph structure and semantic annotations
• Decreased utility for rather complex domains
• Advantages of embedding representation should be adapted for other
approaches
- 3 -
Goal: Similarity assessment of complex graphs via siamese GNNs

Related Work
1. Embedding-based approach by Klein et al. (2019)
• Automatically-learned low-dimensional embedding vectors for graph nodes
• Similarity of two graphs given by aggregating node vectors and applying a
vector similarity measure
• Very fast method, but lacks proper integration of graph structure and semantic
annotations into embedding vectors
2. Cluster-based approach by Müller and Bergmann (2014)
• based on the cluster structure of the case base
• finding clusters that are similar to the query
• no additional modeling effort
• not reaching performance of the feature-based approach
3. Domain specific feature-based case representation by Bergmann and
Stromer (2013)
• simplified version of native representation
• similarity computation only uses feature representation and not the graph
• requires additional modeling effort (manually modeled)
- 4 -

Graph Similarity Assessment with
Siamese GNNs
• Challenges:
1. Transforming all relevant information of semantic graphs into a
format, interpretable by neural networks
2. Finding a siamese GNN that is capable of producing meaningful
graph similarities and short retrieval times
• Contributions:
– Encoding method of semantic graphs
• Supports complex semantic descriptions, including nested entries
• Supports the complete graph structure with all nodes and edges
– Two siamese GNN architectures by Li et al. (2019)
• Different levels of computational complexity
• Modified for retrieving semantic graphs
- 5 -

Encoding Semantic Graphs for
Similarity Learning
• Composition of node and edge encodings:
– Encoding of node and edge type
– Encoding of semantic annotations and ProCAKE data types such
as key-value pairs, lists, numerics, strings, taxonomies or sets.
• Sequence of vectors can be transformed to a matrix
• Matrix is processable by neural networks
- 6 -

Graph Embedding Model (GEM)
- 7 -
• Encoder:
− Transforms raw encoded data
into node and edge embeddings
− Usage of Feed-Forward- and
Recurrent Neural Networks
• Propagation Layer:
− Propagates embeddings of con-
nected nodes in both graphs
− Captures information on the
neighborhood of individual nodes
• Aggregator:
− Merges final node embeddings
to a whole-graph embedding
− Pairwise vector similarity
Complexity O(n)

Graph Matching Network (GMN)
- 8 -
• Encoder:
− Transforms raw encoded data
into node and edge embeddings
− Usage of Feed-Forward- and
Recurrent Neural Networks
• Propagation Layer:
− Same goal as propagation layer
in GEM but different scope
− Propagates information across
graphs via attention mechanism
• Aggregator:
− Merges final node embeddings
to a whole-graph embedding
− Pairwise similarity using a Feed-
Forward Neural Network
Complexity O(n²)

Graph Retrieval using GEM or GMN
• Retrieval utilizes one of the neural networks (GEM/GMN):
– Input data: Encoded semantic graphs
– Output data: pairwise graph similarities
– Prediction of the similarity of each case with the query
– K-most similar cases are determined
– Both neural networks are also applicable in a MAC/FAC setup
• Implementation uses ProCAKE and Tensorflow
- 9 -
(https://procake.uni-trier.de)

Experimental Evaluation
• Setup:
– Comparison of GEM and GMN to the embedding-based retriever (EBR)
[1] and the feature-based retriever (FBR) [2] in two experiments
1. MAC/FAC retrieval with different filter sizes and values of k (similar to [1])
2. Approximation of the A-Star retriever (A*R) by Bergmann and Gil (2014) [3]
– Experiments for two workflow domains:
• CB-I: Simple cooking recipes (680 training and 120 testing cases)
• CB-II: Complex data mining workflows (529 training and 80 testing cases)
– Examination of retrieval quality and performance
• Hypotheses:
– H1: Using GEM and GMN as a MAC retriever of a MAC/FAC retrieval
leads to better retrieval results than using EBR as MAC retriever.
– H2: The GMN retriever is able to approximate the ground-truth graph
similarities better than A*R, using parameter settings such that the
retrieval time of both retrievers is comparable.
- 10 -

Experimental Results (MAC/FAC)
• CB-I:
– FBR has the best qualities and GEM and EBR are the fastest retrievers
– GEM and GMN are not able to outperform EBR
• CB-II:
– FBR still performs very well, GEM and EBR are still the fastest retrievers
– GEM and GMN now outperform EBR
• H1 is partly confirmed: Rejected for CB-I and accepted for CB-II
- 11 -

Experimental Results (A-Star
Approximation)
• CB-I: GMN has lowest MAE
and A*R highest correctness
• CB-II: GMN has lowest MAE
and highest correctness
• H2: Clearly accepted for CB-II
and partly accepted for CB-I
- 12 -

Conclusion and Future Work
Conclusion
• GEM and GMN show high potential in graph retrieval scenarios, especially in
more complex domains
– GEM as a MAC measure outperforms other machine-learned approaches
– GMN as a FAC measure outperforms an A-Star-based graph matching measure
Future Work
• Optimization of the presented neural networks
– Usage of a differentiable ranking loss function for GEM
– Optimized encoding scheme for more data types and other graph structures
• Evaluation on other domains, e.g., argument graphs, and other types of
complex graph similarity measures
• Investigation of suitable methods for explaining the results of the neural
networks in the context of Explainable Artificial Intelligence (e.g., see XAI
workshop of ICCBR 2019)
- 13 -

Information Systems II - 14 -
Thank you for your
attention!

References
[1] Klein, P., Malburg, L., Bergmann, R.: Learning Workflow Embeddings to Improve
the Performance of Similarity-Based Retrieval for Process-Oriented Case-Based
Reasoning. In: Case-Based Reasoning Research and Development: 27th Inter-
Conference, ICCBR 2019, Germany, pp. 188–203. Springer. (2019)
[2] Bergmann, R., Stromer, A.: MAC/FAC Retrieval of Semantic Workflows. In:
Boonthum-Denecke, C., Youngblood, G.M. (eds.) Proceedings of the Twenty-
Sixth International Florida Artificial Intelligence Research Society Conference,
FLAIRS 2013. AAAI Press (2013)
[3] Bergmann, R., Gil, Y.: Similarity assessment and efficient retrieval of semantic
workflows. Information Systems 40, pp. 115–127 (2014)
[4] Li, Y., Gu, C., Dullien, T., Vinyals, O., Kohli, P.: Graph Matching Networks for
Learning the Similarity of Graph Structured Objects. In: Chaudhuri, K.,
Salakhutdinov, R. (eds.) Proc. of the 36th Int. Conf. on Machine Learning, ICML
2019, USA. Proc. of Machine Learning Research, vol. 97, pp. 3835–3845. PMLR
- 16 -

Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process-Oriented Case-Based Reasoning

More Related Content

What's hot (18)

Similar to Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process-Oriented Case-Based Reasoning (20)

Recently uploaded (20)

Using Siamese Graph Neural Networks for Similarity-Based Retrieval in Process-Oriented Case-Based Reasoning