SlideShare a Scribd company logo
Intern © Siemens AG 2017
Replicated Siamese LSTM in Ticketing System
for Similarity Learning and Retrieval
in Asymmetric Texts
Author(s): Pankaj Gupta1,2, Bernt Andrassy2, Hinrich Schütze1
Presenter: Pankaj Gupta @COLING2018 Santa Fe, NM USA. 20th August 2018
1CIS, University of Munich (LMU) | 2Machine Intelligence, Siemens AG | August 2018
Intern © Siemens AG 2017
May 2017Seite 2 Corporate Technology
Contributors
PANKAJ GUPTA
PhD Candidate @LMU
Research Scientist @Siemens
Munich Germany
Dr. Bernt Andrassy
Senior Key Expert @Siemens
Munich Germany
Prof. Hinrich Schütze
@University of Munich (LMU)
(PhD advisor)
Intern © Siemens AG 2017
May 2017Seite 3 Corporate Technology
Outline
➢ Problem Statement
- Industrial Diagnostic Ticketing System
- Asymmetric Textual Similarity
- Complementary Semantics in Similarity
➢ Background
- Siamese Neural Networks (Siamese-LSTM) in NLP
- Neural Autoregressive Topic Model (DocNADE)
➢ Proposal
- The Replicated Siamese LSTM for STS and IR
- Multi-Channel Manhattan Metric for Complementary Semantics
➢ Evaluations: STS and IR
Intern © Siemens AG 2017
May 2017Seite 4 Corporate Technology
Problem Statement: STS and IR
Semantic Textual Similarity (STS):
➢ Task to find out if the text pairs mean the same thing
➢ Information Retrieval (IR) and text understanding may be improved by
modeling the underlying semantic similarity between texts.
Intern © Siemens AG 2017
May 2017Seite 5 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB DESC SOL
…… …… ……
…… …… ……
…… …… ……
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar.
SOL: The recommended action is that the machine should not run at load over 28 MW until resolved.
KB of Industrial Tickets
Intelligent Ticketing System (ITS)#TICKET
Intern © Siemens AG 2017
May 2017Seite 6 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB DESC SOL
…… …… ……
…… …… ……
…… …… ……
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar.
SOL: The recommended action is that the machine should not run at load over 28 MW until resolved.
KB of Industrial Tickets
Intelligent Ticketing System (ITS)#TICKET
SUBJECT text
DESCRIPTION text
QUERY
Solution:
Recommended
Service Action text
Recommendation(s)
Intern © Siemens AG 2017
May 2017Seite 7 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar.
SOL: The recommended action is that the machine should
not run at load over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
QUERY
similarity
Retrieval
#TICKET
IN OUT
KB
ITS
Intern © Siemens AG 2017
May 2017Seite 8 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar.
SOL: The recommended action is that the machine should
not run at load over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
QUERY
similarity
#TICKET
IN OUT
KB
ITS
Retrieval
Intern © Siemens AG 2017
May 2017Seite 9 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar.
SOL: The recommended action is that the machine should
not run at load over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
QUERY
similarity
Retrieval
#TICKET
IN OUT
KB
ITS
Intern © Siemens AG 2017
May 2017Seite 10 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar.
SOL: The recommended action is that the machine should
not run at load over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
QUERY
similarity
#TICKET
IN OUT
KB
ITS
Intern © Siemens AG 2017
May 2017Seite 11 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar.
SOL: The recommended action is that the machine should
not run at load over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
QUERY
similarity
#TICKET
IN OUT
KB
cosine similarity
(low, narrow) = 0.26
ITS
Intern © Siemens AG 2017
May 2017Seite 12 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar.
SOL: The recommended action is that the machine should
not run at load over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
QUERY
similarity
#TICKET
IN OUT
KB
cosine similarity
(frequency, load) = 0.14
ITS
Intern © Siemens AG 2017
May 2017Seite 13 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar .
SOL: The recommended action is that the machine should
not run at loads over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KB
cosine similarity
(frequency, mbar) = 0.25
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
QUERY
ITS
Intern © Siemens AG 2017
May 2017Seite 14 Corporate Technology
Problem Statement: Industrial Diagnostic Ticketing System
Goal: Retrieve a relevant solution for an input query
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar .
SOL: The recommended action is that the machine should
not run at loads over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KBcosine similarity
(frequency, MW) = 0.08
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
QUERY
ITS
Intern © Siemens AG 2017
May 2017Seite 15 Corporate Technology
Problem Statement: Significant Term Mismatch
Goal: Retrieve a relevant solution for an input query
SOL: The recommended action is that the machine should
not run at loads over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KB(Significant Term Mismatch)
SUB:
GT Trip - Low Frequency Pulsations
QUERY
ITS
Intern © Siemens AG 2017
May 2017Seite 17 Corporate Technology
Problem Statement: Symmetric Textual Similarity
Goal: Retrieve a relevant solution for an input query
SUB: Narrow Frequency Pulsations
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KB
Symmetric Text pairs
(short Vs short texts)
SUB:
GT Trip - Low Frequency Pulsations
QUERY
ITS
Intern © Siemens AG 2017
May 2017Seite 18 Corporate Technology
Problem Statement: Symmetric Textual Similarity
Goal: Retrieve a relevant solution for an input query
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar .
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KB
Symmetric Text pairs
(long Vs long texts)
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
QUERY
ITS
Intern © Siemens AG 2017
May 2017Seite 19 Corporate Technology
Problem Statement: Asymmetric Textual Similarity
Goal: Retrieve a relevant solution for an input query
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar .
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KB
Asymmetric Text pairs
(short Vs multi-sentence texts)
SUB:
GT Trip - Low Frequency Pulsations
QUERY
ITS
Intern © Siemens AG 2017
May 2017Seite 20 Corporate Technology
Problem Statement: Asymmetric Textual Similarity
Goal: Retrieve a relevant solution for an input query
SOL: The recommended action is that the machine should
not run at loads over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KB
Asymmetric Text pairs
(short Vs multi-sentence texts)
SUB:
GT Trip - Low Frequency Pulsations
QUERY
ITS
Intern © Siemens AG 2017
May 2017Seite 21 Corporate Technology
Complementary Semantic Learning and Similarity
Goal: Retrieve a relevant solution for an input query
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar .
SOL: The recommended action is that the machine should
not run at loads over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KB
Complementary Semantics
(Coarse Vs fine Granularity)
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
QUERY
ITS
Intern © Siemens AG 2017
May 2017Seite 22 Corporate Technology
Complementary Semantic Learning and Similarity
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar .
SOL: The recommended action is that the machine should
not run at loads over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KB
Complementary Semantics
(Coarse Vs fine Granularity)
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
QUERY
similarity in latent topic semantics
ITS
Intern © Siemens AG 2017
May 2017Seite 23 Corporate Technology
Complementary Semantic Learning and Similarity
SUB: Narrow Frequency Pulsations
DESC: Low and Narrow frequency pulsations were detected.
Peak value for the Low Frequency Pulsations is 60 mbar .
SOL: The recommended action is that the machine should
not run at loads over 28 MW until resolved.
The recommended action is that the
machine should not run at loads over
28 MW until resolved.
Recommendation(s)
similarity
#TICKET
IN OUT
KB
Learn Complementary
Text Pair Representations
in a Highly Structured Space
SUB:
GT Trip - Low Frequency Pulsations
DESC:
GT Tripped due to a sudden increase
in Low Frequency Pulsations.
Machine is restarted and now operating
normally. Alarm received was: GT ESD
Low Frequency Pulsation.
QUERY
similarity in latent topic semantics
ITS
Intern © Siemens AG 2017
May 2017Seite 24 Corporate Technology
Background: Siamese Networks in NLP
Reference: Mueller and Thyagarajan, 2016
- dual-branch networks with tied weights
- non-linear metric learning with similarity information
- Invariant and selective representation directly through the
use of similarity and dissimilarity information
Intern © Siemens AG 2017
May 2017Seite 25 Corporate Technology
Background: Siamese Networks in NLP
Reference: Mueller and Thyagarajan, 2016
Limited to
sentence pairs
- dual-branch networks with tied weights
- non-linear metric learning with similarity information
- Invariant and selective representation directly through the
use of similarity and dissimilarity information
Intern © Siemens AG 2017
May 2017Seite 26 Corporate Technology
Background: Document Neural Autoregressive Topic Model (DocNADE)
Reference: Gupta et al. Document Informed Autoregressive Topic Models. Preprint: https://arxiv.org/pdf/1808.03793.pdf
- Learn word co-occureances across documents
- Extract latent topics (coarse grained representation)
- Optimise via reconstruction loss where, input should
be reproduced by the decoding part of the network.
Intern © Siemens AG 2017
May 2017Seite 27 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Query Component(s)
subscripted
by 1 e.g., X1
Ticket Component(s)
subscripted
by 2 e.g., X2
Intern © Siemens AG 2017
May 2017Seite 28 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE
Intern © Siemens AG 2017
May 2017Seite 29 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Multi-channel for
Complementary Semantics !!!
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE
Intern © Siemens AG 2017
May 2017Seite 30 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Multi-channel for
Complementary Semantics !!!
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE
Intern © Siemens AG 2017
May 2017Seite 31 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Replicated !!!
W
W
W
W
W
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE
Intern © Siemens AG 2017
May 2017Seite 32 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Multi-level Similarity !!!
e.g. SUB1-SUB2, DESC1-DESC2
W
W
W
W
W
DocNADE
DocNADE
DocNADE
DocNADE
DocNADEi.e. symmetric textual similarities
Intern © Siemens AG 2017
May 2017Seite 33 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Cross-level Similarity !!!
e.g. SUB1-DESC2, DESC1-SOL2, etc.
W
W
W
W
W
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE
i.e. asymmetric textual similarities
Intern © Siemens AG 2017
May 2017Seite 34 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Cross-level Similarity !!!
e.g. SUB1-DESC2, SUB1-SOL2, etc.
W
W
W
W
W
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE
i.e. asymmetric textual similarities
Intern © Siemens AG 2017
May 2017Seite 35 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Cross-level Similarity !!!
e.g. SUB1-DESC2, DESC1-SOL2, etc.
W
W
W
W
W
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE
Intern © Siemens AG 2017
May 2017Seite 36 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Cross-level Similarity !!!
e.g. SUB1-DESC2, DESC1-SUB2, etc.
W
W
W
W
W
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE
i.e. asymmetric textual similarities
Intern © Siemens AG 2017
May 2017Seite 37 Corporate Technology
Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese
Overall Learning !!!
W
W
W
W
W
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE
Intern © Siemens AG 2017
May 2017Seite 38 Corporate Technology
Proposal: Multi-Channel Manhattan Metric with Complementary Semantics
Multi-Channel Manhattan Metric
Intern © Siemens AG 2017
May 2017Seite 39 Corporate Technology
Proposal: Multi-Channel Manhattan Metric with Complementary Semantics
Weighted semantic distances
+ Weighted ‘level’ similarity
Multi-Channel Manhattan Metric
Intern © Siemens AG 2017
May 2017Seite 40 Corporate Technology
Evaluation
Intern © Siemens AG 2017
May 2017Seite 41 Corporate Technology
Evaluation
➢ Industrial Data Set
- 949 historical tickets in the KB
- 421 pairs labeled
- relatedness labels: YES (similar that provides correct solution): 5.0
REL (does not provide correct solution, but close to a solution): 3.0 and
NO (not related, not relevant and provides no correct solution): 1.0
- average length (#words) of SUB: (4.6); DESC: (65.0); SOL: (74.2).
➢ Baselines
- Unsupervised (topic and word embedding representations)
- Supervised (Siamese LSTM)
➢ Evaluation and Analysis the proposed Replicated Siamese LSTM
Intern © Siemens AG 2017
May 2017Seite 42 Corporate Technology
Evaluation: LDA Vs DocNADE
Intern © Siemens AG 2017
May 2017Seite 43 Corporate Technology
Evaluation: Unsupervised Baselines for STS and IR
SIMILARITY: Pearson correlation (r), Spearmans correlation coefficient (p), Mean Squared Error (MSE),
IR: Mean Average Precision@k (MAP@k) and Accuracy@k (Acc@k)
Topic (T) semantics from Query and Ticket components i.e. T(SUB1-SUB2), etc.
Intern © Siemens AG 2017
May 2017Seite 44 Corporate Technology
Evaluation: Unsupervised Baselines for STS and IR
SIMILARITY: Pearson correlation (r), Spearmans correlation coefficient (p), Mean Squared Error (MSE),
IR: Mean Average Precision@k (MAP@k) and Accuracy@k (Acc@k)
Topic (T) semantics from Query and Ticket components i.e. T(SUB1-SUB2), etc.
Intern © Siemens AG 2017
May 2017Seite 45 Corporate Technology
Evaluation: Unsupervised Baselines for STS and IR
Topic (T) semantics from Query and Ticket components i.e. T(SUB1-SUB2), etc.
SIMILARITY: Pearson correlation (r), Spearmans correlation coefficient (p), Mean Squared Error (MSE),
IR: Mean Average Precision@k (MAP@k) and Accuracy@k (Acc@k)
Intern © Siemens AG 2017
May 2017Seite 46 Corporate Technology
Evaluation: Unsupervised Baselines for STS and IR
Topic (T) semantics from Query and Ticket components i.e. T(SUB1-SUB2), etc.
Distributional Semantics (SumEmbedding) (E) for Query and Ticket components i.e.
E(SUB1-SUB2), . E(SUB1+DESC1-SUB2+DESC2+SOL2), etc.
Intern © Siemens AG 2017
May 2017Seite 47 Corporate Technology
Evaluation: Supervised Baselines for STS and IR
S-LSTM: Compute Similarity using Standard Siamese LSTM on a query and ticket components
Intern © Siemens AG 2017
May 2017Seite 48 Corporate Technology
Evaluation: Supervised Baselines for STS and IR
ML-LSTM: Multi-level similarity
Intern © Siemens AG 2017
May 2017Seite 49 Corporate Technology
Evaluation: Supervised Baselines for STS and IR
CL-LSTM: Cross-level similarity
Intern © Siemens AG 2017
May 2017Seite 50 Corporate Technology
Evaluation: Supervised Baselines for STS and IR
Multi-channel semantic similarity
Intern © Siemens AG 2017
May 2017Seite 51 Corporate Technology
Evaluation: Supervised Baselines for STS and IR
Multi-channel semantic similarity
7%
Intern © Siemens AG 2017
May 2017Seite 52 Corporate Technology
Qualitative Inspections for STS and IR
topics
Intern © Siemens AG 2017
May 2017Seite 53 Corporate Technology
Qualitative Inspections for STS and IR
topics
Intern © Siemens AG 2017
May 2017Seite 54 Corporate Technology
Conclusion
➢ Advancement of Siamese architecture for similarity learning in
asymmetric text pairs, instead of sentence pairs only
➢ Proposed Replicated Siamese LSTM with multi/cross level similarity
➢ Gain of 22% and 7% (Acc@10), respectively over unsupervised and supervised baselines in IR
➢ Demonstrated complementary semantics improved similarity/retrieval performance
➢ Address/Improve the real-world industrial application of ticketing system
Intern © Siemens AG 2017
May 2017Seite 55 Corporate Technology
Thanks !!!
W
W
W
W
W
DocNADE
DocNADE
DocNADE
DocNADE
DocNADE

More Related Content

PDF
PowerArtist: RTL Design for Power Platform
PPTX
PAC 2020 Santorin - Joerek Van Gaalen
PPTX
PAC 2020 Santorin - Edoardo Varani
PDF
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
PDF
Neural NLP Models of Information Extraction
PDF
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
PDF
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
PDF
Document Informed Neural Autoregressive Topic Models with Distributional Prior
PowerArtist: RTL Design for Power Platform
PAC 2020 Santorin - Joerek Van Gaalen
PAC 2020 Santorin - Edoardo Varani
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
Neural NLP Models of Information Extraction
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
Document Informed Neural Autoregressive Topic Models with Distributional Prior

More from Pankaj Gupta, PhD (9)

PDF
Neural Relation ExtractionWithin and Across Sentence Boundaries
PDF
Deep Learning for Information Extraction in Natural Language Text
PDF
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
PDF
Pankaj Gupta CV / Resume
PDF
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
PDF
Joint Bootstrapping Machines for High Confidence Relation Extraction
PDF
RNN-RSM (Topics over Time) | NAACL2018 conference talk
PDF
Lecture 07: Representation and Distributional Learning by Pankaj Gupta
PDF
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
Neural Relation ExtractionWithin and Across Sentence Boundaries
Deep Learning for Information Extraction in Natural Language Text
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
Pankaj Gupta CV / Resume
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
Joint Bootstrapping Machines for High Confidence Relation Extraction
RNN-RSM (Topics over Time) | NAACL2018 conference talk
Lecture 07: Representation and Distributional Learning by Pankaj Gupta
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
Ad

Recently uploaded (20)

PPTX
C1 cut-Methane and it's Derivatives.pptx
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
PPTX
Fluid dynamics vivavoce presentation of prakash
PPTX
Application of enzymes in medicine (2).pptx
PDF
. Radiology Case Scenariosssssssssssssss
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPTX
Science Quipper for lesson in grade 8 Matatag Curriculum
PPTX
Pharmacology of Autonomic nervous system
PPT
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
PPTX
BODY FLUIDS AND CIRCULATION class 11 .pptx
PDF
The scientific heritage No 166 (166) (2025)
PDF
Phytochemical Investigation of Miliusa longipes.pdf
DOCX
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
PPTX
perinatal infections 2-171220190027.pptx
PDF
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
PPTX
Introcution to Microbes Burton's Biology for the Health
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PDF
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
PPTX
Seminar Hypertension and Kidney diseases.pptx
C1 cut-Methane and it's Derivatives.pptx
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Warm, water-depleted rocky exoplanets with surfaceionic liquids: A proposed c...
Fluid dynamics vivavoce presentation of prakash
Application of enzymes in medicine (2).pptx
. Radiology Case Scenariosssssssssssssss
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
Science Quipper for lesson in grade 8 Matatag Curriculum
Pharmacology of Autonomic nervous system
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
BODY FLUIDS AND CIRCULATION class 11 .pptx
The scientific heritage No 166 (166) (2025)
Phytochemical Investigation of Miliusa longipes.pdf
Q1_LE_Mathematics 8_Lesson 5_Week 5.docx
perinatal infections 2-171220190027.pptx
Assessment of environmental effects of quarrying in Kitengela subcountyof Kaj...
Introcution to Microbes Burton's Biology for the Health
Biophysics 2.pdffffffffffffffffffffffffff
Is Earendel a Star Cluster?: Metal-poor Globular Cluster Progenitors at z ∼ 6
Seminar Hypertension and Kidney diseases.pptx
Ad

Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retrieval in Asymmetric Texts

  • 1. Intern © Siemens AG 2017 Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retrieval in Asymmetric Texts Author(s): Pankaj Gupta1,2, Bernt Andrassy2, Hinrich Schütze1 Presenter: Pankaj Gupta @COLING2018 Santa Fe, NM USA. 20th August 2018 1CIS, University of Munich (LMU) | 2Machine Intelligence, Siemens AG | August 2018
  • 2. Intern © Siemens AG 2017 May 2017Seite 2 Corporate Technology Contributors PANKAJ GUPTA PhD Candidate @LMU Research Scientist @Siemens Munich Germany Dr. Bernt Andrassy Senior Key Expert @Siemens Munich Germany Prof. Hinrich Schütze @University of Munich (LMU) (PhD advisor)
  • 3. Intern © Siemens AG 2017 May 2017Seite 3 Corporate Technology Outline ➢ Problem Statement - Industrial Diagnostic Ticketing System - Asymmetric Textual Similarity - Complementary Semantics in Similarity ➢ Background - Siamese Neural Networks (Siamese-LSTM) in NLP - Neural Autoregressive Topic Model (DocNADE) ➢ Proposal - The Replicated Siamese LSTM for STS and IR - Multi-Channel Manhattan Metric for Complementary Semantics ➢ Evaluations: STS and IR
  • 4. Intern © Siemens AG 2017 May 2017Seite 4 Corporate Technology Problem Statement: STS and IR Semantic Textual Similarity (STS): ➢ Task to find out if the text pairs mean the same thing ➢ Information Retrieval (IR) and text understanding may be improved by modeling the underlying semantic similarity between texts.
  • 5. Intern © Siemens AG 2017 May 2017Seite 5 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB DESC SOL …… …… …… …… …… …… …… …… …… SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar. SOL: The recommended action is that the machine should not run at load over 28 MW until resolved. KB of Industrial Tickets Intelligent Ticketing System (ITS)#TICKET
  • 6. Intern © Siemens AG 2017 May 2017Seite 6 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB DESC SOL …… …… …… …… …… …… …… …… …… SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar. SOL: The recommended action is that the machine should not run at load over 28 MW until resolved. KB of Industrial Tickets Intelligent Ticketing System (ITS)#TICKET SUBJECT text DESCRIPTION text QUERY Solution: Recommended Service Action text Recommendation(s)
  • 7. Intern © Siemens AG 2017 May 2017Seite 7 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar. SOL: The recommended action is that the machine should not run at load over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) QUERY similarity Retrieval #TICKET IN OUT KB ITS
  • 8. Intern © Siemens AG 2017 May 2017Seite 8 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar. SOL: The recommended action is that the machine should not run at load over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) QUERY similarity #TICKET IN OUT KB ITS Retrieval
  • 9. Intern © Siemens AG 2017 May 2017Seite 9 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar. SOL: The recommended action is that the machine should not run at load over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) QUERY similarity Retrieval #TICKET IN OUT KB ITS
  • 10. Intern © Siemens AG 2017 May 2017Seite 10 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar. SOL: The recommended action is that the machine should not run at load over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) QUERY similarity #TICKET IN OUT KB ITS
  • 11. Intern © Siemens AG 2017 May 2017Seite 11 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar. SOL: The recommended action is that the machine should not run at load over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) QUERY similarity #TICKET IN OUT KB cosine similarity (low, narrow) = 0.26 ITS
  • 12. Intern © Siemens AG 2017 May 2017Seite 12 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar. SOL: The recommended action is that the machine should not run at load over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) QUERY similarity #TICKET IN OUT KB cosine similarity (frequency, load) = 0.14 ITS
  • 13. Intern © Siemens AG 2017 May 2017Seite 13 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar . SOL: The recommended action is that the machine should not run at loads over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KB cosine similarity (frequency, mbar) = 0.25 SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. QUERY ITS
  • 14. Intern © Siemens AG 2017 May 2017Seite 14 Corporate Technology Problem Statement: Industrial Diagnostic Ticketing System Goal: Retrieve a relevant solution for an input query SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar . SOL: The recommended action is that the machine should not run at loads over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KBcosine similarity (frequency, MW) = 0.08 SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. QUERY ITS
  • 15. Intern © Siemens AG 2017 May 2017Seite 15 Corporate Technology Problem Statement: Significant Term Mismatch Goal: Retrieve a relevant solution for an input query SOL: The recommended action is that the machine should not run at loads over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KB(Significant Term Mismatch) SUB: GT Trip - Low Frequency Pulsations QUERY ITS
  • 16. Intern © Siemens AG 2017 May 2017Seite 17 Corporate Technology Problem Statement: Symmetric Textual Similarity Goal: Retrieve a relevant solution for an input query SUB: Narrow Frequency Pulsations The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KB Symmetric Text pairs (short Vs short texts) SUB: GT Trip - Low Frequency Pulsations QUERY ITS
  • 17. Intern © Siemens AG 2017 May 2017Seite 18 Corporate Technology Problem Statement: Symmetric Textual Similarity Goal: Retrieve a relevant solution for an input query DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar . The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KB Symmetric Text pairs (long Vs long texts) DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. QUERY ITS
  • 18. Intern © Siemens AG 2017 May 2017Seite 19 Corporate Technology Problem Statement: Asymmetric Textual Similarity Goal: Retrieve a relevant solution for an input query DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar . The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KB Asymmetric Text pairs (short Vs multi-sentence texts) SUB: GT Trip - Low Frequency Pulsations QUERY ITS
  • 19. Intern © Siemens AG 2017 May 2017Seite 20 Corporate Technology Problem Statement: Asymmetric Textual Similarity Goal: Retrieve a relevant solution for an input query SOL: The recommended action is that the machine should not run at loads over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KB Asymmetric Text pairs (short Vs multi-sentence texts) SUB: GT Trip - Low Frequency Pulsations QUERY ITS
  • 20. Intern © Siemens AG 2017 May 2017Seite 21 Corporate Technology Complementary Semantic Learning and Similarity Goal: Retrieve a relevant solution for an input query SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar . SOL: The recommended action is that the machine should not run at loads over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KB Complementary Semantics (Coarse Vs fine Granularity) SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. QUERY ITS
  • 21. Intern © Siemens AG 2017 May 2017Seite 22 Corporate Technology Complementary Semantic Learning and Similarity SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar . SOL: The recommended action is that the machine should not run at loads over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KB Complementary Semantics (Coarse Vs fine Granularity) SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. QUERY similarity in latent topic semantics ITS
  • 22. Intern © Siemens AG 2017 May 2017Seite 23 Corporate Technology Complementary Semantic Learning and Similarity SUB: Narrow Frequency Pulsations DESC: Low and Narrow frequency pulsations were detected. Peak value for the Low Frequency Pulsations is 60 mbar . SOL: The recommended action is that the machine should not run at loads over 28 MW until resolved. The recommended action is that the machine should not run at loads over 28 MW until resolved. Recommendation(s) similarity #TICKET IN OUT KB Learn Complementary Text Pair Representations in a Highly Structured Space SUB: GT Trip - Low Frequency Pulsations DESC: GT Tripped due to a sudden increase in Low Frequency Pulsations. Machine is restarted and now operating normally. Alarm received was: GT ESD Low Frequency Pulsation. QUERY similarity in latent topic semantics ITS
  • 23. Intern © Siemens AG 2017 May 2017Seite 24 Corporate Technology Background: Siamese Networks in NLP Reference: Mueller and Thyagarajan, 2016 - dual-branch networks with tied weights - non-linear metric learning with similarity information - Invariant and selective representation directly through the use of similarity and dissimilarity information
  • 24. Intern © Siemens AG 2017 May 2017Seite 25 Corporate Technology Background: Siamese Networks in NLP Reference: Mueller and Thyagarajan, 2016 Limited to sentence pairs - dual-branch networks with tied weights - non-linear metric learning with similarity information - Invariant and selective representation directly through the use of similarity and dissimilarity information
  • 25. Intern © Siemens AG 2017 May 2017Seite 26 Corporate Technology Background: Document Neural Autoregressive Topic Model (DocNADE) Reference: Gupta et al. Document Informed Autoregressive Topic Models. Preprint: https://arxiv.org/pdf/1808.03793.pdf - Learn word co-occureances across documents - Extract latent topics (coarse grained representation) - Optimise via reconstruction loss where, input should be reproduced by the decoding part of the network.
  • 26. Intern © Siemens AG 2017 May 2017Seite 27 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Query Component(s) subscripted by 1 e.g., X1 Ticket Component(s) subscripted by 2 e.g., X2
  • 27. Intern © Siemens AG 2017 May 2017Seite 28 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese DocNADE DocNADE DocNADE DocNADE DocNADE
  • 28. Intern © Siemens AG 2017 May 2017Seite 29 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Multi-channel for Complementary Semantics !!! DocNADE DocNADE DocNADE DocNADE DocNADE
  • 29. Intern © Siemens AG 2017 May 2017Seite 30 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Multi-channel for Complementary Semantics !!! DocNADE DocNADE DocNADE DocNADE DocNADE
  • 30. Intern © Siemens AG 2017 May 2017Seite 31 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Replicated !!! W W W W W DocNADE DocNADE DocNADE DocNADE DocNADE
  • 31. Intern © Siemens AG 2017 May 2017Seite 32 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Multi-level Similarity !!! e.g. SUB1-SUB2, DESC1-DESC2 W W W W W DocNADE DocNADE DocNADE DocNADE DocNADEi.e. symmetric textual similarities
  • 32. Intern © Siemens AG 2017 May 2017Seite 33 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Cross-level Similarity !!! e.g. SUB1-DESC2, DESC1-SOL2, etc. W W W W W DocNADE DocNADE DocNADE DocNADE DocNADE i.e. asymmetric textual similarities
  • 33. Intern © Siemens AG 2017 May 2017Seite 34 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Cross-level Similarity !!! e.g. SUB1-DESC2, SUB1-SOL2, etc. W W W W W DocNADE DocNADE DocNADE DocNADE DocNADE i.e. asymmetric textual similarities
  • 34. Intern © Siemens AG 2017 May 2017Seite 35 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Cross-level Similarity !!! e.g. SUB1-DESC2, DESC1-SOL2, etc. W W W W W DocNADE DocNADE DocNADE DocNADE DocNADE
  • 35. Intern © Siemens AG 2017 May 2017Seite 36 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Cross-level Similarity !!! e.g. SUB1-DESC2, DESC1-SUB2, etc. W W W W W DocNADE DocNADE DocNADE DocNADE DocNADE i.e. asymmetric textual similarities
  • 36. Intern © Siemens AG 2017 May 2017Seite 37 Corporate Technology Proposal: Multi-Channel, Replicated and Multi-Cross Level Siamese Overall Learning !!! W W W W W DocNADE DocNADE DocNADE DocNADE DocNADE
  • 37. Intern © Siemens AG 2017 May 2017Seite 38 Corporate Technology Proposal: Multi-Channel Manhattan Metric with Complementary Semantics Multi-Channel Manhattan Metric
  • 38. Intern © Siemens AG 2017 May 2017Seite 39 Corporate Technology Proposal: Multi-Channel Manhattan Metric with Complementary Semantics Weighted semantic distances + Weighted ‘level’ similarity Multi-Channel Manhattan Metric
  • 39. Intern © Siemens AG 2017 May 2017Seite 40 Corporate Technology Evaluation
  • 40. Intern © Siemens AG 2017 May 2017Seite 41 Corporate Technology Evaluation ➢ Industrial Data Set - 949 historical tickets in the KB - 421 pairs labeled - relatedness labels: YES (similar that provides correct solution): 5.0 REL (does not provide correct solution, but close to a solution): 3.0 and NO (not related, not relevant and provides no correct solution): 1.0 - average length (#words) of SUB: (4.6); DESC: (65.0); SOL: (74.2). ➢ Baselines - Unsupervised (topic and word embedding representations) - Supervised (Siamese LSTM) ➢ Evaluation and Analysis the proposed Replicated Siamese LSTM
  • 41. Intern © Siemens AG 2017 May 2017Seite 42 Corporate Technology Evaluation: LDA Vs DocNADE
  • 42. Intern © Siemens AG 2017 May 2017Seite 43 Corporate Technology Evaluation: Unsupervised Baselines for STS and IR SIMILARITY: Pearson correlation (r), Spearmans correlation coefficient (p), Mean Squared Error (MSE), IR: Mean Average Precision@k (MAP@k) and Accuracy@k (Acc@k) Topic (T) semantics from Query and Ticket components i.e. T(SUB1-SUB2), etc.
  • 43. Intern © Siemens AG 2017 May 2017Seite 44 Corporate Technology Evaluation: Unsupervised Baselines for STS and IR SIMILARITY: Pearson correlation (r), Spearmans correlation coefficient (p), Mean Squared Error (MSE), IR: Mean Average Precision@k (MAP@k) and Accuracy@k (Acc@k) Topic (T) semantics from Query and Ticket components i.e. T(SUB1-SUB2), etc.
  • 44. Intern © Siemens AG 2017 May 2017Seite 45 Corporate Technology Evaluation: Unsupervised Baselines for STS and IR Topic (T) semantics from Query and Ticket components i.e. T(SUB1-SUB2), etc. SIMILARITY: Pearson correlation (r), Spearmans correlation coefficient (p), Mean Squared Error (MSE), IR: Mean Average Precision@k (MAP@k) and Accuracy@k (Acc@k)
  • 45. Intern © Siemens AG 2017 May 2017Seite 46 Corporate Technology Evaluation: Unsupervised Baselines for STS and IR Topic (T) semantics from Query and Ticket components i.e. T(SUB1-SUB2), etc. Distributional Semantics (SumEmbedding) (E) for Query and Ticket components i.e. E(SUB1-SUB2), . E(SUB1+DESC1-SUB2+DESC2+SOL2), etc.
  • 46. Intern © Siemens AG 2017 May 2017Seite 47 Corporate Technology Evaluation: Supervised Baselines for STS and IR S-LSTM: Compute Similarity using Standard Siamese LSTM on a query and ticket components
  • 47. Intern © Siemens AG 2017 May 2017Seite 48 Corporate Technology Evaluation: Supervised Baselines for STS and IR ML-LSTM: Multi-level similarity
  • 48. Intern © Siemens AG 2017 May 2017Seite 49 Corporate Technology Evaluation: Supervised Baselines for STS and IR CL-LSTM: Cross-level similarity
  • 49. Intern © Siemens AG 2017 May 2017Seite 50 Corporate Technology Evaluation: Supervised Baselines for STS and IR Multi-channel semantic similarity
  • 50. Intern © Siemens AG 2017 May 2017Seite 51 Corporate Technology Evaluation: Supervised Baselines for STS and IR Multi-channel semantic similarity 7%
  • 51. Intern © Siemens AG 2017 May 2017Seite 52 Corporate Technology Qualitative Inspections for STS and IR topics
  • 52. Intern © Siemens AG 2017 May 2017Seite 53 Corporate Technology Qualitative Inspections for STS and IR topics
  • 53. Intern © Siemens AG 2017 May 2017Seite 54 Corporate Technology Conclusion ➢ Advancement of Siamese architecture for similarity learning in asymmetric text pairs, instead of sentence pairs only ➢ Proposed Replicated Siamese LSTM with multi/cross level similarity ➢ Gain of 22% and 7% (Acc@10), respectively over unsupervised and supervised baselines in IR ➢ Demonstrated complementary semantics improved similarity/retrieval performance ➢ Address/Improve the real-world industrial application of ticketing system
  • 54. Intern © Siemens AG 2017 May 2017Seite 55 Corporate Technology Thanks !!! W W W W W DocNADE DocNADE DocNADE DocNADE DocNADE