SlideShare a Scribd company logo
Unrestricted © Siemens AG 2017
Neural NLP Models of Information Extraction
Presenter: Pankaj Gupta | PhD with Prof. Hinrich Schütze | Research Scientist
University of Munich (LMU) | Siemens AG, Munich Germany
Venue: Google AI, New York City | 25 Mar, 2019
Unrestricted © Siemens AG 2019
January 2019Page 2 Machine Intelligence / Siemens AI Lab
About Me: Affiliations
time
Bachelors
(B.Tech-IT)
2006-10 2010-13
Senior
Software
Developer
Masters
(MSc-CS)
&
Research
Assistant
Bachelor
Internship
2009 2013-15
Working
Student
&
Master
Thesis
2013-15
Starting
PhD
2015
PhD
Research
Intern
(4 months)
2016-17
Research
Scientist -
NLP/ML
2017-Now
PhD
Submission
2019
Master Thesis: Deep Learning Methods for the Extraction of Relations in Natural Language Text
PhD Thesis Title (tentative): Neural Models of Information Extraction from Natural Language Text
Reach me: https://sites.google.com/view/gupta-pankaj/
Unrestricted © Siemens AG 2019
January 2019Page 3 Machine Intelligence / Siemens AI Lab
About Me: Research
Neural Topic ModelingNeural Relation Extraction
Interpretability
➢ Intra- and Inter-sentential RE
➢ Joint Entity & RE
➢ Weakly-supervised
Bootstrapping RE
➢ Autoregressive TMs
➢ Word Embeddings Aware TM
➢ Language Structure Aware TM (textTOvec)
➢ Multi-view Transfer Learning in TM
Interpretable RE
Interpretable
topics
Transfer Learning
Lifelong Learning
➢ Explaining RNN predictions
Unrestricted © Siemens AG 2019
January 2019Page 4 Machine Intelligence / Siemens AI Lab
Outline
Two Tracks:
1/2 Track: Relation Extraction
Neural Relation Extraction Within and Across Sentence Boundaries
Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Thomas Runkler. In AAAI-2019.
2/2 Track: Topic Modeling & Representation Learning (Briefly)
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with
Distributed Compositional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019.
Unrestricted © Siemens AG 2019
January 2019Page 5 Machine Intelligence / Siemens AI Lab
Neural Relation Extraction Within and Across
Sentence Boundaries
Introduction: Relation Extraction spanning sentence boundaries
Proposed Methods
➢ Inter-sentential Dependency-Based Neural Networks (iDepNN)
→ Inter-sentential Shortest Dependency Path (iDepNN-SDP)
→ Inter-sentential Augmented Dependency Path (iDepNN-ADP)
Evaluation and Analysis
➢ State-of-the-art comparison
➢ Error analysis
Unrestricted © Siemens AG 2019
January 2019Page 6 Machine Intelligence / Siemens AI Lab
Neural Relation Extraction Within and Across
Sentence Boundaries
Introduction: Relation Extraction spanning sentence boundaries
Proposed Methods
➢ Inter-sentential Dependency-Based Neural Networks (iDepNN)
→ Inter-sentential Shortest Dependency Path (iDepNN-SDP)
→ Inter-sentential Augmented Dependency Path (iDepNN-ADP)
Evaluation and Analysis
➢ State-of-the-art comparison
➢ Error analysis
Unrestricted © Siemens AG 2019
January 2019Page 7 Machine Intelligence / Siemens AI Lab
Neural Relation Extraction Within and Across
Sentence Boundaries
Introduction: Relation Extraction spanning sentence boundaries
Proposed Methods
➢ Inter-sentential Dependency-Based Neural Networks (iDepNN)
→ Inter-sentential Shortest Dependency Path (iDepNN-SDP)
→ Inter-sentential Augmented Dependency Path (iDepNN-ADP)
Evaluation and Analysis
➢ State-of-the-art comparison
➢ Error analysis
Unrestricted © Siemens AG 2019
January 2019Page 8 Machine Intelligence / Siemens AI Lab
Neural Relation Extraction Within and Across
Sentence Boundaries
Introduction: Relation Extraction spanning sentence boundaries
Proposed Methods
➢ Inter-sentential Dependency-Based Neural Networks (iDepNN)
→ Inter-sentential Shortest Dependency Path (iDepNN-SDP)
→ Inter-sentential Augmented Dependency Path (iDepNN-ADP)
Evaluation and Analysis
➢ State-of-the-art comparison
➢ Error analysis
Unrestricted © Siemens AG 2019
January 2019Page 9 Machine Intelligence / Siemens AI Lab
Introduction: Relation Extraction (RE)
Binary Relation Extraction(RE):
- identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
Unrestricted © Siemens AG 2019
January 2019Page 10 Machine Intelligence / Siemens AI Lab
Introduction: Relation Extraction (RE)
Binary Relation Extraction(RE):
- identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
Paul Allen has started a company and named [Vern Raburn]e1 its [president]e2 .
relation: per-post(e1,e2)
Unrestricted © Siemens AG 2019
January 2019Page 11 Machine Intelligence / Siemens AI Lab
Need for Relation Extraction (RE)
→ Large part of the information is expressed in free text, e.g. in web pages, blogs, social media, etc.
→ Need for automatic systems to extract the relevant information in form of Structured KB
• Entity Extraction
• Relation Extraction
• Structure the unstructured text
• Knowledge Graph Construction
• In web search, retrieval, Q&A, etc.
Information Extraction
Entity Extraction: Detect entities such as person, organization, location, product, technology, sensor, etc.
Relation Extraction: Detect relation between the given entities or nominals
End-to-End Knowledge Base Population
Text Documents Knowledge GraphIE Engine
SensorSensor
Competitor-of
Sensor
Unrestricted © Siemens AG 2019
January 2019Page 12 Machine Intelligence / Siemens AI Lab
Introduction: Relation Extraction (RE)
Relation Extraction
(Based on location of entities)
Binary Relation Extraction(RE):
- identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
Unrestricted © Siemens AG 2019
January 2019Page 13 Machine Intelligence / Siemens AI Lab
Introduction: Relation Extraction (RE)
intra-sentential
(entities within sentence boundary)
most
prior works
Binary Relation Extraction(RE):
- identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
Relation Extraction
(Based on location of entities)
Unrestricted © Siemens AG 2019
January 2019Page 14 Machine Intelligence / Siemens AI Lab
Introduction: Relation Extraction (RE)
intra-sentential
(entities within sentence boundary)
most
prior works
This work
Binary Relation Extraction(RE):
- identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
Relation Extraction
(Based on location of entities)
inter-sentential
(entities across sentence boundary(s))
Unrestricted © Siemens AG 2019
January 2019Page 15 Machine Intelligence / Siemens AI Lab
Introduction: Relation Extraction (RE)
intra-sentential
(entities within sentence boundary)
most
prior works
This work
Binary Relation Extraction(RE):
- identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
Relation Extraction
(Based on location of entities)
Paul Allen has started a company and named [Vern Raburn]e1 its [president]e2 .
relation: per-post(e1,e2)
Example
inter-sentential
(entities across sentence boundary(s))
Unrestricted © Siemens AG 2019
January 2019Page 16 Machine Intelligence / Siemens AI Lab
Introduction: Relation Extraction (RE)
intra-sentential
(entities within sentence boundary)
most
prior works
This work
Binary Relation Extraction(RE):
- identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
Relation Extraction
(Based on location of entities)
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group]e2 will be based in Bellevue, Washington.
relation: per-org(e1,e2)
inter-sentential
(entities across sentence boundary(s))
Example
Unrestricted © Siemens AG 2019
January 2019Page 17 Machine Intelligence / Siemens AI Lab
Introduction: Relation Extraction (RE)
intra-sentential
(entities within sentence boundary)
most
prior works
This work
Binary Relation Extraction(RE):
- identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
Relation Extraction
(Based on location of entities)
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group]e2 will be based in Bellevue, Washington.
relation: ??
inter-sentential
(entities across sentence boundary(s))
MISSED relationships:
Impact the system
performance, leading
to POOR RECALL
Unrestricted © Siemens AG 2019
January 2019Page 18 Machine Intelligence / Siemens AI Lab
Introduction: Relation Extraction (RE)
intra-sentential
(entities within sentence boundary)
most
prior works
This work
Binary Relation Extraction(RE):
- identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
Relation Extraction
(Based on location of entities)
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group]e2 will be based in Bellevue, Washington.
relation: per-org(e1,e2)
inter-sentential
(entities across sentence boundary(s))
Capture relationship
between entities at
distance across
sentence boundaries
Unrestricted © Siemens AG 2019
January 2019Page 19 Machine Intelligence / Siemens AI Lab
Challenges in Inter-sentential Relation Extraction (RE)
This work
Paul Allen has started a company and named [Vern Raburn]e1 its president.
The company will coordinate the overall strategy for the group of high-tech
companies that Mr. Allen owns or holds a significant stake in, will be based in
Bellevue, Washington and called [Paul Allen Group]e2 .
relation: per-org(e1,e2)
inter-sentential
(entities across sentence boundary(s))
Unrestricted © Siemens AG 2019
January 2019Page 20 Machine Intelligence / Siemens AI Lab
Challenges in Inter-sentential Relation Extraction (RE)
This work
Paul Allen has started a company and named [Vern Raburn]e1 its president.
The company will coordinate the overall strategy for the group of high-tech
companies that Mr. Allen owns or holds a significant stake in, will be based in
Bellevue, Washington and called [Paul Allen Group]e2 .
relation: per-org(e1,e2)
inter-sentential
(entities across sentence boundary(s))
NOISY text in relationships
spanning sentence boundaries:
POOR PRECISION
Unrestricted © Siemens AG 2019
January 2019Page 21 Machine Intelligence / Siemens AI Lab
Challenges in Inter-sentential Relation Extraction (RE)
This work
Paul Allen has started a company and named [Vern Raburn]e1 its president.
The company will coordinate the overall strategy for the group of high-tech
companies that Mr. Allen owns or holds a significant stake in, will be based in
Bellevue, Washington and called [Paul Allen Group]e2 .
relation: per-org(e1,e2)
inter-sentential
(entities across sentence boundary(s))
NOISY text in relationships
spanning sentence boundaries:
POOR PRECISION
Robust system to
tackle false positives
in inter-sentential RE
Need
Unrestricted © Siemens AG 2019
January 2019Page 22 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
1. Dependency Parse trees effective in extracting relationships
limited to single sentences, i.e., intra-sentential relationships
2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE
limited to single sentences, i.e., intra-sentential relationships
ignore additional information relevant in relation identification
4. Tree-RNNs effective in modeling relations via recursive compositionality
3. Augmented Dependency Path (ADP) precisely models relationships
limited to single sentences, i.e., intra-sentential relationships
limited to single sentences, i.e., intra-sentential relationships
Unrestricted © Siemens AG 2019
January 2019Page 23 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
1. Dependency Parse trees effective in extracting relationships
limited to single sentences, i.e., intra-sentential relationships
2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE
limited to single sentences, i.e., intra-sentential relationships
ignore additional information relevant in relation identification
4. Tree-RNNs effective in modeling relations via recursive compositionality
3. Augmented Dependency Path (ADP) precisely models relationships
limited to single sentences, i.e., intra-sentential relationships
limited to single sentences, i.e., intra-sentential relationships
Unrestricted © Siemens AG 2019
January 2019Page 24 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
1. Dependency Parse trees effective in extracting relationships
limited to single sentences, i.e., intra-sentential relationships
2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE
limited to single sentences, i.e., intra-sentential relationships
4. Tree-RNNs effective in modeling relations via recursive compositionality
3. Augmented Dependency Path (ADP) precisely models relationships
limited to single sentences, i.e., intra-sentential relationships
limited to single sentences, i.e., intra-sentential relationships
ignore additional information relevant in relation identification
Unrestricted © Siemens AG 2019
January 2019Page 25 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
Sentences and their dependency graphs
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington.
Shortest Dependency Path (SDP)
between root to entity e1
Unrestricted © Siemens AG 2019
January 2019Page 26 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
Sentences and their dependency graphs
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington.
Shortest Dependency Path (SDP)
between root to entity e2
Unrestricted © Siemens AG 2019
January 2019Page 27 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
Sentences and their dependency graphs Inter-sentential Shortest Dependency Path
(iSDP) across sentence boundary.
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington.
iSDP
→ Connection between the roots of adjacent sentences by NEXTS
Unrestricted © Siemens AG 2019
January 2019Page 28 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
1. Dependency Parse trees effective in extracting relationships
limited to single sentences, i.e., intra-sentential relationships
2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE
limited to single sentences, i.e., intra-sentential relationships
ignore additional information relevant in relation identification
4. Tree-RNNs effective in modeling relations via recursive compositionality
3. Augmented Dependency Path (ADP) precisely models relationships
limited to single sentences, i.e., intra-sentential relationships
limited to single sentences, i.e., intra-sentential relationships
Unrestricted © Siemens AG 2019
January 2019Page 29 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
Sentences and their dependency graphs Inter-sentential Shortest Dependency Path
(iSDP) across sentence boundary.
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington.
subtreeiSDP
→ Connection between the roots of adjacent sentences by NEXTS
Unrestricted © Siemens AG 2019
January 2019Page 30 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
1. Dependency Parse trees effective in extracting relationships
limited to single sentences, i.e., intra-sentential relationships
2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE
limited to single sentences, i.e., intra-sentential relationships
ignore additional information relevant in relation identification
3. Augmented Dependency Path (ADP) precisely models relationships
limited to single sentences, i.e., intra-sentential relationships
limited to single sentences, i.e., intra-sentential relationships
4. Tree-RNNs effective in modeling relations via recursive compositionality
Unrestricted © Siemens AG 2019
January 2019Page 31 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
Sentences and their dependency graphs Inter-sentential Shortest Dependency Path
(iSDP) across sentence boundary.
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington.
subtreeiSDP
→ Connection between the roots of adjacent sentences by NEXTS
Unrestricted © Siemens AG 2019
January 2019Page 32 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
Sentences and their dependency graphs Inter-sentential Shortest Dependency Path
(iSDP) across sentence boundary.
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington.
subtreeiSDP
→ Connection between the roots of adjacent sentences by NEXTS
Unrestricted © Siemens AG 2019
January 2019Page 33 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
Sentences and their dependency graphs Inter-sentential Shortest Dependency Path
(iSDP) across sentence boundary.
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington.
subtreeiSDP
→ Connection between the roots of adjacent sentences by NEXTS
Unrestricted © Siemens AG 2019
January 2019Page 34 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
Sentences and their dependency graphs Inter-sentential Shortest Dependency Path
(iSDP) across sentence boundary.
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington.
subtreeiSDP
→ Connection between the roots of adjacent sentences by NEXTS
Unrestricted © Siemens AG 2019
January 2019Page 35 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
Sentences and their dependency graphs Inter-sentential Shortest Dependency Path
(iSDP) across sentence boundary.
Paul Allen has started a company and named [Vern Raburn]e1 its president. The
company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington.
subtreeiSDP
→ Connection between the roots of adjacent sentences by NEXTS
Unrestricted © Siemens AG 2019
January 2019Page 36 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
1. Dependency Parse trees effective in extracting relationships
limited to single sentences, i.e., intra-sentential relationships
2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE
limited to single sentences, i.e., intra-sentential relationships
ignore additional information relevant in relation identification
3. Augmented Dependency Path (ADP) precisely models relationships
limited to single sentences, i.e., intra-sentential relationships
4. Tree-RNNs effective in modeling relations via recursive compositionality
limited to single sentences, i.e., intra-sentential relationships
Unrestricted © Siemens AG 2019
January 2019Page 37 Machine Intelligence / Siemens AI Lab
Motivation: Dependency Based Relation Extraction
1. Dependency Parse trees effective in extracting relationships
limited to single sentences, i.e., intra-sentential relationships
2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE
limited to single sentences, i.e., intra-sentential relationships
ignore additional information relevant in relation identification
3. Augmented Dependency Path (ADP) precisely models relationships
limited to single sentences, i.e., intra-sentential relationships
4. Tree-RNNs effective in modeling relations via recursive compositionality
limited to single sentences, i.e., intra-sentential relationships
Exploit these properties in inter-
sentential RE via a unified neural
framework of:
→ bi-RNN modeling SDP
→ RecNN modeling ADP
Unrestricted © Siemens AG 2019
January 2019Page 38 Machine Intelligence / Siemens AI Lab
Contribution
Propose a novel neural approach for Inter-sentential Relation Extraction
Unrestricted © Siemens AG 2019
January 2019Page 39 Machine Intelligence / Siemens AI Lab
Contribution
1. Neural architecture based on dependency parse trees
➔ named as inter-sentential Dependency-based Neural Network (iDepNN)
2. Unified neural framework of a bidirectional RNN (biRNN) and Recursive NN (RecNN)
3. Extract relations within and across sentence boundaries by modeling:
➔ shortest dependency path (SDP) using biRNN
➔ augmented dependency path (ADP) using RecRNN
Propose a novel neural approach for Inter-sentential Relation Extraction
Contribution
Unrestricted © Siemens AG 2019
January 2019Page 40 Machine Intelligence / Siemens AI Lab
Contribution
1. Neural architecture based on dependency parse trees
➔ named as inter-sentential Dependency-based Neural Network (iDepNN)
2. Unified neural framework of a bidirectional RNN (biRNN) and Recursive NN (RecNN)
3. Extract relations within and across sentence boundaries by modeling:
➔ shortest dependency path (SDP) using biRNN
➔ augmented dependency path (ADP) using RecRNN
Propose a novel neural approach for Inter-sentential Relation Extraction
Contribution
Unrestricted © Siemens AG 2019
January 2019Page 41 Machine Intelligence / Siemens AI Lab
Contribution
1. Neural architecture based on dependency parse trees
➔ named as inter-sentential Dependency-based Neural Network (iDepNN)
2. Unified neural framework of a bidirectional RNN (biRNN) and Recursive NN (RecNN)
3. Extract relations within and across sentence boundaries by modeling:
➔ shortest dependency path (SDP) using biRNN
➔ augmented dependency path (ADP) using RecRNN
Propose a novel neural approach for Inter-sentential Relation Extraction
Contribution
Unrestricted © Siemens AG 2019
January 2019Page 42 Machine Intelligence / Siemens AI Lab
Contribution
1. Neural architecture based on dependency parse trees
➔ named as inter-sentential Dependency-based Neural Network (iDepNN)
2. Unified neural framework of a bidirectional RNN (biRNN) and Recursive NN (RecNN)
3. Extract relations within and across sentence boundaries by modeling:
➔ shortest dependency path (SDP) using biRNN
➔ augmented dependency path (ADP) using RecRNN
Propose a novel neural approach for Inter-sentential Relation Extraction
Contribution
1. precisely extract relationships within and across sentence boundaries
2. show a better balance in precision and recall with an improved F1 score
Benefits
Unrestricted © Siemens AG 2019
January 2019Page 43 Machine Intelligence / Siemens AI Lab
Proposed Approach:
Neural Intra- and inter-sentential RE
Unrestricted © Siemens AG 2019
January 2019Page 44 Machine Intelligence / Siemens AI Lab
Proposed Approach: Intra- and inter-sentential RE
Inter-sentential Dependency-based Neural Network variants: iDepNN-SDP and iDepNN-ADP
1. Modeling Inter-sentential
Shortest Dependency Path
2. Modeling Inter-sentential
Dependency Subtrees
1+2: Modeling Inter-sentential
Augmented Dependency Path
Unrestricted © Siemens AG 2019
January 2019Page 45 Machine Intelligence / Siemens AI Lab
Proposed Approach: Intra- and inter-sentential RE
Inter-sentential Dependency-based Neural Network variants: iDepNN-SDP and iDepNN-ADP
Unrestricted © Siemens AG 2019
January 2019Page 46 Machine Intelligence / Siemens AI Lab
Proposed Approach: Intra- and inter-sentential RE
Inter-sentential Dependency-based Neural Network variants: iDepNN-SDP
1. Modeling Inter-sentential
Shortest Dependency Path
Unrestricted © Siemens AG 2019
January 2019Page 47 Machine Intelligence / Siemens AI Lab
Proposed Approach: Intra- and inter-sentential RE
Inter-sentential Dependency-based Neural Network variants: iDepNN-SDP
1. Modeling Inter-sentential
Shortest Dependency Path
Unrestricted © Siemens AG 2019
January 2019Page 48 Machine Intelligence / Siemens AI Lab
Proposed Approach: Intra- and inter-sentential RE
Inter-sentential Dependency-based Neural Network variants: iDepNN-ADP
subtree
2. Modeling Inter-sentential
Dependency Subtrees
Unrestricted © Siemens AG 2019
January 2019Page 49 Machine Intelligence / Siemens AI Lab
Proposed Approach: Intra- and inter-sentential RE
Inter-sentential Dependency-based Neural Network variants: iDepNN-ADP
Compute subtree embedding
subtree
Unrestricted © Siemens AG 2019
January 2019Page 50 Machine Intelligence / Siemens AI Lab
Proposed Approach: Intra- and inter-sentential RE
Inter-sentential Dependency-based Neural Network variants: iDepNN-ADP
1+2: Modeling Inter-sentential
Augmented Dependency Path
→ Offers precise
structure
→ Offers additional
information in
classifying relation
Unrestricted © Siemens AG 2019
January 2019Page 51 Machine Intelligence / Siemens AI Lab
Evaluation and Analysis
Unrestricted © Siemens AG 2019
January 2019Page 52 Machine Intelligence / Siemens AI Lab
Evaluation and Analysis: Datasets
Datasets
➢ evaluate on four datasets from medical and news domain
Count of intra- and inter-sentential relationships in datasets
Unrestricted © Siemens AG 2019
January 2019Page 53 Machine Intelligence / Siemens AI Lab
Evaluation and Analysis: Datasets
Datasets
➢ evaluate on four datasets from medical and news domain
Count of intra- and inter-sentential relationships in datasets
Unrestricted © Siemens AG 2019
January 2019Page 54 Machine Intelligence / Siemens AI Lab
Evaluation and Analysis: Datasets
Datasets
➢ evaluate on four datasets from medical and news domain
Count of intra- and inter-sentential relationships in datasets
Unrestricted © Siemens AG 2019
January 2019Page 55 Machine Intelligence / Siemens AI Lab
Evaluation and Analysis: Datasets
Datasets
➢ evaluate on four datasets from medical and news domain
Count of intra- and inter-sentential relationships in datasets
Unrestricted © Siemens AG 2019
January 2019Page 56 Machine Intelligence / Siemens AI Lab
Evaluation and Analysis: Datasets
Datasets
➢ evaluate on four datasets from medical and news domain
Count of intra- and inter-sentential relationships in datasets
Unrestricted © Siemens AG 2019
January 2019Page 57 Machine Intelligence / Siemens AI Lab
Evaluation and Analysis: Datasets
Datasets
➢ evaluate on four datasets from medical and news domain
Count of intra- and inter-sentential relationships in datasets
Result
discussed in
this talk
Lives_In → Two arguments, the bacterium and the location
where, location → an Habitat (e.g., microbial ecology such as hosts, environment, food, etc.)
or a Geographical entity (e.g., geographical and organization places)
Data: http://2016.bionlp-st.org/tasks/bb2
Unrestricted © Siemens AG 2019
January 2019Page 58 Machine Intelligence / Siemens AI Lab
Evaluation and Analysis: Datasets + Baselines
Datasets
➢ evaluate on four datasets from medical and news domain
Baselines:
→ SVM, graphLSTM, i-biRNN and i-biLSTM
Count of intra- and inter-sentential relationships in datasets
Result
discussed in
this talk
graphLSTMs: Peng et. al., 2017. Cross-Sentence N-ary Relation Extraction with Graph LSTMs.
Unrestricted © Siemens AG 2019
January 2019Page 59 Machine Intelligence / Siemens AI Lab
Results (Precision / Recall / F1)
Sentence range,
k = 0 ➔ Intra-sentential
k > 0 ➔ Inter-sentential
Unrestricted © Siemens AG 2019
January 2019Page 60 Machine Intelligence / Siemens AI Lab
Results (Precision / Recall / F1): Intra-sentential Training
iDepNN-ADP is precise in inter-sentential RE
than both SVM and graphLSTM
precise
Unrestricted © Siemens AG 2019
January 2019Page 61 Machine Intelligence / Siemens AI Lab
Results (Precision / Recall / F1): Intra-sentential Training
iDepNN-ADP outperforms both SVM and graphLSTM
in terms of F1 in inter-sentential RE due to a better
balance in precision and recall
F1
Unrestricted © Siemens AG 2019
January 2019Page 62 Machine Intelligence / Siemens AI Lab
Results (Precision / Recall / F1): Inter-sentential Training
iDepNN-ADP outperforms both SVM and graphLSTM
in terms of P and F1 in inter-sentential RE
F1
Unrestricted © Siemens AG 2019
January 2019Page 63 Machine Intelligence / Siemens AI Lab
Results (Precision / Recall / F1): Inter-sentential Training
F1
Unrestricted © Siemens AG 2019
January 2019Page 64 Machine Intelligence / Siemens AI Lab
Results (Precision / Recall / F1): Ensemble
Unrestricted © Siemens AG 2019
January 2019Page 65 Machine Intelligence / Siemens AI Lab
Ensemble with Thresholding on Prediction Probability
Ensemble scores at various thresholds
p: output probability
pr: the count of predictions.
Unrestricted © Siemens AG 2019
January 2019Page 66 Machine Intelligence / Siemens AI Lab
Official Scores: State-of-the-art Comparison
Ensemble scores at various thresholds
p: output probability
pr: the count of predictions. Official results on test set: Comparison with the
published systems in the BioNLP ST 2016.
Unrestricted © Siemens AG 2019
January 2019Page 67 Machine Intelligence / Siemens AI Lab
Error Analysis: BioNLP ST 2016 dataset
Unrestricted © Siemens AG 2019
January 2019Page 68 Machine Intelligence / Siemens AI Lab
Error Analysis: BioNLP ST 2016 dataset
Unrestricted © Siemens AG 2019
January 2019Page 69 Machine Intelligence / Siemens AI Lab
Error Analysis: BioNLP ST 2016 dataset
Few false positives
in iDepNN-ADP,
compared to both
SVM and graphLSTM
iDepNN-ADP
SVM
graphLSTM
Unrestricted © Siemens AG 2019
January 2019Page 70 Machine Intelligence / Siemens AI Lab
Key Takeaways
➢ Propose a novel neural approach iDepNN for Inter-sentential Relation Extraction
➢ Precisely extract relations within and across sentence boundaries by modeling:
➔ shortest dependency path (SDP) using biRNN, i.e., iDepNN-SDP
➔ augmented dependency path (ADP) using RecRNN, i.e., iDepNN-ADP
➢ Demonstrate a better balance in precision and recall with an improved F1 score
➢ Evaluate on 4 datasets from news and medical domains
➢ Achieve a gain of 5.2% (0.587 vs 0.558) in F1 over the winning team (out of 11 teams)
in BioNLP Shared Task (ST) 2016
Code and Data: https://github.com/pgcool/Cross-sentence-Relation-Extraction-iDepNN
Unrestricted © Siemens AG 2019
January 2019Page 71 Machine Intelligence / Siemens AI Lab
Outline
1/2 Tracks: Relation Extraction
Neural Relation Extraction Within and Across Sentence Boundaries
Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Thomas Runkler. In AAAI-2019.
Active Research in Information Extraction:
→ Neural Models of Lifelong Learning for Information Extraction
→ Weakly-supervised Neural Bootstrapping for Relation Extraction
(PhD Student: Mr. Usama Yaseen)
Unrestricted © Siemens AG 2019
January 2019Page 72 Machine Intelligence / Siemens AI Lab
Research Outline
Neural Topic ModelingNeural Relation Extraction
Interpretability
➢ Intra- and Inter-sentential RE
➢ Joint Entity & RE
➢ Weakly-supervised
Bootstrapping RE
➢ Autoregressive TMs
➢ Word Embeddings Aware TM
➢ Language Structure Aware TM (textTOvec)
➢ Multi-view Transfer Learning in TM
Interpretable RE
Interpretable
topics
Transfer Learning
Lifelong Learning
➢ Explaining RNN predictions
Unrestricted © Siemens AG 2019
January 2019Page 73 Machine Intelligence / Siemens AI Lab
Research Outline
Neural Topic ModelingNeural Relation Extraction
Interpretability
➢ Intra- and Inter-sentential RE
➢ Joint Entity & RE
➢ Weakly-supervised
Bootstrapping RE
➢ Autoregressive TMs
➢ Word Embeddings Aware TM
➢ Language Structure Aware TM (textTOvec)
➢ Multi-view Transfer Learning in TM
Interpretable RE
Interpretable
topics
Transfer Learning
Lifelong Learning
➢ Explaining RNN predictions
Unrestricted © Siemens AG 2019
January 2019Page 74 Machine Intelligence / Siemens AI Lab
Outline (Brief Introduction)
2/2 Tracks: Topic Modeling & Representation Learning
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with
Distributed Compositional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019.
Multi-view and Multi-source Transfers in Neural Topic Modeling
Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review.
Unrestricted © Siemens AG 2019
January 2019Page 75 Machine Intelligence / Siemens AI Lab
Outline (Brief Introduction)
2/2 Tracks: Topic Modeling & Representation Learning
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019
TL;DR → Improved Topic Modeling with full-contexts and pre-trained word embeddings
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with
Distributed Compositional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019.
TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax
and semantic information); Composite Model of a neural topic and neural language model
Multi-view and Multi-source Transfers in Neural Topic Modeling
Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review.
TL;DR → Improved Topic modeling with knowledge transfer via local as well as global semantics
Unrestricted © Siemens AG 2019
January 2019Page 76 Machine Intelligence / Siemens AI Lab
Outline (Brief Introduction)
2/2 Tracks: Topic Modeling & Representation Learning
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019
TL;DR → Improved Topic Modeling with full-contexts and pre-trained word embeddings
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language
with Distributed Compositional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019.
TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax
and semantic information); Composite Model of a neural topic and neural language model
Multi-view and Multi-source Transfers in Neural Topic Modeling
Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review.
TL;DR → Improved Topic modeling with knowledge transfer via local as well as global semantics
Unrestricted © Siemens AG 2019
January 2019Page 77 Machine Intelligence / Siemens AI Lab
Outline (Brief Introduction)
2/2 Tracks: Topic Modeling & Representation Learning
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019
TL;DR → Improved Topic Modeling with context-awareness and pre-trained word embeddings
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with
Distributed Compositional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019.
TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax
and semantic information); Composite Model of a neural topic and neural language model
Multi-view and Multi-source Transfers in Neural Topic Modeling
Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review.
TL;DR → Improved Topic modeling with knowledge transfer via local and global semantics
Unrestricted © Siemens AG 2019
January 2019Page 78 Machine Intelligence / Siemens AI Lab
Topic Modeling
➢ statistical modeling that examines how words co-occur across a collection of documents, and
➢ automatically discovers coherent groups of words (i.e., themes or topics) that best explain the corpus
➢ Each document is composed of a mixture of topics, and each topic is composed of a collection of words
Source: http://www.cs.columbia.edu/~blei/papers/Blei2012.pdf
Unrestricted © Siemens AG 2019
January 2019Page 79 Machine Intelligence / Siemens AI Lab
Outline (Brief Introduction)
2/2 Tracks: Topic Modeling & Representation Learning
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019
TL;DR → Improved Topic Modeling with full-contexts and pre-trained word embeddings
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with
Distributed Compositional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019.
TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax
and semantic information); Composite Model of a neural topic and neural language model
Multi-view and Multi-source Transfers in Neural Topic Modeling
Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review.
TL;DR → Improved Topic modeling with knowledge transfer via local as well as global semantics
Unrestricted © Siemens AG 2019
January 2019Page 80 Machine Intelligence / Siemens AI Lab
Document Informed Neural Autoregressive
Topic Models with Distributional Prior (AAAI-19)
Need for Distributional Semantics / Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Unrestricted © Siemens AG 2019
January 2019Page 81 Machine Intelligence / Siemens AI Lab
Document Informed Neural Autoregressive
Topic Models with Distributional Prior (AAAI-19)
Need for Distributional Semantics / Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Lack of Context
Difficult to learn good representations Generate Incoherent Topics
Unrestricted © Siemens AG 2019
January 2019Page 82 Machine Intelligence / Siemens AI Lab
Document Informed Neural Autoregressive
Topic Models with Distributional Prior (AAAI-19)
Need for Distributional Semantics / Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Lack of Context
Difficult to learn good representations Generate Incoherent Topics
Topic1: price, wall, china, fall, shares
Topic2: shares, price, profits, rises, earnings
coherent
example topics for ‘trading’ incoherent
Unrestricted © Siemens AG 2019
January 2019Page 83 Machine Intelligence / Siemens AI Lab
Document Informed Neural Autoregressive
Topic Models with Distributional Prior (AAAI-19)
Need for Distributional Semantics / Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Lack of Context
Difficult to learn good representations Generate Incoherent Topics
Topic1: price, wall, china, fall, shares
Topic2: shares, price, profits, rises, earnings
coherent
example topics for ‘trading’ incoherent
TO RESCUE: Use External/additional information, e.g., WORD EMBEDDINGS
(encodes semantic and syntactic relatedness in words in a vector space)
Unrestricted © Siemens AG 2019
January 2019Page 84 Machine Intelligence / Siemens AI Lab
Document Informed Neural Autoregressive
Topic Models with Distributional Prior (AAAI-19)
Need for Distributional Semantics / Prior Knowledge
➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc.
➢ “Lack of Context” in a corpus of few documents
Small number of
Word co-occurrences
Lack of Context
Difficult to learn good representations Generate Incoherent Topics
TO RESCUE: Use External/additional information, e.g., WORD EMBEDDINGS
(encodes semantic and syntactic relatedness in words in a vector space)
→ trading
No word
overlap
(e.g., 1-hot-
encoding)
Same
topic
class→ trading
Unrestricted © Siemens AG 2019
January 2019Page 85 Machine Intelligence / Siemens AI Lab
Document Informed Neural Autoregressive
Topic Models with Distributional Prior (AAAI-19)
mixture weights
➢ introduce weighted pre-trained
word embedding aggregation at
each autoregressive step k
➢ E, pretrained emb as fixed prior
➢ generate topics with embeddings
➢ learn a complementary textual
representation
Baseline Model
Proposed Model
Glove
Unrestricted © Siemens AG 2019
January 2019Page 86 Machine Intelligence / Siemens AI Lab
Document Informed Neural Autoregressive
Topic Models with Distributional Prior (AAAI-19)
IR-precision (on short-text datasets)
→ Precision at different retrieval fractions and higher the better
Evaluation: Applicability (Information Retrieval)
Unrestricted © Siemens AG 2019
January 2019Page 87 Machine Intelligence / Siemens AI Lab
Document Informed Neural Autoregressive
Topic Models with Distributional Prior (AAAI-19)
Take Away of this work:
➢ Leveraging full contextual information in neural autoregressive topic model
➢ Introducing distributional priors via pre-trained word embeddings
➢ Gain of 5.2% (404 vs 426) in perplexity,
2.8% (.74 vs .72) in topic coherence,
11.1% (.60 vs .54) in precision at retrieval fraction 0.02,
5.2% (.664 vs .631) in F1 for text categorization
on avg over 15 datasets
➢ Learning better word/document representation for short/long texts
Tryout: The code and data are available at https://github.com/pgcool/iDocNADEe
@PankajGupta262
Unrestricted © Siemens AG 2019
January 2019Page 88 Machine Intelligence / Siemens AI Lab
Local vs Global Semantics
Language Models have Local View (semantics):
→ A vector-space representation for each word, based on the local word collocation patterns
→ Due to word-word co-occurrence, limited by a window-size (e.g., word2vec) or sentence (e.g. ELMo)
→ Information beyond the limited context is not exposed
→ Good at capturing local syntactic and semantic information
Unrestricted © Siemens AG 2019
January 2019Page 89 Machine Intelligence / Siemens AI Lab
Local vs Global Semantics
Language Models have Local View (semantics):
→ A vector-space representation for each word, based on the local word collocation patterns
→ Due to word-word co-occurrence, limited by a window-size (e.g., word2vec) or sentence (e.g. ELMo)
→ Information beyond the limited context is not exposed
→ Good at capturing local syntactic and semantic information
→ Difficulties in capturing long-range dependencies
Unrestricted © Siemens AG 2019
January 2019Page 90 Machine Intelligence / Siemens AI Lab
Local vs Global Semantics
Language Models have Local View (semantics):
→ A vector-space representation for each word, based on the local word collocation patterns
→ Due to word-word co-occurrence, limited by a window-size (e.g., word2vec) or sentence (e.g. ELMo)
→ Information beyond the limited context is not exposed
→ Good at capturing local syntactic and semantic information
→ Difficulties in capturing long-range dependencies
Unrestricted © Siemens AG 2019
January 2019Page 91 Machine Intelligence / Siemens AI Lab
Local vs Global Semantics
Language Models have Local View (semantics):
→ A vector-space representation for each word, based on the local word collocation patterns
→ Due to word-word co-occurrence, limited by a window-size (e.g., word2vec) or sentence (e.g. ELMo)
→ Information beyond the limited context is not exposed
→ Good at capturing local syntactic and semantic information
→ Difficulties in capturing long-range dependencies
Unrestricted © Siemens AG 2019
January 2019Page 92 Machine Intelligence / Siemens AI Lab
Local vs Global Semantics
Topic Models have Global View (semantics):
→ Due to document-word occurrences (i.e., words are similar if these words similarly appear in documents)
→ Access to document context, not limited by local context
→ Good at capturing thematic structures or long-range dependencies in document collection
Topic models have global view in the sense that each topic is
learned by leveraging statistical information across documents
Unrestricted © Siemens AG 2019
January 2019Page 93 Machine Intelligence / Siemens AI Lab
Local vs Global Semantics
Topic Models have Global View (semantics):
→ Due to document-word occurrences (i.e., words are similar if these words similarly appear in documents)
→ Access to document context, not limited by local context
→ Good at capturing thematic structures or long-range dependencies in document collection
Topic models have global view in the sense that each topic is
learned by leveraging statistical information across documents
Unrestricted © Siemens AG 2019
January 2019Page 94 Machine Intelligence / Siemens AI Lab
Local vs Global Semantics
Topic Models have Global View (semantics):
→ Due to document-word occurrences (i.e., words are similar if these words similarly appear in documents)
→ Access to document context, not limited by local context
→ Good at capturing thematic structures or long-range dependencies in document collection
→ No Language structures (e.g., word ordering, local syntactical and semantic information, etc.)
→ Difficulties in capturing short-range dependencies
Same
unigram
statistics,
but
different
topics
Source Text Sense/Topic
Market falls into bear territory → “trading”
Bear falls into market territory → “trading”
Language structure helps in determining actual meaning !!!
Unrestricted © Siemens AG 2019
January 2019Page 95 Machine Intelligence / Siemens AI Lab
textTOvec: Deep Contextulized Neural Autoregressive Topics
Models of Language With Distributed Compositional Prior (ICLR-19)
Incorporate language structures in Topic Models
→ accounting word ordering, latent syntactical and semantic features
→ improving word and document representations, including polysemy
Improving Topic Modeling for short-text and long-text documents
via contextualized features and external knowledge
Incorporate external knowledge for each word
→ using distributional semantics, i.e., word embeddings
→ improving document representations and topics
Unrestricted © Siemens AG 2019
January 2019Page 96 Machine Intelligence / Siemens AI Lab
textTOvec: Deep Contextulized Neural Autoregressive Topics
Models of Language With Distributed Compositional Prior (ICLR-19)
Advantages of Composite Modeling:
→ introduce language structure into neural autoregressive
topic models via a LSTM-LM, such as word ordering,
language concepts and long-range dependencies.
→ probability of each word is a function of global and local
contexts, modeled via DocNADE and LSTM-LM, respectively.
→ offers learning complementary semantics by combining joint
word and latent topic learning in a unified neural
autoregressive framework.
contextualized-Document Neural Autoregressive
Distribution Estimator (ctx-DocNADE) with pre-
trained word embedding (ctx-DocNADEe)
Unrestricted © Siemens AG 2019
January 2019Page 97 Machine Intelligence / Siemens AI Lab
textTOvec: Deep Contextulized Neural Autoregressive Topics
Models of Language With Distributed Compositional Prior (ICLR-19)
Code: https://github.com/pgcool/textTOvec
Unrestricted © Siemens AG 2019
January 2019Page 98 Machine Intelligence / Siemens AI Lab
Outline: Topic Modeling (Brief Introduction)
2/2 Tracks: Topic Modeling & Representation Learning
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019
TL;DR → Improved Topic Modeling with context-awareness and pre-trained word embeddings
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with
Distributed Compositional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019.
TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax
and semantic information); Composite Model of a neural topic and neural language model
Multi-view and Multi-source Transfers in Neural Topic Modeling
Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review.
TL;DR → Improved Topic modeling with knowledge transfer via local and global semantics
Unrestricted © Siemens AG 2019
January 2019Page 99 Machine Intelligence / Siemens AI Lab
Outline: Topic Modeling
2/2 Tracks: Topic Modeling & Representation Learning
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with
Distributed Compositional Prior
Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019.
Active Research in Topic Modeling & Representation Learning:
→ Multi-view and Multi-source Transfers in Neural Topic Modeling
Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review.
→Improving Language Models with Global Semantics via Neural Composite Networks
→Lifelong Neural Topic Learning
(PhD Student: Mr. Yatin Chaudhary)
Unrestricted © Siemens AG 2019
January 2019Page 100 Machine Intelligence / Siemens AI Lab
Summary & Thanks !!
Neural Topic ModelingNeural Relation Extraction
Interpretability
➢ Intra- and Inter-sentential RE
➢ Joint Entity & RE
➢ Weakly-supervised
Bootstrapping RE
➢ Autoregressive TMs
➢ Word Embeddings Aware TM
➢ Language Structure Aware TM (textTOvec)
➢ Multi-view Transfer Learning in TM
Interpretable RE
Interpretable
topics
Transfer Learning
Lifelong Learning
➢ Explaining RNN predictions
ReachMe / Talks: https://sites.google.com/view/gupta-pankaj/

More Related Content

PDF
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
PDF
Neural Relation ExtractionWithin and Across Sentence Boundaries
PDF
Deep Learning for Information Extraction in Natural Language Text
PDF
Lecture 07: Representation and Distributional Learning by Pankaj Gupta
PDF
Document Informed Neural Autoregressive Topic Models with Distributional Prior
PDF
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
PDF
IRJET- Visual Information Narrator using Neural Network
PPTX
Text Data Mining
textTOvec: DEEP CONTEXTUALIZED NEURAL AUTOREGRESSIVE TOPIC MODELS OF LANGUAGE...
Neural Relation ExtractionWithin and Across Sentence Boundaries
Deep Learning for Information Extraction in Natural Language Text
Lecture 07: Representation and Distributional Learning by Pankaj Gupta
Document Informed Neural Autoregressive Topic Models with Distributional Prior
Lecture 05: Recurrent Neural Networks / Deep Learning by Pankaj Gupta
IRJET- Visual Information Narrator using Neural Network
Text Data Mining

Similar to Neural NLP Models of Information Extraction (20)

PPTX
Information Extraction from Text, presented @ Deloitte
PPT
5-Information Extraction (IE) and Machine Translation (MT).ppt
PDF
Latent Relational Model for Relation Extraction
PPTX
Fun with Text - Managing Text Analytics
PPTX
Knowledge acquisition using automated techniques
PDF
EXTRACTING ARABIC RELATIONS FROM THE WEB
PDF
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
PPTX
Text analysis-semantic-search
PDF
Introduction to Natural Language Processing
PPTX
Natural Language Processing Advancements By Deep Learning - A Survey
PDF
Learning to Extract Relations for Protein Annotation
PPTX
PhD Research Proposal - Qualifying Exam
PDF
Text Analytics - JCC2014 Kimelfeld
PDF
A03730108
PPT
Download
PPT
Download
PDF
Novel Database-Centric Framework for Incremental Information Extraction
PDF
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
PDF
D017422528
Information Extraction from Text, presented @ Deloitte
5-Information Extraction (IE) and Machine Translation (MT).ppt
Latent Relational Model for Relation Extraction
Fun with Text - Managing Text Analytics
Knowledge acquisition using automated techniques
EXTRACTING ARABIC RELATIONS FROM THE WEB
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Text analysis-semantic-search
Introduction to Natural Language Processing
Natural Language Processing Advancements By Deep Learning - A Survey
Learning to Extract Relations for Protein Annotation
PhD Research Proposal - Qualifying Exam
Text Analytics - JCC2014 Kimelfeld
A03730108
Download
Download
Novel Database-Centric Framework for Incremental Information Extraction
AI-SDV 2022: Creation and updating of large Knowledge Graphs through NLP Anal...
D017422528
Ad

More from Pankaj Gupta, PhD (8)

PDF
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
PDF
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
PDF
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
PDF
Pankaj Gupta CV / Resume
PDF
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
PDF
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
PDF
Joint Bootstrapping Machines for High Confidence Relation Extraction
PDF
RNN-RSM (Topics over Time) | NAACL2018 conference talk
textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language...
Poster: Neural Relation ExtractionWithin and Across Sentence Boundaries
Poster: Document Informed Neural Autoregressive Topic Models with Distributio...
Pankaj Gupta CV / Resume
LISA: Explaining RNN Judgments via Layer-wIse Semantic Accumulation and Examp...
Replicated Siamese LSTM in Ticketing System for Similarity Learning and Retri...
Joint Bootstrapping Machines for High Confidence Relation Extraction
RNN-RSM (Topics over Time) | NAACL2018 conference talk
Ad

Recently uploaded (20)

PPTX
New ISO 27001_2022 standard and the changes
PPTX
IMPACT OF LANDSLIDE.....................
PDF
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
PPTX
A Complete Guide to Streamlining Business Processes
PDF
Introduction to Data Science and Data Analysis
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
Introduction to the R Programming Language
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Introduction to Inferential Statistics.pptx
PDF
annual-report-2024-2025 original latest.
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
Transcultural that can help you someday.
PDF
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PPTX
importance of Data-Visualization-in-Data-Science. for mba studnts
PDF
How to run a consulting project- client discovery
PPTX
CYBER SECURITY the Next Warefare Tactics
PPT
ISS -ESG Data flows What is ESG and HowHow
New ISO 27001_2022 standard and the changes
IMPACT OF LANDSLIDE.....................
OneRead_20250728_1808.pdfhdhddhshahwhwwjjaaja
A Complete Guide to Streamlining Business Processes
Introduction to Data Science and Data Analysis
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
Introduction to the R Programming Language
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
IBA_Chapter_11_Slides_Final_Accessible.pptx
Introduction to Inferential Statistics.pptx
annual-report-2024-2025 original latest.
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Transcultural that can help you someday.
Jean-Georges Perrin - Spark in Action, Second Edition (2020, Manning Publicat...
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
importance of Data-Visualization-in-Data-Science. for mba studnts
How to run a consulting project- client discovery
CYBER SECURITY the Next Warefare Tactics
ISS -ESG Data flows What is ESG and HowHow

Neural NLP Models of Information Extraction

  • 1. Unrestricted © Siemens AG 2017 Neural NLP Models of Information Extraction Presenter: Pankaj Gupta | PhD with Prof. Hinrich Schütze | Research Scientist University of Munich (LMU) | Siemens AG, Munich Germany Venue: Google AI, New York City | 25 Mar, 2019
  • 2. Unrestricted © Siemens AG 2019 January 2019Page 2 Machine Intelligence / Siemens AI Lab About Me: Affiliations time Bachelors (B.Tech-IT) 2006-10 2010-13 Senior Software Developer Masters (MSc-CS) & Research Assistant Bachelor Internship 2009 2013-15 Working Student & Master Thesis 2013-15 Starting PhD 2015 PhD Research Intern (4 months) 2016-17 Research Scientist - NLP/ML 2017-Now PhD Submission 2019 Master Thesis: Deep Learning Methods for the Extraction of Relations in Natural Language Text PhD Thesis Title (tentative): Neural Models of Information Extraction from Natural Language Text Reach me: https://sites.google.com/view/gupta-pankaj/
  • 3. Unrestricted © Siemens AG 2019 January 2019Page 3 Machine Intelligence / Siemens AI Lab About Me: Research Neural Topic ModelingNeural Relation Extraction Interpretability ➢ Intra- and Inter-sentential RE ➢ Joint Entity & RE ➢ Weakly-supervised Bootstrapping RE ➢ Autoregressive TMs ➢ Word Embeddings Aware TM ➢ Language Structure Aware TM (textTOvec) ➢ Multi-view Transfer Learning in TM Interpretable RE Interpretable topics Transfer Learning Lifelong Learning ➢ Explaining RNN predictions
  • 4. Unrestricted © Siemens AG 2019 January 2019Page 4 Machine Intelligence / Siemens AI Lab Outline Two Tracks: 1/2 Track: Relation Extraction Neural Relation Extraction Within and Across Sentence Boundaries Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Thomas Runkler. In AAAI-2019. 2/2 Track: Topic Modeling & Representation Learning (Briefly) Document Informed Neural Autoregressive Topic Models with Distributional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019 textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019.
  • 5. Unrestricted © Siemens AG 2019 January 2019Page 5 Machine Intelligence / Siemens AI Lab Neural Relation Extraction Within and Across Sentence Boundaries Introduction: Relation Extraction spanning sentence boundaries Proposed Methods ➢ Inter-sentential Dependency-Based Neural Networks (iDepNN) → Inter-sentential Shortest Dependency Path (iDepNN-SDP) → Inter-sentential Augmented Dependency Path (iDepNN-ADP) Evaluation and Analysis ➢ State-of-the-art comparison ➢ Error analysis
  • 6. Unrestricted © Siemens AG 2019 January 2019Page 6 Machine Intelligence / Siemens AI Lab Neural Relation Extraction Within and Across Sentence Boundaries Introduction: Relation Extraction spanning sentence boundaries Proposed Methods ➢ Inter-sentential Dependency-Based Neural Networks (iDepNN) → Inter-sentential Shortest Dependency Path (iDepNN-SDP) → Inter-sentential Augmented Dependency Path (iDepNN-ADP) Evaluation and Analysis ➢ State-of-the-art comparison ➢ Error analysis
  • 7. Unrestricted © Siemens AG 2019 January 2019Page 7 Machine Intelligence / Siemens AI Lab Neural Relation Extraction Within and Across Sentence Boundaries Introduction: Relation Extraction spanning sentence boundaries Proposed Methods ➢ Inter-sentential Dependency-Based Neural Networks (iDepNN) → Inter-sentential Shortest Dependency Path (iDepNN-SDP) → Inter-sentential Augmented Dependency Path (iDepNN-ADP) Evaluation and Analysis ➢ State-of-the-art comparison ➢ Error analysis
  • 8. Unrestricted © Siemens AG 2019 January 2019Page 8 Machine Intelligence / Siemens AI Lab Neural Relation Extraction Within and Across Sentence Boundaries Introduction: Relation Extraction spanning sentence boundaries Proposed Methods ➢ Inter-sentential Dependency-Based Neural Networks (iDepNN) → Inter-sentential Shortest Dependency Path (iDepNN-SDP) → Inter-sentential Augmented Dependency Path (iDepNN-ADP) Evaluation and Analysis ➢ State-of-the-art comparison ➢ Error analysis
  • 9. Unrestricted © Siemens AG 2019 January 2019Page 9 Machine Intelligence / Siemens AI Lab Introduction: Relation Extraction (RE) Binary Relation Extraction(RE): - identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
  • 10. Unrestricted © Siemens AG 2019 January 2019Page 10 Machine Intelligence / Siemens AI Lab Introduction: Relation Extraction (RE) Binary Relation Extraction(RE): - identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S Paul Allen has started a company and named [Vern Raburn]e1 its [president]e2 . relation: per-post(e1,e2)
  • 11. Unrestricted © Siemens AG 2019 January 2019Page 11 Machine Intelligence / Siemens AI Lab Need for Relation Extraction (RE) → Large part of the information is expressed in free text, e.g. in web pages, blogs, social media, etc. → Need for automatic systems to extract the relevant information in form of Structured KB • Entity Extraction • Relation Extraction • Structure the unstructured text • Knowledge Graph Construction • In web search, retrieval, Q&A, etc. Information Extraction Entity Extraction: Detect entities such as person, organization, location, product, technology, sensor, etc. Relation Extraction: Detect relation between the given entities or nominals End-to-End Knowledge Base Population Text Documents Knowledge GraphIE Engine SensorSensor Competitor-of Sensor
  • 12. Unrestricted © Siemens AG 2019 January 2019Page 12 Machine Intelligence / Siemens AI Lab Introduction: Relation Extraction (RE) Relation Extraction (Based on location of entities) Binary Relation Extraction(RE): - identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S
  • 13. Unrestricted © Siemens AG 2019 January 2019Page 13 Machine Intelligence / Siemens AI Lab Introduction: Relation Extraction (RE) intra-sentential (entities within sentence boundary) most prior works Binary Relation Extraction(RE): - identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S Relation Extraction (Based on location of entities)
  • 14. Unrestricted © Siemens AG 2019 January 2019Page 14 Machine Intelligence / Siemens AI Lab Introduction: Relation Extraction (RE) intra-sentential (entities within sentence boundary) most prior works This work Binary Relation Extraction(RE): - identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S Relation Extraction (Based on location of entities) inter-sentential (entities across sentence boundary(s))
  • 15. Unrestricted © Siemens AG 2019 January 2019Page 15 Machine Intelligence / Siemens AI Lab Introduction: Relation Extraction (RE) intra-sentential (entities within sentence boundary) most prior works This work Binary Relation Extraction(RE): - identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S Relation Extraction (Based on location of entities) Paul Allen has started a company and named [Vern Raburn]e1 its [president]e2 . relation: per-post(e1,e2) Example inter-sentential (entities across sentence boundary(s))
  • 16. Unrestricted © Siemens AG 2019 January 2019Page 16 Machine Intelligence / Siemens AI Lab Introduction: Relation Extraction (RE) intra-sentential (entities within sentence boundary) most prior works This work Binary Relation Extraction(RE): - identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S Relation Extraction (Based on location of entities) Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group]e2 will be based in Bellevue, Washington. relation: per-org(e1,e2) inter-sentential (entities across sentence boundary(s)) Example
  • 17. Unrestricted © Siemens AG 2019 January 2019Page 17 Machine Intelligence / Siemens AI Lab Introduction: Relation Extraction (RE) intra-sentential (entities within sentence boundary) most prior works This work Binary Relation Extraction(RE): - identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S Relation Extraction (Based on location of entities) Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group]e2 will be based in Bellevue, Washington. relation: ?? inter-sentential (entities across sentence boundary(s)) MISSED relationships: Impact the system performance, leading to POOR RECALL
  • 18. Unrestricted © Siemens AG 2019 January 2019Page 18 Machine Intelligence / Siemens AI Lab Introduction: Relation Extraction (RE) intra-sentential (entities within sentence boundary) most prior works This work Binary Relation Extraction(RE): - identify semantic relationship between a pair of nominals or entities e1 and e2 in a given text snippet, S Relation Extraction (Based on location of entities) Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group]e2 will be based in Bellevue, Washington. relation: per-org(e1,e2) inter-sentential (entities across sentence boundary(s)) Capture relationship between entities at distance across sentence boundaries
  • 19. Unrestricted © Siemens AG 2019 January 2019Page 19 Machine Intelligence / Siemens AI Lab Challenges in Inter-sentential Relation Extraction (RE) This work Paul Allen has started a company and named [Vern Raburn]e1 its president. The company will coordinate the overall strategy for the group of high-tech companies that Mr. Allen owns or holds a significant stake in, will be based in Bellevue, Washington and called [Paul Allen Group]e2 . relation: per-org(e1,e2) inter-sentential (entities across sentence boundary(s))
  • 20. Unrestricted © Siemens AG 2019 January 2019Page 20 Machine Intelligence / Siemens AI Lab Challenges in Inter-sentential Relation Extraction (RE) This work Paul Allen has started a company and named [Vern Raburn]e1 its president. The company will coordinate the overall strategy for the group of high-tech companies that Mr. Allen owns or holds a significant stake in, will be based in Bellevue, Washington and called [Paul Allen Group]e2 . relation: per-org(e1,e2) inter-sentential (entities across sentence boundary(s)) NOISY text in relationships spanning sentence boundaries: POOR PRECISION
  • 21. Unrestricted © Siemens AG 2019 January 2019Page 21 Machine Intelligence / Siemens AI Lab Challenges in Inter-sentential Relation Extraction (RE) This work Paul Allen has started a company and named [Vern Raburn]e1 its president. The company will coordinate the overall strategy for the group of high-tech companies that Mr. Allen owns or holds a significant stake in, will be based in Bellevue, Washington and called [Paul Allen Group]e2 . relation: per-org(e1,e2) inter-sentential (entities across sentence boundary(s)) NOISY text in relationships spanning sentence boundaries: POOR PRECISION Robust system to tackle false positives in inter-sentential RE Need
  • 22. Unrestricted © Siemens AG 2019 January 2019Page 22 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction 1. Dependency Parse trees effective in extracting relationships limited to single sentences, i.e., intra-sentential relationships 2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE limited to single sentences, i.e., intra-sentential relationships ignore additional information relevant in relation identification 4. Tree-RNNs effective in modeling relations via recursive compositionality 3. Augmented Dependency Path (ADP) precisely models relationships limited to single sentences, i.e., intra-sentential relationships limited to single sentences, i.e., intra-sentential relationships
  • 23. Unrestricted © Siemens AG 2019 January 2019Page 23 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction 1. Dependency Parse trees effective in extracting relationships limited to single sentences, i.e., intra-sentential relationships 2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE limited to single sentences, i.e., intra-sentential relationships ignore additional information relevant in relation identification 4. Tree-RNNs effective in modeling relations via recursive compositionality 3. Augmented Dependency Path (ADP) precisely models relationships limited to single sentences, i.e., intra-sentential relationships limited to single sentences, i.e., intra-sentential relationships
  • 24. Unrestricted © Siemens AG 2019 January 2019Page 24 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction 1. Dependency Parse trees effective in extracting relationships limited to single sentences, i.e., intra-sentential relationships 2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE limited to single sentences, i.e., intra-sentential relationships 4. Tree-RNNs effective in modeling relations via recursive compositionality 3. Augmented Dependency Path (ADP) precisely models relationships limited to single sentences, i.e., intra-sentential relationships limited to single sentences, i.e., intra-sentential relationships ignore additional information relevant in relation identification
  • 25. Unrestricted © Siemens AG 2019 January 2019Page 25 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction Sentences and their dependency graphs Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington. Shortest Dependency Path (SDP) between root to entity e1
  • 26. Unrestricted © Siemens AG 2019 January 2019Page 26 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction Sentences and their dependency graphs Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington. Shortest Dependency Path (SDP) between root to entity e2
  • 27. Unrestricted © Siemens AG 2019 January 2019Page 27 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction Sentences and their dependency graphs Inter-sentential Shortest Dependency Path (iSDP) across sentence boundary. Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington. iSDP → Connection between the roots of adjacent sentences by NEXTS
  • 28. Unrestricted © Siemens AG 2019 January 2019Page 28 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction 1. Dependency Parse trees effective in extracting relationships limited to single sentences, i.e., intra-sentential relationships 2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE limited to single sentences, i.e., intra-sentential relationships ignore additional information relevant in relation identification 4. Tree-RNNs effective in modeling relations via recursive compositionality 3. Augmented Dependency Path (ADP) precisely models relationships limited to single sentences, i.e., intra-sentential relationships limited to single sentences, i.e., intra-sentential relationships
  • 29. Unrestricted © Siemens AG 2019 January 2019Page 29 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction Sentences and their dependency graphs Inter-sentential Shortest Dependency Path (iSDP) across sentence boundary. Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington. subtreeiSDP → Connection between the roots of adjacent sentences by NEXTS
  • 30. Unrestricted © Siemens AG 2019 January 2019Page 30 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction 1. Dependency Parse trees effective in extracting relationships limited to single sentences, i.e., intra-sentential relationships 2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE limited to single sentences, i.e., intra-sentential relationships ignore additional information relevant in relation identification 3. Augmented Dependency Path (ADP) precisely models relationships limited to single sentences, i.e., intra-sentential relationships limited to single sentences, i.e., intra-sentential relationships 4. Tree-RNNs effective in modeling relations via recursive compositionality
  • 31. Unrestricted © Siemens AG 2019 January 2019Page 31 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction Sentences and their dependency graphs Inter-sentential Shortest Dependency Path (iSDP) across sentence boundary. Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington. subtreeiSDP → Connection between the roots of adjacent sentences by NEXTS
  • 32. Unrestricted © Siemens AG 2019 January 2019Page 32 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction Sentences and their dependency graphs Inter-sentential Shortest Dependency Path (iSDP) across sentence boundary. Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington. subtreeiSDP → Connection between the roots of adjacent sentences by NEXTS
  • 33. Unrestricted © Siemens AG 2019 January 2019Page 33 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction Sentences and their dependency graphs Inter-sentential Shortest Dependency Path (iSDP) across sentence boundary. Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington. subtreeiSDP → Connection between the roots of adjacent sentences by NEXTS
  • 34. Unrestricted © Siemens AG 2019 January 2019Page 34 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction Sentences and their dependency graphs Inter-sentential Shortest Dependency Path (iSDP) across sentence boundary. Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington. subtreeiSDP → Connection between the roots of adjacent sentences by NEXTS
  • 35. Unrestricted © Siemens AG 2019 January 2019Page 35 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction Sentences and their dependency graphs Inter-sentential Shortest Dependency Path (iSDP) across sentence boundary. Paul Allen has started a company and named [Vern Raburn]e1 its president. The company, to be called [Paul Allen Group] e2 will be based in Bellevue, Washington. subtreeiSDP → Connection between the roots of adjacent sentences by NEXTS
  • 36. Unrestricted © Siemens AG 2019 January 2019Page 36 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction 1. Dependency Parse trees effective in extracting relationships limited to single sentences, i.e., intra-sentential relationships 2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE limited to single sentences, i.e., intra-sentential relationships ignore additional information relevant in relation identification 3. Augmented Dependency Path (ADP) precisely models relationships limited to single sentences, i.e., intra-sentential relationships 4. Tree-RNNs effective in modeling relations via recursive compositionality limited to single sentences, i.e., intra-sentential relationships
  • 37. Unrestricted © Siemens AG 2019 January 2019Page 37 Machine Intelligence / Siemens AI Lab Motivation: Dependency Based Relation Extraction 1. Dependency Parse trees effective in extracting relationships limited to single sentences, i.e., intra-sentential relationships 2. Shortest Dependency Path (SDP) between entities in parse trees effective in RE limited to single sentences, i.e., intra-sentential relationships ignore additional information relevant in relation identification 3. Augmented Dependency Path (ADP) precisely models relationships limited to single sentences, i.e., intra-sentential relationships 4. Tree-RNNs effective in modeling relations via recursive compositionality limited to single sentences, i.e., intra-sentential relationships Exploit these properties in inter- sentential RE via a unified neural framework of: → bi-RNN modeling SDP → RecNN modeling ADP
  • 38. Unrestricted © Siemens AG 2019 January 2019Page 38 Machine Intelligence / Siemens AI Lab Contribution Propose a novel neural approach for Inter-sentential Relation Extraction
  • 39. Unrestricted © Siemens AG 2019 January 2019Page 39 Machine Intelligence / Siemens AI Lab Contribution 1. Neural architecture based on dependency parse trees ➔ named as inter-sentential Dependency-based Neural Network (iDepNN) 2. Unified neural framework of a bidirectional RNN (biRNN) and Recursive NN (RecNN) 3. Extract relations within and across sentence boundaries by modeling: ➔ shortest dependency path (SDP) using biRNN ➔ augmented dependency path (ADP) using RecRNN Propose a novel neural approach for Inter-sentential Relation Extraction Contribution
  • 40. Unrestricted © Siemens AG 2019 January 2019Page 40 Machine Intelligence / Siemens AI Lab Contribution 1. Neural architecture based on dependency parse trees ➔ named as inter-sentential Dependency-based Neural Network (iDepNN) 2. Unified neural framework of a bidirectional RNN (biRNN) and Recursive NN (RecNN) 3. Extract relations within and across sentence boundaries by modeling: ➔ shortest dependency path (SDP) using biRNN ➔ augmented dependency path (ADP) using RecRNN Propose a novel neural approach for Inter-sentential Relation Extraction Contribution
  • 41. Unrestricted © Siemens AG 2019 January 2019Page 41 Machine Intelligence / Siemens AI Lab Contribution 1. Neural architecture based on dependency parse trees ➔ named as inter-sentential Dependency-based Neural Network (iDepNN) 2. Unified neural framework of a bidirectional RNN (biRNN) and Recursive NN (RecNN) 3. Extract relations within and across sentence boundaries by modeling: ➔ shortest dependency path (SDP) using biRNN ➔ augmented dependency path (ADP) using RecRNN Propose a novel neural approach for Inter-sentential Relation Extraction Contribution
  • 42. Unrestricted © Siemens AG 2019 January 2019Page 42 Machine Intelligence / Siemens AI Lab Contribution 1. Neural architecture based on dependency parse trees ➔ named as inter-sentential Dependency-based Neural Network (iDepNN) 2. Unified neural framework of a bidirectional RNN (biRNN) and Recursive NN (RecNN) 3. Extract relations within and across sentence boundaries by modeling: ➔ shortest dependency path (SDP) using biRNN ➔ augmented dependency path (ADP) using RecRNN Propose a novel neural approach for Inter-sentential Relation Extraction Contribution 1. precisely extract relationships within and across sentence boundaries 2. show a better balance in precision and recall with an improved F1 score Benefits
  • 43. Unrestricted © Siemens AG 2019 January 2019Page 43 Machine Intelligence / Siemens AI Lab Proposed Approach: Neural Intra- and inter-sentential RE
  • 44. Unrestricted © Siemens AG 2019 January 2019Page 44 Machine Intelligence / Siemens AI Lab Proposed Approach: Intra- and inter-sentential RE Inter-sentential Dependency-based Neural Network variants: iDepNN-SDP and iDepNN-ADP 1. Modeling Inter-sentential Shortest Dependency Path 2. Modeling Inter-sentential Dependency Subtrees 1+2: Modeling Inter-sentential Augmented Dependency Path
  • 45. Unrestricted © Siemens AG 2019 January 2019Page 45 Machine Intelligence / Siemens AI Lab Proposed Approach: Intra- and inter-sentential RE Inter-sentential Dependency-based Neural Network variants: iDepNN-SDP and iDepNN-ADP
  • 46. Unrestricted © Siemens AG 2019 January 2019Page 46 Machine Intelligence / Siemens AI Lab Proposed Approach: Intra- and inter-sentential RE Inter-sentential Dependency-based Neural Network variants: iDepNN-SDP 1. Modeling Inter-sentential Shortest Dependency Path
  • 47. Unrestricted © Siemens AG 2019 January 2019Page 47 Machine Intelligence / Siemens AI Lab Proposed Approach: Intra- and inter-sentential RE Inter-sentential Dependency-based Neural Network variants: iDepNN-SDP 1. Modeling Inter-sentential Shortest Dependency Path
  • 48. Unrestricted © Siemens AG 2019 January 2019Page 48 Machine Intelligence / Siemens AI Lab Proposed Approach: Intra- and inter-sentential RE Inter-sentential Dependency-based Neural Network variants: iDepNN-ADP subtree 2. Modeling Inter-sentential Dependency Subtrees
  • 49. Unrestricted © Siemens AG 2019 January 2019Page 49 Machine Intelligence / Siemens AI Lab Proposed Approach: Intra- and inter-sentential RE Inter-sentential Dependency-based Neural Network variants: iDepNN-ADP Compute subtree embedding subtree
  • 50. Unrestricted © Siemens AG 2019 January 2019Page 50 Machine Intelligence / Siemens AI Lab Proposed Approach: Intra- and inter-sentential RE Inter-sentential Dependency-based Neural Network variants: iDepNN-ADP 1+2: Modeling Inter-sentential Augmented Dependency Path → Offers precise structure → Offers additional information in classifying relation
  • 51. Unrestricted © Siemens AG 2019 January 2019Page 51 Machine Intelligence / Siemens AI Lab Evaluation and Analysis
  • 52. Unrestricted © Siemens AG 2019 January 2019Page 52 Machine Intelligence / Siemens AI Lab Evaluation and Analysis: Datasets Datasets ➢ evaluate on four datasets from medical and news domain Count of intra- and inter-sentential relationships in datasets
  • 53. Unrestricted © Siemens AG 2019 January 2019Page 53 Machine Intelligence / Siemens AI Lab Evaluation and Analysis: Datasets Datasets ➢ evaluate on four datasets from medical and news domain Count of intra- and inter-sentential relationships in datasets
  • 54. Unrestricted © Siemens AG 2019 January 2019Page 54 Machine Intelligence / Siemens AI Lab Evaluation and Analysis: Datasets Datasets ➢ evaluate on four datasets from medical and news domain Count of intra- and inter-sentential relationships in datasets
  • 55. Unrestricted © Siemens AG 2019 January 2019Page 55 Machine Intelligence / Siemens AI Lab Evaluation and Analysis: Datasets Datasets ➢ evaluate on four datasets from medical and news domain Count of intra- and inter-sentential relationships in datasets
  • 56. Unrestricted © Siemens AG 2019 January 2019Page 56 Machine Intelligence / Siemens AI Lab Evaluation and Analysis: Datasets Datasets ➢ evaluate on four datasets from medical and news domain Count of intra- and inter-sentential relationships in datasets
  • 57. Unrestricted © Siemens AG 2019 January 2019Page 57 Machine Intelligence / Siemens AI Lab Evaluation and Analysis: Datasets Datasets ➢ evaluate on four datasets from medical and news domain Count of intra- and inter-sentential relationships in datasets Result discussed in this talk Lives_In → Two arguments, the bacterium and the location where, location → an Habitat (e.g., microbial ecology such as hosts, environment, food, etc.) or a Geographical entity (e.g., geographical and organization places) Data: http://2016.bionlp-st.org/tasks/bb2
  • 58. Unrestricted © Siemens AG 2019 January 2019Page 58 Machine Intelligence / Siemens AI Lab Evaluation and Analysis: Datasets + Baselines Datasets ➢ evaluate on four datasets from medical and news domain Baselines: → SVM, graphLSTM, i-biRNN and i-biLSTM Count of intra- and inter-sentential relationships in datasets Result discussed in this talk graphLSTMs: Peng et. al., 2017. Cross-Sentence N-ary Relation Extraction with Graph LSTMs.
  • 59. Unrestricted © Siemens AG 2019 January 2019Page 59 Machine Intelligence / Siemens AI Lab Results (Precision / Recall / F1) Sentence range, k = 0 ➔ Intra-sentential k > 0 ➔ Inter-sentential
  • 60. Unrestricted © Siemens AG 2019 January 2019Page 60 Machine Intelligence / Siemens AI Lab Results (Precision / Recall / F1): Intra-sentential Training iDepNN-ADP is precise in inter-sentential RE than both SVM and graphLSTM precise
  • 61. Unrestricted © Siemens AG 2019 January 2019Page 61 Machine Intelligence / Siemens AI Lab Results (Precision / Recall / F1): Intra-sentential Training iDepNN-ADP outperforms both SVM and graphLSTM in terms of F1 in inter-sentential RE due to a better balance in precision and recall F1
  • 62. Unrestricted © Siemens AG 2019 January 2019Page 62 Machine Intelligence / Siemens AI Lab Results (Precision / Recall / F1): Inter-sentential Training iDepNN-ADP outperforms both SVM and graphLSTM in terms of P and F1 in inter-sentential RE F1
  • 63. Unrestricted © Siemens AG 2019 January 2019Page 63 Machine Intelligence / Siemens AI Lab Results (Precision / Recall / F1): Inter-sentential Training F1
  • 64. Unrestricted © Siemens AG 2019 January 2019Page 64 Machine Intelligence / Siemens AI Lab Results (Precision / Recall / F1): Ensemble
  • 65. Unrestricted © Siemens AG 2019 January 2019Page 65 Machine Intelligence / Siemens AI Lab Ensemble with Thresholding on Prediction Probability Ensemble scores at various thresholds p: output probability pr: the count of predictions.
  • 66. Unrestricted © Siemens AG 2019 January 2019Page 66 Machine Intelligence / Siemens AI Lab Official Scores: State-of-the-art Comparison Ensemble scores at various thresholds p: output probability pr: the count of predictions. Official results on test set: Comparison with the published systems in the BioNLP ST 2016.
  • 67. Unrestricted © Siemens AG 2019 January 2019Page 67 Machine Intelligence / Siemens AI Lab Error Analysis: BioNLP ST 2016 dataset
  • 68. Unrestricted © Siemens AG 2019 January 2019Page 68 Machine Intelligence / Siemens AI Lab Error Analysis: BioNLP ST 2016 dataset
  • 69. Unrestricted © Siemens AG 2019 January 2019Page 69 Machine Intelligence / Siemens AI Lab Error Analysis: BioNLP ST 2016 dataset Few false positives in iDepNN-ADP, compared to both SVM and graphLSTM iDepNN-ADP SVM graphLSTM
  • 70. Unrestricted © Siemens AG 2019 January 2019Page 70 Machine Intelligence / Siemens AI Lab Key Takeaways ➢ Propose a novel neural approach iDepNN for Inter-sentential Relation Extraction ➢ Precisely extract relations within and across sentence boundaries by modeling: ➔ shortest dependency path (SDP) using biRNN, i.e., iDepNN-SDP ➔ augmented dependency path (ADP) using RecRNN, i.e., iDepNN-ADP ➢ Demonstrate a better balance in precision and recall with an improved F1 score ➢ Evaluate on 4 datasets from news and medical domains ➢ Achieve a gain of 5.2% (0.587 vs 0.558) in F1 over the winning team (out of 11 teams) in BioNLP Shared Task (ST) 2016 Code and Data: https://github.com/pgcool/Cross-sentence-Relation-Extraction-iDepNN
  • 71. Unrestricted © Siemens AG 2019 January 2019Page 71 Machine Intelligence / Siemens AI Lab Outline 1/2 Tracks: Relation Extraction Neural Relation Extraction Within and Across Sentence Boundaries Pankaj Gupta, Subburam Rajaram, Hinrich Schütze, Thomas Runkler. In AAAI-2019. Active Research in Information Extraction: → Neural Models of Lifelong Learning for Information Extraction → Weakly-supervised Neural Bootstrapping for Relation Extraction (PhD Student: Mr. Usama Yaseen)
  • 72. Unrestricted © Siemens AG 2019 January 2019Page 72 Machine Intelligence / Siemens AI Lab Research Outline Neural Topic ModelingNeural Relation Extraction Interpretability ➢ Intra- and Inter-sentential RE ➢ Joint Entity & RE ➢ Weakly-supervised Bootstrapping RE ➢ Autoregressive TMs ➢ Word Embeddings Aware TM ➢ Language Structure Aware TM (textTOvec) ➢ Multi-view Transfer Learning in TM Interpretable RE Interpretable topics Transfer Learning Lifelong Learning ➢ Explaining RNN predictions
  • 73. Unrestricted © Siemens AG 2019 January 2019Page 73 Machine Intelligence / Siemens AI Lab Research Outline Neural Topic ModelingNeural Relation Extraction Interpretability ➢ Intra- and Inter-sentential RE ➢ Joint Entity & RE ➢ Weakly-supervised Bootstrapping RE ➢ Autoregressive TMs ➢ Word Embeddings Aware TM ➢ Language Structure Aware TM (textTOvec) ➢ Multi-view Transfer Learning in TM Interpretable RE Interpretable topics Transfer Learning Lifelong Learning ➢ Explaining RNN predictions
  • 74. Unrestricted © Siemens AG 2019 January 2019Page 74 Machine Intelligence / Siemens AI Lab Outline (Brief Introduction) 2/2 Tracks: Topic Modeling & Representation Learning Document Informed Neural Autoregressive Topic Models with Distributional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019 textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019. Multi-view and Multi-source Transfers in Neural Topic Modeling Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review.
  • 75. Unrestricted © Siemens AG 2019 January 2019Page 75 Machine Intelligence / Siemens AI Lab Outline (Brief Introduction) 2/2 Tracks: Topic Modeling & Representation Learning Document Informed Neural Autoregressive Topic Models with Distributional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019 TL;DR → Improved Topic Modeling with full-contexts and pre-trained word embeddings textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019. TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax and semantic information); Composite Model of a neural topic and neural language model Multi-view and Multi-source Transfers in Neural Topic Modeling Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review. TL;DR → Improved Topic modeling with knowledge transfer via local as well as global semantics
  • 76. Unrestricted © Siemens AG 2019 January 2019Page 76 Machine Intelligence / Siemens AI Lab Outline (Brief Introduction) 2/2 Tracks: Topic Modeling & Representation Learning Document Informed Neural Autoregressive Topic Models with Distributional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019 TL;DR → Improved Topic Modeling with full-contexts and pre-trained word embeddings textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019. TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax and semantic information); Composite Model of a neural topic and neural language model Multi-view and Multi-source Transfers in Neural Topic Modeling Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review. TL;DR → Improved Topic modeling with knowledge transfer via local as well as global semantics
  • 77. Unrestricted © Siemens AG 2019 January 2019Page 77 Machine Intelligence / Siemens AI Lab Outline (Brief Introduction) 2/2 Tracks: Topic Modeling & Representation Learning Document Informed Neural Autoregressive Topic Models with Distributional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019 TL;DR → Improved Topic Modeling with context-awareness and pre-trained word embeddings textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019. TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax and semantic information); Composite Model of a neural topic and neural language model Multi-view and Multi-source Transfers in Neural Topic Modeling Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review. TL;DR → Improved Topic modeling with knowledge transfer via local and global semantics
  • 78. Unrestricted © Siemens AG 2019 January 2019Page 78 Machine Intelligence / Siemens AI Lab Topic Modeling ➢ statistical modeling that examines how words co-occur across a collection of documents, and ➢ automatically discovers coherent groups of words (i.e., themes or topics) that best explain the corpus ➢ Each document is composed of a mixture of topics, and each topic is composed of a collection of words Source: http://www.cs.columbia.edu/~blei/papers/Blei2012.pdf
  • 79. Unrestricted © Siemens AG 2019 January 2019Page 79 Machine Intelligence / Siemens AI Lab Outline (Brief Introduction) 2/2 Tracks: Topic Modeling & Representation Learning Document Informed Neural Autoregressive Topic Models with Distributional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019 TL;DR → Improved Topic Modeling with full-contexts and pre-trained word embeddings textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019. TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax and semantic information); Composite Model of a neural topic and neural language model Multi-view and Multi-source Transfers in Neural Topic Modeling Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review. TL;DR → Improved Topic modeling with knowledge transfer via local as well as global semantics
  • 80. Unrestricted © Siemens AG 2019 January 2019Page 80 Machine Intelligence / Siemens AI Lab Document Informed Neural Autoregressive Topic Models with Distributional Prior (AAAI-19) Need for Distributional Semantics / Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences
  • 81. Unrestricted © Siemens AG 2019 January 2019Page 81 Machine Intelligence / Siemens AI Lab Document Informed Neural Autoregressive Topic Models with Distributional Prior (AAAI-19) Need for Distributional Semantics / Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences Lack of Context Difficult to learn good representations Generate Incoherent Topics
  • 82. Unrestricted © Siemens AG 2019 January 2019Page 82 Machine Intelligence / Siemens AI Lab Document Informed Neural Autoregressive Topic Models with Distributional Prior (AAAI-19) Need for Distributional Semantics / Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences Lack of Context Difficult to learn good representations Generate Incoherent Topics Topic1: price, wall, china, fall, shares Topic2: shares, price, profits, rises, earnings coherent example topics for ‘trading’ incoherent
  • 83. Unrestricted © Siemens AG 2019 January 2019Page 83 Machine Intelligence / Siemens AI Lab Document Informed Neural Autoregressive Topic Models with Distributional Prior (AAAI-19) Need for Distributional Semantics / Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences Lack of Context Difficult to learn good representations Generate Incoherent Topics Topic1: price, wall, china, fall, shares Topic2: shares, price, profits, rises, earnings coherent example topics for ‘trading’ incoherent TO RESCUE: Use External/additional information, e.g., WORD EMBEDDINGS (encodes semantic and syntactic relatedness in words in a vector space)
  • 84. Unrestricted © Siemens AG 2019 January 2019Page 84 Machine Intelligence / Siemens AI Lab Document Informed Neural Autoregressive Topic Models with Distributional Prior (AAAI-19) Need for Distributional Semantics / Prior Knowledge ➢ “Lack of Context” in short-text documents, e.g., headlines, tweets, etc. ➢ “Lack of Context” in a corpus of few documents Small number of Word co-occurrences Lack of Context Difficult to learn good representations Generate Incoherent Topics TO RESCUE: Use External/additional information, e.g., WORD EMBEDDINGS (encodes semantic and syntactic relatedness in words in a vector space) → trading No word overlap (e.g., 1-hot- encoding) Same topic class→ trading
  • 85. Unrestricted © Siemens AG 2019 January 2019Page 85 Machine Intelligence / Siemens AI Lab Document Informed Neural Autoregressive Topic Models with Distributional Prior (AAAI-19) mixture weights ➢ introduce weighted pre-trained word embedding aggregation at each autoregressive step k ➢ E, pretrained emb as fixed prior ➢ generate topics with embeddings ➢ learn a complementary textual representation Baseline Model Proposed Model Glove
  • 86. Unrestricted © Siemens AG 2019 January 2019Page 86 Machine Intelligence / Siemens AI Lab Document Informed Neural Autoregressive Topic Models with Distributional Prior (AAAI-19) IR-precision (on short-text datasets) → Precision at different retrieval fractions and higher the better Evaluation: Applicability (Information Retrieval)
  • 87. Unrestricted © Siemens AG 2019 January 2019Page 87 Machine Intelligence / Siemens AI Lab Document Informed Neural Autoregressive Topic Models with Distributional Prior (AAAI-19) Take Away of this work: ➢ Leveraging full contextual information in neural autoregressive topic model ➢ Introducing distributional priors via pre-trained word embeddings ➢ Gain of 5.2% (404 vs 426) in perplexity, 2.8% (.74 vs .72) in topic coherence, 11.1% (.60 vs .54) in precision at retrieval fraction 0.02, 5.2% (.664 vs .631) in F1 for text categorization on avg over 15 datasets ➢ Learning better word/document representation for short/long texts Tryout: The code and data are available at https://github.com/pgcool/iDocNADEe @PankajGupta262
  • 88. Unrestricted © Siemens AG 2019 January 2019Page 88 Machine Intelligence / Siemens AI Lab Local vs Global Semantics Language Models have Local View (semantics): → A vector-space representation for each word, based on the local word collocation patterns → Due to word-word co-occurrence, limited by a window-size (e.g., word2vec) or sentence (e.g. ELMo) → Information beyond the limited context is not exposed → Good at capturing local syntactic and semantic information
  • 89. Unrestricted © Siemens AG 2019 January 2019Page 89 Machine Intelligence / Siemens AI Lab Local vs Global Semantics Language Models have Local View (semantics): → A vector-space representation for each word, based on the local word collocation patterns → Due to word-word co-occurrence, limited by a window-size (e.g., word2vec) or sentence (e.g. ELMo) → Information beyond the limited context is not exposed → Good at capturing local syntactic and semantic information → Difficulties in capturing long-range dependencies
  • 90. Unrestricted © Siemens AG 2019 January 2019Page 90 Machine Intelligence / Siemens AI Lab Local vs Global Semantics Language Models have Local View (semantics): → A vector-space representation for each word, based on the local word collocation patterns → Due to word-word co-occurrence, limited by a window-size (e.g., word2vec) or sentence (e.g. ELMo) → Information beyond the limited context is not exposed → Good at capturing local syntactic and semantic information → Difficulties in capturing long-range dependencies
  • 91. Unrestricted © Siemens AG 2019 January 2019Page 91 Machine Intelligence / Siemens AI Lab Local vs Global Semantics Language Models have Local View (semantics): → A vector-space representation for each word, based on the local word collocation patterns → Due to word-word co-occurrence, limited by a window-size (e.g., word2vec) or sentence (e.g. ELMo) → Information beyond the limited context is not exposed → Good at capturing local syntactic and semantic information → Difficulties in capturing long-range dependencies
  • 92. Unrestricted © Siemens AG 2019 January 2019Page 92 Machine Intelligence / Siemens AI Lab Local vs Global Semantics Topic Models have Global View (semantics): → Due to document-word occurrences (i.e., words are similar if these words similarly appear in documents) → Access to document context, not limited by local context → Good at capturing thematic structures or long-range dependencies in document collection Topic models have global view in the sense that each topic is learned by leveraging statistical information across documents
  • 93. Unrestricted © Siemens AG 2019 January 2019Page 93 Machine Intelligence / Siemens AI Lab Local vs Global Semantics Topic Models have Global View (semantics): → Due to document-word occurrences (i.e., words are similar if these words similarly appear in documents) → Access to document context, not limited by local context → Good at capturing thematic structures or long-range dependencies in document collection Topic models have global view in the sense that each topic is learned by leveraging statistical information across documents
  • 94. Unrestricted © Siemens AG 2019 January 2019Page 94 Machine Intelligence / Siemens AI Lab Local vs Global Semantics Topic Models have Global View (semantics): → Due to document-word occurrences (i.e., words are similar if these words similarly appear in documents) → Access to document context, not limited by local context → Good at capturing thematic structures or long-range dependencies in document collection → No Language structures (e.g., word ordering, local syntactical and semantic information, etc.) → Difficulties in capturing short-range dependencies Same unigram statistics, but different topics Source Text Sense/Topic Market falls into bear territory → “trading” Bear falls into market territory → “trading” Language structure helps in determining actual meaning !!!
  • 95. Unrestricted © Siemens AG 2019 January 2019Page 95 Machine Intelligence / Siemens AI Lab textTOvec: Deep Contextulized Neural Autoregressive Topics Models of Language With Distributed Compositional Prior (ICLR-19) Incorporate language structures in Topic Models → accounting word ordering, latent syntactical and semantic features → improving word and document representations, including polysemy Improving Topic Modeling for short-text and long-text documents via contextualized features and external knowledge Incorporate external knowledge for each word → using distributional semantics, i.e., word embeddings → improving document representations and topics
  • 96. Unrestricted © Siemens AG 2019 January 2019Page 96 Machine Intelligence / Siemens AI Lab textTOvec: Deep Contextulized Neural Autoregressive Topics Models of Language With Distributed Compositional Prior (ICLR-19) Advantages of Composite Modeling: → introduce language structure into neural autoregressive topic models via a LSTM-LM, such as word ordering, language concepts and long-range dependencies. → probability of each word is a function of global and local contexts, modeled via DocNADE and LSTM-LM, respectively. → offers learning complementary semantics by combining joint word and latent topic learning in a unified neural autoregressive framework. contextualized-Document Neural Autoregressive Distribution Estimator (ctx-DocNADE) with pre- trained word embedding (ctx-DocNADEe)
  • 97. Unrestricted © Siemens AG 2019 January 2019Page 97 Machine Intelligence / Siemens AI Lab textTOvec: Deep Contextulized Neural Autoregressive Topics Models of Language With Distributed Compositional Prior (ICLR-19) Code: https://github.com/pgcool/textTOvec
  • 98. Unrestricted © Siemens AG 2019 January 2019Page 98 Machine Intelligence / Siemens AI Lab Outline: Topic Modeling (Brief Introduction) 2/2 Tracks: Topic Modeling & Representation Learning Document Informed Neural Autoregressive Topic Models with Distributional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019 TL;DR → Improved Topic Modeling with context-awareness and pre-trained word embeddings textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019. TL;DR → Improved Topic modeling with language structures (e.g., word ordering, local syntax and semantic information); Composite Model of a neural topic and neural language model Multi-view and Multi-source Transfers in Neural Topic Modeling Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review. TL;DR → Improved Topic modeling with knowledge transfer via local and global semantics
  • 99. Unrestricted © Siemens AG 2019 January 2019Page 99 Machine Intelligence / Siemens AI Lab Outline: Topic Modeling 2/2 Tracks: Topic Modeling & Representation Learning Document Informed Neural Autoregressive Topic Models with Distributional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. In AAAI-2019 textTOvec: Deep Contextualized Neural Autoregressive Topic Models of Language with Distributed Compositional Prior Pankaj Gupta, Yatin Chaudhary, Florian Buettner, Hinrich Schütze. To appear in ICLR-2019. Active Research in Topic Modeling & Representation Learning: → Multi-view and Multi-source Transfers in Neural Topic Modeling Pankaj Gupta, Yatin Chaudhary, Hinrich Schütze. Under review. →Improving Language Models with Global Semantics via Neural Composite Networks →Lifelong Neural Topic Learning (PhD Student: Mr. Yatin Chaudhary)
  • 100. Unrestricted © Siemens AG 2019 January 2019Page 100 Machine Intelligence / Siemens AI Lab Summary & Thanks !! Neural Topic ModelingNeural Relation Extraction Interpretability ➢ Intra- and Inter-sentential RE ➢ Joint Entity & RE ➢ Weakly-supervised Bootstrapping RE ➢ Autoregressive TMs ➢ Word Embeddings Aware TM ➢ Language Structure Aware TM (textTOvec) ➢ Multi-view Transfer Learning in TM Interpretable RE Interpretable topics Transfer Learning Lifelong Learning ➢ Explaining RNN predictions ReachMe / Talks: https://sites.google.com/view/gupta-pankaj/