Open IE tutorial 2018

André Freitas
Open Information
Extraction for QA
OKBQA 2018André Freitas

André Freitas
Building Knowledge
Graphs for QA
OKBQA 2018

Organisation
• Goals of this Tutorial:
– Try to answer the question: What is a knowledge graph?
– Depict open IE as a foundation to build KGs.
– Establish how different semantic perspectives (including
open IE) can be integrated into a unified Knowledge
Representation model.
• Method:
– Focus on the contemporary and emerging perspectives.
– Sampling exemplar approaches and infrastructures on
each of these emerging perspectives (not an exhaustive
survey).
– Emphasis on Open IE.

“On our best behaviour”
“We need to return to our roots in Knowledge
Representation and Reasoning for language and from
language.”
Levesque, 2013
“We should not treat English text as a monolithic
source of information. Instead, we should carefully study
how simple knowledge bases might be used to make
sense of the simple language needed to build slightly
more complex knowledge bases.”

Outline
1. What is a ?
2. Building a
(with Open IE)
3. Querying
4. Inferences over
5. Uses of

Some Perspectives on “What”
“The Knowledge Graph is a knowledge base used by Google to enhance
its search engine's search results with semantic-search information
gathered from a wide variety of sources.”
“A Knowledge graph (i) mainly describes real world entities and
interrelations, organized in a graph (ii) defines possible classes and
relations of entities in a schema(iii) allows potentially interrelating arbitrary
entities with each other…” [Paulheim H.]
“We defines a Knowledge Graph as an RDF graph consists of a set of RDF
triples where each RDF triple (s,p,o) is an ordered set of following RDF
term ….” [Pujara J. al al.]
KDD 2014 Tutorial on Constructing and Mining Web-scale Knowledge Graphs, New York, August 24, 2014

• Open world representation of information.
• Every entry point is equal cost.
• Underpin Cortana, Google Assistant, Siri, Alexa.
• Typically (but doesn’t have to be) expressed in RDF.
• No longer a solution in search of a problem!
Dan Bennett, Thomson Reuters
Some Perspectives on “What”

Defining KG by Example

• “Knowledge is Power” Hypothesis (the Knowledge
Principle): “If a program is to perform a complex task
well, it must know a great deal about the world in
which it operates.”
• The Breadth Hypothesis: “To behave intelligently in
unexpected situations, an agent must be capable of
falling back on increasingly general knowledge.”
Some Perspectives on “Why”

• We’re surrounded by entities, which are connected by
relations.
• We need to store them somehow, e.g., using a DB or a
graph.
• Graphs can be processed efficiently and offer a
convenient abstraction.

• Knowledge models such as Linked Data and many
problems in machine learning have a natural
representation as relational data.
• Relations between entities are often more important for a
prediction task than attributes.
• For instance, can be easier to predict the party of a vice-
president from the party of his president than from his
attributes.
[Koopman, 2010]

Schema on Write
• Fixed data model
• Slow to change
• Strong enforcement
Schema on Read
• Capture everything
• Apply logic (schema) on read
• No standards
The Data Management Perspective
Dan Bennet, Thomsom Reuters

From Closed to Open
Communication

Intuitions on the Connection
Between KGs and AI

Coping with the long tail of data
variety
frequency of use
# of entities and attributes
relational NoSQL
schema-less
unstructured
Consumption (Querying, Software)
Formalization
Semiotic breakdown
Exponential growth of
vocabulary size

Regularities in Natural
Language

Breaking away from the linear
order imposed by the medium

What did we get?
• By integrating terms (mapping to a canonical form) we
reduced complexity.
– ‘Atomization’ of knowledge allows integration.
• Entities, attributes and relationships are now explicit
and interconnected.
– Predicate-argument structures.
• Ability to focus (“query”) and operate on specific sets
and relations.
• Entities are organized into hierarchies of sets.
– With that we can express knowledge at different abstraction
levels (make generalizations).

Requirements for Open IE
• Banko et al. (2007) identified three major
challenges for Open IE systems:
– Automation. Open IE systems must rely on
unsupervised extraction strategies
– Transportability. As Open IE systems are intended
for domain-independent usage,
– Efficiency. In order to readily scale to large
amounts of text, Open IE systems must be
computationally efficient.

Four Types of Systems
• Learning-based Systems
• Rule-based Systems
• Clause-based Systems
• Systems Capturing Inter-Proposition
Relationships

Learning-based and Rule-
based Systems

Clause-based Systems
• Incorporating a sentence re-structuring stage.
• Transform complex sentences, where relations are
spread over several clauses into a set of syntactically
simplified independent clauses that are easy to
segment into Open IE tuples.

Systems Capturing Inter-
Proposition Relationships
• Previous approaches usually ignore the context under
which a proposition is complete and correct.
• E.g. not distinguish between information asserted in a
sentence and information that is only hypothetical or
conditionally true.
<the earth; is the center of ; the universe>
“Early scientists believed that the earth is the center of the
universe.”

Systems Capturing Inter-
Proposition Relationships

Case Study: Graphene
• Captures contextual relations.
• Extends the default Open IE representation in
order to capture inter-proposition relationships.
• Increase the informativeness and
expressiveness of the extracted tuples.
Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017)

Extracting Rhetorical
Relations

Clausal & Phrasal
Disembedding

Asian stocks fell anew and the yen rose to session highs
in the afternoon as worries about North Korea simmered,
after a senior Pyongyang official said the U.S. is
becoming ``more vicious and more aggressive'' under
President Donald Trump .
Asian stocks fell anew
The yen rose to session highs in the afternoon
spatial
attribution
after
Worries simmered about North Korea
The U.S. is becoming
becoming `` more vicious and more aggressive ''
under Donald Trump
A senior Pyongyang
official said
background
and

Precision:
Recall:
Creating a Hierarchy of Semantically-Linked
Propositions in Open Information Extraction,
COLING, (2018)
What to expect
(Wikipedia & Newswire)

https://github.com/Lambda-3/Graphene
Niklaus et al., A Sentence Simplification System for Improving Relation Extraction, COLING (2017)
Software: Extracting Knowledge
Graphs from Text

Graphene
• Resolves co-references.
• Transforms complex sentences (for example, containing subordinations,
coordinations, appositive phrases, etc), into simple independent
sentences (one clause per sentence).
• Identifies rhetorical relations between those sentences.
• Extract binary relations (subject, predicate and object) from each
sentence.
• Merge all the extracted relations into a relation graph (knowledge
graph).

Installing Graphene
$: git clone https://github.com/Lambda-3/Graphene.git
$: docker-compose up

Using it as a Server
curl -X POST -H "Content-Type: application/json" -d '{"text": “My text.",
"doCoreference": "true", "isolateSentences": "false", "format": "DEFAULT"}'
-H "Accept: text/plain" http://localhost:8080/discourseSimplification/text
CLI Tutorial:
https://asciinema.org/a/bvhgIP8ZEgDwtmRPFctHyxALu?speed=3

Java API
Co-reference:
Discourse Simplification:
Relation Extraction:

Evaluation
• Assertedness. Extracted propositions should be
asserted by the original sentence. Instead of inferring
propositions
• Minimal Propositions. In order to serve for semantic
tasks, it is beneficial for Open IE systems to extract
compact, self-contained propositions.
• Completeness and Open Lexicon. Aims to extract all
relations that are asserted in the input text (Banko et al.
(2007)).

Extrinsic Evaluation
• Potential: Stanovsky and Dagan (2016) converted the
annotations of a QA-SRL dataset to an Open IE corpus,
resulting in more than 10,000 extractions over 3,200
sentences Wikipedia and the Wall Street Journal.

MCTest comprehension data
(Richardson et al.)
James the Turtle was always getting in trouble. Sometimes he'd reach into
the freezer and empty out all the food. Other times he'd sled on the deck
and get a splinter. His aunt Jane tried as hard as she could to keep him out
of trouble, but he was sneaky and got into lots of trouble behind her back.
One day, James thought he would go into town and see what kind of trouble
he could get into. He went to the grocery store and pulled all the pudding off
the shelves and ate two jars. Then he walked to the fast food restaurant and
ordered 15 bags of fries. He didn't pay, and instead headed home.
His aunt was waiting for him in his room. She told James that she loved
him, but he would have to start acting like a well-behaved turtle.
After about a month, and after getting into lots of trouble, James finally
made up his mind to be a better turtle.
Q: What did James pull off of the shelves in the grocery store?
A) pudding B) fries C) food D) splinters
…
Slide credit: Jason Weston

Qualitative Analysis
• Wrong boundaries, where the relational or argument phrase is either too long
or too small.
• Redundant extraction, where the proposition asserted in an extraction is
already expressed in another extraction.
• Uninformative extraction, where critical information is omitted.
• Missing extraction, i.e. a false negative, where either a relation is not
detected by the system.
• Wrong extraction, where no meaningful interpretation of the proposition is
possible.
• Out of scope extraction, where a system yields a correct extraction that was
not recognized by the authors of the gold dataset.

Open Research Questions
• Comparing results among different Open IE systems in a
large-scale, objective and reproducible fashion.
• Applicability and transferability to languages other than
English.
– The role of Universal Dependencies on this process
• Canonicalizing relational phrases and arguments.
Normalizing extractions would be highly beneficial for
downstream semantic tasks, such as textual entailment or
knowledge base population.
• Incorporating coreference resolution into open ie evaluations
A Survey on Open Information Extraction, COLING (2018)

RDF-NL
https://github.com/Lambda-3/Graphene/blob/master/wiki/RDFNL-Format.md
“Although the Treasury will announce details of the November refunding on Monday , the
funding will be delayed if Congress and President Bush fail to increase the Treasury 's
borrowing capacity.”

RDF-NL
https://github.com/Lambda-3/Graphene/blob/master/wiki/RDFNL-Format.md

SRDF
SRDF: Korean Open Information Extraction using
Singleton Property, Proc. International Semantic Web
Conference, ISWC (2015).
SRDF: A Novel Lexical Knowledge Graph for Whole
Sentence Knowledge Extraction, LDK (2017).

SRDF
SRDF: Korean Open Information Extraction using
Singleton Property, Proc. International Semantic Web
Conference, ISWC (2015).
SRDF: A Novel Lexical Knowledge Graph for Whole
Sentence Knowledge Extraction, LDK (2017).
https://github.com/sanghanam/SRDF

Frame-based Extraction
(Ontology Grounded)
Background theories:
• Combinatory Categorial Grammar [C&C],
• Discourse Representation Theory [DRT, Boxer],
• Frame Semantics [Fillmore 1976]
• Ontology Design Patterns [Ontology Handbook].
Frameworks:
• Named Entity Resolution [Stanbol, TagMe],
• Coreference Resolution [CoreNLP]
• Word Sense Disambiguation [Boxer, IMS].
Gangemini et al., Semantic Web Machine Reading with FRED, Semantic Web Journal, 2017

(Ontology Grounded)
N-ary relation and Event extraction by using frame detection
Relation extraction between frames, events, concepts and entities
Negation representation
Modality representation
Adjective semantics
Temporal relation extraction from tense expressions
Semantic annotation of text fragments
Coreference resolution
Type and taxonomy induction
Incremental role propagation
Entity linking to Semantic Web data
Word-sense disambiguation
Pattern-based subgraph extraction
Named-graph generation

The New York Times reported that John McCarthy died. He invented the programming language LISP.

http://wit.istc.cnr.it/stlab-tools/fred/
Software: FRED

Argumentation Structures
Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.

Argumentative Discourse
Unit Classification

Argumentation &
Rhetorical Relations
• Support: background, cause, evidence, justify,
list, motivation, reason, restatement, result.
• Rebuttal: antithesis, contrast, unless.
• Undercut: concession.

Argumentation Schemes
“Argumentation Schemes are forms of argument
(structures of inference) that represent structures of
common types of arguments used in everyday
discourse, as well as in special contexts like those
of legal argumentation and scientific
argumentation.”
Douglas Walton

Argument Mining Approaches
What to expect?
F1-score: 0.74
Stab & Gurevych, Parsing Argumentation
Structures in Persuasive Essays, 2016.

https://www.ukp.tu-darmstadt.de/data/argumentation-
mining/argument-annotated-essays/
Stab & Gurevych, Parsing Argumentation Structures in Persuasive Essays, 2016.
Software: Argumentation Mining

EventKG: A Multilingual Event-Centric
Temporal Knowledge Graph
Simon Gottschalk, Elena Demidova. ESWC 2018
EventKG is an open knowledge graph containing
event-centric information: http://eventkg.l3s.uni-hannover.de/
- EventKG V1.1: 690K events and 2.3M temporal relations
- 2014 FIFA World Cup
- “The Space Shuttle Challenger is launched on its maiden voyage”
- <Jennifer Aniston, married to, Brad Pitt, [2000-07-29,2005-10-02]>
- Extracted from Wikidata, YAGO, DBpedia and Wikipedia
- Integrated data in five languages: EN, FR, DE, RU, PT
- Provides provenance information
- High coverage of event times and locations due to integration:
EventKG Wikidata DBpedia (en)
Events with Time 50.82% 33.00% 7.00%
Events with Location 26.13% 11.70% 6.21%

Which science-related events took place in Lyon?
- 1921: “À Lyon, fusion de la Société de médecine et de la Société des
sciences médicales”
SELECT DISTINCT ?description {
?event rdf:type sem:Event .
?relation rdf:object ?lyon .
?relation rdf:subject ?event .
?event dcterms:description ?description .
FILTER regex(?description, "science", "i") .
?lyon owl:sameAs dbr:Lyon .
}

• EventKG builds upon and extends the Simple Event Model
(SEM).
• Example: The participation of Barack Obama in his second
inauguration as US president in 2013 in EventKG.

EventKG+TL: Creating Cross-Lingual Timelines
from an Event-Centric Knowledge Graph
- An overview of events related to a query entity over a time
period across languages using Event KG
- Example: Brexit-related events. The pie chart size: the overall (i.e.
language independent) event relevance. The colored slices: the ratio
of the relevance in a language context.

Semantic Roles for Lexical
Definitions
Aristotle’s classic theory of definition introduced important aspects
such as the genus-differentia definition pattern and the
essential/non-essential property differentiation.

Data: WordNetGraph
Silva et al., Categorization of Semantic Roles for Dictionary Definitions.
Cognitive Aspects of the Lexicon CogALex@COLING, 2017.
https://github.com/Lambda-3/WordnetGraph
RDF graph generated from WordNet.

Emerging perspectives
• The evolution of parsing and classification methods in
NLP is inducing a new lightweight semantic
representation.
• This representation dialogues with elements from logics,
linguistics and the Semantic/Linked Data Web (especially
RDF).
• However, they relax the semantic constraints of previous
models (which were operating under assumptions for
deductive reasoning or databases).

• Knowledge graphs as lexical semantic models
operating under a semantic best-effort mode (canonical
identifiers when possible, otherwise, words).
• Possibly closer to the surface form of the text.
• Priority is on segmenting, categorizing and when
possible, integrating.
• A representation (data model) convenient for AI
engineering.

Categorization
A fact (main clause):
* Can be a taxonomic fact.
s p o
term, URI term, URI term, URI
instance,
class,
triple
type, property,
schema property
instance,
class,
triple

Categorization
A fact with a context:
s0 p0 o0
p1
o1
reification
e.g.
• subordination
(modality, temporality,
spatiality, RSTs)
• fact probability
• polarity

Categorization
Coordinated facts:
s0 p0 o0
s1 p1 o1
p2
e.g.
• coordination
• RSTs
• ADU

Knowledge Graphs &
Distributional Semantics
(A marriage made in heaven?)

• Computational models that build contextual
semantic representations from corpus data.
• Semantic context is represented by a vector.
• Vectors are obtained through the statistical
analysis of the linguistic contexts of a word.
• Salience of contexts (cf. context weighting
scheme).
• Semantic similarity/relatedness as the core
operation over the model.
Distributional Semantic Models

Distributional Semantic Models
• Semantic Model with low acquisition effort
(automatically built from text)
Simplification of the representation
• Enables the construction of comprehensive
commonsense/semantic KBs
• What is the cost?
Some level of noise
(semantic best-effort)
Limited semantic model

Distributional Semantics as
Commonsense Knowledge
Commonsense is here
θ
car
dog
cat
bark
run
leash
Semantic Approximation is
here
Semantic Model with low
acquisition effort

Context Weighting Measures
Kiela & Clark, 2014
Similarity Measures
x
… and of course, Glove and W2V

Distributional-Relational
Networks
Distributional Relational Networks, AAAI Symposium (2013).
A Compositional-Distributional Semantic Model for Searching Complex Entity
Categories, ACL *SEM (2016)
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
First Supreme Court Justice of
Hispanic descent
…
LSA, ESA, W2V, GLOVE, …
s0 p0 o0

The vector space is segmented
114
Dimensional reduction
mechanism!
A Distributional Structured Semantic Space for Querying RDF Graph
Data, IJSC 2012

Compositionality of Complex
Nominals
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
Hispanic descent

Building on Word Vector Space
Models
• But how can we represent the meaning of longer phrases?
• By mapping them into the same vector space!
the country of my birth
the place where I was born

Compositionality
• The meaning of a complex expression is a
function of the meaning of its constituent parts.
carnivorous plants digest
slowly

Compositionality Principles
Words in which the meaning
is directly determined by
their distributional behaviour
(e.g., nouns).
Words that act as functions
transforming the
distributional profile of other
words (e.g., verbs,
adjectives, …).

Compositionality Principles
• Take the syntactic structure to constitute the backbone
guiding the assembly of the semantic representations of
phrases.
• A correspondence between syntactic categories and
distributional objects.

Inducing distributional functions
from corpus data
- Distributional functions are
induced from input to output
transformation examples
- Regression techniques
commonly used in machine
learning.

How should we map phrases
into a vector space?
Recursive Neural Networks

Compositional-distributional
model for paraphrases
A Compositional-Distributional Semantic Model for Searching Complex
Entity Categories, *SEM (2016)

Software: Indra
• Semantic approximation server
• Multi-lingual (12 languages)
• Multi-domain
• Different compositional models
https://github.com/Lambda-3/indra
Semantic Relatedness for All (Languages): A Comparative Analysis of Multilingual
Semantic Relatedness using Machine Translation, EKAW, (2016).

Data: PPDB
Ganitkevitch et al., PPDB: The Paraphrase Database, LREC, 2013
http://paraphrase.org/#/download
• 16 languages

Recursive vs recurrent neural
networks
1

Segmented Spaces vs
Unified Space
s0 p0 o0
s0 p0 o0
• Assumes is <s,p,o> naturally
irreconcilable.
• Inherent dimensional reduction
mechanism.
• Facilitates the specialization of
embedding-based approximations.
• Easier to compute identity.
• Requires complex and high-
dimensional tensorial model.

How to access Distributional-
Knowledge Graphs efficiently?
• Depends on the target operations in the
Knowledge Graphs (more on this later).

s0 p0 o0
s0
q
Inverted index
sharding
disk access
optimization
…
Multiple Randomized
K-d Tree Algorithm
The Priority Search
K-Means Tree algorithm
Database + IR
Query planning
Cardinality
Indexing
Skyline
Bitmap indexes
…
Structured Queries Approximation Queries

s0 p0 o0
Database + IR
Structured Queries Approximation Queries

Software: StarGraph
• Distributional Knowledge Graph Database.
• Word embedding Database.
https://github.com/Lambda-3/Stargraph
Freitas et al., Natural Language Queries over Heterogeneous Linked Data Graphs: A
Distributional-Compositional Semantics Approach, 2014.

• Graph-based data models + Distributional Semantic Models
(Word embeddings) have complementary semantic values.
• Graph-based Data Models:
– Facilitates querying, integration and rule-based reasoning.
• Distributional Semantic Models:
– Supports semantic approximation, coping with vocabulary variation.

• AI systems require access to comprehensive background
knowledge for semantic interpretation tasks.
• Inheriting from Information Retrieval and Databases:
– General Indexing schemes,
– Particular Indexing schemes,
• Spatial, temporal, topological, probabilistic, causal, …
– Query planning,
– Data compression,
– Distribution,
– … even supporting hardware strategies.

• One size of embedding does not fit all: Operate with
multiple distributional + compositional models for different
data model types (I, C, P), different domains and different
languages.
• Inheriting from Information Retrieval and Databases:
– Indexing schemes,
– Query planning,
– Data compression,
– Query distribution,
– even supporting hardware.

Best-effort
Semantic Integration

Semantic Integration
• Task: Mapping near-synonymic term references to a
canonical identifier.
• Goal: Reduce the entropy (complexity) of the underlying
KG.
• Operations:
– Co-reference Resolution
– (Named) Entity Linking
– Predicate Reconciliation
• Common aspects:
– Highly dependent on the context of the mention.
– Highly dependent on target entity background knowledge.

Software: Cobalt
• KG-based co-reference resolution.
Co
Cobalt
https://github.com/SeManchester/cobalt (to appear)

Software: AGDISTIS
• Agnostic Disambiguation of Named Entities Using Linked Open Data
http://aksw.org/Projects/AGDISTIS.html
Usbeck et al. AGDISTIS - Agnostic Disambiguation of Named Entities Using
Linked Open Data, ECAI, 2015

Software: StarGraph
• Predicate Reconciliation (Distributional-semantics based).

Effective Semantic Parsing
for Large KBs

The Vocabulary Problem
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
Hispanic descent

The Vocabulary Problem
Barack
Obama
Sonia
Sotomayor
nominated
:is_a
Hispanic descent
Latino origins
selected
JudgeHigh
Obama
Last US president

“On our best behaviour”
“It is not enough to build knowledge bases without paying
closer attention to the demands arising from their use.”
Levesque, 2013
“We should explore more thoroughly the
space of computations between fact
retrieval and full automated logical
reasoning.”

Vocabulary Problem for
Databases
Schema-agnostic
query mechanisms

Minimizing the Semantic Entropy
for the Semantic Matching
Definition of a semantic pivot: first query term to be resolved in the
database.
• Maximizes the reduction of the semantic configuration space.
• Less prone to more complex synonymic expressions and
abstraction-level differences.
• Semantic pivot serves as interpretation context for the remaining
alignments.
• proper nouns >> nouns >> complex nominals >> adjectives ,
verbs.

Distributional Semantic
Relatedness
?

Semantic Relatedness Measure
as a Ranking Function
A Distributional Approach for Terminological Semantic Search on the
Linked Data Web, ACM SAC, 2012152

Distributional
Inverted Index
Distributional-
Relational Model
Reference
Commonsense
corpora
Core semantic approximation
& composition operations
Semantic Parser
Query Plan
Scalable semantic
parsing
Learn to
Rank
Question Answers

I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽}
…
lexical specificity # of senses lexical category
…
… …

- Vector neighborhood density
- Semantic differential
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽}
…
…
… …
𝜌

I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽} …
…
… …
Δ𝑠𝑟
Δ𝑟
Semantic pivoting

- Distributional compositionality
I P C
𝒒 = 𝒕Γ
𝟎, … , 𝒕Γ
𝒏
t h
0t m1
0
t m2
0
Γ= {𝑰, 𝑷, 𝑪, 𝑽}
…
…
… …
t h
0t m1
0
t m2
0
o t h
0t m1
0
t m1
0 =
… …
… …

Knowledge Graphs
Open Information
Extraction &
Text Classification
Question
Answering
Show me cases which refer to article
Regulation (EU) No 575/2013.
Show me cases with similar supporting
arguments to Case 2008/900922.
Interpreting Law at Scale

What to expect (@ QALD1)
F1-Score: 0.72
MRR: 0.5
Freitas & Curry, Natural Language Queries over Heterogeneous Linked Data Graphs,
IUI (2014).

Addressing the Vocabulary
Problem
• Hierarchy of approximation spaces
– Probabilistic justification.
– Semantic pivoting.
How hard is the Query? Measuring the Semantic Complexity of Schema-Agnostic Queries, IWCS
(2015).
A Distributional Approach for Terminological Semantic Search on the Linked Data Web, ACM SAC
(2012).
Schema-agnostic queries over large-schema databases: a distributional semantics approach,
PhD Thesis (2015).
On the Semantic Mapping of Schema-agnostic Queries: A Preliminary Study, NLIWoD
(2015).

How to cope with the vocabulary
problem?
• Embed resources on distributional spaces.
• Use heuristics to minimize approximation errors.
• Interaction Pattern (Semantic best-effort):
Search >> Disambiguate >> Learn
Distributional semantics
= domain and language transportability.
= high search recall and relevance ranking.

Software: StarGraph
• Semantic parsing.

The Problem
Liu, Representation Learning for Large-Scale Knowledge Graphs, 2015

Formulating the Distributional-
Relational Representation

Complex Relations

TransD

KG2E

Complex Relations:
RL4KG

RL4KG with Entity Descriptions

Relation Paths
• Complex Inference patterns for composition.

Representation of Relation Paths

Path-based TransE
Addition, multiplication, RNNv

What to expect (PTransE@FB15K)
Relation Prediction

Software: KB2E
Relation Extraction
Knowledge Graph Embeddings including TransE, TransH,
TransR and PTransE.
https://github.com/thunlp/KB2E
Yankai Lin, Zhiyuan Liu, Maosong Sun, Yang Liu, Xuan Zhu. Learning Entity and Relation
Embeddings for Knowledge Graph Completion. The 29th AAAI Conference on Artificial
Intelligence (AAAI'15).
KB2E

Recognizing and Justifying
Text Entailments (TE)
using Definition KGs

Distributional semantic relatedness as a
Selectivity Heuristics
Distributional
heuristics
target
source
answer

Distributional Navigation
Algorithm

Explainable AI
“The right to explanation”
“The data subject should have
the right not to be subject to a
decision, which may include a
measure, evaluating personal
aspects relating to him or her
which is based solely on
automated processing …”
“… such processing should be
subject to suitable safeguards,
… to obtain an explanation of
the decision …”

What to expect (TE@Boeing-Princeton-ISI)
F1-Score: 0.59
What to expect (TE@Guardian Headline Samples)
F1-Score: 0.53
Recognizing and Justifying Text Entailment through Distributional
Navigation on Definition Graphs, AAAI, 2018.

• Distributional-relational models in KB completion
explored a large range of representation paradigms.
– Opportunity for exporting these representation models to other
tasks.
• Definition-based models can provide a corpus-viable,
low-data and explainable alternative to embedding-
based models.

Entity Linking
Open IE
Taxonomy
Extraction
Integration
Arg. Classif.Co-reference
Resolution
KG Completion
Natural
Language
Inference
Named Entity
Recognition
Semantic
Parsing
KG Construction
Inference
Distributional
Semantics
Server
Query By
Example
Query
spatial
temporal
probabilistic
causal
Indexes
NL
Generation
NL Query
Answers
Explanations
Definition
Extraction

Entity Linking
Integration
Co-reference
Resolution
KG Completion
Natural
Language
Inference
Named Entity
Recognition
Semantic
Parsing
KG Construction
Inference
Distributional
Semantics
Server
Query By
Example
Query
spatial
temporal
probabilistic
causal
Indexes
NL
Generation
NL Query
Answers
Explanations
M
T
M
T
Open IE
Taxonomy
Extraction
Arg. Classif.
Definition
Extraction https://goo.gl/qEG5Rk

Asked Financial Analysts ”What
ruins your day?” Customer ranked
pain points
Analyst workday
Source: Customer meetings, TR internal analyst
survey
20%
15%
20%
10%
15%
20%Assemble
Synthesize
Interpret
Meetings
Financial
Modeling
Communicate
“An analyst used to cover 40 firms, now it is
150 and the tools haven’t changed.
Director, $4B US L/S equity fund
“They all still do it manually”
Equities, Market data team $20B+ hedge fund
“The problem has gotten worse with more
data and more information”
Director, Research/Tech, $10B Multi-strat
1. Information overload
2. Understanding relationships
3. Unable to track impact
events
4. Cannot link internal &
external research, especially
text
What they told
Geoff Horrell & Dan Bennett

What’s in the Graph?
• Organizations – including names, address, identifiers, Country of
HQ/Incorp
• Industry Classification
• Hierarchy – Parent, Ultimate Parent, Affiliates
• Officers & Directors
• Job History, Education
• Suppliers & Customers
• Comparable Companies
• Joint Ventures & Strategic Alliances
• Meta-Data

• 125,000,000 (equity instruments & quotes).
• 75,000,000 meta-data objects (countries, regions, cities,
currencies, commodities, holidays, industry schemes,
scripts, languages, time zones, value domains, units,
identifiers).
• 200,000,000 strategic relationships (supplier, customer,
competitor, joint venture, alliance, industry, ownership,
affiliate).

All Connections between
Onshore Oil Drilling & Venezuela

One Week Snapshot
• 6,778 news articles with company news where at least one
organization has 80% relevance to the article.
• 135,267 companies are 2 steps away.
• 217,387 strategic relationships.
• Typical analyst portfolio is 200 companies.
• Each customer creates their own relative weights for each
type of relationship.
• Requires around 800,000 shortest path calculations to
deliver the ranked news feed. Each calculation optimised to
take 10ms.

Explainable Findings
From Tensor Inferences Back to KGs

Take-away Message
• The evolution of methods, tools and the availability of data in
NLP creates the demand for a knowledge representation method
to support complex AI systems.
• A relaxed version of RDF (RDF-NL?) can provide this answer.
– Establishes a dialogue with a standard (with existing data).
– Inherits optimization aspects from Databases.
• Word-embeddings (DSMs) + compositional models + RDF.
• Moving beyond facts and taxonomies: rhetorical structures,
arguments, polarity stories, pragmatics.

Take-away Message
• Syntactical and lexical features can go a long way for
structuring text.
– Context-preserving.
• Integration (entity reconciliation) as semantic-best effort.
– Embrace schema on read.
• KGs can support explainable AI:
– Meeting point between extraction, reasoning and querying.
– Definition-based models.
• Inherit infrastructures from DB and IR.

Take-away Message
Opportunities:
• ML orchestrated pipelines with:
– Richer discourse-representation models.
– Explicit semantic representations (centered on KGs).
– Different compositional/distributional models (beyond W2V & Glove)
• KGs and impact on explainability.
• Quantifying domain and language transportability.

Open IE tutorial 2018

More Related Content

What's hot (20)

Similar to Open IE tutorial 2018 (20)

More from Andre Freitas (20)

Recently uploaded (20)

Open IE tutorial 2018