SlideShare a Scribd company logo
Tutorial
Meaning Representations for Natural Languages:
Design, Models and Applications
Jeffrey
Flanigan
Tim
O’Gorman
Ishan
Jindal
Yunyao
Li
Nianwen
Xue
Martha
Palmar
Meaning Representations for Natural Languages Tutorial Part 1
Introduction
Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
What should be in a Meaning
Representation?
Mo#va#on: From Sentences to Proposi/ons
Who did what to whom, when, where and how?
Powell met Zhu Rongji
Proposition: meet(Powell, Zhu Rongji)
Powell met with Zhu Rongji
Powell and Zhu Rongji met
Powell and Zhu Rongji had
a meeting
. . .
When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane.
meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane))
debate
consult
join
wrestle
battle
meet(Somebody1, Somebody2)
Capturing seman.c roles
SUBJ
SUBJ
SUBJ
• Tim broke [ the laser pointer.]
• [ The windows] were broken by the
hurricane.
• [ The vase] broke into pieces when it
toppled over.
Capturing seman.c roles
• Tim broke [ the laser pointer.]
• [ The windows] were broken by the
hurricane.
• [ The vase] broke into pieces when it
toppled over.
Breake
r
Thing broken
Thing broken
A proposition as a tree
Zhu and Powell discussed the return of the spy plane
discuss([Powell, Zhu], return(X, plane))
Zhu and
Powell
of the
spy plane
discuss
return
discuss.01 - talk about
Aliases: discussion (n.), discuss (v.), have_discussion (l.)
• Roles:
ARG0: discussant
ARG1: topic
ARG2: conversation partner, if explicit
Valency Lexicon
PropBank Frame File - 11,436 framesets
Kingsbury & Palmer, LREC 2002 – Pradhan et. al., *SEM 2022,
discuss.01 - talk about
Aliases: discussion (n.), discuss (v.), have_discussion (l.)
• Roles:
ARG0: discussant
ARG1: topic
ARG2: conversation partner, if explicit
Valency Lexicon
PropBank Frame File - 11,436 framesets
Kingsbury & Palmer, LREC 2002 – Pradhan et. al., *SEM 2022,
discuss.01
ARG0: Zhu and Powell
ARG1: return.01
Arg1: of the spy plane
Zhu and Powell discussed the return of the spy plane
discuss.01
ARG0: Zhu and Powell
ARG1: return.01
Arg1: of the spy plane
discuss([Powell, Zhu], return(X, plane))
Zhu and Powell discussed the return of the spy plane
A proposi,on as a tree
Zhu and Powell discussed the return of the spy plane
discuss([Powell, Zhu], return(X, plane))
Zhu and
Powell
of the
spy plane
discuss.01
return.02
Arg0 Arg1
Arg1
A proposition as a tree
Zhu and Powell discussed the return of the spy plane
discuss([Powell, Zhu], return(X, plane))
Zhu and
Powell
of the
spy plane
discuss.01
return.02
Arg0 Arg1
Arg1
Arg0
A proposition as a tree
Zhu and Powell discussed the return of the spy plane
discuss([Powell, Zhu], return(X, plane))
Zhu and
Powell
of the
spy plane
discuss.01
return.02
Arg0 Arg1
Arg1
Arg0
?? (Zhu)
A proposi,on as a tree
Zhu and Powell discussed the return of the spy plane
discuss([Powell, Zhu], return(X, plane))
Zhu and
Powell
of the
spy plane
discuss.01
return.02
Arg0 Arg1
Arg1
Proposi.on Bank
• Hand annotated predicate argument structures for Penn Treebank
• Standoff XML, points directly to syntac=c parse tree nodes, 1M words
• Doubly annotated and adjudicated
• (Kingsbury & Palmer, 2002, Palmer, Gildea, Xue, 2004, …).
• Based on PropBank Frame Files
• English valency lexicon: ~4K verb entries (2004) → ~11K v,n, adj, prep (2022)
• Core arguments – Arg0-Arg5
• ArgM’s for modifiers and adjuncts
• Mappings to VerbNet and FrameNet
• Annotated PropBank Corpora
• English 2M+, Chinese 1M+, Arabic .5M, Hindi/Urdu .6K, Korean, …
An Abstract Meaning Representation as a graph
Zhu and Powell discussed the return of the spy plane
discuss([Powell, Zhu], return(X, plane))
Zhu and
Powell
of the
spy plane
discuss.01
return.02
Arg0 Arg1
Arg1
An Abstract Meaning Representation as a graph
Zhu and Powell discussed the return of the spy plane
discuss([Zhu, Powell], return(X, plane))
and
spy plane
discuss.01
return.02
Arg0 Arg1
Arg1
AMR drops:
Determiners
Function words
adds:
NE tags.
Wiki links
An Abstract Meaning Representation as a graph
Zhu and Powell discussed the return of the spy plane
discuss([Zhu, Powell], return(X, plane))
and
plane
discuss.01
return.02
Arg0 Arg1
Arg1
AMR drops:
Determiners
Function words
adds:
NE tags.
Wiki links
Noun Phrase Structure
spy.01
Arg0-of
An Abstract Meaning Representa,on as a graph
Zhu and Powell discussed the return of the spy plane
discuss([Powell, Zhu], return(X, plane))
and
plane
discuss.01
return.02
Arg0 Arg1
Arg1
Arg0
?? (Zhu)
AMR drops:
Determiners
Function words
adds:
NE tags.
Wiki links
Noun Phrase Structure
Implicit Arguments
Coreference Links
spy.01
Arg0-of
An Abstract Meaning Representa,on as a graph
Zhu and Powell discussed the return of the spy plane
discuss([Powell, Zhu], return(X, plane))
and
of the
spy plane
discuss.01
return.02
Arg0 Arg1
Arg1
Arg0
?? (Zhu)
AMR drops:
Determiners
Function words
adds:
NE tags.
Wiki links
Noun Phrase Structure
Implicit Arguments
Coreference Links
spy.01
Arg0-of
• Stay tuned
AMRs – Tim O’Gorman
Mo#va#on: From Sentences to Proposi/ons
Who did what to whom, when, where and how?
Powell met Zhu Rongji
Proposition: meet(Powell, Zhu Rongji)
Powell met with Zhu Rongji
Powell and Zhu Rongji met
Powell and Zhu Rongji had
a meeting
. . .
When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane.
meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane))
debate
consult
join
wrestle
battle
meet(Somebody1, Somebody2)
Motivation: From Sentences to Propositions
Who did what to whom, when, where and how?
Powell met Zhu Rongji
Proposition: meet(Powell, Zhu Rongji)
Powell met with Zhu Rongji
Powell and Zhu Rongji met
Powell and Zhu Rongji had
a meeting
. . .
When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane.
meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane))
debate
consult
join
wrestle
battle
meet(Somebody1, Somebody2)
ENGLISH!
Mo#va#on: From Sentences to Proposi/ons
Who did what to whom, when, where and how?
Powell reunió Zhu Rongji
Proposition: reunir(Powell, Zhu Rongji)
Powell reunió con Zhu Rongji
Powell y Zhu Rongji reunió
Powell y Zhu Rongji
tuvo una reunión
. . .
Powell se reunió con Zhu Rongji el jueves y hablaron sobre el regreso del avión espía.
reunir(Powell, Zhu) hablar[Powell, Zhu], regresar(X, avión))
зустрів
‫ا‬
‫ﻟ‬
‫ﺘ‬
‫ﻘ‬
‫ﻰ‬
遇⻅
मुलाकात की
พบ
meet(Somebody1, Somebody2)
Thai
Hindi
Chinese
Ukrainian
Arabic
Other
Languages?
Spanish
• Several languages already have valency lexicons
• Chinese, Arabic, Hindi/Urdu, Korean PropBanks, ….
• Czech Tectogrammatical SynSemClass , https://ufal.mff.cuni.cz/synsemclass
• VerbNets, FrameNets: Spanish, Basque, Catalan, Portuguese, Japanese, …
• Linguistic valency lexicons: Arapaho, Lakota, Turkish, Farsi, Japanese, …
• For those without, follow EuroWordNet approach: project from English?
• Universal Proposition Banks for Multilingual Semantic Role Labeling
• See Ishan Jindal in Part 2
• Can AMR be applied universally to build language specific AMRs?
• Uniform Meaning Representation
• See Nianwen Xue after the AM break
How do we cover thousands of languages?
• Universal PropBank was developed by IBM, primarily with translaLon
Prac=cal and efficient, produces consistent representa=ons for all languages
Projects English frames to parallel sentences in 23 languages
• BUT - May obscure language specific seman=c nuances
Not op=mal for target language applica=ons: IE, QA,…
• Uniform Meaning RepresentaLon
• Richer than PropBank alone
• Captures language specific characteris=cs while preserving
• consistency
• BUT - Producing sufficient hand annotated data is SLOW!
• Comparisons of UP/UMR will teach us a lot about
differences between languages
UP vs UMR
• Morning Session, Part 1
• Introduc=on - Martha Palmer
• Background and Resources – Martha Palmer
• Abstract Meaning Representa=ons - Tim O’Gorman
• Break
• Morning Session, Part 2
• Rela=ons to other Meaning Formalisms: AMR, UCCA, Tectogramma=cal, DRS (Parallel
Meaning Bank), Minimal Recursion Seman=cs and Seman=c Parsing – Tim O’Gorman
• Uniform Meaning Representa=ons – Nianwen Xue
Tutorial Outline
• Afternoon Session, Part 1
• Modeling Meaning Representation: SRL - Ishan Jindal
• Modeling Meaning Representation: AMR – Jeff Flanigan
• Break
• Afternoon Session, Part 2
• Applying Meaning Representations – Yunyao Li, Jeff Flanigan
• Open Questions and Future Work – Tim O’Gorman
Tutorial Outline
Meaning Representations for Natural Languages Tutorial Part 2
Common Meaning Representations
Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
• AMR as a format is older (Kasper 1989,
Langkilde & Knight 1998), but with no
PropBank, no training data.
• Propbank showed that large-scale training
sets could be annotated for SRL
• Modern AMR (Banarescu et al. (2013)
main innovation: making large-scale
sembanking possible:
• AMR 3.0 more than 60k sentences in English
• CAMR more than 20k sentences in Chinese
“AMR” annota,on
• Shi$ from SRL to AMR – from spans
to graphs
• In SRL we separately represent each
predicate’s arguments with spans
• AMR instead uses graphs with one
node per concept
AMR Basics – SRL to AMR
• “PENMAN” is the text-based format
used to represent these graphs
AMR Basics – PENMAN
(l / like-01
:ARG0 (c / cat
:mod (l / little))
:ARG1 (e / eat-01
:ARG0 c
:ARG1 (c2 / cheese)))
• Edges are represented by
indentation and colons (:EDGE)
• Individual variables identify each
node
AMR Basics – PENMAN
(l / like-01
:ARG0 (c / cat
:mod (l / little))
:ARG1 (e / eat-01
:ARG0 c
:ARG1 (c2 / cheese)))
• If a node has more than one edge, it
can be referred to again using that
variable.
• Terminology: We call that a re-
entrancy
• This is used for all references to the
same enPty/thing in a sentence!
• This is what allows us to encode
graphs in this tree-like format
AMR Basics – PENMAN
(l / like-01
:ARG0 (c / cat
:mod (l / little))
:ARG1 (e / eat-01
:ARG0 c
:ARG1 (c2 / cheese)))
• Inverse roles allow us to encode
things like relative clauses
• Any relation of the form “:X-of” is
an inverse.
• Interchangeable!
• (entity, ARG0-of, predicate)
generally equal to
(predicate, ARG0, entity)
AMR Basics – PENMAN
(l / like-01
:ARG0 (h / he)
:ARG1 (c / cat
:ARG0-of (e / eat-01
:ARG1 (c2 / cheese))))
• Are the graphs the same for “cats that eat cheese” and “cats eat
cheese”?
• No! Every graph gets a “Top” edge defining the semantic head/root
AMR Basics – PENMAN
(c / cat
:ARG0-of (e / eat-01
:ARG1 (c2 / cheese)))
(e / eat-01
:ARG0 (c / cat)
:ARG1 (c2 / cheese))
• Named en))es are typed and then linked to a
“name” node with features for each name token.
• 70+ categories like person, government-organiza)on,
newspaper, city, food-dish, conference
• Note that name strings (and some other things like
numbers) are constants — they aren’t assigned
variables.
• En)ty linking: connect to wikipedia entry for each NE
(when available)
AMR Basics – PENMAN
• That’s AMR notation! Let’s review before discussing how
we annotate AMRs.
(e / eat-01
:ARG0 (d / dog)
:ARG1 (b / bone :quant 4
:ARG1-of (f / find-01
:ARG0 d)))
3
9
variable concept constant
inverse rela>on reentrancy
AMR Basics – PENMAN
• AMR does limited normalization aimed at reducing arbitrary
syntactic variation (“syntactic sugar”) and maximizing cross-
linguistic robustness
• Mapping all predicative things (verbs, adjectives, many nouns)
to PropBank predicates. Some morphological decomposition
• Limited speculation: mostly represent direct contents of
sentence (add pragmatic content only when it can be done
consistently)
• Canonicalize the rest: removal of semantically light predicates
and some features like definiteness (controversial)
AMR Basics 2 – Annotation Philosophy
AMR Basics 2 – Annotation Philosophy
• We generalize across parts of speech and
etymologically related words:
• But we don’t generalize over synonyms (hard
to do consistently):
4
1
My fear of snakes fear-01
I’m terrified of snakes terrify-01
Snakes creep me out creep_out-03
My fear of snakes fear-01
I am fearful of snakes fear-01
I fear snakes fear-01
I’m afraid of snakes fear-01
AMR Basics 2 – Annotation Philosophy
• Predicates use the
PropBank inventory.
• Each frame presents
annotators with a list of
senses.
• Each sense has
its own definitions for its
numbered (core)
arguments
4
2
AMR Basics 2 – Annotation Philosophy
• If a seman)c role is not in
the core roles for a roleset,
AMR provides an inventory
of non-core roles
• These express things like
:+me, :manner, :part,
:loca+on, :frequency
• Inventory on handout, or in
editor (the [roles] bu@on)
4
3
AMR Basics 2 – Annota4on Philosophy
• Ideally one seman)c
concept = one node
• Mul)-word predicates
modeled as a single node
• Complex words can be
decomposed
• Only limited, replicable
decomposi)on (e.g. kill
does not become “cause to
die”)
4
4
The thief was lining his pockets with
their investments
(l / line-pocket-02
:ARG0 (p / person
:ARG0-of (t / thieve-01))
:ARG1 (t2 / thing
:ARG2-of (i2 / invest-01
:ARG0 (t3 / they))))
AMR Basics 2 – Annotation Philosophy
• All concepts drop plurality, aspect,
definiteness, and tense.
• Non-predicative terms simply represented in
singular, nominative form
4
5
A cat
The cat
cats
the cats
(c / cat)
ea=ng
eats
ate
will eat
(e / eat-01)
They
Their
Them
(t / they)
4
6
The man described the mission as a disaster.
The man’s description of the mission: disaster.
As the man described it, the mission was a disaster.
The man described the mission as disastrous.
(d / describe-01
:ARG0 (m / man)
:ARG1 (m2 / mission)
:ARG2 (d / disaster))
AMR Basics 2 – Annotation Philosophy
Meaning Representa=ons for Natural Languages Tutorial Part 2
Common Meaning Representa0ons
• Format & Basics
• Some Details & Design Decisions
• Prac=ce - Walking through a few AMRs
• Mul=-sentence AMRs
• Rela=on to Other Formalisms
• UMRs
• Open Ques=ons in Representa=on
Representa)on Roadmap
Details- Specialized Normaliza3ons
• We also have special entity types we use for
normalizable entities.
4
8
(d / date-entity
:weekday (t / tuesday)
:day 19)
(m / monetary-quantity
:unit dollar
:quant 5)
“Tuesday the 19th” “five bucks”
Details- Specialized Normaliza3ons
• We also have special enLty types we use for
normalizable enMMes.
4
9
(r / rate-entity-91
:ARG1 (m / monetary-quantity
:unit dollar
:quant 3)
:ARG2 (v / volume-quantity
:unit gallon
:quant 1))
“$3 / gallon”
Details - Specialized Predicates
• Common construcLons for kinship and
organizaLonal relaLons are given general
predicates like have-org-role-91
5
0
(p / person
:ARG0-of (h / have-org-role-91
:ARG1 (c / country
:name (n / name :op1 "US")
:wiki "United_States")
:ARG2 (p2 / president)
“The US president”
have-org-role-91
ARG0: office holder
ARG1: organization
ARG2: title of office held
ARG3: description of responsibility
Details - Specialized Predicates
• Common constructions for kinship and
organizational relations are given general
predicates like have-org-role-91
5
1
(p / person
:ARG0-of (h / have-rel-role-91
:ARG1 (s / she)
:ARG2 (f / father)
“Her father”
have-rel-role-91
ARG0: entity A
ARG1: entity B
ARG2: role of entity A
ARG3: role of entity B
ARG4: relationship basis
Coreference and Control
5
2
• Within sentences, all references to the same “referent” are merged
into the same variable.
• This applies even with pronouns or even descriptions
Pat saw a moose and she ran
(a / and
:op1 (s / see-01
:ARG0 (p /person
:name (n / name :op1 “Pat”))
:ARG1 (m / moose) )
:op2 (run-02
:ARG0 p))
Reduc)on of Seman)cally Light Matrix Verbs
5
3
• Specific predicates (specifically
the English copula) NOT used in
AMR.
• Copular predicates which
*many languages would omit*
are good candidates for
removal
• Replace with rela=ve
SEMANTIC asser=ons (e.g.
:domain is “is an atribute of”)
• UMR will discuss alterna=ves to
just omiung these.
the pizza is free
(f / free-01
:arg1 (p / pizza))
The house is a pit
(p / pit
:domain (h / house))
• For two-place discourse connectives, we
define frames
• Although it rained, we walked home
• For list-like things (including coordination) we
use “:op#” to define places in the list:
• Apples and bananas
5
4
(a / and
:op1 (a2 / apple)
:op2 (b / banana))
Have-concession-91:
Arg2: “although” clause
Arg1: main clause
Discourse Connec)ves and Coordina)on
Meaning Representations for Natural Languages Tutorial Part 2
Common Meaning Representations
• Format & Basics
• Some Details & Design Decisions
• Practice - Walking through a few AMRs
• Multi-sentence AMRs
• Relation to Other Formalisms
• UMRs
• Open Questions in Representation
Representation Roadmap
Practice - Let’s Try some Sentences
• Feel free to annotate by hand (or ponder how you’d want to represent them)
• Edmund Pope tasted freedom today for the first 3me in more than eight months.
• Pope is the American businessman who was convicted last week on spying charges and sentenced to
20 years in a Russian prison.
Taste-01:
Arg0: taster
Arg1: food
Useful Normalized forms:
- Rate-en5ty
- Ordinal-en5ty
- Date-en5ty
- Temporal-quan5ty
Useful NER types:
- Person
- Country
Convict-01
Arg0: judge
Arg1: person convicted
Arg2: convicted of what
Spy-01
Arg0: secret agent
Arg1: entity spied /seen
Charge-01
Asking price
Arg0: seller
Arg1: asking price
Arg2: buyer
Arg3 :commodity
Charge-05
Assign a role
(including criminal charges)
Arg0:assigner
Arg1 : assignee
Arg2: role or crime
Sentence-01
Arg0: judge/jury
Arg1: criminal
Arg2: punishment
Prac3ce- Let’s Try some Sentences
Edmund Pope tasted freedom today for the first time in more than eight months.
(t2 / taste-01
:ARG0 (p / person :wiki "Edmond_Pope"
:name (n2 / name :op1 "Edmund" :op2 "Pope"))
:ARG1 (f / free-04
:ARG1 p)
:time (t3 / today)
:ord (o3 / ordinal-entity :value 1
:range (m / more-than
:op1 (t / temporal-quantity :quant 8
:unit (m2 / month)))))
Prac3ce- Let’s Try some Sentences
Pope is the American businessman who was convicted last week
on spying charges and sentenced to 20 years in a Russian prison.
(b2 / businessman
:mod (c5 / country :wiki "United_States"
:name (n6 / name :op1 "America"))
:domain (p / person :wiki "Edmond_Pope"
:name (n5 / name :op1 "Pope"))
:ARG1-of (c4 / convict-01
:ARG2 (c / charge-05
:ARG1 b2
:ARG2 (s2 / spy-01
:ARG0 p))
:time (w / week
:mod (l / last)))
:ARG1-of (s / sentence-01
:ARG2 (p2 / prison
:mod (c3 / country :wiki "Russia"
:name (n4 / name :op1 "Russia"))
:duration (t3 / temporal-quantity :quant 20
:unit (y2 / year)))
:ARG3 s2))
Meaning Representations for Natural Languages Tutorial Part 2
Common Meaning Representations
• Format & Basics
• Some Details & Design Decisions
• Practice - Walking through a few AMRs
• Multi-sentence AMRs
• Relation to Other Formalisms
• UMRs
• Open Questions in Representation
Representation Roadmap
A final component in AMR: Multi-sentence!
• AMR 3.0 release contains Mul--sentence AMR annota-ons
• Document-level coreference:
• Connec=ng men=ons that co-refer
• Connec=ng some par=al coreference
• Making cross-sentence implicit seman=c roles
• John took his car to the store.
• He bought milk (from the store).
• He put it in the trunk.
A final component in AMR: Mul)-sentence!
• AMR 3.0 release contains Mul--sentence AMR annota-ons
• Annota=on was done between AMR variables, not raw text — nodes are coreferent
• (t / take-01
:ARG0 (p / person :name (n / name :op1 “John”))
:ARG1 (c / car :poss p)
:ARG3 (s / store)
• (B / buy-01
:ARG0 (h / he)
:ARG1 (m / milk))
A final component in AMR: Mul)-sentence!
• AMR 3.0 release contains Multi-sentence AMR annotations
• "implicit role" annotation was done by showing the remaining roles to annotators
and allowing them to be added to coreference chains.
• (t / take-01
:ARG0 (p / person :name (n / name :op1 “John”))
:ARG1 (c / car :poss p)
• :ARG2 (x / implicit :op1 “taken from, source…”
:ARG3 (s / store)
• (B / buy-01
:ARG0 (h / he)
:ARG1 (m / milk)
:ARG2 (x / implicit :op1“seller”)
A final component in AMR: Multi-sentence!
• AMR 3.0 release contains Multi-sentence AMR annotations
• Implicit roles are worth considering for meaning representation, especially for
languages other than English
• Null subject (and sometimes null object) constructions are very cross-linguistically
common, can carry lots of information
• Arguments of nominalizations can carry a lot of assumed information in scientific
domains
A final component in AMR: Multi-sentence!
• MulL-sentence AMR data: training and evaluaLon data for creaLng a graph for
a whole document
• Was not impossible before mul=-sentence AMR: could boostrap with span-based
coreference data
• Also extended to spa=al AMRs (human-robot interac=ons - Bonn et al .2022
• MS-AMR work was done on top of exisLng gold AMR annotaLons — a separate
process.
Meaning Representa=ons for Natural Languages Tutorial Part 2
Common Meaning Representa0ons
• Format & Basics
• Some Details & Design Decisions
• Prac=ce - Walking through a few AMRs
• Mul=-sentence AMRs
• Rela>on to Other Formalisms
• UMRs
• Open Ques=ons in Representa=on
Representa6on Roadmap
Comparison to Other Frameworks
6
6
• Meaning representations vary along many
dimensions!
• How meaning is connected to text
• Relationship to logical and/or executable form
• Mapping to Lexicons/Ontologies/Tasks
• Relationship to discourse
• We’ll overview these followed by some side-
by-side comparisons
Alignment to Text / Compositionality
6
7
• Historical approach to meaning representa1ons: represent context-free seman1cs,
as defined by a par1cular grammar model
• AMR at other extreme: AMR graph annotated for a single sentence, but no
individual mapping from tokens to nodes
Alignment to Text / Composi6onality
6
8
Oepen & Kuhlmann (2016) “flavors” of meaning representations:
Type 0: Bilexical Type 1: Anchored Type 2: Unanchored
Nodes each correspond
to one token
(Dependency parsing)
Nodes are aligned to text
(can be subtoken or
multi-token)
No mapping from graph
to surface form
Universal Dependencies UCCA AMR
MRS-connected
frameworks (DM, EDS)
DRS-based frameworks
(PMB / GMB)
Some executable/task-
specific semantic parsing
frameworks
Prague Semantic
dependencies
Prague tectogrammatical
Alignment to Text / Compositionality
6
9
Less thoroughly defined: adherence to grammar/composiAonally (cf. Bender et al. 2015)
Some frameworks (MRS/ DRS below) have parAcular asserAons about how a given meaning representaAon was derived (Aed to a parAcular
grammar)
AMR encodes many useful things that are oPen *not* considered composiAonal — named enAty typing, cross-sentence coreference, word
senses, etc.
<- “Sentence meaning” Extragrammatical inference ->
Only encode “compositional”
meanings predicted by a
particular theory of grammar
some useful pragmatic
inference (e.g. sense
distinctions, named entity
types)
Any wild inferences needed for
task
Alignment to Text / Compositionality - UCCA
7
0
• Universal Conceptual Cogni2ve Annota2on : based on a typological
theory (Dixon’s BLT) of how to do coarse-grained seman2cs across
languages
• Similar to a cross between dependency and cons2tuency parses (labeled
edges)- some2mes very syntac2c
• Coarse-grained roles, e.g.:
• A: par2cipant
• S: State
• C: Center
• D: Adverbial
• E: elaborator
• “Anchored” graphs, in the Open & Kuhlman taxonomy (somewhat
composi2onal, but no formal rules for how a given node is derived)
Alignment to Text / Compositionality - Prague
7
1
• Very similar to AMR with more general semantic roles (predicates use
Vallex predicates (valency lexicon) and a shared set of semantic roles
similar to VerbNet)
• Semantic graph is aligned to syntactic graph layers (“type 1”)
• “Prague Czech-English Dependency Treebank”
• “PSD” reduced form fully bilexical (“Type 0”) for dependency
parsing.
• Full PCEDT also has rich semantics like implicit roles (e.g. null
subjects) – “anchored” (“Type 1”)
For the Czech version of “An earthquake struck
Northern California, killing more than 50
people.” (Čmejrek et al. 2004)
Logical & Executable Forms
7
2
• Lots of logical desiderata:
• Modeling whether events happen and/or are believed (and other modality
questions): Sam believes that Bill didn’t eat the plums.
• Understanding quantifications: whether “every child has a favorite song” refers
to one song or many
• Technically our default assumption for AMR is Neo-Davidsonian: bag of triples like
(“instance-of(b, believe-01)”, “instance-of(h, he), “ARG0(b, h)”
• One cannot modify more than one node in the graph
• PENMAN is a bracketed tree that can be treated like a logical form (with certain
assumptions or addition to certain new annotations)
• Artzi et al. 2015), Bos (2016), Stabler (2017), : Pustejovsky et al. (2019), etc.
• Competing frameworks like DRS and MRS more specialized for this.
Logical & Executable Forms
7
3
• Lots of logical desiderata:
• Modeling whether events happen and/or are believed (and other modality
questions): Sam believes that Bill didn’t eat the plums.
• Understanding quantifications: whether “every child has a favorite song” refers
to one song or many
• Technically our default assumption for AMR just means that something like “:polarity
-“ is a feature of a single node; no semantics for quantifiers like “every”
• With certain assumptions or addition to certain new annotations, PENMAN is a
bracketed tree that can be treated like a logical form
• Artzi et al. 2015), Bos (2016), Stabler (2017), : Pustejovsky et al. (2019), etc.;
proposals for “UMR” treatments as well.
• Competing frameworks like DRS and MRS more specialized for this.
Logical & Executable Forms - DRS
7
4
• Discourse Representa1on Structures (annota1ons in
Groening Meaning Bank and Parallel Meaning Bank)
• DRS frameworks do scoped meaning representa1on
• Outputs originally modified from CCG parser LF
outputs-> DRS
• DRS uses “boxes” which can be negated, asserted,
believed in.
• This is not na1vely a graph representa1on! “box
variables”(bo[om) one way of thinking about
these
• a triple like “agent(e1, x1)” is part of b3
• Box b3 is modified (e.g. b2 POS b3)
Logical & Executable Forms - DRS
7
5
• Grounded in long theore5cal DRS tradi5on (Heim &
Kamp) for handling discourse referents, presupposi/ons,
discourse connec/ves, temporal rela/ons across
sentences, etc.
• DRS for “everyone was killed” (Liu et al. 2021)
Logical & Executable Forms - MRS
7
6
Minimal Recursion Semantics (and related frameworks)
• Copestake (1997) model proposed for semantics of HPSG - this is
connected to other underspecification solutions (Glue semantics /
hole semantics / etc. )
• Define set of constraints over which variables outscope other
variables
• HPSG grammars like the English Resource Grammar produce ERS
(English resource semantics) outputs (which are roughly MRS) and
have been modified into a simplified DM format (“type 0” bilexical
dependency)
Logical & Executable Forms - MRS
7
7
• Underspecification in practice:
• MRS can the thought of as many fragments with constraints on
how they scope together
• Those define a set of MANY possible combinations
into a fully scoped output, e.g.:
Every dog barks and chases a cat(as interpreted in Manshadi et al. 2017)
Logical & Executable Forms- MRS
7
8
• Variables starting with h are “handle”
variables used to define constraints on
scope.
• h19 = things under scope of negation
• H21 = leave_v_1 head
• H19 =q h21 : equality modulo
quantifiers
• (Neg outscopes leave)
• “forest” of possible readings
• Takeaway: Constraints on which variables
“outscope" others can add flexible amounts
of scope info
Lexicon/Ontology Differences
7
9
• Predicates can use different ontologies – e.g. more grounded in
grammar/valency, or more tied to taxonomies like WordNet
• Semantic Roles can be encoded differently, e.g. with non-lexicalized
semantic roles (discussed for UMR later)
• Some additional proposals: “BabelNet Meaning Representation”
propose using VerbAtlas (clusters over wordnet senses with VerbNet
semantic role templates)
DRS (GMB/PMB) MRS Prague (PCEDT ) AMR UCCA
Semantic Roles VerbNet (general
roles)
General roles General roles +
valency lexicon
Lexicalized numbered
arguments
Fixed general roles
Predicates WordNet grammatical entries Vallex valency lexicon
(Propbank-like)
Propbank Predicates A few types (State vs
process …)
non-predicates wordnet Lemmas Lemmas Named entity types Lemmas
Task-specific Representations
8
0
• Many use “Seman1c Parsing” to refer to task-specific, executable
representa1ons
• Text-to-SQL
• interac1on with robots, text to code/commands
• interac1on with determinis1c systems like calendars/travel
planners
• Similar dis1nc1ons to a general-purpose meaning representa1on, BUT
• May need to map into specific task taxonomies and ignore
content not relevant to task
• Can require more detail or inference than what’s assumed for
“context-free” representa1ons
• Ogen can be thought of as first-order logic forms — simple
predicates + scope
Task-specific Representations
8
1
• Classic datasets (Table from
Dong & Lapata 2016) regard
household commands or
querying KBs
• Recent tasks for text-to-SQL
Task-specific Representa6ons- Spa6al AMR
8
2
• Additional example of task-specific semantic parsing is human-robot
interaction
• Non-trivial to simply pull those interactions from AMR: normal human
language is not normally sufficiently informative about spatial positioning,
frames of reference, etc.
• Spatial AMR project (Bonn et al. 2020) a good example of project
attempting to add all “additional detail” needed to handle structure-
building dialogues (giving instructions for building Minecraft structures)
• Released with dataset of building actions, success/failures, views
of the event different angles.
Discourse-Level Annotation
8
3
• Do you do multi-sentence coreference?
• Partial coreference (set-subset, implicit roles,
etc.)?
• Discourse connectives?
• Treatment of multi-sentence tense, modality,
etc.?
• Prague Tectogrammatical annotations & AMR
only general-purpose representations with
extensive multi-sentence annotations
Overviewing Frameworks vs. AMR
Alignment Logical Scoping &
Interpretation
Ontologies and
Task-Specifc
Discourse-Level
DRS (Groeningen /
Parallel)
Compositional
/Anchored
Scoped
representation
(boxes)
Rich predicates
(WordNet), general
roles
Can handle
referents,
connectives
MRS Compositional
/Anchored
Underspecified
scoped
representation
Simple predicates,
general roles
N/a
UCCA Anchored Not really scoped Simple predicates,
general roles
Some implicit roles
Prague Tecto Anchored Not really scoped Rich predicates,
semi-lexicalizekd
roles
Rich multi-
sentence
conference
AMR Unanchored
(English);
Anchored
(Chinese)
Not really scoped
yet
Rich predicates,
lexicalized roles
Rich multi-
sentence
conference
End of Meaning Representation Comparison
• What’s next: UMR — proposal within AMR-connected
scholars on next steps for AMR.
• QuesHons about how AMR is annotated?
• QuesHons about how it relates to other meaning
representaHon formalisms?
Meaning Representations for Natural Languages Tutorial Part 2
Common Meaning Representations
Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
Outline
► Background
► Do we need a new meaning representation? What’s wrong with existing
meaning representations?
► Aspects of Uniform Meaning Representation (UMR)
► UMR starts with AMR but made a number of enrichments
► UMR is a document-level meaning representation that represents temporal
dependencies, modal dependencies, and coreference
► UMR is a cross-lingual meaning representation that
separates aspects of meaning that are shared across languages
language-independent from those that are idiosyncratic to individual
languages (language-specific)
► UMR-Writer -- a tool for annotating UMRs
Why aren’t exisHng meaning representaHons sufficient?
► Existing meaning representations vary a great deal in their focus
and perspective
► Formal semantic representations aimed at supporting logical inference
focus on the proper representation of quantification, negation, tense,
and modality (e.g., Minimal Recursion Semantics (MRS) and Discourse
Representation Theory (DRT).
► Lexical semantic representations focus on the proper representation of
core predicate-argument structures, word sense, named entities and
relations between them, coreference (e.g., Tectogrammatical
Representation (TR), AMR).
► The semantic ontology they use also differ a great deal. For
example, MRS doesn’t have a classification of named entities at
all, while AMR has over 100 types of named entities
UMR uses AMR as a starting point
► Our starting point is AMR, which has a number of
attractive properties:
► Easy to read,
► scalable (can be directly annotated without relying on syntactic
structures),
► has information that is important to downstream applications (e.g.,
semantic roles, named entities and coreference),
► represented in a well-defined mathematical structure (asingle-rooted,
directed, acylical graph)
► Our general strategy is to augment AMR with meaning
components that are missing and adapt it to cross-lingual
settings
ParHcipants of the UMR project
► UMR stands for Uniform Meaning Representation, and it is an
NSF funded collaborative project between Brandeis University,
University of Colorado, and University of New Mexico, with a
number of partners outside these institions
From AMR to UMR Gysel et al. (2021)
► At the sentence level, UMR adds:
► An aspect attribute to eventive concepts
► Person and number attributes for pronouns and other nominal
expressions
► Quantification scope between quantified expressions
► At the document level UMR adds:
► Temporal dependencies in lieu of tense
► Modal dependencies in lieu of modality
► Coreference relations beyond sentence boundaries
► To make UMR cross-linguistically applicable, UMR
► defines a set of language-independent abstract concepts and
participant roles,
► uses lattices to accommodate linguistic variability
► designs specifications for complicated mappings between words and
UMR concepts.
UMR sentence-level addi6ons
► An Aspect attribute to event concepts
► Aspect refers to the internal constituency of events - their
temporal and qualitative boundedness
► Person and number attributes for pronouns and other
nominal expressions
► A set of concepts and relations for discourse relations
between clauses
► Quantification scope between quantified expressions to
facilitate translation of UMR to logical expressions
UMR attribute: aspect
Aspect
Habitual
Imperfective
Process
State
Atelic
Process
Perfective
Activity
Endeavor
Performance
Reversible State
Irreversible State
Inherent State
Point State
Undirected Activity
Directed Activity
Semelfactive Undirected
Endeavor Directed
Endeavor
Incremental Accomplishment
Nonincremental
Accomplishment Directed
Achievement
Reversible Irreversible
UMR aNribute: coarse-grained aspect
► State: unspecified type of state
► Habitual: an event that occurs regularly in the
past or present, including generic statements
► Activity: an event that has not necessarily ended and
may be ongoing at Document Creation Time (DCT).
► Endeavor: a process that ends without reaching
completion (i.e., termination)
► Performance: a process that reaches a completed
result
state
Coarse-grained Aspect as an UMR attribute
He wants to travel to Albuquerque.
(w / want
:aspect State)
She rides her bike to
work.
(r / ride
:aspect Habitual)
He was writing his
paper yesterday.
(w / write
:aspect Activity)
Mary mowed the lawn for thirty
minutes.
(m / mow
:aspect Endeavor)
Fine-grained Aspect as an UMR attribute
My cat is hungry.
(h / have-mod-91
:aspect Reversible state)
The wine glass is
shattered.
(h / have-mod-91
:aspect Irreversible state)
My cat is black and white.
(h / have-mod-91
:aspect Inherent state)
It is 2:30pm.
(h / have-mod-91
:aspect Point state)
AMR vs UMR on how pronouns are represented
► In AMR, pronouns are treated as unanalyzable concepts
► However, pronouns differ from language to language, so UMR
decomposes them into person and number attributes
► These attributes can be applied to nominal expressions too
AMR:
(s / see-01
:ARG0 (h/ he)
:ARG1 (b/ bird
:mod (r/ rare)))
UMR:
(s / see-01
:ARG0 (p / person
:ref-person 3rd
:ref-number Sing.)
:ARG1 (b / bird
:mod (r/ rare)
:ref-number Plural))
“He saw rare birds
today.”
UMR attributes: Person
Person
Non-third
Non-first
Third
Inclusive
First
Second
Exclusive
UMR attributes: number
Number
Singular
Non-singular
Paucal
Plural
Non-dual
Paucal
Dual
Greater
Plural
Trial
Non-trial
Paucal
Discourse relations in UMR
► In AMR, there is a minimal system for indicating
relationships between clauses - specifically coordination:
► and concept and :opX relations for addition
► or/either/neither concepts and :opX relations for disjunction
► contrast-01 and its participant roles for contrast
► Many subordinated relationships are represented through
participant roles, e.g.:
► :manner
► :purpose
► :condition
► UMR makes explicit the semantic relations between (more
general) “coordination” semantics and (more specific)
“subordination” semantics
Discourse relations in UMR
Discours
e
Relations
inclusive-disj
or
and + but
exclusive-disj
and +
unexpected
and +
contrast
but-91
and
consecutive
additive
unexpected-co-
occurrence-91
contrast-91
:apprehensive
:condition
:cause
:purpose
:temporal
:manner
:pure-addition
:substitute
:concession
:concessive-
condition
:subtraction
Disambiguation of quantification scope in UMR
“Someone didn’t answer all the questions”
(a / answer-01
:ARG0 (p / person)
:ARG1 (q / question :quant All :polarity -)
:pred-of (s / scope :ARG0 p :ARG1 q))
∃p(person(p) ∧ ¬∀q(question(q) →
∃a(answer-01(a) ∧ ARG1(a, q) ∧ ARG0(a, p))))
Quantification scope annotation
► Scope will not be annotated for summation readings, nor is
it annotated where a distributive or collective reading can be
predictably derived from the lexical semantics.
► The linguistics students ran 5 kilometers to raise money for charity.
► The linguistics students carried a piano into the theater.
► Ten hurricanes hit six states over the weekend.
► The scope annotation only comes into play when some
overt linguistic element forces an interpretation that
diverges from the lexical default
► The linguistics students together ran 200 kilometers to raise
money for charity.
► The bodybuilders each carried a piano into the theater.
► Ten hurricanes each hit six states over the weekend.
From AMR to UMR Gysel et al. (2021)
► At the sentence level, UMR adds:
► An aspect attribute to eventive concepts
► Person and number attributes for pronouns and other nominal
expressions
► Quantification scope between quantified expressions
► At the document level UMR adds:
► Temporal dependencies in lieu of tense
► Modal dependencies in lieu of modality
► Coreference relations beyond sentence boundaries
► To make UMR cross-linguistically applicable, UMR
► defines a set of language-independent abstract concepts and
participant roles,
► uses lattices to accommodate linguistic variability
► designs specifications for complicated mappings between words and
UMR concepts.
UMR is a document-level representation
► Temporal relations are added to UMR graphs as
temporal dependencies
► Modal relations are also added to UMR graphs as
modal dependencies
► Coreference is added to UMR graphs as identity or
subset relations between named entities or events
No representation of tense in AMR
talk-01
she
he
ARG0
ARG2
medium
language
name
name
op1
“French”
(t / talk-01
:ARG0 (s / she)
:ARG2 (h / he)
:medium (l / language
:name (n / name
:op1 "French")))
► “She talked to him in French.”
► “She is talking to him in French.”
► “She will talk to him in French.”
Adding tense seems straighMorward...
Adding tense to AMR involves defining a temporal relation
between event-time and the Document Creation Time
(DCT) or speech time (Donatelli et al 2019).
talk-01
she
he
ARG0
ARG2
medium
time
before
op1
now
language
name
name
op1
“French”
(t / talk-01
:time (b / before
:op1 (n / now)))
:ARG0 (s / she)
:ARG2 (h / he)
:medium (l / language
:name (n / name
:op1 "French")))
“She talked to him in French.”
... but it isn’t
► For some events, its temporal relation to the DCT or
speech time is undefined. “John said he would go to the
florist shop”.
► Is “going to the florist shop” before or after the DCT?
► Its temporal relation is more naturally defined with respect to “said”.
► In quoted speech, the speech time has shifted. “I visited my
aunt on the weekend,” Tom said.
► The reference time for “visited” has shifted to the time when
Tom said this. We only know the “visiting” event happened
before the DCT indirectly.
► Tense is not universally grammaticalized, e.g., Chinese
Limita9ons of simply adding tense
► Even in cases when tense, i.e., the temporal relation between an event
and the DCT is clear, tense may not give us the most precise temporal
location of the event.
► John went into the florist shop.
► He had promised Mary some flowers.
► He picked out three red roses, two white ones and one pale pink
► Example from (Webber 1988)
► All three events happened before the DCT, but we also know that the
“going” event happened after the “promising” event, but before the
“picking out” event.
UMR represents temporal relations in a document as
temporal dependency structures (TDS)
► The temporal dependency structure annotation involves
identifying the most specific reference time for each event
► Time expressions and other events are normally the most
specific reference times
► In some cases, an event may require two reference times in
order to make its temporal location as specific as possible
Zhang and Xue (2018); Yao et al. (2020)
TDS Annotation
► If an event is not clearly linked temporally to either a
time expression or another event, then it can be linked
to the DCT or tense metanodes
► Tense metanodes capture vague stretches of time that
correspond to grammatical tense
► Past_Ref, Present_Ref, Future_Ref
► DCT is a more specific reference time than a tense
metanode
Temporal dependency Structure (TDS)
► If we identify a reference time for every event and time
expression in a document, the result will be a
Temporal Dependency Graph.
descended
arrested
assaulted
ROOT
Temporal
DCT (4/30/2020
Depends-on
today
Contained
Contained
Contained
After Before
“700 people descended on the state Capitol today, according
to Michigan State Police. State Police made one arrest, where
one protester had assaulted another, Lt. Brian Oleksyk said.”
Genre in TDS Annotation
► Temporal relations function differently depending on the
genre of the text (e.g., Smith 2003)
► Certain genres proceed in temporal sequence from one
clause to the next
► While other genres involve generally non-sequenced
events
► News stories are a special type
► many events are temporally sequenced
► temporal sequence does not match with sequencing in the text
TDS Annotation
► Annotators may also consider the modal annotation when
annotating temporal relations
► Events in the same modal “world” can be temporally linked to
each other
► Events in non-real mental spaces rarely make good
reference times for events in the “real world”
► Joe got to the restaurant, but his friends had not arrived. So, he
sat down and ordered a drink.
► Exception to this are deontic complement-taking
predicates
► Events in the complement are temporally linked to the
complement-taking predicate
► E.g. I want to travel to France: After (want, travel)
Modality in AMR
► Modality characterizes the reality status of events, without
which the meaning representation of a text is incomplete
► AMR has six concepts that represent modality:
► possible-01, e.g., “The boy can go.”
► obligate-01, e.g., “The boy must go.”
► permit-01, e.g., “The boy may go.”
► recommend-01, e.g., “The boy should go.”
► likely-01, e.g., “The boy is likely to go.”
► prefer-01, e.g., “They boy would rather go.”
► Modality in AMR is represented as senses of an English
verb or adjective.
► However, the same exact concepts for modality may not
apply to other languages
Modal dependency structure
► Modality is represented as a dependency structure in
UMR
► Similar to the temporal relations
► Events and conceivers (sources) are nodes in
the dependency structure
► Modal strength and polarity values characterize the edges
► Mary might be walking the dog.
AUTH
Neutral
walk
Modal dependency structure
► A dependency structure:
► Allows for the nesting of modal operators (scope)
► Allows for the annotation of scope relations between
modality and negation
► Allows for the import of theoretical insights from Mental
Space Theory (Fauconnier 1994, 1997)
Modal dependency structure
► There are two types of nodes in the modal
dependency structure: events and conceivers
► Conceivers
► Mental-level entities whose perspective is modelled in
the text
► Each text has an author node (or nodes)
► All other conceivers are children of the AUTH node
► Conceivers may be nested under other conceivers
► Mary said that Henry wants...
AUTH MARY HENRY
Epistemic strength lattice
Epistemic
Strength
Non-neutral
Non-full
Partial
Full
Neutral
Strong partial
Weak partial
Strong neutral
Weak neutral
Full: The dog barked.
Partial: The dog probably barked.
Neutral: The dog might have barked.
Modal dependency structure (MDS)
Michigan State Police
descended
arrested assaulted
ROOT
MODAL
AUTH (CNN)
FULLAFF FULLAFF
FULLAFF
Lt. Brian Oleksyk
FULLAFF FULLAFF
“700 people descended on the state Capitol today, according to
Michigan State Police. State Police made one arrest, where one
protester had assaulted another, Lt. Brian Oleksyk said.”
(Vigus et al., 2019; Yao et al., 2021):
En9ty Coreference in UMR
► same-entity:
1. Edmund Pope tasted freedom today for the first time
in more than eight months.
2. He denied any wrongdoing.
► subset:
1. He is very possesive and controlling but he has no right
to be as we are not together.
Event coreference in UMR
► same-event
1. El-Shater and Malek’s property was confiscated and is believed to
be worth millions of dollars.
2. Abdel-Maksoud stated the confiscation will affect the Brotherhood’s
financial bases.
► same-event
1. The Three Gorges project on the Yangtze River has recently introduced
the first foreign capital.
2. The loan , a sum of 12.5 million US dollars , is an export credit
provided to the Three Gorges project by the Canadian government ,
which will be used mainly for the management system of the Three
Gorges project .
► subset:
1. 1 arrest took place in the Netherlands and another in Germany.
2. The arrests were ordered by anti-terrorism judge fragnoli.
An UMR example with coreference
He is controlling but he has no right to be as we are not together.
(s4c / but-91
:ARG1 (s4c3 / control-01
:ARG0 (s4p2 / person
:ref-person 3rd
:ref-number Singular))
:ARG2 (s4r / right-05
:ARG1 s4p2
:ARG1-of (s4c2 / cause-01
:ARG0 (s4h / have-mod-91
:ARG0 (s4p3 / person
:ref-person 1st
:ref-number Plural)
:ARG1 (s4t/ together)
:aspect State
:modstr FullNeg))
:modstr FullNeg))
(s / sentence
:coref ((s4p2 :subset-of s4p3)))
Implicit
arguments
► Like MS-AMRs, UMR also annotates implicit arguments when they can
be inferred from context and can be annotated for coreference like
overt (pronominal) expressions
(s3d / deny-01
:Aspect Performance
:ARG0 (s3p / person
:ref-number Singular
:ref-person 3rd)
:ARG1 (s3t / thing
:ARG1-of (s3d2 / do-02
:ARG0 s3p
:ARG1-of
(s3w / wrong-02)
:aspect Process
:modpred s3d))
:modstr FullAff)
“He denied any wrongdoing”
The challenge: Integration of different meaning components
into one graph
► How do we represent all this information in a unified
structure that is still easy to read and scalable?
► UMR pairs a sentence-level representation (a modified
form of AMR) with a document-level representation.
► We assume that a text will still have to be processed
sentence by sentence, so each sentence will have a
fragment of the document-level super-structure.
Integrated UMR
representa6on
1. Edmund Pope tasted freedom today for the first time in
more than eight months.
2. Pope is the American businessman who was convicted last
week on spying charges and sentenced to 20 years in a
Russian prison.
3. He denied any wrongdoing.
Sentence-level representation vs document-level representation
(s1t2 / taste-01
:Aspect Performance
:ARG0 (s1p / person
:name (s1n2 / name
:op1 “Edmund”
:op2 “Pope”))
:ARG1 (s1f / free-04 :ARG1 s1p)
:time (s1t3 / today)
:ord (s1o3 / ordinal-entity
:value 1
:range (s1m / more-than
:op1 (s1t / temporal-quantity
:quant 8
:unit (s1m2 / month)))))
Edmund Pope tasted freedom today for the first time in
more than eight months.
(s1 / sentence
:temporal ((DCT :before
s1t2) (s1t3 :contained s1t2)
(DCT :depends-on s1t3))
:modal ((ROOT :MODALAUTH)
(AUTH :FullAff s1t2)))
Pope is the American businessman who was convicted last week on spying charges and sentenced to 20 years in a
Russian prison.
(s2i/ identity-91
:ARG0 (p/ person :wiki "Edmond_Pope"
:name (n/ name "op1 "Pope))
:ARG1 (b/ businessman
:mod (n2/ nationality :wiki "United_States"
:name (n3/ name :op1 "America")))
:ARG1-of (c/ convict-01
:ARG2 (c2/ charge-05
:ARG1b
:ARG2 (s/ spy-02
:ARG0b
:modpred c2))
:temporal (w/ week
:mod ( l / last))
:aspect Performance
:modstr FullAff)
:ARG1-of (s2/ sentence-01
:ARG2 (p2/ prison
:mod (c3/ country :wiki "Russia"
:name (n4/ name :op1 "Russia))
:duration ( t / temporal-quantity
:quant 20
:unit (y/ year)))
:ARG3s
:aspect Performance
:modstr FullAff)
:aspect State
:modstr FullAff)
( s2 / sentence
:temporal ((s2c4 :before s1t2)
(DCT :depends-on s2w)
(s2w :contained s2c
(s2w :contained s2s2)
(s2c :after s2s)
(s2s :after s2c4))
:modal ((AUTH :FullAff s2i)
(AUTH :FullAff s2c)
(AUTH :FullAff Null Charger)
(Null Charger :FullAff s2c2)
(s2c2 :Unsp s2s)
(AUTH :FullAff s2s2))
:coref ((s1p :same-entity s2p)))
Sentence-level representation vs document-level representation
He denied any wrongdoing.
(s3d/deny-01
:Aspect Performance
:ARG0 (s3p / person
:ref-number Singular
:ref-person 3rd)
:ARG1 (s3t / thing
:ARG1-of (s3d2 / do-02
:ARG0 s3p
:ARG1-of
(s3w/wrong-02)
:aspect Performance
:modpred s3d)
:modpred FullAff))
(s3 / sentence
:temporal ((s2c :before s3d))
:modal ( (AUTH :FullAff s3p)
(s3p :FullAff s3d
(s3d :Unsp s3d2)))
:coref ((s2p :same-entity s3p)))
Sentence-level representation vs document-level representation
UMR
graph
From AMR to UMR Gysel et al. (2021)
► At the sentence level, UMR adds:
► An aspect attribute to eventive concepts
► Person and number attributes for pronouns and other nominal
expressions
► Quantification scope between quantified expressions
► At the document level UMR adds:
► Temporal dependencies in lieu of tense
► Modal dependencies in lieu of modality
► Coreference relations beyond sentence boundaries
► To make UMR cross-linguistically applicable, UMR
► defines a set of language-independent abstract concepts and
participant roles,
► uses lattices to accommodate linguistic variability
► designs specifications for complicated mappings between words and
UMR concepts.
Elements of AMR are already cross-linguistically
applicable
► Abstract concepts (e.g., person, thing, have-org-role-91):
► Abstract concepts are concepts that do not have explicit lexical support
but can be inferred from context
► Some semantic relations (e.g., :manner, :purpose, :time) are also
cross-linguistically applicable
Language-independent vs language-specific aspects of AMR
加入-01
person
董事会 date-entity
name
temporal-quantity
” 文肯”
” 皮埃尔”
61
岁
have-org-role-91
董事
11 29
Arg0
Arg1 time
name
op1
op2
age
quant
unit
Arg1-of
Arg0
Arg2
month day
mod
执行
polarity
-
“61 岁的 Pierre Vinken 将于 11 月 29 日加入董事会,担任
非执行董事。”
Language-independent vs language-specific aspects of AMR
join-01
person
board date-entity
name
temporal-quantity
”Vinken”
”Pierre”
61
year
have-org-role-91
director
11 29
Arg0
Arg1 time
name
op1
op2
age
quant
unit
Arg1-of
Arg0
Arg2
month day
mod
executive
polarity
-
““Pierre Vinken , 61 years old , will join the board as
a nonexecutive director Nov. 29 .”
Abstract concepts in UMR
► Abstract concepts inherited from AMR:
► Standardization of quantities, dates etc.: have-name-91,
have-frequency-91, have-quant-91, temporal-quantity, date-entity...
► New concepts for abstract events: “non-verbal” predication.
► New concepts for abstract entities: entity types are annotated for
named entities and implicit arguments.
► Scope: scope concept to disambiguate scope ambiguity to facilitate
translation of UMR to logical expressions (see sentence-level
structure).
► Discourse relations: concepts to capture sentence-internal discourse
relations (see sentence-level structure).
Sample abstract events
Clause Type UMR
Predicates
Arg0 Arg1 Arg2
Thetic/present
ational
possession
have-91 possessor possessum
Predicative
possession
belong-91 possessum possessor
Thetic/present
ational location
exist-91 location theme
Predicative
location
have-location-
91
theme location
property-
predicaOon
have-mod-91 theme property
Object
predication
have-role-91 theme Ref point Object
category
Equational identity-91 theme equated referent
How do we find abstract eventive concepts?
► Languages use different strategies to express these meanings:
► Predicativized possessum: Yukaghir
pulun-die jowje-n'-i old.man-DIM net-PROP
3SG.INTR
`The old man has a net, lit. The old man net-
has.'
► UMR trains annotators to recognize the semantics of these constructions and
select the appropriate abstract predicate and its participant roles
Language-independent vs language-specific participant roles
► Core participant roles are defined in a set of frame files (valency
lexicon, see Palmer et al. 2005). The semantic roles for each
sense of a predicate are defined:
► E.g. boil-01: apply heat to water
ARG0-PAG: applier of heat ARG1-PPT:
water
► Most languages do not have frame files
► But see e.g. Hindi (Bhat et al. 2014), Chinese (Xue 2006)
► UMR defines language-independent participant roles
► Based on ValPaL data on co-expression patterns of different
micro-roles (Hartmann et al., 2013)
Language-independent roles: an incomplete list
UMR Annotation
Actor
Definition
animate entity that initiates the action
Undergoer
theme
Recipient
force
Causer
causer
experiencer
stimulus
entity (animate or inanimate) that is affected
by the action
entity (animate or inanimate) that moves from
one entity to another entity, either spatially or
metaphorically
animate entity that gains possession (or at
least temporary control) of another entity
inanimate entity that initiates the action
animate entity that acts on another animate
entity to initiate the action
animate entity that acts on another animate
entity to initiate the action
animate entity that cognitively or sensorily
experiences a stimulus
entity (animate or inanimate) that is experi-
enced by an experiencer
Road Map for annotating UMRs for under-
resourced languages
► Participant Roles:
► Stage 0: General participant roles
► Stage 1: Language-specific frame files
► UMR-Writer allows for the creation of lexicon with argument
structure information during annotation
► Morphosemantic Tests:
► Stage 0: Identify one concept per word
► Stage 1: Apply more fine-grained tests to identify concepts
► Annotation Categories with Lattices:
► Stage 0: Use grammatically encoded categories (more general if
necessary)
► Stage 1: Use (overtly expressed) fine-grained categories
► Modal Dependencies:
► Stage 0: Use simplified modal annotation
► Stage 1: Fill in lexically based modal strength values
How UMR accommodates cross-linguistic variability
► Not all languages grammaticalize/overtly express the same
meaning contrasts:
► English: I (1SG) vs. you (2SG) vs. she/he (3SG)
► Sanapaná: as- (1SG) vs. an-/ap- (2/3SG)
► However, there are typological patterns in how semantic
domains get subdivided:
► A 1/3SG person category would be much more surprising than a
2/3SG one
► UMR uses lattices for abstract concepts, attribute values, and
relations to accommodate variability across languages.
► Languages with overt grammatical distinctions can choose to use
more fine-grained categories
Lattic
es
►Semantic categories are organized in “lattices” to
achieve cross-lingual compatibility while
accommodating variability.
►We have lattices for abstract concepts, relations,
as well as attributes
Non-3rd Non-1st
1st 2nd 3rd
Excl. Incl.
person
Wordhood vs concepthood across languages
► The mapping between words and concepts in languages is
not one-to-one: UMR designs specifications for
complicated mappings between words and concepts.
► Multiple words can map to one concept (e.g., multi-word
expressions)
► One word can map to multiple concepts (morphological
complexity)
Multiple words can map to a single (discontinuous) concept
(x0/帮忙-01
:aspect Performance
:arg0 (x1/地理学)
:affectee (x2/我)
:degree (x3/大))
地理学帮 了我很大的忙。
“Geography has helped me a lot”
(w / want-01
:Aspect State
:ARG0 (p / person)
:ref-person 3rd
:ref-number Singular
:ARG1 (g / give-up-07
:ARG0 h
:ARG1 (t / that)
:aspect Performance
:modpred w)
:ARG1-of (c / cause-01
:ARG0 (a / umr-unknown))
:aspect State)
“Why would he want to give that up?”
One word maps to multiple UMR concepts
► One word containing predicate and arguments
Sanapaná:
yavhan anmen m-e-l-yen-ek
honey alcohol NEG-2/3M-DSTR-drink-POT
"They did not drink alcohol from honey."
(e / elyama
:actor (p / person
:ref-person 3rd
:ref-number Plural)
:undergoer (a / anmen
:material (y/ yavhan))
:modstr FullNeg
:aspect Habitual)
► Argument Indexation: Identify both predicate concept and
argument concept, don’t morphologically decompose word
One word maps to multiple UMR concepts
► One word containing predicate and arguments
Arapaho:
he'ih'iixooxookbixoh'oekoohuutoono' he'ih'ii-xoo-xook-
bixoh'oekoohuutoo-no'
NARR.PST.IPFV-REDUP-through-make.hand.appear.quickly-PL
``They were sticking their hands right through them [the ghosts] to the other
side.''
(b/ bixoh'oekoohuutoo `stick hands through'
:actor (p/ person :ref-person 3rd :ref-number Plural)
:theme (h/ hands)
:undergoer (g/ [ghosts])
:aspect Endeavor
:modstr FullAff)
► Noun Incorporation (less grammaticalized): identify predicate and
argument concept
UMR-Writer
► The annotation interface we use for UMR annotation is
called UMR-Writer
► UMR-Writer includes interfaces for project management,
sentence-level and document-level annotation, as well as
lexicon (frame file) creation.
► UMR-Writer has both keyboard-based and click-based
interfaces to accommodate the annotation habits of
different anntotators.
► UMR-Writer is web-based and supports UMR annotation
for avariety of languages and formats. Sofar it supports
Arabic, Arapaho, Chinese, English,Kukama Navajo, and
Sanapana. It can easily extended to more languages.
UMR writer: Project management
UMR writer: Project management
UMR writer: Sentence-level interface
UMR writer: Lexicon interface
UMR Writer: Document-level interface
UMR summary
► UMR is a rooted directed node-labeled and edge-labeled
document-level graph.
► UMR is a document-level meaning representation that
builds on sentence-level meaning representations
► UMR aims to achieve semantic stability across syntactic
variations and support logical inference
► UMR is across-lingual meaning representation that
separates language-general aspects of meaning from those
that are language-specific
► We are doing UMR English, Chinese, Arabic, Arapaho,
Kukama, Sanapana, Navajo, Quechua
Use cases of UMR
► T
emporal reasoning
► UMR can be used to extract temporal dependencies, which
can then be used to perform temporal reasoning
► Knowledge extraction
► UMR annotates aspect, and this can be used to extract
habitual events or state, which are typical knowledge forms
► Factuality determination
► UMR annotates modal dependencies, and this can be used
to verify the factuality of events or claims
► As intermediate representation for dialogue systems where
control is more needed.
► UMR annotates entities and coreferences, which helps
tracking dialogue states
Planned UMR activities
• The DMR international workshops
• UMR summer schools, tentatively in 2024 and 2025.
• UMR shared tasks once we have sufficient amount of UMR-annotated data as
well as evaluation metrics and baseline parsing models
References
Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Koehn, P
., Palmer, M., and Schneider, N.
(2013). Abstract meaning representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and
interoperability with discourse, pages 178–186.
Hartmann, I., Haspelmath, M., and Taylor, B., editors (2013). TheValency Patterns Leipzig online database. Max Planck Institute
for Evolutionary Anthropology, Leipzig.
Van Gysel, J. E. L., Vigus, M., Chun, J., Lai, K., Moeller, S., Yao, J., O’Gorman, T. J., Cowell,
A., Croft, W. B., Huang, C. R., Hajic, J., Martin, J. H., Oepen, S., Palmer, M., Pustejovsky, J.,Vallejos, R.,and Xue, N.
(2021). Designing auniform meaning representation for natural language processing. Künstliche Intelligenz, pages 1–
18.
Vigus, M., Van Gysel, J. E., and Croft, W. (2019). A dependency structure annotation for modality. In Proceedings of the First
International Workshop on Designing Meaning
Representations, pages 182–198.
Yao, J., Qiu, H., Min, B., and Xue, N. (2020). Annotating temporal dependency graphs via crowdsourcing. In Proceedings of the
2020 Conference on Empirical Methods in Natural LanguageProcessing (EMNLP), pages 5368–5380.
Yao, J., Qiu, H., Zhao, J., Min, B., and Xue, N. (2021). Factuality assessment as modal dependency parsing. In Proceedingsof
the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on
Natural Language Processing (Volume 1: Long Papers), pages 1540–1550.
Zhang, Y
. and Xue, N. (2018). Structured interpretation of temporal relations. In
Proceedings of LREC2018.
Acknowledgements
We would like to acknowledge the support of National Science Foundation:
• NSF IIS (2018): “Building a Uniform Meaning Representation for Natural Language
Processing” awarded to Brandeis (Xue, Pustejovsky), Colorado (M. Palmer, Martin, and
Cowell) and UNM (Croft).
• NSF CCRI (2022): ``Building a Broad Infrastructure for Uniform Meaning
Representations'', awarded to Brandeis (Xue, Pustejovsky) and Colorado (A. Palmer, M.
Palmer, cowell, Martin), with Croft as consultant
All views expressed in this paper are those of the authors and do not
necessarily represent the view of the National Science Foundation.
For more information:
https://umr4nlp.github.io/web/
Meaning Representations for Natural Languages Tutorial Part 3a
Modeling Meaning Representation: SRL
Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
Who did what to whom, when, where and how?
(Gildea and Jurafsky, 2000; Màrquez et al., 2008)
160
Semantic Role Labeling (SRL)
broke
Derik the window with a hammer to
161
Predicate Identification
1 Identify all predicates in the sentence
broke
Semantic Role Labeling (SRL)
escape
escape
break.01
broke
Predicate Identification
1
2
Identify all predicates in the sentence
Sense Disambiguation Classify sense of each predicate
162
break.01, break
A0: breaker
A1: thing broken
A2: instrument
A3: pieces
A4: arg1 broken
away from what?
English Propbank
Breaking_apart
Pieces
Whole
Criterion
Manner
Means
Place…
FrameNet Frame
Break-45.1
Agent
Patient
Instrument
Result
VerbNet
Semantic Role Labeling (SRL)
Derik the window with a hammer to escape.
break.01
Predicate Identification
1
2
3
Identify all predicates in the sentence
Sense Disambiguation Classify sense of each predicate
Argument Identification Find all roles of each predicate
163
Argument identification can either be
- Identification of span, (span SRL) OR
- Identification of head (dependency SRL)
broke
Semantic Role Labeling (SRL)
Derik the window with a hammer to escape
Predicate Identification
1
2
4
3
Identify all predicates in the sentence
Sense Disambiguation Classify sense of each predicate
Argument Identification Find all roles of each predicate
Argument Classification Assign semantic label to each role
164
Breaker thing broken
break.01 instrument Purpose
Semantic Role Labeling (SRL)
break.01
broke
Derik the window with a hammer to escape
Predicate Identification
1
2
4
3
Identify all predicates in the sentence
Sense Disambiguation Classify sense of each predicate
Argument Identification Find all roles of each predicate
Argument Classification Assign semantic label to each role
165
A0: Breaker A1: thing broken
break.01 A2: instrument AM-PRP: Purpose
Semantic Role Labeling (SRL)
break.01
broke
Derik the window with a hammer to escape
If using
PropBank
Predicate Identification
1
2
4
3
Identify all predicates in the sentence
Sense Disambiguation Classify sense of each predicate
Argument Identification Find all roles of each predicate
Argument Classification Assign semantic label to each role
166
A0: Breaker A1: thing broken
break.01 A2: instrument AM-PRP: Purpose
Semantic Role Labeling (SRL)
break.01
broke
Derik the window with a hammer to escape
5 Global Optimization Global constraints (predicates and arguments)
167
Outline
q Early SRL approaches [< 2017]
q Typical neural SRL model components
q Performance analysis
q Syntax-aware neural SRL models
q What, When and Where?
q Performance analysis
q How to incorporate Syntax?
q Syntax-agnostic neural SRL models
q Performance Analysis
q Do we really need syntax for SRL?
q Are high quality contextual embedding enough for SRL
task?
q Practical SRL systems
q Should we rely on this pipelined approach?
q End-to-end SRL systems
q Can we jointly predict dependency and span?
q More recent approaches
q Handling low-frequency exceptions
q Incorporate semantic role label definitions
q SRL as MRC task
q Practical SRL system evaluations
q Are we evaluating SRL systems correctly?
q Conclusion
168
Outline
q Early SRL approaches
q Typical neural SRL model components
q Performance analysis
q Syntax-aware neural SRL models
q What, When and Where?
q Performance analysis
q How to incorporate Syntax?
q Syntax-agnostic neural SRL models
q Performance Analysis
q Do we really need syntax for SRL?
q Are high quality contextual embedding enough for SRL
task?
q Practical SRL systems
q Should we rely on this pipelined approach?
q End-to-end SRL systems
q Can we jointly predict dependency and span?
q More recent approaches
q Handling low-frequency exceptions
q Incorporate semantic role label definitions
q SRL as MRC task
q Practical SRL system evaluations
q Are we evaluating SRL systems correctly?
q Conclusion
169
Early SRL Approaches
Ø 2 to 3 steps to obtain complete predicate-
argument structure
Ø Predicate Identification
Ø Generally considered as not a task, as all the
existing SRL datasets provided Gold predicate
location.
Ø Predicate sense disambiguation
Ø Logistic Regression [Roth and Lapata, 2016]
Ø Argument Identification
Ø Binary classifier [Pradhan et al., 2005; Toutanova et
al., 2008]
Ø Role Labeling
Ø Labeling is performed using a classifier (SVM,
logistic regression)
Ø Argmax over roles will result in a local assignment
Ø Requires Feature Engineering
Ø Mostly Syntactic [Gildea and Jurafsky, 2002]
Ø Re-ranking
Ø Enforce linguiscc and structural constraint (e.g., no
overlaps, disconcnuous arguments, reference
arguments, ...)
Ø Viterbi decoding (k-best list with constraints)
[Täckström et al., 2015]
Ø Dynamic programming [Täckström et al., 2015;
Toutanova et al., 2008]
Ø Integer linear programming [Punyakanok et al.,
2008]
Ø Re-ranking [Toutanova et al., 2008; Bjö̈rkelund et
al., 2009]
170
Outline
q Early SRL approaches
q Typical neural SRL model components
q Performance analysis
q Syntax-aware neural SRL models
q What, When and Where?
q Performance analysis
q How to incorporate Syntax?
q Syntax-agnostic neural SRL models
q Performance Analysis
q Do we really need syntax for SRL?
q Are high quality contextual embedding enough for SRL
task?
q Practical SRL systems
q Should we rely on this pipelined approach?
q End-to-end SRL systems
q Can we jointly predict dependency and span?
q More recent approaches
q Handling low-frequency exceptions
q Incorporate semantic role label definitions
q SRL as MRC task
q Practical SRL system evaluations
q Are we evaluating SRL systems correctly?
q Conclusion
Encoder
Classifier
Embedder
Input Sentence
Word embeddings
- FastText, GloVe
- ELMo, BERT
Types of encoder
- LSTMs, Attention
- MLP
Typical Neural SRL Components
171
A typical neural SRL model contains three
components
Ø Classifier
Ø Assign a semantic role label to each
token in the input sentence. [Local +
Global]
Ø Encoder:
Ø Encodes the context information to each
token.
Ø Embedder:
Ø Represent input token into continuous
vector representation.
Encoder
Classifier
Embedder
Input Sentence
Word embeddings
- FastText, GloVe
- ELMo, BERT
Neural SRL Components – Embedder
172
Ø Embedder:
Ø Represent input token into continuous
vector representation.
He had dared to defy nature
Embedder
Ø Could be static or dynamic embeddings
Ø Could include syntax information
Ø Usually, a binary flag
Ø 0 à represents no predicate
Ø 1 à represent predicate
End-to-end systems do not include this flag
Encoder
Classifier
Embedder
Input Sentence
Word embeddings
- FastText, GloVe
- ELMo, BERT Dynamic Embeddings
Merchant et al., 2020
Neural SRL Components – Embedder
Static Embeddings
GLoVe:
• He et al., 2017
• Strubell et al., 2018
SENNA:
• Ouchi et al., 2018
ELMo:
• Marcheggiani et al., 2017
• Ouchi et al., 2018
• Li et al., 2019
• Lyu et al., 2019
• Jindal et al., 2020
• Li et al., 2020
BERT:
• Shi et al., 2019
• Jindal et al., 2020
• Li et al., 2020
BERT:
• Shi et al., 2019
• Conia et al., 2020
• Zhang et al., 2021
• Tian et al., 2022
RoBERTa:
• Conia et al., 2020
• Blloshmi et al., 2021
• Fei et al., 2021
• Wang et al., 2022
• Zhang et al. 2022
XLNet:
• Zhou et al., 2020
• Tian et al., 2022
173
Ø Embedder:
Ø Represent input token into continuous
vector representation.
85.28
89.6
91.4 91.5
92.6
93.3
70
75
80
85
90
95
100
Random GLoVe; Cai
et al., 2018
ELMo; Liet
al., 2019
BERT;
Conia et
al., 2020
BERT;
Conia et
al., 2020
RoBERTa;
Wang et
al., 2022
WSJ
F1
75.09
79.3
83.28
84.67
85.9
87.2
70
75
80
85
90
95
100
Random GLoVe; He
et al., 2018
ELMo; Liet
al., 2019
BERT;
Conia et
al., 2020
BERT;
Conia et
al., 2020
RoBERTa;
Wang et
al., 2022
Brown
F1
Static Static
Dataset: CoNLL09 EN
Performance Analysis
Best performing model for each word embedding type
174
Encoder
Classifier
Embedder
Input Sentence
Neural SRL Components – Encoder
175
Ø Encoder:
Ø Encodes the context information to each
token.
Types of encoder
- BiLSTMs
- Attention
He had dared to defy nature
Embedder
Encoder
Left pass
Right pass
Encoder could be
Ø Stacked BiLSTMs or some variant of LSTMs
Ø Attention Network
Ø Include syntax information
Encoder
Classifier
Embedder
Input Sentence
Neural SRL Components – Classifier
176
Ø Classifier
Ø Assign a semantic role label to each token
in the input sentence.
He had dared to defy nature
Embedder
Encoder
Usually a FF followed by Softmax
- MLP
Classifier
B-A0 0 0 B-A2 I-A2 I-A2
177
Outline
q Early SRL approaches
q Typical neural SRL model components
q Performance analysis
q Syntax-aware neural SRL models
q What, When and Where?
q Performance analysis
q How to incorporate Syntax?
q Syntax-agnostic neural SRL models
q Performance Analysis
q Do we really need syntax for SRL?
q Are high quality contextual embedding enough for SRL
task?
q Prac/cal SRL systems
q Should we rely on this pipelined approach?
q End-to-end SRL systems
q Can we jointly predict dependency and span?
q More recent approaches
q Handling low-frequency excepcons
q Incorporate semancc role label definicons
q SRL as MRC task
q Prac/cal SRL system evalua/ons
q Are we evaluacng SRL systems correctly?
q Conclusion
178
What and Where Syntax?
<latexit sha1_base64="DDfPssnDMCfvIPspxYHzFdvDzxQ=">AAAMkXicrVZtb9s2EHa7ruk8d2tXYPuwL+yCDN1gG5bbvA0wkKQJNmBr4nlJW8AyAko+2VwoUaCo2g4hYH9zv2B/YR93pOT6JTa6ohUM63T3PHfH44knL+YsUY3G37duf3Ln07sb9z4rf165/8WXDx5+9TIRqfThwhdcyNceTYCzCC4UUxxexxJo6HF45V09N/ZXb0AmTETnahJDL6SDiAXMpwpVlw/v/uV6MGCRDtgglfBj1lXeMO6VXR8iBZJFg7IrIWHX4Imxdk2YEeurYaYfZ7pccPsQQ9SHyJ8gfQghkBZJWBhzqBLoD4AMhWTXIlKUk5j2++i15dS3YdwrE0JmThSMVdbFRaVhRBKIWw6EVSLFyD7UnYKQX26i6EAfY47+VaYz4n5PiCfFFRgJszC3HDNiUV+MEIOaEVNDvFELz81DGoYgCxdKzHiQ+DSG3FAnrou/t+HzuEVQoFdzQfNwy7HyKLMQuXMUlj23O2ftU0t5edI5MsLxybm5nZ5dnOLt8Lg901mlEdqHHauYctoXp+fzXrcI0e5AAkTajakc6EaW2QVPtXYV9YZTaBdsOcPJFmwrEM1sPftFrd3pFPY8Lxd75u2uW4XpI2yXru0Z3AKpyJiIIEhAtWo7saoSTj3gaJpwaOkAO6rlemiWDJKq8dPyOMV26GEqmK+OktT7M9NNGC8G+N9unmX6aYZJqg9xgrmgH/FhmexmejvTIZVXK7x8hIKh/511K/04+7FrasA/pAb7md57vxo4TcySSp/QaIDedxvvl/L+ipSlECoPlkZMkT6e0TTyofUUxu/nvHN2dj5zXbyl7kjI/kCKNDaQ/IcHen+GmUM4WfGjjXX2d3t4aruTOuvs23bnaHOdfc+WiYaxlFl5+k5PxwFqbEnMePkpkbyGtgI1HTiXDzYb9Ya9yE3BKYTNUnG1Lx/e+dftCz8NcUL5nCZJ12nEqqdx55nPAd2nOC6wzBS7AsWIhpD0tJ2VGdlCTZ8EQhKzPcRq5xm4kiSZhB4iQ6qGybLNKFfZuqkK9nqaRXGqcOl5oCDl5rg3gxcbRYKv+AQF6kuGuRJ/SCX1ccwuRlHs6jq7oanNldUajZIzT1I50Vu4OckQB0pSNSKVODRzMRYJM6MeZ6599in3c0yqBK6CbpUXY8E4Ni2JrR7gsLf10VQ8AXyBWDL8IdNy4GUaN6iKu7Rt/hpLaF/ICDg3PTcF7xqc4+T/S/A4wP7q/HyEnbyzUyWOs1clzX0E4aeGL8KQYrO4WHfOlcLPA6eni0ftmldKKb1pRtMSwc6eAr1iEUghyxSPp7DAsIqVULO2eeT8km02i3AfbQsJ4XNOtNrHze3VYYJUcktxAzx1IqFAu6gqQixizZTNsYBjT4Lp0LcVOlyRlCH4axnPa+s4ci2ns5YTrs/sRW1dbuYwz/RcRq9X57OE66zAme+5rNtcSoFDoLqWZirqSjYYqt5lrnAViyZks3nDlVzta+rlnXwcIPhOmkZpS/YC/uj8Vjt5Q/lywvjJrAx0BsMNx7PSWT4Zbwovm3Vnp/7s9+bmwVFxat4rfVv6rvSk5JR2SwelX0rt0kXJv/vPxv2Nrze+qTyq7FcOKgX29q2C86i0cFV+/Q8g42bX</latexit>
[Derick] broke the [window] with a [hammer] to [escape] .
Derick break the window with a hammer to escape .
PROPN VERB DET NOUN ADP DET NOUN PART VERB PUNT
nsubj det
obj mark
det
obl
mark
obl
ROOT
Surface form
Lemma form
U{X}POS
Dependency
Relation
Everything or anything that explains the syntactic structure of the sentence
Parsed with UDPipe Parser: hjp://lindat.mff.cuni.cz/services/udpipe/
What Syntax for SRL?
Syntax at Embedder
Concatenate {POS, dependency relation,
dependency head and other syntactic information}
Where the Syntax is being used?
Marcheggiani et al.,2017b
Li et al., 2018
He et al., 2018
Wang et al., 2019
Kasai et al., 2019
HE et al., 2019
Li et al., 2020
Zhou et al., 2020
179
Encoder
Classifier
Embedder
Input Sentence
Word embeddings
- FastText, GloVe
- ELMo, BERT
EMB
Syntax at Encoder
Dependency tree
- Graphs
- LSTMs Trees
Marcheggiani et al., 2017
Zhou et al., 2020
Marcheggiani et al., 2020
Zhang et al., 2021
Tian et al., 2022
180
Encoder
Classifier
Embedder
Input Sentence
Types of encoder
- BiLSTMs
- Attention
ENC
Where the Syntax is being used?
Joint Learning
At what level Syntax is used?
Strubell et al., 2018
Shi et al., 2020
Multi-task learning
181
Encoder
Classifier
Embedder
Input Sentence
Word embeddings
- FastText, GloVe
- ELMo, BERT
Types of encoder
- BiLSTMs
- Attention
- MLP
87.7 88
89.5 89.8
90.2
90.86 90.99 91.27
91.7
92.83
80
82
84
86
88
90
92
94
Marcheggiani
et al.,2017b
Marcheggiani
et al., 2017
Heet al., 2018 LI et al., 2018 Kasaiet al.,
2019
HEet al., 2019 Lyu et al., 2019 Zhou et al.,
2020
LI et al., 2020 Fei et al., 2021
WSJ
F1
Dataset: CoNLL09 EN
2018 2019 2020 2021à
2017
Emb Enc Emb Emb Emb Emb
Enc
+
Emb
Enc
Emb
BERT/Fine-tune Regime
+2.0
-2.9
Comparing Syntax aware models
Performance Analysis
Enc
182
Dataset: CoNLL09 EN Comparing Syntax aware models
Observations
q Syntax at encoder level provides the best performance.
q Most likely, Encoder is best suited for incorporaOng dependency or consOtuent relaOons.
q BERT models raised the bar
q With Max improvement over Out-of-domain dataset
q However, the improvement since 2019 is marginal
183
A Simple and Accurate Syntax-Agnostic Neural Model for
Dependency-based Semantic Role Labeling
Marcheggiani et al., 2017
Ø Predict semantic dependency edges between
predicates and arguments.
Ø Use predicate-specific roles (such as make-A0
instead of A0) as opposed to generic sequence
labeling task.
184
Syntax at embedder level
Diego Marcheggiani, Anton Frolov, and Ivan Titov. 2017. A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling. In Proceedings of the 21st
Conference on Computational Natural Language Learning (CoNLL 2017), pages 411–420, Vancouver, Canada. Association for Computational Linguistics.
Marcheggiani et al., 2017
Wp
à
Randomly initialized word embeddings
Wr
à
Pre-trained word embeddings
PO
à
Randomly initialized POS embeddings
Le
à
Randomly initialized Lemma embeddings
à
Predicate specific feature [Binary]
Embedder OR
Input word representation
He had dared to defy nature
Embedder
185
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Syntax at embedder level
Marcheggiani et al., 2017
Encoder
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
Several BiLSTMs layers
- Capturing both the left and the right context
- Each BiLSTM layer takes the lower layer as input
186
Syntax at embedder level
Marcheggiani et al., 2017
Preparation for classifier
Provide predicate hidden state as another another
input to classifier along with each token.
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
+ ~6% F1 on CoNLL09 EN
187
The two ways of encoding predicate information,
using predicate-specific flag at embedder level and
incorporating the predicate state in the classifier,
turn out to be complementary.
Syntax at embedder level
Predicate
Hidden state
Marcheggiani et al., 2017
86.9
87.3 87.3
87.7 87.7
80
81
82
83
84
85
86
87
88
89
90
Bjö̈rkelund et al.
(2010)
Täckström et al.
(2015)
FitzGerald et al.
(2015)
Roth and Lapata
(2016)
Marcheggianiet
al. (2017)
WSJ
75.6 75.7 75.2
76.1
77.7
65
70
75
80
85
90
Bjö̈rkelund et
al. (2010)
Täckström et
al. (2015)
FitzGerald et
al. (2015)
Roth and
Lapata (2016)
Marcheggiani
et al. (2017)
Brown
188
Syntax at embedder level
Dataset: CoNLL09 EN
Marcheggiani et al., 2017
Takeaways
Ø Appending POS does help à approx. 1 F1 points gain
Ø Predicate specific encoding does help à approx. 6 F1 point
gain
Ø Quite effective for the classification of arguments which are
far from the predicate in terms of word distance.
Ø Noted: Substantial improvement on EN OOD over previous
works.
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
Classifier
A0 0 0 0 A2 0
189
Syntax at embedder level
Encoding Sentences with Graph ConvoluOonal Networks for
SemanOc Role Labeling
Marcheggiani et al., 2017b
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
Classifier
A0 0 0 0 A2 0
K layers GCN
Ø Basic SRL components remains the same as compared
to [Marcheggiani et al., 2017]
Ø GCN layers are inserted between Encoder and
Classifier.
Ø Re-encoding the encoder representations based
on syntactic structure of the sentence.
Ø Modeling syntactic dependency structure
190
Syntax at encoder level
Diego Marcheggiani and Ivan Titov. 2017. Encoding Sentences with Graph ConvoluAonal Networks for SemanAc Role Labeling. In Proceedings of the 2017 Conference on Empirical Methods in
Natural Language Processing, pages 1506–1515, Copenhagen, Denmark. AssociaAon for ComputaAonal LinguisAcs.
What is syntactic GCN?
Marcheggiani et al., 2017b
He had dared to defy nature
Ø Self Loops
Ø Allowing input feature representation of a node
affects its induced representation.
ReLU ReLU ReLU ReLU ReLU ReLU
nsubj
xcomp obj
aux mark
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
191
He(k+1) = He_self(k) +
Syntax at encoder level
Marcheggiani et al., 2017b
He had dared to defy nature
Ø SyntacOc children set of a node
ReLU ReLU ReLU ReLU ReLU ReLU
nsubj
xcomp obj
aux mark
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="aOP8/D4V8b2zsT12Nkq5uPRT3iM=">AAACB3icbVDLSgNBEJz1GeNr1aMgg4kQL2E3iHoMevEYwTwgiWF2MpuMmZ1dZnrFsOzNi7/ixYMiXv0Fb/6Nk8dBEwsaiqpuuru8SHANjvNtLSwuLa+sZtay6xubW9v2zm5Nh7GirEpDEaqGRzQTXLIqcBCsESlGAk+wuje4HPn1e6Y0D+UNDCPWDkhPcp9TAkbq2Af5FvCAaVy/TQrucdpJWsAeINGxd5em+Y6dc4rOGHieuFOSQ1NUOvZXqxvSOGASqCBaN10ngnZCFHAqWJptxZpFhA5IjzUNlcTsbifjP1J8ZJQu9kNlSgIeq78nEhJoPQw80xkQ6OtZbyT+5zVj8M/bCZdRDEzSySI/FhhCPAoFd7liFMTQEEIVN7di2ieKUDDRZU0I7uzL86RWKrqnxZPrUq58MY0jg/bRISogF52hMrpCFVRFFD2iZ/SK3qwn68V6tz4mrQvWdGYP/YH1+QN7w5kX</latexit>
⇥
W (1)
subj
<latexit
sha1_base64="8Gz0Mk/cy5pzqtj9WbX6+sEWlzg=">AAACCHicbVC7SgNBFJ31GeNr1dLCwUSITdgNopZBG8sI5gHZGGYns8mQ2QczdyVh2dLGX7GxUMTWT7Dzb5wkW2jigQuHc+7l3nvcSHAFlvVtLC2vrK6t5zbym1vbO7vm3n5DhbGkrE5DEcqWSxQTPGB14CBYK5KM+K5gTXd4PfGbD0wqHgZ3MI5Yxyf9gHucEtBS1zwqOsB9pnDzPinZp2k3cYCNIBnR0I/StNg1C1bZmgIvEjsjBZSh1jW/nF5IY58FQAVRqm1bEXQSIoFTwdK8EysWETokfdbWNCB6eSeZPpLiE630sBdKXQHgqfp7IiG+UmPf1Z0+gYGa9ybif147Bu+yk/AgioEFdLbIiwWGEE9SwT0uGQUx1oRQyfWtmA6IJBR0dnkdgj3/8iJpVMr2efnstlKoXmVx5NAhOkYlZKMLVEU3qIbqiKJH9Ixe0ZvxZLwY78bHrHXJyGYO0B8Ynz9aypmU</latexit>
⇥
W
(1)
xcom
p
<latexit
sha1_base64="LufH28OLnsOd0sNygmTE4R71Sdw=">AAACBnicbVDLSgNBEJz1GeMr6lGEwUSIl7AbRD0GvXiMYB6QrMvsZJIMmZ1dZnolYdmTF3/FiwdFvPoN3vwbJ4+DJhY0FFXddHf5keAabPvbWlpeWV1bz2xkN7e2d3Zze/t1HcaKshoNRaiaPtFMcMlqwEGwZqQYCXzBGv7geuw3HpjSPJR3MIqYG5Ce5F1OCRjJyx0V2sADpnHjPik6p6mXtIENISHxME0LXi5vl+wJ8CJxZiSPZqh6ua92J6RxwCRQQbRuOXYEbkIUcCpYmm3HmkWEDkiPtQyVxKx2k8kbKT4xSgd3Q2VKAp6ovycSEmg9CnzTGRDo63lvLP7ntWLoXroJl1EMTNLpom4sMIR4nAnucMUoiJEhhCpubsW0TxShYJLLmhCc+ZcXSb1ccs5LZ7flfOVqFkcGHaJjVEQOukAVdIOqqIYoekTP6BW9WU/Wi/VufUxbl6zZzAH6A+vzB7DamKc=</latexit>
⇥
W
(
1
)
a
u
x
192
What is syntacOc GCN?
He(k+1) = He_self(k) + He_child_of(k) +
Syntax at encoder level
Marcheggiani et al., 2017b
Ø SyntacOc head of a child
Ø Allows informaOon flow to and from dependent to
<latexit
sha1_base64="6onhJrgwe/CZITWJailWFE3mSCc=">AAACEHicbVC5TsNAEF2HK4TLQEmzIkGEJrIjBJQRNJRBIocUO9F6s0mWrA/tjhGR5U+g4VdoKECIlpKOv2FzFJDwpJGe3pvRzDwvElyBZX0bmaXlldW17HpuY3Nre8fc3aurMJaU1WgoQtn0iGKCB6wGHARrRpIR3xOs4Q2vxn7jnknFw+AWRhFzfdIPeI9TAlrqmMcFB7jPFG60k6J9knYSB9gDJCr27tJ24kRSu2la6Jh5q2RNgBeJPSN5NEO1Y3453ZDGPguACqJUy7YicBMigVPB0pwTKxYROiR91tI0IPoIN5k8lOIjrXRxL5S6AsAT9fdEQnylRr6nO30CAzXvjcX/vFYMvQs34UEUAwvodFEvFhhCPE4Hd7lkFMRIE0Il17diOiCSUNAZ5nQI9vzLi6ReLtlnpdObcr5yOYsjiw7QISoiG52jCrpGVVRDFD2iZ/SK3own48V4Nz6mrRljNrOP/sD4/AEHQZ1A</latexit>
⇥W
(1)
subj0
<latexit
sha1_base64="/EFio39ylcfy8bJT5fEnHppk5X4=">AAACD3icbVC7TgJBFJ3FF+ILtbSZCBpsyC4xakm0scREHgm7bGaHASbMPjJz10A2+wc2/oqNhcbY2tr5Nw6PQsGT3OTknHtz7z1eJLgC0/w2Miura+sb2c3c1vbO7l5+/6ChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje8mfjNByYVD4N7GEfM8Uk/4D1OCWjJzZ8WbeA+U7jZSUrWWeomNrARJCQepZ3EjqQ207To5gtm2ZwCLxNrTgpojpqb/7K7IY19FgAVRKm2ZUbgJEQCp4KlOTtWLCJ0SPqsrWlA9A1OMv0nxSda6eJeKHUFgKfq74mE+EqNfU93+gQGatGbiP957Rh6V07CgygGFtDZol4sMIR4Eg7ucskoiLEmhEqub8V0QCShoCPM6RCsxZeXSaNSti7K53eVQvV6HkcWHaFjVEIWukRVdItqqI4oekTP6BW9GU/Gi/FufMxaM8Z85hD9gfH5AziwnNA=</latexit>
⇥
W
(1)
aux
0
<latexit
sha1_base64="9QwhDMyMlBshoV14hbEHyOV6+fU=">AAACEXicbVC7TgJBFJ31ifhatbSZCCbYkF1i1JJoY4mJPBJ2IbPDABNmH5m5ayCb/QUbf8XGQmNs7ez8GwfYQsGT3OTknHtz7z1eJLgCy/o2VlbX1jc2c1v57Z3dvX3z4LChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje6mfrNByYVD4N7mETM9ckg4H1OCWipa5aKDnCfKdzsJCX7LO0mDrAxJGMa+lHaSZxIajtNi12zYJWtGfAysTNSQBlqXfPL6YU09lkAVBCl2rYVgZsQCZwKluadWLGI0BEZsLamAdFXuMnsoxSfaqWH+6HUFQCeqb8nEuIrNfE93ekTGKpFbyr+57Vj6F+5CQ+iGFhA54v6scAQ4mk8uMcloyAmmhAqub4V0yGRhIIOMa9DsBdfXiaNStm+KJ/fVQrV6yyOHDpGJ6iEbHSJqugW1VAdUfSIntErejOejBfj3fiYt64Y2cwR+gPj8wfqVp29</latexit>
⇥
W (1)
xcom
p 0
He had dared to defy nature
ReLU ReLU ReLU ReLU ReLU ReLU
nsubj
xcomp obj
aux mark
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="aOP8/D4V8b2zsT12Nkq5uPRT3iM=">AAACB3icbVDLSgNBEJz1GeNr1aMgg4kQL2E3iHoMevEYwTwgiWF2MpuMmZ1dZnrFsOzNi7/ixYMiXv0Fb/6Nk8dBEwsaiqpuuru8SHANjvNtLSwuLa+sZtay6xubW9v2zm5Nh7GirEpDEaqGRzQTXLIqcBCsESlGAk+wuje4HPn1e6Y0D+UNDCPWDkhPcp9TAkbq2Af5FvCAaVy/TQrucdpJWsAeINGxd5em+Y6dc4rOGHieuFOSQ1NUOvZXqxvSOGASqCBaN10ngnZCFHAqWJptxZpFhA5IjzUNlcTsbifjP1J8ZJQu9kNlSgIeq78nEhJoPQw80xkQ6OtZbyT+5zVj8M/bCZdRDEzSySI/FhhCPAoFd7liFMTQEEIVN7di2ieKUDDRZU0I7uzL86RWKrqnxZPrUq58MY0jg/bRISogF52hMrpCFVRFFD2iZ/SK3qwn68V6tz4mrQvWdGYP/YH1+QN7w5kX</latexit>
⇥
W (1)
subj
<latexit
sha1_base64="8Gz0Mk/cy5pzqtj9WbX6+sEWlzg=">AAACCHicbVC7SgNBFJ31GeNr1dLCwUSITdgNopZBG8sI5gHZGGYns8mQ2QczdyVh2dLGX7GxUMTWT7Dzb5wkW2jigQuHc+7l3nvcSHAFlvVtLC2vrK6t5zbym1vbO7vm3n5DhbGkrE5DEcqWSxQTPGB14CBYK5KM+K5gTXd4PfGbD0wqHgZ3MI5Yxyf9gHucEtBS1zwqOsB9pnDzPinZp2k3cYCNIBnR0I/StNg1C1bZmgIvEjsjBZSh1jW/nF5IY58FQAVRqm1bEXQSIoFTwdK8EysWETokfdbWNCB6eSeZPpLiE630sBdKXQHgqfp7IiG+UmPf1Z0+gYGa9ybif147Bu+yk/AgioEFdLbIiwWGEE9SwT0uGQUx1oRQyfWtmA6IJBR0dnkdgj3/8iJpVMr2efnstlKoXmVx5NAhOkYlZKMLVEU3qIbqiKJH9Ixe0ZvxZLwY78bHrHXJyGYO0B8Ynz9aypmU</latexit>
⇥
W
(1)
xcom
p
<latexit
sha1_base64="LufH28OLnsOd0sNygmTE4R71Sdw=">AAACBnicbVDLSgNBEJz1GeMr6lGEwUSIl7AbRD0GvXiMYB6QrMvsZJIMmZ1dZnolYdmTF3/FiwdFvPoN3vwbJ4+DJhY0FFXddHf5keAabPvbWlpeWV1bz2xkN7e2d3Zze/t1HcaKshoNRaiaPtFMcMlqwEGwZqQYCXzBGv7geuw3HpjSPJR3MIqYG5Ce5F1OCRjJyx0V2sADpnHjPik6p6mXtIENISHxME0LXi5vl+wJ8CJxZiSPZqh6ua92J6RxwCRQQbRuOXYEbkIUcCpYmm3HmkWEDkiPtQyVxKx2k8kbKT4xSgd3Q2VKAp6ovycSEmg9CnzTGRDo63lvLP7ntWLoXroJl1EMTNLpom4sMIR4nAnucMUoiJEhhCpubsW0TxShYJLLmhCc+ZcXSb1ccs5LZ7flfOVqFkcGHaJjVEQOukAVdIOqqIYoekTP6BW9WU/Wi/VufUxbl6zZzAH6A+vzB7DamKc=</latexit>
⇥
W
(
1
)
a
u
x
GCN Layer
193
What is syntacOc GCN?
He(k+1) = He_self(k) + He_child_of(k) + He_parent_of(k)
Syntax at encoder level
Marcheggiani et al., 2017b
He had dared to defy nature
Ø Want to encode informaOon k nodes away
Ø Use k layers to encode k-order neighborhood.
Ø Helped capture the widened syntacOc neighborhood.
ReLU ReLU ReLU ReLU ReLU ReLU
nsubj
xcomp obj
aux mark
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="aOP8/D4V8b2zsT12Nkq5uPRT3iM=">AAACB3icbVDLSgNBEJz1GeNr1aMgg4kQL2E3iHoMevEYwTwgiWF2MpuMmZ1dZnrFsOzNi7/ixYMiXv0Fb/6Nk8dBEwsaiqpuuru8SHANjvNtLSwuLa+sZtay6xubW9v2zm5Nh7GirEpDEaqGRzQTXLIqcBCsESlGAk+wuje4HPn1e6Y0D+UNDCPWDkhPcp9TAkbq2Af5FvCAaVy/TQrucdpJWsAeINGxd5em+Y6dc4rOGHieuFOSQ1NUOvZXqxvSOGASqCBaN10ngnZCFHAqWJptxZpFhA5IjzUNlcTsbifjP1J8ZJQu9kNlSgIeq78nEhJoPQw80xkQ6OtZbyT+5zVj8M/bCZdRDEzSySI/FhhCPAoFd7liFMTQEEIVN7di2ieKUDDRZU0I7uzL86RWKrqnxZPrUq58MY0jg/bRISogF52hMrpCFVRFFD2iZ/SK3qwn68V6tz4mrQvWdGYP/YH1+QN7w5kX</latexit>
⇥
W (1)
subj
<latexit
sha1_base64="6onhJrgwe/CZITWJailWFE3mSCc=">AAACEHicbVC5TsNAEF2HK4TLQEmzIkGEJrIjBJQRNJRBIocUO9F6s0mWrA/tjhGR5U+g4VdoKECIlpKOv2FzFJDwpJGe3pvRzDwvElyBZX0bmaXlldW17HpuY3Nre8fc3aurMJaU1WgoQtn0iGKCB6wGHARrRpIR3xOs4Q2vxn7jnknFw+AWRhFzfdIPeI9TAlrqmMcFB7jPFG60k6J9knYSB9gDJCr27tJ24kRSu2la6Jh5q2RNgBeJPSN5NEO1Y3453ZDGPguACqJUy7YicBMigVPB0pwTKxYROiR91tI0IPoIN5k8lOIjrXRxL5S6AsAT9fdEQnylRr6nO30CAzXvjcX/vFYMvQs34UEUAwvodFEvFhhCPE4Hd7lkFMRIE0Il17diOiCSUNAZ5nQI9vzLi6ReLtlnpdObcr5yOYsjiw7QISoiG52jCrpGVVRDFD2iZ/SK3own48V4Nz6mrRljNrOP/sD4/AEHQZ1A</latexit>
⇥W
(1)
subj0
<latexit
sha1_base64="/EFio39ylcfy8bJT5fEnHppk5X4=">AAACD3icbVC7TgJBFJ3FF+ILtbSZCBpsyC4xakm0scREHgm7bGaHASbMPjJz10A2+wc2/oqNhcbY2tr5Nw6PQsGT3OTknHtz7z1eJLgC0/w2Miura+sb2c3c1vbO7l5+/6ChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje8mfjNByYVD4N7GEfM8Uk/4D1OCWjJzZ8WbeA+U7jZSUrWWeomNrARJCQepZ3EjqQ207To5gtm2ZwCLxNrTgpojpqb/7K7IY19FgAVRKm2ZUbgJEQCp4KlOTtWLCJ0SPqsrWlA9A1OMv0nxSda6eJeKHUFgKfq74mE+EqNfU93+gQGatGbiP957Rh6V07CgygGFtDZol4sMIR4Eg7ucskoiLEmhEqub8V0QCShoCPM6RCsxZeXSaNSti7K53eVQvV6HkcWHaFjVEIWukRVdItqqI4oekTP6BW9GU/Gi/FufMxaM8Z85hD9gfH5AziwnNA=</latexit>
⇥
W
(1)
aux
0
<latexit
sha1_base64="9QwhDMyMlBshoV14hbEHyOV6+fU=">AAACEXicbVC7TgJBFJ31ifhatbSZCCbYkF1i1JJoY4mJPBJ2IbPDABNmH5m5ayCb/QUbf8XGQmNs7ez8GwfYQsGT3OTknHtz7z1eJLgCy/o2VlbX1jc2c1v57Z3dvX3z4LChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje6mfrNByYVD4N7mETM9ckg4H1OCWipa5aKDnCfKdzsJCX7LO0mDrAxJGMa+lHaSZxIajtNi12zYJWtGfAysTNSQBlqXfPL6YU09lkAVBCl2rYVgZsQCZwKluadWLGI0BEZsLamAdFXuMnsoxSfaqWH+6HUFQCeqb8nEuIrNfE93ekTGKpFbyr+57Vj6F+5CQ+iGFhA54v6scAQ4mk8uMcloyAmmhAqub4V0yGRhIIOMa9DsBdfXiaNStm+KJ/fVQrV6yyOHDpGJ6iEbHSJqugW1VAdUfSIntErejOejBfj3fiYt64Y2cwR+gPj8wfqVp29</latexit>
⇥
W (1)
xcom
p 0
<latexit
sha1_base64="8Gz0Mk/cy5pzqtj9WbX6+sEWlzg=">AAACCHicbVC7SgNBFJ31GeNr1dLCwUSITdgNopZBG8sI5gHZGGYns8mQ2QczdyVh2dLGX7GxUMTWT7Dzb5wkW2jigQuHc+7l3nvcSHAFlvVtLC2vrK6t5zbym1vbO7vm3n5DhbGkrE5DEcqWSxQTPGB14CBYK5KM+K5gTXd4PfGbD0wqHgZ3MI5Yxyf9gHucEtBS1zwqOsB9pnDzPinZp2k3cYCNIBnR0I/StNg1C1bZmgIvEjsjBZSh1jW/nF5IY58FQAVRqm1bEXQSIoFTwdK8EysWETokfdbWNCB6eSeZPpLiE630sBdKXQHgqfp7IiG+UmPf1Z0+gYGa9ybif147Bu+yk/AgioEFdLbIiwWGEE9SwT0uGQUx1oRQyfWtmA6IJBR0dnkdgj3/8iJpVMr2efnstlKoXmVx5NAhOkYlZKMLVEU3qIbqiKJH9Ixe0ZvxZLwY78bHrHXJyGYO0B8Ynz9aypmU</latexit>
⇥
W
(1)
xcom
p
<latexit
sha1_base64="LufH28OLnsOd0sNygmTE4R71Sdw=">AAACBnicbVDLSgNBEJz1GeMr6lGEwUSIl7AbRD0GvXiMYB6QrMvsZJIMmZ1dZnolYdmTF3/FiwdFvPoN3vwbJ4+DJhY0FFXddHf5keAabPvbWlpeWV1bz2xkN7e2d3Zze/t1HcaKshoNRaiaPtFMcMlqwEGwZqQYCXzBGv7geuw3HpjSPJR3MIqYG5Ce5F1OCRjJyx0V2sADpnHjPik6p6mXtIENISHxME0LXi5vl+wJ8CJxZiSPZqh6ua92J6RxwCRQQbRuOXYEbkIUcCpYmm3HmkWEDkiPtQyVxKx2k8kbKT4xSgd3Q2VKAp6ovycSEmg9CnzTGRDo63lvLP7ntWLoXroJl1EMTNLpom4sMIR4nAnucMUoiJEhhCpubsW0TxShYJLLmhCc+ZcXSb1ccs5LZ7flfOVqFkcGHaJjVEQOukAVdIOqqIYoekTP6BW9WU/Wi/VufUxbl6zZzAH6A+vzB7DamKc=</latexit>
⇥
W
(
1
)
a
u
x
ReLU ReLU ReLU ReLU ReLU ReLU
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit>
⇥W
(1)
self
<latexit
sha1_base64="aOP8/D4V8b2zsT12Nkq5uPRT3iM=">AAACB3icbVDLSgNBEJz1GeNr1aMgg4kQL2E3iHoMevEYwTwgiWF2MpuMmZ1dZnrFsOzNi7/ixYMiXv0Fb/6Nk8dBEwsaiqpuuru8SHANjvNtLSwuLa+sZtay6xubW9v2zm5Nh7GirEpDEaqGRzQTXLIqcBCsESlGAk+wuje4HPn1e6Y0D+UNDCPWDkhPcp9TAkbq2Af5FvCAaVy/TQrucdpJWsAeINGxd5em+Y6dc4rOGHieuFOSQ1NUOvZXqxvSOGASqCBaN10ngnZCFHAqWJptxZpFhA5IjzUNlcTsbifjP1J8ZJQu9kNlSgIeq78nEhJoPQw80xkQ6OtZbyT+5zVj8M/bCZdRDEzSySI/FhhCPAoFd7liFMTQEEIVN7di2ieKUDDRZU0I7uzL86RWKrqnxZPrUq58MY0jg/bRISogF52hMrpCFVRFFD2iZ/SK3qwn68V6tz4mrQvWdGYP/YH1+QN7w5kX</latexit>
⇥
W (1)
subj
<latexit
sha1_base64="6onhJrgwe/CZITWJailWFE3mSCc=">AAACEHicbVC5TsNAEF2HK4TLQEmzIkGEJrIjBJQRNJRBIocUO9F6s0mWrA/tjhGR5U+g4VdoKECIlpKOv2FzFJDwpJGe3pvRzDwvElyBZX0bmaXlldW17HpuY3Nre8fc3aurMJaU1WgoQtn0iGKCB6wGHARrRpIR3xOs4Q2vxn7jnknFw+AWRhFzfdIPeI9TAlrqmMcFB7jPFG60k6J9knYSB9gDJCr27tJ24kRSu2la6Jh5q2RNgBeJPSN5NEO1Y3453ZDGPguACqJUy7YicBMigVPB0pwTKxYROiR91tI0IPoIN5k8lOIjrXRxL5S6AsAT9fdEQnylRr6nO30CAzXvjcX/vFYMvQs34UEUAwvodFEvFhhCPE4Hd7lkFMRIE0Il17diOiCSUNAZ5nQI9vzLi6ReLtlnpdObcr5yOYsjiw7QISoiG52jCrpGVVRDFD2iZ/SK3own48V4Nz6mrRljNrOP/sD4/AEHQZ1A</latexit>
⇥W
(1)
subj0
<latexit
sha1_base64="9QwhDMyMlBshoV14hbEHyOV6+fU=">AAACEXicbVC7TgJBFJ31ifhatbSZCCbYkF1i1JJoY4mJPBJ2IbPDABNmH5m5ayCb/QUbf8XGQmNs7ez8GwfYQsGT3OTknHtz7z1eJLgCy/o2VlbX1jc2c1v57Z3dvX3z4LChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje6mfrNByYVD4N7mETM9ckg4H1OCWipa5aKDnCfKdzsJCX7LO0mDrAxJGMa+lHaSZxIajtNi12zYJWtGfAysTNSQBlqXfPL6YU09lkAVBCl2rYVgZsQCZwKluadWLGI0BEZsLamAdFXuMnsoxSfaqWH+6HUFQCeqb8nEuIrNfE93ekTGKpFbyr+57Vj6F+5CQ+iGFhA54v6scAQ4mk8uMcloyAmmhAqub4V0yGRhIIOMa9DsBdfXiaNStm+KJ/fVQrV6yyOHDpGJ6iEbHSJqugW1VAdUfSIntErejOejBfj3fiYt64Y2cwR+gPj8wfqVp29</latexit>
⇥
W (1)
xcom
p 0
<latexit
sha1_base64="8Gz0Mk/cy5pzqtj9WbX6+sEWlzg=">AAACCHicbVC7SgNBFJ31GeNr1dLCwUSITdgNopZBG8sI5gHZGGYns8mQ2QczdyVh2dLGX7GxUMTWT7Dzb5wkW2jigQuHc+7l3nvcSHAFlvVtLC2vrK6t5zbym1vbO7vm3n5DhbGkrE5DEcqWSxQTPGB14CBYK5KM+K5gTXd4PfGbD0wqHgZ3MI5Yxyf9gHucEtBS1zwqOsB9pnDzPinZp2k3cYCNIBnR0I/StNg1C1bZmgIvEjsjBZSh1jW/nF5IY58FQAVRqm1bEXQSIoFTwdK8EysWETokfdbWNCB6eSeZPpLiE630sBdKXQHgqfp7IiG+UmPf1Z0+gYGa9ybif147Bu+yk/AgioEFdLbIiwWGEE9SwT0uGQUx1oRQyfWtmA6IJBR0dnkdgj3/8iJpVMr2efnstlKoXmVx5NAhOkYlZKMLVEU3qIbqiKJH9Ixe0ZvxZLwY78bHrHXJyGYO0B8Ynz9aypmU</latexit>
⇥
W
(1)
xcom
p
<latexit
sha1_base64="LufH28OLnsOd0sNygmTE4R71Sdw=">AAACBnicbVDLSgNBEJz1GeMr6lGEwUSIl7AbRD0GvXiMYB6QrMvsZJIMmZ1dZnolYdmTF3/FiwdFvPoN3vwbJ4+DJhY0FFXddHf5keAabPvbWlpeWV1bz2xkN7e2d3Zze/t1HcaKshoNRaiaPtFMcMlqwEGwZqQYCXzBGv7geuw3HpjSPJR3MIqYG5Ce5F1OCRjJyx0V2sADpnHjPik6p6mXtIENISHxME0LXi5vl+wJ8CJxZiSPZqh6ua92J6RxwCRQQbRuOXYEbkIUcCpYmm3HmkWEDkiPtQyVxKx2k8kbKT4xSgd3Q2VKAp6ovycSEmg9CnzTGRDo63lvLP7ntWLoXroJl1EMTNLpom4sMIR4nAnucMUoiJEhhCpubsW0TxShYJLLmhCc+ZcXSb1ccs5LZ7flfOVqFkcGHaJjVEQOukAVdIOqqIYoekTP6BW9WU/Wi/VufUxbl6zZzAH6A+vzB7DamKc=</latexit>
⇥
W
(
1
)
a
u
x
194
What is syntactic GCN?
Syntax at encoder level
Encoding Sentences with Graph ConvoluOonal Networks for
SemanOc Role Labeling
Marcheggiani et al., 2017b
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
Classifier
A0 0 0 0 A2 0
K layers GCN
Ø Claim: GCN helps capture long range dependencies.
Ø But: encoding k-hope neighborhood seems to
hurt the performance. (k = 1 works the best)
195
Syntax at encoder level
Encoding Sentences with Graph ConvoluOonal Networks for
SemanOc Role Labeling
Marcheggiani et al., 2017b
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
Classifier
A0 0 0 0 A2 0
K layers GCN
Ø Gold dependency can significantly improve the
performance.
82.7
83.3
86.4
75
77
79
81
83
85
87
89
No Syntax GCN (Predicted) GCN (Gold)
DEV set
196
Syntax at encoder level
Marcheggiani et al., 2017b
86.9
87.3 87.3
87.7 87.7
88
80
81
82
83
84
85
86
87
88
89
90
Bjö̈rkelund et
al. (2010)
Täckström et
al. (2015)
FitzGerald et
al. (2015)
Roth and
Lapata (2016)
Marcheggiani
et al. (2017)
Marcheggiani
et al. (2017)
WSJ
75.6 75.7 75.2
76.1
77.7 77.2
65
70
75
80
85
90
Bjö̈rkelund et
al. (2010)
Täckström et
al. (2015)
FitzGerald et
al. (2015)
Roth and
Lapata (2016)
Marcheggiani
et al. (2017)
Marcheggiani
et al. (2017)
Brown
197
Syntax at encoder level
Dataset: CoNLL09 EN
ENC ENC
Marcheggiani et al., 2017b
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
Classifier
A0 0 0 0 A2 0
K layers GCN
Takeaways
Ø Appending POS does help à approx. 1 F1 points gain
Ø Predicate specific encoding does help à approx. 6 F1 point
gain
Ø Model syntacOc dependencies via syntacOc GCN further
improve the SRL performance. NEED High quality syntacOc
parser
Ø Noted: Improvement only on EN in-domain over previous
works.
Ø However previous work show improvement over OOD set.
198
A unified Syntax-aware Framework for Semantic role labeling
Li et al., 2018
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
Classifier
SyntacOc Layer
[Marcheggiani et al., 2017b] [Tai et al., 2015]
199
Zuchao Li, Shexia He, Jiaxun Cai, Zhuosheng Zhang, Hai Zhao, Gongshen Liu, Linlin Li, and Luo Si. 2018. A Unified Syntax-aware Framework for SemanAc Role Labeling. In Proceedings of the 2018
Conference on Empirical Methods in Natural Language Processing, pages 2401–2411, Brussels, Belgium. AssociaAon for ComputaAonal LinguisAcs.
[Qian et al., 2017]
Syntax at encoder level
Extension of BiLSTMs.
Incorporates the syntacOc informaOon
into each word representaOon by
introducing an addiOonal gate
Extension of BiLSTMs.
Model tree-structured topologies
A unified Syntax-aware Framework for SemanOc role labeling
Li et al., 2018
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
Classifier
SyntacOc Layer
Ø Adds residual connection
Ø Allows the model to skip syntactic information if
and when necessary
Ø Adds Highway Layers
Highway Layers
200
Syntax at encoder level
Li et al., 2018
87.3 87.3
87.7 87.7
88
89.8
80
81
82
83
84
85
86
87
88
89
90
Täckström et
al. (2015)
FitzGerald et
al. (2015)
Roth and
Lapata (2016)
Marcheggiani
et al. (2017)
Marcheggiani
et al. (2017)
Li et al., 2018
WSJ
75.7 75.2
76.1
77.7 77.2
79.8
65
70
75
80
85
90
Täckström et
al. (2015)
FitzGerald et
al. (2015)
Roth and
Lapata (2016)
Marcheggiani
et al. (2017)
Marcheggiani
et al. (2017)
Li et al., 2018
Brown
201
Syntax at encoder level
Dataset: CoNLL09 EN
Glove ELMo
A unified Syntax-aware Framework for SemanOc role labeling
Li et al., 2018
He had dared to defy nature
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Wp
Wr
PO
Le
Embedder
Encoder
Classifier
SyntacOc Layer
Highway Layers
Takeaways
Ø Need high quality parser to substantially improve model
performance. [90.5 à 89.5] CoNLL09 EN test
Ø Residual connection + Deep encoder + Syntactic GCN
improves over syntactic GCN alone. [ 88.0 à 89.8] CoNLL09
EN test
Ø However, uses ELMo word embeddings.
202
Syntax at encoder level
203
Outline
q Early SRL approaches
q Typical neural SRL model components
q Performance analysis
q Syntax-aware neural SRL models
q What, When and Where?
q Performance analysis
q How to incorporate Syntax?
q Syntax-agnos/c neural SRL models
q Performance Analysis
q Do we really need syntax for SRL?
q Are high quality contextual embedding enough for SRL
task?
q Prac/cal SRL systems
q Should we rely on this pipelined approach?
q End-to-end SRL systems
q Can we jointly predict dependency and span?
q More recent approaches
q Handling low-frequency excepcons
q Incorporate semancc role label definicons
q SRL as MRC task
q Prac/cal SRL system evalua/ons
q Are we evaluacng SRL systems correctly?
q Conclusion
Syntax-Agnos6c Model
He et al., 2017
He et al., 2018
Cai et al., 2018
Ouchi et al., 2018
Guan et al., 2019
LI et al., 2019
Shi et al., 2019
Conia et al., 2020
Jindal et al., 2020
Zhou et al., 2020
Conia et al., 2021
Blloshmi et al., 2021
Wang et al., 2022
Zhang et al. 2022
Syntax-Agnostic Models
204
Encoder
Classifier
Embedder
Input Sentence
Word embeddings
- FastText, GloVe
- ELMo, BERT
Types of encoder
- BiLSTMs
- Attention
- MLP
88.7
89.6
89.1
89.6
90.8
92.4
91.4
92.6 92.4 92.2
93.3
80
82
84
86
88
90
92
94
Heet al.,
2018
Caiet al.,
2018
LI et al.,
2019
Guan et al.,
2019
Jindal et al.,
2019
Shi et al.,
2019
Zhou et al.,
2020
Conia et al.,
2020
Blloshmi et
al., 2021
Zhang etal.
2022
Wang et al.,
2022
WSJ
F1
2018
Dataset: CoNLL09 EN Comparing Syntax agnostic models
2019 2020 2021à
BERT/Fine-tune Regime
+2.5
-2.1
205
Performance Analysis
78.8 79 78.9
79.7
85
85.7
87.3
85.9
85.2
86
87.2
75
77
79
81
83
85
87
89
Heet al.,
2018
Caiet al.,
2018
LI et al.,
2019
Guan et al.,
2019
Jindal et al.,
2019
Shi et al.,
2019
Zhou et al.,
2020
Conia et al.,
2020
Blloshmi et
al., 2021
Zhang etal.
2022
Wang et al.,
2022
Brown
Dataset: CoNLL09 EN Comparing Syntax agnostic models
F1
2018 2019 2020 2021à
BERT/Fine-tune Regime
+2.3
-6.2
206
Performance Analysis
Dataset: CoNLL09 EN Comparing Syntax agnostic models
Observa6ons
q BERL models raised the bar
q With Max improvement over Out-of-domain dataset
q However, the improvement since 2019 is marginal
207
He et al., 2017
He had dared to defy nature
Embedder
wr
à
Pre-trained word embeddings
à
Predicate specific feature [Binary]
Ø Pre-trained word embeddings
Ø Use predicate flag
Luheng He, Kenton Lee, Mike Lewis, and Luke Zeklemoyer. 2017. Deep SemanAc Role Labeling: What Works
and What’s Next. In Proceedings of the 55th Annual MeeBng of the AssociaBon for ComputaBonal LinguisBcs
(Volume 1: Long Papers), pages 473–483, Vancouver, Canada. AssociaAon for ComputaAonal LinguisAcs.
Embedder OR Input word representation
208
Wp
Wr
PO
Le
wr
Wp
Wr
PO
Le
He et al., 2017
He had dared to defy nature
Embedder
Encoder
Ø Stacked BILSTM
Ø Highway ConnecOons [Srivastava et al., 2015]
Ø To alleviate vanishing gradient problem
Ø Recurrent Dropout [Gal et al., 2016]
Ø To reduce overfitng
Encoder
209
He et al., 2017
He had dared to defy nature
Embedder
Encoder
Classifier
B-A0 0 0 B-A2 I-A2 I-A2
Ø Constraint A* decoding
Ø BIO constraint
Ø Unique core roles
Ø ConOnuaOon Constraint
Ø Reference constraint
Ø SyntacOc constraint
Classifier with MLP layer followed by Softmax
SRL Constraints were previously discussed by
Punyakanok et al. (2008) and Tackstrom et al. (2015)
210
Local classifier
Global opOmizaOon
He et al., 2017
77.2
79.7 79.9
79.4
82.8 83.1
75
77
79
81
83
85
87
89
Surdenau et
al. (2007)
Toutanova et
al. (2008)
Täckström et
al. (2015)
FitzGerald et
al. (2015)
Zhou and Xu
(2015)
Heet al.
(2017)
CoNLL05 WSJ
F1
67.7 67.8
71.3 71.2
69.4
72.1
65
70
75
80
85
90
Surdenau et
al. (2007)
Toutanova
et al. (2008)
Täckström
et al. (2015)
FitzGerald
et al. (2015)
Zhou and
Xu (2015)
Heet al.
(2017)
CoNLL05 Brown
211
Dataset: CoNLL05
He et al., 2017
What is the model good at and what kinds of mistakes
does it make?
Label confusions: The model oven confuses ARG2 with
AM-DIR, AM-LOC and AM-MNR. These confusions can
arise due to the use of ARG2 in many verb frames to
represent semanOc relaOons such as direcOon or locaOon.
212
Attachment Mistakes: These errors are closely tied to
prepositional phrase (PP) attachment errors.
He et al., 2017
How well do LSTMs model global structural consistency,
despite condiOonally independent tagging decisions?
Long range dependencies:
Performance tends to degrade, for all models, for
arguments further from the predicate
213
He et al., 2017
214
He had dared to defy nature
Embedder
Encoder
Classifier
B-A0 0 0 B-A2 I-A2 I-A2
Takeaways
Ø General label confusion between core arguments and
contextual arguments is due to the ambiguous definiOons
in frame files.
Ø Layers of BiLSTMs help captures the long–range predicate-
argument structures.
Ø The number of BIO violaOons decreases when we use a
deeper model
Ø Deeper BiLSTMs are bexer at enforcing structural
consistencies, although not perfectly.
Tan, Zhixing, Mingxuan Wang, Jun Xie, Yidong Chen, and Xiaodong Shi. "Deep semantic role
labeling with self-attention." In Proceedings of the AAAI conference on artificial intelligence, vol.
32, no. 1. 2018.
Tan et al., 2018
Do we really need all these hacks!!!! J
Let’s break Recurrence and allow every posiOon in the
sentence to axend over all posiOons in the input sequence
No Syntax
Use predicate specific flag
Use MulO-head self axenOon
Use Glove Embeddings
He had dared to defy nature
Embedder
Encoder
Soemax Classifier
B-A0 0 0 B-A2 I-A2 I-A2
RNN/CNN/FNN
MulO-Head Self-AxenOon
10x
215
Tan et al., 2018
77.2
79.7 79.9
79.4
82.8 83.1
84.8
75
77
79
81
83
85
87
89
Surdenau et
al. (2007)
Toutanova
et al. (2008)
Täckström
et al. (2015)
FitzGerald
et al. (2015)
Zhou and
Xu (2015)
Heet al.
(2017)
Tan et al.,
2018
WSJ
67.7 67.8
71.3 71.2
69.4
72.1
74.1
65
70
75
80
85
90
Surdenau
et al.
(2007)
Toutanova
et al.
(2008)
Täckström
et al.
(2015)
FitzGerald
et al.
(2015)
Zhou and
Xu (2015)
Heet al.
(2017)
Tan et al.,
2018
Brown
216
Dataset: CoNLL05
Deep SemanOc Role Labeling with Self-AxenOon
Tan et al., 2018
Ø PosiOonal embeddings are necessary to gain actual
performance.
He had dared to defy nature
Embedder
Encoder
Soemax Classifier
B-A0 0 0 B-A2 I-A2 I-A2
RNN/CNN/FNN
MulO-Head Self-AxenOon
10x
20
79.4
83.1
10
20
30
40
50
60
70
80
90
No PE PE Timely PE
DEV set
217
Tan et al., 2018
Takeaways
Ø SubstanOal improvements on CoNLL05 WSJ as compared to
[He et al., 2017]
Ø No need of CONSTRAINED Decoding (slows down) Just use
Argmax decoding. 83.1 à 83.0 [Token classificaOon]
Ø As reported earlier, Model depth is the key as compared
against model width
Ø FNN seems bexer choice over CNN and RNN when
axenOon is used as encoder
Ø PosiOonal embeddings are necessary to gain actual
performance
He had dared to defy nature
Embedder
Encoder
Soemax Classifier
B-A0 0 0 B-A2 I-A2 I-A2
RNN/CNN/FNN
MulO-Head Self-AxenOon
10x
218
Simple BERT model for relaOon extracOon and SRL
Shi et al., 2019
He had dared to defy nature
Encoder
Classifier
A0 0 0 0 A2 0
BERT
[CLS] [SEP] dared [SEP]
q Use BERT LM to obtain predicate-aware contextualized
embeddings for encoder.
q BiLSTMs are encoder layer (1x)
q Concatenate predicate hidden state to the hidden state of
the rest of the tokes similar to [Marcheggiani et al., 2017]
and then fed into one-layer MLP classificaOon.
219
Shi, Peng, and Jimmy Lin. "Simple bert models for relation extraction and semantic role
labeling." arXiv preprint arXiv:1904.05255 (2019).
Are high quality contextual embedding enough for SRL task?
Shi et al., 2019
79.4
82.8 83.1
84.8
86
88.1
88.8
75
77
79
81
83
85
87
89
FitzGerald
et al. (2015)
Zhou and Xu
(2015)
Heet al.
(2017)
Tan et al.,
(2018)
ELMO
Strubellet
al., (2018)
ELMo
Shi et al.,
(2019)
BERT-S
Shi et al.,
(2019)
BERT-L
CoNLL05 WSJ
71.2
69.4
72.1
74.1
76.5
80.9
82.1
65
70
75
80
85
90
FitzGerald
et al. (2015)
Zhou and Xu
(2015)
Heet al.
(2017)
Tan et al.,
(2018)
ELMO
Strubellet
al., (2018)
ELMo
Shi et al.,
(2019)
BERT-S
Shi et al.,
(2019)
BERT-L
CoNLL05 Brown
220
Are high quality contextual embedding enough for SRL task?
Dataset: CoNLL05
+2.1
+4.4
Shi et al., 2019
He had dared to defy nature
Encoder
Classifier
A0 0 0 0 A2 0
BERT
[CLS] [SEP] dared [SEP]
Ø Powerful Contextualized embeddings is all we need for
SRL??
Ø We do not need syntax to perform bexer on SRL??
Ø Do we know if BERT embeddings encodes syntax
implicitly??
Ø Yes [Jawaher et al., 2019]
Ø Explicit syntax informaOon shown to further improve
the SoTA SRL performance.
221
Are high quality contextual embedding enough for SRL task?
88
89.6 89.8
92.4
90.99
92.6
91.7
93.3
92.83
80
82
84
86
88
90
92
94
Marcheggianiet
al., 2017
Caiet al., 2018 LI et al., 2018 Shi et al., 2019 Lyu et al., 2019 Conia et al.,
2020
LI et al., 2020 Wang et al.,
2022
Fei et al., 2021
WSJ
Dataset: CoNLL09 EN
Comparison
Syntax-agnostic (SG) Vs. Syntax-aware(SA) models
BERT/Fine-tune Regime
SG
SG SG
SG
SA
SA
SA
SA
SA
F1
2018 2019 2020 2021à
2017
222
Dataset: CoNLL09 EN
Comparison
Syntax-agnostic (SG) Vs. Syntax-aware(SA) models
77.7
79
79.8
85.7
80.8
87.3
86.84
87.2
75
77
79
81
83
85
87
89
Marcheggianiet
al.,2017b
Caiet al., 2018 LI et al., 2018 Shi et al., 2019 Kasaiet al., 2019Zhou et al., 2020 Zhou et al., 2020 Wang et al.,
2022
PLACEHOLDER
WSJ
BERT/Fine-tune Regime
SG SG SG SG SA
??
SA
SA
SA
SA
F1
2018 2019 2020 2021à
2017
223
224
Outline
q Early SRL approaches
q Typical neural SRL model components
q Performance analysis
q Syntax-aware neural SRL models
q What, When and Where?
q Performance analysis
q How to incorporate Syntax?
q Syntax-agnos/c neural SRL models
q Performance Analysis
q Do we really need syntax for SRL?
q Are high quality contextual embedding enough for SRL
task?
q Practical SRL systems
q Should we rely on this pipelined approach?
q End-to-end SRL systems
q Can we jointly predict dependency and span?
q More recent approaches
q Handling low-frequency exceptions
q Incorporate semantic role label definitions
q SRL as MRC task
q Practical SRL system evaluations
q Are we evaluating SRL systems correctly?
q Conclusion
He et al., 2018
He had dared to defy nature
Embedder
Encoder
Classifier
q Jointly predicOng all predicates, arguments spans and the
relaOon between them
q Build upon coreference resoluOon model [Lee et al., 2017].
q Embedder:
q No predicate locaOon specified instead concatenate
word embeddings with the output of charCNN.
q Each edge is idenOfied by independently predicOng which
role, if any, holds between every possible pair of text
spans, while using aggressive beam pruning for efficiency.
The final graph is simply the union of predicted SRL roles
(edges) and their associated text spans (nodes)
Encoder
RepresentaOon
225
Syntax-agnosOc end-to-end SRL system
Luheng He, Kenton Lee, Omer Levy, and Luke Zeklemoyer. 2018. Jointly PredicAng Predicates and
Arguments in Neural SemanAc Role Labeling. In Proceedings of the 56th Annual MeeBng of the AssociaBon
for ComputaBonal LinguisBcs (Volume 2: Short Papers), pages 364–369, Melbourne, Australia. AssociaAon
for ComputaAonal LinguisAcs.
He et al., 2018
Task: Predict a set of labeled predicate argument relations
<latexit sha1_base64="ls/GzdMHWdKfYg+NhkSeAzvOAsE=">AAACJnicbVBNS8NAEN3Ur1q/oh69LLaCp5IUUS9C1YsHDxXshzSlbLbbdulmE3YnQgn9NV78K148VES8+VPctgG19cHA470ZZub5keAaHOfTyiwtr6yuZddzG5tb2zv27l5Nh7GirEpDEaqGTzQTXLIqcBCsESlGAl+wuj+4nvj1R6Y0D+U9DCPWCkhP8i6nBIzUti8KD9jTsa8ZYC8g0KdEJJUR9oAHTP9Il4vS7ajQtvNO0ZkCLxI3JXmUotK2x14npHHAJFBBtG66TgSthCjgVLBRzos1iwgdkB5rGiqJ2dhKpm+O8JFROrgbKlMS8FT9PZGQQOth4JvOyY163puI/3nNGLrnrYTLKAYm6WxRNxYYQjzJDHe4YhTE0BBCFTe3YtonilAwyeZMCO78y4ukViq6p8WTu1K+fJXGkUUH6BAdIxedoTK6QRVURRQ9oRc0Rm/Ws/VqvVsfs9aMlc7soz+wvr4BclKlyg==</latexit>
Y ⇢ P ⇥ A ⇥ L
Set of all predicate-argument
rela6ons
Set of all tokens
Set of all the possible spans
Set of all SRL labels
Encoder RepresentaOon
<latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit>
P(yp,a = l|X)
226
Syntax-agnosOc end-to-end SRL system
He et al., 2018
He had He had dared to defy nature dared
<latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit>
P(yp,a = l|X)
Predicate RepresentaOon
SPAN RepresentaOon
He
To obtain predicate and argument representaOons
Predicate representa/on is simply the BiLSTM
output at the posiOon index p
Argument Representation contains the following:
- End points from BiLSTM ouput
- A soft head word
- Embedded span width feature
227
Syntax-agnosOc end-to-end SRL system
He et al., 2018
Jointly predicOng predicates and Arguments in Neural SRL
He had He had dared to defy nature dared
Encoder RepresentaOon
<latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit>
P(yp,a = l|X)
Unary scores
Compute Unary score for predicates and arguments
<latexit sha1_base64="1Ws2LC1jyACT0jK7h1UDzQ8x5D0=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/i83C+W3Ko7B1olXkZKkKHRL371BpIkERWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LBY6o9tP5tVN0ZpUBCqWyJQyaq78nUhxpPYkC2xlhM9LL3kz8z+smJrz2UybixFBBFovChCMj0ex1NGCKEsMnlmCimL0VkRFWmBgbUMGG4C2/vEpatap3Wb24r5XqN1kceTiBU6iAB1dQhztoQBMIPMIzvMKbI50X5935WLTmnGzmGP7A+fwBBhOOHg==</latexit>
g(a)
<latexit sha1_base64="Gh7V9z+4NY792biUOQMDXdPU9Mk=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/F5uV8suVV3DrRKvIyUIEOjX/zqDSRJIioM4VjrrufGxk+xMoxwOi30Ek1jTMZ4SLuWChxR7afza6fozCoDFEplSxg0V39PpDjSehIFtjPCZqSXvZn4n9dNTHjtp0zEiaGCLBaFCUdGotnraMAUJYZPLMFEMXsrIiOsMDE2oIINwVt+eZW0alXvsnpxXyvVb7I48nACp1ABD66gDnfQgCYQeIRneIU3RzovzrvzsWjNOdnMMfyB8/kDHO2OLQ==</latexit>
g(p)
<latexit sha1_base64="iOTvvQBEW7RNCR8kRWB+GwfZrXA=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixSnxe7hdLbtWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwms/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrVvUuqxf3tVL9JosjDydwChXw4ArqcAcNaAIBAc/wCm+Ocl6cd+dj0Zpzsplj+APn8wc4E49h</latexit>
(p)
<latexit sha1_base64="Y6OAbliZcYcKciJ0WIr4YDYHkm4=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixCj4v94slt+rOgVaJl5ESZGj0i1+9QUQSQaUhHGvd9dzY+ClWhhFOp4VeommMyRgPaddSiQXVfjo/eIrOrDJAYaRsSYPm6u+JFAutJyKwnQKbkV72ZuJ/Xjcx4bWfMhknhkqyWBQmHJkIzb5HA6YoMXxiCSaK2VsRGWGFibEZFWwI3vLLq6RVq3qX1Yv7Wql+k8WRhxM4hQp4cAV1uIMGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QMhOY9S</latexit>
(a)
He
228
Syntax-agnosOc end-to-end SRL system
He et al., 2018
Jointly predicting predicates and Arguments in Neural SRL
He had He had dared to defy nature dared
Encoder RepresentaOon
<latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit>
P(yp,a = l|X)
Unary scores
Compute Unary score for predicates and arguments
<latexit sha1_base64="1Ws2LC1jyACT0jK7h1UDzQ8x5D0=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/i83C+W3Ko7B1olXkZKkKHRL371BpIkERWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LBY6o9tP5tVN0ZpUBCqWyJQyaq78nUhxpPYkC2xlhM9LL3kz8z+smJrz2UybixFBBFovChCMj0ex1NGCKEsMnlmCimL0VkRFWmBgbUMGG4C2/vEpatap3Wb24r5XqN1kceTiBU6iAB1dQhztoQBMIPMIzvMKbI50X5935WLTmnGzmGP7A+fwBBhOOHg==</latexit>
g(a)
<latexit sha1_base64="Gh7V9z+4NY792biUOQMDXdPU9Mk=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/F5uV8suVV3DrRKvIyUIEOjX/zqDSRJIioM4VjrrufGxk+xMoxwOi30Ek1jTMZ4SLuWChxR7afza6fozCoDFEplSxg0V39PpDjSehIFtjPCZqSXvZn4n9dNTHjtp0zEiaGCLBaFCUdGotnraMAUJYZPLMFEMXsrIiOsMDE2oIINwVt+eZW0alXvsnpxXyvVb7I48nACp1ABD66gDnfQgCYQeIRneIU3RzovzrvzsWjNOdnMMfyB8/kDHO2OLQ==</latexit>
g(p)
<latexit sha1_base64="iOTvvQBEW7RNCR8kRWB+GwfZrXA=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixSnxe7hdLbtWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwms/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrVvUuqxf3tVL9JosjDydwChXw4ArqcAcNaAIBAc/wCm+Ocl6cd+dj0Zpzsplj+APn8wc4E49h</latexit>
(p)
<latexit sha1_base64="Y6OAbliZcYcKciJ0WIr4YDYHkm4=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixCj4v94slt+rOgVaJl5ESZGj0i1+9QUQSQaUhHGvd9dzY+ClWhhFOp4VeommMyRgPaddSiQXVfjo/eIrOrDJAYaRsSYPm6u+JFAutJyKwnQKbkV72ZuJ/Xjcx4bWfMhknhkqyWBQmHJkIzb5HA6YoMXxiCSaK2VsRGWGFibEZFWwI3vLLq6RVq3qX1Yv7Wql+k8WRhxM4hQp4cAV1uIMGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QMhOY9S</latexit>
(a)
RelaOon score
Compute RelaOon score between predicates and
arguments
<latexit sha1_base64="h0rPwKSFtSqJ6x/RiYH137QlKYw=">AAAB/nicbVBNS8NAEN34WetXVTx5CbaCp5IUUY9FLx4r2A9oY9hsp+3SzSbsTsQSCv4VLx4U8erv8Oa/cdvmoK0PBh7vzTAzL4gF1+g439bS8srq2npuI7+5tb2zW9jbb+goUQzqLBKRagVUg+AS6shRQCtWQMNAQDMYXk/85gMozSN5h6MYvJD2Je9xRtFIfuGw1KkNuJ92EB4xVSDG43tR8gtFp+xMYS8SNyNFkqHmF7463YglIUhkgmrddp0YvZQq5EzAON9JNMSUDWkf2oZKGoL20un5Y/vEKF27FylTEu2p+nsipaHWozAwnSHFgZ73JuJ/XjvB3qWXchknCJLNFvUSYWNkT7Kwu1wBQzEyhDLFza02G1BFGZrE8iYEd/7lRdKolN3z8tltpVi9yuLIkSNyTE6JSy5IldyQGqkTRlLyTF7Jm/VkvVjv1sesdcnKZg7IH1ifP2OblcY=</latexit>
l
rel
He
<latexit sha1_base64="ABKpVPVsnTn+X9OAJ5qkcg74IxY=">AAACCXicbVDLSsNAFJ3UV62vqEs3g61QNyWpoi6LblwIVrAPaGOZTKft0MkkzEyEkmbrxl9x40IRt/6BO//GSRtEWw9cOJxzL/fe4waMSmVZX0ZmYXFpeSW7mltb39jcMrd36tIPBSY17DNfNF0kCaOc1BRVjDQDQZDnMtJwhxeJ37gnQlKf36pRQBwP9TntUYyUljomLLQ9pAYYseg6hkV+dzT+Ea7i8WGhY+atkjUBnCd2SvIgRbVjfra7Pg49whVmSMqWbQXKiZBQFDMS59qhJAHCQ9QnLU058oh0osknMTzQShf2fKGLKzhRf09EyJNy5Lm6M7lSznqJ+J/XClXvzIkoD0JFOJ4u6oUMKh8mscAuFQQrNtIEYUH1rRAPkEBY6fByOgR79uV5Ui+X7JPS8U05XzlP48iCPbAPisAGp6ACLkEV1AAGD+AJvIBX49F4Nt6M92lrxkhndsEfGB/fhweZmQ==</latexit>
O(n3
|L|)
Number of possible relaOons
229
Syntax-agnosOc end-to-end SRL system
He et al., 2018
BEAM pruning:
two beams Ba and Bp for storing the
candidate arguments and predicates,
respecOvely. The candidates in each beam
are ranked by their unary score (Φa or Φp)
Number of possible relations
<latexit sha1_base64="bo6WaFl+/eTE2ZAOqeYeFaaYji4=">AAACNHicbVDLSgMxFM34rPU16tJNsBXqpsxUUZdFN4KCFewD2loyadqGZpIhyShlOh/lxg9xI4ILRdz6DaYPxLYeCBzOOZfce7yAUaUd59Wam19YXFpOrCRX19Y3Nu2t7ZISocSkiAUTsuIhRRjlpKipZqQSSIJ8j5Gy1z0f+OV7IhUV/Fb3AlL3UZvTFsVIG6lhX6ZrPtIdjFh0HcMMvzvs/wpXcf8A1iRtdzSSUjzAqWhuMppu2Ckn6wwBZ4k7JikwRqFhP9eaAoc+4RozpFTVdQJdj5DUFDMSJ2uhIgHCXdQmVUM58omqR8OjY7hvlCZsCWke13Co/p2IkK9Uz/dMcrClmvYG4n9eNdSt03pEeRBqwvHoo1bIoBZw0CBsUkmwZj1DEJbU7ApxB0mEtek5aUpwp0+eJaVc1j3OHt3kUvmzcR0JsAv2QAa44ATkwQUogCLA4BG8gHfwYT1Zb9an9TWKzlnjmR0wAev7B8t9q4o=</latexit>
O(n3
|L|) ! O(n2
|L|)
He had He had dared to defy nature dared
<latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit>
P(yp,a = l|X)
Unary scores
<latexit sha1_base64="1Ws2LC1jyACT0jK7h1UDzQ8x5D0=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/i83C+W3Ko7B1olXkZKkKHRL371BpIkERWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LBY6o9tP5tVN0ZpUBCqWyJQyaq78nUhxpPYkC2xlhM9LL3kz8z+smJrz2UybixFBBFovChCMj0ex1NGCKEsMnlmCimL0VkRFWmBgbUMGG4C2/vEpatap3Wb24r5XqN1kceTiBU6iAB1dQhztoQBMIPMIzvMKbI50X5935WLTmnGzmGP7A+fwBBhOOHg==</latexit>
g(a)
<latexit sha1_base64="Gh7V9z+4NY792biUOQMDXdPU9Mk=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/F5uV8suVV3DrRKvIyUIEOjX/zqDSRJIioM4VjrrufGxk+xMoxwOi30Ek1jTMZ4SLuWChxR7afza6fozCoDFEplSxg0V39PpDjSehIFtjPCZqSXvZn4n9dNTHjtp0zEiaGCLBaFCUdGotnraMAUJYZPLMFEMXsrIiOsMDE2oIINwVt+eZW0alXvsnpxXyvVb7I48nACp1ABD66gDnfQgCYQeIRneIU3RzovzrvzsWjNOdnMMfyB8/kDHO2OLQ==</latexit>
g(p)
<latexit sha1_base64="iOTvvQBEW7RNCR8kRWB+GwfZrXA=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixSnxe7hdLbtWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwms/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrVvUuqxf3tVL9JosjDydwChXw4ArqcAcNaAIBAc/wCm+Ocl6cd+dj0Zpzsplj+APn8wc4E49h</latexit>
(p)
<latexit sha1_base64="Y6OAbliZcYcKciJ0WIr4YDYHkm4=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixCj4v94slt+rOgVaJl5ESZGj0i1+9QUQSQaUhHGvd9dzY+ClWhhFOp4VeommMyRgPaddSiQXVfjo/eIrOrDJAYaRsSYPm6u+JFAutJyKwnQKbkV72ZuJ/Xjcx4bWfMhknhkqyWBQmHJkIzb5HA6YoMXxiCSaK2VsRGWGFibEZFWwI3vLLq6RVq3qX1Yv7Wql+k8WRhxM4hQp4cAV1uIMGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QMhOY9S</latexit>
(a)
RelaOon score
<latexit sha1_base64="h0rPwKSFtSqJ6x/RiYH137QlKYw=">AAAB/nicbVBNS8NAEN34WetXVTx5CbaCp5IUUY9FLx4r2A9oY9hsp+3SzSbsTsQSCv4VLx4U8erv8Oa/cdvmoK0PBh7vzTAzL4gF1+g439bS8srq2npuI7+5tb2zW9jbb+goUQzqLBKRagVUg+AS6shRQCtWQMNAQDMYXk/85gMozSN5h6MYvJD2Je9xRtFIfuGw1KkNuJ92EB4xVSDG43tR8gtFp+xMYS8SNyNFkqHmF7463YglIUhkgmrddp0YvZQq5EzAON9JNMSUDWkf2oZKGoL20un5Y/vEKF27FylTEu2p+nsipaHWozAwnSHFgZ73JuJ/XjvB3qWXchknCJLNFvUSYWNkT7Kwu1wBQzEyhDLFza02G1BFGZrE8iYEd/7lRdKolN3z8tltpVi9yuLIkSNyTE6JSy5IldyQGqkTRlLyTF7Jm/VkvVjv1sesdcnKZg7IH1ifP2OblcY=</latexit>
l
rel
Token “HE” is less likely to be a predicate based on unary scores
and is removed from forming potenOal relaOon
230
Syntax-agnostic end-to-end SRL system
He
He et al., 2018
Jointly predicOng predicates and Arguments in Neural SRL
He had He had dared to defy nature dared
Encoder RepresentaOon
<latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit>
P(yp,a = l|X)
Unary scores
Compute Unary score for predicates and arguments
<latexit sha1_base64="1Ws2LC1jyACT0jK7h1UDzQ8x5D0=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/i83C+W3Ko7B1olXkZKkKHRL371BpIkERWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LBY6o9tP5tVN0ZpUBCqWyJQyaq78nUhxpPYkC2xlhM9LL3kz8z+smJrz2UybixFBBFovChCMj0ex1NGCKEsMnlmCimL0VkRFWmBgbUMGG4C2/vEpatap3Wb24r5XqN1kceTiBU6iAB1dQhztoQBMIPMIzvMKbI50X5935WLTmnGzmGP7A+fwBBhOOHg==</latexit>
g(a)
<latexit sha1_base64="Gh7V9z+4NY792biUOQMDXdPU9Mk=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/F5uV8suVV3DrRKvIyUIEOjX/zqDSRJIioM4VjrrufGxk+xMoxwOi30Ek1jTMZ4SLuWChxR7afza6fozCoDFEplSxg0V39PpDjSehIFtjPCZqSXvZn4n9dNTHjtp0zEiaGCLBaFCUdGotnraMAUJYZPLMFEMXsrIiOsMDE2oIINwVt+eZW0alXvsnpxXyvVb7I48nACp1ABD66gDnfQgCYQeIRneIU3RzovzrvzsWjNOdnMMfyB8/kDHO2OLQ==</latexit>
g(p)
<latexit sha1_base64="iOTvvQBEW7RNCR8kRWB+GwfZrXA=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixSnxe7hdLbtWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwms/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrVvUuqxf3tVL9JosjDydwChXw4ArqcAcNaAIBAc/wCm+Ocl6cd+dj0Zpzsplj+APn8wc4E49h</latexit>
(p)
<latexit sha1_base64="Y6OAbliZcYcKciJ0WIr4YDYHkm4=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixCj4v94slt+rOgVaJl5ESZGj0i1+9QUQSQaUhHGvd9dzY+ClWhhFOp4VeommMyRgPaddSiQXVfjo/eIrOrDJAYaRsSYPm6u+JFAutJyKwnQKbkV72ZuJ/Xjcx4bWfMhknhkqyWBQmHJkIzb5HA6YoMXxiCSaK2VsRGWGFibEZFWwI3vLLq6RVq3qX1Yv7Wql+k8WRhxM4hQp4cAV1uIMGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QMhOY9S</latexit>
(a)
RelaOon score
Compute RelaOon score between predicates and
arguments
<latexit sha1_base64="h0rPwKSFtSqJ6x/RiYH137QlKYw=">AAAB/nicbVBNS8NAEN34WetXVTx5CbaCp5IUUY9FLx4r2A9oY9hsp+3SzSbsTsQSCv4VLx4U8erv8Oa/cdvmoK0PBh7vzTAzL4gF1+g439bS8srq2npuI7+5tb2zW9jbb+goUQzqLBKRagVUg+AS6shRQCtWQMNAQDMYXk/85gMozSN5h6MYvJD2Je9xRtFIfuGw1KkNuJ92EB4xVSDG43tR8gtFp+xMYS8SNyNFkqHmF7463YglIUhkgmrddp0YvZQq5EzAON9JNMSUDWkf2oZKGoL20un5Y/vEKF27FylTEu2p+nsipaHWozAwnSHFgZ73JuJ/XjvB3qWXchknCJLNFvUSYWNkT7Kwu1wBQzEyhDLFza02G1BFGZrE8iYEd/7lRdKolN3z8tltpVi9yuLIkSNyTE6JSy5IldyQGqkTRlLyTF7Jm/VkvVjv1sesdcnKZg7IH1ifP2OblcY=</latexit>
l
rel
Combined score
231
Syntax-agnosOc end-to-end SRL system
Classifier
He et al., 2018
He had dared to defy nature
Embedder
Encoder
Classifier
He had He had dared to defy nature dared
- An end-to-end Neural SRL Model
87.4
86
80.4
76.1
70
72
74
76
78
80
82
84
86
88
90
Gold
predicate
end-to-
end
Gold
predicate
end-to-
end
Argument classification results on CoNLL05
WSJ Brown
232
Syntax-agnosOc end-to-end SRL system
He et al., 2018
He had dared to defy nature
Embedder
Encoder
Classifier
He had He had dared to defy nature dared
Takeaways
Ø First end-to-end neural SRL model.
Ø Strong performance against models with gold predicates.
Ø Empirically, the model does bexer at long range
dependencies and agreement with syntacOc boundaries,
but is weaker at global consistency, due to our strong
independence assumpOon
233
Syntax-agnosOc end-to-end SRL system
Strubell et al., 2018
He had dared to defy nature
Encoder
Embedder
Classifier
B-A0 0 0 0 0 B-A1
Multi-Head Self-Attention + FF
Syntac6cally-informed Self-A<en6on + FF
Mul6-Head Self-A<en6on + FF
Predicate + POS Tagging
FF Bilinear FF
Predicate Role
Dare B-A0 0 0 B-A2 I-A2 I-A2
defy
LinguisOcally-Informed Self-AxenOon for SemanOc Role
Labeling
Syntax strikes back
- A mulO-task learning framework with stacked mulO-head
self-axenOon
- Jointly predicts POS and predicates
- Perform parsing
- Axend to syntacOc parse parent while assigning semanOc
role label.
234
Syntax-aware end-to-end SRL system
Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, and Andrew McCallum. 2018. LinguisAcally-
Informed Self-AkenAon for SemanAc Role Labeling. In Proceedings of the 2018 Conference on Empirical
Methods in Natural Language Processing, pages 5027–5038, Brussels, Belgium. AssociaAon for
ComputaAonal LinguisAcs.
Strubell et al., 2018
He had dared to defy nature
Encoder
Embedder
Classifier
B-A0 0 0 0 0 B-A1
Mul6-Head Self-A<en6on + FF
Syntac6cally-informed Self-A<en6on + FF
Multi-Head Self-Attention + FF
Predicate + POS Tagging
FF Bilinear FF
Predicate Role
Dare B-A0 0 0 B-A2 I-A2 I-A2
defy
q Replace one axenOon head with the deep bi-affine
model of Dozat and Manning (2017).
q Use a bi-affine operator U to obtain axenOon
weights for that single head.
q Encode both the dependency and the dependency
label
235
Syntax-aware end-to-end SRL system
Strubell et al., 2018
Encoder
Embedder
Classifier
B-A0 0 0 0 0 B-A1
Mul6-Head Self-A<en6on + FF
Syntac6cally-informed Self-A<en6on + FF
Mul6-Head Self-A<en6on + FF
Predicate + POS Tagging
FF Bilinear FF
Predicate Role
Dare B-A0 0 0 B-A2 I-A2 I-A2
defy
236
Syntax-aware end-to-end SRL system
He had dared to defy nature
SyntacOc Head
SemanOc Heads
Strubell et al., 2018
Encoder
Embedder
Classifier
B-A0 0 0 0 0 B-A1
Mul6-Head Self-A<en6on + FF
Syntac6cally-informed Self-A<en6on + FF
Mul6-Head Self-A<en6on + FF
Predicate + POS Tagging
FF Bilinear FF
Predicate Role
Dare B-A0 0 0 B-A2 I-A2 I-A2
defy
LinguisOcally-Informed Self-AxenOon for SemanOc Role
Labeling
Predicate-specific
representaOon
Argument-specific
representaOon
Bilinear TransformaOon
operator
237
Syntax-aware end-to-end SRL system
He had dared to defy nature
Strubell et al., 2018
79.9
79.4
82.8 83.1
83.9
84.8
86
75
77
79
81
83
85
87
89
Täckström
et al. (2015)
FitzGerald
et al. (2015)
Zhou and
Xu (2015)
Heet al.
(2017)
Heet al.
(2018)
Tan et al.,
(2018)
Strubellet
al., (2018)
WSJ
71.3 71.2
69.4
72.1
73.7 74.1
76.5
65
70
75
80
85
90
Täckström
et al.
(2015)
FitzGerald
et al.
(2015)
Zhou and
Xu (2015)
Heet al.
(2017)
Heet al.
(2018)
Tan et al.,
2018
Strubellet
al., (2018)
Brown
238
Syntax-aware end-to-end SRL system
SA
SA
Dataset: CoNLL05
Strubell et al., 2018
He had dared to defy nature
Encoder
Embedder
Classifier
B-A0 0 0 0 0 B-A1
Mul6-Head Self-A<en6on + FF
Syntac6cally-informed Self-A<en6on + FF
Mul6-Head Self-A<en6on + FF
Predicate + POS Tagging
FF Bilinear FF
Predicate Role
Dare B-A0 0 0 B-A2 I-A2 I-A2
defy
Takeaways
Ø Shows strong performance gain over other methods with
and w/o gold predicate locaOon
Ø IncorporaOng parse informaOon helpful for resolving span
boundary errors (Merge spans, split spans etc.)
239
Syntax-aware end-to-end SRL system
Zhou et al., 2019
q SemanOcs is usually considered as a higher layer of
linguisOcs over syntax, most previous studies focus on
how the laxer helps the former
q SemanOcs benefit from syntax, but syntax may also
benefit from semanOcs.
q Joint training of (MulO-task learning) following 5 tasks
q SemanOc
q Dependency
q Span
q Predicate
q Syntax
q ConsOtuent
q Dependency
He had dared to defy nature
Encoder
Embedder
SRL
Classifier
Mul6-Head Self-A<en6on + FF
FF Bilinear FF
Predicate Role
Dependency Head score
ConsOtuent Span score
240
Syntax-aware end-to-end SRL system
Junru Zhou, Zuchao Li, and Hai Zhao. 2020. Parsing All: Syntax and SemanAcs, Dependencies and Spans. In Findings of the AssociaBon for ComputaBonal LinguisBcs: EMNLP 2020, pages 4438–
4449, Online. AssociaAon for ComputaAonal LinguisAcs.
Zhou et al., 2019
Table 2 from the paper: Joint learning analysis on CoNLL-
2005, CoNLL-2009, and PTB dev sets
q Joint training of dependency and span for SRL
helps improve both. Further strengthened by Fei
et al. (2021).
InteresOng Insights
q Further improve for both is observed when
combined with syntacOc consOtuent.
SEMANTICS
q Though marginal, semanOc do improve syntax
SYNTAX
241
Can we jointly predict dependency and span?
Hao Fei, Shengqiong Wu, Yafeng Ren, Fei Li, and Donghong Ji. 2021. Beker Combine Them Together! IntegraAng SyntacAc ConsAtuency and Dependency RepresentaAons for SemanAc Role
Labeling. In Findings of the AssociaBon for ComputaBonal LinguisBcs: ACL-IJCNLP 2021, pages 549–559, Online. AssociaAon for ComputaAonal LinguisAcs.
q Not so when combined with syntacOc dependency
Jindal et al., 2022
242
Ishan Jindal, Alexandre Rademaker, Michał Ulewicz, Ha Linh, Huyen Nguyen, Khoi-Nguyen Tran, Huaiyu Zhu, and Yunyao Li. 2022. Universal ProposiAon Bank 2.0. In Proceedings of the Thirteenth
Language Resources and EvaluaBon Conference, pages 1700–1711, Marseille, France. European Language Resources AssociaAon.
SPADE: SPAn and DEpendency SRL model
He had to defy nature
A0 0 0 0 A2 0
BERT
[CLS] [SEP] dared [SEP]
B-A0 0 0 0 B-A2 I-A2
A mulO-task learning framework
- Train simultaneously on argument heads and the argument spans.
Enclosing constraints
ObservaOons:
q Slight drop on argument head performance.
q Gain on argument span performance.
These observaOons are consistent with Zhou et
at., 2019
Can we jointly predict dependency and span?
Zhou et al., 2019
243
79.9 79.4
82.8 83.1
84.8
86
87.8
88.7 88.1
88.8
75
80
85
90
Täckström et
al. (2015)
FitzGerald et
al. (2015)
Zhou and Xu
(2015)
Heet al.
(2017)
Tan et al.,
(2018) ELMO
Strubellet al.,
(2018) ELMo
Zhou et al.,
(2019) ELMo
Zhou et al.,
(2019) BERT-L
Shi et al.,
(2019) BERT-S
Shi et al.,
(2019) BERT-L
CoNLL05 WSJ
71.3 71.2
69.4
72.1
74.1
76.5
80.2 81.2 80.9 82.1
65
70
75
80
85
90
Täckström et
al. (2015)
FitzGerald et
al. (2015)
Zhou and Xu
(2015)
Heet al.
(2017)
Tan et al.,
(2018) ELMO
Strubellet al.,
(2018) ELMo
Zhou et al.,
(2019) ELMo
Zhou et al.,
(2019) BERT-L
Shi et al.,
(2019) BERT-S
Shi et al.,
(2019) BERT-L
CoNLL05 Brown
Parsing All: Syntax and SemanOcs, Dependencies and Spans
Can we jointly predict dependency and span?
Zhou et al., 2019
Parsing All: Syntax and SemanOcs, Dependencies and Spans
244
87.3 87.7 87.7 88
89.8 89.8
91.1
92 92.4
80
85
90
95
FitzGerald et
al. (2015)
Roth and
Lapata (2016)
Marcheggiani
et al. (2017)
Marcheggiani
et al. (2017)
Li et al., (2018)
ELMo
Zhou et al.,
(2019) ELMo
Zhou et al.,
(2019) BERT-L
Shi et al.,
(2019) BERT-S
Shi et al.,
(2019) BERT-L
CoNLL09 WSJ
75.2 76.1
77.7 77.2
79.8
84.4 85.3 85.1 85.7
65
70
75
80
85
90
FitzGerald et
al. (2015)
Roth and
Lapata (2016)
Marcheggiani
et al. (2017)
Marcheggiani
et al. (2017)
Li et al., (2018)
ELMo
Zhou et al.,
(2019) ELMo
Zhou et al.,
(2019) BERT-L
Shi et al.,
(2019) BERT-S
Shi et al.,
(2019) BERT-L
CoNLL09 Brown
Can we jointly predict dependency and span?
245
Outline
q Early SRL approaches
q Typical neural SRL model components
q Performance analysis
q Syntax-aware neural SRL models
q What, When and Where?
q Performance analysis
q How to incorporate Syntax?
q Syntax-agnos/c neural SRL models
q Performance Analysis
q Do we really need syntax for SRL?
q Are high quality contextual embedding enough for SRL
task?
q Prac/cal SRL systems
q Should we rely on this pipelined approach?
q End-to-end SRL systems
q Can we jointly predict dependency and span?
q More recent approaches
q Learn low-frequency excepcons
q Incorporate semancc role label definicons
q SRL as MRC task
q Prac/cal SRL system evalua/ons
q Are we evaluacng SRL systems correctly?
q Conclusion
Low-frequency Excep6ons
246
Argument labeling task:
q Arguments that are syntacccally realized as passive subjects are typically labeled Arg1
q However, there exist numerous low-frequency excepcons to this rule.
q Passive subjects of certain frames (such as the frame TELL.01) are most commonly labeled as Arg2
ObservaOons based on CoNLL09 training data [Akbik and Li, 2015]:
q 57% of all subjects are labeled A0
q 33% of all subjects are labeled A1
q 74% of accve subjects are labeled A0
q 86% of passive subjects are labeled A1
q 100% of passive subjects of SELL.01 are labeled A1
q 88% of passive subjects of TELL.01 are labeled A2
Low-frequency Exceptions
247
[Akbik and Li, 2016] Alan Akbik and Yunyao Li. 2016. K-SRL: Instance-based Learning for Semantic Role Labeling. In Proceedings of COLING 2016, the 26th International
Conference on Computational Linguistics: Technical Papers, pages 599–608, Osaka, Japan. The COLING 2016 Organizing Committee.
[Guan et al., 2019] Chaoyu Guan, Yuhao Cheng, and Hai Zhao. 2019. Semantic Role Labeling with Associated Memory Network. In Proceedings of the 2019 Conference of the
North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3361–3371, Minneapolis,
Minnesota. Association for Computational Linguistics.
[Jindal et al., 2020] Jindal, Ishan, Ranit Aharonov, Siddhartha Brahma, Huaiyu Zhu, and Yunyao Li. "Improved Semantic Role Labeling using Parameterized Neighborhood
Memory Adaptation." arXiv preprint arXiv:2011.14459 (2020).
A1
A2
A3
A1
?
A2
A2
A2
A3
A3
A3
A3
A2 A3
A1
A1
A1
A1
?
Instance-based learning
q Extrapolates prediccons from the most similar instances in
the training data. [Akbik and Li, 2016, Jindal et al., 2020]
q Generally, staged approaches where base model is trained
first to get the word/span representacons. [Guan et al.,
2019, Jindal et al., 2020]
Understanding BERT based model bexer for bexer SRL performance.
Understand BERT for SRL
248
Ilia Kuznetsov and Iryna Gurevych. 2020. A maker of framing: The impact of linguisAc formalism on probing results. In Proceedings of the 2020 Conference on Empirical Methods in
Natural Language Processing (EMNLP), pages 171–182, Online. AssociaAon for ComputaAonal LinguisAcs.
BERT “rediscovers” the classical NLP pipeline [Tenney et al., 2019]
q Lower layers tend to encode mostly lexical level informaOon, while
q Upper layers seem to favor sentence-level informaOon.
Understanding BERT based model bexer for bexer SRL
performance.
Understand BERT for SRL
He had dared to defy nature
Encoder
Classifier
A0 0 0 0 A2 0
BERT
[CLS] [SEP] dared [SEP]
= f( , , , , ….. )
249
Simone Conia and Roberto Navigli. 2022. Probing for Predicate Argument Structures in Pretrained Language Models. In Proceedings of the 60th Annual MeeBng of the AssociaBon for
ComputaBonal LinguisBcs (Volume 1: Long Papers), pages 4622–4632, Dublin, Ireland. AssociaAon for ComputaAonal LinguisAcs.
Sta/c: Last layer acOvaOons as staOc embeddings
Top-4: Concatenate top 4 layers acOvaOons
W-avg: Parametric sum of all layer acOvaOons
Understanding BERT based model bexer for bexer SRL
performance.
Understand BERT for SRL
He had dared to defy nature
Encoder
Classifier
A0 0 0 0 A2 0
BERT
[CLS] [SEP] dared [SEP]
= f( , , , , ….. )
250
q Predicate senses and argument structures are encoded at
different layers in LMs
q Verbal and nominal predicate-argument structures are
represented differently across the layers of a LM;
q SRL system benefits from treaOng them separately;
InteresOng Insights
Label-aware NLP
• Model is given the definitions of labels, and
can effectively leverage them in many
tasks
§ Sentiment/entailment: (Schick and Schutze,
2021)
§ Event extraction: (Du and Cardie, 2020; Hongming
et al., 2021)
§ Word sense disambiguation: (Kumar et al., 2019)
• Strong even with few-shot
• Many more, but NOT for SRL (why?)
§ Semantic roles are specific to predicates
§ There are many predicates, thus many roles;
very sparse
§ 8500 Predicate senses in CoNLL09 data
§ ~8500*3 argument labels ~ 25K
251
Incorpora6ng Role Defini6ons
Label-aware NLP for SRL
252
Incorpora6ng Role Defini6ons
Li Zhang, Ishan Jindal, and Yunyao Li. 2022. Label DefiniAons Improve SemanAc Role Labeling. In Proceedings of the 2022 Conference of the North American Chapter of the AssociaBon for
ComputaBonal LinguisBcs: Human Language Technologies, pages 5613–5620, Seakle, United States. AssociaAon for ComputaAonal LinguisAcs.
[Zhang et al., 2022]
q Make n+1 copies of the sentence where n is number of
core arguments defined for frame.
q N is number of core arguments
q +1 is for contextual arguments
q Append label definiOon at the end of the sentence.
q Convert K class classificaOon problem into binary class
classificaOon.
q That is to determine whether a token is worker or
not in this example.
253
Incorpora6ng Role Defini6ons
Low-Frequency Predicates.
- SRL suffers from the long-tail phenomenon.
- LD outperforms base by up to 4.4 argument F1 for unseen
predicates, notably helping with low-frequency predicates.
Few-Shot Learning.
- LD outperforms base by up to 3.2 F1 in- and out-domain.
- The performance gap diminishes as training size approaches
100, 000.
Distant Domain Adapta/on
- evaluate models trained on CoNLL09 (news arOcles) on the
Biology PropBank.
- LD model achieves 55.5 argument F1, outperforming base
which achieves 54.6..
InteresOng Insights
SRL as extracOve machine Reading Comprehension task [Wang et al., 2022]
SRL as MRC Task
254
Nan Wang, Jiwei Li, Yuxian Meng, Xiaofei Sun, Han Qiu, Ziyao Wang, Guoyin Wang, and Jun He. 2022. An MRC Framework for SemanAc Role Labeling. In Proceedings of the 29th
InternaBonal Conference on ComputaBonal LinguisBcs, pages 2188–2198, Gyeongju, Republic of Korea. InternaAonal Commikee on ComputaAonal LinguisAcs.
255
Outline
q Early SRL approaches
q Typical neural SRL model components
q Performance analysis
q Syntax-aware neural SRL models
q What, When and Where?
q Performance analysis
q How to incorporate Syntax?
q Syntax-agnos/c neural SRL models
q Performance Analysis
q Do we really need syntax for SRL?
q Are high quality contextual embedding enough for SRL
task?
q Prac/cal SRL systems
q Should we rely on this pipelined approach?
q End-to-end SRL systems
q Can we jointly predict dependency and span?
q More recent approaches
q Handling low-frequency excepcons
q Incorporate semancc role label definicons
q SRL as MRC task
q Prac/cal SRL system evalua/ons
q Are we evaluacng SRL systems correctly?
q Conclusion
256
SRL Evalua6on – Issues with Evalua6on Metrics
Two official evaluaOon scripts
q EvaluaOon script from CoNLL05 Shared task (eval05.pl)
q EvaluaOon script from CoNLL09 Shared task (eval09.pl)
Predicate iden/fica/on
Predicate sense
disambiguation
Argument iden/fica/on
Argument classifica/on
Eval05.pl Eval09.pl
Span only Head only
Assume gold
predicate locaOon
Assume gold
predicate location
All tasks are
evaluated
independently
ERROR
PROPOGATATION
257
SRL Evalua6on – Predicate Error Types
Example:
Predicate sense
evaluaOon
Eval05.pl
Eval09.pl
R
Do not evaluate
1/1 0/1 0/1 1/1
Eval05.pl
Eval09.pl
P
Do not evaluate
1/1 0/1 0/1 1/1
258
SRL Evaluation – Error Examples
Real errors from a SoTA SRL model.
All of these predicate senses are marked correct by the CoNLL09 evaluaOon script.
259
SRL Evalua6on – Argument Error Types
Example:
Eval05.pl
Argument
evaluaOon
Eval09.pl
R
Eval05.pl
Eval09.pl
P
3/3 3/3 3/3 3/3
3/3 3/3 3/3 3/3
3/3 0/3 3/3 0/3
3/3 0/3 3/3 0/3
260
An Improved Evalua6on Scheme
Summary of issues with exisfng SRL evaluafon metric:
q Proper evaluaOon of predicate sense disambiguaOon task;
q Argument label evaluaOon in conjuncOon with predicate sense;
q Proper evaluaOon for disconOnuous arguments and reference arguments; and
q Unified evaluaOon of argument head and span.
Jindal, Ishan, Alexandre Rademaker, Khoi-Nguyen Tran, Huaiyu Zhu, Hiroshi Kanayama, Marina Danilevsky, and Yunyao Li. "PriMeSRL-Eval: A Practical Quality
Metric for Semantic Role Labeling Systems Evaluation." arXiv preprint arXiv:2210.06408 (2022).
PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation [Jindal et al., 2022]
261
With PriMeSRL-Eval we made the following observaOons:
q Current evaluaOon scripts exaggerate the SRL model quality.
q A clear drop on ~7F1 points on OOD set is observed.
q The relaOve ranking of the SoTA SRL models changes.
An Improved Evalua6on Scheme
262
Conclusion
q Syntax ma/ers
q Yes, at least for argument spans.
q Not for dependency SRL.
q Eventually, you need syntax to compute span.
q SRL can help syntax
q Contextualized embeddings
q Carry major chunk of performance gain in SRL.
q Fine-tunning LM for SRL further raised the bar.
q End-to-End Systems
q More pracOcal, but computaOonally expensive
q Predicate and arguments task shown to improve
each other.
q SRL in few shot se=ng
q Probe SRL informaOon from large LMs.
q Given the sparsity of the SRL label space
finding a right prompt is quite challenging.
q Mul?lingual SRL
q MulOlingual SRL Resources
q Universal PropBanks for SRL
q A long way to go
q Datasets
q Dataset without predicate sense annotaOons
q Ethical issues
q SRL Model Re-Evalua?ons
Observations OpportuniLes
263
References
1. Merchant, A., Rahimtoroghi, E., Pavlick, E., & Tenney, I. (2020, November). What Happens To BERT Embeddings During Fine-tuning?.
In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (pp. 33-44).
2. Tan, Z., Wang, M., Xie, J., Chen, Y., & Shi, X. (2018, April). Deep seman6c role labeling with self-a<en6on. In Proceedings of the AAAI conference on ar6ficial
intelligence (Vol. 32, No. 1).
3. Marcheggiani, D., Frolov, A., & Titov, I. (2017, August). A Simple and Accurate Syntax-Agnos6c Neural Model for Dependency-based Seman6c Role Labeling. In
Proceedings of the 21st Conference on Computa6onal Natural Language Learning (CoNLL 2017) (pp. 411-420).
4. A Unified Syntax-aware Framework for Seman6c Role Labeling
5. Tian, Y., Qin, H., Xia, F., & Song, Y. (2022, June). Syntax-driven Approach for Seman6c Role Labeling. In Proceedings of the Thirteenth Language Resources and
Evalua6on Conference (pp. 7129-7139).
6. Zhang, Z., Strubell, E., & Hovy, E. (2021, August). Comparing span extrac6on methods for seman6c role labeling. In Proceedings of the 5th Workshop on
Structured Predic6on for NLP (SPNLP 2021) (pp. 67-77).
7. Fei, H., Wu, S., Ren, Y., Li, F., & Ji, D. (2021, August). Be<er combine them together! integra6ng syntac6c cons6tuency and dependency representa6ons for
seman6c role labeling. In Findings of the Associa6on for Computa6onal Linguis6cs: ACL-IJCNLP 2021 (pp. 549-559).
8. Wang, N., Li, J., Meng, Y., Sun, X., & He, J. (2021). An mrc framework for seman6c role labeling. arXiv preprint arXiv:2109.06660.
9. Blloshmi, R., Conia, S., Tripodi, R., & Navigli, R. (2021). Genera6ng Senses and RoLes: An End-to-End Model for Dependency-and Span-based Seman6c Role
Labeling. In IJCAI (pp. 3786-3793).
10.Zhang, L., Jindal, I., & Li, Y. (2022, July). Label Defini6ons Improve Seman6c Role Labeling. In Proceedings of the 2022 Conference of the North American
Chapter of the Associa6on for Computa6onal Linguis6cs: Human Language Technologies (pp. 5613-5620).
11.Cai, J., He, S., Li, Z., & Zhao, H. (2018, August). A full end-to-end seman6c role labeler, syntac6c-agnos6c over syntac6c-aware?. In Proceedings of the 27th
Interna6onal Conference on Computa6onal Linguis6cs (pp. 2753-2765).
12.He, S., Li, Z., & Zhao, H. (2019, November). Syntax-aware Mul6lingual Seman6c Role Labeling. In Proceedings of the 2019 Conference on Empirical Methods in
Natural Language Processing and the 9th Interna6onal Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 5350-5359).
13.Conia, S., Bacciu, A., & Navigli, R. (2021, June). Unifying cross-lingual Seman6c Role Labeling with heterogeneous linguis6c resources. In Proceedings of the
2021 Conference of the North American Chapter of the Associa6on for Computa6onal Linguis6cs: Human Language Technologies (pp. 338-351).
264
References
14. Conia, S., & Navigli, R. (2020, December). Bridging the gap in mul6lingual seman6c role labeling: a language-agnos6c approach. In Proceedings of the 28th
Interna6onal Conference on Computa6onal Linguis6cs (pp. 1396-1410).
15.Kasai, J., Friedman, D., Frank, R., Radev, D., & Rambow, O. (2019, June). Syntax-aware Neural Seman6c Role Labeling with Supertags. In Proceedings of the
2019 Conference of the North American Chapter of the Associa6on for Computa6onal Linguis6cs: Human Language Technologies, Volume 1 (Long and Short
Papers) (pp. 701-709).
16.He, L., Lee, K., Levy, O., & Ze<lemoyer, L. (2018, July). Jointly Predic6ng Predicates and Arguments in Neural Seman6c Role Labeling. In Proceedings of the
56th Annual Mee6ng of the Associa6on for Computa6onal Linguis6cs (Volume 2: Short Papers) (pp. 364-369).
17.Shi, T., Malioutov, I., & İrsoy, O. (2020, November). Seman6c Role Labeling as Syntac6c Dependency Parsing. In Proceedings of the 2020 Conference on
Empirical Methods in Natural Language Processing (EMNLP) (pp. 7551-7571).
18.Zhou, J., Li, Z., & Zhao, H. (2020, November). Parsing All: Syntax and Seman6cs, Dependencies and Spans. In Findings of the Associa6on for Computa6onal
Linguis6cs: EMNLP 2020 (pp. 4438-4449).
19.Zhou, J., Li, Z., & Zhao, H. (2020, November). Parsing All: Syntax and Seman6cs, Dependencies and Spans. In Findings of the Associa6on for Computa6onal
Linguis6cs: EMNLP 2020 (pp. 4438-4449).
20.Wang, Y., Johnson, M., Wan, S., Sun, Y., & Wang, W. (2019, July). How to best use syntax in seman6c role labelling. In Annual Mee6ng of the Associa6on for
Computa6onal Linguis6cs (57th: 2019) (pp. 5338-5343). Associa6on for Computa6onal Linguis6cs.
21.He, S., Li, Z., Zhao, H., & Bai, H. (2018, July). Syntax for seman6c role labeling, to be, or not to be. In Proceedings of the 56th annual mee6ng of the associa6on
for computa6onal linguis6cs (Volume 1: Long papers) (pp. 2061-2071).
22.Marcheggiani, D., & Titov, I. (2020, November). Graph Convolu6ons over Cons6tuent Trees for Syntax-Aware Seman6c Role Labeling. In Proceedings of the
2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 3915-3928).
23.Marcheggiani, D., & Titov, I. (2017, September). Encoding Sentences with Graph Convolu6onal Networks for Seman6c Role Labeling. In Proceedings of the
2017 Conference on Empirical Methods in Natural Language Processing (pp. 1506-1515).
24.Marcheggiani, D., & Titov, I. (2017, September). Encoding Sentences with Graph Convolu6onal Networks for Seman6c Role Labeling. In Proceedings of the
2017 Conference on Empirical Methods in Natural Language Processing (pp. 1506-1515).
25.Li, Z., Zhao, H., Wang, R., & Parnow, K. (2020, November). High-order Seman6c Role Labeling. In Findings of the Associa6on for Computa6onal Linguis6cs:
EMNLP 2020 (pp. 1134-1151).
265
References
26. Lyu, C., Cohen, S. B., & Titov, I. (2019, November). Seman6c Role Labeling with Itera6ve Structure Refinement. In Proceedings of the 2019 Conference on
Empirical Methods in Natural Language Processing and the 9th Interna6onal Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 1071-
1082).
27.Li, Z., He, S., Zhao, H., Zhang, Y., Zhang, Z., Zhou, X., & Zhou, X. (2019, July). Dependency or span, end-to-end uniform seman6c role labeling. In Proceedings of
the AAAI Conference on Ar6ficial Intelligence (Vol. 33, No. 01, pp. 6730-6737).
28.Ouchi, H., Shindo, H., & Matsumoto, Y. (2018). A Span Selec6on Model for Seman6c Role Labeling. In Proceedings of the 2018 Conference on Empirical
Methods in Natural Language Processing (pp. 1630-1642).
29.Strubell, E., Verga, P., Andor, D., Weiss, D., & McCallum, A. (2018). Linguis6cally-Informed Self-A<en6on for Seman6c Role Labeling. In Proceedings of the 2018
Conference on Empirical Methods in Natural Language Processing (pp. 5027-5038).
30.He, L., Lee, K., Lewis, M., & Ze<lemoyer, L. (2017, July). Deep seman6c role labeling: What works and what’s next. In Proceedings of the 55th Annual Mee6ng
of the Associa6on for Computa6onal Linguis6cs (Volume 1: Long Papers) (pp. 473-483).
31.FitzGerald, N., Täckström, O., Ganchev, K., & Das, D. (2015, September). Seman6c role labeling with neural network factors. In Proceedings of the 2015
Conference on Empirical Methods in Natural Language Processing (pp. 960-970).
32.Guan, C., Cheng, Y., & Zhao, H. (2019, June). Seman6c Role Labeling with Associated Memory Network. In Proceedings of the 2019 Conference of the North
American Chapter of the Associa6on for Computa6onal Linguis6cs: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 3361-3371).
33.Jindal, I., Aharonov, R., Brahma, S., Zhu, H., & Li, Y. (2020). Improved Seman6c Role Labeling using Parameterized Neighborhood Memory Adapta6on. arXiv
preprint arXiv:2011.14459.
Meaning Representa=ons for Natural Languages Tutorial Part 3b
Modeling Meaning Representa0on: AMR
Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
❏ AMR Parsing
❏ Sequence-to-sequence methods
❏ Pre/post processing
❏ Transi=on-based methods
❏ Graph-based methods
❏ Evalua=on
❏ AMR GeneraLon:
❏ Sequence-to-sequence methods
❏ Graph-based methods
❏ Silver data
❏ Pre-training
Outline
267
❏ Linearize the AMR graphs
❏ AMR parsing as sequence-to-sequence modeling
❏ Can use any seq2seq method and pre-training method (BART, etc)
Konstas et al. Neural AMR: Sequence-to-Sequence Models for Parsing and Genera<on. ACL 2017.
inter alia.
Seq2seq AMR Parsing
268
❏ LinearizaLon order of the AMR graph usually mafers
AMR Lineariza6on
Bevilacqua et al. One SPRING to Rule Them Both:
Symmetric AMR SemanOc Parsing and GeneraOon without
a Complex Pipeline. AAAI 2021 269
❏ LinearizaLon order of the AMR graph usually mafers
AMR Lineariza6on
van Noord & Bos. Neural Semantic Parsing by Character-
based Translation: Experiments with Abstract Meaning
Representations. Computational Linguistics in the 270
❏ Remove variables and adding them back-in with post-processing heurisLcs
Removing Variables
271
van Noord & Bos. Neural SemanOc Parsing by Character-based TranslaOon:
Experiments with Abstract Meaning RepresentaOons. ComputaOonal LinguisOcs in
the Netherlands Journal. 2017.
❏ Rather than removing variables (lossy) use special tokens
Removing Variables
272
Bevilacqua et al. One SPRING to Rule Them Both: Symmetric
AMR SemanOc Parsing and GeneraOon without a Complex
Pipeline. AAAI 2021
Pre-Processing for Transi4on and Graph-Based: Recategoriza4on
273
Figure from Zhou et al. Structure-aware
Fine-tuning of Sequence-to-sequence
Transformers for
TransiOon-based AMR Parsing. EMNLP 2021
❏ Collapsing verbalized concepts
❏ Anonymizing named en<<es (recovered
with alignments)
❏ Removing sense nodes (predict most
frequent sense)
❏ Remove wiki links (predict with wikifier)
Zhang et al 2019. AMR Parsing as Sequence-
to-Graph TransducOon. ACL 2019
❏ AMR Parsing
❏ Sequence-to-sequence methods
❏ Pre/post processing
❏ Transition-based methods
❏ Graph-based methods
❏ Evaluation
❏ AMR Generation:
❏ Sequence-to-sequence methods
❏ Graph-based methods
❏ Silver data
❏ Pre-training
Outline
274
❏ Construct the graph using a sequence of acLons that build the graph
❏ Use a classifier to predict the next acLon
❏ Inspired by transiLon-based dependency parsing
Wang et al. A Transi<on-based Algorithm for AMR Parsing. NAACL 2015, inter alia.
Transi6on-Based AMR Parsing
275
Transi6on-Based AMR Parsing
276
Zhou et al. AMR Parsing with AcOon-Pointer
Transformer. NAACL 2021
Transi6on-Based AMR Parsing
277
Zhou et al. AMR Parsing with AcOon-Pointer
Transformer. NAACL 2021
Zhou et al. Structure-aware Fine-tuning of Sequence-to-
sequence Transformers for TransiOon-based AMR Parsing.
EMNLP 2021.
Simplified Transifon Acfons
❏ Simplified system: Transition system has 6 actions
Transi6on-Based AMR Parsing
278
Zhou et al. Structure-aware Fine-tuning of Sequence-to-
sequence Transformers for TransiOon-based AMR Parsing.
Transi6on-Based AMR Parsing
279
Zhou et al. AMR Parsing with AcOon-Pointer
Transformer. NAACL 2021
Zhou et al. Structure-aware Fine-tuning of Sequence-to-
sequence Transformers for Transition-based AMR Parsing.
EMNLP 2021.
Simplified Transifon Acfons
Transi6on-Based AMR Parsing
280
Zhou et al. Structure-aware Fine-tuning of Sequence-to-
sequence Transformers for TransiOon-based AMR Parsing.
EMNLP 2021.
❏ AMR Parsing
❏ Sequence-to-sequence methods
❏ Pre/post processing
❏ Transi=on-based methods
❏ Graph-based methods
❏ Evalua=on
❏ AMR GeneraLon:
❏ Sequence-to-sequence methods
❏ Graph-based methods
❏ Silver data
❏ Pre-training
Outline
282
❏ Graph-based methods use the graph structure when predicLng
❏ Inspired by graph-based methods for dependency parsing
❏ Can be done incrementally or using a structured predicLon method
Flanigan et al. A Discrimina<ve Graph-Based Parser for the Abstract Meaning Representa<on. ACL 2014.
inter alia.
Graph-Based AMR Parsing
283
Graph-Based AMR Parsing
284
Cai & Lam. AMR Parsing via Graph-Sequence IteraOve Inference. ACL 2020.
Graph-Based AMR Parsing
285
Cai & Lam. AMR Parsing via Graph-Sequence IteraOve Inference. ACL 2020.
Graph-Based AMR Parsing
286
Cai & Lam 2020. AMR Parsing
via Graph-Sequence IteraOve
Inference. ACL 2020.
Graph-Based AMR Parsing
287
Cai & Lam. AMR Parsing via Graph-Sequence IteraOve Inference. ACL 2020.
❏ AMR Parsing
❏ Sequence-to-sequence methods
❏ Pre/post processing
❏ Transi=on-based methods
❏ Graph-based methods
❏ Evalua>on
❏ AMR GeneraLon:
❏ Sequence-to-sequence methods
❏ Graph-based methods
❏ Silver data
❏ Pre-training
Outline
288
❏ Can use fine-grained evaluaLon to examine strengths and weakness
Evaluation
289
Damonte et al. An Incremental
Parser for Abstract Meaning
RepresentaOon. EACL 2017
❏ AMR Parsing
❏ Sequence-to-sequence methods
❏ Pre/post processing
❏ Transi=on-based methods
❏ Graph-based methods
❏ Evalua=on
❏ AMR GeneraMon:
❏ Sequence-to-sequence methods
❏ Graph-based methods
❏ Silver data
❏ Pre-training
Outline
291
AMR Genera6on: Overview
292
Hao et al. A Survey : Neural Networks for AMR-to-Text. 2022
❏ Linearize the AMR graphs
❏ AMR generaLon as sequence-to-sequence modeling
❏ Can use any seq2seq method and pre-training method (BART, etc)
AMR Genera6on: Seq2seq
293
AMR Genera6on: Graph-Based
294
Hao et al. A Survey : Neural Networks for AMR-to-Text. 2022
AMR Genera6on: Graph-Based
295
Hao et al. Heterogeneous Graph Transformer for Graph-to-Sequence
AMR Genera6on: Graph-Based
296
Damonte & Cohen. Structural
Neural Encoders for AMR-to-
text Generation. NAACL 2019
AMR Genera6on: Comparison
297
Hao et al. A Survey : Neural
Networks for AMR-to-Text. 2022
❏ AMR Parsing
❏ Sequence-to-sequence methods
❏ Pre/post processing
❏ Transi=on-based methods
❏ Graph-based methods
❏ Evalua=on
❏ AMR GeneraLon:
❏ Sequence-to-sequence methods
❏ Graph-based methods
❏ Silver data
❏ Pre-training
Outline
298
❏ Gold data is human labeled data
❏ Silver data is where you run an existing parser on unlabeled data
❏ You can add silver data to the training data to improve performance
❏ Usually people use Gigaword for the silver data (more on this later)
Silver Data (Semi-supervised learning)
299
❏ Silver data someLmes helps parsing, usually on out-of-domain data
Silver Data for AMR Parsing
300
In-domain
Out-of-domain
Bevilacqua et al. One SPRING to Rule Them
Both: Symmetric AMR Semantic Parsing and
Generation without a Complex Pipeline.
AAAI 2021
❏ Silver data always helps generaLon, but be careful! Results are misleading!
❏ Silver data hurts out of domain data
Silver Data for AMR Genera6on
301
In-domain (official test sets)
Out-of-domain
Bevilacqua et al. One SPRING to Rule Them
Both: Symmetric AMR SemanOc Parsing and
GeneraOon without a Complex Pipeline.
AAAI 2021
Baseline +Silver data
❏ Silver data always helps generation, but be careful! Results are misleading!
Silver Data for AMR Genera6on
302
Du & Flanigan. Avoiding Overlap in Data
AugmentaOon for AMR-to-Text GeneraOon.
ACL 2020
❏ Recommend excluding parts of Gigaword that may overlap with test data
Silver Data for AMR Genera6on
303
Du & Flanigan. Avoiding Overlap in Data
AugmentaOon for AMR-to-Text GeneraOon.
ACL 2020
https://github.com/jlab-nlp/amr-clean
❏ AMR Parsing
❏ Sequence-to-sequence methods
❏ Pre/post processing
❏ Transi=on-based methods
❏ Graph-based methods
❏ Evalua=on
❏ AMR GeneraLon:
❏ Sequence-to-sequence methods
❏ Graph-based methods
❏ Silver data
❏ Pre-training
Outline
304
❏ Pre-training the encoder, such as BERT, helps a lot
❏ Pre-training the decoder, such as BART, helps even more
❏ Structural pre-training helps as well
AMR Parsing: Pretraining
305
Bai et al. Graph Pre-training for AMR Parsing and GeneraOon.
ACL 2022
❏ Structural pre-training helps as well
Structural Pretraining
306
Bai et al. Graph Pre-training for AMR Parsing and GeneraOon.
ACL 2022
❏ Structural pre-training helps as well
Structural Pretraining
307
Bai et al. Graph Pre-training for AMR Parsing and GeneraOon.
ACL 2022
AMR Genera6on: Pretraining
308
Hao et al. A Survey : Neural
Networks for AMR-to-Text. 2022
❏ Pre-training helps a lot
❏ Pre-training the encoder
and decoder helps the
most (BART)
AMR Genera6on: Pretraining
309
Hao et al. A Survey : Neural
Networks for AMR-to-Text. 2022
❏ Pre-training helps a lot
❏ Pre-training the encoder
and decoder helps the
most (BART)
❏ There’s a lot more work we didn’t have Lme to cover
❏ See the AMR bibliography
Lots More Work
310
hfps://nert-nlp.github.io/AMR-Bibliography/
Meaning Representa=ons for Natural Languages Tutorial Part 4
Applying Meaning Representa0ons
Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
Informa6on Extrac6on
•OneIE [Lin et al., ACL2020] framework extracts the information graph from a given sentence in four
steps: encoding, identification, classification, and decoding
Moving from Seq-to-Graph to Graph-to-Graph
Slide credit: Heng
● AMR converts input sentence into a directed and acyclic graph
structure with fine-grained node and edge type labels
● AMR parsing shares inherent similarifes with informafon network
(IE output)
● Similar node and edge semanfcs
● Similar graph topology
● Semanfc graphs can beier capture non-local context in a sentence
Zixuan Zhang, Heng Ji. AMR-IE: An AMR-guided encoding and decoding framework for IE.
NAACL’2021 Slide credit: Heng
Key Idea:
Exploit the similarity between AMR and IE to for joint informaZon
extracZon
AMR-IE
Zixuan Zhang, Heng Ji. AMR-IE: An AMR-guided encoding and decoding framework for IE.
NAACL’2021 Slide credit: Heng
AMR Guided Graph Encoding: Using an Edge-Condi4oned GAT
Zixuan Zhang, Heng Ji. AMR-IE: An AMR-guided encoding and decoding framework for IE.
NAACL’2021 Slide credit: Heng
● Map each candidate enfty and event to AMR nodes.
● Update enfty and event representafons using an edge-condifoned GAT to incorporate informafon from
AMR neighbors.
AMR Guided Graph Decoding: Ordered decoding guided by AMR
Zixuan Zhang, Heng Ji. AMR-IE: An AMR-guided encoding and decoding framework for IE.
NAACL’2021 Slide credit: Heng
● Beam search based decoding as in OneIE (Lin et al. 2020).
● The decoding order of candidate nodes are determined by the hierarchy
in AMR in a top-to-down manner.
● E.g., the correct ordered decoding in the following graph is:
Examples on how AMR graphs help
Slide credit: Heng
Leverage Meaning Representa@on
for High-quality Rule-based IE
Llio Humphreys et al. Popula/ng Legal
Ontologies using Seman/c Role Labeling
extracOon
rules
Machine Transla6on
● Repeating words with same meaning
● MT methods using Transformers can make seman=c errors
● Hallucinate informa=on not contained in the source
Machine Transla6on
Goal: inject semanLc informaLon into Machine translaLon
This is mostly due to
Failing to accurately capture
the semanLcs of the source in
some cases.
Machine Transla6on
Song et al. Semantic Neural Machine Translation
using AMR. TACL 2019.
Machine Transla6on
Nguyen et al. Improving Neural
Machine TranslaOon with AMR
SemanOc Graphs. Hindawi
MathemaOcal Problems in
Engineering 2021.
Machine Transla6on
Nguyen et al. Improving Neural
Machine TranslaOon with AMR
SemanOc Graphs. Hindawi
MathemaOcal Problems in
Engineering 2021.
Machine Transla6on
Li & Flanigan. Improving Neural Machine
TranslaOon with the Abstract Meaning
RepresentaOon by Combining Graph and
Sequence Transformers. DLG4NLP 2022.
Machine Translation
Li & Flanigan. Improving Neural Machine
TranslaOon with the Abstract Meaning
RepresentaOon by Combining Graph and
Sequence Transformers. DLG4NLP 2022.
Machine Transla6on
Li & Flanigan. Improving Neural Machine
TranslaOon with the Abstract Meaning
RepresentaOon by Combining Graph and
Sequence Transformers. DLG4NLP 2022.
Summariza6on
Liao et al. Abstract Meaning RepresentaOon for MulO-Document
SummarizaOon. ICCL 2018
Summariza6on
Liao et al. Abstract Meaning RepresentaOon for MulO-Document
Natural Language Inference
Does premise P jus,fy an inference to hypothesis H?
P: The judge by the actor stopped the banker.
H: The banker stopped the actor.
Natural Language Inference
Does premise P jus,fy an inference to hypothesis H?
P: The judge by the actor stopped the banker.
H: The banker stopped the actor.
shallow heuris>cs
due to dataset biases
(e.g. lexicon overlap)
low generaliza>on
on out-of-distribu8on
evalua8on sets.
The HANS challenge dataset [McCoy et al., 2019] showed that NLI models trained on MNLI or SNLI
datasets get fooled easily by heurisOcs when the input sentence pairs have high lexical similarity.
Seman)c informa)on(SRL)
○ Improve the seman)c knowledge of the NLI models
○ Less prone to dataset biases.
How Can Meaning Representation Help?
P: The judge by the actor stopped the banker.
H: The banker stopped the actor.
VERB ARG1
ARG1
ARG0
ARG0
VERB
SemBERT: Semantic Aware BERT
Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi Zhou,
Xiang Zhou: Seman/cs-Aware BERT for Language Understanding. AAAI 2020
Incorporate SRL informaOon with
BERT representaOons.
SemBERT: Seman@c Aware BERT
Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi Zhou,
Xiang Zhou: Seman/cs-Aware BERT for Language Understanding. AAAI 2020
Results on GLUE benchmark
Works parOcularly well for smaller
Joint Training with SRL improves NLI
generaliza9on
Main idea: Improve sentence understanding
(hence out-of-distribution generalization) with
joint learning of explicit semantics
Cemil Cengiz, Deniz Yuret. Joint Training with Seman/c Role
Labeling for BeZer Generaliza/on in Natural Language
Inference. Rep4NLP’2020
Joint Training with SRL improves NLI
generaliza9on
Main idea: Improve sentence understanding
(hence out-of-distribution generalization) with
joint learning of explicit semantics
Cemil Cengiz, Deniz Yuret. Joint Training with Seman/c Role
Labeling for BeZer Generaliza/on in Natural Language
Inference. Rep4NLP’2020
Is Seman9c-Aware BERT More Linguis9cally Aware?
Ling Liu, Ishan Jindal, Yunyao Li. Is Seman/c-aware BERT more Linguis/cally
Infuse seman*c knowledge via predicate-
wise concatena*on with BERT
Is Semantic-Aware BERT More Linguistically Aware
Ling Liu, Ishan Jindal, Yunyao Li. Is Seman/c-aware BERT more Linguis/cally
Performance on HANS non-entailment
examples by models fine-tuned on
SNLI. Examples in black and normal
font are where BERT made wrong
predicOons and LingBERT made correct
predicOons. Examples in blue and italics
are where none of the three models
made the correct predicOon. The last
three columns are the accuracy in % on
the non-entailment examples by BERT,
SemBERT, and LingBERT respecOvely.
Beier differenfate lexical
similarity from world
knowledge
Fails to help with subsequence
/consftuent heurisfcs
NSQA: AMR for Neural-Symbolic Ques@on
Answering over Knowledge Graph
Pavan Kapanipathi et al∗ Leveraging Abstract Meaning
AMR Graph → Query Graph
Acer nigrum is used in making what?
AMR Graph
Query Graph
Count the awards received by the ones
who fought the ba?le of france?”
What ciAes are located on the sides
of mediterranean sea?
Pavan Kapanipathi et al∗ Leveraging Abstract Meaning
AMR-Based Ques6on Decomposi6on
Zhenyun Deng et al. Interpretable AMR-Based Ques/on Decomposi/on for
AMR-Based Question Decomposition
Zhenyun Deng et al. Interpretable AMR-Based Ques/on Decomposi/on for
AMR-Based Ques6on Decomposi6on
BeZer accuracy of the final answer and the quality of sub-
ques/ons
Zhenyun Deng et al. Interpretable AMR-Based Ques/on Decomposi/on for
AMR-Based Ques6on Decomposi6on
Outperforming exis0ng ques0on-decomposi0on-based mul0-hop QA approaches.
Zhenyun Deng et al. Interpretable AMR-Based Ques/on Decomposi/on for
Cross-Document Multi-hop Reading Comprehension
Zheng and Kordjamshidi. SRLGRN: Seman/c Role Labeling Graph Reasoning Network.
Heterogeneous SRL Graph
Zheng and Kordjamshidi. SRLGRN: Seman/c Role Labeling Graph Reasoning Network.
HotpotQA Result
SRL graph improves the completeness of the graph network over NER graph
Zheng and Kordjamshidi. SRLGRN: Semantic Role Labeling Graph Reasoning Network.
Dialog Modeling via AMR Transforma@on & Augmenta@on
Mitchell Abrams, Claire Bonial, L. Donatelli. Graph-to-graph meaning representa/on transforma/ons for
human-robot dialogue. SCIL. 2020
Claire Bonial et al. Augmen/ng Abstract Meaning Representa/on for Human-Robot Dialogue. ACL-DMR.
Dialog Modeling via AMR Transforma@on &
Augmenta@on
Xuefeng Bai, Yulong Chen, Linfeng Song, Yue Zhang. Seman/c Representa/on for Dialogue Modeling. ACL 2021
Dialog Modeling via AMR Transforma@on &
Augmenta@on
Xuefeng Bai, Yulong Chen, Linfeng Song, Yue Zhang. Seman/c Representa/on for Dialogue Modeling. ACL 2021
(a) Using AMR to enrich text representaOon. (b,c) Using AMR
independently.
Dialog Modeling via AMR Transformation &
Augmentation
Xuefeng Bai, Yulong Chen, Linfeng Song, Yue Zhang. Seman/c Representa/on for Dialogue Modeling. ACL 2021
semanOc knowledge in formal AMR is
helpful for dialogue modeling
manually added relaOons are useful in
dialog relaOon extracOon and dialog
generaOon
Case Study- Watson Discover Content Intelligence
A. Agarwal et al.Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Case Study- Watson Discover Content Intelligence
A. Agarwal et al.Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Element
Case Study- Watson Discover Content Intelligence
A. Agarwal et al.Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Element
Expanded SRL as
Semanfc NLP Primifves
Provided by SystemT
[ACL '10, NAACL ‘18]
Case Study- Watson Discover Content Intelligence
A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Element
Expanded SRL as
Semanfc NLP Primifves
Business transact. verbs
in future tense
with posiAve polarity
Case Study- Watson Discover Content Intelligence
A. Agarwal et al.Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Case Study- Watson Discover Content Intelligence
A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Case Study- Watson Discover Content Intelligence
A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Explainability + Tooling → BeQer Root Cause Analysis
A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Yannis Katsis and ChrisOne T. Wolf. ModelLens: An Interac/ve System to Support the Model Improvement Prac/ces of
Model Stability with Increasing Complexity
A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Effec@veness of Feedback Incorpora@on
A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
Human & Machine Co-Crea6on
Prithvi Sen. et al. HEIDL: Learning Linguis/c Expressions with Deep Learning and Human-in-the-Loop. ACL’2019
Prithvi Sen. et al. Learning Explainable Linguis/c Expressions with Neural Induc/ve Logic Programming for Sentence
User Study: Human & Machine Co-Crea6on
Prithvi Sen. et al. HEIDL: Learning Linguis/c Expressions with Deep Learning and Human-in-the-Loop. ACL’2019
Prithvi Sen. et al. Learning Explainable Linguis/c Expressions with Neural Induc/ve Logic Programming for Sentence
User study
–4 NLP Engineers with 1-2 year experience
–2 NLP experts with 10+ years experience
Key Takeaways
● Explana'on of learned rules: VisualizaAon tool is
very effecAve
● Reduc'on in human labor: Co-created model
created within 1.5 person-hrs outperforms black-
box sentence classifier
● Lower requirement on human exper'se: Co-
created model is at par with the model created by
Super-Experts
Summary: Value of Meaning Representation
Work Out-of-box Deeper understanding of text
Overcome Low-resource
Challenges
Robustness against linguistics
variants & complexity
Better model
generalization
Explainability &
Interpretability
Information Extraction ✔ ✔ ✔
Text Classification ✔ ✔ ✔ ✔
Natural Language
Inference
✔
Ques*on Answering ✔ ✔
Dialog ✔
Machine Translation ✔ ✔
SRL
AMR
Meaning Representations for Natural Languages Tutorial Part 5
Open Questions and Future Work
Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
•Do we need to think of opposition between symbolic AMRs and “Deep
learning”?
•Advantages of AMR of being explainable, controllable
•AMR can sometimes provide rich semantics that help generalization
•Open questions regarding impact of AMR error propagation (how
much are we getting hurt by being discrete+symbolic?)
• How much do pretrained AMR graph representations change this (if
they become generally useful in applications)?
Open Ques6ons- Symbolic AMRs vs LLMs?
•One simple story of main advantages of AMR over direct end-to-end
language models: controllability and explainability.
•We have some case studies and applicaMons, and believe that AMRs are
clearly more explainable and controllable than black-box LLMs.
•But: not a lot of work connecMng symbolic meaning representaMons to
current explainability literature
•No current work (that I’m aware of)
Open Questions - Explainability
•There can be advantages to approaching low-resource tasks with AMR
graphs
•Start with a lot of rich semantic distinctions!
•Open questions: how to better quickly transfer to new tasks
•In theory, it’s hoped AMR/UMR can be cross-linguistically robust for low-
resource languages as well
•Especially as we can extend to more languages / expand UMR
•But: Related tasks (domain-adaptation, projecting AMRs into new
languages, etc.) largely unexplored.
Open Questions - Low-resource tasks and languages
•Multi-sentence AMR — only now starting to be modeled & trained on
•Huge range of IE and QA tasks going beyond sentence that AMR might
help (e.g. multi-hop reasoning)
•AMR “entity linking” often used. But: limited explorations of AMR with
structured knowledge sources/ KBs
•Promise for rich retrieval /understanding of “entire documents”
(especially with temporal + modal information if UMR) but currently
under-explored
Open Questions - World Knowledge & Discourse

More Related Content

PDF
The Role of Patterns in the Era of Large Language Models
PDF
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
PDF
Transformers, LLMs, and the Possibility of AGI
PDF
Deep Learning for Recommender Systems RecSys2017 Tutorial
PPTX
Deep dive into LangChain integration with Neo4j.pptx
PDF
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
PPTX
Notes on attention mechanism
PDF
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
The Role of Patterns in the Era of Large Language Models
Building, Growing and Serving Large Knowledge Graphs with Human-in-the-Loop
Transformers, LLMs, and the Possibility of AGI
Deep Learning for Recommender Systems RecSys2017 Tutorial
Deep dive into LangChain integration with Neo4j.pptx
Real-Time Recommendations with Hopsworks and OpenSearch - MLOps World 2022
Notes on attention mechanism
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...

What's hot (20)

PDF
Intro to LLMs
PDF
Graph Database Meetup in Korea #4. 그래프 이론을 적용한 그래프 데이터베이스 활용 사례
PDF
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
PDF
言語モデル入門
PPTX
PDF
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
PPTX
Explainable AI in Industry (FAT* 2020 Tutorial)
PDF
A Review of Deep Contextualized Word Representations (Peters+, 2018)
PDF
An introduction to the Transformers architecture and BERT
PDF
Deep Learning for Recommender Systems
PPTX
Building, Evaluating, and Optimizing your RAG App for Production
PPTX
NLP State of the Art | BERT
PDF
Deep Learning for Recommender Systems
PDF
SIGIR2011読み会 3. Learning to Rank
PDF
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
PDF
Building NLP applications with Transformers
PPTX
ArcFace: Additive Angular Margin Loss for Deep Face Recognition
PDF
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
PDF
검색엔진에 적용된 ChatGPT
PDF
Graph Database Meetup in Korea #2. Graph Database Usecase (그래프 데이터베이스 활용사례)
Intro to LLMs
Graph Database Meetup in Korea #4. 그래프 이론을 적용한 그래프 데이터베이스 활용 사례
LLaMA Open and Efficient Foundation Language Models - 230528.pdf
言語モデル入門
Learning to Rank for Recommender Systems - ACM RecSys 2013 tutorial
Explainable AI in Industry (FAT* 2020 Tutorial)
A Review of Deep Contextualized Word Representations (Peters+, 2018)
An introduction to the Transformers architecture and BERT
Deep Learning for Recommender Systems
Building, Evaluating, and Optimizing your RAG App for Production
NLP State of the Art | BERT
Deep Learning for Recommender Systems
SIGIR2011読み会 3. Learning to Rank
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Building NLP applications with Transformers
ArcFace: Additive Angular Margin Loss for Deep Face Recognition
Transformer Seq2Sqe Models: Concepts, Trends & Limitations (DLI)
검색엔진에 적용된 ChatGPT
Graph Database Meetup in Korea #2. Graph Database Usecase (그래프 데이터베이스 활용사례)
Ad

Similar to Meaning Representations for Natural Languages: Design, Models and Applications (10)

PDF
Meaning Representations for-Natural Languages Design, Models, and Application...
PDF
Documenting and modeling inflectional paradigms in under-resourced languages
PPTX
textprocessingboth.pptx
PPT
english-grammar
PDF
2-lecture_01.pdf
PDF
Learning phoneme mappings for transliteration without parallel data
PDF
Articulatory Phonetics
PPT
Linguistics
PDF
Speech signal processing lizy
PDF
Mapping Spatial Pps The Cartography Of Syntactic Structures 1st Guglielmo Cin...
Meaning Representations for-Natural Languages Design, Models, and Application...
Documenting and modeling inflectional paradigms in under-resourced languages
textprocessingboth.pptx
english-grammar
2-lecture_01.pdf
Learning phoneme mappings for transliteration without parallel data
Articulatory Phonetics
Linguistics
Speech signal processing lizy
Mapping Spatial Pps The Cartography Of Syntactic Structures 1st Guglielmo Cin...
Ad

More from Yunyao Li (20)

PPTX
Taming the Wild West of NLP
PPTX
Towards Deep Table Understanding
PDF
Explainability for Natural Language Processing
PPTX
Explainability for Natural Language Processing
PDF
Human in the Loop AI for Building Knowledge Bases
PDF
Towards Universal Language Understanding
PPTX
Explainability for Natural Language Processing
PDF
Towards Universal Language Understanding (2020 version)
PDF
Towards Universal Semantic Understanding of Natural Languages
PPT
An In-depth Analysis of the Effect of Text Normalization in Social Media
PDF
Exploiting Structure in Representation of Named Entities using Active Learning
PPTX
K-SRL: Instance-based Learning for Semantic Role Labeling
PDF
Coling poster
PDF
Coling demo
PPTX
Natural Language Data Management and Interfaces: Recent Development and Open ...
PDF
Polyglot: Multilingual Semantic Role Labeling with Unified Labels
PDF
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
PDF
The Power of Declarative Analytics
PDF
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
PPT
SystemT: Declarative Information Extraction
Taming the Wild West of NLP
Towards Deep Table Understanding
Explainability for Natural Language Processing
Explainability for Natural Language Processing
Human in the Loop AI for Building Knowledge Bases
Towards Universal Language Understanding
Explainability for Natural Language Processing
Towards Universal Language Understanding (2020 version)
Towards Universal Semantic Understanding of Natural Languages
An In-depth Analysis of the Effect of Text Normalization in Social Media
Exploiting Structure in Representation of Named Entities using Active Learning
K-SRL: Instance-based Learning for Semantic Role Labeling
Coling poster
Coling demo
Natural Language Data Management and Interfaces: Recent Development and Open ...
Polyglot: Multilingual Semantic Role Labeling with Unified Labels
Transparent Machine Learning for Information Extraction: State-of-the-Art and...
The Power of Declarative Analytics
Enterprise Search in the Big Data Era: Recent Developments and Open Challenges
SystemT: Declarative Information Extraction

Recently uploaded (20)

PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
August Patch Tuesday
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Mushroom cultivation and it's methods.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
project resource management chapter-09.pdf
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Encapsulation theory and applications.pdf
PDF
Hybrid model detection and classification of lung cancer
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PPTX
A Presentation on Touch Screen Technology
PPTX
Programs and apps: productivity, graphics, security and other tools
DP Operators-handbook-extract for the Mautical Institute
August Patch Tuesday
A comparative analysis of optical character recognition models for extracting...
Enhancing emotion recognition model for a student engagement use case through...
Mushroom cultivation and it's methods.pdf
Unlocking AI with Model Context Protocol (MCP)
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Assigned Numbers - 2025 - Bluetooth® Document
project resource management chapter-09.pdf
A comparative study of natural language inference in Swahili using monolingua...
Zenith AI: Advanced Artificial Intelligence
Hindi spoken digit analysis for native and non-native speakers
NewMind AI Weekly Chronicles - August'25-Week II
Encapsulation theory and applications.pdf
Hybrid model detection and classification of lung cancer
Building Integrated photovoltaic BIPV_UPV.pdf
A Presentation on Touch Screen Technology
Programs and apps: productivity, graphics, security and other tools

Meaning Representations for Natural Languages: Design, Models and Applications

  • 1. Tutorial Meaning Representations for Natural Languages: Design, Models and Applications Jeffrey Flanigan Tim O’Gorman Ishan Jindal Yunyao Li Nianwen Xue Martha Palmar
  • 2. Meaning Representations for Natural Languages Tutorial Part 1 Introduction Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
  • 3. What should be in a Meaning Representation?
  • 4. Mo#va#on: From Sentences to Proposi/ons Who did what to whom, when, where and how? Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting . . . When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2)
  • 5. Capturing seman.c roles SUBJ SUBJ SUBJ • Tim broke [ the laser pointer.] • [ The windows] were broken by the hurricane. • [ The vase] broke into pieces when it toppled over.
  • 6. Capturing seman.c roles • Tim broke [ the laser pointer.] • [ The windows] were broken by the hurricane. • [ The vase] broke into pieces when it toppled over. Breake r Thing broken Thing broken
  • 7. A proposition as a tree Zhu and Powell discussed the return of the spy plane discuss([Powell, Zhu], return(X, plane)) Zhu and Powell of the spy plane discuss return
  • 8. discuss.01 - talk about Aliases: discussion (n.), discuss (v.), have_discussion (l.) • Roles: ARG0: discussant ARG1: topic ARG2: conversation partner, if explicit Valency Lexicon PropBank Frame File - 11,436 framesets Kingsbury & Palmer, LREC 2002 – Pradhan et. al., *SEM 2022,
  • 9. discuss.01 - talk about Aliases: discussion (n.), discuss (v.), have_discussion (l.) • Roles: ARG0: discussant ARG1: topic ARG2: conversation partner, if explicit Valency Lexicon PropBank Frame File - 11,436 framesets Kingsbury & Palmer, LREC 2002 – Pradhan et. al., *SEM 2022,
  • 10. discuss.01 ARG0: Zhu and Powell ARG1: return.01 Arg1: of the spy plane Zhu and Powell discussed the return of the spy plane
  • 11. discuss.01 ARG0: Zhu and Powell ARG1: return.01 Arg1: of the spy plane discuss([Powell, Zhu], return(X, plane)) Zhu and Powell discussed the return of the spy plane
  • 12. A proposi,on as a tree Zhu and Powell discussed the return of the spy plane discuss([Powell, Zhu], return(X, plane)) Zhu and Powell of the spy plane discuss.01 return.02 Arg0 Arg1 Arg1
  • 13. A proposition as a tree Zhu and Powell discussed the return of the spy plane discuss([Powell, Zhu], return(X, plane)) Zhu and Powell of the spy plane discuss.01 return.02 Arg0 Arg1 Arg1 Arg0
  • 14. A proposition as a tree Zhu and Powell discussed the return of the spy plane discuss([Powell, Zhu], return(X, plane)) Zhu and Powell of the spy plane discuss.01 return.02 Arg0 Arg1 Arg1 Arg0 ?? (Zhu)
  • 15. A proposi,on as a tree Zhu and Powell discussed the return of the spy plane discuss([Powell, Zhu], return(X, plane)) Zhu and Powell of the spy plane discuss.01 return.02 Arg0 Arg1 Arg1
  • 16. Proposi.on Bank • Hand annotated predicate argument structures for Penn Treebank • Standoff XML, points directly to syntac=c parse tree nodes, 1M words • Doubly annotated and adjudicated • (Kingsbury & Palmer, 2002, Palmer, Gildea, Xue, 2004, …). • Based on PropBank Frame Files • English valency lexicon: ~4K verb entries (2004) → ~11K v,n, adj, prep (2022) • Core arguments – Arg0-Arg5 • ArgM’s for modifiers and adjuncts • Mappings to VerbNet and FrameNet • Annotated PropBank Corpora • English 2M+, Chinese 1M+, Arabic .5M, Hindi/Urdu .6K, Korean, …
  • 17. An Abstract Meaning Representation as a graph Zhu and Powell discussed the return of the spy plane discuss([Powell, Zhu], return(X, plane)) Zhu and Powell of the spy plane discuss.01 return.02 Arg0 Arg1 Arg1
  • 18. An Abstract Meaning Representation as a graph Zhu and Powell discussed the return of the spy plane discuss([Zhu, Powell], return(X, plane)) and spy plane discuss.01 return.02 Arg0 Arg1 Arg1 AMR drops: Determiners Function words adds: NE tags. Wiki links
  • 19. An Abstract Meaning Representation as a graph Zhu and Powell discussed the return of the spy plane discuss([Zhu, Powell], return(X, plane)) and plane discuss.01 return.02 Arg0 Arg1 Arg1 AMR drops: Determiners Function words adds: NE tags. Wiki links Noun Phrase Structure spy.01 Arg0-of
  • 20. An Abstract Meaning Representa,on as a graph Zhu and Powell discussed the return of the spy plane discuss([Powell, Zhu], return(X, plane)) and plane discuss.01 return.02 Arg0 Arg1 Arg1 Arg0 ?? (Zhu) AMR drops: Determiners Function words adds: NE tags. Wiki links Noun Phrase Structure Implicit Arguments Coreference Links spy.01 Arg0-of
  • 21. An Abstract Meaning Representa,on as a graph Zhu and Powell discussed the return of the spy plane discuss([Powell, Zhu], return(X, plane)) and of the spy plane discuss.01 return.02 Arg0 Arg1 Arg1 Arg0 ?? (Zhu) AMR drops: Determiners Function words adds: NE tags. Wiki links Noun Phrase Structure Implicit Arguments Coreference Links spy.01 Arg0-of
  • 22. • Stay tuned AMRs – Tim O’Gorman
  • 23. Mo#va#on: From Sentences to Proposi/ons Who did what to whom, when, where and how? Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting . . . When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2)
  • 24. Motivation: From Sentences to Propositions Who did what to whom, when, where and how? Powell met Zhu Rongji Proposition: meet(Powell, Zhu Rongji) Powell met with Zhu Rongji Powell and Zhu Rongji met Powell and Zhu Rongji had a meeting . . . When Powell met Zhu Rongji on Thursday they discussed the return of the spy plane. meet(Powell, Zhu) discuss([Powell, Zhu], return(X, plane)) debate consult join wrestle battle meet(Somebody1, Somebody2) ENGLISH!
  • 25. Mo#va#on: From Sentences to Proposi/ons Who did what to whom, when, where and how? Powell reunió Zhu Rongji Proposition: reunir(Powell, Zhu Rongji) Powell reunió con Zhu Rongji Powell y Zhu Rongji reunió Powell y Zhu Rongji tuvo una reunión . . . Powell se reunió con Zhu Rongji el jueves y hablaron sobre el regreso del avión espía. reunir(Powell, Zhu) hablar[Powell, Zhu], regresar(X, avión)) зустрів ‫ا‬ ‫ﻟ‬ ‫ﺘ‬ ‫ﻘ‬ ‫ﻰ‬ 遇⻅ मुलाकात की พบ meet(Somebody1, Somebody2) Thai Hindi Chinese Ukrainian Arabic Other Languages? Spanish
  • 26. • Several languages already have valency lexicons • Chinese, Arabic, Hindi/Urdu, Korean PropBanks, …. • Czech Tectogrammatical SynSemClass , https://ufal.mff.cuni.cz/synsemclass • VerbNets, FrameNets: Spanish, Basque, Catalan, Portuguese, Japanese, … • Linguistic valency lexicons: Arapaho, Lakota, Turkish, Farsi, Japanese, … • For those without, follow EuroWordNet approach: project from English? • Universal Proposition Banks for Multilingual Semantic Role Labeling • See Ishan Jindal in Part 2 • Can AMR be applied universally to build language specific AMRs? • Uniform Meaning Representation • See Nianwen Xue after the AM break How do we cover thousands of languages?
  • 27. • Universal PropBank was developed by IBM, primarily with translaLon Prac=cal and efficient, produces consistent representa=ons for all languages Projects English frames to parallel sentences in 23 languages • BUT - May obscure language specific seman=c nuances Not op=mal for target language applica=ons: IE, QA,… • Uniform Meaning RepresentaLon • Richer than PropBank alone • Captures language specific characteris=cs while preserving • consistency • BUT - Producing sufficient hand annotated data is SLOW! • Comparisons of UP/UMR will teach us a lot about differences between languages UP vs UMR
  • 28. • Morning Session, Part 1 • Introduc=on - Martha Palmer • Background and Resources – Martha Palmer • Abstract Meaning Representa=ons - Tim O’Gorman • Break • Morning Session, Part 2 • Rela=ons to other Meaning Formalisms: AMR, UCCA, Tectogramma=cal, DRS (Parallel Meaning Bank), Minimal Recursion Seman=cs and Seman=c Parsing – Tim O’Gorman • Uniform Meaning Representa=ons – Nianwen Xue Tutorial Outline
  • 29. • Afternoon Session, Part 1 • Modeling Meaning Representation: SRL - Ishan Jindal • Modeling Meaning Representation: AMR – Jeff Flanigan • Break • Afternoon Session, Part 2 • Applying Meaning Representations – Yunyao Li, Jeff Flanigan • Open Questions and Future Work – Tim O’Gorman Tutorial Outline
  • 30. Meaning Representations for Natural Languages Tutorial Part 2 Common Meaning Representations Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
  • 31. • AMR as a format is older (Kasper 1989, Langkilde & Knight 1998), but with no PropBank, no training data. • Propbank showed that large-scale training sets could be annotated for SRL • Modern AMR (Banarescu et al. (2013) main innovation: making large-scale sembanking possible: • AMR 3.0 more than 60k sentences in English • CAMR more than 20k sentences in Chinese “AMR” annota,on
  • 32. • Shi$ from SRL to AMR – from spans to graphs • In SRL we separately represent each predicate’s arguments with spans • AMR instead uses graphs with one node per concept AMR Basics – SRL to AMR
  • 33. • “PENMAN” is the text-based format used to represent these graphs AMR Basics – PENMAN (l / like-01 :ARG0 (c / cat :mod (l / little)) :ARG1 (e / eat-01 :ARG0 c :ARG1 (c2 / cheese)))
  • 34. • Edges are represented by indentation and colons (:EDGE) • Individual variables identify each node AMR Basics – PENMAN (l / like-01 :ARG0 (c / cat :mod (l / little)) :ARG1 (e / eat-01 :ARG0 c :ARG1 (c2 / cheese)))
  • 35. • If a node has more than one edge, it can be referred to again using that variable. • Terminology: We call that a re- entrancy • This is used for all references to the same enPty/thing in a sentence! • This is what allows us to encode graphs in this tree-like format AMR Basics – PENMAN (l / like-01 :ARG0 (c / cat :mod (l / little)) :ARG1 (e / eat-01 :ARG0 c :ARG1 (c2 / cheese)))
  • 36. • Inverse roles allow us to encode things like relative clauses • Any relation of the form “:X-of” is an inverse. • Interchangeable! • (entity, ARG0-of, predicate) generally equal to (predicate, ARG0, entity) AMR Basics – PENMAN (l / like-01 :ARG0 (h / he) :ARG1 (c / cat :ARG0-of (e / eat-01 :ARG1 (c2 / cheese))))
  • 37. • Are the graphs the same for “cats that eat cheese” and “cats eat cheese”? • No! Every graph gets a “Top” edge defining the semantic head/root AMR Basics – PENMAN (c / cat :ARG0-of (e / eat-01 :ARG1 (c2 / cheese))) (e / eat-01 :ARG0 (c / cat) :ARG1 (c2 / cheese))
  • 38. • Named en))es are typed and then linked to a “name” node with features for each name token. • 70+ categories like person, government-organiza)on, newspaper, city, food-dish, conference • Note that name strings (and some other things like numbers) are constants — they aren’t assigned variables. • En)ty linking: connect to wikipedia entry for each NE (when available) AMR Basics – PENMAN
  • 39. • That’s AMR notation! Let’s review before discussing how we annotate AMRs. (e / eat-01 :ARG0 (d / dog) :ARG1 (b / bone :quant 4 :ARG1-of (f / find-01 :ARG0 d))) 3 9 variable concept constant inverse rela>on reentrancy AMR Basics – PENMAN
  • 40. • AMR does limited normalization aimed at reducing arbitrary syntactic variation (“syntactic sugar”) and maximizing cross- linguistic robustness • Mapping all predicative things (verbs, adjectives, many nouns) to PropBank predicates. Some morphological decomposition • Limited speculation: mostly represent direct contents of sentence (add pragmatic content only when it can be done consistently) • Canonicalize the rest: removal of semantically light predicates and some features like definiteness (controversial) AMR Basics 2 – Annotation Philosophy
  • 41. AMR Basics 2 – Annotation Philosophy • We generalize across parts of speech and etymologically related words: • But we don’t generalize over synonyms (hard to do consistently): 4 1 My fear of snakes fear-01 I’m terrified of snakes terrify-01 Snakes creep me out creep_out-03 My fear of snakes fear-01 I am fearful of snakes fear-01 I fear snakes fear-01 I’m afraid of snakes fear-01
  • 42. AMR Basics 2 – Annotation Philosophy • Predicates use the PropBank inventory. • Each frame presents annotators with a list of senses. • Each sense has its own definitions for its numbered (core) arguments 4 2
  • 43. AMR Basics 2 – Annotation Philosophy • If a seman)c role is not in the core roles for a roleset, AMR provides an inventory of non-core roles • These express things like :+me, :manner, :part, :loca+on, :frequency • Inventory on handout, or in editor (the [roles] bu@on) 4 3
  • 44. AMR Basics 2 – Annota4on Philosophy • Ideally one seman)c concept = one node • Mul)-word predicates modeled as a single node • Complex words can be decomposed • Only limited, replicable decomposi)on (e.g. kill does not become “cause to die”) 4 4 The thief was lining his pockets with their investments (l / line-pocket-02 :ARG0 (p / person :ARG0-of (t / thieve-01)) :ARG1 (t2 / thing :ARG2-of (i2 / invest-01 :ARG0 (t3 / they))))
  • 45. AMR Basics 2 – Annotation Philosophy • All concepts drop plurality, aspect, definiteness, and tense. • Non-predicative terms simply represented in singular, nominative form 4 5 A cat The cat cats the cats (c / cat) ea=ng eats ate will eat (e / eat-01) They Their Them (t / they)
  • 46. 4 6 The man described the mission as a disaster. The man’s description of the mission: disaster. As the man described it, the mission was a disaster. The man described the mission as disastrous. (d / describe-01 :ARG0 (m / man) :ARG1 (m2 / mission) :ARG2 (d / disaster)) AMR Basics 2 – Annotation Philosophy
  • 47. Meaning Representa=ons for Natural Languages Tutorial Part 2 Common Meaning Representa0ons • Format & Basics • Some Details & Design Decisions • Prac=ce - Walking through a few AMRs • Mul=-sentence AMRs • Rela=on to Other Formalisms • UMRs • Open Ques=ons in Representa=on Representa)on Roadmap
  • 48. Details- Specialized Normaliza3ons • We also have special entity types we use for normalizable entities. 4 8 (d / date-entity :weekday (t / tuesday) :day 19) (m / monetary-quantity :unit dollar :quant 5) “Tuesday the 19th” “five bucks”
  • 49. Details- Specialized Normaliza3ons • We also have special enLty types we use for normalizable enMMes. 4 9 (r / rate-entity-91 :ARG1 (m / monetary-quantity :unit dollar :quant 3) :ARG2 (v / volume-quantity :unit gallon :quant 1)) “$3 / gallon”
  • 50. Details - Specialized Predicates • Common construcLons for kinship and organizaLonal relaLons are given general predicates like have-org-role-91 5 0 (p / person :ARG0-of (h / have-org-role-91 :ARG1 (c / country :name (n / name :op1 "US") :wiki "United_States") :ARG2 (p2 / president) “The US president” have-org-role-91 ARG0: office holder ARG1: organization ARG2: title of office held ARG3: description of responsibility
  • 51. Details - Specialized Predicates • Common constructions for kinship and organizational relations are given general predicates like have-org-role-91 5 1 (p / person :ARG0-of (h / have-rel-role-91 :ARG1 (s / she) :ARG2 (f / father) “Her father” have-rel-role-91 ARG0: entity A ARG1: entity B ARG2: role of entity A ARG3: role of entity B ARG4: relationship basis
  • 52. Coreference and Control 5 2 • Within sentences, all references to the same “referent” are merged into the same variable. • This applies even with pronouns or even descriptions Pat saw a moose and she ran (a / and :op1 (s / see-01 :ARG0 (p /person :name (n / name :op1 “Pat”)) :ARG1 (m / moose) ) :op2 (run-02 :ARG0 p))
  • 53. Reduc)on of Seman)cally Light Matrix Verbs 5 3 • Specific predicates (specifically the English copula) NOT used in AMR. • Copular predicates which *many languages would omit* are good candidates for removal • Replace with rela=ve SEMANTIC asser=ons (e.g. :domain is “is an atribute of”) • UMR will discuss alterna=ves to just omiung these. the pizza is free (f / free-01 :arg1 (p / pizza)) The house is a pit (p / pit :domain (h / house))
  • 54. • For two-place discourse connectives, we define frames • Although it rained, we walked home • For list-like things (including coordination) we use “:op#” to define places in the list: • Apples and bananas 5 4 (a / and :op1 (a2 / apple) :op2 (b / banana)) Have-concession-91: Arg2: “although” clause Arg1: main clause Discourse Connec)ves and Coordina)on
  • 55. Meaning Representations for Natural Languages Tutorial Part 2 Common Meaning Representations • Format & Basics • Some Details & Design Decisions • Practice - Walking through a few AMRs • Multi-sentence AMRs • Relation to Other Formalisms • UMRs • Open Questions in Representation Representation Roadmap
  • 56. Practice - Let’s Try some Sentences • Feel free to annotate by hand (or ponder how you’d want to represent them) • Edmund Pope tasted freedom today for the first 3me in more than eight months. • Pope is the American businessman who was convicted last week on spying charges and sentenced to 20 years in a Russian prison. Taste-01: Arg0: taster Arg1: food Useful Normalized forms: - Rate-en5ty - Ordinal-en5ty - Date-en5ty - Temporal-quan5ty Useful NER types: - Person - Country Convict-01 Arg0: judge Arg1: person convicted Arg2: convicted of what Spy-01 Arg0: secret agent Arg1: entity spied /seen Charge-01 Asking price Arg0: seller Arg1: asking price Arg2: buyer Arg3 :commodity Charge-05 Assign a role (including criminal charges) Arg0:assigner Arg1 : assignee Arg2: role or crime Sentence-01 Arg0: judge/jury Arg1: criminal Arg2: punishment
  • 57. Prac3ce- Let’s Try some Sentences Edmund Pope tasted freedom today for the first time in more than eight months. (t2 / taste-01 :ARG0 (p / person :wiki "Edmond_Pope" :name (n2 / name :op1 "Edmund" :op2 "Pope")) :ARG1 (f / free-04 :ARG1 p) :time (t3 / today) :ord (o3 / ordinal-entity :value 1 :range (m / more-than :op1 (t / temporal-quantity :quant 8 :unit (m2 / month)))))
  • 58. Prac3ce- Let’s Try some Sentences Pope is the American businessman who was convicted last week on spying charges and sentenced to 20 years in a Russian prison. (b2 / businessman :mod (c5 / country :wiki "United_States" :name (n6 / name :op1 "America")) :domain (p / person :wiki "Edmond_Pope" :name (n5 / name :op1 "Pope")) :ARG1-of (c4 / convict-01 :ARG2 (c / charge-05 :ARG1 b2 :ARG2 (s2 / spy-01 :ARG0 p)) :time (w / week :mod (l / last))) :ARG1-of (s / sentence-01 :ARG2 (p2 / prison :mod (c3 / country :wiki "Russia" :name (n4 / name :op1 "Russia")) :duration (t3 / temporal-quantity :quant 20 :unit (y2 / year))) :ARG3 s2))
  • 59. Meaning Representations for Natural Languages Tutorial Part 2 Common Meaning Representations • Format & Basics • Some Details & Design Decisions • Practice - Walking through a few AMRs • Multi-sentence AMRs • Relation to Other Formalisms • UMRs • Open Questions in Representation Representation Roadmap
  • 60. A final component in AMR: Multi-sentence! • AMR 3.0 release contains Mul--sentence AMR annota-ons • Document-level coreference: • Connec=ng men=ons that co-refer • Connec=ng some par=al coreference • Making cross-sentence implicit seman=c roles • John took his car to the store. • He bought milk (from the store). • He put it in the trunk.
  • 61. A final component in AMR: Mul)-sentence! • AMR 3.0 release contains Mul--sentence AMR annota-ons • Annota=on was done between AMR variables, not raw text — nodes are coreferent • (t / take-01 :ARG0 (p / person :name (n / name :op1 “John”)) :ARG1 (c / car :poss p) :ARG3 (s / store) • (B / buy-01 :ARG0 (h / he) :ARG1 (m / milk))
  • 62. A final component in AMR: Mul)-sentence! • AMR 3.0 release contains Multi-sentence AMR annotations • "implicit role" annotation was done by showing the remaining roles to annotators and allowing them to be added to coreference chains. • (t / take-01 :ARG0 (p / person :name (n / name :op1 “John”)) :ARG1 (c / car :poss p) • :ARG2 (x / implicit :op1 “taken from, source…” :ARG3 (s / store) • (B / buy-01 :ARG0 (h / he) :ARG1 (m / milk) :ARG2 (x / implicit :op1“seller”)
  • 63. A final component in AMR: Multi-sentence! • AMR 3.0 release contains Multi-sentence AMR annotations • Implicit roles are worth considering for meaning representation, especially for languages other than English • Null subject (and sometimes null object) constructions are very cross-linguistically common, can carry lots of information • Arguments of nominalizations can carry a lot of assumed information in scientific domains
  • 64. A final component in AMR: Multi-sentence! • MulL-sentence AMR data: training and evaluaLon data for creaLng a graph for a whole document • Was not impossible before mul=-sentence AMR: could boostrap with span-based coreference data • Also extended to spa=al AMRs (human-robot interac=ons - Bonn et al .2022 • MS-AMR work was done on top of exisLng gold AMR annotaLons — a separate process.
  • 65. Meaning Representa=ons for Natural Languages Tutorial Part 2 Common Meaning Representa0ons • Format & Basics • Some Details & Design Decisions • Prac=ce - Walking through a few AMRs • Mul=-sentence AMRs • Rela>on to Other Formalisms • UMRs • Open Ques=ons in Representa=on Representa6on Roadmap
  • 66. Comparison to Other Frameworks 6 6 • Meaning representations vary along many dimensions! • How meaning is connected to text • Relationship to logical and/or executable form • Mapping to Lexicons/Ontologies/Tasks • Relationship to discourse • We’ll overview these followed by some side- by-side comparisons
  • 67. Alignment to Text / Compositionality 6 7 • Historical approach to meaning representa1ons: represent context-free seman1cs, as defined by a par1cular grammar model • AMR at other extreme: AMR graph annotated for a single sentence, but no individual mapping from tokens to nodes
  • 68. Alignment to Text / Composi6onality 6 8 Oepen & Kuhlmann (2016) “flavors” of meaning representations: Type 0: Bilexical Type 1: Anchored Type 2: Unanchored Nodes each correspond to one token (Dependency parsing) Nodes are aligned to text (can be subtoken or multi-token) No mapping from graph to surface form Universal Dependencies UCCA AMR MRS-connected frameworks (DM, EDS) DRS-based frameworks (PMB / GMB) Some executable/task- specific semantic parsing frameworks Prague Semantic dependencies Prague tectogrammatical
  • 69. Alignment to Text / Compositionality 6 9 Less thoroughly defined: adherence to grammar/composiAonally (cf. Bender et al. 2015) Some frameworks (MRS/ DRS below) have parAcular asserAons about how a given meaning representaAon was derived (Aed to a parAcular grammar) AMR encodes many useful things that are oPen *not* considered composiAonal — named enAty typing, cross-sentence coreference, word senses, etc. <- “Sentence meaning” Extragrammatical inference -> Only encode “compositional” meanings predicted by a particular theory of grammar some useful pragmatic inference (e.g. sense distinctions, named entity types) Any wild inferences needed for task
  • 70. Alignment to Text / Compositionality - UCCA 7 0 • Universal Conceptual Cogni2ve Annota2on : based on a typological theory (Dixon’s BLT) of how to do coarse-grained seman2cs across languages • Similar to a cross between dependency and cons2tuency parses (labeled edges)- some2mes very syntac2c • Coarse-grained roles, e.g.: • A: par2cipant • S: State • C: Center • D: Adverbial • E: elaborator • “Anchored” graphs, in the Open & Kuhlman taxonomy (somewhat composi2onal, but no formal rules for how a given node is derived)
  • 71. Alignment to Text / Compositionality - Prague 7 1 • Very similar to AMR with more general semantic roles (predicates use Vallex predicates (valency lexicon) and a shared set of semantic roles similar to VerbNet) • Semantic graph is aligned to syntactic graph layers (“type 1”) • “Prague Czech-English Dependency Treebank” • “PSD” reduced form fully bilexical (“Type 0”) for dependency parsing. • Full PCEDT also has rich semantics like implicit roles (e.g. null subjects) – “anchored” (“Type 1”) For the Czech version of “An earthquake struck Northern California, killing more than 50 people.” (Čmejrek et al. 2004)
  • 72. Logical & Executable Forms 7 2 • Lots of logical desiderata: • Modeling whether events happen and/or are believed (and other modality questions): Sam believes that Bill didn’t eat the plums. • Understanding quantifications: whether “every child has a favorite song” refers to one song or many • Technically our default assumption for AMR is Neo-Davidsonian: bag of triples like (“instance-of(b, believe-01)”, “instance-of(h, he), “ARG0(b, h)” • One cannot modify more than one node in the graph • PENMAN is a bracketed tree that can be treated like a logical form (with certain assumptions or addition to certain new annotations) • Artzi et al. 2015), Bos (2016), Stabler (2017), : Pustejovsky et al. (2019), etc. • Competing frameworks like DRS and MRS more specialized for this.
  • 73. Logical & Executable Forms 7 3 • Lots of logical desiderata: • Modeling whether events happen and/or are believed (and other modality questions): Sam believes that Bill didn’t eat the plums. • Understanding quantifications: whether “every child has a favorite song” refers to one song or many • Technically our default assumption for AMR just means that something like “:polarity -“ is a feature of a single node; no semantics for quantifiers like “every” • With certain assumptions or addition to certain new annotations, PENMAN is a bracketed tree that can be treated like a logical form • Artzi et al. 2015), Bos (2016), Stabler (2017), : Pustejovsky et al. (2019), etc.; proposals for “UMR” treatments as well. • Competing frameworks like DRS and MRS more specialized for this.
  • 74. Logical & Executable Forms - DRS 7 4 • Discourse Representa1on Structures (annota1ons in Groening Meaning Bank and Parallel Meaning Bank) • DRS frameworks do scoped meaning representa1on • Outputs originally modified from CCG parser LF outputs-> DRS • DRS uses “boxes” which can be negated, asserted, believed in. • This is not na1vely a graph representa1on! “box variables”(bo[om) one way of thinking about these • a triple like “agent(e1, x1)” is part of b3 • Box b3 is modified (e.g. b2 POS b3)
  • 75. Logical & Executable Forms - DRS 7 5 • Grounded in long theore5cal DRS tradi5on (Heim & Kamp) for handling discourse referents, presupposi/ons, discourse connec/ves, temporal rela/ons across sentences, etc. • DRS for “everyone was killed” (Liu et al. 2021)
  • 76. Logical & Executable Forms - MRS 7 6 Minimal Recursion Semantics (and related frameworks) • Copestake (1997) model proposed for semantics of HPSG - this is connected to other underspecification solutions (Glue semantics / hole semantics / etc. ) • Define set of constraints over which variables outscope other variables • HPSG grammars like the English Resource Grammar produce ERS (English resource semantics) outputs (which are roughly MRS) and have been modified into a simplified DM format (“type 0” bilexical dependency)
  • 77. Logical & Executable Forms - MRS 7 7 • Underspecification in practice: • MRS can the thought of as many fragments with constraints on how they scope together • Those define a set of MANY possible combinations into a fully scoped output, e.g.: Every dog barks and chases a cat(as interpreted in Manshadi et al. 2017)
  • 78. Logical & Executable Forms- MRS 7 8 • Variables starting with h are “handle” variables used to define constraints on scope. • h19 = things under scope of negation • H21 = leave_v_1 head • H19 =q h21 : equality modulo quantifiers • (Neg outscopes leave) • “forest” of possible readings • Takeaway: Constraints on which variables “outscope" others can add flexible amounts of scope info
  • 79. Lexicon/Ontology Differences 7 9 • Predicates can use different ontologies – e.g. more grounded in grammar/valency, or more tied to taxonomies like WordNet • Semantic Roles can be encoded differently, e.g. with non-lexicalized semantic roles (discussed for UMR later) • Some additional proposals: “BabelNet Meaning Representation” propose using VerbAtlas (clusters over wordnet senses with VerbNet semantic role templates) DRS (GMB/PMB) MRS Prague (PCEDT ) AMR UCCA Semantic Roles VerbNet (general roles) General roles General roles + valency lexicon Lexicalized numbered arguments Fixed general roles Predicates WordNet grammatical entries Vallex valency lexicon (Propbank-like) Propbank Predicates A few types (State vs process …) non-predicates wordnet Lemmas Lemmas Named entity types Lemmas
  • 80. Task-specific Representations 8 0 • Many use “Seman1c Parsing” to refer to task-specific, executable representa1ons • Text-to-SQL • interac1on with robots, text to code/commands • interac1on with determinis1c systems like calendars/travel planners • Similar dis1nc1ons to a general-purpose meaning representa1on, BUT • May need to map into specific task taxonomies and ignore content not relevant to task • Can require more detail or inference than what’s assumed for “context-free” representa1ons • Ogen can be thought of as first-order logic forms — simple predicates + scope
  • 81. Task-specific Representations 8 1 • Classic datasets (Table from Dong & Lapata 2016) regard household commands or querying KBs • Recent tasks for text-to-SQL
  • 82. Task-specific Representa6ons- Spa6al AMR 8 2 • Additional example of task-specific semantic parsing is human-robot interaction • Non-trivial to simply pull those interactions from AMR: normal human language is not normally sufficiently informative about spatial positioning, frames of reference, etc. • Spatial AMR project (Bonn et al. 2020) a good example of project attempting to add all “additional detail” needed to handle structure- building dialogues (giving instructions for building Minecraft structures) • Released with dataset of building actions, success/failures, views of the event different angles.
  • 83. Discourse-Level Annotation 8 3 • Do you do multi-sentence coreference? • Partial coreference (set-subset, implicit roles, etc.)? • Discourse connectives? • Treatment of multi-sentence tense, modality, etc.? • Prague Tectogrammatical annotations & AMR only general-purpose representations with extensive multi-sentence annotations
  • 84. Overviewing Frameworks vs. AMR Alignment Logical Scoping & Interpretation Ontologies and Task-Specifc Discourse-Level DRS (Groeningen / Parallel) Compositional /Anchored Scoped representation (boxes) Rich predicates (WordNet), general roles Can handle referents, connectives MRS Compositional /Anchored Underspecified scoped representation Simple predicates, general roles N/a UCCA Anchored Not really scoped Simple predicates, general roles Some implicit roles Prague Tecto Anchored Not really scoped Rich predicates, semi-lexicalizekd roles Rich multi- sentence conference AMR Unanchored (English); Anchored (Chinese) Not really scoped yet Rich predicates, lexicalized roles Rich multi- sentence conference
  • 85. End of Meaning Representation Comparison • What’s next: UMR — proposal within AMR-connected scholars on next steps for AMR. • QuesHons about how AMR is annotated? • QuesHons about how it relates to other meaning representaHon formalisms?
  • 86. Meaning Representations for Natural Languages Tutorial Part 2 Common Meaning Representations Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
  • 87. Outline ► Background ► Do we need a new meaning representation? What’s wrong with existing meaning representations? ► Aspects of Uniform Meaning Representation (UMR) ► UMR starts with AMR but made a number of enrichments ► UMR is a document-level meaning representation that represents temporal dependencies, modal dependencies, and coreference ► UMR is a cross-lingual meaning representation that separates aspects of meaning that are shared across languages language-independent from those that are idiosyncratic to individual languages (language-specific) ► UMR-Writer -- a tool for annotating UMRs
  • 88. Why aren’t exisHng meaning representaHons sufficient? ► Existing meaning representations vary a great deal in their focus and perspective ► Formal semantic representations aimed at supporting logical inference focus on the proper representation of quantification, negation, tense, and modality (e.g., Minimal Recursion Semantics (MRS) and Discourse Representation Theory (DRT). ► Lexical semantic representations focus on the proper representation of core predicate-argument structures, word sense, named entities and relations between them, coreference (e.g., Tectogrammatical Representation (TR), AMR). ► The semantic ontology they use also differ a great deal. For example, MRS doesn’t have a classification of named entities at all, while AMR has over 100 types of named entities
  • 89. UMR uses AMR as a starting point ► Our starting point is AMR, which has a number of attractive properties: ► Easy to read, ► scalable (can be directly annotated without relying on syntactic structures), ► has information that is important to downstream applications (e.g., semantic roles, named entities and coreference), ► represented in a well-defined mathematical structure (asingle-rooted, directed, acylical graph) ► Our general strategy is to augment AMR with meaning components that are missing and adapt it to cross-lingual settings
  • 90. ParHcipants of the UMR project ► UMR stands for Uniform Meaning Representation, and it is an NSF funded collaborative project between Brandeis University, University of Colorado, and University of New Mexico, with a number of partners outside these institions
  • 91. From AMR to UMR Gysel et al. (2021) ► At the sentence level, UMR adds: ► An aspect attribute to eventive concepts ► Person and number attributes for pronouns and other nominal expressions ► Quantification scope between quantified expressions ► At the document level UMR adds: ► Temporal dependencies in lieu of tense ► Modal dependencies in lieu of modality ► Coreference relations beyond sentence boundaries ► To make UMR cross-linguistically applicable, UMR ► defines a set of language-independent abstract concepts and participant roles, ► uses lattices to accommodate linguistic variability ► designs specifications for complicated mappings between words and UMR concepts.
  • 92. UMR sentence-level addi6ons ► An Aspect attribute to event concepts ► Aspect refers to the internal constituency of events - their temporal and qualitative boundedness ► Person and number attributes for pronouns and other nominal expressions ► A set of concepts and relations for discourse relations between clauses ► Quantification scope between quantified expressions to facilitate translation of UMR to logical expressions
  • 93. UMR attribute: aspect Aspect Habitual Imperfective Process State Atelic Process Perfective Activity Endeavor Performance Reversible State Irreversible State Inherent State Point State Undirected Activity Directed Activity Semelfactive Undirected Endeavor Directed Endeavor Incremental Accomplishment Nonincremental Accomplishment Directed Achievement Reversible Irreversible
  • 94. UMR aNribute: coarse-grained aspect ► State: unspecified type of state ► Habitual: an event that occurs regularly in the past or present, including generic statements ► Activity: an event that has not necessarily ended and may be ongoing at Document Creation Time (DCT). ► Endeavor: a process that ends without reaching completion (i.e., termination) ► Performance: a process that reaches a completed result state
  • 95. Coarse-grained Aspect as an UMR attribute He wants to travel to Albuquerque. (w / want :aspect State) She rides her bike to work. (r / ride :aspect Habitual) He was writing his paper yesterday. (w / write :aspect Activity) Mary mowed the lawn for thirty minutes. (m / mow :aspect Endeavor)
  • 96. Fine-grained Aspect as an UMR attribute My cat is hungry. (h / have-mod-91 :aspect Reversible state) The wine glass is shattered. (h / have-mod-91 :aspect Irreversible state) My cat is black and white. (h / have-mod-91 :aspect Inherent state) It is 2:30pm. (h / have-mod-91 :aspect Point state)
  • 97. AMR vs UMR on how pronouns are represented ► In AMR, pronouns are treated as unanalyzable concepts ► However, pronouns differ from language to language, so UMR decomposes them into person and number attributes ► These attributes can be applied to nominal expressions too AMR: (s / see-01 :ARG0 (h/ he) :ARG1 (b/ bird :mod (r/ rare))) UMR: (s / see-01 :ARG0 (p / person :ref-person 3rd :ref-number Sing.) :ARG1 (b / bird :mod (r/ rare) :ref-number Plural)) “He saw rare birds today.”
  • 100. Discourse relations in UMR ► In AMR, there is a minimal system for indicating relationships between clauses - specifically coordination: ► and concept and :opX relations for addition ► or/either/neither concepts and :opX relations for disjunction ► contrast-01 and its participant roles for contrast ► Many subordinated relationships are represented through participant roles, e.g.: ► :manner ► :purpose ► :condition ► UMR makes explicit the semantic relations between (more general) “coordination” semantics and (more specific) “subordination” semantics
  • 101. Discourse relations in UMR Discours e Relations inclusive-disj or and + but exclusive-disj and + unexpected and + contrast but-91 and consecutive additive unexpected-co- occurrence-91 contrast-91 :apprehensive :condition :cause :purpose :temporal :manner :pure-addition :substitute :concession :concessive- condition :subtraction
  • 102. Disambiguation of quantification scope in UMR “Someone didn’t answer all the questions” (a / answer-01 :ARG0 (p / person) :ARG1 (q / question :quant All :polarity -) :pred-of (s / scope :ARG0 p :ARG1 q)) ∃p(person(p) ∧ ¬∀q(question(q) → ∃a(answer-01(a) ∧ ARG1(a, q) ∧ ARG0(a, p))))
  • 103. Quantification scope annotation ► Scope will not be annotated for summation readings, nor is it annotated where a distributive or collective reading can be predictably derived from the lexical semantics. ► The linguistics students ran 5 kilometers to raise money for charity. ► The linguistics students carried a piano into the theater. ► Ten hurricanes hit six states over the weekend. ► The scope annotation only comes into play when some overt linguistic element forces an interpretation that diverges from the lexical default ► The linguistics students together ran 200 kilometers to raise money for charity. ► The bodybuilders each carried a piano into the theater. ► Ten hurricanes each hit six states over the weekend.
  • 104. From AMR to UMR Gysel et al. (2021) ► At the sentence level, UMR adds: ► An aspect attribute to eventive concepts ► Person and number attributes for pronouns and other nominal expressions ► Quantification scope between quantified expressions ► At the document level UMR adds: ► Temporal dependencies in lieu of tense ► Modal dependencies in lieu of modality ► Coreference relations beyond sentence boundaries ► To make UMR cross-linguistically applicable, UMR ► defines a set of language-independent abstract concepts and participant roles, ► uses lattices to accommodate linguistic variability ► designs specifications for complicated mappings between words and UMR concepts.
  • 105. UMR is a document-level representation ► Temporal relations are added to UMR graphs as temporal dependencies ► Modal relations are also added to UMR graphs as modal dependencies ► Coreference is added to UMR graphs as identity or subset relations between named entities or events
  • 106. No representation of tense in AMR talk-01 she he ARG0 ARG2 medium language name name op1 “French” (t / talk-01 :ARG0 (s / she) :ARG2 (h / he) :medium (l / language :name (n / name :op1 "French"))) ► “She talked to him in French.” ► “She is talking to him in French.” ► “She will talk to him in French.”
  • 107. Adding tense seems straighMorward... Adding tense to AMR involves defining a temporal relation between event-time and the Document Creation Time (DCT) or speech time (Donatelli et al 2019). talk-01 she he ARG0 ARG2 medium time before op1 now language name name op1 “French” (t / talk-01 :time (b / before :op1 (n / now))) :ARG0 (s / she) :ARG2 (h / he) :medium (l / language :name (n / name :op1 "French"))) “She talked to him in French.”
  • 108. ... but it isn’t ► For some events, its temporal relation to the DCT or speech time is undefined. “John said he would go to the florist shop”. ► Is “going to the florist shop” before or after the DCT? ► Its temporal relation is more naturally defined with respect to “said”. ► In quoted speech, the speech time has shifted. “I visited my aunt on the weekend,” Tom said. ► The reference time for “visited” has shifted to the time when Tom said this. We only know the “visiting” event happened before the DCT indirectly. ► Tense is not universally grammaticalized, e.g., Chinese
  • 109. Limita9ons of simply adding tense ► Even in cases when tense, i.e., the temporal relation between an event and the DCT is clear, tense may not give us the most precise temporal location of the event. ► John went into the florist shop. ► He had promised Mary some flowers. ► He picked out three red roses, two white ones and one pale pink ► Example from (Webber 1988) ► All three events happened before the DCT, but we also know that the “going” event happened after the “promising” event, but before the “picking out” event.
  • 110. UMR represents temporal relations in a document as temporal dependency structures (TDS) ► The temporal dependency structure annotation involves identifying the most specific reference time for each event ► Time expressions and other events are normally the most specific reference times ► In some cases, an event may require two reference times in order to make its temporal location as specific as possible Zhang and Xue (2018); Yao et al. (2020)
  • 111. TDS Annotation ► If an event is not clearly linked temporally to either a time expression or another event, then it can be linked to the DCT or tense metanodes ► Tense metanodes capture vague stretches of time that correspond to grammatical tense ► Past_Ref, Present_Ref, Future_Ref ► DCT is a more specific reference time than a tense metanode
  • 112. Temporal dependency Structure (TDS) ► If we identify a reference time for every event and time expression in a document, the result will be a Temporal Dependency Graph. descended arrested assaulted ROOT Temporal DCT (4/30/2020 Depends-on today Contained Contained Contained After Before “700 people descended on the state Capitol today, according to Michigan State Police. State Police made one arrest, where one protester had assaulted another, Lt. Brian Oleksyk said.”
  • 113. Genre in TDS Annotation ► Temporal relations function differently depending on the genre of the text (e.g., Smith 2003) ► Certain genres proceed in temporal sequence from one clause to the next ► While other genres involve generally non-sequenced events ► News stories are a special type ► many events are temporally sequenced ► temporal sequence does not match with sequencing in the text
  • 114. TDS Annotation ► Annotators may also consider the modal annotation when annotating temporal relations ► Events in the same modal “world” can be temporally linked to each other ► Events in non-real mental spaces rarely make good reference times for events in the “real world” ► Joe got to the restaurant, but his friends had not arrived. So, he sat down and ordered a drink. ► Exception to this are deontic complement-taking predicates ► Events in the complement are temporally linked to the complement-taking predicate ► E.g. I want to travel to France: After (want, travel)
  • 115. Modality in AMR ► Modality characterizes the reality status of events, without which the meaning representation of a text is incomplete ► AMR has six concepts that represent modality: ► possible-01, e.g., “The boy can go.” ► obligate-01, e.g., “The boy must go.” ► permit-01, e.g., “The boy may go.” ► recommend-01, e.g., “The boy should go.” ► likely-01, e.g., “The boy is likely to go.” ► prefer-01, e.g., “They boy would rather go.” ► Modality in AMR is represented as senses of an English verb or adjective. ► However, the same exact concepts for modality may not apply to other languages
  • 116. Modal dependency structure ► Modality is represented as a dependency structure in UMR ► Similar to the temporal relations ► Events and conceivers (sources) are nodes in the dependency structure ► Modal strength and polarity values characterize the edges ► Mary might be walking the dog. AUTH Neutral walk
  • 117. Modal dependency structure ► A dependency structure: ► Allows for the nesting of modal operators (scope) ► Allows for the annotation of scope relations between modality and negation ► Allows for the import of theoretical insights from Mental Space Theory (Fauconnier 1994, 1997)
  • 118. Modal dependency structure ► There are two types of nodes in the modal dependency structure: events and conceivers ► Conceivers ► Mental-level entities whose perspective is modelled in the text ► Each text has an author node (or nodes) ► All other conceivers are children of the AUTH node ► Conceivers may be nested under other conceivers ► Mary said that Henry wants... AUTH MARY HENRY
  • 119. Epistemic strength lattice Epistemic Strength Non-neutral Non-full Partial Full Neutral Strong partial Weak partial Strong neutral Weak neutral Full: The dog barked. Partial: The dog probably barked. Neutral: The dog might have barked.
  • 120. Modal dependency structure (MDS) Michigan State Police descended arrested assaulted ROOT MODAL AUTH (CNN) FULLAFF FULLAFF FULLAFF Lt. Brian Oleksyk FULLAFF FULLAFF “700 people descended on the state Capitol today, according to Michigan State Police. State Police made one arrest, where one protester had assaulted another, Lt. Brian Oleksyk said.” (Vigus et al., 2019; Yao et al., 2021):
  • 121. En9ty Coreference in UMR ► same-entity: 1. Edmund Pope tasted freedom today for the first time in more than eight months. 2. He denied any wrongdoing. ► subset: 1. He is very possesive and controlling but he has no right to be as we are not together.
  • 122. Event coreference in UMR ► same-event 1. El-Shater and Malek’s property was confiscated and is believed to be worth millions of dollars. 2. Abdel-Maksoud stated the confiscation will affect the Brotherhood’s financial bases. ► same-event 1. The Three Gorges project on the Yangtze River has recently introduced the first foreign capital. 2. The loan , a sum of 12.5 million US dollars , is an export credit provided to the Three Gorges project by the Canadian government , which will be used mainly for the management system of the Three Gorges project . ► subset: 1. 1 arrest took place in the Netherlands and another in Germany. 2. The arrests were ordered by anti-terrorism judge fragnoli.
  • 123. An UMR example with coreference He is controlling but he has no right to be as we are not together. (s4c / but-91 :ARG1 (s4c3 / control-01 :ARG0 (s4p2 / person :ref-person 3rd :ref-number Singular)) :ARG2 (s4r / right-05 :ARG1 s4p2 :ARG1-of (s4c2 / cause-01 :ARG0 (s4h / have-mod-91 :ARG0 (s4p3 / person :ref-person 1st :ref-number Plural) :ARG1 (s4t/ together) :aspect State :modstr FullNeg)) :modstr FullNeg)) (s / sentence :coref ((s4p2 :subset-of s4p3)))
  • 124. Implicit arguments ► Like MS-AMRs, UMR also annotates implicit arguments when they can be inferred from context and can be annotated for coreference like overt (pronominal) expressions (s3d / deny-01 :Aspect Performance :ARG0 (s3p / person :ref-number Singular :ref-person 3rd) :ARG1 (s3t / thing :ARG1-of (s3d2 / do-02 :ARG0 s3p :ARG1-of (s3w / wrong-02) :aspect Process :modpred s3d)) :modstr FullAff) “He denied any wrongdoing”
  • 125. The challenge: Integration of different meaning components into one graph ► How do we represent all this information in a unified structure that is still easy to read and scalable? ► UMR pairs a sentence-level representation (a modified form of AMR) with a document-level representation. ► We assume that a text will still have to be processed sentence by sentence, so each sentence will have a fragment of the document-level super-structure.
  • 126. Integrated UMR representa6on 1. Edmund Pope tasted freedom today for the first time in more than eight months. 2. Pope is the American businessman who was convicted last week on spying charges and sentenced to 20 years in a Russian prison. 3. He denied any wrongdoing.
  • 127. Sentence-level representation vs document-level representation (s1t2 / taste-01 :Aspect Performance :ARG0 (s1p / person :name (s1n2 / name :op1 “Edmund” :op2 “Pope”)) :ARG1 (s1f / free-04 :ARG1 s1p) :time (s1t3 / today) :ord (s1o3 / ordinal-entity :value 1 :range (s1m / more-than :op1 (s1t / temporal-quantity :quant 8 :unit (s1m2 / month))))) Edmund Pope tasted freedom today for the first time in more than eight months. (s1 / sentence :temporal ((DCT :before s1t2) (s1t3 :contained s1t2) (DCT :depends-on s1t3)) :modal ((ROOT :MODALAUTH) (AUTH :FullAff s1t2)))
  • 128. Pope is the American businessman who was convicted last week on spying charges and sentenced to 20 years in a Russian prison. (s2i/ identity-91 :ARG0 (p/ person :wiki "Edmond_Pope" :name (n/ name "op1 "Pope)) :ARG1 (b/ businessman :mod (n2/ nationality :wiki "United_States" :name (n3/ name :op1 "America"))) :ARG1-of (c/ convict-01 :ARG2 (c2/ charge-05 :ARG1b :ARG2 (s/ spy-02 :ARG0b :modpred c2)) :temporal (w/ week :mod ( l / last)) :aspect Performance :modstr FullAff) :ARG1-of (s2/ sentence-01 :ARG2 (p2/ prison :mod (c3/ country :wiki "Russia" :name (n4/ name :op1 "Russia)) :duration ( t / temporal-quantity :quant 20 :unit (y/ year))) :ARG3s :aspect Performance :modstr FullAff) :aspect State :modstr FullAff) ( s2 / sentence :temporal ((s2c4 :before s1t2) (DCT :depends-on s2w) (s2w :contained s2c (s2w :contained s2s2) (s2c :after s2s) (s2s :after s2c4)) :modal ((AUTH :FullAff s2i) (AUTH :FullAff s2c) (AUTH :FullAff Null Charger) (Null Charger :FullAff s2c2) (s2c2 :Unsp s2s) (AUTH :FullAff s2s2)) :coref ((s1p :same-entity s2p))) Sentence-level representation vs document-level representation
  • 129. He denied any wrongdoing. (s3d/deny-01 :Aspect Performance :ARG0 (s3p / person :ref-number Singular :ref-person 3rd) :ARG1 (s3t / thing :ARG1-of (s3d2 / do-02 :ARG0 s3p :ARG1-of (s3w/wrong-02) :aspect Performance :modpred s3d) :modpred FullAff)) (s3 / sentence :temporal ((s2c :before s3d)) :modal ( (AUTH :FullAff s3p) (s3p :FullAff s3d (s3d :Unsp s3d2))) :coref ((s2p :same-entity s3p))) Sentence-level representation vs document-level representation
  • 131. From AMR to UMR Gysel et al. (2021) ► At the sentence level, UMR adds: ► An aspect attribute to eventive concepts ► Person and number attributes for pronouns and other nominal expressions ► Quantification scope between quantified expressions ► At the document level UMR adds: ► Temporal dependencies in lieu of tense ► Modal dependencies in lieu of modality ► Coreference relations beyond sentence boundaries ► To make UMR cross-linguistically applicable, UMR ► defines a set of language-independent abstract concepts and participant roles, ► uses lattices to accommodate linguistic variability ► designs specifications for complicated mappings between words and UMR concepts.
  • 132. Elements of AMR are already cross-linguistically applicable ► Abstract concepts (e.g., person, thing, have-org-role-91): ► Abstract concepts are concepts that do not have explicit lexical support but can be inferred from context ► Some semantic relations (e.g., :manner, :purpose, :time) are also cross-linguistically applicable
  • 133. Language-independent vs language-specific aspects of AMR 加入-01 person 董事会 date-entity name temporal-quantity ” 文肯” ” 皮埃尔” 61 岁 have-org-role-91 董事 11 29 Arg0 Arg1 time name op1 op2 age quant unit Arg1-of Arg0 Arg2 month day mod 执行 polarity - “61 岁的 Pierre Vinken 将于 11 月 29 日加入董事会,担任 非执行董事。”
  • 134. Language-independent vs language-specific aspects of AMR join-01 person board date-entity name temporal-quantity ”Vinken” ”Pierre” 61 year have-org-role-91 director 11 29 Arg0 Arg1 time name op1 op2 age quant unit Arg1-of Arg0 Arg2 month day mod executive polarity - ““Pierre Vinken , 61 years old , will join the board as a nonexecutive director Nov. 29 .”
  • 135. Abstract concepts in UMR ► Abstract concepts inherited from AMR: ► Standardization of quantities, dates etc.: have-name-91, have-frequency-91, have-quant-91, temporal-quantity, date-entity... ► New concepts for abstract events: “non-verbal” predication. ► New concepts for abstract entities: entity types are annotated for named entities and implicit arguments. ► Scope: scope concept to disambiguate scope ambiguity to facilitate translation of UMR to logical expressions (see sentence-level structure). ► Discourse relations: concepts to capture sentence-internal discourse relations (see sentence-level structure).
  • 136. Sample abstract events Clause Type UMR Predicates Arg0 Arg1 Arg2 Thetic/present ational possession have-91 possessor possessum Predicative possession belong-91 possessum possessor Thetic/present ational location exist-91 location theme Predicative location have-location- 91 theme location property- predicaOon have-mod-91 theme property Object predication have-role-91 theme Ref point Object category Equational identity-91 theme equated referent
  • 137. How do we find abstract eventive concepts? ► Languages use different strategies to express these meanings: ► Predicativized possessum: Yukaghir pulun-die jowje-n'-i old.man-DIM net-PROP 3SG.INTR `The old man has a net, lit. The old man net- has.' ► UMR trains annotators to recognize the semantics of these constructions and select the appropriate abstract predicate and its participant roles
  • 138. Language-independent vs language-specific participant roles ► Core participant roles are defined in a set of frame files (valency lexicon, see Palmer et al. 2005). The semantic roles for each sense of a predicate are defined: ► E.g. boil-01: apply heat to water ARG0-PAG: applier of heat ARG1-PPT: water ► Most languages do not have frame files ► But see e.g. Hindi (Bhat et al. 2014), Chinese (Xue 2006) ► UMR defines language-independent participant roles ► Based on ValPaL data on co-expression patterns of different micro-roles (Hartmann et al., 2013)
  • 139. Language-independent roles: an incomplete list UMR Annotation Actor Definition animate entity that initiates the action Undergoer theme Recipient force Causer causer experiencer stimulus entity (animate or inanimate) that is affected by the action entity (animate or inanimate) that moves from one entity to another entity, either spatially or metaphorically animate entity that gains possession (or at least temporary control) of another entity inanimate entity that initiates the action animate entity that acts on another animate entity to initiate the action animate entity that acts on another animate entity to initiate the action animate entity that cognitively or sensorily experiences a stimulus entity (animate or inanimate) that is experi- enced by an experiencer
  • 140. Road Map for annotating UMRs for under- resourced languages ► Participant Roles: ► Stage 0: General participant roles ► Stage 1: Language-specific frame files ► UMR-Writer allows for the creation of lexicon with argument structure information during annotation ► Morphosemantic Tests: ► Stage 0: Identify one concept per word ► Stage 1: Apply more fine-grained tests to identify concepts ► Annotation Categories with Lattices: ► Stage 0: Use grammatically encoded categories (more general if necessary) ► Stage 1: Use (overtly expressed) fine-grained categories ► Modal Dependencies: ► Stage 0: Use simplified modal annotation ► Stage 1: Fill in lexically based modal strength values
  • 141. How UMR accommodates cross-linguistic variability ► Not all languages grammaticalize/overtly express the same meaning contrasts: ► English: I (1SG) vs. you (2SG) vs. she/he (3SG) ► Sanapaná: as- (1SG) vs. an-/ap- (2/3SG) ► However, there are typological patterns in how semantic domains get subdivided: ► A 1/3SG person category would be much more surprising than a 2/3SG one ► UMR uses lattices for abstract concepts, attribute values, and relations to accommodate variability across languages. ► Languages with overt grammatical distinctions can choose to use more fine-grained categories
  • 142. Lattic es ►Semantic categories are organized in “lattices” to achieve cross-lingual compatibility while accommodating variability. ►We have lattices for abstract concepts, relations, as well as attributes Non-3rd Non-1st 1st 2nd 3rd Excl. Incl. person
  • 143. Wordhood vs concepthood across languages ► The mapping between words and concepts in languages is not one-to-one: UMR designs specifications for complicated mappings between words and concepts. ► Multiple words can map to one concept (e.g., multi-word expressions) ► One word can map to multiple concepts (morphological complexity)
  • 144. Multiple words can map to a single (discontinuous) concept (x0/帮忙-01 :aspect Performance :arg0 (x1/地理学) :affectee (x2/我) :degree (x3/大)) 地理学帮 了我很大的忙。 “Geography has helped me a lot” (w / want-01 :Aspect State :ARG0 (p / person) :ref-person 3rd :ref-number Singular :ARG1 (g / give-up-07 :ARG0 h :ARG1 (t / that) :aspect Performance :modpred w) :ARG1-of (c / cause-01 :ARG0 (a / umr-unknown)) :aspect State) “Why would he want to give that up?”
  • 145. One word maps to multiple UMR concepts ► One word containing predicate and arguments Sanapaná: yavhan anmen m-e-l-yen-ek honey alcohol NEG-2/3M-DSTR-drink-POT "They did not drink alcohol from honey." (e / elyama :actor (p / person :ref-person 3rd :ref-number Plural) :undergoer (a / anmen :material (y/ yavhan)) :modstr FullNeg :aspect Habitual) ► Argument Indexation: Identify both predicate concept and argument concept, don’t morphologically decompose word
  • 146. One word maps to multiple UMR concepts ► One word containing predicate and arguments Arapaho: he'ih'iixooxookbixoh'oekoohuutoono' he'ih'ii-xoo-xook- bixoh'oekoohuutoo-no' NARR.PST.IPFV-REDUP-through-make.hand.appear.quickly-PL ``They were sticking their hands right through them [the ghosts] to the other side.'' (b/ bixoh'oekoohuutoo `stick hands through' :actor (p/ person :ref-person 3rd :ref-number Plural) :theme (h/ hands) :undergoer (g/ [ghosts]) :aspect Endeavor :modstr FullAff) ► Noun Incorporation (less grammaticalized): identify predicate and argument concept
  • 147. UMR-Writer ► The annotation interface we use for UMR annotation is called UMR-Writer ► UMR-Writer includes interfaces for project management, sentence-level and document-level annotation, as well as lexicon (frame file) creation. ► UMR-Writer has both keyboard-based and click-based interfaces to accommodate the annotation habits of different anntotators. ► UMR-Writer is web-based and supports UMR annotation for avariety of languages and formats. Sofar it supports Arabic, Arapaho, Chinese, English,Kukama Navajo, and Sanapana. It can easily extended to more languages.
  • 148. UMR writer: Project management
  • 149. UMR writer: Project management
  • 151. UMR writer: Lexicon interface
  • 153. UMR summary ► UMR is a rooted directed node-labeled and edge-labeled document-level graph. ► UMR is a document-level meaning representation that builds on sentence-level meaning representations ► UMR aims to achieve semantic stability across syntactic variations and support logical inference ► UMR is across-lingual meaning representation that separates language-general aspects of meaning from those that are language-specific ► We are doing UMR English, Chinese, Arabic, Arapaho, Kukama, Sanapana, Navajo, Quechua
  • 154. Use cases of UMR ► T emporal reasoning ► UMR can be used to extract temporal dependencies, which can then be used to perform temporal reasoning ► Knowledge extraction ► UMR annotates aspect, and this can be used to extract habitual events or state, which are typical knowledge forms ► Factuality determination ► UMR annotates modal dependencies, and this can be used to verify the factuality of events or claims ► As intermediate representation for dialogue systems where control is more needed. ► UMR annotates entities and coreferences, which helps tracking dialogue states
  • 155. Planned UMR activities • The DMR international workshops • UMR summer schools, tentatively in 2024 and 2025. • UMR shared tasks once we have sufficient amount of UMR-annotated data as well as evaluation metrics and baseline parsing models
  • 156. References Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Koehn, P ., Palmer, M., and Schneider, N. (2013). Abstract meaning representation for sembanking. In Proceedings of the 7th linguistic annotation workshop and interoperability with discourse, pages 178–186. Hartmann, I., Haspelmath, M., and Taylor, B., editors (2013). TheValency Patterns Leipzig online database. Max Planck Institute for Evolutionary Anthropology, Leipzig. Van Gysel, J. E. L., Vigus, M., Chun, J., Lai, K., Moeller, S., Yao, J., O’Gorman, T. J., Cowell, A., Croft, W. B., Huang, C. R., Hajic, J., Martin, J. H., Oepen, S., Palmer, M., Pustejovsky, J.,Vallejos, R.,and Xue, N. (2021). Designing auniform meaning representation for natural language processing. Künstliche Intelligenz, pages 1– 18. Vigus, M., Van Gysel, J. E., and Croft, W. (2019). A dependency structure annotation for modality. In Proceedings of the First International Workshop on Designing Meaning Representations, pages 182–198. Yao, J., Qiu, H., Min, B., and Xue, N. (2020). Annotating temporal dependency graphs via crowdsourcing. In Proceedings of the 2020 Conference on Empirical Methods in Natural LanguageProcessing (EMNLP), pages 5368–5380. Yao, J., Qiu, H., Zhao, J., Min, B., and Xue, N. (2021). Factuality assessment as modal dependency parsing. In Proceedingsof the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 1540–1550. Zhang, Y . and Xue, N. (2018). Structured interpretation of temporal relations. In Proceedings of LREC2018.
  • 157. Acknowledgements We would like to acknowledge the support of National Science Foundation: • NSF IIS (2018): “Building a Uniform Meaning Representation for Natural Language Processing” awarded to Brandeis (Xue, Pustejovsky), Colorado (M. Palmer, Martin, and Cowell) and UNM (Croft). • NSF CCRI (2022): ``Building a Broad Infrastructure for Uniform Meaning Representations'', awarded to Brandeis (Xue, Pustejovsky) and Colorado (A. Palmer, M. Palmer, cowell, Martin), with Croft as consultant All views expressed in this paper are those of the authors and do not necessarily represent the view of the National Science Foundation.
  • 159. Meaning Representations for Natural Languages Tutorial Part 3a Modeling Meaning Representation: SRL Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
  • 160. Who did what to whom, when, where and how? (Gildea and Jurafsky, 2000; Màrquez et al., 2008) 160 Semantic Role Labeling (SRL)
  • 161. broke Derik the window with a hammer to 161 Predicate Identification 1 Identify all predicates in the sentence broke Semantic Role Labeling (SRL) escape escape
  • 162. break.01 broke Predicate Identification 1 2 Identify all predicates in the sentence Sense Disambiguation Classify sense of each predicate 162 break.01, break A0: breaker A1: thing broken A2: instrument A3: pieces A4: arg1 broken away from what? English Propbank Breaking_apart Pieces Whole Criterion Manner Means Place… FrameNet Frame Break-45.1 Agent Patient Instrument Result VerbNet Semantic Role Labeling (SRL) Derik the window with a hammer to escape.
  • 163. break.01 Predicate Identification 1 2 3 Identify all predicates in the sentence Sense Disambiguation Classify sense of each predicate Argument Identification Find all roles of each predicate 163 Argument identification can either be - Identification of span, (span SRL) OR - Identification of head (dependency SRL) broke Semantic Role Labeling (SRL) Derik the window with a hammer to escape
  • 164. Predicate Identification 1 2 4 3 Identify all predicates in the sentence Sense Disambiguation Classify sense of each predicate Argument Identification Find all roles of each predicate Argument Classification Assign semantic label to each role 164 Breaker thing broken break.01 instrument Purpose Semantic Role Labeling (SRL) break.01 broke Derik the window with a hammer to escape
  • 165. Predicate Identification 1 2 4 3 Identify all predicates in the sentence Sense Disambiguation Classify sense of each predicate Argument Identification Find all roles of each predicate Argument Classification Assign semantic label to each role 165 A0: Breaker A1: thing broken break.01 A2: instrument AM-PRP: Purpose Semantic Role Labeling (SRL) break.01 broke Derik the window with a hammer to escape If using PropBank
  • 166. Predicate Identification 1 2 4 3 Identify all predicates in the sentence Sense Disambiguation Classify sense of each predicate Argument Identification Find all roles of each predicate Argument Classification Assign semantic label to each role 166 A0: Breaker A1: thing broken break.01 A2: instrument AM-PRP: Purpose Semantic Role Labeling (SRL) break.01 broke Derik the window with a hammer to escape 5 Global Optimization Global constraints (predicates and arguments)
  • 167. 167 Outline q Early SRL approaches [< 2017] q Typical neural SRL model components q Performance analysis q Syntax-aware neural SRL models q What, When and Where? q Performance analysis q How to incorporate Syntax? q Syntax-agnostic neural SRL models q Performance Analysis q Do we really need syntax for SRL? q Are high quality contextual embedding enough for SRL task? q Practical SRL systems q Should we rely on this pipelined approach? q End-to-end SRL systems q Can we jointly predict dependency and span? q More recent approaches q Handling low-frequency exceptions q Incorporate semantic role label definitions q SRL as MRC task q Practical SRL system evaluations q Are we evaluating SRL systems correctly? q Conclusion
  • 168. 168 Outline q Early SRL approaches q Typical neural SRL model components q Performance analysis q Syntax-aware neural SRL models q What, When and Where? q Performance analysis q How to incorporate Syntax? q Syntax-agnostic neural SRL models q Performance Analysis q Do we really need syntax for SRL? q Are high quality contextual embedding enough for SRL task? q Practical SRL systems q Should we rely on this pipelined approach? q End-to-end SRL systems q Can we jointly predict dependency and span? q More recent approaches q Handling low-frequency exceptions q Incorporate semantic role label definitions q SRL as MRC task q Practical SRL system evaluations q Are we evaluating SRL systems correctly? q Conclusion
  • 169. 169 Early SRL Approaches Ø 2 to 3 steps to obtain complete predicate- argument structure Ø Predicate Identification Ø Generally considered as not a task, as all the existing SRL datasets provided Gold predicate location. Ø Predicate sense disambiguation Ø Logistic Regression [Roth and Lapata, 2016] Ø Argument Identification Ø Binary classifier [Pradhan et al., 2005; Toutanova et al., 2008] Ø Role Labeling Ø Labeling is performed using a classifier (SVM, logistic regression) Ø Argmax over roles will result in a local assignment Ø Requires Feature Engineering Ø Mostly Syntactic [Gildea and Jurafsky, 2002] Ø Re-ranking Ø Enforce linguiscc and structural constraint (e.g., no overlaps, disconcnuous arguments, reference arguments, ...) Ø Viterbi decoding (k-best list with constraints) [Täckström et al., 2015] Ø Dynamic programming [Täckström et al., 2015; Toutanova et al., 2008] Ø Integer linear programming [Punyakanok et al., 2008] Ø Re-ranking [Toutanova et al., 2008; Bjö̈rkelund et al., 2009]
  • 170. 170 Outline q Early SRL approaches q Typical neural SRL model components q Performance analysis q Syntax-aware neural SRL models q What, When and Where? q Performance analysis q How to incorporate Syntax? q Syntax-agnostic neural SRL models q Performance Analysis q Do we really need syntax for SRL? q Are high quality contextual embedding enough for SRL task? q Practical SRL systems q Should we rely on this pipelined approach? q End-to-end SRL systems q Can we jointly predict dependency and span? q More recent approaches q Handling low-frequency exceptions q Incorporate semantic role label definitions q SRL as MRC task q Practical SRL system evaluations q Are we evaluating SRL systems correctly? q Conclusion
  • 171. Encoder Classifier Embedder Input Sentence Word embeddings - FastText, GloVe - ELMo, BERT Types of encoder - LSTMs, Attention - MLP Typical Neural SRL Components 171 A typical neural SRL model contains three components Ø Classifier Ø Assign a semantic role label to each token in the input sentence. [Local + Global] Ø Encoder: Ø Encodes the context information to each token. Ø Embedder: Ø Represent input token into continuous vector representation.
  • 172. Encoder Classifier Embedder Input Sentence Word embeddings - FastText, GloVe - ELMo, BERT Neural SRL Components – Embedder 172 Ø Embedder: Ø Represent input token into continuous vector representation. He had dared to defy nature Embedder Ø Could be static or dynamic embeddings Ø Could include syntax information Ø Usually, a binary flag Ø 0 à represents no predicate Ø 1 à represent predicate End-to-end systems do not include this flag
  • 173. Encoder Classifier Embedder Input Sentence Word embeddings - FastText, GloVe - ELMo, BERT Dynamic Embeddings Merchant et al., 2020 Neural SRL Components – Embedder Static Embeddings GLoVe: • He et al., 2017 • Strubell et al., 2018 SENNA: • Ouchi et al., 2018 ELMo: • Marcheggiani et al., 2017 • Ouchi et al., 2018 • Li et al., 2019 • Lyu et al., 2019 • Jindal et al., 2020 • Li et al., 2020 BERT: • Shi et al., 2019 • Jindal et al., 2020 • Li et al., 2020 BERT: • Shi et al., 2019 • Conia et al., 2020 • Zhang et al., 2021 • Tian et al., 2022 RoBERTa: • Conia et al., 2020 • Blloshmi et al., 2021 • Fei et al., 2021 • Wang et al., 2022 • Zhang et al. 2022 XLNet: • Zhou et al., 2020 • Tian et al., 2022 173 Ø Embedder: Ø Represent input token into continuous vector representation.
  • 174. 85.28 89.6 91.4 91.5 92.6 93.3 70 75 80 85 90 95 100 Random GLoVe; Cai et al., 2018 ELMo; Liet al., 2019 BERT; Conia et al., 2020 BERT; Conia et al., 2020 RoBERTa; Wang et al., 2022 WSJ F1 75.09 79.3 83.28 84.67 85.9 87.2 70 75 80 85 90 95 100 Random GLoVe; He et al., 2018 ELMo; Liet al., 2019 BERT; Conia et al., 2020 BERT; Conia et al., 2020 RoBERTa; Wang et al., 2022 Brown F1 Static Static Dataset: CoNLL09 EN Performance Analysis Best performing model for each word embedding type 174
  • 175. Encoder Classifier Embedder Input Sentence Neural SRL Components – Encoder 175 Ø Encoder: Ø Encodes the context information to each token. Types of encoder - BiLSTMs - Attention He had dared to defy nature Embedder Encoder Left pass Right pass Encoder could be Ø Stacked BiLSTMs or some variant of LSTMs Ø Attention Network Ø Include syntax information
  • 176. Encoder Classifier Embedder Input Sentence Neural SRL Components – Classifier 176 Ø Classifier Ø Assign a semantic role label to each token in the input sentence. He had dared to defy nature Embedder Encoder Usually a FF followed by Softmax - MLP Classifier B-A0 0 0 B-A2 I-A2 I-A2
  • 177. 177 Outline q Early SRL approaches q Typical neural SRL model components q Performance analysis q Syntax-aware neural SRL models q What, When and Where? q Performance analysis q How to incorporate Syntax? q Syntax-agnostic neural SRL models q Performance Analysis q Do we really need syntax for SRL? q Are high quality contextual embedding enough for SRL task? q Prac/cal SRL systems q Should we rely on this pipelined approach? q End-to-end SRL systems q Can we jointly predict dependency and span? q More recent approaches q Handling low-frequency excepcons q Incorporate semancc role label definicons q SRL as MRC task q Prac/cal SRL system evalua/ons q Are we evaluacng SRL systems correctly? q Conclusion
  • 178. 178 What and Where Syntax? <latexit sha1_base64="DDfPssnDMCfvIPspxYHzFdvDzxQ=">AAAMkXicrVZtb9s2EHa7ruk8d2tXYPuwL+yCDN1gG5bbvA0wkKQJNmBr4nlJW8AyAko+2VwoUaCo2g4hYH9zv2B/YR93pOT6JTa6ohUM63T3PHfH44knL+YsUY3G37duf3Ln07sb9z4rf165/8WXDx5+9TIRqfThwhdcyNceTYCzCC4UUxxexxJo6HF45V09N/ZXb0AmTETnahJDL6SDiAXMpwpVlw/v/uV6MGCRDtgglfBj1lXeMO6VXR8iBZJFg7IrIWHX4Imxdk2YEeurYaYfZ7pccPsQQ9SHyJ8gfQghkBZJWBhzqBLoD4AMhWTXIlKUk5j2++i15dS3YdwrE0JmThSMVdbFRaVhRBKIWw6EVSLFyD7UnYKQX26i6EAfY47+VaYz4n5PiCfFFRgJszC3HDNiUV+MEIOaEVNDvFELz81DGoYgCxdKzHiQ+DSG3FAnrou/t+HzuEVQoFdzQfNwy7HyKLMQuXMUlj23O2ftU0t5edI5MsLxybm5nZ5dnOLt8Lg901mlEdqHHauYctoXp+fzXrcI0e5AAkTajakc6EaW2QVPtXYV9YZTaBdsOcPJFmwrEM1sPftFrd3pFPY8Lxd75u2uW4XpI2yXru0Z3AKpyJiIIEhAtWo7saoSTj3gaJpwaOkAO6rlemiWDJKq8dPyOMV26GEqmK+OktT7M9NNGC8G+N9unmX6aYZJqg9xgrmgH/FhmexmejvTIZVXK7x8hIKh/511K/04+7FrasA/pAb7md57vxo4TcySSp/QaIDedxvvl/L+ipSlECoPlkZMkT6e0TTyofUUxu/nvHN2dj5zXbyl7kjI/kCKNDaQ/IcHen+GmUM4WfGjjXX2d3t4aruTOuvs23bnaHOdfc+WiYaxlFl5+k5PxwFqbEnMePkpkbyGtgI1HTiXDzYb9Ya9yE3BKYTNUnG1Lx/e+dftCz8NcUL5nCZJ12nEqqdx55nPAd2nOC6wzBS7AsWIhpD0tJ2VGdlCTZ8EQhKzPcRq5xm4kiSZhB4iQ6qGybLNKFfZuqkK9nqaRXGqcOl5oCDl5rg3gxcbRYKv+AQF6kuGuRJ/SCX1ccwuRlHs6jq7oanNldUajZIzT1I50Vu4OckQB0pSNSKVODRzMRYJM6MeZ6599in3c0yqBK6CbpUXY8E4Ni2JrR7gsLf10VQ8AXyBWDL8IdNy4GUaN6iKu7Rt/hpLaF/ICDg3PTcF7xqc4+T/S/A4wP7q/HyEnbyzUyWOs1clzX0E4aeGL8KQYrO4WHfOlcLPA6eni0ftmldKKb1pRtMSwc6eAr1iEUghyxSPp7DAsIqVULO2eeT8km02i3AfbQsJ4XNOtNrHze3VYYJUcktxAzx1IqFAu6gqQixizZTNsYBjT4Lp0LcVOlyRlCH4axnPa+s4ci2ns5YTrs/sRW1dbuYwz/RcRq9X57OE66zAme+5rNtcSoFDoLqWZirqSjYYqt5lrnAViyZks3nDlVzta+rlnXwcIPhOmkZpS/YC/uj8Vjt5Q/lywvjJrAx0BsMNx7PSWT4Zbwovm3Vnp/7s9+bmwVFxat4rfVv6rvSk5JR2SwelX0rt0kXJv/vPxv2Nrze+qTyq7FcOKgX29q2C86i0cFV+/Q8g42bX</latexit> [Derick] broke the [window] with a [hammer] to [escape] . Derick break the window with a hammer to escape . PROPN VERB DET NOUN ADP DET NOUN PART VERB PUNT nsubj det obj mark det obl mark obl ROOT Surface form Lemma form U{X}POS Dependency Relation Everything or anything that explains the syntactic structure of the sentence Parsed with UDPipe Parser: hjp://lindat.mff.cuni.cz/services/udpipe/ What Syntax for SRL?
  • 179. Syntax at Embedder Concatenate {POS, dependency relation, dependency head and other syntactic information} Where the Syntax is being used? Marcheggiani et al.,2017b Li et al., 2018 He et al., 2018 Wang et al., 2019 Kasai et al., 2019 HE et al., 2019 Li et al., 2020 Zhou et al., 2020 179 Encoder Classifier Embedder Input Sentence Word embeddings - FastText, GloVe - ELMo, BERT EMB
  • 180. Syntax at Encoder Dependency tree - Graphs - LSTMs Trees Marcheggiani et al., 2017 Zhou et al., 2020 Marcheggiani et al., 2020 Zhang et al., 2021 Tian et al., 2022 180 Encoder Classifier Embedder Input Sentence Types of encoder - BiLSTMs - Attention ENC Where the Syntax is being used?
  • 181. Joint Learning At what level Syntax is used? Strubell et al., 2018 Shi et al., 2020 Multi-task learning 181 Encoder Classifier Embedder Input Sentence Word embeddings - FastText, GloVe - ELMo, BERT Types of encoder - BiLSTMs - Attention - MLP
  • 182. 87.7 88 89.5 89.8 90.2 90.86 90.99 91.27 91.7 92.83 80 82 84 86 88 90 92 94 Marcheggiani et al.,2017b Marcheggiani et al., 2017 Heet al., 2018 LI et al., 2018 Kasaiet al., 2019 HEet al., 2019 Lyu et al., 2019 Zhou et al., 2020 LI et al., 2020 Fei et al., 2021 WSJ F1 Dataset: CoNLL09 EN 2018 2019 2020 2021à 2017 Emb Enc Emb Emb Emb Emb Enc + Emb Enc Emb BERT/Fine-tune Regime +2.0 -2.9 Comparing Syntax aware models Performance Analysis Enc 182
  • 183. Dataset: CoNLL09 EN Comparing Syntax aware models Observations q Syntax at encoder level provides the best performance. q Most likely, Encoder is best suited for incorporaOng dependency or consOtuent relaOons. q BERT models raised the bar q With Max improvement over Out-of-domain dataset q However, the improvement since 2019 is marginal 183
  • 184. A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling Marcheggiani et al., 2017 Ø Predict semantic dependency edges between predicates and arguments. Ø Use predicate-specific roles (such as make-A0 instead of A0) as opposed to generic sequence labeling task. 184 Syntax at embedder level Diego Marcheggiani, Anton Frolov, and Ivan Titov. 2017. A Simple and Accurate Syntax-Agnostic Neural Model for Dependency-based Semantic Role Labeling. In Proceedings of the 21st Conference on Computational Natural Language Learning (CoNLL 2017), pages 411–420, Vancouver, Canada. Association for Computational Linguistics.
  • 185. Marcheggiani et al., 2017 Wp à Randomly initialized word embeddings Wr à Pre-trained word embeddings PO à Randomly initialized POS embeddings Le à Randomly initialized Lemma embeddings à Predicate specific feature [Binary] Embedder OR Input word representation He had dared to defy nature Embedder 185 Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Syntax at embedder level
  • 186. Marcheggiani et al., 2017 Encoder He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder Several BiLSTMs layers - Capturing both the left and the right context - Each BiLSTM layer takes the lower layer as input 186 Syntax at embedder level
  • 187. Marcheggiani et al., 2017 Preparation for classifier Provide predicate hidden state as another another input to classifier along with each token. He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder + ~6% F1 on CoNLL09 EN 187 The two ways of encoding predicate information, using predicate-specific flag at embedder level and incorporating the predicate state in the classifier, turn out to be complementary. Syntax at embedder level Predicate Hidden state
  • 188. Marcheggiani et al., 2017 86.9 87.3 87.3 87.7 87.7 80 81 82 83 84 85 86 87 88 89 90 Bjö̈rkelund et al. (2010) Täckström et al. (2015) FitzGerald et al. (2015) Roth and Lapata (2016) Marcheggianiet al. (2017) WSJ 75.6 75.7 75.2 76.1 77.7 65 70 75 80 85 90 Bjö̈rkelund et al. (2010) Täckström et al. (2015) FitzGerald et al. (2015) Roth and Lapata (2016) Marcheggiani et al. (2017) Brown 188 Syntax at embedder level Dataset: CoNLL09 EN
  • 189. Marcheggiani et al., 2017 Takeaways Ø Appending POS does help à approx. 1 F1 points gain Ø Predicate specific encoding does help à approx. 6 F1 point gain Ø Quite effective for the classification of arguments which are far from the predicate in terms of word distance. Ø Noted: Substantial improvement on EN OOD over previous works. He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder Classifier A0 0 0 0 A2 0 189 Syntax at embedder level
  • 190. Encoding Sentences with Graph ConvoluOonal Networks for SemanOc Role Labeling Marcheggiani et al., 2017b He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder Classifier A0 0 0 0 A2 0 K layers GCN Ø Basic SRL components remains the same as compared to [Marcheggiani et al., 2017] Ø GCN layers are inserted between Encoder and Classifier. Ø Re-encoding the encoder representations based on syntactic structure of the sentence. Ø Modeling syntactic dependency structure 190 Syntax at encoder level Diego Marcheggiani and Ivan Titov. 2017. Encoding Sentences with Graph ConvoluAonal Networks for SemanAc Role Labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1506–1515, Copenhagen, Denmark. AssociaAon for ComputaAonal LinguisAcs.
  • 191. What is syntactic GCN? Marcheggiani et al., 2017b He had dared to defy nature Ø Self Loops Ø Allowing input feature representation of a node affects its induced representation. ReLU ReLU ReLU ReLU ReLU ReLU nsubj xcomp obj aux mark <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self 191 He(k+1) = He_self(k) + Syntax at encoder level
  • 192. Marcheggiani et al., 2017b He had dared to defy nature Ø SyntacOc children set of a node ReLU ReLU ReLU ReLU ReLU ReLU nsubj xcomp obj aux mark <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="aOP8/D4V8b2zsT12Nkq5uPRT3iM=">AAACB3icbVDLSgNBEJz1GeNr1aMgg4kQL2E3iHoMevEYwTwgiWF2MpuMmZ1dZnrFsOzNi7/ixYMiXv0Fb/6Nk8dBEwsaiqpuuru8SHANjvNtLSwuLa+sZtay6xubW9v2zm5Nh7GirEpDEaqGRzQTXLIqcBCsESlGAk+wuje4HPn1e6Y0D+UNDCPWDkhPcp9TAkbq2Af5FvCAaVy/TQrucdpJWsAeINGxd5em+Y6dc4rOGHieuFOSQ1NUOvZXqxvSOGASqCBaN10ngnZCFHAqWJptxZpFhA5IjzUNlcTsbifjP1J8ZJQu9kNlSgIeq78nEhJoPQw80xkQ6OtZbyT+5zVj8M/bCZdRDEzSySI/FhhCPAoFd7liFMTQEEIVN7di2ieKUDDRZU0I7uzL86RWKrqnxZPrUq58MY0jg/bRISogF52hMrpCFVRFFD2iZ/SK3qwn68V6tz4mrQvWdGYP/YH1+QN7w5kX</latexit> ⇥ W (1) subj <latexit sha1_base64="8Gz0Mk/cy5pzqtj9WbX6+sEWlzg=">AAACCHicbVC7SgNBFJ31GeNr1dLCwUSITdgNopZBG8sI5gHZGGYns8mQ2QczdyVh2dLGX7GxUMTWT7Dzb5wkW2jigQuHc+7l3nvcSHAFlvVtLC2vrK6t5zbym1vbO7vm3n5DhbGkrE5DEcqWSxQTPGB14CBYK5KM+K5gTXd4PfGbD0wqHgZ3MI5Yxyf9gHucEtBS1zwqOsB9pnDzPinZp2k3cYCNIBnR0I/StNg1C1bZmgIvEjsjBZSh1jW/nF5IY58FQAVRqm1bEXQSIoFTwdK8EysWETokfdbWNCB6eSeZPpLiE630sBdKXQHgqfp7IiG+UmPf1Z0+gYGa9ybif147Bu+yk/AgioEFdLbIiwWGEE9SwT0uGQUx1oRQyfWtmA6IJBR0dnkdgj3/8iJpVMr2efnstlKoXmVx5NAhOkYlZKMLVEU3qIbqiKJH9Ixe0ZvxZLwY78bHrHXJyGYO0B8Ynz9aypmU</latexit> ⇥ W (1) xcom p <latexit sha1_base64="LufH28OLnsOd0sNygmTE4R71Sdw=">AAACBnicbVDLSgNBEJz1GeMr6lGEwUSIl7AbRD0GvXiMYB6QrMvsZJIMmZ1dZnolYdmTF3/FiwdFvPoN3vwbJ4+DJhY0FFXddHf5keAabPvbWlpeWV1bz2xkN7e2d3Zze/t1HcaKshoNRaiaPtFMcMlqwEGwZqQYCXzBGv7geuw3HpjSPJR3MIqYG5Ce5F1OCRjJyx0V2sADpnHjPik6p6mXtIENISHxME0LXi5vl+wJ8CJxZiSPZqh6ua92J6RxwCRQQbRuOXYEbkIUcCpYmm3HmkWEDkiPtQyVxKx2k8kbKT4xSgd3Q2VKAp6ovycSEmg9CnzTGRDo63lvLP7ntWLoXroJl1EMTNLpom4sMIR4nAnucMUoiJEhhCpubsW0TxShYJLLmhCc+ZcXSb1ccs5LZ7flfOVqFkcGHaJjVEQOukAVdIOqqIYoekTP6BW9WU/Wi/VufUxbl6zZzAH6A+vzB7DamKc=</latexit> ⇥ W ( 1 ) a u x 192 What is syntacOc GCN? He(k+1) = He_self(k) + He_child_of(k) + Syntax at encoder level
  • 193. Marcheggiani et al., 2017b Ø SyntacOc head of a child Ø Allows informaOon flow to and from dependent to <latexit sha1_base64="6onhJrgwe/CZITWJailWFE3mSCc=">AAACEHicbVC5TsNAEF2HK4TLQEmzIkGEJrIjBJQRNJRBIocUO9F6s0mWrA/tjhGR5U+g4VdoKECIlpKOv2FzFJDwpJGe3pvRzDwvElyBZX0bmaXlldW17HpuY3Nre8fc3aurMJaU1WgoQtn0iGKCB6wGHARrRpIR3xOs4Q2vxn7jnknFw+AWRhFzfdIPeI9TAlrqmMcFB7jPFG60k6J9knYSB9gDJCr27tJ24kRSu2la6Jh5q2RNgBeJPSN5NEO1Y3453ZDGPguACqJUy7YicBMigVPB0pwTKxYROiR91tI0IPoIN5k8lOIjrXRxL5S6AsAT9fdEQnylRr6nO30CAzXvjcX/vFYMvQs34UEUAwvodFEvFhhCPE4Hd7lkFMRIE0Il17diOiCSUNAZ5nQI9vzLi6ReLtlnpdObcr5yOYsjiw7QISoiG52jCrpGVVRDFD2iZ/SK3own48V4Nz6mrRljNrOP/sD4/AEHQZ1A</latexit> ⇥W (1) subj0 <latexit sha1_base64="/EFio39ylcfy8bJT5fEnHppk5X4=">AAACD3icbVC7TgJBFJ3FF+ILtbSZCBpsyC4xakm0scREHgm7bGaHASbMPjJz10A2+wc2/oqNhcbY2tr5Nw6PQsGT3OTknHtz7z1eJLgC0/w2Miura+sb2c3c1vbO7l5+/6ChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje8mfjNByYVD4N7GEfM8Uk/4D1OCWjJzZ8WbeA+U7jZSUrWWeomNrARJCQepZ3EjqQ207To5gtm2ZwCLxNrTgpojpqb/7K7IY19FgAVRKm2ZUbgJEQCp4KlOTtWLCJ0SPqsrWlA9A1OMv0nxSda6eJeKHUFgKfq74mE+EqNfU93+gQGatGbiP957Rh6V07CgygGFtDZol4sMIR4Eg7ucskoiLEmhEqub8V0QCShoCPM6RCsxZeXSaNSti7K53eVQvV6HkcWHaFjVEIWukRVdItqqI4oekTP6BW9GU/Gi/FufMxaM8Z85hD9gfH5AziwnNA=</latexit> ⇥ W (1) aux 0 <latexit sha1_base64="9QwhDMyMlBshoV14hbEHyOV6+fU=">AAACEXicbVC7TgJBFJ31ifhatbSZCCbYkF1i1JJoY4mJPBJ2IbPDABNmH5m5ayCb/QUbf8XGQmNs7ez8GwfYQsGT3OTknHtz7z1eJLgCy/o2VlbX1jc2c1v57Z3dvX3z4LChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje6mfrNByYVD4N7mETM9ckg4H1OCWipa5aKDnCfKdzsJCX7LO0mDrAxJGMa+lHaSZxIajtNi12zYJWtGfAysTNSQBlqXfPL6YU09lkAVBCl2rYVgZsQCZwKluadWLGI0BEZsLamAdFXuMnsoxSfaqWH+6HUFQCeqb8nEuIrNfE93ekTGKpFbyr+57Vj6F+5CQ+iGFhA54v6scAQ4mk8uMcloyAmmhAqub4V0yGRhIIOMa9DsBdfXiaNStm+KJ/fVQrV6yyOHDpGJ6iEbHSJqugW1VAdUfSIntErejOejBfj3fiYt64Y2cwR+gPj8wfqVp29</latexit> ⇥ W (1) xcom p 0 He had dared to defy nature ReLU ReLU ReLU ReLU ReLU ReLU nsubj xcomp obj aux mark <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="aOP8/D4V8b2zsT12Nkq5uPRT3iM=">AAACB3icbVDLSgNBEJz1GeNr1aMgg4kQL2E3iHoMevEYwTwgiWF2MpuMmZ1dZnrFsOzNi7/ixYMiXv0Fb/6Nk8dBEwsaiqpuuru8SHANjvNtLSwuLa+sZtay6xubW9v2zm5Nh7GirEpDEaqGRzQTXLIqcBCsESlGAk+wuje4HPn1e6Y0D+UNDCPWDkhPcp9TAkbq2Af5FvCAaVy/TQrucdpJWsAeINGxd5em+Y6dc4rOGHieuFOSQ1NUOvZXqxvSOGASqCBaN10ngnZCFHAqWJptxZpFhA5IjzUNlcTsbifjP1J8ZJQu9kNlSgIeq78nEhJoPQw80xkQ6OtZbyT+5zVj8M/bCZdRDEzSySI/FhhCPAoFd7liFMTQEEIVN7di2ieKUDDRZU0I7uzL86RWKrqnxZPrUq58MY0jg/bRISogF52hMrpCFVRFFD2iZ/SK3qwn68V6tz4mrQvWdGYP/YH1+QN7w5kX</latexit> ⇥ W (1) subj <latexit sha1_base64="8Gz0Mk/cy5pzqtj9WbX6+sEWlzg=">AAACCHicbVC7SgNBFJ31GeNr1dLCwUSITdgNopZBG8sI5gHZGGYns8mQ2QczdyVh2dLGX7GxUMTWT7Dzb5wkW2jigQuHc+7l3nvcSHAFlvVtLC2vrK6t5zbym1vbO7vm3n5DhbGkrE5DEcqWSxQTPGB14CBYK5KM+K5gTXd4PfGbD0wqHgZ3MI5Yxyf9gHucEtBS1zwqOsB9pnDzPinZp2k3cYCNIBnR0I/StNg1C1bZmgIvEjsjBZSh1jW/nF5IY58FQAVRqm1bEXQSIoFTwdK8EysWETokfdbWNCB6eSeZPpLiE630sBdKXQHgqfp7IiG+UmPf1Z0+gYGa9ybif147Bu+yk/AgioEFdLbIiwWGEE9SwT0uGQUx1oRQyfWtmA6IJBR0dnkdgj3/8iJpVMr2efnstlKoXmVx5NAhOkYlZKMLVEU3qIbqiKJH9Ixe0ZvxZLwY78bHrHXJyGYO0B8Ynz9aypmU</latexit> ⇥ W (1) xcom p <latexit sha1_base64="LufH28OLnsOd0sNygmTE4R71Sdw=">AAACBnicbVDLSgNBEJz1GeMr6lGEwUSIl7AbRD0GvXiMYB6QrMvsZJIMmZ1dZnolYdmTF3/FiwdFvPoN3vwbJ4+DJhY0FFXddHf5keAabPvbWlpeWV1bz2xkN7e2d3Zze/t1HcaKshoNRaiaPtFMcMlqwEGwZqQYCXzBGv7geuw3HpjSPJR3MIqYG5Ce5F1OCRjJyx0V2sADpnHjPik6p6mXtIENISHxME0LXi5vl+wJ8CJxZiSPZqh6ua92J6RxwCRQQbRuOXYEbkIUcCpYmm3HmkWEDkiPtQyVxKx2k8kbKT4xSgd3Q2VKAp6ovycSEmg9CnzTGRDo63lvLP7ntWLoXroJl1EMTNLpom4sMIR4nAnucMUoiJEhhCpubsW0TxShYJLLmhCc+ZcXSb1ccs5LZ7flfOVqFkcGHaJjVEQOukAVdIOqqIYoekTP6BW9WU/Wi/VufUxbl6zZzAH6A+vzB7DamKc=</latexit> ⇥ W ( 1 ) a u x GCN Layer 193 What is syntacOc GCN? He(k+1) = He_self(k) + He_child_of(k) + He_parent_of(k) Syntax at encoder level
  • 194. Marcheggiani et al., 2017b He had dared to defy nature Ø Want to encode informaOon k nodes away Ø Use k layers to encode k-order neighborhood. Ø Helped capture the widened syntacOc neighborhood. ReLU ReLU ReLU ReLU ReLU ReLU nsubj xcomp obj aux mark <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="aOP8/D4V8b2zsT12Nkq5uPRT3iM=">AAACB3icbVDLSgNBEJz1GeNr1aMgg4kQL2E3iHoMevEYwTwgiWF2MpuMmZ1dZnrFsOzNi7/ixYMiXv0Fb/6Nk8dBEwsaiqpuuru8SHANjvNtLSwuLa+sZtay6xubW9v2zm5Nh7GirEpDEaqGRzQTXLIqcBCsESlGAk+wuje4HPn1e6Y0D+UNDCPWDkhPcp9TAkbq2Af5FvCAaVy/TQrucdpJWsAeINGxd5em+Y6dc4rOGHieuFOSQ1NUOvZXqxvSOGASqCBaN10ngnZCFHAqWJptxZpFhA5IjzUNlcTsbifjP1J8ZJQu9kNlSgIeq78nEhJoPQw80xkQ6OtZbyT+5zVj8M/bCZdRDEzSySI/FhhCPAoFd7liFMTQEEIVN7di2ieKUDDRZU0I7uzL86RWKrqnxZPrUq58MY0jg/bRISogF52hMrpCFVRFFD2iZ/SK3qwn68V6tz4mrQvWdGYP/YH1+QN7w5kX</latexit> ⇥ W (1) subj <latexit sha1_base64="6onhJrgwe/CZITWJailWFE3mSCc=">AAACEHicbVC5TsNAEF2HK4TLQEmzIkGEJrIjBJQRNJRBIocUO9F6s0mWrA/tjhGR5U+g4VdoKECIlpKOv2FzFJDwpJGe3pvRzDwvElyBZX0bmaXlldW17HpuY3Nre8fc3aurMJaU1WgoQtn0iGKCB6wGHARrRpIR3xOs4Q2vxn7jnknFw+AWRhFzfdIPeI9TAlrqmMcFB7jPFG60k6J9knYSB9gDJCr27tJ24kRSu2la6Jh5q2RNgBeJPSN5NEO1Y3453ZDGPguACqJUy7YicBMigVPB0pwTKxYROiR91tI0IPoIN5k8lOIjrXRxL5S6AsAT9fdEQnylRr6nO30CAzXvjcX/vFYMvQs34UEUAwvodFEvFhhCPE4Hd7lkFMRIE0Il17diOiCSUNAZ5nQI9vzLi6ReLtlnpdObcr5yOYsjiw7QISoiG52jCrpGVVRDFD2iZ/SK3own48V4Nz6mrRljNrOP/sD4/AEHQZ1A</latexit> ⇥W (1) subj0 <latexit sha1_base64="/EFio39ylcfy8bJT5fEnHppk5X4=">AAACD3icbVC7TgJBFJ3FF+ILtbSZCBpsyC4xakm0scREHgm7bGaHASbMPjJz10A2+wc2/oqNhcbY2tr5Nw6PQsGT3OTknHtz7z1eJLgC0/w2Miura+sb2c3c1vbO7l5+/6ChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje8mfjNByYVD4N7GEfM8Uk/4D1OCWjJzZ8WbeA+U7jZSUrWWeomNrARJCQepZ3EjqQ207To5gtm2ZwCLxNrTgpojpqb/7K7IY19FgAVRKm2ZUbgJEQCp4KlOTtWLCJ0SPqsrWlA9A1OMv0nxSda6eJeKHUFgKfq74mE+EqNfU93+gQGatGbiP957Rh6V07CgygGFtDZol4sMIR4Eg7ucskoiLEmhEqub8V0QCShoCPM6RCsxZeXSaNSti7K53eVQvV6HkcWHaFjVEIWukRVdItqqI4oekTP6BW9GU/Gi/FufMxaM8Z85hD9gfH5AziwnNA=</latexit> ⇥ W (1) aux 0 <latexit sha1_base64="9QwhDMyMlBshoV14hbEHyOV6+fU=">AAACEXicbVC7TgJBFJ31ifhatbSZCCbYkF1i1JJoY4mJPBJ2IbPDABNmH5m5ayCb/QUbf8XGQmNs7ez8GwfYQsGT3OTknHtz7z1eJLgCy/o2VlbX1jc2c1v57Z3dvX3z4LChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje6mfrNByYVD4N7mETM9ckg4H1OCWipa5aKDnCfKdzsJCX7LO0mDrAxJGMa+lHaSZxIajtNi12zYJWtGfAysTNSQBlqXfPL6YU09lkAVBCl2rYVgZsQCZwKluadWLGI0BEZsLamAdFXuMnsoxSfaqWH+6HUFQCeqb8nEuIrNfE93ekTGKpFbyr+57Vj6F+5CQ+iGFhA54v6scAQ4mk8uMcloyAmmhAqub4V0yGRhIIOMa9DsBdfXiaNStm+KJ/fVQrV6yyOHDpGJ6iEbHSJqugW1VAdUfSIntErejOejBfj3fiYt64Y2cwR+gPj8wfqVp29</latexit> ⇥ W (1) xcom p 0 <latexit sha1_base64="8Gz0Mk/cy5pzqtj9WbX6+sEWlzg=">AAACCHicbVC7SgNBFJ31GeNr1dLCwUSITdgNopZBG8sI5gHZGGYns8mQ2QczdyVh2dLGX7GxUMTWT7Dzb5wkW2jigQuHc+7l3nvcSHAFlvVtLC2vrK6t5zbym1vbO7vm3n5DhbGkrE5DEcqWSxQTPGB14CBYK5KM+K5gTXd4PfGbD0wqHgZ3MI5Yxyf9gHucEtBS1zwqOsB9pnDzPinZp2k3cYCNIBnR0I/StNg1C1bZmgIvEjsjBZSh1jW/nF5IY58FQAVRqm1bEXQSIoFTwdK8EysWETokfdbWNCB6eSeZPpLiE630sBdKXQHgqfp7IiG+UmPf1Z0+gYGa9ybif147Bu+yk/AgioEFdLbIiwWGEE9SwT0uGQUx1oRQyfWtmA6IJBR0dnkdgj3/8iJpVMr2efnstlKoXmVx5NAhOkYlZKMLVEU3qIbqiKJH9Ixe0ZvxZLwY78bHrHXJyGYO0B8Ynz9aypmU</latexit> ⇥ W (1) xcom p <latexit sha1_base64="LufH28OLnsOd0sNygmTE4R71Sdw=">AAACBnicbVDLSgNBEJz1GeMr6lGEwUSIl7AbRD0GvXiMYB6QrMvsZJIMmZ1dZnolYdmTF3/FiwdFvPoN3vwbJ4+DJhY0FFXddHf5keAabPvbWlpeWV1bz2xkN7e2d3Zze/t1HcaKshoNRaiaPtFMcMlqwEGwZqQYCXzBGv7geuw3HpjSPJR3MIqYG5Ce5F1OCRjJyx0V2sADpnHjPik6p6mXtIENISHxME0LXi5vl+wJ8CJxZiSPZqh6ua92J6RxwCRQQbRuOXYEbkIUcCpYmm3HmkWEDkiPtQyVxKx2k8kbKT4xSgd3Q2VKAp6ovycSEmg9CnzTGRDo63lvLP7ntWLoXroJl1EMTNLpom4sMIR4nAnucMUoiJEhhCpubsW0TxShYJLLmhCc+ZcXSb1ccs5LZ7flfOVqFkcGHaJjVEQOukAVdIOqqIYoekTP6BW9WU/Wi/VufUxbl6zZzAH6A+vzB7DamKc=</latexit> ⇥ W ( 1 ) a u x ReLU ReLU ReLU ReLU ReLU ReLU <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="C7F53Y/OaV04lPikHNPSRzKAF7c=">AAACB3icbVC7SgNBFJ2Nrxhfq5aCDCZCbMJuELUM2lhGMA/IrmF2MpsMmX0wc1cMy3Y2/oqNhSK2/oKdf+PkUWjigYHDOfcx93ix4Aos69vILS2vrK7l1wsbm1vbO+buXlNFiaSsQSMRybZHFBM8ZA3gIFg7lowEnmAtb3g19lv3TCoehbcwipkbkH7IfU4JaKlrHpYc4AFTuHWXlu2TrJs6wB4g1RP9LCt1zaJVsSbAi8SekSKaod41v5xeRJOAhUAFUapjWzG4KZHAqWBZwUkUiwkdkj7raBoSvdtNJ3dk+FgrPexHUr8Q8ET93ZGSQKlR4OnKgMBAzXtj8T+vk4B/4aY8jBNgIZ0u8hOBIcLjUHCPS0ZBjDQhVHL9V0wHRBIKOrqCDsGeP3mRNKsV+6xyelMt1i5nceTRATpCZWSjc1RD16iOGoiiR/SMXtGb8WS8GO/Gx7Q0Z8x69tEfGJ8/bGeZDQ==</latexit> ⇥W (1) self <latexit sha1_base64="aOP8/D4V8b2zsT12Nkq5uPRT3iM=">AAACB3icbVDLSgNBEJz1GeNr1aMgg4kQL2E3iHoMevEYwTwgiWF2MpuMmZ1dZnrFsOzNi7/ixYMiXv0Fb/6Nk8dBEwsaiqpuuru8SHANjvNtLSwuLa+sZtay6xubW9v2zm5Nh7GirEpDEaqGRzQTXLIqcBCsESlGAk+wuje4HPn1e6Y0D+UNDCPWDkhPcp9TAkbq2Af5FvCAaVy/TQrucdpJWsAeINGxd5em+Y6dc4rOGHieuFOSQ1NUOvZXqxvSOGASqCBaN10ngnZCFHAqWJptxZpFhA5IjzUNlcTsbifjP1J8ZJQu9kNlSgIeq78nEhJoPQw80xkQ6OtZbyT+5zVj8M/bCZdRDEzSySI/FhhCPAoFd7liFMTQEEIVN7di2ieKUDDRZU0I7uzL86RWKrqnxZPrUq58MY0jg/bRISogF52hMrpCFVRFFD2iZ/SK3qwn68V6tz4mrQvWdGYP/YH1+QN7w5kX</latexit> ⇥ W (1) subj <latexit sha1_base64="6onhJrgwe/CZITWJailWFE3mSCc=">AAACEHicbVC5TsNAEF2HK4TLQEmzIkGEJrIjBJQRNJRBIocUO9F6s0mWrA/tjhGR5U+g4VdoKECIlpKOv2FzFJDwpJGe3pvRzDwvElyBZX0bmaXlldW17HpuY3Nre8fc3aurMJaU1WgoQtn0iGKCB6wGHARrRpIR3xOs4Q2vxn7jnknFw+AWRhFzfdIPeI9TAlrqmMcFB7jPFG60k6J9knYSB9gDJCr27tJ24kRSu2la6Jh5q2RNgBeJPSN5NEO1Y3453ZDGPguACqJUy7YicBMigVPB0pwTKxYROiR91tI0IPoIN5k8lOIjrXRxL5S6AsAT9fdEQnylRr6nO30CAzXvjcX/vFYMvQs34UEUAwvodFEvFhhCPE4Hd7lkFMRIE0Il17diOiCSUNAZ5nQI9vzLi6ReLtlnpdObcr5yOYsjiw7QISoiG52jCrpGVVRDFD2iZ/SK3own48V4Nz6mrRljNrOP/sD4/AEHQZ1A</latexit> ⇥W (1) subj0 <latexit sha1_base64="9QwhDMyMlBshoV14hbEHyOV6+fU=">AAACEXicbVC7TgJBFJ31ifhatbSZCCbYkF1i1JJoY4mJPBJ2IbPDABNmH5m5ayCb/QUbf8XGQmNs7ez8GwfYQsGT3OTknHtz7z1eJLgCy/o2VlbX1jc2c1v57Z3dvX3z4LChwlhSVqehCGXLI4oJHrA6cBCsFUlGfE+wpje6mfrNByYVD4N7mETM9ckg4H1OCWipa5aKDnCfKdzsJCX7LO0mDrAxJGMa+lHaSZxIajtNi12zYJWtGfAysTNSQBlqXfPL6YU09lkAVBCl2rYVgZsQCZwKluadWLGI0BEZsLamAdFXuMnsoxSfaqWH+6HUFQCeqb8nEuIrNfE93ekTGKpFbyr+57Vj6F+5CQ+iGFhA54v6scAQ4mk8uMcloyAmmhAqub4V0yGRhIIOMa9DsBdfXiaNStm+KJ/fVQrV6yyOHDpGJ6iEbHSJqugW1VAdUfSIntErejOejBfj3fiYt64Y2cwR+gPj8wfqVp29</latexit> ⇥ W (1) xcom p 0 <latexit sha1_base64="8Gz0Mk/cy5pzqtj9WbX6+sEWlzg=">AAACCHicbVC7SgNBFJ31GeNr1dLCwUSITdgNopZBG8sI5gHZGGYns8mQ2QczdyVh2dLGX7GxUMTWT7Dzb5wkW2jigQuHc+7l3nvcSHAFlvVtLC2vrK6t5zbym1vbO7vm3n5DhbGkrE5DEcqWSxQTPGB14CBYK5KM+K5gTXd4PfGbD0wqHgZ3MI5Yxyf9gHucEtBS1zwqOsB9pnDzPinZp2k3cYCNIBnR0I/StNg1C1bZmgIvEjsjBZSh1jW/nF5IY58FQAVRqm1bEXQSIoFTwdK8EysWETokfdbWNCB6eSeZPpLiE630sBdKXQHgqfp7IiG+UmPf1Z0+gYGa9ybif147Bu+yk/AgioEFdLbIiwWGEE9SwT0uGQUx1oRQyfWtmA6IJBR0dnkdgj3/8iJpVMr2efnstlKoXmVx5NAhOkYlZKMLVEU3qIbqiKJH9Ixe0ZvxZLwY78bHrHXJyGYO0B8Ynz9aypmU</latexit> ⇥ W (1) xcom p <latexit sha1_base64="LufH28OLnsOd0sNygmTE4R71Sdw=">AAACBnicbVDLSgNBEJz1GeMr6lGEwUSIl7AbRD0GvXiMYB6QrMvsZJIMmZ1dZnolYdmTF3/FiwdFvPoN3vwbJ4+DJhY0FFXddHf5keAabPvbWlpeWV1bz2xkN7e2d3Zze/t1HcaKshoNRaiaPtFMcMlqwEGwZqQYCXzBGv7geuw3HpjSPJR3MIqYG5Ce5F1OCRjJyx0V2sADpnHjPik6p6mXtIENISHxME0LXi5vl+wJ8CJxZiSPZqh6ua92J6RxwCRQQbRuOXYEbkIUcCpYmm3HmkWEDkiPtQyVxKx2k8kbKT4xSgd3Q2VKAp6ovycSEmg9CnzTGRDo63lvLP7ntWLoXroJl1EMTNLpom4sMIR4nAnucMUoiJEhhCpubsW0TxShYJLLmhCc+ZcXSb1ccs5LZ7flfOVqFkcGHaJjVEQOukAVdIOqqIYoekTP6BW9WU/Wi/VufUxbl6zZzAH6A+vzB7DamKc=</latexit> ⇥ W ( 1 ) a u x 194 What is syntactic GCN? Syntax at encoder level
  • 195. Encoding Sentences with Graph ConvoluOonal Networks for SemanOc Role Labeling Marcheggiani et al., 2017b He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder Classifier A0 0 0 0 A2 0 K layers GCN Ø Claim: GCN helps capture long range dependencies. Ø But: encoding k-hope neighborhood seems to hurt the performance. (k = 1 works the best) 195 Syntax at encoder level
  • 196. Encoding Sentences with Graph ConvoluOonal Networks for SemanOc Role Labeling Marcheggiani et al., 2017b He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder Classifier A0 0 0 0 A2 0 K layers GCN Ø Gold dependency can significantly improve the performance. 82.7 83.3 86.4 75 77 79 81 83 85 87 89 No Syntax GCN (Predicted) GCN (Gold) DEV set 196 Syntax at encoder level
  • 197. Marcheggiani et al., 2017b 86.9 87.3 87.3 87.7 87.7 88 80 81 82 83 84 85 86 87 88 89 90 Bjö̈rkelund et al. (2010) Täckström et al. (2015) FitzGerald et al. (2015) Roth and Lapata (2016) Marcheggiani et al. (2017) Marcheggiani et al. (2017) WSJ 75.6 75.7 75.2 76.1 77.7 77.2 65 70 75 80 85 90 Bjö̈rkelund et al. (2010) Täckström et al. (2015) FitzGerald et al. (2015) Roth and Lapata (2016) Marcheggiani et al. (2017) Marcheggiani et al. (2017) Brown 197 Syntax at encoder level Dataset: CoNLL09 EN ENC ENC
  • 198. Marcheggiani et al., 2017b He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder Classifier A0 0 0 0 A2 0 K layers GCN Takeaways Ø Appending POS does help à approx. 1 F1 points gain Ø Predicate specific encoding does help à approx. 6 F1 point gain Ø Model syntacOc dependencies via syntacOc GCN further improve the SRL performance. NEED High quality syntacOc parser Ø Noted: Improvement only on EN in-domain over previous works. Ø However previous work show improvement over OOD set. 198
  • 199. A unified Syntax-aware Framework for Semantic role labeling Li et al., 2018 He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder Classifier SyntacOc Layer [Marcheggiani et al., 2017b] [Tai et al., 2015] 199 Zuchao Li, Shexia He, Jiaxun Cai, Zhuosheng Zhang, Hai Zhao, Gongshen Liu, Linlin Li, and Luo Si. 2018. A Unified Syntax-aware Framework for SemanAc Role Labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2401–2411, Brussels, Belgium. AssociaAon for ComputaAonal LinguisAcs. [Qian et al., 2017] Syntax at encoder level Extension of BiLSTMs. Incorporates the syntacOc informaOon into each word representaOon by introducing an addiOonal gate Extension of BiLSTMs. Model tree-structured topologies
  • 200. A unified Syntax-aware Framework for SemanOc role labeling Li et al., 2018 He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder Classifier SyntacOc Layer Ø Adds residual connection Ø Allows the model to skip syntactic information if and when necessary Ø Adds Highway Layers Highway Layers 200 Syntax at encoder level
  • 201. Li et al., 2018 87.3 87.3 87.7 87.7 88 89.8 80 81 82 83 84 85 86 87 88 89 90 Täckström et al. (2015) FitzGerald et al. (2015) Roth and Lapata (2016) Marcheggiani et al. (2017) Marcheggiani et al. (2017) Li et al., 2018 WSJ 75.7 75.2 76.1 77.7 77.2 79.8 65 70 75 80 85 90 Täckström et al. (2015) FitzGerald et al. (2015) Roth and Lapata (2016) Marcheggiani et al. (2017) Marcheggiani et al. (2017) Li et al., 2018 Brown 201 Syntax at encoder level Dataset: CoNLL09 EN Glove ELMo
  • 202. A unified Syntax-aware Framework for SemanOc role labeling Li et al., 2018 He had dared to defy nature Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Wp Wr PO Le Embedder Encoder Classifier SyntacOc Layer Highway Layers Takeaways Ø Need high quality parser to substantially improve model performance. [90.5 à 89.5] CoNLL09 EN test Ø Residual connection + Deep encoder + Syntactic GCN improves over syntactic GCN alone. [ 88.0 à 89.8] CoNLL09 EN test Ø However, uses ELMo word embeddings. 202 Syntax at encoder level
  • 203. 203 Outline q Early SRL approaches q Typical neural SRL model components q Performance analysis q Syntax-aware neural SRL models q What, When and Where? q Performance analysis q How to incorporate Syntax? q Syntax-agnos/c neural SRL models q Performance Analysis q Do we really need syntax for SRL? q Are high quality contextual embedding enough for SRL task? q Prac/cal SRL systems q Should we rely on this pipelined approach? q End-to-end SRL systems q Can we jointly predict dependency and span? q More recent approaches q Handling low-frequency excepcons q Incorporate semancc role label definicons q SRL as MRC task q Prac/cal SRL system evalua/ons q Are we evaluacng SRL systems correctly? q Conclusion
  • 204. Syntax-Agnos6c Model He et al., 2017 He et al., 2018 Cai et al., 2018 Ouchi et al., 2018 Guan et al., 2019 LI et al., 2019 Shi et al., 2019 Conia et al., 2020 Jindal et al., 2020 Zhou et al., 2020 Conia et al., 2021 Blloshmi et al., 2021 Wang et al., 2022 Zhang et al. 2022 Syntax-Agnostic Models 204 Encoder Classifier Embedder Input Sentence Word embeddings - FastText, GloVe - ELMo, BERT Types of encoder - BiLSTMs - Attention - MLP
  • 205. 88.7 89.6 89.1 89.6 90.8 92.4 91.4 92.6 92.4 92.2 93.3 80 82 84 86 88 90 92 94 Heet al., 2018 Caiet al., 2018 LI et al., 2019 Guan et al., 2019 Jindal et al., 2019 Shi et al., 2019 Zhou et al., 2020 Conia et al., 2020 Blloshmi et al., 2021 Zhang etal. 2022 Wang et al., 2022 WSJ F1 2018 Dataset: CoNLL09 EN Comparing Syntax agnostic models 2019 2020 2021à BERT/Fine-tune Regime +2.5 -2.1 205 Performance Analysis
  • 206. 78.8 79 78.9 79.7 85 85.7 87.3 85.9 85.2 86 87.2 75 77 79 81 83 85 87 89 Heet al., 2018 Caiet al., 2018 LI et al., 2019 Guan et al., 2019 Jindal et al., 2019 Shi et al., 2019 Zhou et al., 2020 Conia et al., 2020 Blloshmi et al., 2021 Zhang etal. 2022 Wang et al., 2022 Brown Dataset: CoNLL09 EN Comparing Syntax agnostic models F1 2018 2019 2020 2021à BERT/Fine-tune Regime +2.3 -6.2 206 Performance Analysis
  • 207. Dataset: CoNLL09 EN Comparing Syntax agnostic models Observa6ons q BERL models raised the bar q With Max improvement over Out-of-domain dataset q However, the improvement since 2019 is marginal 207
  • 208. He et al., 2017 He had dared to defy nature Embedder wr à Pre-trained word embeddings à Predicate specific feature [Binary] Ø Pre-trained word embeddings Ø Use predicate flag Luheng He, Kenton Lee, Mike Lewis, and Luke Zeklemoyer. 2017. Deep SemanAc Role Labeling: What Works and What’s Next. In Proceedings of the 55th Annual MeeBng of the AssociaBon for ComputaBonal LinguisBcs (Volume 1: Long Papers), pages 473–483, Vancouver, Canada. AssociaAon for ComputaAonal LinguisAcs. Embedder OR Input word representation 208 Wp Wr PO Le wr Wp Wr PO Le
  • 209. He et al., 2017 He had dared to defy nature Embedder Encoder Ø Stacked BILSTM Ø Highway ConnecOons [Srivastava et al., 2015] Ø To alleviate vanishing gradient problem Ø Recurrent Dropout [Gal et al., 2016] Ø To reduce overfitng Encoder 209
  • 210. He et al., 2017 He had dared to defy nature Embedder Encoder Classifier B-A0 0 0 B-A2 I-A2 I-A2 Ø Constraint A* decoding Ø BIO constraint Ø Unique core roles Ø ConOnuaOon Constraint Ø Reference constraint Ø SyntacOc constraint Classifier with MLP layer followed by Softmax SRL Constraints were previously discussed by Punyakanok et al. (2008) and Tackstrom et al. (2015) 210 Local classifier Global opOmizaOon
  • 211. He et al., 2017 77.2 79.7 79.9 79.4 82.8 83.1 75 77 79 81 83 85 87 89 Surdenau et al. (2007) Toutanova et al. (2008) Täckström et al. (2015) FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) CoNLL05 WSJ F1 67.7 67.8 71.3 71.2 69.4 72.1 65 70 75 80 85 90 Surdenau et al. (2007) Toutanova et al. (2008) Täckström et al. (2015) FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) CoNLL05 Brown 211 Dataset: CoNLL05
  • 212. He et al., 2017 What is the model good at and what kinds of mistakes does it make? Label confusions: The model oven confuses ARG2 with AM-DIR, AM-LOC and AM-MNR. These confusions can arise due to the use of ARG2 in many verb frames to represent semanOc relaOons such as direcOon or locaOon. 212 Attachment Mistakes: These errors are closely tied to prepositional phrase (PP) attachment errors.
  • 213. He et al., 2017 How well do LSTMs model global structural consistency, despite condiOonally independent tagging decisions? Long range dependencies: Performance tends to degrade, for all models, for arguments further from the predicate 213
  • 214. He et al., 2017 214 He had dared to defy nature Embedder Encoder Classifier B-A0 0 0 B-A2 I-A2 I-A2 Takeaways Ø General label confusion between core arguments and contextual arguments is due to the ambiguous definiOons in frame files. Ø Layers of BiLSTMs help captures the long–range predicate- argument structures. Ø The number of BIO violaOons decreases when we use a deeper model Ø Deeper BiLSTMs are bexer at enforcing structural consistencies, although not perfectly.
  • 215. Tan, Zhixing, Mingxuan Wang, Jun Xie, Yidong Chen, and Xiaodong Shi. "Deep semantic role labeling with self-attention." In Proceedings of the AAAI conference on artificial intelligence, vol. 32, no. 1. 2018. Tan et al., 2018 Do we really need all these hacks!!!! J Let’s break Recurrence and allow every posiOon in the sentence to axend over all posiOons in the input sequence No Syntax Use predicate specific flag Use MulO-head self axenOon Use Glove Embeddings He had dared to defy nature Embedder Encoder Soemax Classifier B-A0 0 0 B-A2 I-A2 I-A2 RNN/CNN/FNN MulO-Head Self-AxenOon 10x 215
  • 216. Tan et al., 2018 77.2 79.7 79.9 79.4 82.8 83.1 84.8 75 77 79 81 83 85 87 89 Surdenau et al. (2007) Toutanova et al. (2008) Täckström et al. (2015) FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) Tan et al., 2018 WSJ 67.7 67.8 71.3 71.2 69.4 72.1 74.1 65 70 75 80 85 90 Surdenau et al. (2007) Toutanova et al. (2008) Täckström et al. (2015) FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) Tan et al., 2018 Brown 216 Dataset: CoNLL05
  • 217. Deep SemanOc Role Labeling with Self-AxenOon Tan et al., 2018 Ø PosiOonal embeddings are necessary to gain actual performance. He had dared to defy nature Embedder Encoder Soemax Classifier B-A0 0 0 B-A2 I-A2 I-A2 RNN/CNN/FNN MulO-Head Self-AxenOon 10x 20 79.4 83.1 10 20 30 40 50 60 70 80 90 No PE PE Timely PE DEV set 217
  • 218. Tan et al., 2018 Takeaways Ø SubstanOal improvements on CoNLL05 WSJ as compared to [He et al., 2017] Ø No need of CONSTRAINED Decoding (slows down) Just use Argmax decoding. 83.1 à 83.0 [Token classificaOon] Ø As reported earlier, Model depth is the key as compared against model width Ø FNN seems bexer choice over CNN and RNN when axenOon is used as encoder Ø PosiOonal embeddings are necessary to gain actual performance He had dared to defy nature Embedder Encoder Soemax Classifier B-A0 0 0 B-A2 I-A2 I-A2 RNN/CNN/FNN MulO-Head Self-AxenOon 10x 218
  • 219. Simple BERT model for relaOon extracOon and SRL Shi et al., 2019 He had dared to defy nature Encoder Classifier A0 0 0 0 A2 0 BERT [CLS] [SEP] dared [SEP] q Use BERT LM to obtain predicate-aware contextualized embeddings for encoder. q BiLSTMs are encoder layer (1x) q Concatenate predicate hidden state to the hidden state of the rest of the tokes similar to [Marcheggiani et al., 2017] and then fed into one-layer MLP classificaOon. 219 Shi, Peng, and Jimmy Lin. "Simple bert models for relation extraction and semantic role labeling." arXiv preprint arXiv:1904.05255 (2019). Are high quality contextual embedding enough for SRL task?
  • 220. Shi et al., 2019 79.4 82.8 83.1 84.8 86 88.1 88.8 75 77 79 81 83 85 87 89 FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) Tan et al., (2018) ELMO Strubellet al., (2018) ELMo Shi et al., (2019) BERT-S Shi et al., (2019) BERT-L CoNLL05 WSJ 71.2 69.4 72.1 74.1 76.5 80.9 82.1 65 70 75 80 85 90 FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) Tan et al., (2018) ELMO Strubellet al., (2018) ELMo Shi et al., (2019) BERT-S Shi et al., (2019) BERT-L CoNLL05 Brown 220 Are high quality contextual embedding enough for SRL task? Dataset: CoNLL05 +2.1 +4.4
  • 221. Shi et al., 2019 He had dared to defy nature Encoder Classifier A0 0 0 0 A2 0 BERT [CLS] [SEP] dared [SEP] Ø Powerful Contextualized embeddings is all we need for SRL?? Ø We do not need syntax to perform bexer on SRL?? Ø Do we know if BERT embeddings encodes syntax implicitly?? Ø Yes [Jawaher et al., 2019] Ø Explicit syntax informaOon shown to further improve the SoTA SRL performance. 221 Are high quality contextual embedding enough for SRL task?
  • 222. 88 89.6 89.8 92.4 90.99 92.6 91.7 93.3 92.83 80 82 84 86 88 90 92 94 Marcheggianiet al., 2017 Caiet al., 2018 LI et al., 2018 Shi et al., 2019 Lyu et al., 2019 Conia et al., 2020 LI et al., 2020 Wang et al., 2022 Fei et al., 2021 WSJ Dataset: CoNLL09 EN Comparison Syntax-agnostic (SG) Vs. Syntax-aware(SA) models BERT/Fine-tune Regime SG SG SG SG SA SA SA SA SA F1 2018 2019 2020 2021à 2017 222
  • 223. Dataset: CoNLL09 EN Comparison Syntax-agnostic (SG) Vs. Syntax-aware(SA) models 77.7 79 79.8 85.7 80.8 87.3 86.84 87.2 75 77 79 81 83 85 87 89 Marcheggianiet al.,2017b Caiet al., 2018 LI et al., 2018 Shi et al., 2019 Kasaiet al., 2019Zhou et al., 2020 Zhou et al., 2020 Wang et al., 2022 PLACEHOLDER WSJ BERT/Fine-tune Regime SG SG SG SG SA ?? SA SA SA SA F1 2018 2019 2020 2021à 2017 223
  • 224. 224 Outline q Early SRL approaches q Typical neural SRL model components q Performance analysis q Syntax-aware neural SRL models q What, When and Where? q Performance analysis q How to incorporate Syntax? q Syntax-agnos/c neural SRL models q Performance Analysis q Do we really need syntax for SRL? q Are high quality contextual embedding enough for SRL task? q Practical SRL systems q Should we rely on this pipelined approach? q End-to-end SRL systems q Can we jointly predict dependency and span? q More recent approaches q Handling low-frequency exceptions q Incorporate semantic role label definitions q SRL as MRC task q Practical SRL system evaluations q Are we evaluating SRL systems correctly? q Conclusion
  • 225. He et al., 2018 He had dared to defy nature Embedder Encoder Classifier q Jointly predicOng all predicates, arguments spans and the relaOon between them q Build upon coreference resoluOon model [Lee et al., 2017]. q Embedder: q No predicate locaOon specified instead concatenate word embeddings with the output of charCNN. q Each edge is idenOfied by independently predicOng which role, if any, holds between every possible pair of text spans, while using aggressive beam pruning for efficiency. The final graph is simply the union of predicted SRL roles (edges) and their associated text spans (nodes) Encoder RepresentaOon 225 Syntax-agnosOc end-to-end SRL system Luheng He, Kenton Lee, Omer Levy, and Luke Zeklemoyer. 2018. Jointly PredicAng Predicates and Arguments in Neural SemanAc Role Labeling. In Proceedings of the 56th Annual MeeBng of the AssociaBon for ComputaBonal LinguisBcs (Volume 2: Short Papers), pages 364–369, Melbourne, Australia. AssociaAon for ComputaAonal LinguisAcs.
  • 226. He et al., 2018 Task: Predict a set of labeled predicate argument relations <latexit sha1_base64="ls/GzdMHWdKfYg+NhkSeAzvOAsE=">AAACJnicbVBNS8NAEN3Ur1q/oh69LLaCp5IUUS9C1YsHDxXshzSlbLbbdulmE3YnQgn9NV78K148VES8+VPctgG19cHA470ZZub5keAaHOfTyiwtr6yuZddzG5tb2zv27l5Nh7GirEpDEaqGTzQTXLIqcBCsESlGAl+wuj+4nvj1R6Y0D+U9DCPWCkhP8i6nBIzUti8KD9jTsa8ZYC8g0KdEJJUR9oAHTP9Il4vS7ajQtvNO0ZkCLxI3JXmUotK2x14npHHAJFBBtG66TgSthCjgVLBRzos1iwgdkB5rGiqJ2dhKpm+O8JFROrgbKlMS8FT9PZGQQOth4JvOyY163puI/3nNGLrnrYTLKAYm6WxRNxYYQjzJDHe4YhTE0BBCFTe3YtonilAwyeZMCO78y4ukViq6p8WTu1K+fJXGkUUH6BAdIxedoTK6QRVURRQ9oRc0Rm/Ws/VqvVsfs9aMlc7soz+wvr4BclKlyg==</latexit> Y ⇢ P ⇥ A ⇥ L Set of all predicate-argument rela6ons Set of all tokens Set of all the possible spans Set of all SRL labels Encoder RepresentaOon <latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit> P(yp,a = l|X) 226 Syntax-agnosOc end-to-end SRL system
  • 227. He et al., 2018 He had He had dared to defy nature dared <latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit> P(yp,a = l|X) Predicate RepresentaOon SPAN RepresentaOon He To obtain predicate and argument representaOons Predicate representa/on is simply the BiLSTM output at the posiOon index p Argument Representation contains the following: - End points from BiLSTM ouput - A soft head word - Embedded span width feature 227 Syntax-agnosOc end-to-end SRL system
  • 228. He et al., 2018 Jointly predicOng predicates and Arguments in Neural SRL He had He had dared to defy nature dared Encoder RepresentaOon <latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit> P(yp,a = l|X) Unary scores Compute Unary score for predicates and arguments <latexit sha1_base64="1Ws2LC1jyACT0jK7h1UDzQ8x5D0=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/i83C+W3Ko7B1olXkZKkKHRL371BpIkERWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LBY6o9tP5tVN0ZpUBCqWyJQyaq78nUhxpPYkC2xlhM9LL3kz8z+smJrz2UybixFBBFovChCMj0ex1NGCKEsMnlmCimL0VkRFWmBgbUMGG4C2/vEpatap3Wb24r5XqN1kceTiBU6iAB1dQhztoQBMIPMIzvMKbI50X5935WLTmnGzmGP7A+fwBBhOOHg==</latexit> g(a) <latexit sha1_base64="Gh7V9z+4NY792biUOQMDXdPU9Mk=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/F5uV8suVV3DrRKvIyUIEOjX/zqDSRJIioM4VjrrufGxk+xMoxwOi30Ek1jTMZ4SLuWChxR7afza6fozCoDFEplSxg0V39PpDjSehIFtjPCZqSXvZn4n9dNTHjtp0zEiaGCLBaFCUdGotnraMAUJYZPLMFEMXsrIiOsMDE2oIINwVt+eZW0alXvsnpxXyvVb7I48nACp1ABD66gDnfQgCYQeIRneIU3RzovzrvzsWjNOdnMMfyB8/kDHO2OLQ==</latexit> g(p) <latexit sha1_base64="iOTvvQBEW7RNCR8kRWB+GwfZrXA=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixSnxe7hdLbtWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwms/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrVvUuqxf3tVL9JosjDydwChXw4ArqcAcNaAIBAc/wCm+Ocl6cd+dj0Zpzsplj+APn8wc4E49h</latexit> (p) <latexit sha1_base64="Y6OAbliZcYcKciJ0WIr4YDYHkm4=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixCj4v94slt+rOgVaJl5ESZGj0i1+9QUQSQaUhHGvd9dzY+ClWhhFOp4VeommMyRgPaddSiQXVfjo/eIrOrDJAYaRsSYPm6u+JFAutJyKwnQKbkV72ZuJ/Xjcx4bWfMhknhkqyWBQmHJkIzb5HA6YoMXxiCSaK2VsRGWGFibEZFWwI3vLLq6RVq3qX1Yv7Wql+k8WRhxM4hQp4cAV1uIMGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QMhOY9S</latexit> (a) He 228 Syntax-agnosOc end-to-end SRL system
  • 229. He et al., 2018 Jointly predicting predicates and Arguments in Neural SRL He had He had dared to defy nature dared Encoder RepresentaOon <latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit> P(yp,a = l|X) Unary scores Compute Unary score for predicates and arguments <latexit sha1_base64="1Ws2LC1jyACT0jK7h1UDzQ8x5D0=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/i83C+W3Ko7B1olXkZKkKHRL371BpIkERWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LBY6o9tP5tVN0ZpUBCqWyJQyaq78nUhxpPYkC2xlhM9LL3kz8z+smJrz2UybixFBBFovChCMj0ex1NGCKEsMnlmCimL0VkRFWmBgbUMGG4C2/vEpatap3Wb24r5XqN1kceTiBU6iAB1dQhztoQBMIPMIzvMKbI50X5935WLTmnGzmGP7A+fwBBhOOHg==</latexit> g(a) <latexit sha1_base64="Gh7V9z+4NY792biUOQMDXdPU9Mk=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/F5uV8suVV3DrRKvIyUIEOjX/zqDSRJIioM4VjrrufGxk+xMoxwOi30Ek1jTMZ4SLuWChxR7afza6fozCoDFEplSxg0V39PpDjSehIFtjPCZqSXvZn4n9dNTHjtp0zEiaGCLBaFCUdGotnraMAUJYZPLMFEMXsrIiOsMDE2oIINwVt+eZW0alXvsnpxXyvVb7I48nACp1ABD66gDnfQgCYQeIRneIU3RzovzrvzsWjNOdnMMfyB8/kDHO2OLQ==</latexit> g(p) <latexit sha1_base64="iOTvvQBEW7RNCR8kRWB+GwfZrXA=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixSnxe7hdLbtWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwms/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrVvUuqxf3tVL9JosjDydwChXw4ArqcAcNaAIBAc/wCm+Ocl6cd+dj0Zpzsplj+APn8wc4E49h</latexit> (p) <latexit sha1_base64="Y6OAbliZcYcKciJ0WIr4YDYHkm4=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixCj4v94slt+rOgVaJl5ESZGj0i1+9QUQSQaUhHGvd9dzY+ClWhhFOp4VeommMyRgPaddSiQXVfjo/eIrOrDJAYaRsSYPm6u+JFAutJyKwnQKbkV72ZuJ/Xjcx4bWfMhknhkqyWBQmHJkIzb5HA6YoMXxiCSaK2VsRGWGFibEZFWwI3vLLq6RVq3qX1Yv7Wql+k8WRhxM4hQp4cAV1uIMGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QMhOY9S</latexit> (a) RelaOon score Compute RelaOon score between predicates and arguments <latexit sha1_base64="h0rPwKSFtSqJ6x/RiYH137QlKYw=">AAAB/nicbVBNS8NAEN34WetXVTx5CbaCp5IUUY9FLx4r2A9oY9hsp+3SzSbsTsQSCv4VLx4U8erv8Oa/cdvmoK0PBh7vzTAzL4gF1+g439bS8srq2npuI7+5tb2zW9jbb+goUQzqLBKRagVUg+AS6shRQCtWQMNAQDMYXk/85gMozSN5h6MYvJD2Je9xRtFIfuGw1KkNuJ92EB4xVSDG43tR8gtFp+xMYS8SNyNFkqHmF7463YglIUhkgmrddp0YvZQq5EzAON9JNMSUDWkf2oZKGoL20un5Y/vEKF27FylTEu2p+nsipaHWozAwnSHFgZ73JuJ/XjvB3qWXchknCJLNFvUSYWNkT7Kwu1wBQzEyhDLFza02G1BFGZrE8iYEd/7lRdKolN3z8tltpVi9yuLIkSNyTE6JSy5IldyQGqkTRlLyTF7Jm/VkvVjv1sesdcnKZg7IH1ifP2OblcY=</latexit> l rel He <latexit sha1_base64="ABKpVPVsnTn+X9OAJ5qkcg74IxY=">AAACCXicbVDLSsNAFJ3UV62vqEs3g61QNyWpoi6LblwIVrAPaGOZTKft0MkkzEyEkmbrxl9x40IRt/6BO//GSRtEWw9cOJxzL/fe4waMSmVZX0ZmYXFpeSW7mltb39jcMrd36tIPBSY17DNfNF0kCaOc1BRVjDQDQZDnMtJwhxeJ37gnQlKf36pRQBwP9TntUYyUljomLLQ9pAYYseg6hkV+dzT+Ea7i8WGhY+atkjUBnCd2SvIgRbVjfra7Pg49whVmSMqWbQXKiZBQFDMS59qhJAHCQ9QnLU058oh0osknMTzQShf2fKGLKzhRf09EyJNy5Lm6M7lSznqJ+J/XClXvzIkoD0JFOJ4u6oUMKh8mscAuFQQrNtIEYUH1rRAPkEBY6fByOgR79uV5Ui+X7JPS8U05XzlP48iCPbAPisAGp6ACLkEV1AAGD+AJvIBX49F4Nt6M92lrxkhndsEfGB/fhweZmQ==</latexit> O(n3 |L|) Number of possible relaOons 229 Syntax-agnosOc end-to-end SRL system
  • 230. He et al., 2018 BEAM pruning: two beams Ba and Bp for storing the candidate arguments and predicates, respecOvely. The candidates in each beam are ranked by their unary score (Φa or Φp) Number of possible relations <latexit sha1_base64="bo6WaFl+/eTE2ZAOqeYeFaaYji4=">AAACNHicbVDLSgMxFM34rPU16tJNsBXqpsxUUZdFN4KCFewD2loyadqGZpIhyShlOh/lxg9xI4ILRdz6DaYPxLYeCBzOOZfce7yAUaUd59Wam19YXFpOrCRX19Y3Nu2t7ZISocSkiAUTsuIhRRjlpKipZqQSSIJ8j5Gy1z0f+OV7IhUV/Fb3AlL3UZvTFsVIG6lhX6ZrPtIdjFh0HcMMvzvs/wpXcf8A1iRtdzSSUjzAqWhuMppu2Ckn6wwBZ4k7JikwRqFhP9eaAoc+4RozpFTVdQJdj5DUFDMSJ2uhIgHCXdQmVUM58omqR8OjY7hvlCZsCWke13Co/p2IkK9Uz/dMcrClmvYG4n9eNdSt03pEeRBqwvHoo1bIoBZw0CBsUkmwZj1DEJbU7ApxB0mEtek5aUpwp0+eJaVc1j3OHt3kUvmzcR0JsAv2QAa44ATkwQUogCLA4BG8gHfwYT1Zb9an9TWKzlnjmR0wAev7B8t9q4o=</latexit> O(n3 |L|) ! O(n2 |L|) He had He had dared to defy nature dared <latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit> P(yp,a = l|X) Unary scores <latexit sha1_base64="1Ws2LC1jyACT0jK7h1UDzQ8x5D0=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/i83C+W3Ko7B1olXkZKkKHRL371BpIkERWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LBY6o9tP5tVN0ZpUBCqWyJQyaq78nUhxpPYkC2xlhM9LL3kz8z+smJrz2UybixFBBFovChCMj0ex1NGCKEsMnlmCimL0VkRFWmBgbUMGG4C2/vEpatap3Wb24r5XqN1kceTiBU6iAB1dQhztoQBMIPMIzvMKbI50X5935WLTmnGzmGP7A+fwBBhOOHg==</latexit> g(a) <latexit sha1_base64="Gh7V9z+4NY792biUOQMDXdPU9Mk=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/F5uV8suVV3DrRKvIyUIEOjX/zqDSRJIioM4VjrrufGxk+xMoxwOi30Ek1jTMZ4SLuWChxR7afza6fozCoDFEplSxg0V39PpDjSehIFtjPCZqSXvZn4n9dNTHjtp0zEiaGCLBaFCUdGotnraMAUJYZPLMFEMXsrIiOsMDE2oIINwVt+eZW0alXvsnpxXyvVb7I48nACp1ABD66gDnfQgCYQeIRneIU3RzovzrvzsWjNOdnMMfyB8/kDHO2OLQ==</latexit> g(p) <latexit sha1_base64="iOTvvQBEW7RNCR8kRWB+GwfZrXA=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixSnxe7hdLbtWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwms/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrVvUuqxf3tVL9JosjDydwChXw4ArqcAcNaAIBAc/wCm+Ocl6cd+dj0Zpzsplj+APn8wc4E49h</latexit> (p) <latexit sha1_base64="Y6OAbliZcYcKciJ0WIr4YDYHkm4=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixCj4v94slt+rOgVaJl5ESZGj0i1+9QUQSQaUhHGvd9dzY+ClWhhFOp4VeommMyRgPaddSiQXVfjo/eIrOrDJAYaRsSYPm6u+JFAutJyKwnQKbkV72ZuJ/Xjcx4bWfMhknhkqyWBQmHJkIzb5HA6YoMXxiCSaK2VsRGWGFibEZFWwI3vLLq6RVq3qX1Yv7Wql+k8WRhxM4hQp4cAV1uIMGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QMhOY9S</latexit> (a) RelaOon score <latexit sha1_base64="h0rPwKSFtSqJ6x/RiYH137QlKYw=">AAAB/nicbVBNS8NAEN34WetXVTx5CbaCp5IUUY9FLx4r2A9oY9hsp+3SzSbsTsQSCv4VLx4U8erv8Oa/cdvmoK0PBh7vzTAzL4gF1+g439bS8srq2npuI7+5tb2zW9jbb+goUQzqLBKRagVUg+AS6shRQCtWQMNAQDMYXk/85gMozSN5h6MYvJD2Je9xRtFIfuGw1KkNuJ92EB4xVSDG43tR8gtFp+xMYS8SNyNFkqHmF7463YglIUhkgmrddp0YvZQq5EzAON9JNMSUDWkf2oZKGoL20un5Y/vEKF27FylTEu2p+nsipaHWozAwnSHFgZ73JuJ/XjvB3qWXchknCJLNFvUSYWNkT7Kwu1wBQzEyhDLFza02G1BFGZrE8iYEd/7lRdKolN3z8tltpVi9yuLIkSNyTE6JSy5IldyQGqkTRlLyTF7Jm/VkvVjv1sesdcnKZg7IH1ifP2OblcY=</latexit> l rel Token “HE” is less likely to be a predicate based on unary scores and is removed from forming potenOal relaOon 230 Syntax-agnostic end-to-end SRL system He
  • 231. He et al., 2018 Jointly predicOng predicates and Arguments in Neural SRL He had He had dared to defy nature dared Encoder RepresentaOon <latexit sha1_base64="/ksjMlHVAvDphCk6bYulL0FiTD4=">AAAB+3icbVDLSsNAFJ3UV62vWJduBluhgpSkiLoRim5cVrAPaEOYTCft0MkkzEzEEPMrblwo4tYfceffOG2z0NYDFw7n3Mu993gRo1JZ1rdRWFldW98obpa2tnd298z9ckeGscCkjUMWip6HJGGUk7aiipFeJAgKPEa63uRm6ncfiJA05PcqiYgToBGnPsVIack1y9VWLXHT6BRl8Aqyp95J1TUrVt2aAS4TOycVkKPlml+DYYjjgHCFGZKyb1uRclIkFMWMZKVBLEmE8ASNSF9TjgIinXR2ewaPtTKEfih0cQVn6u+JFAVSJoGnOwOkxnLRm4r/ef1Y+ZdOSnkUK8LxfJEfM6hCOA0CDqkgWLFEE4QF1bdCPEYCYaXjKukQ7MWXl0mnUbfP62d3jUrzOo+jCA7BEagBG1yAJrgFLdAGGDyCZ/AK3ozMeDHejY95a8HIZw7AHxifP5GTktk=</latexit> P(yp,a = l|X) Unary scores Compute Unary score for predicates and arguments <latexit sha1_base64="1Ws2LC1jyACT0jK7h1UDzQ8x5D0=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/i83C+W3Ko7B1olXkZKkKHRL371BpIkERWGcKx113Nj46dYGUY4nRZ6iaYxJmM8pF1LBY6o9tP5tVN0ZpUBCqWyJQyaq78nUhxpPYkC2xlhM9LL3kz8z+smJrz2UybixFBBFovChCMj0ex1NGCKEsMnlmCimL0VkRFWmBgbUMGG4C2/vEpatap3Wb24r5XqN1kceTiBU6iAB1dQhztoQBMIPMIzvMKbI50X5935WLTmnGzmGP7A+fwBBhOOHg==</latexit> g(a) <latexit sha1_base64="Gh7V9z+4NY792biUOQMDXdPU9Mk=">AAAB7XicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsB7VKyabaNzSZLkhXK0v/gxYMiXv0/3vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLS0TRWiTSC5VJ8CaciZo0zDDaSdWFEcBp+1gfDvz209UaSbFg5nE1I/wULCQEWys1CoPK/F5uV8suVV3DrRKvIyUIEOjX/zqDSRJIioM4VjrrufGxk+xMoxwOi30Ek1jTMZ4SLuWChxR7afza6fozCoDFEplSxg0V39PpDjSehIFtjPCZqSXvZn4n9dNTHjtp0zEiaGCLBaFCUdGotnraMAUJYZPLMFEMXsrIiOsMDE2oIINwVt+eZW0alXvsnpxXyvVb7I48nACp1ABD66gDnfQgCYQeIRneIU3RzovzrvzsWjNOdnMMfyB8/kDHO2OLQ==</latexit> g(p) <latexit sha1_base64="iOTvvQBEW7RNCR8kRWB+GwfZrXA=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixSnxe7hdLbtWdA60SLyMlyNDoF796g4gkgkpDONa667mx8VOsDCOcTgu9RNMYkzEe0q6lEguq/XR+8BSdWWWAwkjZkgbN1d8TKRZaT0RgOwU2I73szcT/vG5iwms/ZTJODJVksShMODIRmn2PBkxRYvjEEkwUs7ciMsIKE2MzKtgQvOWXV0mrVvUuqxf3tVL9JosjDydwChXw4ArqcAcNaAIBAc/wCm+Ocl6cd+dj0Zpzsplj+APn8wc4E49h</latexit> (p) <latexit sha1_base64="Y6OAbliZcYcKciJ0WIr4YDYHkm4=">AAAB8HicbVBNSwMxEJ2tX7V+VT16CbZCvZTdIuqx6MVjBfsh7VKyabYNTbJLkhXK0l/hxYMiXv053vw3pu0etPXBwOO9GWbmBTFn2rjut5NbW9/Y3MpvF3Z29/YPiodHLR0litAmiXikOgHWlDNJm4YZTjuxolgEnLaD8e3Mbz9RpVkkH8wkpr7AQ8lCRrCx0mO51xixCj4v94slt+rOgVaJl5ESZGj0i1+9QUQSQaUhHGvd9dzY+ClWhhFOp4VeommMyRgPaddSiQXVfjo/eIrOrDJAYaRsSYPm6u+JFAutJyKwnQKbkV72ZuJ/Xjcx4bWfMhknhkqyWBQmHJkIzb5HA6YoMXxiCSaK2VsRGWGFibEZFWwI3vLLq6RVq3qX1Yv7Wql+k8WRhxM4hQp4cAV1uIMGNIGAgGd4hTdHOS/Ou/OxaM052cwx/IHz+QMhOY9S</latexit> (a) RelaOon score Compute RelaOon score between predicates and arguments <latexit sha1_base64="h0rPwKSFtSqJ6x/RiYH137QlKYw=">AAAB/nicbVBNS8NAEN34WetXVTx5CbaCp5IUUY9FLx4r2A9oY9hsp+3SzSbsTsQSCv4VLx4U8erv8Oa/cdvmoK0PBh7vzTAzL4gF1+g439bS8srq2npuI7+5tb2zW9jbb+goUQzqLBKRagVUg+AS6shRQCtWQMNAQDMYXk/85gMozSN5h6MYvJD2Je9xRtFIfuGw1KkNuJ92EB4xVSDG43tR8gtFp+xMYS8SNyNFkqHmF7463YglIUhkgmrddp0YvZQq5EzAON9JNMSUDWkf2oZKGoL20un5Y/vEKF27FylTEu2p+nsipaHWozAwnSHFgZ73JuJ/XjvB3qWXchknCJLNFvUSYWNkT7Kwu1wBQzEyhDLFza02G1BFGZrE8iYEd/7lRdKolN3z8tltpVi9yuLIkSNyTE6JSy5IldyQGqkTRlLyTF7Jm/VkvVjv1sesdcnKZg7IH1ifP2OblcY=</latexit> l rel Combined score 231 Syntax-agnosOc end-to-end SRL system Classifier
  • 232. He et al., 2018 He had dared to defy nature Embedder Encoder Classifier He had He had dared to defy nature dared - An end-to-end Neural SRL Model 87.4 86 80.4 76.1 70 72 74 76 78 80 82 84 86 88 90 Gold predicate end-to- end Gold predicate end-to- end Argument classification results on CoNLL05 WSJ Brown 232 Syntax-agnosOc end-to-end SRL system
  • 233. He et al., 2018 He had dared to defy nature Embedder Encoder Classifier He had He had dared to defy nature dared Takeaways Ø First end-to-end neural SRL model. Ø Strong performance against models with gold predicates. Ø Empirically, the model does bexer at long range dependencies and agreement with syntacOc boundaries, but is weaker at global consistency, due to our strong independence assumpOon 233 Syntax-agnosOc end-to-end SRL system
  • 234. Strubell et al., 2018 He had dared to defy nature Encoder Embedder Classifier B-A0 0 0 0 0 B-A1 Multi-Head Self-Attention + FF Syntac6cally-informed Self-A<en6on + FF Mul6-Head Self-A<en6on + FF Predicate + POS Tagging FF Bilinear FF Predicate Role Dare B-A0 0 0 B-A2 I-A2 I-A2 defy LinguisOcally-Informed Self-AxenOon for SemanOc Role Labeling Syntax strikes back - A mulO-task learning framework with stacked mulO-head self-axenOon - Jointly predicts POS and predicates - Perform parsing - Axend to syntacOc parse parent while assigning semanOc role label. 234 Syntax-aware end-to-end SRL system Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, and Andrew McCallum. 2018. LinguisAcally- Informed Self-AkenAon for SemanAc Role Labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 5027–5038, Brussels, Belgium. AssociaAon for ComputaAonal LinguisAcs.
  • 235. Strubell et al., 2018 He had dared to defy nature Encoder Embedder Classifier B-A0 0 0 0 0 B-A1 Mul6-Head Self-A<en6on + FF Syntac6cally-informed Self-A<en6on + FF Multi-Head Self-Attention + FF Predicate + POS Tagging FF Bilinear FF Predicate Role Dare B-A0 0 0 B-A2 I-A2 I-A2 defy q Replace one axenOon head with the deep bi-affine model of Dozat and Manning (2017). q Use a bi-affine operator U to obtain axenOon weights for that single head. q Encode both the dependency and the dependency label 235 Syntax-aware end-to-end SRL system
  • 236. Strubell et al., 2018 Encoder Embedder Classifier B-A0 0 0 0 0 B-A1 Mul6-Head Self-A<en6on + FF Syntac6cally-informed Self-A<en6on + FF Mul6-Head Self-A<en6on + FF Predicate + POS Tagging FF Bilinear FF Predicate Role Dare B-A0 0 0 B-A2 I-A2 I-A2 defy 236 Syntax-aware end-to-end SRL system He had dared to defy nature SyntacOc Head SemanOc Heads
  • 237. Strubell et al., 2018 Encoder Embedder Classifier B-A0 0 0 0 0 B-A1 Mul6-Head Self-A<en6on + FF Syntac6cally-informed Self-A<en6on + FF Mul6-Head Self-A<en6on + FF Predicate + POS Tagging FF Bilinear FF Predicate Role Dare B-A0 0 0 B-A2 I-A2 I-A2 defy LinguisOcally-Informed Self-AxenOon for SemanOc Role Labeling Predicate-specific representaOon Argument-specific representaOon Bilinear TransformaOon operator 237 Syntax-aware end-to-end SRL system He had dared to defy nature
  • 238. Strubell et al., 2018 79.9 79.4 82.8 83.1 83.9 84.8 86 75 77 79 81 83 85 87 89 Täckström et al. (2015) FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) Heet al. (2018) Tan et al., (2018) Strubellet al., (2018) WSJ 71.3 71.2 69.4 72.1 73.7 74.1 76.5 65 70 75 80 85 90 Täckström et al. (2015) FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) Heet al. (2018) Tan et al., 2018 Strubellet al., (2018) Brown 238 Syntax-aware end-to-end SRL system SA SA Dataset: CoNLL05
  • 239. Strubell et al., 2018 He had dared to defy nature Encoder Embedder Classifier B-A0 0 0 0 0 B-A1 Mul6-Head Self-A<en6on + FF Syntac6cally-informed Self-A<en6on + FF Mul6-Head Self-A<en6on + FF Predicate + POS Tagging FF Bilinear FF Predicate Role Dare B-A0 0 0 B-A2 I-A2 I-A2 defy Takeaways Ø Shows strong performance gain over other methods with and w/o gold predicate locaOon Ø IncorporaOng parse informaOon helpful for resolving span boundary errors (Merge spans, split spans etc.) 239 Syntax-aware end-to-end SRL system
  • 240. Zhou et al., 2019 q SemanOcs is usually considered as a higher layer of linguisOcs over syntax, most previous studies focus on how the laxer helps the former q SemanOcs benefit from syntax, but syntax may also benefit from semanOcs. q Joint training of (MulO-task learning) following 5 tasks q SemanOc q Dependency q Span q Predicate q Syntax q ConsOtuent q Dependency He had dared to defy nature Encoder Embedder SRL Classifier Mul6-Head Self-A<en6on + FF FF Bilinear FF Predicate Role Dependency Head score ConsOtuent Span score 240 Syntax-aware end-to-end SRL system Junru Zhou, Zuchao Li, and Hai Zhao. 2020. Parsing All: Syntax and SemanAcs, Dependencies and Spans. In Findings of the AssociaBon for ComputaBonal LinguisBcs: EMNLP 2020, pages 4438– 4449, Online. AssociaAon for ComputaAonal LinguisAcs.
  • 241. Zhou et al., 2019 Table 2 from the paper: Joint learning analysis on CoNLL- 2005, CoNLL-2009, and PTB dev sets q Joint training of dependency and span for SRL helps improve both. Further strengthened by Fei et al. (2021). InteresOng Insights q Further improve for both is observed when combined with syntacOc consOtuent. SEMANTICS q Though marginal, semanOc do improve syntax SYNTAX 241 Can we jointly predict dependency and span? Hao Fei, Shengqiong Wu, Yafeng Ren, Fei Li, and Donghong Ji. 2021. Beker Combine Them Together! IntegraAng SyntacAc ConsAtuency and Dependency RepresentaAons for SemanAc Role Labeling. In Findings of the AssociaBon for ComputaBonal LinguisBcs: ACL-IJCNLP 2021, pages 549–559, Online. AssociaAon for ComputaAonal LinguisAcs. q Not so when combined with syntacOc dependency
  • 242. Jindal et al., 2022 242 Ishan Jindal, Alexandre Rademaker, Michał Ulewicz, Ha Linh, Huyen Nguyen, Khoi-Nguyen Tran, Huaiyu Zhu, and Yunyao Li. 2022. Universal ProposiAon Bank 2.0. In Proceedings of the Thirteenth Language Resources and EvaluaBon Conference, pages 1700–1711, Marseille, France. European Language Resources AssociaAon. SPADE: SPAn and DEpendency SRL model He had to defy nature A0 0 0 0 A2 0 BERT [CLS] [SEP] dared [SEP] B-A0 0 0 0 B-A2 I-A2 A mulO-task learning framework - Train simultaneously on argument heads and the argument spans. Enclosing constraints ObservaOons: q Slight drop on argument head performance. q Gain on argument span performance. These observaOons are consistent with Zhou et at., 2019 Can we jointly predict dependency and span?
  • 243. Zhou et al., 2019 243 79.9 79.4 82.8 83.1 84.8 86 87.8 88.7 88.1 88.8 75 80 85 90 Täckström et al. (2015) FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) Tan et al., (2018) ELMO Strubellet al., (2018) ELMo Zhou et al., (2019) ELMo Zhou et al., (2019) BERT-L Shi et al., (2019) BERT-S Shi et al., (2019) BERT-L CoNLL05 WSJ 71.3 71.2 69.4 72.1 74.1 76.5 80.2 81.2 80.9 82.1 65 70 75 80 85 90 Täckström et al. (2015) FitzGerald et al. (2015) Zhou and Xu (2015) Heet al. (2017) Tan et al., (2018) ELMO Strubellet al., (2018) ELMo Zhou et al., (2019) ELMo Zhou et al., (2019) BERT-L Shi et al., (2019) BERT-S Shi et al., (2019) BERT-L CoNLL05 Brown Parsing All: Syntax and SemanOcs, Dependencies and Spans Can we jointly predict dependency and span?
  • 244. Zhou et al., 2019 Parsing All: Syntax and SemanOcs, Dependencies and Spans 244 87.3 87.7 87.7 88 89.8 89.8 91.1 92 92.4 80 85 90 95 FitzGerald et al. (2015) Roth and Lapata (2016) Marcheggiani et al. (2017) Marcheggiani et al. (2017) Li et al., (2018) ELMo Zhou et al., (2019) ELMo Zhou et al., (2019) BERT-L Shi et al., (2019) BERT-S Shi et al., (2019) BERT-L CoNLL09 WSJ 75.2 76.1 77.7 77.2 79.8 84.4 85.3 85.1 85.7 65 70 75 80 85 90 FitzGerald et al. (2015) Roth and Lapata (2016) Marcheggiani et al. (2017) Marcheggiani et al. (2017) Li et al., (2018) ELMo Zhou et al., (2019) ELMo Zhou et al., (2019) BERT-L Shi et al., (2019) BERT-S Shi et al., (2019) BERT-L CoNLL09 Brown Can we jointly predict dependency and span?
  • 245. 245 Outline q Early SRL approaches q Typical neural SRL model components q Performance analysis q Syntax-aware neural SRL models q What, When and Where? q Performance analysis q How to incorporate Syntax? q Syntax-agnos/c neural SRL models q Performance Analysis q Do we really need syntax for SRL? q Are high quality contextual embedding enough for SRL task? q Prac/cal SRL systems q Should we rely on this pipelined approach? q End-to-end SRL systems q Can we jointly predict dependency and span? q More recent approaches q Learn low-frequency excepcons q Incorporate semancc role label definicons q SRL as MRC task q Prac/cal SRL system evalua/ons q Are we evaluacng SRL systems correctly? q Conclusion
  • 246. Low-frequency Excep6ons 246 Argument labeling task: q Arguments that are syntacccally realized as passive subjects are typically labeled Arg1 q However, there exist numerous low-frequency excepcons to this rule. q Passive subjects of certain frames (such as the frame TELL.01) are most commonly labeled as Arg2 ObservaOons based on CoNLL09 training data [Akbik and Li, 2015]: q 57% of all subjects are labeled A0 q 33% of all subjects are labeled A1 q 74% of accve subjects are labeled A0 q 86% of passive subjects are labeled A1 q 100% of passive subjects of SELL.01 are labeled A1 q 88% of passive subjects of TELL.01 are labeled A2
  • 247. Low-frequency Exceptions 247 [Akbik and Li, 2016] Alan Akbik and Yunyao Li. 2016. K-SRL: Instance-based Learning for Semantic Role Labeling. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 599–608, Osaka, Japan. The COLING 2016 Organizing Committee. [Guan et al., 2019] Chaoyu Guan, Yuhao Cheng, and Hai Zhao. 2019. Semantic Role Labeling with Associated Memory Network. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 3361–3371, Minneapolis, Minnesota. Association for Computational Linguistics. [Jindal et al., 2020] Jindal, Ishan, Ranit Aharonov, Siddhartha Brahma, Huaiyu Zhu, and Yunyao Li. "Improved Semantic Role Labeling using Parameterized Neighborhood Memory Adaptation." arXiv preprint arXiv:2011.14459 (2020). A1 A2 A3 A1 ? A2 A2 A2 A3 A3 A3 A3 A2 A3 A1 A1 A1 A1 ? Instance-based learning q Extrapolates prediccons from the most similar instances in the training data. [Akbik and Li, 2016, Jindal et al., 2020] q Generally, staged approaches where base model is trained first to get the word/span representacons. [Guan et al., 2019, Jindal et al., 2020]
  • 248. Understanding BERT based model bexer for bexer SRL performance. Understand BERT for SRL 248 Ilia Kuznetsov and Iryna Gurevych. 2020. A maker of framing: The impact of linguisAc formalism on probing results. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), pages 171–182, Online. AssociaAon for ComputaAonal LinguisAcs. BERT “rediscovers” the classical NLP pipeline [Tenney et al., 2019] q Lower layers tend to encode mostly lexical level informaOon, while q Upper layers seem to favor sentence-level informaOon.
  • 249. Understanding BERT based model bexer for bexer SRL performance. Understand BERT for SRL He had dared to defy nature Encoder Classifier A0 0 0 0 A2 0 BERT [CLS] [SEP] dared [SEP] = f( , , , , ….. ) 249 Simone Conia and Roberto Navigli. 2022. Probing for Predicate Argument Structures in Pretrained Language Models. In Proceedings of the 60th Annual MeeBng of the AssociaBon for ComputaBonal LinguisBcs (Volume 1: Long Papers), pages 4622–4632, Dublin, Ireland. AssociaAon for ComputaAonal LinguisAcs. Sta/c: Last layer acOvaOons as staOc embeddings Top-4: Concatenate top 4 layers acOvaOons W-avg: Parametric sum of all layer acOvaOons
  • 250. Understanding BERT based model bexer for bexer SRL performance. Understand BERT for SRL He had dared to defy nature Encoder Classifier A0 0 0 0 A2 0 BERT [CLS] [SEP] dared [SEP] = f( , , , , ….. ) 250 q Predicate senses and argument structures are encoded at different layers in LMs q Verbal and nominal predicate-argument structures are represented differently across the layers of a LM; q SRL system benefits from treaOng them separately; InteresOng Insights
  • 251. Label-aware NLP • Model is given the definitions of labels, and can effectively leverage them in many tasks § Sentiment/entailment: (Schick and Schutze, 2021) § Event extraction: (Du and Cardie, 2020; Hongming et al., 2021) § Word sense disambiguation: (Kumar et al., 2019) • Strong even with few-shot • Many more, but NOT for SRL (why?) § Semantic roles are specific to predicates § There are many predicates, thus many roles; very sparse § 8500 Predicate senses in CoNLL09 data § ~8500*3 argument labels ~ 25K 251 Incorpora6ng Role Defini6ons
  • 252. Label-aware NLP for SRL 252 Incorpora6ng Role Defini6ons Li Zhang, Ishan Jindal, and Yunyao Li. 2022. Label DefiniAons Improve SemanAc Role Labeling. In Proceedings of the 2022 Conference of the North American Chapter of the AssociaBon for ComputaBonal LinguisBcs: Human Language Technologies, pages 5613–5620, Seakle, United States. AssociaAon for ComputaAonal LinguisAcs. [Zhang et al., 2022] q Make n+1 copies of the sentence where n is number of core arguments defined for frame. q N is number of core arguments q +1 is for contextual arguments q Append label definiOon at the end of the sentence. q Convert K class classificaOon problem into binary class classificaOon. q That is to determine whether a token is worker or not in this example.
  • 253. 253 Incorpora6ng Role Defini6ons Low-Frequency Predicates. - SRL suffers from the long-tail phenomenon. - LD outperforms base by up to 4.4 argument F1 for unseen predicates, notably helping with low-frequency predicates. Few-Shot Learning. - LD outperforms base by up to 3.2 F1 in- and out-domain. - The performance gap diminishes as training size approaches 100, 000. Distant Domain Adapta/on - evaluate models trained on CoNLL09 (news arOcles) on the Biology PropBank. - LD model achieves 55.5 argument F1, outperforming base which achieves 54.6.. InteresOng Insights
  • 254. SRL as extracOve machine Reading Comprehension task [Wang et al., 2022] SRL as MRC Task 254 Nan Wang, Jiwei Li, Yuxian Meng, Xiaofei Sun, Han Qiu, Ziyao Wang, Guoyin Wang, and Jun He. 2022. An MRC Framework for SemanAc Role Labeling. In Proceedings of the 29th InternaBonal Conference on ComputaBonal LinguisBcs, pages 2188–2198, Gyeongju, Republic of Korea. InternaAonal Commikee on ComputaAonal LinguisAcs.
  • 255. 255 Outline q Early SRL approaches q Typical neural SRL model components q Performance analysis q Syntax-aware neural SRL models q What, When and Where? q Performance analysis q How to incorporate Syntax? q Syntax-agnos/c neural SRL models q Performance Analysis q Do we really need syntax for SRL? q Are high quality contextual embedding enough for SRL task? q Prac/cal SRL systems q Should we rely on this pipelined approach? q End-to-end SRL systems q Can we jointly predict dependency and span? q More recent approaches q Handling low-frequency excepcons q Incorporate semancc role label definicons q SRL as MRC task q Prac/cal SRL system evalua/ons q Are we evaluacng SRL systems correctly? q Conclusion
  • 256. 256 SRL Evalua6on – Issues with Evalua6on Metrics Two official evaluaOon scripts q EvaluaOon script from CoNLL05 Shared task (eval05.pl) q EvaluaOon script from CoNLL09 Shared task (eval09.pl) Predicate iden/fica/on Predicate sense disambiguation Argument iden/fica/on Argument classifica/on Eval05.pl Eval09.pl Span only Head only Assume gold predicate locaOon Assume gold predicate location All tasks are evaluated independently ERROR PROPOGATATION
  • 257. 257 SRL Evalua6on – Predicate Error Types Example: Predicate sense evaluaOon Eval05.pl Eval09.pl R Do not evaluate 1/1 0/1 0/1 1/1 Eval05.pl Eval09.pl P Do not evaluate 1/1 0/1 0/1 1/1
  • 258. 258 SRL Evaluation – Error Examples Real errors from a SoTA SRL model. All of these predicate senses are marked correct by the CoNLL09 evaluaOon script.
  • 259. 259 SRL Evalua6on – Argument Error Types Example: Eval05.pl Argument evaluaOon Eval09.pl R Eval05.pl Eval09.pl P 3/3 3/3 3/3 3/3 3/3 3/3 3/3 3/3 3/3 0/3 3/3 0/3 3/3 0/3 3/3 0/3
  • 260. 260 An Improved Evalua6on Scheme Summary of issues with exisfng SRL evaluafon metric: q Proper evaluaOon of predicate sense disambiguaOon task; q Argument label evaluaOon in conjuncOon with predicate sense; q Proper evaluaOon for disconOnuous arguments and reference arguments; and q Unified evaluaOon of argument head and span. Jindal, Ishan, Alexandre Rademaker, Khoi-Nguyen Tran, Huaiyu Zhu, Hiroshi Kanayama, Marina Danilevsky, and Yunyao Li. "PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation." arXiv preprint arXiv:2210.06408 (2022). PriMeSRL-Eval: A Practical Quality Metric for Semantic Role Labeling Systems Evaluation [Jindal et al., 2022]
  • 261. 261 With PriMeSRL-Eval we made the following observaOons: q Current evaluaOon scripts exaggerate the SRL model quality. q A clear drop on ~7F1 points on OOD set is observed. q The relaOve ranking of the SoTA SRL models changes. An Improved Evalua6on Scheme
  • 262. 262 Conclusion q Syntax ma/ers q Yes, at least for argument spans. q Not for dependency SRL. q Eventually, you need syntax to compute span. q SRL can help syntax q Contextualized embeddings q Carry major chunk of performance gain in SRL. q Fine-tunning LM for SRL further raised the bar. q End-to-End Systems q More pracOcal, but computaOonally expensive q Predicate and arguments task shown to improve each other. q SRL in few shot se=ng q Probe SRL informaOon from large LMs. q Given the sparsity of the SRL label space finding a right prompt is quite challenging. q Mul?lingual SRL q MulOlingual SRL Resources q Universal PropBanks for SRL q A long way to go q Datasets q Dataset without predicate sense annotaOons q Ethical issues q SRL Model Re-Evalua?ons Observations OpportuniLes
  • 263. 263 References 1. Merchant, A., Rahimtoroghi, E., Pavlick, E., & Tenney, I. (2020, November). What Happens To BERT Embeddings During Fine-tuning?. In Proceedings of the Third BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (pp. 33-44). 2. Tan, Z., Wang, M., Xie, J., Chen, Y., & Shi, X. (2018, April). Deep seman6c role labeling with self-a<en6on. In Proceedings of the AAAI conference on ar6ficial intelligence (Vol. 32, No. 1). 3. Marcheggiani, D., Frolov, A., & Titov, I. (2017, August). A Simple and Accurate Syntax-Agnos6c Neural Model for Dependency-based Seman6c Role Labeling. In Proceedings of the 21st Conference on Computa6onal Natural Language Learning (CoNLL 2017) (pp. 411-420). 4. A Unified Syntax-aware Framework for Seman6c Role Labeling 5. Tian, Y., Qin, H., Xia, F., & Song, Y. (2022, June). Syntax-driven Approach for Seman6c Role Labeling. In Proceedings of the Thirteenth Language Resources and Evalua6on Conference (pp. 7129-7139). 6. Zhang, Z., Strubell, E., & Hovy, E. (2021, August). Comparing span extrac6on methods for seman6c role labeling. In Proceedings of the 5th Workshop on Structured Predic6on for NLP (SPNLP 2021) (pp. 67-77). 7. Fei, H., Wu, S., Ren, Y., Li, F., & Ji, D. (2021, August). Be<er combine them together! integra6ng syntac6c cons6tuency and dependency representa6ons for seman6c role labeling. In Findings of the Associa6on for Computa6onal Linguis6cs: ACL-IJCNLP 2021 (pp. 549-559). 8. Wang, N., Li, J., Meng, Y., Sun, X., & He, J. (2021). An mrc framework for seman6c role labeling. arXiv preprint arXiv:2109.06660. 9. Blloshmi, R., Conia, S., Tripodi, R., & Navigli, R. (2021). Genera6ng Senses and RoLes: An End-to-End Model for Dependency-and Span-based Seman6c Role Labeling. In IJCAI (pp. 3786-3793). 10.Zhang, L., Jindal, I., & Li, Y. (2022, July). Label Defini6ons Improve Seman6c Role Labeling. In Proceedings of the 2022 Conference of the North American Chapter of the Associa6on for Computa6onal Linguis6cs: Human Language Technologies (pp. 5613-5620). 11.Cai, J., He, S., Li, Z., & Zhao, H. (2018, August). A full end-to-end seman6c role labeler, syntac6c-agnos6c over syntac6c-aware?. In Proceedings of the 27th Interna6onal Conference on Computa6onal Linguis6cs (pp. 2753-2765). 12.He, S., Li, Z., & Zhao, H. (2019, November). Syntax-aware Mul6lingual Seman6c Role Labeling. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th Interna6onal Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 5350-5359). 13.Conia, S., Bacciu, A., & Navigli, R. (2021, June). Unifying cross-lingual Seman6c Role Labeling with heterogeneous linguis6c resources. In Proceedings of the 2021 Conference of the North American Chapter of the Associa6on for Computa6onal Linguis6cs: Human Language Technologies (pp. 338-351).
  • 264. 264 References 14. Conia, S., & Navigli, R. (2020, December). Bridging the gap in mul6lingual seman6c role labeling: a language-agnos6c approach. In Proceedings of the 28th Interna6onal Conference on Computa6onal Linguis6cs (pp. 1396-1410). 15.Kasai, J., Friedman, D., Frank, R., Radev, D., & Rambow, O. (2019, June). Syntax-aware Neural Seman6c Role Labeling with Supertags. In Proceedings of the 2019 Conference of the North American Chapter of the Associa6on for Computa6onal Linguis6cs: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 701-709). 16.He, L., Lee, K., Levy, O., & Ze<lemoyer, L. (2018, July). Jointly Predic6ng Predicates and Arguments in Neural Seman6c Role Labeling. In Proceedings of the 56th Annual Mee6ng of the Associa6on for Computa6onal Linguis6cs (Volume 2: Short Papers) (pp. 364-369). 17.Shi, T., Malioutov, I., & İrsoy, O. (2020, November). Seman6c Role Labeling as Syntac6c Dependency Parsing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 7551-7571). 18.Zhou, J., Li, Z., & Zhao, H. (2020, November). Parsing All: Syntax and Seman6cs, Dependencies and Spans. In Findings of the Associa6on for Computa6onal Linguis6cs: EMNLP 2020 (pp. 4438-4449). 19.Zhou, J., Li, Z., & Zhao, H. (2020, November). Parsing All: Syntax and Seman6cs, Dependencies and Spans. In Findings of the Associa6on for Computa6onal Linguis6cs: EMNLP 2020 (pp. 4438-4449). 20.Wang, Y., Johnson, M., Wan, S., Sun, Y., & Wang, W. (2019, July). How to best use syntax in seman6c role labelling. In Annual Mee6ng of the Associa6on for Computa6onal Linguis6cs (57th: 2019) (pp. 5338-5343). Associa6on for Computa6onal Linguis6cs. 21.He, S., Li, Z., Zhao, H., & Bai, H. (2018, July). Syntax for seman6c role labeling, to be, or not to be. In Proceedings of the 56th annual mee6ng of the associa6on for computa6onal linguis6cs (Volume 1: Long papers) (pp. 2061-2071). 22.Marcheggiani, D., & Titov, I. (2020, November). Graph Convolu6ons over Cons6tuent Trees for Syntax-Aware Seman6c Role Labeling. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP) (pp. 3915-3928). 23.Marcheggiani, D., & Titov, I. (2017, September). Encoding Sentences with Graph Convolu6onal Networks for Seman6c Role Labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 1506-1515). 24.Marcheggiani, D., & Titov, I. (2017, September). Encoding Sentences with Graph Convolu6onal Networks for Seman6c Role Labeling. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing (pp. 1506-1515). 25.Li, Z., Zhao, H., Wang, R., & Parnow, K. (2020, November). High-order Seman6c Role Labeling. In Findings of the Associa6on for Computa6onal Linguis6cs: EMNLP 2020 (pp. 1134-1151).
  • 265. 265 References 26. Lyu, C., Cohen, S. B., & Titov, I. (2019, November). Seman6c Role Labeling with Itera6ve Structure Refinement. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th Interna6onal Joint Conference on Natural Language Processing (EMNLP-IJCNLP) (pp. 1071- 1082). 27.Li, Z., He, S., Zhao, H., Zhang, Y., Zhang, Z., Zhou, X., & Zhou, X. (2019, July). Dependency or span, end-to-end uniform seman6c role labeling. In Proceedings of the AAAI Conference on Ar6ficial Intelligence (Vol. 33, No. 01, pp. 6730-6737). 28.Ouchi, H., Shindo, H., & Matsumoto, Y. (2018). A Span Selec6on Model for Seman6c Role Labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 1630-1642). 29.Strubell, E., Verga, P., Andor, D., Weiss, D., & McCallum, A. (2018). Linguis6cally-Informed Self-A<en6on for Seman6c Role Labeling. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (pp. 5027-5038). 30.He, L., Lee, K., Lewis, M., & Ze<lemoyer, L. (2017, July). Deep seman6c role labeling: What works and what’s next. In Proceedings of the 55th Annual Mee6ng of the Associa6on for Computa6onal Linguis6cs (Volume 1: Long Papers) (pp. 473-483). 31.FitzGerald, N., Täckström, O., Ganchev, K., & Das, D. (2015, September). Seman6c role labeling with neural network factors. In Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing (pp. 960-970). 32.Guan, C., Cheng, Y., & Zhao, H. (2019, June). Seman6c Role Labeling with Associated Memory Network. In Proceedings of the 2019 Conference of the North American Chapter of the Associa6on for Computa6onal Linguis6cs: Human Language Technologies, Volume 1 (Long and Short Papers) (pp. 3361-3371). 33.Jindal, I., Aharonov, R., Brahma, S., Zhu, H., & Li, Y. (2020). Improved Seman6c Role Labeling using Parameterized Neighborhood Memory Adapta6on. arXiv preprint arXiv:2011.14459.
  • 266. Meaning Representa=ons for Natural Languages Tutorial Part 3b Modeling Meaning Representa0on: AMR Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
  • 267. ❏ AMR Parsing ❏ Sequence-to-sequence methods ❏ Pre/post processing ❏ Transi=on-based methods ❏ Graph-based methods ❏ Evalua=on ❏ AMR GeneraLon: ❏ Sequence-to-sequence methods ❏ Graph-based methods ❏ Silver data ❏ Pre-training Outline 267
  • 268. ❏ Linearize the AMR graphs ❏ AMR parsing as sequence-to-sequence modeling ❏ Can use any seq2seq method and pre-training method (BART, etc) Konstas et al. Neural AMR: Sequence-to-Sequence Models for Parsing and Genera<on. ACL 2017. inter alia. Seq2seq AMR Parsing 268
  • 269. ❏ LinearizaLon order of the AMR graph usually mafers AMR Lineariza6on Bevilacqua et al. One SPRING to Rule Them Both: Symmetric AMR SemanOc Parsing and GeneraOon without a Complex Pipeline. AAAI 2021 269
  • 270. ❏ LinearizaLon order of the AMR graph usually mafers AMR Lineariza6on van Noord & Bos. Neural Semantic Parsing by Character- based Translation: Experiments with Abstract Meaning Representations. Computational Linguistics in the 270
  • 271. ❏ Remove variables and adding them back-in with post-processing heurisLcs Removing Variables 271 van Noord & Bos. Neural SemanOc Parsing by Character-based TranslaOon: Experiments with Abstract Meaning RepresentaOons. ComputaOonal LinguisOcs in the Netherlands Journal. 2017.
  • 272. ❏ Rather than removing variables (lossy) use special tokens Removing Variables 272 Bevilacqua et al. One SPRING to Rule Them Both: Symmetric AMR SemanOc Parsing and GeneraOon without a Complex Pipeline. AAAI 2021
  • 273. Pre-Processing for Transi4on and Graph-Based: Recategoriza4on 273 Figure from Zhou et al. Structure-aware Fine-tuning of Sequence-to-sequence Transformers for TransiOon-based AMR Parsing. EMNLP 2021 ❏ Collapsing verbalized concepts ❏ Anonymizing named en<<es (recovered with alignments) ❏ Removing sense nodes (predict most frequent sense) ❏ Remove wiki links (predict with wikifier) Zhang et al 2019. AMR Parsing as Sequence- to-Graph TransducOon. ACL 2019
  • 274. ❏ AMR Parsing ❏ Sequence-to-sequence methods ❏ Pre/post processing ❏ Transition-based methods ❏ Graph-based methods ❏ Evaluation ❏ AMR Generation: ❏ Sequence-to-sequence methods ❏ Graph-based methods ❏ Silver data ❏ Pre-training Outline 274
  • 275. ❏ Construct the graph using a sequence of acLons that build the graph ❏ Use a classifier to predict the next acLon ❏ Inspired by transiLon-based dependency parsing Wang et al. A Transi<on-based Algorithm for AMR Parsing. NAACL 2015, inter alia. Transi6on-Based AMR Parsing 275
  • 276. Transi6on-Based AMR Parsing 276 Zhou et al. AMR Parsing with AcOon-Pointer Transformer. NAACL 2021
  • 277. Transi6on-Based AMR Parsing 277 Zhou et al. AMR Parsing with AcOon-Pointer Transformer. NAACL 2021 Zhou et al. Structure-aware Fine-tuning of Sequence-to- sequence Transformers for TransiOon-based AMR Parsing. EMNLP 2021. Simplified Transifon Acfons
  • 278. ❏ Simplified system: Transition system has 6 actions Transi6on-Based AMR Parsing 278 Zhou et al. Structure-aware Fine-tuning of Sequence-to- sequence Transformers for TransiOon-based AMR Parsing.
  • 279. Transi6on-Based AMR Parsing 279 Zhou et al. AMR Parsing with AcOon-Pointer Transformer. NAACL 2021 Zhou et al. Structure-aware Fine-tuning of Sequence-to- sequence Transformers for Transition-based AMR Parsing. EMNLP 2021. Simplified Transifon Acfons
  • 280. Transi6on-Based AMR Parsing 280 Zhou et al. Structure-aware Fine-tuning of Sequence-to- sequence Transformers for TransiOon-based AMR Parsing. EMNLP 2021.
  • 281. ❏ AMR Parsing ❏ Sequence-to-sequence methods ❏ Pre/post processing ❏ Transi=on-based methods ❏ Graph-based methods ❏ Evalua=on ❏ AMR GeneraLon: ❏ Sequence-to-sequence methods ❏ Graph-based methods ❏ Silver data ❏ Pre-training Outline 282
  • 282. ❏ Graph-based methods use the graph structure when predicLng ❏ Inspired by graph-based methods for dependency parsing ❏ Can be done incrementally or using a structured predicLon method Flanigan et al. A Discrimina<ve Graph-Based Parser for the Abstract Meaning Representa<on. ACL 2014. inter alia. Graph-Based AMR Parsing 283
  • 283. Graph-Based AMR Parsing 284 Cai & Lam. AMR Parsing via Graph-Sequence IteraOve Inference. ACL 2020.
  • 284. Graph-Based AMR Parsing 285 Cai & Lam. AMR Parsing via Graph-Sequence IteraOve Inference. ACL 2020.
  • 285. Graph-Based AMR Parsing 286 Cai & Lam 2020. AMR Parsing via Graph-Sequence IteraOve Inference. ACL 2020.
  • 286. Graph-Based AMR Parsing 287 Cai & Lam. AMR Parsing via Graph-Sequence IteraOve Inference. ACL 2020.
  • 287. ❏ AMR Parsing ❏ Sequence-to-sequence methods ❏ Pre/post processing ❏ Transi=on-based methods ❏ Graph-based methods ❏ Evalua>on ❏ AMR GeneraLon: ❏ Sequence-to-sequence methods ❏ Graph-based methods ❏ Silver data ❏ Pre-training Outline 288
  • 288. ❏ Can use fine-grained evaluaLon to examine strengths and weakness Evaluation 289 Damonte et al. An Incremental Parser for Abstract Meaning RepresentaOon. EACL 2017
  • 289. ❏ AMR Parsing ❏ Sequence-to-sequence methods ❏ Pre/post processing ❏ Transi=on-based methods ❏ Graph-based methods ❏ Evalua=on ❏ AMR GeneraMon: ❏ Sequence-to-sequence methods ❏ Graph-based methods ❏ Silver data ❏ Pre-training Outline 291
  • 290. AMR Genera6on: Overview 292 Hao et al. A Survey : Neural Networks for AMR-to-Text. 2022
  • 291. ❏ Linearize the AMR graphs ❏ AMR generaLon as sequence-to-sequence modeling ❏ Can use any seq2seq method and pre-training method (BART, etc) AMR Genera6on: Seq2seq 293
  • 292. AMR Genera6on: Graph-Based 294 Hao et al. A Survey : Neural Networks for AMR-to-Text. 2022
  • 293. AMR Genera6on: Graph-Based 295 Hao et al. Heterogeneous Graph Transformer for Graph-to-Sequence
  • 294. AMR Genera6on: Graph-Based 296 Damonte & Cohen. Structural Neural Encoders for AMR-to- text Generation. NAACL 2019
  • 295. AMR Genera6on: Comparison 297 Hao et al. A Survey : Neural Networks for AMR-to-Text. 2022
  • 296. ❏ AMR Parsing ❏ Sequence-to-sequence methods ❏ Pre/post processing ❏ Transi=on-based methods ❏ Graph-based methods ❏ Evalua=on ❏ AMR GeneraLon: ❏ Sequence-to-sequence methods ❏ Graph-based methods ❏ Silver data ❏ Pre-training Outline 298
  • 297. ❏ Gold data is human labeled data ❏ Silver data is where you run an existing parser on unlabeled data ❏ You can add silver data to the training data to improve performance ❏ Usually people use Gigaword for the silver data (more on this later) Silver Data (Semi-supervised learning) 299
  • 298. ❏ Silver data someLmes helps parsing, usually on out-of-domain data Silver Data for AMR Parsing 300 In-domain Out-of-domain Bevilacqua et al. One SPRING to Rule Them Both: Symmetric AMR Semantic Parsing and Generation without a Complex Pipeline. AAAI 2021
  • 299. ❏ Silver data always helps generaLon, but be careful! Results are misleading! ❏ Silver data hurts out of domain data Silver Data for AMR Genera6on 301 In-domain (official test sets) Out-of-domain Bevilacqua et al. One SPRING to Rule Them Both: Symmetric AMR SemanOc Parsing and GeneraOon without a Complex Pipeline. AAAI 2021 Baseline +Silver data
  • 300. ❏ Silver data always helps generation, but be careful! Results are misleading! Silver Data for AMR Genera6on 302 Du & Flanigan. Avoiding Overlap in Data AugmentaOon for AMR-to-Text GeneraOon. ACL 2020
  • 301. ❏ Recommend excluding parts of Gigaword that may overlap with test data Silver Data for AMR Genera6on 303 Du & Flanigan. Avoiding Overlap in Data AugmentaOon for AMR-to-Text GeneraOon. ACL 2020 https://github.com/jlab-nlp/amr-clean
  • 302. ❏ AMR Parsing ❏ Sequence-to-sequence methods ❏ Pre/post processing ❏ Transi=on-based methods ❏ Graph-based methods ❏ Evalua=on ❏ AMR GeneraLon: ❏ Sequence-to-sequence methods ❏ Graph-based methods ❏ Silver data ❏ Pre-training Outline 304
  • 303. ❏ Pre-training the encoder, such as BERT, helps a lot ❏ Pre-training the decoder, such as BART, helps even more ❏ Structural pre-training helps as well AMR Parsing: Pretraining 305 Bai et al. Graph Pre-training for AMR Parsing and GeneraOon. ACL 2022
  • 304. ❏ Structural pre-training helps as well Structural Pretraining 306 Bai et al. Graph Pre-training for AMR Parsing and GeneraOon. ACL 2022
  • 305. ❏ Structural pre-training helps as well Structural Pretraining 307 Bai et al. Graph Pre-training for AMR Parsing and GeneraOon. ACL 2022
  • 306. AMR Genera6on: Pretraining 308 Hao et al. A Survey : Neural Networks for AMR-to-Text. 2022 ❏ Pre-training helps a lot ❏ Pre-training the encoder and decoder helps the most (BART)
  • 307. AMR Genera6on: Pretraining 309 Hao et al. A Survey : Neural Networks for AMR-to-Text. 2022 ❏ Pre-training helps a lot ❏ Pre-training the encoder and decoder helps the most (BART)
  • 308. ❏ There’s a lot more work we didn’t have Lme to cover ❏ See the AMR bibliography Lots More Work 310 hfps://nert-nlp.github.io/AMR-Bibliography/
  • 309. Meaning Representa=ons for Natural Languages Tutorial Part 4 Applying Meaning Representa0ons Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
  • 310. Informa6on Extrac6on •OneIE [Lin et al., ACL2020] framework extracts the information graph from a given sentence in four steps: encoding, identification, classification, and decoding
  • 311. Moving from Seq-to-Graph to Graph-to-Graph Slide credit: Heng ● AMR converts input sentence into a directed and acyclic graph structure with fine-grained node and edge type labels ● AMR parsing shares inherent similarifes with informafon network (IE output) ● Similar node and edge semanfcs ● Similar graph topology ● Semanfc graphs can beier capture non-local context in a sentence Zixuan Zhang, Heng Ji. AMR-IE: An AMR-guided encoding and decoding framework for IE. NAACL’2021 Slide credit: Heng Key Idea: Exploit the similarity between AMR and IE to for joint informaZon extracZon
  • 312. AMR-IE Zixuan Zhang, Heng Ji. AMR-IE: An AMR-guided encoding and decoding framework for IE. NAACL’2021 Slide credit: Heng
  • 313. AMR Guided Graph Encoding: Using an Edge-Condi4oned GAT Zixuan Zhang, Heng Ji. AMR-IE: An AMR-guided encoding and decoding framework for IE. NAACL’2021 Slide credit: Heng ● Map each candidate enfty and event to AMR nodes. ● Update enfty and event representafons using an edge-condifoned GAT to incorporate informafon from AMR neighbors.
  • 314. AMR Guided Graph Decoding: Ordered decoding guided by AMR Zixuan Zhang, Heng Ji. AMR-IE: An AMR-guided encoding and decoding framework for IE. NAACL’2021 Slide credit: Heng ● Beam search based decoding as in OneIE (Lin et al. 2020). ● The decoding order of candidate nodes are determined by the hierarchy in AMR in a top-to-down manner. ● E.g., the correct ordered decoding in the following graph is:
  • 315. Examples on how AMR graphs help Slide credit: Heng
  • 316. Leverage Meaning Representa@on for High-quality Rule-based IE Llio Humphreys et al. Popula/ng Legal Ontologies using Seman/c Role Labeling extracOon rules
  • 317. Machine Transla6on ● Repeating words with same meaning ● MT methods using Transformers can make seman=c errors ● Hallucinate informa=on not contained in the source
  • 318. Machine Transla6on Goal: inject semanLc informaLon into Machine translaLon This is mostly due to Failing to accurately capture the semanLcs of the source in some cases.
  • 319. Machine Transla6on Song et al. Semantic Neural Machine Translation using AMR. TACL 2019.
  • 320. Machine Transla6on Nguyen et al. Improving Neural Machine TranslaOon with AMR SemanOc Graphs. Hindawi MathemaOcal Problems in Engineering 2021.
  • 321. Machine Transla6on Nguyen et al. Improving Neural Machine TranslaOon with AMR SemanOc Graphs. Hindawi MathemaOcal Problems in Engineering 2021.
  • 322. Machine Transla6on Li & Flanigan. Improving Neural Machine TranslaOon with the Abstract Meaning RepresentaOon by Combining Graph and Sequence Transformers. DLG4NLP 2022.
  • 323. Machine Translation Li & Flanigan. Improving Neural Machine TranslaOon with the Abstract Meaning RepresentaOon by Combining Graph and Sequence Transformers. DLG4NLP 2022.
  • 324. Machine Transla6on Li & Flanigan. Improving Neural Machine TranslaOon with the Abstract Meaning RepresentaOon by Combining Graph and Sequence Transformers. DLG4NLP 2022.
  • 325. Summariza6on Liao et al. Abstract Meaning RepresentaOon for MulO-Document SummarizaOon. ICCL 2018
  • 326. Summariza6on Liao et al. Abstract Meaning RepresentaOon for MulO-Document
  • 327. Natural Language Inference Does premise P jus,fy an inference to hypothesis H? P: The judge by the actor stopped the banker. H: The banker stopped the actor.
  • 328. Natural Language Inference Does premise P jus,fy an inference to hypothesis H? P: The judge by the actor stopped the banker. H: The banker stopped the actor. shallow heuris>cs due to dataset biases (e.g. lexicon overlap) low generaliza>on on out-of-distribu8on evalua8on sets. The HANS challenge dataset [McCoy et al., 2019] showed that NLI models trained on MNLI or SNLI datasets get fooled easily by heurisOcs when the input sentence pairs have high lexical similarity.
  • 329. Seman)c informa)on(SRL) ○ Improve the seman)c knowledge of the NLI models ○ Less prone to dataset biases. How Can Meaning Representation Help? P: The judge by the actor stopped the banker. H: The banker stopped the actor. VERB ARG1 ARG1 ARG0 ARG0 VERB
  • 330. SemBERT: Semantic Aware BERT Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi Zhou, Xiang Zhou: Seman/cs-Aware BERT for Language Understanding. AAAI 2020 Incorporate SRL informaOon with BERT representaOons.
  • 331. SemBERT: Seman@c Aware BERT Zhuosheng Zhang, Yuwei Wu, Hai Zhao, Zuchao Li, Shuailiang Zhang, Xi Zhou, Xiang Zhou: Seman/cs-Aware BERT for Language Understanding. AAAI 2020 Results on GLUE benchmark Works parOcularly well for smaller
  • 332. Joint Training with SRL improves NLI generaliza9on Main idea: Improve sentence understanding (hence out-of-distribution generalization) with joint learning of explicit semantics Cemil Cengiz, Deniz Yuret. Joint Training with Seman/c Role Labeling for BeZer Generaliza/on in Natural Language Inference. Rep4NLP’2020
  • 333. Joint Training with SRL improves NLI generaliza9on Main idea: Improve sentence understanding (hence out-of-distribution generalization) with joint learning of explicit semantics Cemil Cengiz, Deniz Yuret. Joint Training with Seman/c Role Labeling for BeZer Generaliza/on in Natural Language Inference. Rep4NLP’2020
  • 334. Is Seman9c-Aware BERT More Linguis9cally Aware? Ling Liu, Ishan Jindal, Yunyao Li. Is Seman/c-aware BERT more Linguis/cally Infuse seman*c knowledge via predicate- wise concatena*on with BERT
  • 335. Is Semantic-Aware BERT More Linguistically Aware Ling Liu, Ishan Jindal, Yunyao Li. Is Seman/c-aware BERT more Linguis/cally
  • 336. Performance on HANS non-entailment examples by models fine-tuned on SNLI. Examples in black and normal font are where BERT made wrong predicOons and LingBERT made correct predicOons. Examples in blue and italics are where none of the three models made the correct predicOon. The last three columns are the accuracy in % on the non-entailment examples by BERT, SemBERT, and LingBERT respecOvely. Beier differenfate lexical similarity from world knowledge Fails to help with subsequence /consftuent heurisfcs
  • 337. NSQA: AMR for Neural-Symbolic Ques@on Answering over Knowledge Graph Pavan Kapanipathi et al∗ Leveraging Abstract Meaning
  • 338. AMR Graph → Query Graph Acer nigrum is used in making what? AMR Graph Query Graph Count the awards received by the ones who fought the ba?le of france?” What ciAes are located on the sides of mediterranean sea? Pavan Kapanipathi et al∗ Leveraging Abstract Meaning
  • 339. AMR-Based Ques6on Decomposi6on Zhenyun Deng et al. Interpretable AMR-Based Ques/on Decomposi/on for
  • 340. AMR-Based Question Decomposition Zhenyun Deng et al. Interpretable AMR-Based Ques/on Decomposi/on for
  • 341. AMR-Based Ques6on Decomposi6on BeZer accuracy of the final answer and the quality of sub- ques/ons Zhenyun Deng et al. Interpretable AMR-Based Ques/on Decomposi/on for
  • 342. AMR-Based Ques6on Decomposi6on Outperforming exis0ng ques0on-decomposi0on-based mul0-hop QA approaches. Zhenyun Deng et al. Interpretable AMR-Based Ques/on Decomposi/on for
  • 343. Cross-Document Multi-hop Reading Comprehension Zheng and Kordjamshidi. SRLGRN: Seman/c Role Labeling Graph Reasoning Network.
  • 344. Heterogeneous SRL Graph Zheng and Kordjamshidi. SRLGRN: Seman/c Role Labeling Graph Reasoning Network.
  • 345. HotpotQA Result SRL graph improves the completeness of the graph network over NER graph Zheng and Kordjamshidi. SRLGRN: Semantic Role Labeling Graph Reasoning Network.
  • 346. Dialog Modeling via AMR Transforma@on & Augmenta@on Mitchell Abrams, Claire Bonial, L. Donatelli. Graph-to-graph meaning representa/on transforma/ons for human-robot dialogue. SCIL. 2020 Claire Bonial et al. Augmen/ng Abstract Meaning Representa/on for Human-Robot Dialogue. ACL-DMR.
  • 347. Dialog Modeling via AMR Transforma@on & Augmenta@on Xuefeng Bai, Yulong Chen, Linfeng Song, Yue Zhang. Seman/c Representa/on for Dialogue Modeling. ACL 2021
  • 348. Dialog Modeling via AMR Transforma@on & Augmenta@on Xuefeng Bai, Yulong Chen, Linfeng Song, Yue Zhang. Seman/c Representa/on for Dialogue Modeling. ACL 2021 (a) Using AMR to enrich text representaOon. (b,c) Using AMR independently.
  • 349. Dialog Modeling via AMR Transformation & Augmentation Xuefeng Bai, Yulong Chen, Linfeng Song, Yue Zhang. Seman/c Representa/on for Dialogue Modeling. ACL 2021 semanOc knowledge in formal AMR is helpful for dialogue modeling manually added relaOons are useful in dialog relaOon extracOon and dialog generaOon
  • 350. Case Study- Watson Discover Content Intelligence A. Agarwal et al.Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
  • 351. Case Study- Watson Discover Content Intelligence A. Agarwal et al.Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021 Element
  • 352. Case Study- Watson Discover Content Intelligence A. Agarwal et al.Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021 Element Expanded SRL as Semanfc NLP Primifves Provided by SystemT [ACL '10, NAACL ‘18]
  • 353. Case Study- Watson Discover Content Intelligence A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021 Element Expanded SRL as Semanfc NLP Primifves Business transact. verbs in future tense with posiAve polarity
  • 354. Case Study- Watson Discover Content Intelligence A. Agarwal et al.Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
  • 355. Case Study- Watson Discover Content Intelligence A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
  • 356. Case Study- Watson Discover Content Intelligence A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
  • 357. Explainability + Tooling → BeQer Root Cause Analysis A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021 Yannis Katsis and ChrisOne T. Wolf. ModelLens: An Interac/ve System to Support the Model Improvement Prac/ces of
  • 358. Model Stability with Increasing Complexity A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
  • 359. Effec@veness of Feedback Incorpora@on A. Agarwal et al. Development of an Enterprise-Grade Contract Understanding System. NAACL (industry) 2021
  • 360. Human & Machine Co-Crea6on Prithvi Sen. et al. HEIDL: Learning Linguis/c Expressions with Deep Learning and Human-in-the-Loop. ACL’2019 Prithvi Sen. et al. Learning Explainable Linguis/c Expressions with Neural Induc/ve Logic Programming for Sentence
  • 361. User Study: Human & Machine Co-Crea6on Prithvi Sen. et al. HEIDL: Learning Linguis/c Expressions with Deep Learning and Human-in-the-Loop. ACL’2019 Prithvi Sen. et al. Learning Explainable Linguis/c Expressions with Neural Induc/ve Logic Programming for Sentence User study –4 NLP Engineers with 1-2 year experience –2 NLP experts with 10+ years experience Key Takeaways ● Explana'on of learned rules: VisualizaAon tool is very effecAve ● Reduc'on in human labor: Co-created model created within 1.5 person-hrs outperforms black- box sentence classifier ● Lower requirement on human exper'se: Co- created model is at par with the model created by Super-Experts
  • 362. Summary: Value of Meaning Representation Work Out-of-box Deeper understanding of text Overcome Low-resource Challenges Robustness against linguistics variants & complexity Better model generalization Explainability & Interpretability Information Extraction ✔ ✔ ✔ Text Classification ✔ ✔ ✔ ✔ Natural Language Inference ✔ Ques*on Answering ✔ ✔ Dialog ✔ Machine Translation ✔ ✔ SRL AMR
  • 363. Meaning Representations for Natural Languages Tutorial Part 5 Open Questions and Future Work Jeffrey Flanigan, Tim O’Gorman, Ishan Jindal, Yunyao Li, Martha Palmer, Nianwen Xue
  • 364. •Do we need to think of opposition between symbolic AMRs and “Deep learning”? •Advantages of AMR of being explainable, controllable •AMR can sometimes provide rich semantics that help generalization •Open questions regarding impact of AMR error propagation (how much are we getting hurt by being discrete+symbolic?) • How much do pretrained AMR graph representations change this (if they become generally useful in applications)? Open Ques6ons- Symbolic AMRs vs LLMs?
  • 365. •One simple story of main advantages of AMR over direct end-to-end language models: controllability and explainability. •We have some case studies and applicaMons, and believe that AMRs are clearly more explainable and controllable than black-box LLMs. •But: not a lot of work connecMng symbolic meaning representaMons to current explainability literature •No current work (that I’m aware of) Open Questions - Explainability
  • 366. •There can be advantages to approaching low-resource tasks with AMR graphs •Start with a lot of rich semantic distinctions! •Open questions: how to better quickly transfer to new tasks •In theory, it’s hoped AMR/UMR can be cross-linguistically robust for low- resource languages as well •Especially as we can extend to more languages / expand UMR •But: Related tasks (domain-adaptation, projecting AMRs into new languages, etc.) largely unexplored. Open Questions - Low-resource tasks and languages
  • 367. •Multi-sentence AMR — only now starting to be modeled & trained on •Huge range of IE and QA tasks going beyond sentence that AMR might help (e.g. multi-hop reasoning) •AMR “entity linking” often used. But: limited explorations of AMR with structured knowledge sources/ KBs •Promise for rich retrieval /understanding of “entire documents” (especially with temporal + modal information if UMR) but currently under-explored Open Questions - World Knowledge & Discourse