SlideShare a Scribd company logo
1
A Transition-Based Directed Acyclic Graph Parser
for Universal Conceptual Cognitive Annotation
Daniel Hershcovich, Omri Abend and Ari Rappoport
ACL 2017
2
TUPA — Transition-based UCCA Parser
The first parser to support the combination of three properties:
1. Non-terminal nodes — entities and events over the text
You want
to
take a long bath
3
TUPA — Transition-based UCCA Parser
The first parser to support the combination of three properties:
1. Non-terminal nodes — entities and events over the text
2. Reentrancy — allow argument sharing
You want
to
take a long bath
4
TUPA — Transition-based UCCA Parser
The first parser to support the combination of three properties:
1. Non-terminal nodes — entities and events over the text
2. Reentrancy — allow argument sharing
3. Discontinuity — conceptual units are split
— needed for many semantic schemes (e.g. AMR, UCCA).
You want
to
take a long bath
5
Introduction
6
Linguistic Structure Annotation Schemes
• Syntactic dependencies
• Semantic dependencies (Oepen et al., 2016)
Syntactic (UD)
You want to take a long bath
root
nsubj
xcomp
mark
dobj
det
amod
top
ARG2
ARG1
ARG1
ARG2
BV
ARG1
Semantic (DM)
Bilexical dependencies.
7
Linguistic Structure Annotation Schemes
• Syntactic dependencies
• Semantic dependencies (Oepen et al., 2016)
• Semantic role labeling (PropBank, FrameNet)
• AMR (Banarescu et al., 2013)
• UCCA (Abend and Rappoport, 2013)
• Other semantic representation schemes1
Semantic representation schemes attempt to abstract away from
syntactic detail that does not affect meaning:
. . . bathed = . . . took a bath
1
See recent survey (Abend and Rappoport, 2017)
8
The UCCA Semantic Representation Scheme
9
Universal Conceptual Cognitive Annotation (UCCA)
Cross-linguistically applicable (Abend and Rappoport, 2013).
Stable in translation (Sulem et al., 2015).
English
Hebrew
10
Universal Conceptual Cognitive Annotation (UCCA)
Rapid and intuitive annotation interface (Abend et al., 2017).
Usable by non-experts. ucca-demo.cs.huji.ac.il
Facilitates semantics-based human evaluation of machine
translation (Birch et al., 2016). ucca.cs.huji.ac.il/mteval
11
Graph Structure
UCCA generates a directed acyclic graph (DAG).
Text tokens are terminals, complex units are non-terminal nodes.
Remote edges enable reentrancy for argument sharing.
Phrases may be discontinuous (e.g., multi-word expressions).
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
—– primary edge
- - - remote edge
You want to take a long bath
P process
A participant
C center
D adverbial
F function
12
Transition-based UCCA Parsing
13
Transition-Based Parsing
First used for dependency parsing (Nivre, 2004).
Parse text w1 . . . wn to graph G incrementally by applying
transitions to the parser state: stack, buffer and constructed graph.
14
Transition-Based Parsing
First used for dependency parsing (Nivre, 2004).
Parse text w1 . . . wn to graph G incrementally by applying
transitions to the parser state: stack, buffer and constructed graph.
Initial state:
stack buffer
You want to take a long bath
15
Transition-Based Parsing
First used for dependency parsing (Nivre, 2004).
Parse text w1 . . . wn to graph G incrementally by applying
transitions to the parser state: stack, buffer and constructed graph.
Initial state:
stack buffer
You want to take a long bath
TUPA transitions:
{Shift, Reduce, NodeX , Left-EdgeX , Right-EdgeX ,
Left-RemoteX , Right-RemoteX , Swap, Finish}
Support non-terminal nodes, reentrancy and discontinuity.
16
Example
⇒ Shift
stack
You
buffer
want to take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
17
Example
⇒ Right-EdgeA
stack
You
buffer
want to take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
18
Example
⇒ Shift
stack
You want
buffer
to take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
19
Example
⇒ Swap
stack
want
buffer
You to take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
20
Example
⇒ Right-EdgeP
stack
want
buffer
You to take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
21
Example
⇒ Reduce
stack buffer
to take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
22
Example
⇒ Shift
stack
You
buffer
to take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
23
Example
⇒ Shift
stack
You to
buffer
take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
24
Example
⇒ NodeF
stack
You to
buffer
take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
25
Example
⇒ Reduce
stack
You
buffer
take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
26
Example
⇒ Shift
stack
You
buffer
take a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
27
Example
⇒ Shift
stack
You take
buffer
a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
28
Example
⇒ NodeC
stack
You take
buffer
a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
29
Example
⇒ Reduce
stack
You
buffer
a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
30
Example
⇒ Shift
stack
You
buffer
a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
31
Example
⇒ Right-EdgeP
stack
You
buffer
a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
32
Example
⇒ Shift
stack
You a
buffer
long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
33
Example
⇒ Right-EdgeF
stack
You a
buffer
long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
34
Example
⇒ Reduce
stack
You
buffer
long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
35
Example
⇒ Shift
stack
You long
buffer
bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
36
Example
⇒ Swap
stack
You long
buffer
bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
37
Example
⇒ Right-EdgeD
stack
You long
buffer
bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
38
Example
⇒ Reduce
stack
You
buffer
bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
39
Example
⇒ Swap
stack buffer
You bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
40
Example
⇒ Right-EdgeA
stack buffer
You bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
41
Example
⇒ Reduce
stack buffer
You bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
42
Example
⇒ Reduce
stack buffer
You bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
43
Example
⇒ Shift
stack
You
buffer
bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
44
Example
⇒ Shift
stack
You
buffer
bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
45
Example
⇒ Left-RemoteA
stack
You
buffer
bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
46
Example
⇒ Shift
stack
You bath
buffer
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
47
Example
⇒ Right-EdgeC
stack
You bath
buffer
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
48
Example
⇒ Finish
stack
You bath
buffer
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
49
Training
An oracle provides the transition sequence given the correct graph:
You
A
want
P
to
F
take
C
a
F
long bath
C
P
A
A
D
⇓
Shift, Right-EdgeA, Shift, Swap, Right-EdgeP , Reduce, Shift,
Shift, NodeF , Reduce, Shift, Shift, NodeC , Reduce, Shift,
Right-EdgeP , Shift, Right-EdgeF , Reduce, Shift, Swap,
Right-EdgeD, Reduce, Swap, Right-EdgeA, Reduce, Reduce, Shift,
Shift, Left-RemoteA, Shift, Right-EdgeC , Finish
50
TUPA Model
Learn to greedily predict transition based on current state.
Experimenting with three classifiers:
Sparse Perceptron with sparse features (Zhang and Nivre, 2011).
MLP Embeddings + feedforward NN (Chen and Manning, 2014).
BiLSTM Embeddings + deep bidirectional LSTM + MLP
(Kiperwasser and Goldberg, 2016).
Features: words, POS, syntactic dependencies, existing edge labels
from the stack and buffer + parents, children, grandchildren;
ordinal features (height, number of parents and children)
stack buffer
51
TUPA Model
Learn to greedily predict transition based on current state.
Experimenting with three classifiers:
Sparse Perceptron with sparse features (Zhang and Nivre, 2011).
MLP Embeddings + feedforward NN (Chen and Manning, 2014).
BiLSTM Embeddings + deep bidirectional LSTM + MLP
(Kiperwasser and Goldberg, 2016).
Effective “lookahead” encoded in the representation.
You
LSTM
want
LSTM
to
LSTM
take
LSTM
a
LSTM
long
LSTM
bath
LSTM
52
TUPA Model
Learn to greedily predict transition based on current state.
Experimenting with three classifiers:
Sparse Perceptron with sparse features (Zhang and Nivre, 2011).
MLP Embeddings + feedforward NN (Chen and Manning, 2014).
BiLSTM Embeddings + deep bidirectional LSTM + MLP
(Kiperwasser and Goldberg, 2016).
You
LSTM
LSTM
want
LSTM
LSTM
to
LSTM
LSTM
take
LSTM
LSTM
a
LSTM
LSTM
long
LSTM
LSTM
bath
LSTM
LSTM
53
TUPA Model
Learn to greedily predict transition based on current state.
Experimenting with three classifiers:
Sparse Perceptron with sparse features (Zhang and Nivre, 2011).
MLP Embeddings + feedforward NN (Chen and Manning, 2014).
BiLSTM Embeddings + deep bidirectional LSTM + MLP
(Kiperwasser and Goldberg, 2016).
You
LSTM
LSTM
LSTM
want
LSTM
LSTM
LSTM
to
LSTM
LSTM
LSTM
take
LSTM
LSTM
LSTM
a
LSTM
LSTM
LSTM
long
LSTM
LSTM
LSTM
bath
LSTM
LSTM
LSTM
54
TUPA Model
Learn to greedily predict transition based on current state.
Experimenting with three classifiers:
Sparse Perceptron with sparse features (Zhang and Nivre, 2011).
MLP Embeddings + feedforward NN (Chen and Manning, 2014).
BiLSTM Embeddings + deep bidirectional LSTM + MLP
(Kiperwasser and Goldberg, 2016).
You
LSTM
LSTM
LSTM
LSTM
want
LSTM
LSTM
LSTM
LSTM
to
LSTM
LSTM
LSTM
LSTM
take
LSTM
LSTM
LSTM
LSTM
a
LSTM
LSTM
LSTM
LSTM
long
LSTM
LSTM
LSTM
LSTM
bath
LSTM
LSTM
LSTM
LSTM
55
stack You take
buffer a long bath
graph
You
A
want
P
to
F
take
C
a
F
long bath
C
You
LSTM
LSTM
LSTM
LSTM
want
LSTM
LSTM
LSTM
LSTM
to
LSTM
LSTM
LSTM
LSTM
take
LSTM
LSTM
LSTM
LSTM
a
LSTM
LSTM
LSTM
LSTM
long
LSTM
LSTM
LSTM
LSTM
bath
LSTM
LSTM
LSTM
LSTM
MLP
NodeC
56
Experiments
57
Experimental Setup
• UCCA Wikipedia corpus (
train
4268 +
dev
454 +
test
503 sentences).
• Out-of-domain: English part of English-French parallel corpus,
Twenty Thousand Leagues Under the Sea (506 sentences).
58
Baselines
No existing UCCA parsers ⇒ conversion-based approximation.
Bilexical DAG parsers (allow reentrancy):
• DAGParser (Ribeyre et al., 2014): transition-based.
• TurboParser (Almeida and Martins, 2015): graph-based.
Tree parsers (all transition-based):
• MaltParser (Nivre et al., 2007): bilexical tree parser.
• Stack LSTM Parser (Dyer et al., 2015): bilexical tree parser.
• uparse (Maier, 2015): allows non-terminals, discontinuity.
You want to take a long bath
A
A
A
F F
D
C
UCCA bilexical DAG approximation (for tree, delete remote edges).
59
Bilexical Graph Approximation
1. Convert UCCA to bilexical dependencies.
2. Train bilexical parsers and apply to test sentences.
3. Reconstruct UCCA graphs and compare with gold standard.
After
L
graduation
P
H
,
U
Joe
A
moved
P
to
R
Paris
C
A
H
A
After graduation , Joe moved to Paris
L U
A
A
H
R
A
60
Evaluation
Comparing graphs over the same sequence of tokens,
• Match edges by their terminal yield and label.
• Calculate labeled precision, recall and F1 scores.
• Separate primary and remote edges.
gold
After
L
graduation
P
H
,
U
Joe
A
moved
P
to
R
Paris
C
A
H
A
predicted
After
L
graduation
S
H
,
U
Joe
A
moved
P
to
F
Paris
A
H
A
A
Primary:
LP LR LF
6
9 = 67% 6
10 = 60% 64%
Remote:
LP LR LF
1
2 = 50% 1
1 = 100% 67%
61
Results
TUPABiLSTM obtains the highest F-scores in all metrics:
Primary edges Remote edges
LP LR LF LP LR LF
TUPASparse 64.5 63.7 64.1 19.8 13.4 16
TUPAMLP 65.2 64.6 64.9 23.7 13.2 16.9
TUPABiLSTM 74.4 72.7 73.5 47.4 51.6 49.4
Bilexical DAG (91) (58.3)
DAGParser 61.8 55.8 58.6 9.5 0.5 1
TurboParser 57.7 46 51.2 77.8 1.8 3.7
Bilexical tree (91) –
MaltParser 62.8 57.7 60.2 – – –
Stack LSTM 73.2 66.9 69.9 – – –
Tree (100) –
uparse 60.9 61.2 61.1 – – –
Results on the Wiki test set.
62
Results
Comparable on out-of-domain test set:
Primary edges Remote edges
LP LR LF LP LR LF
TUPASparse 59.6 59.9 59.8 22.2 7.7 11.5
TUPAMLP 62.3 62.6 62.5 20.9 6.3 9.7
TUPABiLSTM 68.7 68.5 68.6 38.6 18.8 25.3
Bilexical DAG (91.3) (43.4)
DAGParser 56.4 50.6 53.4 – 0 0
TurboParser 50.3 37.7 43.1 100 0.4 0.8
Bilexical tree (91.3) –
MaltParser 57.8 53 55.3 – – –
Stack LSTM 66.1 61.1 63.5 – – –
Tree (100) –
uparse 52.7 52.8 52.8 – – –
Results on the 20K Leagues out-of-domain set.
63
Conclusion
• UCCA’s semantic distinctions require a graph structure
including non-terminals, reentrancy and discontinuity.
• TUPA is an accurate transition-based UCCA parser, and the
first to support UCCA and any DAG over the text tokens.
• Outperforms strong conversion-based baselines.
Code: github.com/danielhers/tupa
Demo: bit.ly/tupademo
Corpora: cs.huji.ac.il/˜oabend/ucca.html
64
Conclusion
• UCCA’s semantic distinctions require a graph structure
including non-terminals, reentrancy and discontinuity.
• TUPA is an accurate transition-based UCCA parser, and the
first to support UCCA and any DAG over the text tokens.
• Outperforms strong conversion-based baselines.
Future Work:
• More languages (German corpus construction is underway).
• Parsing other schemes, such as AMR.
• Compare semantic representations through conversion.
• Text simplification, MT evaluation and other applications.
Code: github.com/danielhers/tupa
Demo: bit.ly/tupademo
Corpora: cs.huji.ac.il/˜oabend/ucca.html
65
Conclusion
• UCCA’s semantic distinctions require a graph structure
including non-terminals, reentrancy and discontinuity.
• TUPA is an accurate transition-based UCCA parser, and the
first to support UCCA and any DAG over the text tokens.
• Outperforms strong conversion-based baselines.
Future Work:
• More languages (German corpus construction is underway).
• Parsing other schemes, such as AMR.
• Compare semantic representations through conversion.
• Text simplification, MT evaluation and other applications.
Code: github.com/danielhers/tupa
Demo: bit.ly/tupademo
Corpora: cs.huji.ac.il/˜oabend/ucca.html
Thank you!
66
References I
Abend, O. and Rappoport, A. (2013).
Universal Conceptual Cognitive Annotation (UCCA).
In Proc. of ACL, pages 228–238.
Abend, O. and Rappoport, A. (2017).
The state of the art in semantic representation.
In Proc. of ACL.
to appear.
Abend, O., Yerushalmi, S., and Rappoport, A. (2017).
UCCAApp: Web-application for syntactic and semantic phrase-based annotation.
In Proc. of ACL: System Demonstration Papers.
to appear.
Almeida, M. S. C. and Martins, A. F. T. (2015).
Lisbon: Evaluating TurboSemanticParser on multiple languages and out-of-domain data.
In Proc. of SemEval, pages 970–973.
Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Palmer, M., and
Schneider, N. (2013).
Abstract Meaning Representation for sembanking.
In Proc. of the Linguistic Annotation Workshop.
Birch, A., Abend, O., Bojar, O., and Haddow, B. (2016).
HUME: Human UCCA-based evaluation of machine translation.
In Proc. of EMNLP, pages 1264–1274.
Chen, D. and Manning, C. (2014).
A fast and accurate dependency parser using neural networks.
In Proc. of EMNLP, pages 740–750.
67
References II
Dyer, C., Ballesteros, M., Ling, W., Matthews, A., and Smith, N. A. (2015).
Transition-based dependeny parsing with stack long short-term memory.
In Proc. of ACL, pages 334–343.
Kiperwasser, E. and Goldberg, Y. (2016).
Simple and accurate dependency parsing using bidirectional LSTM feature representations.
TACL, 4:313–327.
Maier, W. (2015).
Discontinuous incremental shift-reduce parsing.
In Proc. of ACL, pages 1202–1212.
Nivre, J. (2004).
Incrementality in deterministic dependency parsing.
In Keller, F., Clark, S., Crocker, M., and Steedman, M., editors, Proceedings of the ACL Workshop
Incremental Parsing: Bringing Engineering and Cognition Together, pages 50–57, Barcelona, Spain.
Association for Computational Linguistics.
Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., K¨ubler, S., Marinov, S., and Marsi, E. (2007).
MaltParser: A language-independent system for data-driven dependency parsing.
Natural Language Engineering, 13(02):95–135.
Oepen, S., Kuhlmann, M., Miyao, Y., Zeman, D., Cinkov´a, S., Flickinger, D., Hajic, J., Ivanova, A., and Uresov´a,
Z. (2016).
Towards comparability of linguistic graph banks for semantic parsing.
In LREC.
Ribeyre, C., Villemonte de la Clergerie, E., and Seddah, D. (2014).
Alpage: Transition-based semantic graph parsing with syntactic features.
In Proc. of SemEval, pages 97–103.
68
References III
Sulem, E., Abend, O., and Rappoport, A. (2015).
Conceptual annotations preserve structure across translations: A French-English case study.
In Proc. of S2MT, pages 11–22.
Zhang, Y. and Nivre, J. (2011).
Transition-based dependency parsing with rich non-local features.
In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human
Language Technologies, pages 188–193.
69
Backup
70
UCCA Corpora
Wiki 20K
Train Dev Test Leagues
# passages 300 34 33 154
# sentences 4268 454 503 506
# nodes 298,993 33,704 35,718 29,315
% terminal 42.96 43.54 42.87 42.09
% non-term. 58.33 57.60 58.35 60.01
% discont. 0.54 0.53 0.44 0.81
% reentrant 2.38 1.88 2.15 2.03
# edges 287,914 32,460 34,336 27,749
% primary 98.25 98.75 98.74 97.73
% remote 1.75 1.25 1.26 2.27
Average per non-terminal node
# children 1.67 1.68 1.66 1.61
Corpus statistics.
71
Evaluation
Mutual edges between predicted graph Gp = (Vp, Ep, p) and gold
graph Gg = (Vg , Eg , g ), both over terminals W = {w1, . . . , wn}:
M(Gp, Gg ) = (e1, e2) ∈ Ep×Eg y(e1) = y(e2)∧ p(e1) = g (e2)
The yield y(e) ⊆ W of an edge e = (u, v) in either graph is the set
of terminals in W that are descendants of v. is the edge label.
Labeled precision, recall and F-score are then defined as:
LP =
|M(Gp, Gg )|
|Ep|
, LR =
|M(Gp, Gg )|
|Eg |
,
LF =
2 · LP · LR
LP + LR
.
Two variants: one for primary edges, and another for remote edges.

More Related Content

PDF
Attention is All You Need for AMR Parsing
PDF
An incremental algorithm for transition-based CCG parsing
PDF
From Linked Data to Semantic Applications
PDF
Machine learning on Go Code
PDF
UIMA
PPTX
Machine translation for Indian Languages
PDF
Serverless Clojure and ML prototyping: an experience report
PDF
Abstract Meaning Representation
Attention is All You Need for AMR Parsing
An incremental algorithm for transition-based CCG parsing
From Linked Data to Semantic Applications
Machine learning on Go Code
UIMA
Machine translation for Indian Languages
Serverless Clojure and ML prototyping: an experience report
Abstract Meaning Representation

Similar to Daniel Hershcovich - 2017 - A Transition-Based Directed Acyclic Graph Parser for Universal Conceptual Cognitive Annotation (20)

PDF
talk at Virginia Bioinformatics Institute, December 5, 2013
PDF
Deep Learning in NLP (BERT, ERNIE and REFORMER)
PDF
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
PDF
Triantafyllia Voulibasi
PDF
Wcre12c.ppt
PDF
Building Agents with LangGraph & Gemini
PDF
Trends of machine learning in 2020 - International Journal of Artificial Inte...
PDF
Babar: Knowledge Recognition, Extraction and Representation
PDF
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
PPTX
Semantic-Aware Code Model: Elevating the Future of Software Development
PDF
Program Synthesis, DreamCoder, and ARC
PDF
Arules_TM_Rpart_Markdown
PDF
Big Data Analytics course: Named Entities and Deep Learning for NLP
PDF
Automatic generation of domain models for call centers
PPTX
FESCA 2015 keynote
PDF
Fixing the program my computer learned: End-user debugging of machine-learned...
PDF
Machine Learning Powered by Graphs - Alessandro Negro
PDF
Fosdem 2013 petra selmer flexible querying of graph data
PDF
Gluecon InfiniteGraph Presentation: Scaling the Social Graph in the Cloud
PDF
Understanding Natural Languange with Corpora-based Generation of Dependency G...
talk at Virginia Bioinformatics Institute, December 5, 2013
Deep Learning in NLP (BERT, ERNIE and REFORMER)
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
Triantafyllia Voulibasi
Wcre12c.ppt
Building Agents with LangGraph & Gemini
Trends of machine learning in 2020 - International Journal of Artificial Inte...
Babar: Knowledge Recognition, Extraction and Representation
Adaptation of Multilingual Transformer Encoder for Robust Enhanced Universal ...
Semantic-Aware Code Model: Elevating the Future of Software Development
Program Synthesis, DreamCoder, and ARC
Arules_TM_Rpart_Markdown
Big Data Analytics course: Named Entities and Deep Learning for NLP
Automatic generation of domain models for call centers
FESCA 2015 keynote
Fixing the program my computer learned: End-user debugging of machine-learned...
Machine Learning Powered by Graphs - Alessandro Negro
Fosdem 2013 petra selmer flexible querying of graph data
Gluecon InfiniteGraph Presentation: Scaling the Social Graph in the Cloud
Understanding Natural Languange with Corpora-based Generation of Dependency G...
Ad

More from Association for Computational Linguistics (20)

PDF
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
PDF
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
PDF
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
PDF
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
PDF
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
PDF
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
PDF
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
PDF
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
PDF
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
PDF
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
PDF
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
PDF
Chenchen Ding - 2015 - NICT at WAT 2015
PDF
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
PDF
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
PDF
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
PDF
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
PDF
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
PDF
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
PDF
Chenchen Ding - 2015 - NICT at WAT 2015
PDF
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Muis - 2016 - Weak Semi-Markov CRFs for NP Chunking in Informal Text
Castro - 2018 - A High Coverage Method for Automatic False Friends Detection ...
Castro - 2018 - A Crowd-Annotated Spanish Corpus for Humour Analysis
Muthu Kumar Chandrasekaran - 2018 - Countering Position Bias in Instructor In...
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Daniel Gildea - 2018 - The ACL Anthology: Current State and Future Directions
Wenqiang Lei - 2018 - Sequicity: Simplifying Task-oriented Dialogue Systems w...
Matthew Marge - 2017 - Exploring Variation of Natural Human Commands to a Rob...
Venkatesh Duppada - 2017 - SeerNet at EmoInt-2017: Tweet Emotion Intensity Es...
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Chenchen Ding - 2015 - NICT at WAT 2015
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
John Richardson - 2015 - KyotoEBMT System Description for the 2nd Workshop on...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Zhongyuan Zhu - 2015 - Evaluating Neural Machine Translation in English-Japan...
Hyoung-Gyu Lee - 2015 - NAVER Machine Translation System for WAT 2015
Satoshi Sonoh - 2015 - Toshiba MT System Description for the WAT2015 Workshop
Chenchen Ding - 2015 - NICT at WAT 2015
Graham Neubig - 2015 - Neural Reranking Improves Subjective Quality of Machin...
Ad

Recently uploaded (20)

PDF
Complications of Minimal Access Surgery at WLH
PPTX
Lesson notes of climatology university.
PDF
RMMM.pdf make it easy to upload and study
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
PDF
IGGE1 Understanding the Self1234567891011
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PPTX
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
PPTX
Cell Types and Its function , kingdom of life
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
Classroom Observation Tools for Teachers
PPTX
Introduction to Building Materials
PDF
Hazard Identification & Risk Assessment .pdf
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Complications of Minimal Access Surgery at WLH
Lesson notes of climatology university.
RMMM.pdf make it easy to upload and study
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Weekly quiz Compilation Jan -July 25.pdf
Radiologic_Anatomy_of_the_Brachial_plexus [final].pptx
IGGE1 Understanding the Self1234567891011
LDMMIA Reiki Yoga Finals Review Spring Summer
What if we spent less time fighting change, and more time building what’s rig...
Final Presentation General Medicine 03-08-2024.pptx
Orientation - ARALprogram of Deped to the Parents.pptx
UV-Visible spectroscopy..pptx UV-Visible Spectroscopy – Electronic Transition...
Cell Types and Its function , kingdom of life
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
Classroom Observation Tools for Teachers
Introduction to Building Materials
Hazard Identification & Risk Assessment .pdf
UNIT III MENTAL HEALTH NURSING ASSESSMENT

Daniel Hershcovich - 2017 - A Transition-Based Directed Acyclic Graph Parser for Universal Conceptual Cognitive Annotation

  • 1. 1 A Transition-Based Directed Acyclic Graph Parser for Universal Conceptual Cognitive Annotation Daniel Hershcovich, Omri Abend and Ari Rappoport ACL 2017
  • 2. 2 TUPA — Transition-based UCCA Parser The first parser to support the combination of three properties: 1. Non-terminal nodes — entities and events over the text You want to take a long bath
  • 3. 3 TUPA — Transition-based UCCA Parser The first parser to support the combination of three properties: 1. Non-terminal nodes — entities and events over the text 2. Reentrancy — allow argument sharing You want to take a long bath
  • 4. 4 TUPA — Transition-based UCCA Parser The first parser to support the combination of three properties: 1. Non-terminal nodes — entities and events over the text 2. Reentrancy — allow argument sharing 3. Discontinuity — conceptual units are split — needed for many semantic schemes (e.g. AMR, UCCA). You want to take a long bath
  • 6. 6 Linguistic Structure Annotation Schemes • Syntactic dependencies • Semantic dependencies (Oepen et al., 2016) Syntactic (UD) You want to take a long bath root nsubj xcomp mark dobj det amod top ARG2 ARG1 ARG1 ARG2 BV ARG1 Semantic (DM) Bilexical dependencies.
  • 7. 7 Linguistic Structure Annotation Schemes • Syntactic dependencies • Semantic dependencies (Oepen et al., 2016) • Semantic role labeling (PropBank, FrameNet) • AMR (Banarescu et al., 2013) • UCCA (Abend and Rappoport, 2013) • Other semantic representation schemes1 Semantic representation schemes attempt to abstract away from syntactic detail that does not affect meaning: . . . bathed = . . . took a bath 1 See recent survey (Abend and Rappoport, 2017)
  • 8. 8 The UCCA Semantic Representation Scheme
  • 9. 9 Universal Conceptual Cognitive Annotation (UCCA) Cross-linguistically applicable (Abend and Rappoport, 2013). Stable in translation (Sulem et al., 2015). English Hebrew
  • 10. 10 Universal Conceptual Cognitive Annotation (UCCA) Rapid and intuitive annotation interface (Abend et al., 2017). Usable by non-experts. ucca-demo.cs.huji.ac.il Facilitates semantics-based human evaluation of machine translation (Birch et al., 2016). ucca.cs.huji.ac.il/mteval
  • 11. 11 Graph Structure UCCA generates a directed acyclic graph (DAG). Text tokens are terminals, complex units are non-terminal nodes. Remote edges enable reentrancy for argument sharing. Phrases may be discontinuous (e.g., multi-word expressions). You A want P to F take C a F long bath C P A A D —– primary edge - - - remote edge You want to take a long bath P process A participant C center D adverbial F function
  • 13. 13 Transition-Based Parsing First used for dependency parsing (Nivre, 2004). Parse text w1 . . . wn to graph G incrementally by applying transitions to the parser state: stack, buffer and constructed graph.
  • 14. 14 Transition-Based Parsing First used for dependency parsing (Nivre, 2004). Parse text w1 . . . wn to graph G incrementally by applying transitions to the parser state: stack, buffer and constructed graph. Initial state: stack buffer You want to take a long bath
  • 15. 15 Transition-Based Parsing First used for dependency parsing (Nivre, 2004). Parse text w1 . . . wn to graph G incrementally by applying transitions to the parser state: stack, buffer and constructed graph. Initial state: stack buffer You want to take a long bath TUPA transitions: {Shift, Reduce, NodeX , Left-EdgeX , Right-EdgeX , Left-RemoteX , Right-RemoteX , Swap, Finish} Support non-terminal nodes, reentrancy and discontinuity.
  • 16. 16 Example ⇒ Shift stack You buffer want to take a long bath graph You A want P to F take C a F long bath C P A A D
  • 17. 17 Example ⇒ Right-EdgeA stack You buffer want to take a long bath graph You A want P to F take C a F long bath C P A A D
  • 18. 18 Example ⇒ Shift stack You want buffer to take a long bath graph You A want P to F take C a F long bath C P A A D
  • 19. 19 Example ⇒ Swap stack want buffer You to take a long bath graph You A want P to F take C a F long bath C P A A D
  • 20. 20 Example ⇒ Right-EdgeP stack want buffer You to take a long bath graph You A want P to F take C a F long bath C P A A D
  • 21. 21 Example ⇒ Reduce stack buffer to take a long bath graph You A want P to F take C a F long bath C P A A D
  • 22. 22 Example ⇒ Shift stack You buffer to take a long bath graph You A want P to F take C a F long bath C P A A D
  • 23. 23 Example ⇒ Shift stack You to buffer take a long bath graph You A want P to F take C a F long bath C P A A D
  • 24. 24 Example ⇒ NodeF stack You to buffer take a long bath graph You A want P to F take C a F long bath C P A A D
  • 25. 25 Example ⇒ Reduce stack You buffer take a long bath graph You A want P to F take C a F long bath C P A A D
  • 26. 26 Example ⇒ Shift stack You buffer take a long bath graph You A want P to F take C a F long bath C P A A D
  • 27. 27 Example ⇒ Shift stack You take buffer a long bath graph You A want P to F take C a F long bath C P A A D
  • 28. 28 Example ⇒ NodeC stack You take buffer a long bath graph You A want P to F take C a F long bath C P A A D
  • 29. 29 Example ⇒ Reduce stack You buffer a long bath graph You A want P to F take C a F long bath C P A A D
  • 30. 30 Example ⇒ Shift stack You buffer a long bath graph You A want P to F take C a F long bath C P A A D
  • 31. 31 Example ⇒ Right-EdgeP stack You buffer a long bath graph You A want P to F take C a F long bath C P A A D
  • 32. 32 Example ⇒ Shift stack You a buffer long bath graph You A want P to F take C a F long bath C P A A D
  • 33. 33 Example ⇒ Right-EdgeF stack You a buffer long bath graph You A want P to F take C a F long bath C P A A D
  • 39. 39 Example ⇒ Swap stack buffer You bath graph You A want P to F take C a F long bath C P A A D
  • 40. 40 Example ⇒ Right-EdgeA stack buffer You bath graph You A want P to F take C a F long bath C P A A D
  • 41. 41 Example ⇒ Reduce stack buffer You bath graph You A want P to F take C a F long bath C P A A D
  • 42. 42 Example ⇒ Reduce stack buffer You bath graph You A want P to F take C a F long bath C P A A D
  • 49. 49 Training An oracle provides the transition sequence given the correct graph: You A want P to F take C a F long bath C P A A D ⇓ Shift, Right-EdgeA, Shift, Swap, Right-EdgeP , Reduce, Shift, Shift, NodeF , Reduce, Shift, Shift, NodeC , Reduce, Shift, Right-EdgeP , Shift, Right-EdgeF , Reduce, Shift, Swap, Right-EdgeD, Reduce, Swap, Right-EdgeA, Reduce, Reduce, Shift, Shift, Left-RemoteA, Shift, Right-EdgeC , Finish
  • 50. 50 TUPA Model Learn to greedily predict transition based on current state. Experimenting with three classifiers: Sparse Perceptron with sparse features (Zhang and Nivre, 2011). MLP Embeddings + feedforward NN (Chen and Manning, 2014). BiLSTM Embeddings + deep bidirectional LSTM + MLP (Kiperwasser and Goldberg, 2016). Features: words, POS, syntactic dependencies, existing edge labels from the stack and buffer + parents, children, grandchildren; ordinal features (height, number of parents and children) stack buffer
  • 51. 51 TUPA Model Learn to greedily predict transition based on current state. Experimenting with three classifiers: Sparse Perceptron with sparse features (Zhang and Nivre, 2011). MLP Embeddings + feedforward NN (Chen and Manning, 2014). BiLSTM Embeddings + deep bidirectional LSTM + MLP (Kiperwasser and Goldberg, 2016). Effective “lookahead” encoded in the representation. You LSTM want LSTM to LSTM take LSTM a LSTM long LSTM bath LSTM
  • 52. 52 TUPA Model Learn to greedily predict transition based on current state. Experimenting with three classifiers: Sparse Perceptron with sparse features (Zhang and Nivre, 2011). MLP Embeddings + feedforward NN (Chen and Manning, 2014). BiLSTM Embeddings + deep bidirectional LSTM + MLP (Kiperwasser and Goldberg, 2016). You LSTM LSTM want LSTM LSTM to LSTM LSTM take LSTM LSTM a LSTM LSTM long LSTM LSTM bath LSTM LSTM
  • 53. 53 TUPA Model Learn to greedily predict transition based on current state. Experimenting with three classifiers: Sparse Perceptron with sparse features (Zhang and Nivre, 2011). MLP Embeddings + feedforward NN (Chen and Manning, 2014). BiLSTM Embeddings + deep bidirectional LSTM + MLP (Kiperwasser and Goldberg, 2016). You LSTM LSTM LSTM want LSTM LSTM LSTM to LSTM LSTM LSTM take LSTM LSTM LSTM a LSTM LSTM LSTM long LSTM LSTM LSTM bath LSTM LSTM LSTM
  • 54. 54 TUPA Model Learn to greedily predict transition based on current state. Experimenting with three classifiers: Sparse Perceptron with sparse features (Zhang and Nivre, 2011). MLP Embeddings + feedforward NN (Chen and Manning, 2014). BiLSTM Embeddings + deep bidirectional LSTM + MLP (Kiperwasser and Goldberg, 2016). You LSTM LSTM LSTM LSTM want LSTM LSTM LSTM LSTM to LSTM LSTM LSTM LSTM take LSTM LSTM LSTM LSTM a LSTM LSTM LSTM LSTM long LSTM LSTM LSTM LSTM bath LSTM LSTM LSTM LSTM
  • 55. 55 stack You take buffer a long bath graph You A want P to F take C a F long bath C You LSTM LSTM LSTM LSTM want LSTM LSTM LSTM LSTM to LSTM LSTM LSTM LSTM take LSTM LSTM LSTM LSTM a LSTM LSTM LSTM LSTM long LSTM LSTM LSTM LSTM bath LSTM LSTM LSTM LSTM MLP NodeC
  • 57. 57 Experimental Setup • UCCA Wikipedia corpus ( train 4268 + dev 454 + test 503 sentences). • Out-of-domain: English part of English-French parallel corpus, Twenty Thousand Leagues Under the Sea (506 sentences).
  • 58. 58 Baselines No existing UCCA parsers ⇒ conversion-based approximation. Bilexical DAG parsers (allow reentrancy): • DAGParser (Ribeyre et al., 2014): transition-based. • TurboParser (Almeida and Martins, 2015): graph-based. Tree parsers (all transition-based): • MaltParser (Nivre et al., 2007): bilexical tree parser. • Stack LSTM Parser (Dyer et al., 2015): bilexical tree parser. • uparse (Maier, 2015): allows non-terminals, discontinuity. You want to take a long bath A A A F F D C UCCA bilexical DAG approximation (for tree, delete remote edges).
  • 59. 59 Bilexical Graph Approximation 1. Convert UCCA to bilexical dependencies. 2. Train bilexical parsers and apply to test sentences. 3. Reconstruct UCCA graphs and compare with gold standard. After L graduation P H , U Joe A moved P to R Paris C A H A After graduation , Joe moved to Paris L U A A H R A
  • 60. 60 Evaluation Comparing graphs over the same sequence of tokens, • Match edges by their terminal yield and label. • Calculate labeled precision, recall and F1 scores. • Separate primary and remote edges. gold After L graduation P H , U Joe A moved P to R Paris C A H A predicted After L graduation S H , U Joe A moved P to F Paris A H A A Primary: LP LR LF 6 9 = 67% 6 10 = 60% 64% Remote: LP LR LF 1 2 = 50% 1 1 = 100% 67%
  • 61. 61 Results TUPABiLSTM obtains the highest F-scores in all metrics: Primary edges Remote edges LP LR LF LP LR LF TUPASparse 64.5 63.7 64.1 19.8 13.4 16 TUPAMLP 65.2 64.6 64.9 23.7 13.2 16.9 TUPABiLSTM 74.4 72.7 73.5 47.4 51.6 49.4 Bilexical DAG (91) (58.3) DAGParser 61.8 55.8 58.6 9.5 0.5 1 TurboParser 57.7 46 51.2 77.8 1.8 3.7 Bilexical tree (91) – MaltParser 62.8 57.7 60.2 – – – Stack LSTM 73.2 66.9 69.9 – – – Tree (100) – uparse 60.9 61.2 61.1 – – – Results on the Wiki test set.
  • 62. 62 Results Comparable on out-of-domain test set: Primary edges Remote edges LP LR LF LP LR LF TUPASparse 59.6 59.9 59.8 22.2 7.7 11.5 TUPAMLP 62.3 62.6 62.5 20.9 6.3 9.7 TUPABiLSTM 68.7 68.5 68.6 38.6 18.8 25.3 Bilexical DAG (91.3) (43.4) DAGParser 56.4 50.6 53.4 – 0 0 TurboParser 50.3 37.7 43.1 100 0.4 0.8 Bilexical tree (91.3) – MaltParser 57.8 53 55.3 – – – Stack LSTM 66.1 61.1 63.5 – – – Tree (100) – uparse 52.7 52.8 52.8 – – – Results on the 20K Leagues out-of-domain set.
  • 63. 63 Conclusion • UCCA’s semantic distinctions require a graph structure including non-terminals, reentrancy and discontinuity. • TUPA is an accurate transition-based UCCA parser, and the first to support UCCA and any DAG over the text tokens. • Outperforms strong conversion-based baselines. Code: github.com/danielhers/tupa Demo: bit.ly/tupademo Corpora: cs.huji.ac.il/˜oabend/ucca.html
  • 64. 64 Conclusion • UCCA’s semantic distinctions require a graph structure including non-terminals, reentrancy and discontinuity. • TUPA is an accurate transition-based UCCA parser, and the first to support UCCA and any DAG over the text tokens. • Outperforms strong conversion-based baselines. Future Work: • More languages (German corpus construction is underway). • Parsing other schemes, such as AMR. • Compare semantic representations through conversion. • Text simplification, MT evaluation and other applications. Code: github.com/danielhers/tupa Demo: bit.ly/tupademo Corpora: cs.huji.ac.il/˜oabend/ucca.html
  • 65. 65 Conclusion • UCCA’s semantic distinctions require a graph structure including non-terminals, reentrancy and discontinuity. • TUPA is an accurate transition-based UCCA parser, and the first to support UCCA and any DAG over the text tokens. • Outperforms strong conversion-based baselines. Future Work: • More languages (German corpus construction is underway). • Parsing other schemes, such as AMR. • Compare semantic representations through conversion. • Text simplification, MT evaluation and other applications. Code: github.com/danielhers/tupa Demo: bit.ly/tupademo Corpora: cs.huji.ac.il/˜oabend/ucca.html Thank you!
  • 66. 66 References I Abend, O. and Rappoport, A. (2013). Universal Conceptual Cognitive Annotation (UCCA). In Proc. of ACL, pages 228–238. Abend, O. and Rappoport, A. (2017). The state of the art in semantic representation. In Proc. of ACL. to appear. Abend, O., Yerushalmi, S., and Rappoport, A. (2017). UCCAApp: Web-application for syntactic and semantic phrase-based annotation. In Proc. of ACL: System Demonstration Papers. to appear. Almeida, M. S. C. and Martins, A. F. T. (2015). Lisbon: Evaluating TurboSemanticParser on multiple languages and out-of-domain data. In Proc. of SemEval, pages 970–973. Banarescu, L., Bonial, C., Cai, S., Georgescu, M., Griffitt, K., Hermjakob, U., Knight, K., Palmer, M., and Schneider, N. (2013). Abstract Meaning Representation for sembanking. In Proc. of the Linguistic Annotation Workshop. Birch, A., Abend, O., Bojar, O., and Haddow, B. (2016). HUME: Human UCCA-based evaluation of machine translation. In Proc. of EMNLP, pages 1264–1274. Chen, D. and Manning, C. (2014). A fast and accurate dependency parser using neural networks. In Proc. of EMNLP, pages 740–750.
  • 67. 67 References II Dyer, C., Ballesteros, M., Ling, W., Matthews, A., and Smith, N. A. (2015). Transition-based dependeny parsing with stack long short-term memory. In Proc. of ACL, pages 334–343. Kiperwasser, E. and Goldberg, Y. (2016). Simple and accurate dependency parsing using bidirectional LSTM feature representations. TACL, 4:313–327. Maier, W. (2015). Discontinuous incremental shift-reduce parsing. In Proc. of ACL, pages 1202–1212. Nivre, J. (2004). Incrementality in deterministic dependency parsing. In Keller, F., Clark, S., Crocker, M., and Steedman, M., editors, Proceedings of the ACL Workshop Incremental Parsing: Bringing Engineering and Cognition Together, pages 50–57, Barcelona, Spain. Association for Computational Linguistics. Nivre, J., Hall, J., Nilsson, J., Chanev, A., Eryigit, G., K¨ubler, S., Marinov, S., and Marsi, E. (2007). MaltParser: A language-independent system for data-driven dependency parsing. Natural Language Engineering, 13(02):95–135. Oepen, S., Kuhlmann, M., Miyao, Y., Zeman, D., Cinkov´a, S., Flickinger, D., Hajic, J., Ivanova, A., and Uresov´a, Z. (2016). Towards comparability of linguistic graph banks for semantic parsing. In LREC. Ribeyre, C., Villemonte de la Clergerie, E., and Seddah, D. (2014). Alpage: Transition-based semantic graph parsing with syntactic features. In Proc. of SemEval, pages 97–103.
  • 68. 68 References III Sulem, E., Abend, O., and Rappoport, A. (2015). Conceptual annotations preserve structure across translations: A French-English case study. In Proc. of S2MT, pages 11–22. Zhang, Y. and Nivre, J. (2011). Transition-based dependency parsing with rich non-local features. In Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 188–193.
  • 70. 70 UCCA Corpora Wiki 20K Train Dev Test Leagues # passages 300 34 33 154 # sentences 4268 454 503 506 # nodes 298,993 33,704 35,718 29,315 % terminal 42.96 43.54 42.87 42.09 % non-term. 58.33 57.60 58.35 60.01 % discont. 0.54 0.53 0.44 0.81 % reentrant 2.38 1.88 2.15 2.03 # edges 287,914 32,460 34,336 27,749 % primary 98.25 98.75 98.74 97.73 % remote 1.75 1.25 1.26 2.27 Average per non-terminal node # children 1.67 1.68 1.66 1.61 Corpus statistics.
  • 71. 71 Evaluation Mutual edges between predicted graph Gp = (Vp, Ep, p) and gold graph Gg = (Vg , Eg , g ), both over terminals W = {w1, . . . , wn}: M(Gp, Gg ) = (e1, e2) ∈ Ep×Eg y(e1) = y(e2)∧ p(e1) = g (e2) The yield y(e) ⊆ W of an edge e = (u, v) in either graph is the set of terminals in W that are descendants of v. is the edge label. Labeled precision, recall and F-score are then defined as: LP = |M(Gp, Gg )| |Ep| , LR = |M(Gp, Gg )| |Eg | , LF = 2 · LP · LR LP + LR . Two variants: one for primary edges, and another for remote edges.