SlideShare a Scribd company logo
1
Learning for Semantic Parsing Using
Statistical Syntactic Parsing
Techniques
Ruifang Ge
Ph.D. Final Defense
Supervisor: Raymond J. Mooney
Machine Learning Group
Department of Computer Science
The University of Texas at Austin
2
Semantic Parsing
 Semantic Parsing: Transforming natural
language (NL) sentences into completely
formal meaning representations (MRs)
 Sample application domains where MRs
are directly executable by another
computer system to perform some task
 CLang: Robocup Coach Language
 Geoquery: A Database Query Application
3
CLang (RoboCup Coach Language)
 In RoboCup Coach competition, teams compete to coach
simulated players
 The coaching instructions are given in a formal language
called CLang
Simulated soccer field
Coach
If our player 2 has
the ball, then position
our player 5 in the
midfield.
CLang ((bowner (player our {2}))
(do (player our {5}) (pos (midfield))))
Semantic Parsing
4
GeoQuery: A Database Query Application
 Query application for U.S. geography
database [Zelle & Mooney, 1996]
User What are the
rivers in Texas?
Semantic Parsing
DataBaseAngelina,Angelina,
Blanco, …Blanco, …
Query answer(x1, (river(x1), loc(x1,x2),
equal(x2,stateid(texas))))
5
Motivation for Semantic
Parsing
 Theoretically, it answers the question of how
people interpret language
 Practical applications
 Question answering
 Natural language interface
 Knowledge acquisition
 Reasoning
6
Motivating Example
Semantic parsing is a compositional process.
Sentence structures are needed for building meaning representations.
((bowner (player our {2})) (do our {4} (pos (half our))))
If our player 2 has the ball, our player 4 should stay in our half
bowner: ball owner
pos: position
7
Syntax-Based Approaches
 Meaning composition follows the tree structure of a
syntactic parse
 Composing the meaning of a constituent from the
meanings of its sub-constituents in a syntactic parse
 Hand-built approaches (Woods, 1970, Warren and Pereira,
1982)
 Learned approaches
 Miller et al. (1996): Conceptually simple sentences
 Zettlemoyer & Collins (2005)): hand-built Combinatory
Categorial Grammar (CCG) template rules
8
Example
our player 2 has
the ball
PRP$ NN CD VB
DT NN
NP
VPNP
S
MR: bowner(player(our,2))
Use the structure of a syntactic parse
9
our player 2 has
the ball
PRP$-our NN-player(_,_) CD-2 VB-bowner(_)
DT-null NN-null
NP
VPNP
S
Example
MR: bowner(player(our,2))
Assign semantic concepts to words
10
our player 2 has
the ball
PRP$-our NN-player(_,_) CD-2 VB-bowner(_)
DT-null NN-null
NP
VPNP-player(our,2)
S
Example
MR: bowner(player(our,2))
Compose meaning for the internal nodes
11
our player 2 has
the ball
PRP$-our NN-player(_,_) CD-2 VB-bowner(_)
DT-null NN-null
NP-null
VP-bowner(_)NP-player(our,2)
S
Example
MR: bowner(player(our,2))
Compose meaning for the internal nodes
12
our player 2 has
the ball
PRP$-our NN-player(_,_) CD-2 VB-bowner(_)
DT-null NN-null
NP-null
VP-bowner(_)NP-player(our,2)
S-bowner(player(our,2))
Example
MR: bowner(player(our,2))
Compose meaning for the internal nodes
13
Semantic Grammars
 Non-terminals in a semantic grammar
correspond to semantic concepts in
application domains
 Hand-built approaches (Hendrix et al., 1978)
 Learned approaches
 Tang & Mooney (2001), Kate & Mooney (2006), Wong &
Mooney (2006)
14
our player 2
has the ball
our 2
player
bowner
Example
MR: bowner(player(our,2))
bowner → player has the ball
15
Thesis Contributions
 Introduce two novel syntax-based
approaches to semantic parsing
 Theoretically well-founded in computational
semantics (Blackburn and Bos, 2005)
 Great opportunity: leverage the significant
progress made in statistical syntactic parsing
for semantic parsing (Collins, 1997; Charniak and
Johnson, 2005; Huang, 2008)
16
Thesis Contributions
 SCISSOR: a novel integrated syntactic-
semantic parser
 SYNSEM: exploits an existing syntactic parser
to produce disambiguated parse trees that
drive the compositional meaning composition
 Investigate when the knowledge of syntax
can help
17
Representing Semantic Knowledge in
Meaning Representation Language Grammar
(MRLG)
Production Predicate
CONDITION →(bowner PLAYER) P_BOWNER
PLAYER →(player TEAM {UNUM}) P_PLAYER
UNUM → 2 P_UNUM
TEAM → our P_OUR
 Assumes a meaning representation language (MRL)
is defined by an unambiguous context-free
grammar.
 Each production rule introduces a single predicate
in the MRL.
 The parse of a MR gives its predicate-argument
structure.
18
Roadmap
 SCISSOR
 SYNSEM
 Future Work
 Conclusions
19
 Semantic Composition that Integrates Syntax
and Semantics to get Optimal Representations
 Integrated syntactic-semantic parsing
 Allows both syntax and semantics to be used
simultaneously to obtain an accurate combined
syntactic-semantic analysis
 A statistical parser is used to generate a
semantically augmented parse tree (SAPT)
SCISSOR
20
Syntactic Parse
PRP$ NN CD VB
DT NN
NP
VPNP
S
our player 2 has
the ball
21
SAPT
PRP$-P_OUR NN-P_PLAYER CD- P_UNUM VB-P_BOWNER
DT-NULL NN-NULL
NP-NULL
VP-P_BOWNERNP-P_PLAYER
S-P_BOWNER
our player 2 has
the ball
Non-terminals now have both syntactic and semantic labels
Semantic labels: dominate predicates in the sub-trees
22
SAPT
PRP$-P_OUR NN-P_PLAYER CD- P_UNUM VB-P_BOWNER
DT-NULL NN-NULL
NP-NULL
VP-P_BOWNERNP-P_PLAYER
S-P_BOWNER
our player 2 has
the ball
MR: P_BOWNER(P_PLAYER(P_OUR,P_UNUM))
23
SCISSOR Overview
Integrated Semantic ParserSAPT Training Examples
TRAINING
learner
24
Integrated Semantic Parser
SAPT
Compose MR
MR
NL Sentence
TESTING
SCISSOR Overview
25
Extending Collins’ (1997) Syntactic
Parsing Model
 Find a SAPT with the maximum probability
 A lexicalized head-driven syntactic parsing
model
 Extending the parsing model to generate
semantic labels simultaneously with syntactic
labels
26
Why Extending Collins’ (1997) Syntactic
Parsing Model
 Suitable for incorporating semantic
knowledge
 Head dependency: predicate-argument relation
 Syntactic subcategorization: a set of arguments
that a predicate appears with
 Bikel (2004) implementation: easily
extendable
27
Parser Implementation
 Supervised training on annotated SAPTs is
just frequency counting
 Testing: a variant of standard CKY chart-
parsing algorithm
 Details in the thesis
28
Smoothing
 Each label in SAPT is the combination of a
syntactic label and a semantic label
 Increases data sparsity
 Break the parameters down
Ph(H | P, w)
= Ph(Hsyn, Hsem| P, w)
= Ph(Hsyn | P, w) × Ph(Hsem| P, w, Hsyn)
29
Experimental Corpora
 CLang (Kate, Wong & Mooney, 2005)
 300 pieces of coaching advice
 22.52 words per sentence
 Geoquery (Zelle & Mooney, 1996)
 880 queries on a geography database
 7.48 word per sentence
 MRL: Prolog and FunQL
30
Prolog vs. FunQL (Wong, 2007)
Prolog:
answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
What are the rivers in Texas?
FunQL:
answer(river(loc_2(stateid(texas))))
X1: river; x2: texas
Logical forms: widely used as MRLs in
computational semantics, support reasoning
31
Prolog:
answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
What are the rivers in Texas?
FunQL:
answer(river(loc_2(stateid(texas))))
Flexible order
Strict order
Better generalization on Prolog
Prolog vs. FunQL (Wong, 2007)
32
Experimental Methodology
 standard 10-fold cross validation
 Correctness
 CLang: exactly matches the correct MR
 Geoquery: retrieves the same answers as the
correct MR
 Metrics
 Precision: % of the returned MRs that are correct
 Recall: % of NLs with their MRs correctly returned
 F-measure: harmonic mean of precision and recall
33
Compared Systems
 COCKTAIL (Tang & Mooney, 2001)
 Deterministic, inductive logic programming
 WASP (Wong & Mooney, 2006)
 Semantic grammar, machine translation
 KRISP (Kate & Mooney, 2006)
 Semantic grammar, string kernels
 Z&C (Zettleymoyer & Collins, 2007)
 Syntax-based, combinatory categorial grammar (CCG)
 LU (Lu et al., 2008)
 Semantic grammar, generative parsing model
34
Compared Systems
 COCKTAIL (Tang & Mooney, 2001)
 Deterministic, inductive logic programming
 WASP (Wong & Mooney, 2006)
 Semantic grammar, machine translation
 KRISP (Kate & Mooney, 2006)
 Semantic grammar, string kernels
 Z&C (Zettleymoyer & Collins, 2007)
 Syntax-based, combinatory categorial grammar (CCG)
 LU (Lu et al., 2008)
 Semantic grammar, generative parsing model
Hand-built
lexicon
for Geoquery
Manual CCG
Template rules
35
Compared Systems
 COCKTAIL (Tang & Mooney, 2001)
 Deterministic, inductive logic programming
 WASP (Wong & Mooney, 2006)
 Semantic grammar, machine translation
 KRISP (Kate & Mooney, 2006)
 Semantic grammar, string kernels
 Z&C (Zettleymoyer & Collins, 2007)
 Syntax-based, combinatory categorial grammar (CCG)
 LU (Lu et al., 2008)
 Semantic grammar, generative parsing model
λ-WASP,
handling logical
forms
36
Results on CLang
Precision Recall F-measure
COCKTAIL - - -
SCISSOR 89.5 73.7 80.8
WASP 88.9 61.9 73.0
KRISP 85.2 61.9 71.7
Z&C - - -
LU 82.4 57.7 67.8
(LU: F-measure after reranking is 74.4%)
Memory
overflow
Not reported
37
Results on CLang
Precision Recall F-measure
SCISSOR 89.5 73.7 80.8
WASP 88.9 61.9 73.0
KRISP 85.2 61.9 71.7
LU 82.4 57.7 67.8
(LU: F-measure after reranking is 74.4%)
38
Results on Geoquery
Precision Recall F-measure
SCISSOR 92.1 72.3 81.0
WASP 87.2 74.8 80.5
KRISP 93.3 71.7 81.1
LU 86.2 81.8 84.0
COCKTAIL 89.9 79.4 84.3
λ-WASP 92.0 86.6 89.2
Z&C 95.5 83.2 88.9
(LU: F-measure after reranking is 85.2%)
Prolog
FunQL
39
Results on Geoquery (FunQL)
Precision Recall F-measure
SCISSOR 92.1 72.3 81.0
WASP 87.2 74.8 80.5
KRISP 93.3 71.7 81.1
LU 86.2 81.8 84.0
(LU: F-measure after reranking is 85.2%)
competitive
40
Why Knowledge of Syntax
does not Help
 Geoquery: 7.48 word per sentence
 Short sentence
 Sentence structure can be feasibly learned from
NLs paired with MRs
 Gain from knowledge of syntax vs.
flexibility loss
41
Limitation of Using Prior
Knowledge of Syntax
What state
is the smallest
N1
N2
answer(smallest(state(all)))
Traditional syntactic analysis
42
Limitation of Using Prior
Knowledge of Syntax
What state
is the smallest
state is the smallest
N1
What N2
N1
N2
answer(smallest(state(all))) answer(smallest(state(all)))
Traditional syntactic analysis Semantic grammar
Isomorphic syntactic structure with MR
Better generalization
43
Why Prior Knowledge of
Syntax does not Help
 Geoquery: 7.48 word per sentence
 Short sentence
 Sentence structure can be feasibly learned from
NLs paired with MRs
 Gain from knowledge of syntax vs.
flexibility loss
 LU vs. WASP and KRISP
 Decomposed model for semantic grammar
44
Detailed Clang Results on
Sentence Length
0-10
(7%)
11-20
(33%)
21-30
(46%)
31-40
(13%)
0-10
(7%)
11-20
(33%)
21-30
(46%)
0-10
(7%)
11-20
(33%)
31-40
(13%)
21-30
(46%)
0-10
(7%)
11-20
(33%)
45
SCISSOR Summary
 Integrated syntactic-semantic parsing
approach
 Learns accurate semantic interpretations
by utilizing the SAPT annotations
 knowledge of syntax improves performance
on long sentences
46
Roadmap
 SCISSOR
 SYNSEM
 Future Work
 Conclusions
47
SYNSEM Motivation
 SCISSOR requires extra SAPT annotation for
training
 Must learn both syntax and semantics from
same limited training corpus
 High performance syntactic parsers are
available that are trained on existing large
corpora (Collins, 1997; Charniak & Johnson,
2005)
48
SCISSOR Requires SAPT Annotation
PRP$-P_OUR NN-P_PLAYER CD- P_UNUM VB-P_BOWNER
DT-NULL NN-NULL
NP-NULL
VP-P_BOWNERNP-P_PLAYER
S-P_BOWNER
our player 2 has
the ball
Time consuming.
Automate it!
49
Part I: Syntactic Parse
PRP$ NN CD VB
DT NN
NP
VPNP
S
our player 2 has
the ball
Use a statistical syntactic parser
50
Part II: Word Meanings
P_OUR P_PLAYER P_UNUM P_BOWNER NULL NULL
our player 2 has
the ball
Use a word alignment model (Wong and
Mooney (2006) )
our player 2 has ballthe
P_PLAYERP_BOWNER P_OUR P_UNUM
51
Learning a Semantic Lexicon
 IBM Model 5 word alignment (GIZA++)
 top 5 word/predicate alignments for each training
example
 Assume each word alignment and syntactic parse
defines a possible SAPT for composing the correct
MR
52
Introducing λvariables in semantic labels for
missing arguments
(a1: the first argument)
our player 2 has ballthe
VP
S
NP
NP
P_OUR
λa1λa2P_PLAYER
λa1P_BOWNER
P_UNUM NULLNULL
NP
53
our player 2 has ballthe
VP
S
NP
NP
P_OUR
λa1λa2P_PLAYER
λa1P_BOWNER
P_UNUM NULLNULL
P_BOWNER
P_PLAYER
P_UNUMP_OUR
Part III: Internal Semantic Labels
How to choose the dominant predicates?
NP
54
λa1λa2P_PLAYER P_UNUM
?
player 2
P_BOWNER
P_PLAYER
P_UNUMP_OUR
, a2=c2P_PLAYERλa1λa2PLAYER + P_UNUM  λa1
(c2: child 2)
Learning Semantic Composition Rules
55
our player 2 has ballthe
VP
S
NPP_OUR
λa1λa2P_PLAYER
λa1P_BOWNER
P_UNUM NULLNULL
λa1P_PLAYER
?
λa1λa2PLAYER + P_UNUM  {λa1P_PLAYER, a2=c2}
P_BOWNER
P_PLAYER
P_UNUMP_OUR
Learning Semantic Composition Rules
56
our player 2 has ballthe
VP
S
P_OUR
λa1λa2P_PLAYER
λa1P_BOWNER
P_UNUM NULLNULL
λa1P_PLAYER ?
P_PLAYER
P_OUR +λa1P_PLAYER  {P_PLAYER, a1=c1}
P_BOWNER
P_PLAYER
P_UNUMP_OUR
Learning Semantic Composition Rules
57
our player 2 has ballthe
P_OUR
λa1λa2P_PLAYER
λa1P_BOWNER
P_UNUM NULLNULL
λa1P_PLAYER
P_PLAYER
NULL
λa1P_BOWNER
?
P_BOWNER
P_PLAYER
P_UNUMP_OUR
Learning Semantic Composition Rules
58
our player 2 has ballthe
P_OUR
λa1λa2P_PLAYER
λa1P_BOWNER
P_UNUM NULLNULL
λa1P_PLAYER
P_PLAYER
NULL
λa1P_BOWNER
P_BOWNER
P_PLAYER + λa1P_BOWNER  {P_BOWNER, a1=c1}
P_BOWNER
P_PLAYER
P_UNUMP_OUR
Learning Semantic Composition Rules
59
Ensuring Meaning Composition
What state
is the smallest
N1
N2
answer(smallest(state(all)))
Non-isomorphism
60
Ensuring Meaning Composition
 Non-isomorphism between NL parse and MR
parse
 Various linguistic phenomena
 Machine translation between NL and MRL
 Use automated syntactic parses
 Introduce macro-predicates that combine
multiple predicates.
 Ensure that MR can be composed using a
syntactic parse and word alignment
61
Unambiguous CFG of MRL
Training set, {(S,T,MR)}
Training
Semantic parsing
Input sentence parse T Output MR
Testing
Before training & testing
training/test sentence, S
Syntactic parser
syntactic parse tree,T
Semantic knowledge acquisition
Semantic lexicon & composition rules
Parameter estimation
Probabilistic parsing model
SYNSEM Overview
62
Unambiguous CFG of MRL
Training set, {(S,T,MR)}
Training
Semantic parsing
Input sentence, S Output MR
Testing
Before training & testing
training/test sentence, S
Syntactic parser
syntactic parse tree,T
Semantic knowledge acquisition
Semantic lexicon & composition rules
Parameter estimation
Probabilistic parsing model
SYNSEM Overview
63
Parameter Estimation
• Apply the learned semantic knowledge to all training
examples to generate possible SAPTs
• Use a standard maximum-entropy model similar to that
of Zettlemoyer & Collins (2005), and Wong & Mooney
(2006)
• Training finds a parameter that (approximately)
maximizes the sum of the conditional log-likelihood of
the training set including syntactic parses
• Incomplete data since SAPTs are hidden variables
64
Features
 Lexical features:
 Unigram features: # that a word is assigned a
predicate
 Bigram features: # that a word is assigned a
predicate given its previous/subsequent word.
 Rule features: # a composition rule applied in
a derivation
65
answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
What are the rivers in Texas?
λv1P_ANSWER(x1)
λv1P_RIVER(x1) λv1λv2P_LOC(x1,x2) λv1P_EQUAL(x2)
Handling Logical Forms
Handle shared logical variables
Use Lambda Calculus (v: variable)
66
Prolog Example
answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
What are the rivers in Texas?
λv1P_ANSWER(x1)
(λv1P_RIVER(x1) λv1 λv2P_LOC(x1,x2) λv1P_EQUAL(x2))
Handle shared logical variables
Use Lambda Calculus (v: variable)
67
Prolog Example
answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
What are the rivers in Texas?
λv1P_ANSWER(x1)
(λv1P_RIVER(x1) λv1λv2P_LOC(x1,x2) λv1P_EQUAL(x2))
Handle shared logical variables
Use Lambda Calculus (v: variable)
68
Prolog Example
What are the rivers in Texas
NP
PP
IN
SBARQ
NP
NP
SQ
VBPWHNP
answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
Start from a
syntactic parse
69
Prolog Example
What are the rivers in Texas
PP
SBARQ
NP
SQ
λv1λa1P_ANSWER NULL λv1P_RIVER λv1λv2P_LOC λv1P_EQUAL
Add predicates to
words
answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
70
Prolog Example
What are the rivers in Texas
SBARQ
NP
SQ
λv1λa1P_ANSWER NULL λv1P_RIVER λv1λv2P_LOC λv1P_EQUAL
λv1P_LOC
answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
Learn a rule with
variable unification
λv1λv2P_LOC(x1,x2) + λv1P_EQUAL(x2)  λv1P_LOC
71
Experimental Results
 CLang
 Geoquery (Prolog)
72
Syntactic Parsers (Bikel,2004)
 WSJ only
 CLang(SYN0): F-measure=82.15%
 Geoquery(SYN0) : F-measure=76.44%
 WSJ + in-domain sentences
 CLang(SYN20): 20 sentences, F-measure=88.21%
 Geoquery(SYN40): 40 sentences, F-measure=91.46%
 Gold-standard syntactic parses (GOLDSYN)
73
Questions
 Q1. Can SYNSEM produce accurate semantic
interpretations?
 Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers?
 Q3. Does it also improve on long sentences?
 Q4. Does it improve on limited training data due
to the prior knowledge from large treebanks?
 Q5. Can it handle syntactic errors?
74
Results on CLang
Precision Recall F-measure
GOLDSYN 84.7 74.0 79.0
SYN20 85.4 70.0 76.9
SYN0 87.0 67.0 75.7
SCISSOR 89.5 73.7 80.8
WASP 88.9 61.9 73.0
KRISP 85.2 61.9 71.7
LU 82.4 57.7 67.8
(LU: F-measure after reranking is 74.4%)
SYNSEM
SAPTs
GOLDSYN > SYN20 > SYN0
75
Questions
 Q1. Can SynSem produce accurate semantic
interpretations? [yes]
 Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers? [yes]
 Q3. Does it also improve on long sentences?
76
Detailed Clang Results on Sentence Length
31-40
(13%)
21-30
(46%)
0-10
(7%)
11-20
(33%)
Prior
Knowledge
Syntactic
error
+ Flexibility + = ?
77
Questions
 Q1. Can SynSem produce accurate semantic
interpretations? [yes]
 Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers? [yes]
 Q3. Does it also improve on long sentences? [yes]
 Q4. Does it improve on limited training data due
to the prior knowledge from large treebanks?
78
Results on Clang
(training size = 40)
Precision Recall F-measure
GOLDSYN 61.1 35.7 45.1
SYN20 57.8 31.0 40.4
SYN0 53.5 22.7 31.9
SCISSOR 85.0 23.0 36.2
WASP 88.0 14.4 24.7
KRISP 68.35 20.0 31.0
SYNSEM
SAPTs
The quality of syntactic parser is critically important!
79
Questions
 Q1. Can SynSem produce accurate semantic
interpretations? [yes]
 Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers? [yes]
 Q3. Does it also improve on long sentences? [yes]
 Q4. Does it improve on limited training data due
to the prior knowledge from large treebanks? [yes]
 Q5. Can it handle syntactic errors?
80
Handling Syntactic Errors
 Training ensures meaning composition from
syntactic parses with errors
 For test NLs that generate correct MRs, measure
the F-measures of their syntactic parses
 SYN0: 85.5%
 SYN20: 91.2%
If DR2C7 is true
then players 2 , 3 , 7 and 8 should pass to player 4
81
Questions
 Q1. Can SynSem produce accurate semantic
interpretations? [yes]
 Q2. Can more accurate Treebank syntactic parsers
produce more accurate semantic parsers? [yes]
 Q3. Does it also improve on long sentences? [yes]
 Q4. Does it improve on limited training data due
to the prior knowledge of large treebanks? [yes]
 Q5. Is it robust to syntactic errors? [yes]
82
Results on Geoquery (Prolog)
Precision Recall F-measure
GOLDSYN 91.9 88.2 90.0
SYN40 90.2 86.9 88.5
SYN0 81.8 79.0 80.4
COCKTAIL 89.9 79.4 84.3
λ-WASP 92.0 86.6 89.2
Z&C 95.5 83.2 88.9
SYNSEM
SYN0 does not perform well
All other recent systems perform competitively
83
SYNSEM Summary
 Exploits an existing syntactic parser to drive
the meaning composition process
 Prior knowledge of syntax improves
performance on long sentences
 Prior knowledge of syntax improves
performance on limited training data
 Handle syntactic errors
84
Discriminative Reranking for
semantic Parsing
 Adapt global features used for reranking
syntactic parsing for semantic parsing
 Improvement on CLang
 No improvement on Geoquery where
sentences are short, and are less likely for
global features to show improvement on
85
Roadmap
 SCISSOR
 SYNSEM
 Future Work
 Conclusions
86
Future Work
 Improve SCISSOR
 Discriminative SCISSOR (Finkel, et al., 2008)
 Handling logical forms
 SCISSOR without extra annotation (Klein and
Manning, 2002, 2004)
 Improve SYNSEM
 Utilizing syntactic parsers with improved accuracy
and in other syntactic formalism
87
Future Work
 Utilizing wide-coverage semantic
representations (Curran et al., 2007)
 Better generalizations for syntactic variations
 Utilizing semantic role labeling (Gildea and
Palmer, 2002)
 Provides a layer of correlated semantic
information
88
Roadmap
 SCISSOR
 SYNSEM
 Future Work
 Conclusions
89
Conclusions
 SCISSOR: a novel integrated syntactic-semantic
parser.
 SYNSEM: exploits an existing syntactic parser to
produce disambiguated parse trees that drive the
compositional meaning composition.
 Both produce accurate semantic interpretations.
 Using the knowledge of syntax improves
performance on long sentences.
 SYNSEM also improves performance on limited
training data.
90
Thank you!
 Questions?

More Related Content

PDF
Lecture Notes-Finite State Automata for NLP.pdf
PPTX
NLP_KASHK:Evaluating Language Model
PPTX
Natural Language Processing: Parsing
PPTX
Fuzzy logic and application in AI
PDF
Word Embeddings - Introduction
PPTX
Introduction to Visual transformers
PPTX
Data Con LA 2022 - Transformers for NLP
PDF
Lecture: Question Answering
Lecture Notes-Finite State Automata for NLP.pdf
NLP_KASHK:Evaluating Language Model
Natural Language Processing: Parsing
Fuzzy logic and application in AI
Word Embeddings - Introduction
Introduction to Visual transformers
Data Con LA 2022 - Transformers for NLP
Lecture: Question Answering

What's hot (20)

PDF
6 shallow parsing introduction
PPT
Introduction to Natural Language Processing
PDF
Lecture: Word Sense Disambiguation
PDF
Lecture-18(11-02-22)Stochastics POS Tagging.pdf
PPTX
Natural language processing
PPTX
Deep fake
PPTX
1.Introduction to deep learning
PPT
Heuristic Search Techniques Unit -II.ppt
PPTX
Natural Language Processing
PPT
Natural Language Processing
PPTX
Text similarity measures
PPTX
The Role of Natural Language Processing in Information Retrieval
PPT
Finite automata examples
PDF
AI_ 3 & 4 Knowledge Representation issues
PDF
A Review of Deep Contextualized Word Representations (Peters+, 2018)
PDF
Introduction to Natural Language Processing (NLP)
PPTX
frames.pptx
PPTX
Morphological Analysis
PDF
Machine Translation Introduction
PPT
Artificial intelligence and knowledge representation
6 shallow parsing introduction
Introduction to Natural Language Processing
Lecture: Word Sense Disambiguation
Lecture-18(11-02-22)Stochastics POS Tagging.pdf
Natural language processing
Deep fake
1.Introduction to deep learning
Heuristic Search Techniques Unit -II.ppt
Natural Language Processing
Natural Language Processing
Text similarity measures
The Role of Natural Language Processing in Information Retrieval
Finite automata examples
AI_ 3 & 4 Knowledge Representation issues
A Review of Deep Contextualized Word Representations (Peters+, 2018)
Introduction to Natural Language Processing (NLP)
frames.pptx
Morphological Analysis
Machine Translation Introduction
Artificial intelligence and knowledge representation
Ad

Similar to Learning for semantic parsing using statistical syntactic parsing techniques (20)

ODP
Reference Scope Identification in Citing Sentences
PPT
4-Chapter Four-Syntactic Parsing and Semantic Analysis.ppt
PPTX
An introduction to compositional models in distributional semantics
PPTX
natural language processing
PDF
Crash-course in Natural Language Processing
PDF
Implementation Of Syntax Parser For English Language Using Grammar Rules
PPTX
Knowledge Extraction
PPT
Moore_slides.ppt
PDF
Adnan: Introduction to Natural Language Processing
PPTX
gdhfjdhjcbdjhvjhdshbajhbvdjbklcbdsjhbvjhsdbvjjv
PDF
Meaning Extraction - IJCTE 2(1)
PPTX
Artificial Intelligence Notes Unit 4
PPT
Semantic natural language processing ppt
PDF
NLP Project Full Circle
PDF
ESSLLI2016 DTS Lecture Day 5-1: Introduction to day 5
PDF
Sanskrit Parser Report
PPTX
https://www.slideshare.net/amaresimachew/hot-topics-132093738
PDF
Crash Course in Natural Language Processing (2016)
PDF
May 2024 - Top10 Cited Articles in Natural Language Computing
PPTX
Reference Scope Identification in Citing Sentences
4-Chapter Four-Syntactic Parsing and Semantic Analysis.ppt
An introduction to compositional models in distributional semantics
natural language processing
Crash-course in Natural Language Processing
Implementation Of Syntax Parser For English Language Using Grammar Rules
Knowledge Extraction
Moore_slides.ppt
Adnan: Introduction to Natural Language Processing
gdhfjdhjcbdjhvjhdshbajhbvdjbklcbdsjhbvjhsdbvjjv
Meaning Extraction - IJCTE 2(1)
Artificial Intelligence Notes Unit 4
Semantic natural language processing ppt
NLP Project Full Circle
ESSLLI2016 DTS Lecture Day 5-1: Introduction to day 5
Sanskrit Parser Report
https://www.slideshare.net/amaresimachew/hot-topics-132093738
Crash Course in Natural Language Processing (2016)
May 2024 - Top10 Cited Articles in Natural Language Computing
Ad

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Cloud computing and distributed systems.
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Big Data Technologies - Introduction.pptx
PDF
Electronic commerce courselecture one. Pdf
PDF
Approach and Philosophy of On baking technology
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PPTX
Machine Learning_overview_presentation.pptx
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PDF
Review of recent advances in non-invasive hemoglobin estimation
Empathic Computing: Creating Shared Understanding
“AI and Expert System Decision Support & Business Intelligence Systems”
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Cloud computing and distributed systems.
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Big Data Technologies - Introduction.pptx
Electronic commerce courselecture one. Pdf
Approach and Philosophy of On baking technology
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Dropbox Q2 2025 Financial Results & Investor Presentation
20250228 LYD VKU AI Blended-Learning.pptx
The Rise and Fall of 3GPP – Time for a Sabbatical?
Machine Learning_overview_presentation.pptx
MIND Revenue Release Quarter 2 2025 Press Release
A comparative analysis of optical character recognition models for extracting...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Unlocking AI with Model Context Protocol (MCP)
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Review of recent advances in non-invasive hemoglobin estimation

Learning for semantic parsing using statistical syntactic parsing techniques

  • 1. 1 Learning for Semantic Parsing Using Statistical Syntactic Parsing Techniques Ruifang Ge Ph.D. Final Defense Supervisor: Raymond J. Mooney Machine Learning Group Department of Computer Science The University of Texas at Austin
  • 2. 2 Semantic Parsing  Semantic Parsing: Transforming natural language (NL) sentences into completely formal meaning representations (MRs)  Sample application domains where MRs are directly executable by another computer system to perform some task  CLang: Robocup Coach Language  Geoquery: A Database Query Application
  • 3. 3 CLang (RoboCup Coach Language)  In RoboCup Coach competition, teams compete to coach simulated players  The coaching instructions are given in a formal language called CLang Simulated soccer field Coach If our player 2 has the ball, then position our player 5 in the midfield. CLang ((bowner (player our {2})) (do (player our {5}) (pos (midfield)))) Semantic Parsing
  • 4. 4 GeoQuery: A Database Query Application  Query application for U.S. geography database [Zelle & Mooney, 1996] User What are the rivers in Texas? Semantic Parsing DataBaseAngelina,Angelina, Blanco, …Blanco, … Query answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
  • 5. 5 Motivation for Semantic Parsing  Theoretically, it answers the question of how people interpret language  Practical applications  Question answering  Natural language interface  Knowledge acquisition  Reasoning
  • 6. 6 Motivating Example Semantic parsing is a compositional process. Sentence structures are needed for building meaning representations. ((bowner (player our {2})) (do our {4} (pos (half our)))) If our player 2 has the ball, our player 4 should stay in our half bowner: ball owner pos: position
  • 7. 7 Syntax-Based Approaches  Meaning composition follows the tree structure of a syntactic parse  Composing the meaning of a constituent from the meanings of its sub-constituents in a syntactic parse  Hand-built approaches (Woods, 1970, Warren and Pereira, 1982)  Learned approaches  Miller et al. (1996): Conceptually simple sentences  Zettlemoyer & Collins (2005)): hand-built Combinatory Categorial Grammar (CCG) template rules
  • 8. 8 Example our player 2 has the ball PRP$ NN CD VB DT NN NP VPNP S MR: bowner(player(our,2)) Use the structure of a syntactic parse
  • 9. 9 our player 2 has the ball PRP$-our NN-player(_,_) CD-2 VB-bowner(_) DT-null NN-null NP VPNP S Example MR: bowner(player(our,2)) Assign semantic concepts to words
  • 10. 10 our player 2 has the ball PRP$-our NN-player(_,_) CD-2 VB-bowner(_) DT-null NN-null NP VPNP-player(our,2) S Example MR: bowner(player(our,2)) Compose meaning for the internal nodes
  • 11. 11 our player 2 has the ball PRP$-our NN-player(_,_) CD-2 VB-bowner(_) DT-null NN-null NP-null VP-bowner(_)NP-player(our,2) S Example MR: bowner(player(our,2)) Compose meaning for the internal nodes
  • 12. 12 our player 2 has the ball PRP$-our NN-player(_,_) CD-2 VB-bowner(_) DT-null NN-null NP-null VP-bowner(_)NP-player(our,2) S-bowner(player(our,2)) Example MR: bowner(player(our,2)) Compose meaning for the internal nodes
  • 13. 13 Semantic Grammars  Non-terminals in a semantic grammar correspond to semantic concepts in application domains  Hand-built approaches (Hendrix et al., 1978)  Learned approaches  Tang & Mooney (2001), Kate & Mooney (2006), Wong & Mooney (2006)
  • 14. 14 our player 2 has the ball our 2 player bowner Example MR: bowner(player(our,2)) bowner → player has the ball
  • 15. 15 Thesis Contributions  Introduce two novel syntax-based approaches to semantic parsing  Theoretically well-founded in computational semantics (Blackburn and Bos, 2005)  Great opportunity: leverage the significant progress made in statistical syntactic parsing for semantic parsing (Collins, 1997; Charniak and Johnson, 2005; Huang, 2008)
  • 16. 16 Thesis Contributions  SCISSOR: a novel integrated syntactic- semantic parser  SYNSEM: exploits an existing syntactic parser to produce disambiguated parse trees that drive the compositional meaning composition  Investigate when the knowledge of syntax can help
  • 17. 17 Representing Semantic Knowledge in Meaning Representation Language Grammar (MRLG) Production Predicate CONDITION →(bowner PLAYER) P_BOWNER PLAYER →(player TEAM {UNUM}) P_PLAYER UNUM → 2 P_UNUM TEAM → our P_OUR  Assumes a meaning representation language (MRL) is defined by an unambiguous context-free grammar.  Each production rule introduces a single predicate in the MRL.  The parse of a MR gives its predicate-argument structure.
  • 18. 18 Roadmap  SCISSOR  SYNSEM  Future Work  Conclusions
  • 19. 19  Semantic Composition that Integrates Syntax and Semantics to get Optimal Representations  Integrated syntactic-semantic parsing  Allows both syntax and semantics to be used simultaneously to obtain an accurate combined syntactic-semantic analysis  A statistical parser is used to generate a semantically augmented parse tree (SAPT) SCISSOR
  • 20. 20 Syntactic Parse PRP$ NN CD VB DT NN NP VPNP S our player 2 has the ball
  • 21. 21 SAPT PRP$-P_OUR NN-P_PLAYER CD- P_UNUM VB-P_BOWNER DT-NULL NN-NULL NP-NULL VP-P_BOWNERNP-P_PLAYER S-P_BOWNER our player 2 has the ball Non-terminals now have both syntactic and semantic labels Semantic labels: dominate predicates in the sub-trees
  • 22. 22 SAPT PRP$-P_OUR NN-P_PLAYER CD- P_UNUM VB-P_BOWNER DT-NULL NN-NULL NP-NULL VP-P_BOWNERNP-P_PLAYER S-P_BOWNER our player 2 has the ball MR: P_BOWNER(P_PLAYER(P_OUR,P_UNUM))
  • 23. 23 SCISSOR Overview Integrated Semantic ParserSAPT Training Examples TRAINING learner
  • 24. 24 Integrated Semantic Parser SAPT Compose MR MR NL Sentence TESTING SCISSOR Overview
  • 25. 25 Extending Collins’ (1997) Syntactic Parsing Model  Find a SAPT with the maximum probability  A lexicalized head-driven syntactic parsing model  Extending the parsing model to generate semantic labels simultaneously with syntactic labels
  • 26. 26 Why Extending Collins’ (1997) Syntactic Parsing Model  Suitable for incorporating semantic knowledge  Head dependency: predicate-argument relation  Syntactic subcategorization: a set of arguments that a predicate appears with  Bikel (2004) implementation: easily extendable
  • 27. 27 Parser Implementation  Supervised training on annotated SAPTs is just frequency counting  Testing: a variant of standard CKY chart- parsing algorithm  Details in the thesis
  • 28. 28 Smoothing  Each label in SAPT is the combination of a syntactic label and a semantic label  Increases data sparsity  Break the parameters down Ph(H | P, w) = Ph(Hsyn, Hsem| P, w) = Ph(Hsyn | P, w) × Ph(Hsem| P, w, Hsyn)
  • 29. 29 Experimental Corpora  CLang (Kate, Wong & Mooney, 2005)  300 pieces of coaching advice  22.52 words per sentence  Geoquery (Zelle & Mooney, 1996)  880 queries on a geography database  7.48 word per sentence  MRL: Prolog and FunQL
  • 30. 30 Prolog vs. FunQL (Wong, 2007) Prolog: answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas)))) What are the rivers in Texas? FunQL: answer(river(loc_2(stateid(texas)))) X1: river; x2: texas Logical forms: widely used as MRLs in computational semantics, support reasoning
  • 31. 31 Prolog: answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas)))) What are the rivers in Texas? FunQL: answer(river(loc_2(stateid(texas)))) Flexible order Strict order Better generalization on Prolog Prolog vs. FunQL (Wong, 2007)
  • 32. 32 Experimental Methodology  standard 10-fold cross validation  Correctness  CLang: exactly matches the correct MR  Geoquery: retrieves the same answers as the correct MR  Metrics  Precision: % of the returned MRs that are correct  Recall: % of NLs with their MRs correctly returned  F-measure: harmonic mean of precision and recall
  • 33. 33 Compared Systems  COCKTAIL (Tang & Mooney, 2001)  Deterministic, inductive logic programming  WASP (Wong & Mooney, 2006)  Semantic grammar, machine translation  KRISP (Kate & Mooney, 2006)  Semantic grammar, string kernels  Z&C (Zettleymoyer & Collins, 2007)  Syntax-based, combinatory categorial grammar (CCG)  LU (Lu et al., 2008)  Semantic grammar, generative parsing model
  • 34. 34 Compared Systems  COCKTAIL (Tang & Mooney, 2001)  Deterministic, inductive logic programming  WASP (Wong & Mooney, 2006)  Semantic grammar, machine translation  KRISP (Kate & Mooney, 2006)  Semantic grammar, string kernels  Z&C (Zettleymoyer & Collins, 2007)  Syntax-based, combinatory categorial grammar (CCG)  LU (Lu et al., 2008)  Semantic grammar, generative parsing model Hand-built lexicon for Geoquery Manual CCG Template rules
  • 35. 35 Compared Systems  COCKTAIL (Tang & Mooney, 2001)  Deterministic, inductive logic programming  WASP (Wong & Mooney, 2006)  Semantic grammar, machine translation  KRISP (Kate & Mooney, 2006)  Semantic grammar, string kernels  Z&C (Zettleymoyer & Collins, 2007)  Syntax-based, combinatory categorial grammar (CCG)  LU (Lu et al., 2008)  Semantic grammar, generative parsing model λ-WASP, handling logical forms
  • 36. 36 Results on CLang Precision Recall F-measure COCKTAIL - - - SCISSOR 89.5 73.7 80.8 WASP 88.9 61.9 73.0 KRISP 85.2 61.9 71.7 Z&C - - - LU 82.4 57.7 67.8 (LU: F-measure after reranking is 74.4%) Memory overflow Not reported
  • 37. 37 Results on CLang Precision Recall F-measure SCISSOR 89.5 73.7 80.8 WASP 88.9 61.9 73.0 KRISP 85.2 61.9 71.7 LU 82.4 57.7 67.8 (LU: F-measure after reranking is 74.4%)
  • 38. 38 Results on Geoquery Precision Recall F-measure SCISSOR 92.1 72.3 81.0 WASP 87.2 74.8 80.5 KRISP 93.3 71.7 81.1 LU 86.2 81.8 84.0 COCKTAIL 89.9 79.4 84.3 λ-WASP 92.0 86.6 89.2 Z&C 95.5 83.2 88.9 (LU: F-measure after reranking is 85.2%) Prolog FunQL
  • 39. 39 Results on Geoquery (FunQL) Precision Recall F-measure SCISSOR 92.1 72.3 81.0 WASP 87.2 74.8 80.5 KRISP 93.3 71.7 81.1 LU 86.2 81.8 84.0 (LU: F-measure after reranking is 85.2%) competitive
  • 40. 40 Why Knowledge of Syntax does not Help  Geoquery: 7.48 word per sentence  Short sentence  Sentence structure can be feasibly learned from NLs paired with MRs  Gain from knowledge of syntax vs. flexibility loss
  • 41. 41 Limitation of Using Prior Knowledge of Syntax What state is the smallest N1 N2 answer(smallest(state(all))) Traditional syntactic analysis
  • 42. 42 Limitation of Using Prior Knowledge of Syntax What state is the smallest state is the smallest N1 What N2 N1 N2 answer(smallest(state(all))) answer(smallest(state(all))) Traditional syntactic analysis Semantic grammar Isomorphic syntactic structure with MR Better generalization
  • 43. 43 Why Prior Knowledge of Syntax does not Help  Geoquery: 7.48 word per sentence  Short sentence  Sentence structure can be feasibly learned from NLs paired with MRs  Gain from knowledge of syntax vs. flexibility loss  LU vs. WASP and KRISP  Decomposed model for semantic grammar
  • 44. 44 Detailed Clang Results on Sentence Length 0-10 (7%) 11-20 (33%) 21-30 (46%) 31-40 (13%) 0-10 (7%) 11-20 (33%) 21-30 (46%) 0-10 (7%) 11-20 (33%) 31-40 (13%) 21-30 (46%) 0-10 (7%) 11-20 (33%)
  • 45. 45 SCISSOR Summary  Integrated syntactic-semantic parsing approach  Learns accurate semantic interpretations by utilizing the SAPT annotations  knowledge of syntax improves performance on long sentences
  • 46. 46 Roadmap  SCISSOR  SYNSEM  Future Work  Conclusions
  • 47. 47 SYNSEM Motivation  SCISSOR requires extra SAPT annotation for training  Must learn both syntax and semantics from same limited training corpus  High performance syntactic parsers are available that are trained on existing large corpora (Collins, 1997; Charniak & Johnson, 2005)
  • 48. 48 SCISSOR Requires SAPT Annotation PRP$-P_OUR NN-P_PLAYER CD- P_UNUM VB-P_BOWNER DT-NULL NN-NULL NP-NULL VP-P_BOWNERNP-P_PLAYER S-P_BOWNER our player 2 has the ball Time consuming. Automate it!
  • 49. 49 Part I: Syntactic Parse PRP$ NN CD VB DT NN NP VPNP S our player 2 has the ball Use a statistical syntactic parser
  • 50. 50 Part II: Word Meanings P_OUR P_PLAYER P_UNUM P_BOWNER NULL NULL our player 2 has the ball Use a word alignment model (Wong and Mooney (2006) ) our player 2 has ballthe P_PLAYERP_BOWNER P_OUR P_UNUM
  • 51. 51 Learning a Semantic Lexicon  IBM Model 5 word alignment (GIZA++)  top 5 word/predicate alignments for each training example  Assume each word alignment and syntactic parse defines a possible SAPT for composing the correct MR
  • 52. 52 Introducing λvariables in semantic labels for missing arguments (a1: the first argument) our player 2 has ballthe VP S NP NP P_OUR λa1λa2P_PLAYER λa1P_BOWNER P_UNUM NULLNULL NP
  • 53. 53 our player 2 has ballthe VP S NP NP P_OUR λa1λa2P_PLAYER λa1P_BOWNER P_UNUM NULLNULL P_BOWNER P_PLAYER P_UNUMP_OUR Part III: Internal Semantic Labels How to choose the dominant predicates? NP
  • 54. 54 λa1λa2P_PLAYER P_UNUM ? player 2 P_BOWNER P_PLAYER P_UNUMP_OUR , a2=c2P_PLAYERλa1λa2PLAYER + P_UNUM  λa1 (c2: child 2) Learning Semantic Composition Rules
  • 55. 55 our player 2 has ballthe VP S NPP_OUR λa1λa2P_PLAYER λa1P_BOWNER P_UNUM NULLNULL λa1P_PLAYER ? λa1λa2PLAYER + P_UNUM  {λa1P_PLAYER, a2=c2} P_BOWNER P_PLAYER P_UNUMP_OUR Learning Semantic Composition Rules
  • 56. 56 our player 2 has ballthe VP S P_OUR λa1λa2P_PLAYER λa1P_BOWNER P_UNUM NULLNULL λa1P_PLAYER ? P_PLAYER P_OUR +λa1P_PLAYER  {P_PLAYER, a1=c1} P_BOWNER P_PLAYER P_UNUMP_OUR Learning Semantic Composition Rules
  • 57. 57 our player 2 has ballthe P_OUR λa1λa2P_PLAYER λa1P_BOWNER P_UNUM NULLNULL λa1P_PLAYER P_PLAYER NULL λa1P_BOWNER ? P_BOWNER P_PLAYER P_UNUMP_OUR Learning Semantic Composition Rules
  • 58. 58 our player 2 has ballthe P_OUR λa1λa2P_PLAYER λa1P_BOWNER P_UNUM NULLNULL λa1P_PLAYER P_PLAYER NULL λa1P_BOWNER P_BOWNER P_PLAYER + λa1P_BOWNER  {P_BOWNER, a1=c1} P_BOWNER P_PLAYER P_UNUMP_OUR Learning Semantic Composition Rules
  • 59. 59 Ensuring Meaning Composition What state is the smallest N1 N2 answer(smallest(state(all))) Non-isomorphism
  • 60. 60 Ensuring Meaning Composition  Non-isomorphism between NL parse and MR parse  Various linguistic phenomena  Machine translation between NL and MRL  Use automated syntactic parses  Introduce macro-predicates that combine multiple predicates.  Ensure that MR can be composed using a syntactic parse and word alignment
  • 61. 61 Unambiguous CFG of MRL Training set, {(S,T,MR)} Training Semantic parsing Input sentence parse T Output MR Testing Before training & testing training/test sentence, S Syntactic parser syntactic parse tree,T Semantic knowledge acquisition Semantic lexicon & composition rules Parameter estimation Probabilistic parsing model SYNSEM Overview
  • 62. 62 Unambiguous CFG of MRL Training set, {(S,T,MR)} Training Semantic parsing Input sentence, S Output MR Testing Before training & testing training/test sentence, S Syntactic parser syntactic parse tree,T Semantic knowledge acquisition Semantic lexicon & composition rules Parameter estimation Probabilistic parsing model SYNSEM Overview
  • 63. 63 Parameter Estimation • Apply the learned semantic knowledge to all training examples to generate possible SAPTs • Use a standard maximum-entropy model similar to that of Zettlemoyer & Collins (2005), and Wong & Mooney (2006) • Training finds a parameter that (approximately) maximizes the sum of the conditional log-likelihood of the training set including syntactic parses • Incomplete data since SAPTs are hidden variables
  • 64. 64 Features  Lexical features:  Unigram features: # that a word is assigned a predicate  Bigram features: # that a word is assigned a predicate given its previous/subsequent word.  Rule features: # a composition rule applied in a derivation
  • 65. 65 answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas)))) What are the rivers in Texas? λv1P_ANSWER(x1) λv1P_RIVER(x1) λv1λv2P_LOC(x1,x2) λv1P_EQUAL(x2) Handling Logical Forms Handle shared logical variables Use Lambda Calculus (v: variable)
  • 66. 66 Prolog Example answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas)))) What are the rivers in Texas? λv1P_ANSWER(x1) (λv1P_RIVER(x1) λv1 λv2P_LOC(x1,x2) λv1P_EQUAL(x2)) Handle shared logical variables Use Lambda Calculus (v: variable)
  • 67. 67 Prolog Example answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas)))) What are the rivers in Texas? λv1P_ANSWER(x1) (λv1P_RIVER(x1) λv1λv2P_LOC(x1,x2) λv1P_EQUAL(x2)) Handle shared logical variables Use Lambda Calculus (v: variable)
  • 68. 68 Prolog Example What are the rivers in Texas NP PP IN SBARQ NP NP SQ VBPWHNP answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas)))) Start from a syntactic parse
  • 69. 69 Prolog Example What are the rivers in Texas PP SBARQ NP SQ λv1λa1P_ANSWER NULL λv1P_RIVER λv1λv2P_LOC λv1P_EQUAL Add predicates to words answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas))))
  • 70. 70 Prolog Example What are the rivers in Texas SBARQ NP SQ λv1λa1P_ANSWER NULL λv1P_RIVER λv1λv2P_LOC λv1P_EQUAL λv1P_LOC answer(x1, (river(x1), loc(x1,x2), equal(x2,stateid(texas)))) Learn a rule with variable unification λv1λv2P_LOC(x1,x2) + λv1P_EQUAL(x2)  λv1P_LOC
  • 72. 72 Syntactic Parsers (Bikel,2004)  WSJ only  CLang(SYN0): F-measure=82.15%  Geoquery(SYN0) : F-measure=76.44%  WSJ + in-domain sentences  CLang(SYN20): 20 sentences, F-measure=88.21%  Geoquery(SYN40): 40 sentences, F-measure=91.46%  Gold-standard syntactic parses (GOLDSYN)
  • 73. 73 Questions  Q1. Can SYNSEM produce accurate semantic interpretations?  Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers?  Q3. Does it also improve on long sentences?  Q4. Does it improve on limited training data due to the prior knowledge from large treebanks?  Q5. Can it handle syntactic errors?
  • 74. 74 Results on CLang Precision Recall F-measure GOLDSYN 84.7 74.0 79.0 SYN20 85.4 70.0 76.9 SYN0 87.0 67.0 75.7 SCISSOR 89.5 73.7 80.8 WASP 88.9 61.9 73.0 KRISP 85.2 61.9 71.7 LU 82.4 57.7 67.8 (LU: F-measure after reranking is 74.4%) SYNSEM SAPTs GOLDSYN > SYN20 > SYN0
  • 75. 75 Questions  Q1. Can SynSem produce accurate semantic interpretations? [yes]  Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers? [yes]  Q3. Does it also improve on long sentences?
  • 76. 76 Detailed Clang Results on Sentence Length 31-40 (13%) 21-30 (46%) 0-10 (7%) 11-20 (33%) Prior Knowledge Syntactic error + Flexibility + = ?
  • 77. 77 Questions  Q1. Can SynSem produce accurate semantic interpretations? [yes]  Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers? [yes]  Q3. Does it also improve on long sentences? [yes]  Q4. Does it improve on limited training data due to the prior knowledge from large treebanks?
  • 78. 78 Results on Clang (training size = 40) Precision Recall F-measure GOLDSYN 61.1 35.7 45.1 SYN20 57.8 31.0 40.4 SYN0 53.5 22.7 31.9 SCISSOR 85.0 23.0 36.2 WASP 88.0 14.4 24.7 KRISP 68.35 20.0 31.0 SYNSEM SAPTs The quality of syntactic parser is critically important!
  • 79. 79 Questions  Q1. Can SynSem produce accurate semantic interpretations? [yes]  Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers? [yes]  Q3. Does it also improve on long sentences? [yes]  Q4. Does it improve on limited training data due to the prior knowledge from large treebanks? [yes]  Q5. Can it handle syntactic errors?
  • 80. 80 Handling Syntactic Errors  Training ensures meaning composition from syntactic parses with errors  For test NLs that generate correct MRs, measure the F-measures of their syntactic parses  SYN0: 85.5%  SYN20: 91.2% If DR2C7 is true then players 2 , 3 , 7 and 8 should pass to player 4
  • 81. 81 Questions  Q1. Can SynSem produce accurate semantic interpretations? [yes]  Q2. Can more accurate Treebank syntactic parsers produce more accurate semantic parsers? [yes]  Q3. Does it also improve on long sentences? [yes]  Q4. Does it improve on limited training data due to the prior knowledge of large treebanks? [yes]  Q5. Is it robust to syntactic errors? [yes]
  • 82. 82 Results on Geoquery (Prolog) Precision Recall F-measure GOLDSYN 91.9 88.2 90.0 SYN40 90.2 86.9 88.5 SYN0 81.8 79.0 80.4 COCKTAIL 89.9 79.4 84.3 λ-WASP 92.0 86.6 89.2 Z&C 95.5 83.2 88.9 SYNSEM SYN0 does not perform well All other recent systems perform competitively
  • 83. 83 SYNSEM Summary  Exploits an existing syntactic parser to drive the meaning composition process  Prior knowledge of syntax improves performance on long sentences  Prior knowledge of syntax improves performance on limited training data  Handle syntactic errors
  • 84. 84 Discriminative Reranking for semantic Parsing  Adapt global features used for reranking syntactic parsing for semantic parsing  Improvement on CLang  No improvement on Geoquery where sentences are short, and are less likely for global features to show improvement on
  • 85. 85 Roadmap  SCISSOR  SYNSEM  Future Work  Conclusions
  • 86. 86 Future Work  Improve SCISSOR  Discriminative SCISSOR (Finkel, et al., 2008)  Handling logical forms  SCISSOR without extra annotation (Klein and Manning, 2002, 2004)  Improve SYNSEM  Utilizing syntactic parsers with improved accuracy and in other syntactic formalism
  • 87. 87 Future Work  Utilizing wide-coverage semantic representations (Curran et al., 2007)  Better generalizations for syntactic variations  Utilizing semantic role labeling (Gildea and Palmer, 2002)  Provides a layer of correlated semantic information
  • 88. 88 Roadmap  SCISSOR  SYNSEM  Future Work  Conclusions
  • 89. 89 Conclusions  SCISSOR: a novel integrated syntactic-semantic parser.  SYNSEM: exploits an existing syntactic parser to produce disambiguated parse trees that drive the compositional meaning composition.  Both produce accurate semantic interpretations.  Using the knowledge of syntax improves performance on long sentences.  SYNSEM also improves performance on limited training data.

Editor's Notes

  • #20: Insert an outline slides for scissor overview, integrated model(augment Collins’s parsemodel), experiment Each internal node includes both a syntactic and semantic label SAPT captures the semantic interpretation of individual words and the basic predicate-argument structure of the sentence
  • #24: Animiation Training Examples Parser
  • #25: Animiation Training Examples Parser