SlideShare a Scribd company logo
SPADE: Evaluation Dataset for
Monolingual Phrase Alignment
Yuki Arase*† and Junichi Tsujii†◊
*Osaka University, Japan
†Artificial Intelligence Research Center (AIRC), AIST, Japan
◊NaCTeM, School of Computer Science, University of Manchester, UK
Created and Released a dataset annotating
Phrase alignments on parse trees
of paraphrases
her life is excellent and wonderful… she also has a very splendid… life
COOD
ADJP
VP
NP
S…
ADJP
NP
NP
VP
VP
S…
Annotator #1
Annotator #2
Annotator #3
2
15,721
alignments
https://catalog.ldc.upenn.edu/LDC2018T09
3
Phrasal (N-gram) Paraphrases
♠Phrasal paraphrases of N-grams have been useful
for NLP applications
• Semantic parsing (Berant and Liang, 2014)
• Automatic QA (Dong et al., 2017)
♠PPDB (Ganitkevitch et al., 2013) is widely used as
an abundant resource
4
Are N-grams Sufficient?
♠Syntactic structures are important in modeling
phrases/sentences
• Semantic relatedness (Tai et al., 2015)
• Phrase embedding (Wieting et al., 2015)
♠Part of PPDB provides phrasal paraphrases under the
synchronous context free grammar (SCFG)
♠SCFG captures only a fraction of paraphrasing
phenomenon (Weese et al., 2014)
• Only 9.1% of paraphrases were reachable using SCFG
5
♠Phrasal paraphrases under the linguistically
motivated grammar would deliver richer
syntactic information
♠For systematic research,
• SPADE annotates phrase alignments under
the head-driven phrase structure grammar (Pollard
and Sag, 1994)
• Evaluation metrics are proposed for benchmarking
Phrase Alignment on Paraphrases
6
Annotation Target
Paraphrases extracted from MT evaluation corpora
♠Paraphrases by linguistic operations
♠Paraphrases with simple summarization
Relying on team spirit, expedition members defeated difficulties.
Members of the scientific team overcame challenges living on Mars
through teamwork.
7
Approach
1. Gold-tree annotation by a linguistic expert
2. Phrase alignment annotation
• 3 annotators independently identified phrase
alignments using a provided annotation tool
• Refer to tree structures when helpful
8
Gold-Tree Annotation
her life is excellent and wonderful… she also has a very splendid… life
COOD
ADJP
VP
NP
S…
ADJP
NP
NP
VP
VP
S…
9
Phrase alignment annotation
her life is excellent and wonderful… she also has a very splendid… life
COOD
ADJP
VP
NP
S…
ADJP
NP
NP
VP
VP
S…
10
SPADE Statistics
Dev Test
# of sentence pairs 50 151
# of tokens 2,494 7,276
# of types 736 1,573
# of phrases (w/o tokens) 5,201 15,075
# of alignments (∪) 3,932 11,789
# of alignments (∩) 2,518 7,134
11
Evaluation Metric
♠ALIR (ALInment Recall) evaluates how gold alignments
(𝔾𝔾 & 𝔾𝔾′) can be replicated by automatic alignment (ℍ𝑎𝑎)
ALIR =
| 𝕙𝕙|𝕙𝕙 ∈ ℍ𝑎𝑎 ∧ 𝕙𝕙 ∈ 𝔾𝔾 ∩ 𝔾𝔾′
|
𝔾𝔾 ∩ 𝔾𝔾′
♠ALIP (ALInment Precision) evaluates how automatic
alignments overlap with alignments that at least an
annotator aligned
ALIP =
| 𝕙𝕙|𝕙𝕙 ∈ ℍ𝑎𝑎 ∧ 𝕙𝕙 ∈ 𝔾𝔾 ∪ 𝔾𝔾′
|
ℍ𝑎𝑎
12
Benchmark
90.65
88.21
83.64
78.91
70
75
80
85
90
95
ALIR ALIP
Human
(Arase and Tsujii,
2017)
Y. Arase and J. Tsujii. 2017.
Monolingual Phrase Alignment
on Parse Forests, in Proc. of
EMNLP, pp. 1-11.
13
Future Directions
Expand the dataset
1. Size
• Working on annotating 5k more paraphrase pairs
2. Linguistic phenomenon in paraphrases
• SPADE used reference translations as paraphrases
• Cover relatively simple paraphrases due to constraints by
the source sentences
14
Future Directions (Cont’d)
2. Linguistic phenomenon in paraphrases
• Annotate paraphrases from other datasets
• Microsoft Research Paraphrase Corpus (Dolan et al., 2004)
• Twitter URL corpus (Lan et al., 2017)
• Cover diverse linguistic phenomenon of
paraphrases in the wild
Ex) Paraphrases involve inferences/entailments
Scientists overcame challenges living on Mars.
Scientists overcame water and oxygen scarcity on the red planet. 15

More Related Content

PDF
2024 Trend Updates: What Really Works In SEO & Content Marketing
PDF
闘病ブログからの医薬品奏功情報認識
PDF
自然言語処理によるテキストデータ処理
PDF
[最新版] JSAI2018 チュートリアル「"深層学習時代の" ゼロから始める自然言語処理」
PDF
[旧版] JSAI2018 チュートリアル「"深層学習時代の" ゼロから始める自然言語処理」
PDF
NLP R&D 育成と連携:NLP若手の会 (YANS)の取り組み
PDF
Monolingual Phrase Alignment on Parse Forests (EMNLP2017 presentation)
PDF
ゼロから始める自然言語処理 【FIT2016チュートリアル】
2024 Trend Updates: What Really Works In SEO & Content Marketing
闘病ブログからの医薬品奏功情報認識
自然言語処理によるテキストデータ処理
[最新版] JSAI2018 チュートリアル「"深層学習時代の" ゼロから始める自然言語処理」
[旧版] JSAI2018 チュートリアル「"深層学習時代の" ゼロから始める自然言語処理」
NLP R&D 育成と連携:NLP若手の会 (YANS)の取り組み
Monolingual Phrase Alignment on Parse Forests (EMNLP2017 presentation)
ゼロから始める自然言語処理 【FIT2016チュートリアル】

Recently uploaded (20)

PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
TLE Review Electricity (Electricity).pptx
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PPTX
Tartificialntelligence_presentation.pptx
PDF
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
PPTX
The various Industrial Revolutions .pptx
PDF
A novel scalable deep ensemble learning framework for big data classification...
PDF
DP Operators-handbook-extract for the Mautical Institute
PDF
STKI Israel Market Study 2025 version august
PDF
1 - Historical Antecedents, Social Consideration.pdf
PDF
project resource management chapter-09.pdf
PDF
Zenith AI: Advanced Artificial Intelligence
PDF
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
PPT
Module 1.ppt Iot fundamentals and Architecture
PDF
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
PDF
Architecture types and enterprise applications.pdf
PDF
Hindi spoken digit analysis for native and non-native speakers
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Getting started with AI Agents and Multi-Agent Systems
NewMind AI Weekly Chronicles - August'25-Week II
TLE Review Electricity (Electricity).pptx
WOOl fibre morphology and structure.pdf for textiles
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
Tartificialntelligence_presentation.pptx
ENT215_Completing-a-large-scale-migration-and-modernization-with-AWS.pdf
The various Industrial Revolutions .pptx
A novel scalable deep ensemble learning framework for big data classification...
DP Operators-handbook-extract for the Mautical Institute
STKI Israel Market Study 2025 version august
1 - Historical Antecedents, Social Consideration.pdf
project resource management chapter-09.pdf
Zenith AI: Advanced Artificial Intelligence
How ambidextrous entrepreneurial leaders react to the artificial intelligence...
Module 1.ppt Iot fundamentals and Architecture
Microsoft Solutions Partner Drive Digital Transformation with D365.pdf
Architecture types and enterprise applications.pdf
Hindi spoken digit analysis for native and non-native speakers
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
Ad
Ad

SPADE: Evaluation Dataset for Monolingual Phrase Alignment

  • 1. SPADE: Evaluation Dataset for Monolingual Phrase Alignment Yuki Arase*† and Junichi Tsujii†◊ *Osaka University, Japan †Artificial Intelligence Research Center (AIRC), AIST, Japan ◊NaCTeM, School of Computer Science, University of Manchester, UK
  • 2. Created and Released a dataset annotating Phrase alignments on parse trees of paraphrases her life is excellent and wonderful… she also has a very splendid… life COOD ADJP VP NP S… ADJP NP NP VP VP S… Annotator #1 Annotator #2 Annotator #3 2 15,721 alignments
  • 4. Phrasal (N-gram) Paraphrases ♠Phrasal paraphrases of N-grams have been useful for NLP applications • Semantic parsing (Berant and Liang, 2014) • Automatic QA (Dong et al., 2017) ♠PPDB (Ganitkevitch et al., 2013) is widely used as an abundant resource 4
  • 5. Are N-grams Sufficient? ♠Syntactic structures are important in modeling phrases/sentences • Semantic relatedness (Tai et al., 2015) • Phrase embedding (Wieting et al., 2015) ♠Part of PPDB provides phrasal paraphrases under the synchronous context free grammar (SCFG) ♠SCFG captures only a fraction of paraphrasing phenomenon (Weese et al., 2014) • Only 9.1% of paraphrases were reachable using SCFG 5
  • 6. ♠Phrasal paraphrases under the linguistically motivated grammar would deliver richer syntactic information ♠For systematic research, • SPADE annotates phrase alignments under the head-driven phrase structure grammar (Pollard and Sag, 1994) • Evaluation metrics are proposed for benchmarking Phrase Alignment on Paraphrases 6
  • 7. Annotation Target Paraphrases extracted from MT evaluation corpora ♠Paraphrases by linguistic operations ♠Paraphrases with simple summarization Relying on team spirit, expedition members defeated difficulties. Members of the scientific team overcame challenges living on Mars through teamwork. 7
  • 8. Approach 1. Gold-tree annotation by a linguistic expert 2. Phrase alignment annotation • 3 annotators independently identified phrase alignments using a provided annotation tool • Refer to tree structures when helpful 8
  • 9. Gold-Tree Annotation her life is excellent and wonderful… she also has a very splendid… life COOD ADJP VP NP S… ADJP NP NP VP VP S… 9
  • 10. Phrase alignment annotation her life is excellent and wonderful… she also has a very splendid… life COOD ADJP VP NP S… ADJP NP NP VP VP S… 10
  • 11. SPADE Statistics Dev Test # of sentence pairs 50 151 # of tokens 2,494 7,276 # of types 736 1,573 # of phrases (w/o tokens) 5,201 15,075 # of alignments (∪) 3,932 11,789 # of alignments (∩) 2,518 7,134 11
  • 12. Evaluation Metric ♠ALIR (ALInment Recall) evaluates how gold alignments (𝔾𝔾 & 𝔾𝔾′) can be replicated by automatic alignment (ℍ𝑎𝑎) ALIR = | 𝕙𝕙|𝕙𝕙 ∈ ℍ𝑎𝑎 ∧ 𝕙𝕙 ∈ 𝔾𝔾 ∩ 𝔾𝔾′ | 𝔾𝔾 ∩ 𝔾𝔾′ ♠ALIP (ALInment Precision) evaluates how automatic alignments overlap with alignments that at least an annotator aligned ALIP = | 𝕙𝕙|𝕙𝕙 ∈ ℍ𝑎𝑎 ∧ 𝕙𝕙 ∈ 𝔾𝔾 ∪ 𝔾𝔾′ | ℍ𝑎𝑎 12
  • 13. Benchmark 90.65 88.21 83.64 78.91 70 75 80 85 90 95 ALIR ALIP Human (Arase and Tsujii, 2017) Y. Arase and J. Tsujii. 2017. Monolingual Phrase Alignment on Parse Forests, in Proc. of EMNLP, pp. 1-11. 13
  • 14. Future Directions Expand the dataset 1. Size • Working on annotating 5k more paraphrase pairs 2. Linguistic phenomenon in paraphrases • SPADE used reference translations as paraphrases • Cover relatively simple paraphrases due to constraints by the source sentences 14
  • 15. Future Directions (Cont’d) 2. Linguistic phenomenon in paraphrases • Annotate paraphrases from other datasets • Microsoft Research Paraphrase Corpus (Dolan et al., 2004) • Twitter URL corpus (Lan et al., 2017) • Cover diverse linguistic phenomenon of paraphrases in the wild Ex) Paraphrases involve inferences/entailments Scientists overcame challenges living on Mars. Scientists overcame water and oxygen scarcity on the red planet. 15