A CROSS-LINGUAL ANNOTATION PROJECTION
APPROACH FOR RELATION DETECTION

   The 23rd International Conference on Computational Linguistics (COLING 2010)
                              August 24th, 2010, Beijing

                       Seokhwan Kim (POSTECH)
                     Minwoo Jeong (Saarland University)
                         Jonghoon Lee (POSTECH)
                       Gary Geunbae Lee (POSTECH)
Contents
• Introduction
• Methods
    Cross-lingual Annotation Projection for Relation Detection
    Noise Reduction Strategies
• Evaluation
• Conclusion




                                                                  2
Contents
• Introduction
• Methods
    Cross-lingual Annotation Projection for Relation Detection
    Noise Reduction Strategies
• Evaluation
• Conclusion




                                                                  3
What’s Relation Detection?
• Relation Extraction
    To identify semantic relations between a pair of entities
    ACE RDC
       • Relation Detection (RD)
       • Relation Categorization (RC)



                   Owner-Of

  Jan Mullins, owner of Computer Recycler Incorporated said that …




                                                                     4
What’s the Problem?
• Many supervised machine learning approaches have been
  successfully applied to the RDC task
    (Kambhatla, 2004; Zhou et al., 2005; Zelenko et al., 2003; Culotta
     and Sorensen, 2004; Bunescu and Mooney, 2005; Zhang et al.,
     2006)
• Datasets for relation detection
    Labeled corpora for supervised learning
    Available for only a few languages
       • English, Chinese, Arabic
    No resources for other languages
       • Korean


                                                                          5
Contents
• Introduction
• Methods
    Cross-lingual Annotation Projection for Relation Detection
    Noise Reduction Strategies
• Evaluation
• Conclusion




                                                                  6
Cross-lingual Annotation Projection
• Goal
   To learn the relation detector without significant annotation efforts
• Method
   To leverage parallel corpora to project the relation annotation on
    the source language LS to the target language LT




                                                                            7
Cross-lingual Annotation Projection
• Previous Work
    Part-of-speech tagging (Yarowsky and Ngai, 2001)
    Named-entity tagging (Yarowsky et al., 2001)
    Verb classification (Merlo et al., 2002)
    Dependency parsing (Hwa et al., 2005)
    Mention detection (Zitouni and Florian, 2008)
    Semantic role labeling (Pado and Lapata, 2009)
• To the best of our knowledge, no work has reported on the
  RDC task



                                                          8
Overall Architecture
Annotation                  Parallel
                                                       Projection
                            Corpus


         Sentences in                   Sentences in
               Ls                            Lt



         Preprocessing                 Preprocessing
         (POS Tagging,                 (POS Tagging,
            Parsing)                      Parsing)




             NER                       Word Alignment




       Relation Detection                Projection



          Annotated                      Annotated
         Sentences in                   Sentences in
               Ls                            Lt                     9
How to Reduce Noise?
• Error Accumulation
    Numerous errors can be generated and accumulated through a
     procedure of annotation projection
      • Preprocessing for LS and LT
      • NER for LS
      • Relation Detection for LS
      • Word Alignment between LS and LT

• Noise Reduction
    A key factor to improve the performance of annotation projection




                                                                        10
How to Reduce Noise?
• Noise Reduction Strategies (1)
    Alignment Filtering
        • Based on Heuristics
                 A projection for an entity mention should be based on alignments between
                  contiguous word sequences




     accepted      rejected




                                                                                        11
How to Reduce Noise?
• Noise Reduction Strategies (1)
    Alignment Filtering
        • Based on Heuristics
                 A projection for an entity mention should be based on alignments between
                  contiguous word sequences
                 Both an entity mention in LS and its projection in LT should include at
                  least one base noun phrase




                                          N   N      N   N



     accepted      rejected           accepted    rejected


                                          N




                                                                                            12
How to Reduce Noise?
• Noise Reduction Strategies (1)
    Alignment Filtering
        • Based on Heuristics
                 A projection for an entity mention should be based on alignments between
                  contiguous word sequences
                 Both an entity mention in LS and its projection in LT should include at
                  least one base noun phrase
                 The projected instance in LT should satisfy the clausal agreement with the
                  original instance in LS

                                          N   N      N   N



     accepted      rejected           accepted    rejected                rejected


                                          N




                                                                                            13
How to Reduce Noise?
• Noise Reduction Strategies (2)
    Alignment Correction
       • Based on a bilingual dictionary for entity mentions
            Each entry of the dictionary is a pair of entity mention in LS and its
             translation or transliteration in LT


   FOR each entity ES in LS                                A    B    C D       E   F   G
      RETRIEVE counterpart ET from DICT(E-T)
      SEEK ET from the sentence ST in LT
      IF matched THEN                                                                  BCD - βγ
          MAKE new alignment ES-ET
      ENDIF
   ENDFOR                                                  α    β    γ     δ   ε   δ   ε
                                                               corrected




                                                                                                  14
How to Reduce Noise?
• Noise Reduction Strategies (3)
    Assessment-based Instance Selection
      • Based on the reliability of a projected instances in LT
           Evaluated by the confidence score of monolingual relation detection for
            the original counterpart instance in LS
           Only instances with larger scores than threshold value θ are accepted


                    conf = 0.9                            conf = 0.6




                                         θ = 0.7
                     accepted                               rejected




                                                                                      15
Contents
• Introduction
• Methods
    Cross-lingual Annotation Projection for Relation Detection
    Noise Reduction Strategies
• Evaluation
• Conclusion




                                                                  16
Experimental Setup
• Dataset
    English-Korean parallel corpus
       • 454,315 bi-sentence pairs in English and Korean
       • Aligned by GIZA++
    Korean RDC corpus
       • Annotated following LDC guideline for ACE RDC corpus
       • 100 news documents in Korean
             835 sentences
             3,331 entity mentions
             8,354 relation instances




                                                                17
Experimental Setup
• Preprocessors
    English
      • Stanford Parser (Klein and Manning, 2003)
      • Stanford Named Entity Recognizer (Finkel et al., 2005)
    Korean
      • Korean POS Tagger (Lee et al., 2002)
      • MST Parser (R. McDonald et al., 2006)




                                                                 18
Experimental Setup
• Relation Detection for English Sentences
    Tree kernel-based SVM classifier
       • Training Dataset
            ACE 2003 corpus
                 • 674 documents
                 • 9,683 relation instances
       • Model
            Shortest path enclosed subtrees kernel (Zhang et al., 2006)
       • Implementation
            SVM-Light (Joachims, 1998)
            Tree Kernel Tools (Moschitti, 2006)




                                                                           19
Experimental Setup
• Relation Detection for Korean Sentences
    Tree kernel-based SVM classifier
       • Training Dataset
            Half of the Korean RDC corpus (baseline)
            Projected instances
       • Model
            Shortest path dependency kernel (Bunescu and Mooney, 2005)
       • Implementation
            SVM-Light (Joachims, 1998)
            Tree Kernel Tools (Moschitti, 2006)




                                                                          20
Experimental Setup
• Experimental Sets
    Combinations of noise reduction strategies
      • (S1: Heuristic, S2: Dictionary, S3: Assessment)
      1. Baseline
             Trained with only half of the Korean RDC corpus
      2. Baseline + Projections (no noise reduction)
      3. Baseline + Projections (S1)
      4. Baseline + Projections (S1 + S2)
      5. Baseline + Projections (S3)
      6. Baseline + Projections (S1 + S3)
      7. Baseline + Projections (S1 + S2 + S3)



                                                                21
Experimental Setup
• Evaluation
    On the second half of the Korean RDC corpus
       • The first half is for the baseline
    On true entity mentions with true chaining of coreference
    Evaluated by Precision/Recall/F-measure




                                                                 22
Experimental Results

                             no assessment       with assessment
         Model
                             P      R      F      P      R      F

        baseline            60.5   20.4   30.5    -      -      -

 baseline + projection      22.5   6.5    10.0   29.1   13.2   18.2

 Baseline + projection
                            51.4   15.5   23.8   56.1   22.9   32.5
      (heuristics)
 Baseline + projection
                            55.3   19.4   28.7   59.8   26.7   36.9
(heuristics + dictionary)




                                                                      23
Non-filtered Projects were Poor

                              no assessment       with assessment
          Model
                              P      R      F      P      R      F

         baseline            60.5   20.4   30.5    -      -      -

  baseline + projection      22.5   6.5    10.0   29.1   13.2   18.2

  Baseline + projection
                             51.4   15.5   23.8   56.1   22.9   32.5
       (heuristics)
  Baseline + projection
                             55.3   19.4   28.7   59.8   26.7   36.9
 (heuristics + dictionary)




                                                                       24
Heuristics Were Helpful

                             no assessment       with assessment
         Model
                             P      R      F      P      R      F

        baseline            60.5   20.4   30.5    -      -      -

 baseline + projection      22.5   6.5    10.0   29.1   13.2   18.2

 Baseline + projection
                            51.4   15.5   23.8   56.1   22.9   32.5
      (heuristics)
 Baseline + projection
                            55.3   19.4   28.7   59.8   26.7   36.9
(heuristics + dictionary)




                                                                      25
Much Worse Than Baseline

                             no assessment       with assessment
         Model
                             P      R      F      P      R      F

        baseline            60.5   20.4   30.5    -      -      -

 baseline + projection      22.5   6.5    10.0   29.1   13.2   18.2

 Baseline + projection
                            51.4   15.5   23.8   56.1   22.9   32.5
      (heuristics)
 Baseline + projection
                            55.3   19.4   28.7   59.8   26.7   36.9
(heuristics + dictionary)




                                                                      26
Dictionary Was Also Helpful

                             no assessment       with assessment
         Model
                             P      R      F      P      R      F

        baseline            60.5   20.4   30.5    -      -      -

 baseline + projection      22.5   6.5    10.0   29.1   13.2   18.2

 Baseline + projection
                            51.4   15.5   23.8   56.1   22.9   32.5
      (heuristics)
 Baseline + projection
                            55.3   19.4   28.7   59.8   26.7   36.9
(heuristics + dictionary)




                                                                      27
Still Worse Than Baseline

                             no assessment       with assessment
         Model
                             P      R      F      P      R      F

        baseline            60.5   20.4   30.5    -      -      -

 baseline + projection      22.5   6.5    10.0   29.1   13.2   18.2

 Baseline + projection
                            51.4   15.5   23.8   56.1   22.9   32.5
      (heuristics)
 Baseline + projection
                            55.3   19.4   28.7   59.8   26.7   36.9
(heuristics + dictionary)




                                                                      28
Assessment Boosted Performance


                             no assessment       with assessment
         Model
                             P      R      F      P      R      F

        baseline            60.5   20.4   30.5    -      -      -

 baseline + projection      22.5   6.5    10.0   29.1   13.2   18.2

 Baseline + projection
                            51.4   15.5   23.8   56.1   22.9   32.5
      (heuristics)
 Baseline + projection
                            55.3   19.4   28.7   59.8   26.7   36.9
(heuristics + dictionary)




                                                                      29
Combined Strategies Achieved
  Better Performance Then Baseline


                             no assessment       with assessment
         Model
                             P      R      F      P      R      F

        baseline            60.5   20.4   30.5    -      -      -

 baseline + projection      22.5   6.5    10.0   29.1   13.2   18.2

 Baseline + projection
                            51.4   15.5   23.8   56.1   22.9   32.5
      (heuristics)
 Baseline + projection
                            55.3   19.4   28.7   59.8   26.7   36.9
(heuristics + dictionary)




                                                                      30
Contents
• Introduction
• Methods
    Cross-lingual Annotation Projection for Relation Detection
    Noise Reduction Strategies
• Evaluation
• Conclusion




                                                                  31
Conclusion
• Summary
    A cross-lingual annotation projection for relation detection
    Three strategies for noise reduction
    Projected instances from an English-Korean parallel corpus helped
     to improve the performance of the task
       • with the noise reduction strategies

• Future work
    A cross-lingual annotation projection for relation categorization
    More elaborate strategies for noise reduction to improve the
     projection performance for relation extraction



                                                                         32
Q&A

More Related Content

PDF
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
PPTX
1909 paclic
PDF
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
PDF
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
PDF
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
PDF
Colloquium talk on modal sense classification using a convolutional neural ne...
PPTX
2010 PACLIC - pay attention to categories
PDF
Declarative analysis of noisy information networks
A Graph-based Cross-lingual Projection Approach for Weakly Supervised Relatio...
1909 paclic
Engineering Intelligent NLP Applications Using Deep Learning – Part 2
Ontology Integration and Interoperability (OntoIOp) – Part 1: The Distributed...
Engineering Intelligent NLP Applications Using Deep Learning – Part 1
Colloquium talk on modal sense classification using a convolutional neural ne...
2010 PACLIC - pay attention to categories
Declarative analysis of noisy information networks

What's hot (20)

PDF
Learning to understand phrases by embedding the dictionary
PPT
Pronominal Anaphora resolution
PDF
Multi modal retrieval and generation with deep distributed models
PPTX
NLP Bootcamp
PDF
Natural language processing with python and amharic syntax parse tree by dani...
PDF
dialogue act modeling for automatic tagging and recognition
PPT
Amharic WSD using WordNet
PPTX
Latest trends in NLP - Exploring BERT
PDF
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
PDF
A survey on parallel corpora alignment
PDF
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
PDF
Deep Learning, an interactive introduction for NLP-ers
PPTX
An Improved Approach to Word Sense Disambiguation
PDF
Information Retrieval with Deep Learning
PDF
Deep Learning for Natural Language Processing: Word Embeddings
PDF
Anthiil Inside workshop on NLP
PDF
Finite Wordlength Linear-Phase FIR Filter Design Using Babai's Algorithm
PDF
Deep learning for natural language embeddings
PDF
Lean Logic for Lean Times: Entailment and Contradiction Revisited
Learning to understand phrases by embedding the dictionary
Pronominal Anaphora resolution
Multi modal retrieval and generation with deep distributed models
NLP Bootcamp
Natural language processing with python and amharic syntax parse tree by dani...
dialogue act modeling for automatic tagging and recognition
Amharic WSD using WordNet
Latest trends in NLP - Exploring BERT
Tiancheng Zhao - 2017 - Learning Discourse-level Diversity for Neural Dialog...
A survey on parallel corpora alignment
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
Deep Learning, an interactive introduction for NLP-ers
An Improved Approach to Word Sense Disambiguation
Information Retrieval with Deep Learning
Deep Learning for Natural Language Processing: Word Embeddings
Anthiil Inside workshop on NLP
Finite Wordlength Linear-Phase FIR Filter Design Using Babai's Algorithm
Deep learning for natural language embeddings
Lean Logic for Lean Times: Entailment and Contradiction Revisited
Ad

Viewers also liked (8)

PPTX
jiaju.com首页前端优化一期报告
PDF
Wikipedia-based Kernels for Dialogue Topic Tracking
PPTX
Дипломная Работа: Guerrilla Marketing
PDF
A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...
PPTX
Cancer al utero
PPTX
张所勇:前端开发工具推荐
PDF
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
jiaju.com首页前端优化一期报告
Wikipedia-based Kernels for Dialogue Topic Tracking
Дипломная Работа: Guerrilla Marketing
A Cross-lingual Annotation Projection-based Self-supervision Approach for Ope...
Cancer al utero
张所勇:前端开发工具推荐
EPG 정보 검색을 위한 예제 기반 자연어 대화 시스템
Ad

Similar to A Cross-Lingual Annotation Projection Approach for Relation Detection (20)

PDF
Unsupervised Extraction of False Friends from Parallel Bi-Texts Using the Web...
PDF
Thomson Reuters at NIST TAC 2008
ODP
Reference Scope Identification in Citing Sentences
PDF
Improving Machine Learning Approaches to Coreference Resolution
PDF
A Bridge Not too Far
PDF
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
PPTX
MEBI 591C/598 – Data and Text Mining in Biomedical Informatics
PDF
DETECTING OXYMORON IN A SINGLE STATEMENT
PDF
AbstractKR on Pargram 2006
PPTX
DH Tools Workshop #1: Text Analysis
PDF
Evaluation of subjective answers using glsa enhanced with contextual synonymy
PPTX
2023 EMNLP day_san.pptx
PPT
Introduction to Natural Language Processing
PPTX
The CLUES database: automated search for linguistic cognates
PPTX
Not just for reference: Dictionaries and corpora as language acquisition tools
PDF
Machine Learning of Natural Language
PDF
Wei Yang - 2014 - Consistent Improvement in Translation Quality of Chinese–Ja...
PDF
13. Constantin Orasan (UoW) Natural Language Processing for Translation
PDF
semeval2016
Unsupervised Extraction of False Friends from Parallel Bi-Texts Using the Web...
Thomson Reuters at NIST TAC 2008
Reference Scope Identification in Citing Sentences
Improving Machine Learning Approaches to Coreference Resolution
A Bridge Not too Far
SemEval-2012 Task 6: A Pilot on Semantic Textual Similarity
MEBI 591C/598 – Data and Text Mining in Biomedical Informatics
DETECTING OXYMORON IN A SINGLE STATEMENT
AbstractKR on Pargram 2006
DH Tools Workshop #1: Text Analysis
Evaluation of subjective answers using glsa enhanced with contextual synonymy
2023 EMNLP day_san.pptx
Introduction to Natural Language Processing
The CLUES database: automated search for linguistic cognates
Not just for reference: Dictionaries and corpora as language acquisition tools
Machine Learning of Natural Language
Wei Yang - 2014 - Consistent Improvement in Translation Quality of Chinese–Ja...
13. Constantin Orasan (UoW) Natural Language Processing for Translation
semeval2016

More from Seokhwan Kim (17)

PDF
The Eighth Dialog System Technology Challenge (DSTC8)
PDF
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
PDF
Dynamic Memory Networks for Dialogue Topic Tracking
PDF
The Fifth Dialog State Tracking Challenge (DSTC5)
PDF
Natural Language in Human-Robot Interaction
PDF
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
PDF
The Fourth Dialog State Tracking Challenge (DSTC4)
PDF
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
PDF
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
PDF
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
PDF
Sequential Labeling for Tracking Dynamic Dialog States
PDF
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
PDF
MMR-based active machine learning for Bio named entity recognition
PDF
A semi-supervised method for efficient construction of statistical spoken lan...
PDF
A spoken dialog system for electronic program guide information access
PDF
An alignment-based approach to semi-supervised relation extraction including ...
PDF
An Alignment-based Pattern Representation Model for Information Extraction
The Eighth Dialog System Technology Challenge (DSTC8)
Deep Recurrent Neural Networks with Layer-wise Multi-head Attentions for Punc...
Dynamic Memory Networks for Dialogue Topic Tracking
The Fifth Dialog State Tracking Challenge (DSTC5)
Natural Language in Human-Robot Interaction
Exploring Convolutional and Recurrent Neural Networks in Sequential Labelling...
The Fourth Dialog State Tracking Challenge (DSTC4)
Wikification of Concept Mentions within Spoken Dialogues Using Domain Constra...
Towards Improving Dialogue Topic Tracking Performances with Wikification of C...
A Composite Kernel Approach for Dialog Topic Tracking with Structured Domain ...
Sequential Labeling for Tracking Dynamic Dialog States
A Graph-based Cross-lingual Projection Approach for Spoken Language Understan...
MMR-based active machine learning for Bio named entity recognition
A semi-supervised method for efficient construction of statistical spoken lan...
A spoken dialog system for electronic program guide information access
An alignment-based approach to semi-supervised relation extraction including ...
An Alignment-based Pattern Representation Model for Information Extraction

Recently uploaded (20)

PPTX
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
PDF
Credit Without Borders: AI and Financial Inclusion in Bangladesh
PDF
Enhancing emotion recognition model for a student engagement use case through...
PDF
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
PDF
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
PPT
Module 1.ppt Iot fundamentals and Architecture
PPTX
Microsoft Excel 365/2024 Beginner's training
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
1 - Historical Antecedents, Social Consideration.pdf
PPT
Geologic Time for studying geology for geologist
PDF
sustainability-14-14877-v2.pddhzftheheeeee
PDF
STKI Israel Market Study 2025 version august
PDF
Five Habits of High-Impact Board Members
PDF
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
PDF
CloudStack 4.21: First Look Webinar slides
PPT
What is a Computer? Input Devices /output devices
PPT
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
PDF
Abstractive summarization using multilingual text-to-text transfer transforme...
PDF
Getting started with AI Agents and Multi-Agent Systems
PDF
Architecture types and enterprise applications.pdf
AI IN MARKETING- PRESENTED BY ANWAR KABIR 1st June 2025.pptx
Credit Without Borders: AI and Financial Inclusion in Bangladesh
Enhancing emotion recognition model for a student engagement use case through...
Two-dimensional Klein-Gordon and Sine-Gordon numerical solutions based on dee...
Hybrid horned lizard optimization algorithm-aquila optimizer for DC motor
Module 1.ppt Iot fundamentals and Architecture
Microsoft Excel 365/2024 Beginner's training
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
1 - Historical Antecedents, Social Consideration.pdf
Geologic Time for studying geology for geologist
sustainability-14-14877-v2.pddhzftheheeeee
STKI Israel Market Study 2025 version august
Five Habits of High-Impact Board Members
Produktkatalog für HOBO Datenlogger, Wetterstationen, Sensoren, Software und ...
CloudStack 4.21: First Look Webinar slides
What is a Computer? Input Devices /output devices
Galois Field Theory of Risk: A Perspective, Protocol, and Mathematical Backgr...
Abstractive summarization using multilingual text-to-text transfer transforme...
Getting started with AI Agents and Multi-Agent Systems
Architecture types and enterprise applications.pdf

A Cross-Lingual Annotation Projection Approach for Relation Detection

  • 1. A CROSS-LINGUAL ANNOTATION PROJECTION APPROACH FOR RELATION DETECTION The 23rd International Conference on Computational Linguistics (COLING 2010) August 24th, 2010, Beijing Seokhwan Kim (POSTECH) Minwoo Jeong (Saarland University) Jonghoon Lee (POSTECH) Gary Geunbae Lee (POSTECH)
  • 2. Contents • Introduction • Methods  Cross-lingual Annotation Projection for Relation Detection  Noise Reduction Strategies • Evaluation • Conclusion 2
  • 3. Contents • Introduction • Methods  Cross-lingual Annotation Projection for Relation Detection  Noise Reduction Strategies • Evaluation • Conclusion 3
  • 4. What’s Relation Detection? • Relation Extraction  To identify semantic relations between a pair of entities  ACE RDC • Relation Detection (RD) • Relation Categorization (RC) Owner-Of Jan Mullins, owner of Computer Recycler Incorporated said that … 4
  • 5. What’s the Problem? • Many supervised machine learning approaches have been successfully applied to the RDC task  (Kambhatla, 2004; Zhou et al., 2005; Zelenko et al., 2003; Culotta and Sorensen, 2004; Bunescu and Mooney, 2005; Zhang et al., 2006) • Datasets for relation detection  Labeled corpora for supervised learning  Available for only a few languages • English, Chinese, Arabic  No resources for other languages • Korean 5
  • 6. Contents • Introduction • Methods  Cross-lingual Annotation Projection for Relation Detection  Noise Reduction Strategies • Evaluation • Conclusion 6
  • 7. Cross-lingual Annotation Projection • Goal  To learn the relation detector without significant annotation efforts • Method  To leverage parallel corpora to project the relation annotation on the source language LS to the target language LT 7
  • 8. Cross-lingual Annotation Projection • Previous Work  Part-of-speech tagging (Yarowsky and Ngai, 2001)  Named-entity tagging (Yarowsky et al., 2001)  Verb classification (Merlo et al., 2002)  Dependency parsing (Hwa et al., 2005)  Mention detection (Zitouni and Florian, 2008)  Semantic role labeling (Pado and Lapata, 2009) • To the best of our knowledge, no work has reported on the RDC task 8
  • 9. Overall Architecture Annotation Parallel Projection Corpus Sentences in Sentences in Ls Lt Preprocessing Preprocessing (POS Tagging, (POS Tagging, Parsing) Parsing) NER Word Alignment Relation Detection Projection Annotated Annotated Sentences in Sentences in Ls Lt 9
  • 10. How to Reduce Noise? • Error Accumulation  Numerous errors can be generated and accumulated through a procedure of annotation projection • Preprocessing for LS and LT • NER for LS • Relation Detection for LS • Word Alignment between LS and LT • Noise Reduction  A key factor to improve the performance of annotation projection 10
  • 11. How to Reduce Noise? • Noise Reduction Strategies (1)  Alignment Filtering • Based on Heuristics  A projection for an entity mention should be based on alignments between contiguous word sequences accepted rejected 11
  • 12. How to Reduce Noise? • Noise Reduction Strategies (1)  Alignment Filtering • Based on Heuristics  A projection for an entity mention should be based on alignments between contiguous word sequences  Both an entity mention in LS and its projection in LT should include at least one base noun phrase N N N N accepted rejected accepted rejected N 12
  • 13. How to Reduce Noise? • Noise Reduction Strategies (1)  Alignment Filtering • Based on Heuristics  A projection for an entity mention should be based on alignments between contiguous word sequences  Both an entity mention in LS and its projection in LT should include at least one base noun phrase  The projected instance in LT should satisfy the clausal agreement with the original instance in LS N N N N accepted rejected accepted rejected rejected N 13
  • 14. How to Reduce Noise? • Noise Reduction Strategies (2)  Alignment Correction • Based on a bilingual dictionary for entity mentions  Each entry of the dictionary is a pair of entity mention in LS and its translation or transliteration in LT FOR each entity ES in LS A B C D E F G RETRIEVE counterpart ET from DICT(E-T) SEEK ET from the sentence ST in LT IF matched THEN BCD - βγ MAKE new alignment ES-ET ENDIF ENDFOR α β γ δ ε δ ε corrected 14
  • 15. How to Reduce Noise? • Noise Reduction Strategies (3)  Assessment-based Instance Selection • Based on the reliability of a projected instances in LT  Evaluated by the confidence score of monolingual relation detection for the original counterpart instance in LS  Only instances with larger scores than threshold value θ are accepted conf = 0.9 conf = 0.6 θ = 0.7 accepted rejected 15
  • 16. Contents • Introduction • Methods  Cross-lingual Annotation Projection for Relation Detection  Noise Reduction Strategies • Evaluation • Conclusion 16
  • 17. Experimental Setup • Dataset  English-Korean parallel corpus • 454,315 bi-sentence pairs in English and Korean • Aligned by GIZA++  Korean RDC corpus • Annotated following LDC guideline for ACE RDC corpus • 100 news documents in Korean  835 sentences  3,331 entity mentions  8,354 relation instances 17
  • 18. Experimental Setup • Preprocessors  English • Stanford Parser (Klein and Manning, 2003) • Stanford Named Entity Recognizer (Finkel et al., 2005)  Korean • Korean POS Tagger (Lee et al., 2002) • MST Parser (R. McDonald et al., 2006) 18
  • 19. Experimental Setup • Relation Detection for English Sentences  Tree kernel-based SVM classifier • Training Dataset  ACE 2003 corpus • 674 documents • 9,683 relation instances • Model  Shortest path enclosed subtrees kernel (Zhang et al., 2006) • Implementation  SVM-Light (Joachims, 1998)  Tree Kernel Tools (Moschitti, 2006) 19
  • 20. Experimental Setup • Relation Detection for Korean Sentences  Tree kernel-based SVM classifier • Training Dataset  Half of the Korean RDC corpus (baseline)  Projected instances • Model  Shortest path dependency kernel (Bunescu and Mooney, 2005) • Implementation  SVM-Light (Joachims, 1998)  Tree Kernel Tools (Moschitti, 2006) 20
  • 21. Experimental Setup • Experimental Sets  Combinations of noise reduction strategies • (S1: Heuristic, S2: Dictionary, S3: Assessment) 1. Baseline  Trained with only half of the Korean RDC corpus 2. Baseline + Projections (no noise reduction) 3. Baseline + Projections (S1) 4. Baseline + Projections (S1 + S2) 5. Baseline + Projections (S3) 6. Baseline + Projections (S1 + S3) 7. Baseline + Projections (S1 + S2 + S3) 21
  • 22. Experimental Setup • Evaluation  On the second half of the Korean RDC corpus • The first half is for the baseline  On true entity mentions with true chaining of coreference  Evaluated by Precision/Recall/F-measure 22
  • 23. Experimental Results no assessment with assessment Model P R F P R F baseline 60.5 20.4 30.5 - - - baseline + projection 22.5 6.5 10.0 29.1 13.2 18.2 Baseline + projection 51.4 15.5 23.8 56.1 22.9 32.5 (heuristics) Baseline + projection 55.3 19.4 28.7 59.8 26.7 36.9 (heuristics + dictionary) 23
  • 24. Non-filtered Projects were Poor no assessment with assessment Model P R F P R F baseline 60.5 20.4 30.5 - - - baseline + projection 22.5 6.5 10.0 29.1 13.2 18.2 Baseline + projection 51.4 15.5 23.8 56.1 22.9 32.5 (heuristics) Baseline + projection 55.3 19.4 28.7 59.8 26.7 36.9 (heuristics + dictionary) 24
  • 25. Heuristics Were Helpful no assessment with assessment Model P R F P R F baseline 60.5 20.4 30.5 - - - baseline + projection 22.5 6.5 10.0 29.1 13.2 18.2 Baseline + projection 51.4 15.5 23.8 56.1 22.9 32.5 (heuristics) Baseline + projection 55.3 19.4 28.7 59.8 26.7 36.9 (heuristics + dictionary) 25
  • 26. Much Worse Than Baseline no assessment with assessment Model P R F P R F baseline 60.5 20.4 30.5 - - - baseline + projection 22.5 6.5 10.0 29.1 13.2 18.2 Baseline + projection 51.4 15.5 23.8 56.1 22.9 32.5 (heuristics) Baseline + projection 55.3 19.4 28.7 59.8 26.7 36.9 (heuristics + dictionary) 26
  • 27. Dictionary Was Also Helpful no assessment with assessment Model P R F P R F baseline 60.5 20.4 30.5 - - - baseline + projection 22.5 6.5 10.0 29.1 13.2 18.2 Baseline + projection 51.4 15.5 23.8 56.1 22.9 32.5 (heuristics) Baseline + projection 55.3 19.4 28.7 59.8 26.7 36.9 (heuristics + dictionary) 27
  • 28. Still Worse Than Baseline no assessment with assessment Model P R F P R F baseline 60.5 20.4 30.5 - - - baseline + projection 22.5 6.5 10.0 29.1 13.2 18.2 Baseline + projection 51.4 15.5 23.8 56.1 22.9 32.5 (heuristics) Baseline + projection 55.3 19.4 28.7 59.8 26.7 36.9 (heuristics + dictionary) 28
  • 29. Assessment Boosted Performance no assessment with assessment Model P R F P R F baseline 60.5 20.4 30.5 - - - baseline + projection 22.5 6.5 10.0 29.1 13.2 18.2 Baseline + projection 51.4 15.5 23.8 56.1 22.9 32.5 (heuristics) Baseline + projection 55.3 19.4 28.7 59.8 26.7 36.9 (heuristics + dictionary) 29
  • 30. Combined Strategies Achieved Better Performance Then Baseline no assessment with assessment Model P R F P R F baseline 60.5 20.4 30.5 - - - baseline + projection 22.5 6.5 10.0 29.1 13.2 18.2 Baseline + projection 51.4 15.5 23.8 56.1 22.9 32.5 (heuristics) Baseline + projection 55.3 19.4 28.7 59.8 26.7 36.9 (heuristics + dictionary) 30
  • 31. Contents • Introduction • Methods  Cross-lingual Annotation Projection for Relation Detection  Noise Reduction Strategies • Evaluation • Conclusion 31
  • 32. Conclusion • Summary  A cross-lingual annotation projection for relation detection  Three strategies for noise reduction  Projected instances from an English-Korean parallel corpus helped to improve the performance of the task • with the noise reduction strategies • Future work  A cross-lingual annotation projection for relation categorization  More elaborate strategies for noise reduction to improve the projection performance for relation extraction 32
  • 33. Q&A