SlideShare a Scribd company logo
Exploring Higher Order Dependency Parsers

             Pranava Swaroop Madhyastha

   Supervised by: Prof. Michael Rosner & RNDr. Daniel Zeman


                   September 6, 2011
Introduction

     ◮   Dependency Grammar.
           ◮   Binary asymmetric relations - Head and Modifier - Highly
               lexical relationships.
     ◮   A quick example:




     ◮   Projective Constraint
     ◮   Graph Based Dependency Parsing
           ◮   Arc-Factored Parsing
Problem Description?



    ◮   Augmentation of Features
          ◮   Semantic features
          ◮   Morpho-syntactic features
    ◮   Higher order parsing
          ◮   Context availability
          ◮   horizontal and vertical context availability

    ◮   Motivation
          ◮   Semi-supervised dependency parsing and improvements.
          ◮   Using well defined linguistic components.
What is Higher Order Dependency Parsing
    ◮   First-order model - decomposition of the tree into head and
        modifier dependencies.
    ◮   Second-order models - inclusion of sibling relation of the
        modifier tokens along with head and modifier or inclusion of
        head and modifier and children of the modifier.
    ◮   Third-order models - one level up.




    ◮   An illustration
Still Why?
Features


    ◮   For a given φ - a feature vector and w - the list of related
        parameters, each part is scored as

                              Part(x, p) = w .φ(x, p)                  (1)
    ◮   Each of these contributing feature vectors would be scored by
        calculating the individual features in this fashion:
           ◮   dir.pos(h).pos(m)
           ◮   dir.form(h).pos(m)
           ◮   and so on ...
    ◮   The most basic feature patterns consider the surface form,
        part-of-speech, lemma and other morphosyntactic attributes
        of the head or the modifier of a dependency.
Experimentation done with:

    ◮   English - Penn Treebank
          ◮   Section 2 to 10 as training set - a set of 15000 sentences.
          ◮   Random sets of sentences from sections 15, 17, 19, 25 of the
              Penn Treebank as development data - a set of 1000 sentences.
          ◮   Test set was chosen from Sections 0, 1, 21, 23 of the penn
              treebank - a set of 2000 sentences.
    ◮   Czech - Prague Dependency Treebank
          ◮   The sentences were chosen from pdt2-full-automorph dataset.
          ◮   The training set consisted of train1 - train5 splits - a set of
              15,000 sentences..
          ◮   The development set consisted of train6 and train7 splits - a
              set of 1000 sentences.
          ◮   The test set was made up of dtest and etest parts - a set of
              2000 sentences.
Experimentation

    ◮   Fine and Coarse Grained Wordsenses
    ◮   Approximation
    ◮   For English:
          ◮   Both Fine and Coarse Grained Wordsense extraction make use
              of WordNet::SenseRelate package.
          ◮   Fine grained wordsense basically restricts a word to a particular
              sense - Word - noun and first sense (extracted from the
              wordnet)
          ◮   Coarse Grained wordsense is a more generic wordsense
              description Word - the semantic file to which the word belongs
              to.
    ◮   For Czech:
          ◮   Only Fine Grained Wordsense extraction (approximately).
          ◮   extracted by using the sempos which is already tagged in the
              prague dependency treebank.
Results for the Wordsense augmentation experiment

    ◮   Sibling based parsers show a statistically significant
        improvement.
    ◮   For English with Fine Grained wordsense addition - Third
        order grand-sibling based parser gives an improvement of
        +0.81 percent (Unlabeled Accuracy Score). A closer
        statistical examination showed that sibling based interactions
        which are close to each other have better precision.
    ◮   For English with Coarse Grained wordsense addition - the
        second order sibling based parser gives an improvement of
        approximately +1.09 percent.
    ◮   Again for Czech with fine grained wordsense augmentation,
        the 3rd order sibling based parser gives an improvement of
        approximately +1.20 percent.
Results for Morphosyntactic augmentation experiment




    ◮   Morphosyntactic augmentation was basically used directly by
        extracting tags from the corpus.
    ◮   For Czech, instead of the 15 Letter tagset, we tried out a
        subset (which includes - Person, Number, POSSGender,
        Tense, Voice and Case)
    ◮   For English we integrated the fine grained part-of-speech.
Results




     ◮    Both for English and Czech, there is a significant
          improvement in the parsing accuracy when it is parsed with
          the grandchild based algorithms.
     ◮    For Czech, the third order grand sibling based algorithm
          shows an improvement of +1.72 percent.
     ◮    For English, the third order grand sibling based algorithm
          shows an improvement of +1.21 percent.
Conclusion



    ◮   Semantic features work better with sibling based parsers
        (larger horizontal contexts).
    ◮   Morpho-syntactic features work better with grandchild based
        parsers (larger vertical contexts).
    ◮   Features can be instrumental in several tasks, which include
        accurate labeling of semantic roles and other related tasks.
    ◮   Linguistic information can be better handled by a higher order
        parsing algorithm.
Future Work




    ◮   Higher order parsers with labels (we have not yet tested
        labeled accuracy scores).
    ◮   Joint extraction of word-senses and semantic roles.
    ◮   Experimentation with lexical clusters.
    ◮   Thorough experimentation of several features.
    ◮   Maximum and Minimum order requirements.
Thanks

More Related Content

PDF
Collin F. Baker - 2017 - Graph Methods for Multilingual FrameNets
PDF
Lecture 3: Semantic Role Labelling
DOC
referát.doc
PDF
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
PPTX
Lfg and gpsg
PDF
AINL 2016: Eyecioglu
PPTX
Language models
Collin F. Baker - 2017 - Graph Methods for Multilingual FrameNets
Lecture 3: Semantic Role Labelling
referát.doc
ESR10 Joachim Daiber - EXPERT Summer School - Malaga 2015
Lfg and gpsg
AINL 2016: Eyecioglu
Language models

What's hot (20)

PPTX
Info 2402 irt-chapter_4
PDF
A general method applicable to the search for anglicisms in russian social ne...
PDF
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
PDF
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
PPTX
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
PDF
2021-0509_JAECS2021_Spring
PDF
Statistically-Enhanced New Word Identification
PDF
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
PDF
IRJET- Automatic Language Identification using Hybrid Approach and Classifica...
PDF
PDF
Language Models for Information Retrieval
PPT
static dictionary technique
PDF
Phonetic Recognition In Words For Persian Text To Speech Systems
PDF
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
PDF
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
PDF
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
PDF
Enriching Word Vectors with Subword Information
PDF
Deep Dependency Graph Conversion in English
PDF
Noun Paraphrasing Based on a Variety of Contexts
PPTX
Ics1019 ics5003
Info 2402 irt-chapter_4
A general method applicable to the search for anglicisms in russian social ne...
Identification of Fertile Translations in Comparable Corpora: a Morpho-Compos...
ACL-WMT2013.A Description of Tunable Machine Translation Evaluation Systems i...
pptphrase-tagset-mapping-for-french-and-english-treebanks-and-its-application...
2021-0509_JAECS2021_Spring
Statistically-Enhanced New Word Identification
THE ABILITY OF WORD EMBEDDINGS TO CAPTURE WORD SIMILARITIES
IRJET- Automatic Language Identification using Hybrid Approach and Classifica...
Language Models for Information Retrieval
static dictionary technique
Phonetic Recognition In Words For Persian Text To Speech Systems
Segmentation Words for Speech Synthesis in Persian Language Based On Silence
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Fasttext(Enriching Word Vectors with Subword Information) 논문 리뷰
Enriching Word Vectors with Subword Information
Deep Dependency Graph Conversion in English
Noun Paraphrasing Based on a Variety of Contexts
Ics1019 ics5003
Ad

Similar to Presentation (20)

PPTX
Pptphrase tagset mapping for french and english treebanks and its application...
PDF
Cross lingual similarity discrimination with translation characteristics
PPTX
PPTX
Artificial Intelligence Notes Unit 4
PPT
NLP-my-lecture (3).ppt
PPTX
An Improved Approach to Word Sense Disambiguation
PPT
haenelt.ppt
PPTX
Lecture 7- Text Statistics and Document Parsing
PDF
Natural Language Processing Course in AI
PPTX
Using selectors for nouns, verbs and adjectives
PPTX
Grammar rules in English, Dependency Parsing, Shallow parsing
PDF
Understanding Natural Languange with Corpora-based Generation of Dependency G...
PDF
semeval2016
PPTX
Natural Language Processing
PPT
Nikolay Karpov - Single-sentence readability prediction in russian
PDF
Junki Matsuo - 2015 - Source Phrase Segmentation and Translation for Japanese...
PPTX
Types of parsers
PDF
Intrinsic and Extrinsic Evaluations of Word Embeddings
PPTX
Language model in nature language processing
Pptphrase tagset mapping for french and english treebanks and its application...
Cross lingual similarity discrimination with translation characteristics
Artificial Intelligence Notes Unit 4
NLP-my-lecture (3).ppt
An Improved Approach to Word Sense Disambiguation
haenelt.ppt
Lecture 7- Text Statistics and Document Parsing
Natural Language Processing Course in AI
Using selectors for nouns, verbs and adjectives
Grammar rules in English, Dependency Parsing, Shallow parsing
Understanding Natural Languange with Corpora-based Generation of Dependency G...
semeval2016
Natural Language Processing
Nikolay Karpov - Single-sentence readability prediction in russian
Junki Matsuo - 2015 - Source Phrase Segmentation and Translation for Japanese...
Types of parsers
Intrinsic and Extrinsic Evaluations of Word Embeddings
Language model in nature language processing
Ad

Recently uploaded (20)

PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Tartificialntelligence_presentation.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
PPT
Teaching material agriculture food technology
PPTX
SOPHOS-XG Firewall Administrator PPT.pptx
PDF
Approach and Philosophy of On baking technology
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Spectroscopy.pptx food analysis technology
PPTX
1. Introduction to Computer Programming.pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PPTX
Programs and apps: productivity, graphics, security and other tools
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Building Integrated photovoltaic BIPV_UPV.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Diabetes mellitus diagnosis method based random forest with bat algorithm
NewMind AI Weekly Chronicles - August'25-Week II
Digital-Transformation-Roadmap-for-Companies.pptx
Tartificialntelligence_presentation.pptx
Spectral efficient network and resource selection model in 5G networks
Teaching material agriculture food technology
SOPHOS-XG Firewall Administrator PPT.pptx
Approach and Philosophy of On baking technology
20250228 LYD VKU AI Blended-Learning.pptx
Mobile App Security Testing_ A Comprehensive Guide.pdf
Dropbox Q2 2025 Financial Results & Investor Presentation
Spectroscopy.pptx food analysis technology
1. Introduction to Computer Programming.pptx
Per capita expenditure prediction using model stacking based on satellite ima...
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Programs and apps: productivity, graphics, security and other tools

Presentation

  • 1. Exploring Higher Order Dependency Parsers Pranava Swaroop Madhyastha Supervised by: Prof. Michael Rosner & RNDr. Daniel Zeman September 6, 2011
  • 2. Introduction ◮ Dependency Grammar. ◮ Binary asymmetric relations - Head and Modifier - Highly lexical relationships. ◮ A quick example: ◮ Projective Constraint ◮ Graph Based Dependency Parsing ◮ Arc-Factored Parsing
  • 3. Problem Description? ◮ Augmentation of Features ◮ Semantic features ◮ Morpho-syntactic features ◮ Higher order parsing ◮ Context availability ◮ horizontal and vertical context availability ◮ Motivation ◮ Semi-supervised dependency parsing and improvements. ◮ Using well defined linguistic components.
  • 4. What is Higher Order Dependency Parsing ◮ First-order model - decomposition of the tree into head and modifier dependencies. ◮ Second-order models - inclusion of sibling relation of the modifier tokens along with head and modifier or inclusion of head and modifier and children of the modifier. ◮ Third-order models - one level up. ◮ An illustration
  • 6. Features ◮ For a given φ - a feature vector and w - the list of related parameters, each part is scored as Part(x, p) = w .φ(x, p) (1) ◮ Each of these contributing feature vectors would be scored by calculating the individual features in this fashion: ◮ dir.pos(h).pos(m) ◮ dir.form(h).pos(m) ◮ and so on ... ◮ The most basic feature patterns consider the surface form, part-of-speech, lemma and other morphosyntactic attributes of the head or the modifier of a dependency.
  • 7. Experimentation done with: ◮ English - Penn Treebank ◮ Section 2 to 10 as training set - a set of 15000 sentences. ◮ Random sets of sentences from sections 15, 17, 19, 25 of the Penn Treebank as development data - a set of 1000 sentences. ◮ Test set was chosen from Sections 0, 1, 21, 23 of the penn treebank - a set of 2000 sentences. ◮ Czech - Prague Dependency Treebank ◮ The sentences were chosen from pdt2-full-automorph dataset. ◮ The training set consisted of train1 - train5 splits - a set of 15,000 sentences.. ◮ The development set consisted of train6 and train7 splits - a set of 1000 sentences. ◮ The test set was made up of dtest and etest parts - a set of 2000 sentences.
  • 8. Experimentation ◮ Fine and Coarse Grained Wordsenses ◮ Approximation ◮ For English: ◮ Both Fine and Coarse Grained Wordsense extraction make use of WordNet::SenseRelate package. ◮ Fine grained wordsense basically restricts a word to a particular sense - Word - noun and first sense (extracted from the wordnet) ◮ Coarse Grained wordsense is a more generic wordsense description Word - the semantic file to which the word belongs to. ◮ For Czech: ◮ Only Fine Grained Wordsense extraction (approximately). ◮ extracted by using the sempos which is already tagged in the prague dependency treebank.
  • 9. Results for the Wordsense augmentation experiment ◮ Sibling based parsers show a statistically significant improvement. ◮ For English with Fine Grained wordsense addition - Third order grand-sibling based parser gives an improvement of +0.81 percent (Unlabeled Accuracy Score). A closer statistical examination showed that sibling based interactions which are close to each other have better precision. ◮ For English with Coarse Grained wordsense addition - the second order sibling based parser gives an improvement of approximately +1.09 percent. ◮ Again for Czech with fine grained wordsense augmentation, the 3rd order sibling based parser gives an improvement of approximately +1.20 percent.
  • 10. Results for Morphosyntactic augmentation experiment ◮ Morphosyntactic augmentation was basically used directly by extracting tags from the corpus. ◮ For Czech, instead of the 15 Letter tagset, we tried out a subset (which includes - Person, Number, POSSGender, Tense, Voice and Case) ◮ For English we integrated the fine grained part-of-speech.
  • 11. Results ◮ Both for English and Czech, there is a significant improvement in the parsing accuracy when it is parsed with the grandchild based algorithms. ◮ For Czech, the third order grand sibling based algorithm shows an improvement of +1.72 percent. ◮ For English, the third order grand sibling based algorithm shows an improvement of +1.21 percent.
  • 12. Conclusion ◮ Semantic features work better with sibling based parsers (larger horizontal contexts). ◮ Morpho-syntactic features work better with grandchild based parsers (larger vertical contexts). ◮ Features can be instrumental in several tasks, which include accurate labeling of semantic roles and other related tasks. ◮ Linguistic information can be better handled by a higher order parsing algorithm.
  • 13. Future Work ◮ Higher order parsers with labels (we have not yet tested labeled accuracy scores). ◮ Joint extraction of word-senses and semantic roles. ◮ Experimentation with lexical clusters. ◮ Thorough experimentation of several features. ◮ Maximum and Minimum order requirements.