SlideShare a Scribd company logo
13
Most read
14
Most read
15
Most read
Tokens
Example : Will you read the newspaper? Will you read it? I won’t read
it.
Lexemes
Example : Did you see him? I didn’t see him. I didn’t see
anyone.
Morphemes
Typology
Isolating, or analytic
Synthetic
Agglutinative
Fusional
Concatenative
Nonlinear
 Morphological parsing tries to eliminate or reduce the variability
of word forms
 to provide higher-level linguistic units whose lexical and
morphological properties are explicit and well defined.
 It removes unnecessary irregularity and ambiguity
 In irregularity, such forms and structures that are not described
appropriately by a prototypical linguistic model.
 Some irregularities can be understood by redesigning the model
and improving its rules
 but other lexically dependent irregularities often cannot be
generalized.
 Ambiguity is indeterminacy in interpretation of expressions of
language.
 Accidental ambiguity and ambiguity due to lexemes having
multiple senses
 we note the issue of syncretism, or systematic ambiguity.
 Morphological modeling also faces the problem of productivity
and creativity in language
 Usually, though, words are not licensed in lexicon of a
morphological system
 This unknown word problem is particularly severe in speech or
writing
Issues:
 1.Irregularity
 2.Ambiguity
 3.Productivity
 1. Irregularity:
 Morphological parsing is motivated by generalization and
abstraction in the world of words.
 Immediate descriptions of given linguistic data may not b ultimate
ones
 Due to either their inadequate accuracy or inappropriate
complexity, and better formulations may be needed.
 The design principles of the morphological models are very
important.
 In Arabic, the deeper study of the morphological processes that are in effect
during inflection and derivation, even for the so-called irregular words
Issues & Morphological models NLP engineering
Issues & Morphological models NLP engineering
Issues & Morphological models NLP engineering
Issues & Morphological models NLP engineering
 Is the inventory of words in a language finite, or is it unlimited?
This question leads directly to discerning two fundamental
approaches to language,
 summarized in the distinction between langue and parole
 The distribution of words or other elements of language follows
the “80/20 rule,” also known as the law of the vital few.
 It says that most of the word tokens in a given corpus can be
identified with just a couple of word types in its vocabulary,
 and words from the rest of the vocabulary occur much less
commonly if not rarely in the corpus.
In Czech, negation is a productive morphological operation. Verbs,
nouns, adjectives, and adverbs can be prefixed with ne- to define
the complementary lexical concept.
 There are many possible approaches to designing and
implementing morphological models
 computational linguistics has witnessed the development of a
number of formalisms and frameworks, in particular grammars.
 Types of Models
1. Dictionary Lookup
2. Finite-State Morphology
3. Unification-Based Morphology
4. Functional Morphology
5. Morphology Induction
 Morphological parsing is a process by which word forms of a
language are associated with corresponding linguistic
descriptions.
 In sophisticated models of the language uses word lists,
dictionaries, or databases.
 A dictionary is understood as a data structure that directly
enables obtaining some pre computed results, in our case word
analyses
 Dictionaries can be implemented, for instance, as lists, binary
search trees, tries, hash tables, and so on.
 Korean depend on a large dictionary of all possible combinations
of allomorphs and morphological alternations.
 The word list or dictionary-based approach used various ad hoc
implementations for many languages.
 In finite-state morphological models, which the specifications
written by human programmers are directly compiled into finite-
state transducers.
 The two most popular tools supporting this approach,
 For multiple languages are available online,
 1. XFST (Xerox Finite-State Tool)
 2. LexTools
 The role of finite-state transducers is to capture and compute
regular relations on sets
 Transducers specify relations between the input and output
languages.
 input word forms as surface strings and to the
 output descriptions as lexical strings
 In English, a finite-state transducer could analyze the
 surface string children into the lexical string child [+plural],
 for instance, or generate women from woman [+plural].
 Relations on languages can also be viewed as functions. Let
 Let us have a relation R,and let us denote by [ ] the set of all sequences
Σ over some set of symbols Σ,
 so that the domain and the range of R are subsets of [ ].
Σ
 We can then consider R as a function mapping an input string into a set of output strings.
 We rewrite rules of phonology and morphology have been around since the two-level morphology
model,
 Further presented in Computational Approaches to Morphology and Syntax and Morphology and
Computation.
 Compile-replace transducer operation for handling non
concatenative phenomena in morphology
 A theoretical limitation of finite-state models of morphology is the
problem of capturing reduplication of words or their elements
 Unification-based approaches to morphology uses various formal
linguistic frameworks for complete grammatical descriptions of
human languages,
 Especially head-driven phrase structure grammar (HPSG) and
 Development of languages for lexical knowledg representation
DATR
 The concepts and methods of these formalisms are often closely
connected to those of logic programming.
 In finite-state morphological models, both surface and lexical
forms are by themselves .
 In higher-level approaches, linguistic information is expressed by
more appropriate data structures.
 Morphological parsing P thus associates linear forms φ with
alternatives of structured content ψ:
 This group of morphological models includes not only the ones
following the methodology of functional morphology.
 Uses grammars of Grammatical Framework
 Functional morphology defines its models using principles of
functional programming and type theory.
 It treats morphological operations and processes as
 pure mathematical functions and organizes the linguistic as well as
abstract elements of a model into distinct types of values and type
classes.
 Functional morphology implementations are intended to be reused
as programming libraries capable of handling the complete
morphology of a language and to be incorporated into various
kinds of applications.
 Grammars in the OpenCCG project can be viewed as functional
models, too.
 Their formalism discerns declarations of features, categories, and
families
 Morphology Induction
 the structure of words
 the directions of research in this domain
 There are several challenging issues about deducing word
structure
 In order to improve the chances of statistical inference, parallel
learning of morphologies for multiple languages is proposed by
Snyder and Barzilay.
 The discriminative log-linear model of Poon, Cherry, and
Toutanova,
 enhances its generalization options by employing overlapping
contextual features when making segmentation decisions.
Issues & Morphological models NLP engineering

More Related Content

PPTX
Fundamental design concepts
PPTX
Natural Language Processing
PPTX
Python-Libraries,Numpy,Pandas,Matplotlib.pptx
PPTX
Phased life cycle model
PPTX
Prototype model
PPTX
Peephole Optimization
PPTX
Common language runtime clr
PPTX
Life cycle of a computer program
Fundamental design concepts
Natural Language Processing
Python-Libraries,Numpy,Pandas,Matplotlib.pptx
Phased life cycle model
Prototype model
Peephole Optimization
Common language runtime clr
Life cycle of a computer program

What's hot (20)

PPTX
Natural Language Processing - Unit 1
PPTX
Lexical Analysis - Compiler Design
PPTX
Software Engineering- Requirement Elicitation and Specification
PPT
Intermediate code generation (Compiler Design)
PPTX
Design of a two pass assembler
PPTX
Procedural programming
PPTX
Beginning Python Programming
PPT
Analysis concepts and principles
PPTX
Chapter 1 2 - some size factors
PPTX
Loop optimization
PPTX
Software Evolution
PPT
Lecture 12 requirements modeling - (system analysis)
PPTX
Delphi cost estimation model
PDF
Lecture 01 introduction to compiler
PPTX
Scope rules : local and global variables
PPTX
Python PPT
PPT
Fundamentals of the Analysis of Algorithm Efficiency
PDF
Syntactic analysis in NLP
Natural Language Processing - Unit 1
Lexical Analysis - Compiler Design
Software Engineering- Requirement Elicitation and Specification
Intermediate code generation (Compiler Design)
Design of a two pass assembler
Procedural programming
Beginning Python Programming
Analysis concepts and principles
Chapter 1 2 - some size factors
Loop optimization
Software Evolution
Lecture 12 requirements modeling - (system analysis)
Delphi cost estimation model
Lecture 01 introduction to compiler
Scope rules : local and global variables
Python PPT
Fundamentals of the Analysis of Algorithm Efficiency
Syntactic analysis in NLP
Ad

Similar to Issues & Morphological models NLP engineering (20)

PPTX
NL5MorphologyAndFinteStateTransducersPart1.pptx
PPT
Morphology.ppt
PDF
MorphologyAndFST.pdf
PPT
NL5MorphologyAndFinteStateTransducersPart1.ppt
PPTX
MORPHOLOGICAL-PARSING-Ling-132-Morphology-and-Syntax.pptx
PDF
Natural language Processing: Word Level Analysis
PDF
Adnan: Introduction to Natural Language Processing
PDF
Natural-Language-Processing-by-Dr-A-Nagesh.pdf
PDF
Ijarcet vol-3-issue-3-623-625 (1)
PPTX
MORPHOLOGICAL PROCESSING OF INDIAN LANGUAGRES
PPT
NLP Finite state machine needed.ppt
PPTX
NLP_KASHK: Introduction
PPTX
Morphological Analysis
PPT
haenelt.ppt
PPTX
NLP_KASHK:Finite-State Morphological Parsing
PPTX
gdhfjdhjcbdjhvjhdshbajhbvdjbklcbdsjhbvjhsdbvjjv
PPTX
Problems of function based syntax
PPTX
Knowledge Extraction
PPTX
Computational Linguistics - Finite State Automata
PDF
Syntactic Structures and Morphological Information Uwe Junghanns
NL5MorphologyAndFinteStateTransducersPart1.pptx
Morphology.ppt
MorphologyAndFST.pdf
NL5MorphologyAndFinteStateTransducersPart1.ppt
MORPHOLOGICAL-PARSING-Ling-132-Morphology-and-Syntax.pptx
Natural language Processing: Word Level Analysis
Adnan: Introduction to Natural Language Processing
Natural-Language-Processing-by-Dr-A-Nagesh.pdf
Ijarcet vol-3-issue-3-623-625 (1)
MORPHOLOGICAL PROCESSING OF INDIAN LANGUAGRES
NLP Finite state machine needed.ppt
NLP_KASHK: Introduction
Morphological Analysis
haenelt.ppt
NLP_KASHK:Finite-State Morphological Parsing
gdhfjdhjcbdjhvjhdshbajhbvdjbklcbdsjhbvjhsdbvjjv
Problems of function based syntax
Knowledge Extraction
Computational Linguistics - Finite State Automata
Syntactic Structures and Morphological Information Uwe Junghanns
Ad

Recently uploaded (20)

PPTX
Institutional Correction lecture only . . .
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
master seminar digital applications in india
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Cell Structure & Organelles in detailed.
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Institutional Correction lecture only . . .
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
master seminar digital applications in india
STATICS OF THE RIGID BODIES Hibbelers.pdf
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Supply Chain Operations Speaking Notes -ICLT Program
Microbial diseases, their pathogenesis and prophylaxis
Cell Structure & Organelles in detailed.
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PPH.pptx obstetrics and gynecology in nursing
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
FourierSeries-QuestionsWithAnswers(Part-A).pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Anesthesia in Laparoscopic Surgery in India
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf

Issues & Morphological models NLP engineering

  • 1. Tokens Example : Will you read the newspaper? Will you read it? I won’t read it. Lexemes Example : Did you see him? I didn’t see him. I didn’t see anyone. Morphemes
  • 3.  Morphological parsing tries to eliminate or reduce the variability of word forms  to provide higher-level linguistic units whose lexical and morphological properties are explicit and well defined.  It removes unnecessary irregularity and ambiguity  In irregularity, such forms and structures that are not described appropriately by a prototypical linguistic model.  Some irregularities can be understood by redesigning the model and improving its rules  but other lexically dependent irregularities often cannot be generalized.
  • 4.  Ambiguity is indeterminacy in interpretation of expressions of language.  Accidental ambiguity and ambiguity due to lexemes having multiple senses  we note the issue of syncretism, or systematic ambiguity.  Morphological modeling also faces the problem of productivity and creativity in language  Usually, though, words are not licensed in lexicon of a morphological system  This unknown word problem is particularly severe in speech or writing
  • 5. Issues:  1.Irregularity  2.Ambiguity  3.Productivity  1. Irregularity:  Morphological parsing is motivated by generalization and abstraction in the world of words.  Immediate descriptions of given linguistic data may not b ultimate ones  Due to either their inadequate accuracy or inappropriate complexity, and better formulations may be needed.  The design principles of the morphological models are very important.
  • 6.  In Arabic, the deeper study of the morphological processes that are in effect during inflection and derivation, even for the so-called irregular words
  • 11.  Is the inventory of words in a language finite, or is it unlimited? This question leads directly to discerning two fundamental approaches to language,  summarized in the distinction between langue and parole  The distribution of words or other elements of language follows the “80/20 rule,” also known as the law of the vital few.  It says that most of the word tokens in a given corpus can be identified with just a couple of word types in its vocabulary,  and words from the rest of the vocabulary occur much less commonly if not rarely in the corpus.
  • 12. In Czech, negation is a productive morphological operation. Verbs, nouns, adjectives, and adverbs can be prefixed with ne- to define the complementary lexical concept.
  • 13.  There are many possible approaches to designing and implementing morphological models  computational linguistics has witnessed the development of a number of formalisms and frameworks, in particular grammars.  Types of Models 1. Dictionary Lookup 2. Finite-State Morphology 3. Unification-Based Morphology 4. Functional Morphology 5. Morphology Induction
  • 14.  Morphological parsing is a process by which word forms of a language are associated with corresponding linguistic descriptions.  In sophisticated models of the language uses word lists, dictionaries, or databases.  A dictionary is understood as a data structure that directly enables obtaining some pre computed results, in our case word analyses  Dictionaries can be implemented, for instance, as lists, binary search trees, tries, hash tables, and so on.  Korean depend on a large dictionary of all possible combinations of allomorphs and morphological alternations.  The word list or dictionary-based approach used various ad hoc implementations for many languages.
  • 15.  In finite-state morphological models, which the specifications written by human programmers are directly compiled into finite- state transducers.  The two most popular tools supporting this approach,  For multiple languages are available online,  1. XFST (Xerox Finite-State Tool)  2. LexTools  The role of finite-state transducers is to capture and compute regular relations on sets  Transducers specify relations between the input and output languages.  input word forms as surface strings and to the  output descriptions as lexical strings
  • 16.  In English, a finite-state transducer could analyze the  surface string children into the lexical string child [+plural],  for instance, or generate women from woman [+plural].  Relations on languages can also be viewed as functions. Let  Let us have a relation R,and let us denote by [ ] the set of all sequences Σ over some set of symbols Σ,  so that the domain and the range of R are subsets of [ ]. Σ  We can then consider R as a function mapping an input string into a set of output strings.  We rewrite rules of phonology and morphology have been around since the two-level morphology model,  Further presented in Computational Approaches to Morphology and Syntax and Morphology and Computation.
  • 17.  Compile-replace transducer operation for handling non concatenative phenomena in morphology  A theoretical limitation of finite-state models of morphology is the problem of capturing reduplication of words or their elements
  • 18.  Unification-based approaches to morphology uses various formal linguistic frameworks for complete grammatical descriptions of human languages,  Especially head-driven phrase structure grammar (HPSG) and  Development of languages for lexical knowledg representation DATR  The concepts and methods of these formalisms are often closely connected to those of logic programming.  In finite-state morphological models, both surface and lexical forms are by themselves .  In higher-level approaches, linguistic information is expressed by more appropriate data structures.  Morphological parsing P thus associates linear forms φ with alternatives of structured content ψ:
  • 19.  This group of morphological models includes not only the ones following the methodology of functional morphology.  Uses grammars of Grammatical Framework  Functional morphology defines its models using principles of functional programming and type theory.  It treats morphological operations and processes as  pure mathematical functions and organizes the linguistic as well as abstract elements of a model into distinct types of values and type classes.  Functional morphology implementations are intended to be reused as programming libraries capable of handling the complete morphology of a language and to be incorporated into various kinds of applications.
  • 20.  Grammars in the OpenCCG project can be viewed as functional models, too.  Their formalism discerns declarations of features, categories, and families  Morphology Induction  the structure of words  the directions of research in this domain  There are several challenging issues about deducing word structure  In order to improve the chances of statistical inference, parallel learning of morphologies for multiple languages is proposed by Snyder and Barzilay.  The discriminative log-linear model of Poon, Cherry, and Toutanova,  enhances its generalization options by employing overlapping contextual features when making segmentation decisions.