SlideShare a Scribd company logo
technology
from seed
LINGUISTIC EVALUATION
OF SUPPORT VERB CONSTRUCTIONS
BY OPENLOGOS AND GOOGLE TRANSLATE
ANABELA BARREIRO
INESC-ID
KUTZ ARRIETA
Oracle
JOHANNA MONTI
University of Sassari
WANG LING
CMU-IST
BRIGITTE ORLIAC
Logos Institute
FERNANDO BATISTA
INESC-ID, ISCTE-IUL
SUSANNE PREUß
Saarland University
ISABEL TRANCOSO
INESC-ID, IST
Language Resources and Evaluation Conference 26-31 May, Reykjavik, Iceland
• Introduction
– Towards Hybrid Machine Translation
– OpenLogos and Google Translate Models
• Evaluation Task
– Corpus and Datasets
– Quantitative Results
– Linguistic Evaluation Details
• Current Work
– Semantico-Syntactic Knowledge Integration into SMT
• Conclusions and Future Work
Outline
2
• MT GOAL
– researchers aim for robust MT systems that can produce high
quality translations
• CURRENT PROBLEMS
– translations produced by widely used MT systems still show
unfortunate errors that require significant post-editing effort
– there is lack of periodical qualitative evaluation efforts
involving MT systems of different nature
– state-of-the-art quality metrics and estimation have been
targeting human-factors tasks (post-editing time and effort),
but NOT diagnosing fine-grained linguistic errors to improve
syntactic structure and meaning
Introduction
3
• CURRENT TREND
– produce systems that combine linguistic resources and
analysis with statistical techniques that will lead to
linguistically enhancing SMT models
• OUR MOTIVATION
– belief that an effective method to advance MT research is to
bring different approaches together, comparing them and
measuring which modules need improvement
– to our knowledge, no major effort has been made to combine
the strengths of different MT approaches with the purpose of
overcoming known weaknesses on the basis of a joint
linguistic evaluation of those weaknesses
Introduction
4
• OUR GOALS
– advance hybrid MT, starting by understanding different
approaches, their weaknesses and strengths
– perform a systematic fine-grained linguistic analysis of the
performance of individual models
– The first exercise to achieve our goals is to evaluate the
performance of RBMT and SMT when dealing with a very
specific linguistic phenomenon: support verb constructions
Introduction
5
• A current trend in MT research is the creation of HMT models
that combine linguistic knowledge with statistical techniques
• HMT systems attempt to combine RBMT systems [Scott, 2003]
with data-driven MT systems, such as phrase-based SMT
[Koehn, 2007]
• System combination often leads to improvements in
translation quality, as different systems tend to address
different translation challenges
• it is still not obvious which HMT approach will be the most
efficient one and will lead to higher quality translation in the
long run
Towards Hybrid Machine Translation
6
• SMT models learn generalizations of the translation process
using parallel corpora
– they tend to perform better than RBMT when parallel corpora
is abundant (English-Mandarin)
– when parallel corpora is scarce (Spanish-Basque), they have
insufficient data to learn generalizations [Labaka, 2007]
• Morphologically rich languages require more data to learn
accurate translations
– SMT models for morphologically rich languages have been
proposed [Chahuneau, 2013]
– RBMT systems with manually-encoded morphology are an
alternative for resource-poor languages
Towards Hybrid Machine Translation
7
• Some methods to combine RBMT with SMT:
– combine the translations of the same text by two different systems
[Eisele, 2008] [Heafield, 2011]
– use data-driven techniques to improve RBMT systems
[Eisele, 2008] uses phrase pair extraction in phrase-based SMT to
extract phrasal translations used to improve the coverage of a
RBMT system
– a similar method using example-based MT for the same end has
been proposed [Sanchez, 2009]
– use statistical post-editing methods to improve RBMT translation
quality [Elming, 2006] [Simard, 2007] [Dugast, 2007] [Terumasa,
2007]
– use RBMT systems to enhance data-driven approaches. [Shirai,
1997] uses an example-based MT system [Brown, 1996] to create
an initial translation template, and a RBMT system to translate
individual words and phrases according to this template
Towards Hybrid Machine Translation
8
• is an open source copy of the commercial Logos System
• addresses morphology, syntax, and semantics, has robust
parsers, sets of semantico-syntactic rules, terminology sets
and tools
• pattern-based methodology
– closer in spirit to the SMT approach with the advantage of
including semantic knowledge/understanding
• uses an intermediate language (SAL) to encode linguistic
information and process text
– SAL contributes to OpenLogos (OL) high quality translation
and lessens one of the main problems in SMT (the sparseness
in linguistic examples)
• its linguistic knowledge databases have not been developed
for over 10 years
The OpenLogos Model
9
• one of the most widely used online MT systems
• this SMT system benefits from the large amount of parallel
data that Google collects from the web
– in March 2014, it was set to account for 80 language pairs
• translation quality is highly dependent on the language pair,
producing better results for close language pairs (Portuguese
and Spanish) and languages for which large amounts of
parallel data are available
• closed system, however, no knowledge of semantic
understanding is known to exist in Google Translate (GT)
The Google Translate Model
10
• sentences containing 100 support verb constructions (SVC)
extracted from the news and Internet
• SVC - multiword or complex predicate formed by a
semantically weak verb, and a predicate
noun/adjective/adverb [Barreiro, 2008]
– make a presentation
support verb make + predicate noun presentation
– make it simple
support verb make + predicate adjective simple
Evaluation Task: Corpus
11
• Why SVC?
– studied systematically within the Lexicon-Grammar Theory
• the scientific study of SVC eliminates subjectivity concerns
for the evaluation task
– occur abundantly in texts
– recognized and processed computationally
• in general and specific-purpose corpora
• for several languages
– most MT systems still fail at addressing the compositional
aspect of multiword units
• when translated incorrectly, SVC have a negative impact in
the understandability and quality of translations
Evaluation Task: Corpus
12
• Why SVC?
– SVC can be non-contiguous (the individual elements that
compose the unit are placed apart in the sentence), with a
smaller or greater number of inserts
• An insert is any word in between elements of the multiword
other than an article before a predicate noun
• we are taking a growing interest in
– non-contiguous SVC are extremely difficult to align in SMT,
remaining one of the key cross-language challenges for MT
Evaluation Task: Corpus
13
Support Verb Constructions Types in Our
Corpus
14
Nominal Support Verb Construction (NSVC)
make a presentation
Adjectival Support Verb Construction (ADJSVC)
be meaningful
Contiguous nominal (NON-CONT NSVC)
have [ADV+ADJ-particularly good] links
Prepositional nominal (PREPNSVC)
give an illustration of
Non-contiguous prepositional nominal (NON-CONT PREPNSVC)
be the [ADJ-immediate] cause of
Idiomatic nominal (IDIOM NSVC)
set in motion, place at risk, go on strike
Idiomatic prepositional nominal (IDIOM PREPNSVC)
earn an income of
Non-contiguous idiomatic nominal (NON-CONT IDIOM NSVC)
hold [NP-the option] in place, be of [ADJ-practical] value
Non-contiguous idiomatic prepositional nominal (NON-CONT IDIOM PREPNSVC)
give [PRO-us] a [bird’s-eye] view of, be [ADV-clearly] at odds with, open talks [May 14] with
Support Verb Constructions Types in Our
Corpus
15
Nominal Support Verb Construction (NSVC)
make a presentation
Adjectival Support Verb Construction (ADJSVC)
be meaningful
Non-contiguous adjectival (NON-CONT ADJSVC)
be [ADV-extremely] selective
Prepositional adjectival (PREPADJSVC)
be known as; be involved in
Non-contiguous prepositional adjectival (NON-CONT PREPADJSVC)
fall [ADV-so far] short of
• Each SVC was annotated according to the SVC taxonomy
• SVC corpus was translated into FR, GE, IT, PT and ES, using the
OL and the GT systems
• native linguists evaluated the SVC translation quality for each
target language and classified the errors according to a binary
evaluation metrics:
– OK ERR (agreement, morphologically-related or other
problems, such as incorrect prepositions, wrong word order)
• a comprehensive qualitative evaluation of mistranslations
according to the different types of SVC was provided
• none of the systems was trained for the task - texts were not
domain specific
Evaluation Task: Setup
16
Quantitative Results
17
Lang. pair System OK ERR Agreem Other
EN-FR
GT 64 32 4 -
OL 51 48 1 -
EN-GE
GT 37 46 3 14
OL 60 33 1 6
EN-IT
GT 61 31 - 8
OL 43 52 - 5
EN-PT
GT 68 27 5 -
OL 41 58 1 -
EN-ES
GT 51 41 6 2
OL 25 70 3 2
Results for translation of the 100 SVC in our corpus
for FR, GE, IT, PT, and ES
with the OL and the GT MT systems
• OL translates correctly more SVC than GT
• incorrect translations (for both systems) concern:
– word choice, incl. most prepositions - lexical (L)
– word order, incl. incorrect clause segmentation - order (O)
– word form, incl. choice between bare-infinitive and to-
infinitive - morphology (M)
– missing word, mainly auxiliary and main verb - ellipsis (E)
• GT has + lexical, morphology and missing word errors than OL
• GT lexical coverage is poor “wrt” contiguous SVC
• GT does not translate well the GE verb split (even after
reordering)
Linguistic Evaluation
EN-GE
18
• GT translates correctly more SVC than OL
• most translation errors by both systems involved:
– incorrect lexical choice for some or all of the elements of the
SVC (non-translation or literal translation)
– wrong agreement (subject-verb, subject-predicate adjective)
– non-contiguous and idiomatic SVC
– less idiomatic SVC - problems with (i) prepositions; (ii) literal
translation of the support verb and (iii) wrong lexical choice
for the predicate noun
– prepositions and determiner assignment, which require
minor post-editing corrections (e.g., prepositional adjectival
SVC)
Linguistic Evaluation
EN-FR/IT/PT/ES
19
• In general, SVC problems by GT were more structural, while SVC
problems by OL were more lexical
• OL would easily translate contiguous and non-contiguous SVC
correctly, provided it added it to its dictionary and rule DB
• OL is able to resolve the SVC internal modifiers better than GT,
which removes some meaning from the source in the translation
• OL use of linguistic knowledge in its structural analysis is a
powerful feature that can turn OL performance for the Romance
languages as satisfactory as that for GE
• Higher quality translation can be achieved if we combine:
• OL ability to translate different surface structures of a sentence
• GT rich word selection powered by sophisticated statistical
methods to extract knowledge from large volumes of parallel data
Linguistic Evaluation: Conclusions
20
• In the OL system, linguistic elements are represented in a
semantico-syntactic abstraction language (SAL) with
ontological properties
• SAL represents the heart of OL, accounting for its effectiveness
in parsing and semantic understanding
[Scott, 2003] [Barreiro et al., 2011] [Barreiro et., 2014]
– http://www.l2f.inesc-id.pt/~abarreiro/openlogos-tutorial/INDEX.HTM
• SAL is hierarchical, made up of supersets, sets and subsets
• SAL knowledge is encoded in the lexicon,
both in the dictionary entries and in the rules.
• Bilingual dictionaries with SAL knowledge are available at:
– http://metanet4u.l2f.inesc-id.pt
Proposal for Semantico-Syntactic Knowledge
Integration into SMT
21
nouns
concrete
func onals
conduits
word class
superset
set
subsetbarriers containers
……
… …
……
• In OL, all NL input sentences are converted into SAL patterns,
which represent the semantico-syntactic and morphological
features of each word
• SAL elements interact with semantico-syntactic rules called
SEMTAB rules, which
– represent the meaning of words on the basis of their
association with other words (context)
– disambiguate the meanings of words in the source text by
identifying the syntactic structures underlying each meaning
– provide the target language equivalents of each identified
meaning of a source language
– are conceptual and encode deep structure relations
Proposal for Semantico-Syntactic Knowledge
Integration into SMT
22
• called after dictionary look-up and during the execution of
target transfer rules (TRAN rules) to solve ambiguity
problems (verb dependencies) and multiwords, overriding the
default dictionary transfer
• When a sentence is being parsed by TRAN, OL sends the SAL
patterns to the SEMTAB database to look for a rule match
• If the rule exists for a linguistic string, TRAN uses that rule and
overrides the dictionary transfer for that string
Proposal for Semantico-Syntactic Knowledge
Integration into SMT
23
• A string can maintain the SVC structure or be paraphrased
apply paint to
PT: aplicar tinta a / pintar
• The SEMTAB rule applies to different surface structures of the
SVC and any insert specified in the rule
they applied immediately red paint (immediately) to
PT: aplicaram imediatamente tinta vermelha a
Proposal for Semantico-Syntactic Knowledge
Integration into SMT
24
• As long as the SEMTAB rule exists in the database, OL can
process and translate correctly all the incorrectly translated
SVC in our corpus (by OL and GT)
• The OL method can overcome the structural problems
presented by SMT, not only the contiguous, but also the non-
contiguous SVC, independently of how remotely they occur in
the sentence
• The OL methodology applies to any type of multiword and
allows the translation of other context-sensitive challenges
Proposal for Semantico-Syntactic Knowledge
Integration into SMT
25
• Multiwords (SVC) are responsible for most translation errors
– researchers need to develop approach-independent
systematic linguistic quality evaluation metrics with
phased error categorization tasks where specific linguistic
phenomena (such as SVC) can be evaluated individually in
stages by MT expert linguists
• fine-grained error categorization can contribute to more
controlled and systematic evaluation tasks
• evaluation needs to target each group of linguistic errors and
identify which system has more difficulties translating each
type of linguistic challenge (paradigmatic evaluation)
Conclusions and Future Work
26
• evaluation tasks require the construction of corpora to test
grammatical correctness addressing individual linguistic
phenomena
– different types of multiwords, relative constructions,
passives, pronouns, determiners, locative prepositions, etc.
• TOWARDS HYBRIDIZATION
– the question “how effectively can rule-based and statistical
MT be combined?” can only be answered after linguistic
quality evaluation metrics are developed and validated by
the MT community
• no effective hybridization can take place before linguistic
evaluation of the results provided by different approaches is
successfully accomplished
Conclusions and Future Work
27
28
Thank you!
This research was supported by FCT Fundação para a Ciência e Tecnologia,
through grant SFRH/BPD/91446/2012) and project PEst-OE/EEI/LA0021/2013.

More Related Content

PDF
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
PDF
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
PDF
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
PDF
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
PDF
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
PDF
Tech capabilities with_sa
PDF
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
PPTX
K-SRL: Instance-based Learning for Semantic Role Labeling
MultiMWE: Building a Multi-lingual Multi-Word Expression (MWE) Parallel Corpora
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
CUHK intern PPT. Machine Translation Evaluation: Methods and Tools
Chinese Character Decomposition for Neural MT with Multi-Word Expressions
9. Manuel Harranz (pangeanic) Hybrid Solutions for Translation
Tech capabilities with_sa
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
K-SRL: Instance-based Learning for Semantic Role Labeling

What's hot (10)

PPTX
PDF
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
PPTX
VOC real world enterprise needs
PDF
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
PPTX
Arabic question answering ‫‬
PDF
Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...
PDF
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
PDF
Answer Selection and Validation for Arabic Questions
PDF
Requirements Engineering: focus on Natural Language Processing, Lecture 2
PDF
What java developers (don’t) know about api compatibility
TSD2013 PPT.AUTOMATIC MACHINE TRANSLATION EVALUATION WITH PART-OF-SPEECH INFO...
VOC real world enterprise needs
cushLEPOR uses LABSE distilled knowledge to improve correlation with human tr...
Arabic question answering ‫‬
Lung-Hao Lee - 2015 - Overview of the NLP-TEA 2015 Shared Task for Chinese Gr...
Elior Sulem - 2018 - Semantic Structural Evaluation for Text Simplification
Answer Selection and Validation for Arabic Questions
Requirements Engineering: focus on Natural Language Processing, Lecture 2
What java developers (don’t) know about api compatibility
Ad

Viewers also liked (20)

PDF
When Multiwords Go Bad in Machine Translation
PPTX
Level3 new exam_model
PPTX
Is Google Translate Effective At Sentence Changing
ODP
Google Translate + TectoMT
PPTX
Google translate 1
PDF
Human vs-Machine-Translation
ZIP
Language Use And Preservation Online
PPT
Testing and evaluation
PDF
Building Translate on Glass
PPTX
Pptphrase tagset mapping for french and english treebanks and its application...
PPT
Google translate (new russian)
ODP
8 Google Translate
PPTX
Google Translate in the Classroom
PPTX
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
PDF
Amharic document clustering
PPS
Google Translate Update
PPSX
Linguistic & language teaching
PPTX
Google translate
PDF
Natural language processing with python and amharic syntax parse tree by dani...
PPTX
Machine Translation=Google Translator
When Multiwords Go Bad in Machine Translation
Level3 new exam_model
Is Google Translate Effective At Sentence Changing
Google Translate + TectoMT
Google translate 1
Human vs-Machine-Translation
Language Use And Preservation Online
Testing and evaluation
Building Translate on Glass
Pptphrase tagset mapping for french and english treebanks and its application...
Google translate (new russian)
8 Google Translate
Google Translate in the Classroom
Bilingual Data Mining for the English-Amharic Statistical Machine Translation...
Amharic document clustering
Google Translate Update
Linguistic & language teaching
Google translate
Natural language processing with python and amharic syntax parse tree by dani...
Machine Translation=Google Translator
Ad

Similar to Linguistic Evaluation of Support Verb Construction Translations by OpenLogos and Google Translate (20)

PPTX
Past, Present, and Future: Machine Translation & Natural Language Processing ...
PPTX
Past, Present, and Future: Machine Translation & Natural Language Processing ...
PDF
"Machine Translation 101" and the Challenge of Patents
PDF
The Latest Advances in Patent Machine Translation
PDF
The Effect of Translationese on Statistical Machine Translation
PDF
Machine Translation Approaches and Design Aspects
PDF
13. Constantin Orasan (UoW) Natural Language Processing for Translation
PDF
Using ontology based context in the
PPTX
Machine translator Introduction
PPT
Machine Translation ppt for engineering students
PPTX
Evaluation of hindi english mt systems, challenges and solutions
PDF
PDF
Error Analysis of Rule-based Machine Translation Outputs
PDF
Improved Word Alignments Using the Web as a Corpus
PDF
The First English-Persian statistical machine translation
PDF
Make it simple with paraphrases: Automated paraphrasing for authoring aids an...
PDF
Nakov S., Nakov P., Paskaleva E., Improved Word Alignments Using the Web as a...
PDF
A new hybrid metric for verifying
PDF
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
PPTX
Cross language alignments - challenges guidelines and gold sets
Past, Present, and Future: Machine Translation & Natural Language Processing ...
Past, Present, and Future: Machine Translation & Natural Language Processing ...
"Machine Translation 101" and the Challenge of Patents
The Latest Advances in Patent Machine Translation
The Effect of Translationese on Statistical Machine Translation
Machine Translation Approaches and Design Aspects
13. Constantin Orasan (UoW) Natural Language Processing for Translation
Using ontology based context in the
Machine translator Introduction
Machine Translation ppt for engineering students
Evaluation of hindi english mt systems, challenges and solutions
Error Analysis of Rule-based Machine Translation Outputs
Improved Word Alignments Using the Web as a Corpus
The First English-Persian statistical machine translation
Make it simple with paraphrases: Automated paraphrasing for authoring aids an...
Nakov S., Nakov P., Paskaleva E., Improved Word Alignments Using the Web as a...
A new hybrid metric for verifying
CONSTRUCTION OF ENGLISH-BODO PARALLEL TEXT CORPUS FOR STATISTICAL MACHINE TRA...
Cross language alignments - challenges guidelines and gold sets

More from INESC-ID (Spoken Language Systems Laboratory - L2F) (20)

PDF
Análise comparativa das edições portuguesa e brasileira de Os livros que dev...
PDF
Welcome session 3rd Annual MC Meeting - enetCollect COST Action
PPTX
Syntactic-semantic analysis for information extraction in biomedicine
PPT
Cross language semantic relations between English and Portuguese
PPTX
Paraphrasing biomedical support verb constructions for machine translation
PDF
PPTX
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
PDF
Barreiro et al POP@PROPOR2018-informal2formal-language
PDF
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
PPTX
Barreiro-Batista-LR4NLP@Coling2018-presentation
PPTX
Barreiro-Mota-VarDial@Coling2018-poster
PDF
Poster @ enetCollect CA MC meeting in Iasi, Romania
PDF
ReEscreve: A Translator-Friendly Multi-Purpose Paraphrasing Software Tool
Análise comparativa das edições portuguesa e brasileira de Os livros que dev...
Welcome session 3rd Annual MC Meeting - enetCollect COST Action
Syntactic-semantic analysis for information extraction in biomedicine
Cross language semantic relations between English and Portuguese
Paraphrasing biomedical support verb constructions for machine translation
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
Barreiro et al POP@PROPOR2018-informal2formal-language
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
Barreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Mota-VarDial@Coling2018-poster
Poster @ enetCollect CA MC meeting in Iasi, Romania
ReEscreve: A Translator-Friendly Multi-Purpose Paraphrasing Software Tool

Recently uploaded (20)

PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Empathic Computing: Creating Shared Understanding
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
Teaching material agriculture food technology
PDF
MIND Revenue Release Quarter 2 2025 Press Release
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Spectroscopy.pptx food analysis technology
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Tartificialntelligence_presentation.pptx
PDF
Machine learning based COVID-19 study performance prediction
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
OMC Textile Division Presentation 2021.pptx
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Group 1 Presentation -Planning and Decision Making .pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Empathic Computing: Creating Shared Understanding
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Teaching material agriculture food technology
MIND Revenue Release Quarter 2 2025 Press Release
Per capita expenditure prediction using model stacking based on satellite ima...
Digital-Transformation-Roadmap-for-Companies.pptx
Spectroscopy.pptx food analysis technology
A comparative analysis of optical character recognition models for extracting...
A comparative study of natural language inference in Swahili using monolingua...
Univ-Connecticut-ChatGPT-Presentaion.pdf
Diabetes mellitus diagnosis method based random forest with bat algorithm
Accuracy of neural networks in brain wave diagnosis of schizophrenia
NewMind AI Weekly Chronicles - August'25-Week II
Tartificialntelligence_presentation.pptx
Machine learning based COVID-19 study performance prediction

Linguistic Evaluation of Support Verb Construction Translations by OpenLogos and Google Translate

  • 1. technology from seed LINGUISTIC EVALUATION OF SUPPORT VERB CONSTRUCTIONS BY OPENLOGOS AND GOOGLE TRANSLATE ANABELA BARREIRO INESC-ID KUTZ ARRIETA Oracle JOHANNA MONTI University of Sassari WANG LING CMU-IST BRIGITTE ORLIAC Logos Institute FERNANDO BATISTA INESC-ID, ISCTE-IUL SUSANNE PREUß Saarland University ISABEL TRANCOSO INESC-ID, IST Language Resources and Evaluation Conference 26-31 May, Reykjavik, Iceland
  • 2. • Introduction – Towards Hybrid Machine Translation – OpenLogos and Google Translate Models • Evaluation Task – Corpus and Datasets – Quantitative Results – Linguistic Evaluation Details • Current Work – Semantico-Syntactic Knowledge Integration into SMT • Conclusions and Future Work Outline 2
  • 3. • MT GOAL – researchers aim for robust MT systems that can produce high quality translations • CURRENT PROBLEMS – translations produced by widely used MT systems still show unfortunate errors that require significant post-editing effort – there is lack of periodical qualitative evaluation efforts involving MT systems of different nature – state-of-the-art quality metrics and estimation have been targeting human-factors tasks (post-editing time and effort), but NOT diagnosing fine-grained linguistic errors to improve syntactic structure and meaning Introduction 3
  • 4. • CURRENT TREND – produce systems that combine linguistic resources and analysis with statistical techniques that will lead to linguistically enhancing SMT models • OUR MOTIVATION – belief that an effective method to advance MT research is to bring different approaches together, comparing them and measuring which modules need improvement – to our knowledge, no major effort has been made to combine the strengths of different MT approaches with the purpose of overcoming known weaknesses on the basis of a joint linguistic evaluation of those weaknesses Introduction 4
  • 5. • OUR GOALS – advance hybrid MT, starting by understanding different approaches, their weaknesses and strengths – perform a systematic fine-grained linguistic analysis of the performance of individual models – The first exercise to achieve our goals is to evaluate the performance of RBMT and SMT when dealing with a very specific linguistic phenomenon: support verb constructions Introduction 5
  • 6. • A current trend in MT research is the creation of HMT models that combine linguistic knowledge with statistical techniques • HMT systems attempt to combine RBMT systems [Scott, 2003] with data-driven MT systems, such as phrase-based SMT [Koehn, 2007] • System combination often leads to improvements in translation quality, as different systems tend to address different translation challenges • it is still not obvious which HMT approach will be the most efficient one and will lead to higher quality translation in the long run Towards Hybrid Machine Translation 6
  • 7. • SMT models learn generalizations of the translation process using parallel corpora – they tend to perform better than RBMT when parallel corpora is abundant (English-Mandarin) – when parallel corpora is scarce (Spanish-Basque), they have insufficient data to learn generalizations [Labaka, 2007] • Morphologically rich languages require more data to learn accurate translations – SMT models for morphologically rich languages have been proposed [Chahuneau, 2013] – RBMT systems with manually-encoded morphology are an alternative for resource-poor languages Towards Hybrid Machine Translation 7
  • 8. • Some methods to combine RBMT with SMT: – combine the translations of the same text by two different systems [Eisele, 2008] [Heafield, 2011] – use data-driven techniques to improve RBMT systems [Eisele, 2008] uses phrase pair extraction in phrase-based SMT to extract phrasal translations used to improve the coverage of a RBMT system – a similar method using example-based MT for the same end has been proposed [Sanchez, 2009] – use statistical post-editing methods to improve RBMT translation quality [Elming, 2006] [Simard, 2007] [Dugast, 2007] [Terumasa, 2007] – use RBMT systems to enhance data-driven approaches. [Shirai, 1997] uses an example-based MT system [Brown, 1996] to create an initial translation template, and a RBMT system to translate individual words and phrases according to this template Towards Hybrid Machine Translation 8
  • 9. • is an open source copy of the commercial Logos System • addresses morphology, syntax, and semantics, has robust parsers, sets of semantico-syntactic rules, terminology sets and tools • pattern-based methodology – closer in spirit to the SMT approach with the advantage of including semantic knowledge/understanding • uses an intermediate language (SAL) to encode linguistic information and process text – SAL contributes to OpenLogos (OL) high quality translation and lessens one of the main problems in SMT (the sparseness in linguistic examples) • its linguistic knowledge databases have not been developed for over 10 years The OpenLogos Model 9
  • 10. • one of the most widely used online MT systems • this SMT system benefits from the large amount of parallel data that Google collects from the web – in March 2014, it was set to account for 80 language pairs • translation quality is highly dependent on the language pair, producing better results for close language pairs (Portuguese and Spanish) and languages for which large amounts of parallel data are available • closed system, however, no knowledge of semantic understanding is known to exist in Google Translate (GT) The Google Translate Model 10
  • 11. • sentences containing 100 support verb constructions (SVC) extracted from the news and Internet • SVC - multiword or complex predicate formed by a semantically weak verb, and a predicate noun/adjective/adverb [Barreiro, 2008] – make a presentation support verb make + predicate noun presentation – make it simple support verb make + predicate adjective simple Evaluation Task: Corpus 11
  • 12. • Why SVC? – studied systematically within the Lexicon-Grammar Theory • the scientific study of SVC eliminates subjectivity concerns for the evaluation task – occur abundantly in texts – recognized and processed computationally • in general and specific-purpose corpora • for several languages – most MT systems still fail at addressing the compositional aspect of multiword units • when translated incorrectly, SVC have a negative impact in the understandability and quality of translations Evaluation Task: Corpus 12
  • 13. • Why SVC? – SVC can be non-contiguous (the individual elements that compose the unit are placed apart in the sentence), with a smaller or greater number of inserts • An insert is any word in between elements of the multiword other than an article before a predicate noun • we are taking a growing interest in – non-contiguous SVC are extremely difficult to align in SMT, remaining one of the key cross-language challenges for MT Evaluation Task: Corpus 13
  • 14. Support Verb Constructions Types in Our Corpus 14 Nominal Support Verb Construction (NSVC) make a presentation Adjectival Support Verb Construction (ADJSVC) be meaningful Contiguous nominal (NON-CONT NSVC) have [ADV+ADJ-particularly good] links Prepositional nominal (PREPNSVC) give an illustration of Non-contiguous prepositional nominal (NON-CONT PREPNSVC) be the [ADJ-immediate] cause of Idiomatic nominal (IDIOM NSVC) set in motion, place at risk, go on strike Idiomatic prepositional nominal (IDIOM PREPNSVC) earn an income of Non-contiguous idiomatic nominal (NON-CONT IDIOM NSVC) hold [NP-the option] in place, be of [ADJ-practical] value Non-contiguous idiomatic prepositional nominal (NON-CONT IDIOM PREPNSVC) give [PRO-us] a [bird’s-eye] view of, be [ADV-clearly] at odds with, open talks [May 14] with
  • 15. Support Verb Constructions Types in Our Corpus 15 Nominal Support Verb Construction (NSVC) make a presentation Adjectival Support Verb Construction (ADJSVC) be meaningful Non-contiguous adjectival (NON-CONT ADJSVC) be [ADV-extremely] selective Prepositional adjectival (PREPADJSVC) be known as; be involved in Non-contiguous prepositional adjectival (NON-CONT PREPADJSVC) fall [ADV-so far] short of
  • 16. • Each SVC was annotated according to the SVC taxonomy • SVC corpus was translated into FR, GE, IT, PT and ES, using the OL and the GT systems • native linguists evaluated the SVC translation quality for each target language and classified the errors according to a binary evaluation metrics: – OK ERR (agreement, morphologically-related or other problems, such as incorrect prepositions, wrong word order) • a comprehensive qualitative evaluation of mistranslations according to the different types of SVC was provided • none of the systems was trained for the task - texts were not domain specific Evaluation Task: Setup 16
  • 17. Quantitative Results 17 Lang. pair System OK ERR Agreem Other EN-FR GT 64 32 4 - OL 51 48 1 - EN-GE GT 37 46 3 14 OL 60 33 1 6 EN-IT GT 61 31 - 8 OL 43 52 - 5 EN-PT GT 68 27 5 - OL 41 58 1 - EN-ES GT 51 41 6 2 OL 25 70 3 2 Results for translation of the 100 SVC in our corpus for FR, GE, IT, PT, and ES with the OL and the GT MT systems
  • 18. • OL translates correctly more SVC than GT • incorrect translations (for both systems) concern: – word choice, incl. most prepositions - lexical (L) – word order, incl. incorrect clause segmentation - order (O) – word form, incl. choice between bare-infinitive and to- infinitive - morphology (M) – missing word, mainly auxiliary and main verb - ellipsis (E) • GT has + lexical, morphology and missing word errors than OL • GT lexical coverage is poor “wrt” contiguous SVC • GT does not translate well the GE verb split (even after reordering) Linguistic Evaluation EN-GE 18
  • 19. • GT translates correctly more SVC than OL • most translation errors by both systems involved: – incorrect lexical choice for some or all of the elements of the SVC (non-translation or literal translation) – wrong agreement (subject-verb, subject-predicate adjective) – non-contiguous and idiomatic SVC – less idiomatic SVC - problems with (i) prepositions; (ii) literal translation of the support verb and (iii) wrong lexical choice for the predicate noun – prepositions and determiner assignment, which require minor post-editing corrections (e.g., prepositional adjectival SVC) Linguistic Evaluation EN-FR/IT/PT/ES 19
  • 20. • In general, SVC problems by GT were more structural, while SVC problems by OL were more lexical • OL would easily translate contiguous and non-contiguous SVC correctly, provided it added it to its dictionary and rule DB • OL is able to resolve the SVC internal modifiers better than GT, which removes some meaning from the source in the translation • OL use of linguistic knowledge in its structural analysis is a powerful feature that can turn OL performance for the Romance languages as satisfactory as that for GE • Higher quality translation can be achieved if we combine: • OL ability to translate different surface structures of a sentence • GT rich word selection powered by sophisticated statistical methods to extract knowledge from large volumes of parallel data Linguistic Evaluation: Conclusions 20
  • 21. • In the OL system, linguistic elements are represented in a semantico-syntactic abstraction language (SAL) with ontological properties • SAL represents the heart of OL, accounting for its effectiveness in parsing and semantic understanding [Scott, 2003] [Barreiro et al., 2011] [Barreiro et., 2014] – http://www.l2f.inesc-id.pt/~abarreiro/openlogos-tutorial/INDEX.HTM • SAL is hierarchical, made up of supersets, sets and subsets • SAL knowledge is encoded in the lexicon, both in the dictionary entries and in the rules. • Bilingual dictionaries with SAL knowledge are available at: – http://metanet4u.l2f.inesc-id.pt Proposal for Semantico-Syntactic Knowledge Integration into SMT 21 nouns concrete func onals conduits word class superset set subsetbarriers containers …… … … ……
  • 22. • In OL, all NL input sentences are converted into SAL patterns, which represent the semantico-syntactic and morphological features of each word • SAL elements interact with semantico-syntactic rules called SEMTAB rules, which – represent the meaning of words on the basis of their association with other words (context) – disambiguate the meanings of words in the source text by identifying the syntactic structures underlying each meaning – provide the target language equivalents of each identified meaning of a source language – are conceptual and encode deep structure relations Proposal for Semantico-Syntactic Knowledge Integration into SMT 22
  • 23. • called after dictionary look-up and during the execution of target transfer rules (TRAN rules) to solve ambiguity problems (verb dependencies) and multiwords, overriding the default dictionary transfer • When a sentence is being parsed by TRAN, OL sends the SAL patterns to the SEMTAB database to look for a rule match • If the rule exists for a linguistic string, TRAN uses that rule and overrides the dictionary transfer for that string Proposal for Semantico-Syntactic Knowledge Integration into SMT 23
  • 24. • A string can maintain the SVC structure or be paraphrased apply paint to PT: aplicar tinta a / pintar • The SEMTAB rule applies to different surface structures of the SVC and any insert specified in the rule they applied immediately red paint (immediately) to PT: aplicaram imediatamente tinta vermelha a Proposal for Semantico-Syntactic Knowledge Integration into SMT 24
  • 25. • As long as the SEMTAB rule exists in the database, OL can process and translate correctly all the incorrectly translated SVC in our corpus (by OL and GT) • The OL method can overcome the structural problems presented by SMT, not only the contiguous, but also the non- contiguous SVC, independently of how remotely they occur in the sentence • The OL methodology applies to any type of multiword and allows the translation of other context-sensitive challenges Proposal for Semantico-Syntactic Knowledge Integration into SMT 25
  • 26. • Multiwords (SVC) are responsible for most translation errors – researchers need to develop approach-independent systematic linguistic quality evaluation metrics with phased error categorization tasks where specific linguistic phenomena (such as SVC) can be evaluated individually in stages by MT expert linguists • fine-grained error categorization can contribute to more controlled and systematic evaluation tasks • evaluation needs to target each group of linguistic errors and identify which system has more difficulties translating each type of linguistic challenge (paradigmatic evaluation) Conclusions and Future Work 26
  • 27. • evaluation tasks require the construction of corpora to test grammatical correctness addressing individual linguistic phenomena – different types of multiwords, relative constructions, passives, pronouns, determiners, locative prepositions, etc. • TOWARDS HYBRIDIZATION – the question “how effectively can rule-based and statistical MT be combined?” can only be answered after linguistic quality evaluation metrics are developed and validated by the MT community • no effective hybridization can take place before linguistic evaluation of the results provided by different approaches is successfully accomplished Conclusions and Future Work 27
  • 28. 28 Thank you! This research was supported by FCT Fundação para a Ciência e Tecnologia, through grant SFRH/BPD/91446/2012) and project PEst-OE/EEI/LA0021/2013.