SlideShare a Scribd company logo
CONTENT WRITING
OPTIMIZATION WITH REWRITER
Anabela Barreiro
ab@metatrad.com
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
OUTLINE
 LINGUISTIC QUALITY ASSURANCE
 HOW TO ACHIEVE WRITING OPTIMIZATION
 REWRITER
 PARAPHRASES COVERED BY REWRITER
 REWRITER V0.1 AND V0.2 – INTERFACE AND MODUS OPERANDI
 LINGUISTIC RESOURCES USED BY REWRITER
 EVALUATION RESULTS
 TRANSFORMATIONS FOR THE NEAR FUTURE
 WHO COULD BENEFIT FROM REWRITER?
 FROM REWRITER TO MACHINE TRANSLATION
 IMPROVEMENT OF REWRITER
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
LINGUISTIC QUALITY ASSURANCE
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
 Reassures that a certain level of linguistic quality is guaranteed in texts
 Uses techniques to verify the quality of specialized documentation
 Helps individuals/businesses connect with their target audience in a clear
and understandable way by achieving:
• strategic, high quality, meaningful content
• relevant content written and optimized specifically for a particular
purpose (or for a particular business)
• custom content (relevant keywords and adequate terminologies)
• original texts (creative writing) or domain specific texts (technical
writing)
• readable / publishable content
LINGUISTIC QUALITY ASSURANCE
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
 Professional text writing and revision
• Orthographic, grammatical, stylistic and terminological verification of
the technical documents
• Use of tools
 Spell and grammar checkers
 Writing/authoring aids
 Style guides
 Terminologies (technical domains)
 Controlled language
• Consistent, direct, and simple language
• Restricted grammar (avoid certain types of construction)
• Avoid complex reasoning, figures of speech, metaphors, etc.
• Elimination of wordiness
HOW TO ACHIEVE WRITING
OPTIMIZATION
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
 Synonymy
 Stylistics
 PARAPHRASING
• effective in helping writers with writing difficulties or learners of a non-
native language (pedagogical exercise)
• essential component in the translation process and of an extreme value
to machine translation pre-editing
• can be employed in linguistic quality assurance for both source and
target texts
For the past few years, researchers have been trying to achieve
automated paraphrasing to respond to the commercial enterprises’ wish
to include paraphrases in their text processing tools, authoring
aids, learning tools, etc.
REWRITER – PARAPHRASING TOOL
 authoring aid (word processing applications)
 Language composition tool
 Text production and style editor
 Empirical testbed for linguistic quality assurance (source and target texts)
 text (pre-)editing (machine translation)
 “Revision memory” tool (≈ “translation memory”)
 Applicable to general and technical language
(e.g. student texts or legal texts)
Portuguese version “ReEscreve” - publicly available service at:
http://www.linguateca.pt/ReEscreve/
Soon to be integrated in a cyber school project – “Ciberescola”
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
PARAPHRASES COVERED BY REWRITER
 Synonyms in context (ex: phrasal verbs into equivalent expressions)
to clear up (weather) = (weather) to become better/brighter
 Support verb constructions into single verbs
to make a decision = to decide
make a presentation of = present
to give support to N(AN) = to support N(AN)
to go V-ing = to continue V-ing
to get into contact with = to contact
to turn on N(light) = to extinguish N(light)
to become acid = to acidify
 Support verb constructions into their stylistic variants
to make an audit = to perform an audit
to make an impression = to create an impression
 Aspectual constructions into single verbs
to launch an attack = to attack
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
 Adverbials (compounds into single adverbs)
in a constructive way = constructively
on purpose > purposely = deliberately
 Relatives into participial adjectives
the president that was elected = the president elect
 Relatives into possessives
the role that Europe plays/has = the role of Europe
the position that the Church defends = the position of the Church
 Relatives into compound nouns (and vice-versa)
a container for the milk = a milk container
a bottle made of plastic = a plastic bottle
 Agentive passives into actives
the young man is released by the police officer = the police officer releases
the young man
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
PARAPHRASES COVERED BY REWRITER
REWRITER (V0.1)
INTERFACE FOR THE PUBLIC SERVICE
Interactive ReWriter
for word processing applications,
such as text editing
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
REWRITER (V0.1)
SUGGESTIONS FOR EXAMPLE SENTENCES
Suggestions for general language
linguistic phenomena
Compound adverbs >
single adverbs
Support verb constructions >
single verbs
Relatives >
participial adjectives
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
REWRITER (V0.1)
USER’S CAPABILITY TO ADD NEW
REWRITING OPTIONS
The user can suggest new words or
expressions (synonyms or paraphrases)
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
REWRITER (V0.2)
COMMERCIAL INTERFACE - PROTOTYPE
Users can select among general and technical dictionaries (more than one
selection allowed), grammars for specific linguistic transformations (one, several
or all grammars can be selected). The interface provides sample texts for testing.
Sample LEGAL text
Informative details about the
linguistic resources selected
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
REWRITER (V0.2)
COMMERCIAL INTERFACE - PROTOTYPE
Identification of legal terms in the text
Suggestions for the term “breach of law”
Users can select one term on the list of suggestions or provide a new suggestion
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
REWRITER (V0.2)
COMMERCIAL INTERFACE - PROTOTYPE
Text rewritten
• In red, the expressions in the source text
• In green, suggestions provided by ReWriter and selected by the user
It is possible to go back and change the
user option as many times as necessary
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
LINGUISTIC RESOURCES USED BY REWRITER
 Eng4NooJ – linguistic knowledge database
• OpenLogos dictionary (http://logos-os.dfki.de/) – commercial version
belongs to GROUP Business Software, Germany
• converted into NooJ format, and enhanced with new properties, including
derivational and morpho-syntactic and semantic relations
 Allows linguistic annotation and processing of corpora
 Instrument for empirical testbed to support theoretical linguistics
research
 Basis for language technology applications, including machine
translation between several languages
 Sample of Dictionary of Legal Terms
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
Sample of terms classified
as Information +
Instructional/legal
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
LINGUISTIC RESOURCES USED BY REWRITER
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
NDRV04 = <B>ion/Npred+Nom
ADRV02 = <B>icable
AVDRV01 = <E>ly/ADV
AVDRV04 = <B>tically/ADV
impress,V+FLX=POLISH+SAL=PVPCpleasetype+PT=impressionar+DRV=NDRV01:BOOK+
VSUP=make+VSUP=cause+NPREP=on
aesthetic,AFLX=NATURAL+SAL=AVstate+PT=aesthetically+DRV=AVDRV03
skepticism,N+FLX=BOOK+SAL=ABcause+PT=cepticismo+DRV=NAVDRV02
Grammar to recognize adverbial compounds and
transform them into equivalent single adverbs
Rules to transform
morpho-syntactically
and semantically
related words of
different parts of
speech
General language dictionary entries
LINGUISTIC RESOURCES USED BY REWRITER
Morpho-syntactic
and semantic
relations
EVALUATION RESULTS: PARAPHRASING
PRECISION
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
SVC Recognition
Precision
SVC Recognition
Recall
SVC Paraphrasing
Precision
Pôr 73/73 - 100% 73/100 – 73% 72/73 - 98.6%
Tomar 75/75 - 100% 75/100 – 75% 68/73 - 93.1%
Ter 65/65 - 100% 65/100 – 65% 59/65 - 90.7%
Dar 57/60 - 95% 57/100 – 57% 46/51 - 90.1%
Fazer 43/45 – 95.5% 43/100 – 43% 40/45 - 88.8%
Average 62.6/63.6 - 98.4% 62.6/100 - 62.6% 57/61 - 93.4%
Evaluation of recognition and paraphrasing
of support verb constructions
Corpus of fiction: 500 sentences
100 for each of 5 elementary support verbs
EVALUATION RESULTS: IMPACT ON
TRANSLATABILITY (MT)
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
Same corpus, 50 sentences selected randomly
(i) automated pre-processing of support verb constructions with ReWriter and
conversion into equivalent single verbs
(ii) pre-processed sentences (automatically generated paraphrases) and original text
are submitted to MT and the output translations for both original and pre-processed
sentences were compared
• 29 (58%) of the best translations were of automatically generated paraphrases
• 9 (18%) were of support verb constructions
• 12 (24%) were equally bad or equally good
CONCLUSION
The experiment indicates that paraphrases such as those generated by ReWriter
help improve translation scores
• The automated paraphrasing of support verb constructions through ReWriter
allowed a significant improvement of the quality of the MT results in that context
TRANSFORMATIONS FOR THE NEAR FUTURE
[Popular versus technical terms]
around the orbit of the eye ≡ periorbital
[If clauses]
if it is necessary ≡ if necessary
[Passives into actives - whenever suitable]
That book was written by Saramago in 2008 > Saramago wrote that book in 2008
Florida was hit by a tornado > A tornado hit Florida
[Coordinated noun phrases - conjoining or disjoining]
linguistic resources for teaching and for research > linguistic resources for teaching and research
[Subjunctive clauses - into infinitives]
we ask the favor that you confirm your attendance > we ask the favor to confirm your attendance
[Marked-up constructions]
if the end-user need is to create controlled language text > if the end-user needs to create
controlled language text
[Vague and undefined or null subject sentences] (whenever the real subject/actor is known)
[-] there was shouting in the street > [N-PRON]/someone shouted in the street
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
WHO COULD BENEFIT FROM REWRITER?
 Writers in general (searching for synonyms and paraphrases)
 Technical writers (searching for the “exact” term)
 Editors
 MT pre-editors
 Translators
 Learners of a second language
 Students learning language and writing skills
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
$EN
FROM REWRITER TO MACHINE
TRANSLATION
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
a fazer um estágio para dar aulas de / tutor Religião
a fazer um estágio para dar aulas de / lecture Religião
a fazer um estágio para dar aulas de / teach Religião
começa a dar exemplos / exemplify :
sentia-se capaz de dar um murro em / punch quem quisesse detê-lo
gostávamos de lhe dar uma palavrinha / speak .
IMPROVEMENT OF REWRITER
 Writing (pedagogical) exercises – students learning how to improve
their writing skills in a native or foreign language
 Professional writers and translators using (and testing) the
tool, marking [informal], [idiomatic], [slang] and other uses of the
terminology
 Detection of errors (words that are not synonyms, or not in a
particular context)
 Define linguistic rules to improve precision in specific contexts (e.g.:
[bring(vt)) N(charge; action) > present(vt) N(idem)]
 Include “revision memories” (recycling validated reviewed
sentences, structures or phrases)
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
Translating and the Computer Conference 2010
Anabela Barreiro London, 18-19 November 2010
To learn more
on
paraphrases…
Soon to be
released!Anabela Barreiro
ab@metatrad.com

More Related Content

PPTX
PDF
13. Constantin Orasan (UoW) Natural Language Processing for Translation
PDF
6. Khalil Sima'an (UVA) Statistical Machine Translation
PPTX
Machine Translation
PPT
New Tools and Resources to Support Machine Translation
ODP
8 Google Translate
PDF
IRJET- Kinyarwanda Speech Recognition in an Automatic Dictation System for Tr...
PPTX
NLP pipeline in machine translation
13. Constantin Orasan (UoW) Natural Language Processing for Translation
6. Khalil Sima'an (UVA) Statistical Machine Translation
Machine Translation
New Tools and Resources to Support Machine Translation
8 Google Translate
IRJET- Kinyarwanda Speech Recognition in an Automatic Dictation System for Tr...
NLP pipeline in machine translation

Viewers also liked (6)

PDF
When Multiwords Go Bad in Machine Translation
PPTX
Chemistry(g17)2
PPTX
Tarea mapa luz dary paredes infante
PDF
Polinesia 120720150458-phpapp02
PPTX
Como Elegir Carrera
PPT
Emotional Health from a biblical and CBT perspective.
When Multiwords Go Bad in Machine Translation
Chemistry(g17)2
Tarea mapa luz dary paredes infante
Polinesia 120720150458-phpapp02
Como Elegir Carrera
Emotional Health from a biblical and CBT perspective.
Ad

Similar to Content Writing Optimization with ReWriter (20)

PDF
ReEscreve: a translator-friendly multi-purpose paraphrasing software tool - A...
PDF
Using Technologies For Creativetext Translation James Luke Hadley
PDF
Using Technologies For Creativetext Translation James Luke Hadley
PDF
SPIDER: a System for Paraphrasing - Applicability in Machine Translation Pre-...
PDF
Make it simple with paraphrases: Automated paraphrasing for authoring aids an...
PPTX
Supporting the authoring process with linguistic software
PPTX
PPTX
Tools of translation
PPT
Web Metaphysics between Logic and Ontology
PDF
ReEscreve: A Translator-Friendly Multi-Purpose Paraphrasing Software Tool
PPT
Class 21 - A practical class on contextual References.ppt
PPTX
Presentacion tesi sfinal
PPTX
Using ict to analyse language
PDF
paraphrase tools reviews
PPTX
translation scope.pptx
PPTX
scopeoftranslationtechnologiesinindusstry5-201014031459.pptx
PPTX
Scope of translation technologies in indusstry 5.0
PPT
contextual-reference-words.pptkkkkppppou
ReEscreve: a translator-friendly multi-purpose paraphrasing software tool - A...
Using Technologies For Creativetext Translation James Luke Hadley
Using Technologies For Creativetext Translation James Luke Hadley
SPIDER: a System for Paraphrasing - Applicability in Machine Translation Pre-...
Make it simple with paraphrases: Automated paraphrasing for authoring aids an...
Supporting the authoring process with linguistic software
Tools of translation
Web Metaphysics between Logic and Ontology
ReEscreve: A Translator-Friendly Multi-Purpose Paraphrasing Software Tool
Class 21 - A practical class on contextual References.ppt
Presentacion tesi sfinal
Using ict to analyse language
paraphrase tools reviews
translation scope.pptx
scopeoftranslationtechnologiesinindusstry5-201014031459.pptx
Scope of translation technologies in indusstry 5.0
contextual-reference-words.pptkkkkppppou
Ad

More from INESC-ID (Spoken Language Systems Laboratory - L2F) (20)

PDF
Análise comparativa das edições portuguesa e brasileira de Os livros que dev...
PDF
Welcome session 3rd Annual MC Meeting - enetCollect COST Action
PPTX
Syntactic-semantic analysis for information extraction in biomedicine
PPT
Cross language semantic relations between English and Portuguese
PPTX
Paraphrasing biomedical support verb constructions for machine translation
PDF
PPTX
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
PDF
Barreiro et al POP@PROPOR2018-informal2formal-language
PDF
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
PPTX
Barreiro-Batista-LR4NLP@Coling2018-presentation
PPTX
Barreiro-Mota-VarDial@Coling2018-poster
PDF
Poster @ enetCollect CA MC meeting in Iasi, Romania
PDF
Machine Translation of Discontinuous Multiword Units
Análise comparativa das edições portuguesa e brasileira de Os livros que dev...
Welcome session 3rd Annual MC Meeting - enetCollect COST Action
Syntactic-semantic analysis for information extraction in biomedicine
Cross language semantic relations between English and Portuguese
Paraphrasing biomedical support verb constructions for machine translation
eSPERTo’s Paraphrastic Knowledge Applied to Question-Answering and Summarization
Barreiro et al POP@PROPOR2018-informal2formal-language
Rebelo-Arnold et al POP@PROPOR2018-EP-BP-alignments
Barreiro-Batista-LR4NLP@Coling2018-presentation
Barreiro-Mota-VarDial@Coling2018-poster
Poster @ enetCollect CA MC meeting in Iasi, Romania
Machine Translation of Discontinuous Multiword Units

Recently uploaded (20)

PPTX
1. Introduction to Computer Programming.pptx
PPTX
A Presentation on Artificial Intelligence
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
Tartificialntelligence_presentation.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Machine Learning_overview_presentation.pptx
PDF
Empathic Computing: Creating Shared Understanding
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Getting Started with Data Integration: FME Form 101
PDF
Mushroom cultivation and it's methods.pdf
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPT
Teaching material agriculture food technology
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
MIND Revenue Release Quarter 2 2025 Press Release
1. Introduction to Computer Programming.pptx
A Presentation on Artificial Intelligence
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Tartificialntelligence_presentation.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
Mobile App Security Testing_ A Comprehensive Guide.pdf
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Machine Learning_overview_presentation.pptx
Empathic Computing: Creating Shared Understanding
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Getting Started with Data Integration: FME Form 101
Mushroom cultivation and it's methods.pdf
Unlocking AI with Model Context Protocol (MCP)
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Digital-Transformation-Roadmap-for-Companies.pptx
Teaching material agriculture food technology
Building Integrated photovoltaic BIPV_UPV.pdf
Machine learning based COVID-19 study performance prediction
MIND Revenue Release Quarter 2 2025 Press Release

Content Writing Optimization with ReWriter

  • 1. CONTENT WRITING OPTIMIZATION WITH REWRITER Anabela Barreiro ab@metatrad.com Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 2. OUTLINE  LINGUISTIC QUALITY ASSURANCE  HOW TO ACHIEVE WRITING OPTIMIZATION  REWRITER  PARAPHRASES COVERED BY REWRITER  REWRITER V0.1 AND V0.2 – INTERFACE AND MODUS OPERANDI  LINGUISTIC RESOURCES USED BY REWRITER  EVALUATION RESULTS  TRANSFORMATIONS FOR THE NEAR FUTURE  WHO COULD BENEFIT FROM REWRITER?  FROM REWRITER TO MACHINE TRANSLATION  IMPROVEMENT OF REWRITER Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 3. LINGUISTIC QUALITY ASSURANCE Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010  Reassures that a certain level of linguistic quality is guaranteed in texts  Uses techniques to verify the quality of specialized documentation  Helps individuals/businesses connect with their target audience in a clear and understandable way by achieving: • strategic, high quality, meaningful content • relevant content written and optimized specifically for a particular purpose (or for a particular business) • custom content (relevant keywords and adequate terminologies) • original texts (creative writing) or domain specific texts (technical writing) • readable / publishable content
  • 4. LINGUISTIC QUALITY ASSURANCE Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010  Professional text writing and revision • Orthographic, grammatical, stylistic and terminological verification of the technical documents • Use of tools  Spell and grammar checkers  Writing/authoring aids  Style guides  Terminologies (technical domains)  Controlled language • Consistent, direct, and simple language • Restricted grammar (avoid certain types of construction) • Avoid complex reasoning, figures of speech, metaphors, etc. • Elimination of wordiness
  • 5. HOW TO ACHIEVE WRITING OPTIMIZATION Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010  Synonymy  Stylistics  PARAPHRASING • effective in helping writers with writing difficulties or learners of a non- native language (pedagogical exercise) • essential component in the translation process and of an extreme value to machine translation pre-editing • can be employed in linguistic quality assurance for both source and target texts For the past few years, researchers have been trying to achieve automated paraphrasing to respond to the commercial enterprises’ wish to include paraphrases in their text processing tools, authoring aids, learning tools, etc.
  • 6. REWRITER – PARAPHRASING TOOL  authoring aid (word processing applications)  Language composition tool  Text production and style editor  Empirical testbed for linguistic quality assurance (source and target texts)  text (pre-)editing (machine translation)  “Revision memory” tool (≈ “translation memory”)  Applicable to general and technical language (e.g. student texts or legal texts) Portuguese version “ReEscreve” - publicly available service at: http://www.linguateca.pt/ReEscreve/ Soon to be integrated in a cyber school project – “Ciberescola” Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 7. PARAPHRASES COVERED BY REWRITER  Synonyms in context (ex: phrasal verbs into equivalent expressions) to clear up (weather) = (weather) to become better/brighter  Support verb constructions into single verbs to make a decision = to decide make a presentation of = present to give support to N(AN) = to support N(AN) to go V-ing = to continue V-ing to get into contact with = to contact to turn on N(light) = to extinguish N(light) to become acid = to acidify  Support verb constructions into their stylistic variants to make an audit = to perform an audit to make an impression = to create an impression  Aspectual constructions into single verbs to launch an attack = to attack Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 8.  Adverbials (compounds into single adverbs) in a constructive way = constructively on purpose > purposely = deliberately  Relatives into participial adjectives the president that was elected = the president elect  Relatives into possessives the role that Europe plays/has = the role of Europe the position that the Church defends = the position of the Church  Relatives into compound nouns (and vice-versa) a container for the milk = a milk container a bottle made of plastic = a plastic bottle  Agentive passives into actives the young man is released by the police officer = the police officer releases the young man Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010 PARAPHRASES COVERED BY REWRITER
  • 9. REWRITER (V0.1) INTERFACE FOR THE PUBLIC SERVICE Interactive ReWriter for word processing applications, such as text editing Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 10. REWRITER (V0.1) SUGGESTIONS FOR EXAMPLE SENTENCES Suggestions for general language linguistic phenomena Compound adverbs > single adverbs Support verb constructions > single verbs Relatives > participial adjectives Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 11. REWRITER (V0.1) USER’S CAPABILITY TO ADD NEW REWRITING OPTIONS The user can suggest new words or expressions (synonyms or paraphrases) Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 12. REWRITER (V0.2) COMMERCIAL INTERFACE - PROTOTYPE Users can select among general and technical dictionaries (more than one selection allowed), grammars for specific linguistic transformations (one, several or all grammars can be selected). The interface provides sample texts for testing. Sample LEGAL text Informative details about the linguistic resources selected Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 13. REWRITER (V0.2) COMMERCIAL INTERFACE - PROTOTYPE Identification of legal terms in the text Suggestions for the term “breach of law” Users can select one term on the list of suggestions or provide a new suggestion Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 14. REWRITER (V0.2) COMMERCIAL INTERFACE - PROTOTYPE Text rewritten • In red, the expressions in the source text • In green, suggestions provided by ReWriter and selected by the user It is possible to go back and change the user option as many times as necessary Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 15. LINGUISTIC RESOURCES USED BY REWRITER  Eng4NooJ – linguistic knowledge database • OpenLogos dictionary (http://logos-os.dfki.de/) – commercial version belongs to GROUP Business Software, Germany • converted into NooJ format, and enhanced with new properties, including derivational and morpho-syntactic and semantic relations  Allows linguistic annotation and processing of corpora  Instrument for empirical testbed to support theoretical linguistics research  Basis for language technology applications, including machine translation between several languages  Sample of Dictionary of Legal Terms Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 16. Sample of terms classified as Information + Instructional/legal Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010 LINGUISTIC RESOURCES USED BY REWRITER
  • 17. Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010 NDRV04 = <B>ion/Npred+Nom ADRV02 = <B>icable AVDRV01 = <E>ly/ADV AVDRV04 = <B>tically/ADV impress,V+FLX=POLISH+SAL=PVPCpleasetype+PT=impressionar+DRV=NDRV01:BOOK+ VSUP=make+VSUP=cause+NPREP=on aesthetic,AFLX=NATURAL+SAL=AVstate+PT=aesthetically+DRV=AVDRV03 skepticism,N+FLX=BOOK+SAL=ABcause+PT=cepticismo+DRV=NAVDRV02 Grammar to recognize adverbial compounds and transform them into equivalent single adverbs Rules to transform morpho-syntactically and semantically related words of different parts of speech General language dictionary entries LINGUISTIC RESOURCES USED BY REWRITER Morpho-syntactic and semantic relations
  • 18. EVALUATION RESULTS: PARAPHRASING PRECISION Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010 SVC Recognition Precision SVC Recognition Recall SVC Paraphrasing Precision Pôr 73/73 - 100% 73/100 – 73% 72/73 - 98.6% Tomar 75/75 - 100% 75/100 – 75% 68/73 - 93.1% Ter 65/65 - 100% 65/100 – 65% 59/65 - 90.7% Dar 57/60 - 95% 57/100 – 57% 46/51 - 90.1% Fazer 43/45 – 95.5% 43/100 – 43% 40/45 - 88.8% Average 62.6/63.6 - 98.4% 62.6/100 - 62.6% 57/61 - 93.4% Evaluation of recognition and paraphrasing of support verb constructions Corpus of fiction: 500 sentences 100 for each of 5 elementary support verbs
  • 19. EVALUATION RESULTS: IMPACT ON TRANSLATABILITY (MT) Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010 Same corpus, 50 sentences selected randomly (i) automated pre-processing of support verb constructions with ReWriter and conversion into equivalent single verbs (ii) pre-processed sentences (automatically generated paraphrases) and original text are submitted to MT and the output translations for both original and pre-processed sentences were compared • 29 (58%) of the best translations were of automatically generated paraphrases • 9 (18%) were of support verb constructions • 12 (24%) were equally bad or equally good CONCLUSION The experiment indicates that paraphrases such as those generated by ReWriter help improve translation scores • The automated paraphrasing of support verb constructions through ReWriter allowed a significant improvement of the quality of the MT results in that context
  • 20. TRANSFORMATIONS FOR THE NEAR FUTURE [Popular versus technical terms] around the orbit of the eye ≡ periorbital [If clauses] if it is necessary ≡ if necessary [Passives into actives - whenever suitable] That book was written by Saramago in 2008 > Saramago wrote that book in 2008 Florida was hit by a tornado > A tornado hit Florida [Coordinated noun phrases - conjoining or disjoining] linguistic resources for teaching and for research > linguistic resources for teaching and research [Subjunctive clauses - into infinitives] we ask the favor that you confirm your attendance > we ask the favor to confirm your attendance [Marked-up constructions] if the end-user need is to create controlled language text > if the end-user needs to create controlled language text [Vague and undefined or null subject sentences] (whenever the real subject/actor is known) [-] there was shouting in the street > [N-PRON]/someone shouted in the street Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 21. WHO COULD BENEFIT FROM REWRITER?  Writers in general (searching for synonyms and paraphrases)  Technical writers (searching for the “exact” term)  Editors  MT pre-editors  Translators  Learners of a second language  Students learning language and writing skills Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 22. $EN FROM REWRITER TO MACHINE TRANSLATION Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010 a fazer um estágio para dar aulas de / tutor Religião a fazer um estágio para dar aulas de / lecture Religião a fazer um estágio para dar aulas de / teach Religião começa a dar exemplos / exemplify : sentia-se capaz de dar um murro em / punch quem quisesse detê-lo gostávamos de lhe dar uma palavrinha / speak .
  • 23. IMPROVEMENT OF REWRITER  Writing (pedagogical) exercises – students learning how to improve their writing skills in a native or foreign language  Professional writers and translators using (and testing) the tool, marking [informal], [idiomatic], [slang] and other uses of the terminology  Detection of errors (words that are not synonyms, or not in a particular context)  Define linguistic rules to improve precision in specific contexts (e.g.: [bring(vt)) N(charge; action) > present(vt) N(idem)]  Include “revision memories” (recycling validated reviewed sentences, structures or phrases) Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010
  • 24. Translating and the Computer Conference 2010 Anabela Barreiro London, 18-19 November 2010 To learn more on paraphrases… Soon to be released!Anabela Barreiro ab@metatrad.com

Editor's Notes

  • #2: Good afternoon! I’am Anabela Barreiro and I’m presenting ReWriter, a paraphrasing tool designed to help with content writing optimization.
  • #17: The system includes several dictionaries. The structure of the dictionary is XXX