SlideShare a Scribd company logo
Better Translation Technology
Andrzej Zydron, CTO XTM International
Better Translation Technology
DITA Localization
Better Translation Technology
In the beginning
Technical documentation was without form, and darkness was upon the face of
the page:
– Manual typesetting
– RTF
– WordPerfect
– MS Word
– FrameMaker
– Ventura Publisher
– Pagemaker
– SGML
3
Better Translation Technology
In the beginning
Lack of standards
•Proprietary solutions
•Problems with character encoding
•Expensive to design
•Expensive to build
•Expensive to maintain
•Expensive to localize
4
Better Translation Technology
Along came XML
Let there be light:
– XML born in 1997 from SGML/HTML
– Review of lessons learned from SGML
– Easier to implement
– Removed unnecessary complexity
– Declared standard encoding - Unicode
5
Better Translation Technology
DITA
Standards, Standards, Standards
DITA:
Advent of standards to
technical documentation
6
Better Translation Technology
DITA is not perfect!
Better Translation Technology
DITA - the good
Extremely well thought out XML document architecture:
– modularity
– fine level of granularity
– reuse
– bookmap
– standardized elements
– Write once, translate once, reuse many times
– Multiple output formats, multiple places, multiple docs:
• PDF, HTML, mobile, web, paper etc.
8
Better Translation Technology
DITA Localization
Practical considerations:
– Controlled Authoring:
• Consistency
• Terminology
– Delivery for localization:
• All at once in one big heap
• JIT - individual topics when ready
– Translation Consistency:
• Translation Memory
• Terminology
9
Better Translation Technology
DITA Localization - the good
Modularity:
– Translate a topic once
– Reuse many times!
• No need to retranslate
– Just in time translation
• Translate as soon as source is ready
• Dramatic improvement in time to market
• All documentation in all languages is ready concurrently
10
Better Translation Technology
DITA Localization - the good
• Decide how you want to translate:
– Whole document as one using bookmap
– Individual topics navigated according to bookmap
– Individual topics as and when ready
• Handling last minute engineering changes
– JIT translation
– Many TMS systems not good at handling this
– Automatically Update already translated segments
11
Better Translation Technology
DITA Localization - the <bad/><ugly/>
The bad and downright ugly (the three villains!):
– Word Substitution
• CONREF
• KEYREF
• DITAVAL
– Specialization
– Conditional processing
12
Better Translation Technology
DITA: square peg, round hole
• Do not try and force DITA to do what it is not designed for!
• DITA = Modular technical documentation
• Small, discrete topics
• No more than one page of text per topic
• Use the Open Toolkit
• Do not get overambitious with substitutions
– What works for English and Mandarin will not work for other languages
13
Better Translation Technology
DITA: Object Oriented Documentation
• DITA is an attempt to use OO design for XML documentation
• Very tempting for computer scientists
• We did it for computer programming
• Why not documentation?
• Problems arise with the nature of documentation
• Problems arise with the nature of human language
14
Better Translation Technology
Language – why humans mess things up!
What language is this?
What is he saying?
15
Better Translation Technology
Understanding the nature of English
• Why is English different from most other languages?
• English is a fusion language: a creole
– 60% Old Chaucerian English + 40% French
• Other Creoles with a high number of speakers:
– French (Vulgar Latin + Frankish)
– Swahili (Bantu + Arabic)
– Urdu (Hindi + Arabic)
– Mandarin
• (Many Sino-Tibetan languages)
16
Better Translation Technology
Understanding the nature of English
• Primitive morphology
– Nouns:
• Singular, plural, possessive
– ship, ships, ship’s, ships’
– No Gender
• a ship, the ship, the ships
– No adjectival agreement
• green ship, green ships
• We can substitute nouns and noun phrases without causing grammatical errors
• This is not true of most other languages
• English does not work like most other languages
• Your documentation WILL be translated sooner or later
17
Better Translation Technology
DITA Localization
Avoid word substitution (CONREF, KEYREF, DITAVAL):
– Linguistic issues
– Adjectival agreement
– Grammatical case
• Presenting the new Ford <keyword keyref=”model”> for 2014.
– very bad idea!
• Focus, Fiesta, Mondeo
• Nowy Focus, Nowa Fiesta, Nowe Mondeo
• Akin to saying ‘Presenting the Ford new Focus’
• Nowym Focus’em, Nową Fiestą, Nowym Mondeo
– May work for alphanumeric words
18
Better Translation Technology
DITA Localization
Only use substitution for linguistically complete sentences
– Warnings
– Cautions
– Notes
Avoid substitution for individual words or noun phrases
19
Better Translation Technology
Specialization
• Specialize at your peril!
– A double edged sword
• Increases exponentially difficulty:
– Authoring
– Publishing
– Localization
• New elements/attributes
– How are they to be treated
– For localization: completely new document type
20
Better Translation Technology
DITA and OAXAL
• OAXAL - Open Architecture for XML Authoring and Localization
• DITA Authoring and Localization in a Standards context:
– DITA is an Open Standard
– Why use proprietary software for Authoring and Localization of DITA?
Better Translation Technology
OAXAL
http://wiki.oasis-open.org/oaxal/FrontPage
Better Translation Technology
OAXAL Stack
Better Translation Technology
OAXAL Interaction
Better Translation Technology
OAXAL Source Lifecycle
Better Translation Technology
OAXAL Translation Lifecycle
26
Better Translation Technology
DITA Localization - considerations
• Choosing the right TMS/CAT System
– Can it handle XML properly:
• Entity references e.g. ‘&amp;’
• Encoding
• Validation
– Does it understand DITA
– Does it understand ditamap/bookmap
– Can you navigate using the bookmap
– Can it handle specialization
– Does it handle JIT
– Can it handle last minute changes
27
Better Translation Technology
How to reduce you translation costs
• Write less!
– Ford of Europe reduced translation costs by 50% in 2005
– It costs as much to translate into one language as it does to write the
original
• Use more graphics
– Integrate with CAD/CAM systems
– But beware text in graphics – use callouts
• People may actually start using your documentation
• KISS
• Manage your own translation assets: e.g. invest in your own TMS
– Save an additional 20% on average on cost and 50% on turnaround
Better Translation Technology
Less is More
Better Translation Technology
Contact Details
• Postal address:
– PO Box 2167
– Gerrards Cross
– Bucks SL9 8XF
– United Kingdom
• Phone: +44 1753 480 467
• Fax: +44 1753 480 465
• Andrzej Zydroń – azydron@xtm-intl.com

More Related Content

PDF
sete linguagens em sete semanas
PPTX
Finding Translations: Localization and Internationalization in Rails
PDF
Single-Sourcing and Localization stc16
PDF
Software Localization: What You Need to Know to Effectively Go Global
PPT
Xm lforthe smallerpublisher-andywilliams
PPTX
Remote agile testing webinar slides.
PDF
Agile Localization: Oxymoron or Heroic Achievement?
PDF
Challenges in Building NLP Applications in Nepali Language
sete linguagens em sete semanas
Finding Translations: Localization and Internationalization in Rails
Single-Sourcing and Localization stc16
Software Localization: What You Need to Know to Effectively Go Global
Xm lforthe smallerpublisher-andywilliams
Remote agile testing webinar slides.
Agile Localization: Oxymoron or Heroic Achievement?
Challenges in Building NLP Applications in Nepali Language

Viewers also liked (14)

PPTX
Localization and DITA: What you Need to Know - LocWorld32
PPT
Putting DITA Localization into Practice
PPTX
The tipping point
PPTX
Interverbum falcon-10oct14-az
PPTX
The Tipping Point
PPTX
Xtm webinar presentation xtm system overview
PPT
DITA and Translation Best Praticices
PPT
Open Standards
PPTX
Dos and donts
PPTX
Understanding linport
PDF
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
Localization and DITA: What you Need to Know - LocWorld32
Putting DITA Localization into Practice
The tipping point
Interverbum falcon-10oct14-az
The Tipping Point
Xtm webinar presentation xtm system overview
DITA and Translation Best Praticices
Open Standards
Dos and donts
Understanding linport
Keith Schengili-Roberts: Improve Your Chances for Documentation Success with ...
Ad

Similar to DITA for Localization (20)

PPT
Implementing Structured Writing and Content Management Globally
PPTX
ASTC Conference 2019 - Exciting trends and technologies
PPTX
Opening the Black Box of Software Localization
PPT
PTC/USER Conference 2010 - Managing Complex Print Deliverables with Arbortext
PDF
TM-Town - Getting the Most out of Your Translation Memories
PDF
The Intricacies of DITA Content Localization
PDF
The XML Forms Architecture
PDF
CS-321 Compiler Design computer engineering PPT.pdf
PDF
Managing Localization from End-to-end - Going Global with DITA
PDF
Laura Dent: Single-Source and Localization
PPTX
Lean and Collaborative Content - Workshop
PPTX
Putting Compilers to Work
PPTX
Translation and Transcreation Workshop
PDF
An introduction to go programming language
PPTX
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
ODP
Deluxe techperl
PPTX
How Much Cake to Eat: The Case for Targeted MT Engines
PPTX
Intro to Programming Lang.pptx
PDF
Build your own ASR engine
PDF
Programming Languages #devcon2013
Implementing Structured Writing and Content Management Globally
ASTC Conference 2019 - Exciting trends and technologies
Opening the Black Box of Software Localization
PTC/USER Conference 2010 - Managing Complex Print Deliverables with Arbortext
TM-Town - Getting the Most out of Your Translation Memories
The Intricacies of DITA Content Localization
The XML Forms Architecture
CS-321 Compiler Design computer engineering PPT.pdf
Managing Localization from End-to-end - Going Global with DITA
Laura Dent: Single-Source and Localization
Lean and Collaborative Content - Workshop
Putting Compilers to Work
Translation and Transcreation Workshop
An introduction to go programming language
2015 bioinformatics python_introduction_wim_vancriekinge_vfinal
Deluxe techperl
How Much Cake to Eat: The Case for Targeted MT Engines
Intro to Programming Lang.pptx
Build your own ASR engine
Programming Languages #devcon2013
Ad

DITA for Localization

  • 1. Better Translation Technology Andrzej Zydron, CTO XTM International Better Translation Technology DITA Localization
  • 2. Better Translation Technology In the beginning Technical documentation was without form, and darkness was upon the face of the page: – Manual typesetting – RTF – WordPerfect – MS Word – FrameMaker – Ventura Publisher – Pagemaker – SGML
  • 3. 3 Better Translation Technology In the beginning Lack of standards •Proprietary solutions •Problems with character encoding •Expensive to design •Expensive to build •Expensive to maintain •Expensive to localize
  • 4. 4 Better Translation Technology Along came XML Let there be light: – XML born in 1997 from SGML/HTML – Review of lessons learned from SGML – Easier to implement – Removed unnecessary complexity – Declared standard encoding - Unicode
  • 5. 5 Better Translation Technology DITA Standards, Standards, Standards DITA: Advent of standards to technical documentation
  • 7. Better Translation Technology DITA - the good Extremely well thought out XML document architecture: – modularity – fine level of granularity – reuse – bookmap – standardized elements – Write once, translate once, reuse many times – Multiple output formats, multiple places, multiple docs: • PDF, HTML, mobile, web, paper etc.
  • 8. 8 Better Translation Technology DITA Localization Practical considerations: – Controlled Authoring: • Consistency • Terminology – Delivery for localization: • All at once in one big heap • JIT - individual topics when ready – Translation Consistency: • Translation Memory • Terminology
  • 9. 9 Better Translation Technology DITA Localization - the good Modularity: – Translate a topic once – Reuse many times! • No need to retranslate – Just in time translation • Translate as soon as source is ready • Dramatic improvement in time to market • All documentation in all languages is ready concurrently
  • 10. 10 Better Translation Technology DITA Localization - the good • Decide how you want to translate: – Whole document as one using bookmap – Individual topics navigated according to bookmap – Individual topics as and when ready • Handling last minute engineering changes – JIT translation – Many TMS systems not good at handling this – Automatically Update already translated segments
  • 11. 11 Better Translation Technology DITA Localization - the <bad/><ugly/> The bad and downright ugly (the three villains!): – Word Substitution • CONREF • KEYREF • DITAVAL – Specialization – Conditional processing
  • 12. 12 Better Translation Technology DITA: square peg, round hole • Do not try and force DITA to do what it is not designed for! • DITA = Modular technical documentation • Small, discrete topics • No more than one page of text per topic • Use the Open Toolkit • Do not get overambitious with substitutions – What works for English and Mandarin will not work for other languages
  • 13. 13 Better Translation Technology DITA: Object Oriented Documentation • DITA is an attempt to use OO design for XML documentation • Very tempting for computer scientists • We did it for computer programming • Why not documentation? • Problems arise with the nature of documentation • Problems arise with the nature of human language
  • 14. 14 Better Translation Technology Language – why humans mess things up! What language is this? What is he saying?
  • 15. 15 Better Translation Technology Understanding the nature of English • Why is English different from most other languages? • English is a fusion language: a creole – 60% Old Chaucerian English + 40% French • Other Creoles with a high number of speakers: – French (Vulgar Latin + Frankish) – Swahili (Bantu + Arabic) – Urdu (Hindi + Arabic) – Mandarin • (Many Sino-Tibetan languages)
  • 16. 16 Better Translation Technology Understanding the nature of English • Primitive morphology – Nouns: • Singular, plural, possessive – ship, ships, ship’s, ships’ – No Gender • a ship, the ship, the ships – No adjectival agreement • green ship, green ships • We can substitute nouns and noun phrases without causing grammatical errors • This is not true of most other languages • English does not work like most other languages • Your documentation WILL be translated sooner or later
  • 17. 17 Better Translation Technology DITA Localization Avoid word substitution (CONREF, KEYREF, DITAVAL): – Linguistic issues – Adjectival agreement – Grammatical case • Presenting the new Ford <keyword keyref=”model”> for 2014. – very bad idea! • Focus, Fiesta, Mondeo • Nowy Focus, Nowa Fiesta, Nowe Mondeo • Akin to saying ‘Presenting the Ford new Focus’ • Nowym Focus’em, Nową Fiestą, Nowym Mondeo – May work for alphanumeric words
  • 18. 18 Better Translation Technology DITA Localization Only use substitution for linguistically complete sentences – Warnings – Cautions – Notes Avoid substitution for individual words or noun phrases
  • 19. 19 Better Translation Technology Specialization • Specialize at your peril! – A double edged sword • Increases exponentially difficulty: – Authoring – Publishing – Localization • New elements/attributes – How are they to be treated – For localization: completely new document type
  • 20. 20 Better Translation Technology DITA and OAXAL • OAXAL - Open Architecture for XML Authoring and Localization • DITA Authoring and Localization in a Standards context: – DITA is an Open Standard – Why use proprietary software for Authoring and Localization of DITA?
  • 25. Better Translation Technology OAXAL Translation Lifecycle
  • 26. 26 Better Translation Technology DITA Localization - considerations • Choosing the right TMS/CAT System – Can it handle XML properly: • Entity references e.g. ‘&amp;’ • Encoding • Validation – Does it understand DITA – Does it understand ditamap/bookmap – Can you navigate using the bookmap – Can it handle specialization – Does it handle JIT – Can it handle last minute changes
  • 27. 27 Better Translation Technology How to reduce you translation costs • Write less! – Ford of Europe reduced translation costs by 50% in 2005 – It costs as much to translate into one language as it does to write the original • Use more graphics – Integrate with CAD/CAM systems – But beware text in graphics – use callouts • People may actually start using your documentation • KISS • Manage your own translation assets: e.g. invest in your own TMS – Save an additional 20% on average on cost and 50% on turnaround
  • 29. Better Translation Technology Contact Details • Postal address: – PO Box 2167 – Gerrards Cross – Bucks SL9 8XF – United Kingdom • Phone: +44 1753 480 467 • Fax: +44 1753 480 465 • Andrzej Zydroń – azydron@xtm-intl.com