SlideShare a Scribd company logo
INTENTO
Building
A Multi-Purpose
MT Portfolio
© Intento, Inc. / September 2020
© Intento, Inc. / September 2020
AGENDA
2
Multi-Purpose MT?
—
MT usage scenarios and requirements
—
Case Study 1: Entity Protection
—
Case Study 2: Custom Terminology
—
Case Study 3: Tone of Voice
—
Key Takeaways
© Intento, Inc. / September 2020
MULTI-PURPOSE MT?
3
ENTERPRISES
MASSIVELY FAIL
* Share of US companies with successful AI deployment
(Deloitte State of Cognitive Survey 2017)
INTENTO4
20%*
Wrong vendor selected
Failed integrations
Failed pilots
Failed to deliver ROI
© Intento, Inc. / September 2020
TO ADOPT
AI
© Intento, Inc. / September 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
5
MT Procurement
MT Need MT Systems
Localization
© Intento, Inc. / September 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
6
MT Procurement
MT Need MT Systems
Localization
© Intento, Inc. / September 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
7
MT Procurement
—
MT Curation
MT Need MT Systems
Localization
© Intento, Inc. / September 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
8
MT Procurement
—
MT Curation
—
Multi-Engine MT
MT Need MT Systems
Localization
© Intento, Inc. / September 2020
BRIDGING THE GAP BETWEEN
MT CAPABILITIES AND ADOPTION
9
MT Procurement
—
MT Curation
—
Multi-Engine MT
—
Multi-Purpose MT
MT Need MT Systems
Localization
Customer Service
Office Productivity
Global Community
© Intento, Inc. / September 2020
MULTI-PURPOSE MT
10
Instant ROI on the investments already made
—
Combining resources of multiple stakeholders to
benefit everyone
—
MT Requirements beyond the objective linguistic
quality
—
Optimizing for features may compromise the
quality
© Intento, Inc. / September 2020
MT USAGE SCENARIOS
AND REQUIREMENTS
11
© Intento, Inc. / September 2020
MULTI-PURPOSE MT
12
© Intento, Inc. / September 2020
MULTI-PURPOSE MT
REQUIREMENTS BEYOND QUALITY
13
large text translation
—
batch translation
—
latency and jitter
—
tolerance to bad source
—
language detection
—
tag support
multilingual source
—
profanity control
—
metadata protection
—
entity protection
—
custom terminology
—
tone of voice consistency
© Intento, Inc. / September 2020
ADDITIONAL CHALLENGES
WITH SPECIFIC COMBINATIONS
14
large text translation + HTML support
—
source language detection + multilingual
source
—
…
© Intento, Inc. / September 2020
MT REQUIREMENTS MATRIX
EVERY CASE HAS ITS OWN NEEDS
15
large
text
translation
Post-editing / TMS
Support tickets
Live chats
Chatbots
On-the-fly UGC
Real-time communication
15
Knowledge bases
batch
translation
latency
and
jitter
tolerance
to
bad
source
language
detectiontag
supportm
ultilingualsource
profanity
control
m
etadata
protection
entity
protection
custom
term
inology
tone
ofvoice
control
© Intento, Inc. / September 2020
MT REQUIREMENTS MATRIX
SAMPLE
16
large
text
translation
Post-editing / TMS
Support tickets
Live chats
Chatbots
On-the-fly UGC
Real-time communication
16
Knowledge bases
batch
translation
latency
and
jitter
tolerance
to
bad
source
language
detectiontag
supportm
ultilingualsource
profanity
control
m
etadata
protection
entity
protection
custom
term
inology
tone
ofvoice
control
ALSO
different for
inbound and
outbound…
© Intento, Inc. / September 2020
MT REQUIREMENTS SUPPORT
BY POPULAR MT ENGINES
17
large
text
translation
Amazon Translate
Google Translate Advanced
DeepL Pro API
IBM Watson Translator
Microsoft Text Translator
ModernMT
17
Systran PNMT
batch
translation
latency
and
jitter
tolerance
to
bad
source
language
detectiontag
supportm
ultilingualsource
profanity
control
m
etadata
protection
entity
protection
custom
term
inology
tone
ofvoice
control
supported
support or its
quality depends
on the language
pair / model
© Intento, Inc. / September 2020
CASE STUDY 1:
ENTITY PROTECTION
18
© Intento, Inc. / September 2020
ENTITY PROTECTION
SOME SAMPLES
19
Simplest cases:
protecting email,
URLs, phone
numbers, file paths
—
Crucial for Customer
Service
—
Easily broken by MT
Source text (English) Machine Translation
I just want to let you know about a spam mail I have
received on Friday - it’s in D:
DrvPrtEpsonUniversal driver
x64ABC6eeecu120m.inf
Я просто хочу уведомить вас о спаме, который я
получил в пятницу - он здесь D:
DrvPrtEpsonУниверсальный драйвер
x64ABC6eeecu120m.inf
It has been Ivan Mitrich (ASAP,
email.some+plus@example.com.tr) from Belgrad,
but in the future it will be me.
Bio je to Ivan Mitrich (ASAP,
email.some+plus@ekample.com.tr) iz Belgrada, ali
u budućnosti to ću biti ja.
Would you like to help with a new phone for the
ABC department - (772) 194 59 65 ext 4406/4408).
Desideri aiutarti con un nuovo telefono per il
dipartimento ABC - (772) 194 59 65 ext 406/4408).
You must submit such a request via ABC-portal,
attached link: www.example.com/en/submit
Deve enviar o pedido de tal atraves do ABC-portal,
link anexo: www.example.com/pt/submit
© Intento, Inc. / September 2020
ENTITY PROTECTION
EXPERIMENTAL RESULTS
20
Selecting the MT
based on the default
entity protection
may compromise
the quality
—
What if we enforce
protection via MT-
agnostic NLP?
0 %
25 %
50 %
75 %
100 %
94 %
0 %
66 %68 %
51 %
67 %
100 %97 %
emails URLs phones paths
ENTITY PROTECTION IN TWO STOCK ENGINES
BASELINE
0 %
25 %
50 %
75 %
100 %
MT Engine 1 MT Engine 2
94 %
54 %
79 %
71 % 75 %75 %
100 %100 %+NLP
© Intento, Inc. / September 2020
CASE STUDY 2:
CUSTOM TERMINOLOGY
21
© Intento, Inc. / September 2020
CUSTOM TERMINOLOGY
IMPROVES FIDELITY
22
Simplest cases:
enforcing acronyms,
brand names and
other proper nouns.
—
Without Custom
Terminology support,
NMT easily breaks
them.
0 %
25 %
50 %
75 %
100 %
32 %29 %34 %
28 %
78 %
95 %91 %
97 % 96 %91 %93 %94 %
brands acronyms other proper nouns
GOOGLE TRANSLATE ADVANCED (WITH GLOSSARY SUPPORT)
W/OGLOSSARY
0 %
25 %
50 %
75 %
100 %
EN > L1 EN > L2 EN > L3 EN > L4
99,3 %99,3 %99,3 %99,3 % 98,8 %99,4 %99,5 %99,3 % 99,9 %99,7 %99,9 %99,6 %WITHGLOSSARY
© Intento, Inc. / September 2020
CUSTOM TERMINOLOGY
IMPROVES FIDELITY
23
Selecting the MT
engine by the
custom terminology
support may
compromise MT
Quality
—
MT-agnostic glossary
on a top of NMT
0 %
25 %
50 %
75 %
100 %
41 %
31 %
21 %
89 %
81 %
93 %
0 %
96 %
90 %
brands acronyms other proper nouns
STOCK MT ENGINES WITHOUT GLOSSARY SUPPORT
W/OGLOSSARY
0 %
25 %
50 %
75 %
100 %
ModernMT, EN > KO DeepL, EN > FR Baidu, EN > ZH
85 %82 %
95 % 97 %97 %100 % 95 %98 %95 %
+NLP
© Intento, Inc. / September 2020
CASE STUDY 3:
TONE OF VOICE CONTROL
24
© Intento, Inc. / September 2020
TONE OF VOICE CONTROL
SAMPLES FROM SUPPORT CHATS
25
Source text (English)
Machine Translation
(German)
COMMENT
Can you share your screen?
Können Sie Ihren Bildschirm
freigeben?
FORMAL
Could you help me? Kannst du mir helfen? INFORMAL
Make sure you report any of
these issues.
Stellen Sie sicher, dass Sie eines
dieser Probleme melden.
FORMAL
Can you give an example? Kannst du ein Beispiel geben? INFORMAL
Formal vs.
Informal
—
Crucial for Live
Chats
—
Baseline MT
engines are not
consistent
© Intento, Inc. / September 2020
TONE OF VOICE CONTROL
DEFAULT MT OUTPUT
26
English to German
—
210 segments
—
stock models
A B C D E F G
© Intento, Inc. / September 2020
TONE OF VOICE CONTROL
HOW TO MAKE IT INFORMAL?
27
Option 1: Use DeepL with
formality=less (99.5% accuracy)
—
Option 2: Generate synthetic
training data, hoping
translations become more
informal
—
Option 3: MT-agnostic NLP
What if you need a custom model and
terminology, or another MT has better
linguistic quality for you?
Expensive and time-consuming, also
introduces bias into the model
Works to a certain extent, provides a
wider choice of MT engines
© Intento, Inc. / September 2020
TONE OF VOICE CONTROL
MT-AGNOSTIC ADJUSTMENT
28
English to German
—
210 segments
—
stock models
—
let’s make it more
INFORMAL
A B C D E F G
© Intento, Inc. / September 2020
TONE OF VOICE CONTROL
MT-AGNOSTIC ADJUSTMENT
29
English to German
—
210 segments
—
stock models
—
let’s make it more
FORMAL
A B C D E F G
© Intento, Inc. / September 2020
KEY TAKEAWAYS
30
Multi-Purpose MT brings an instant ROI on the MT
investment already made.
—
Different use cases impose multiple requirements
beyond the linguistic quality.
—
Meeting the requirements takes either the right MT
engine choice or clever engineering.
—
We do both, by implementing MT-agnostic fine tuning
algorithms to avoid compromising the MT quality.
THANKS!
ks@inten.to
31
Konstantin Savenkov, CEO

ks@inten.to

2150 Shattuck Ave

Berkeley CA 94705
INTENTO
https://inten.to

More Related Content

PDF
Machine Translation Insights
PDF
State of the Domain-Adaptive Machine Translation by Intento (November 2018)
PDF
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
PDF
State of the Machine Translation by Intento (July 2018)
PDF
State of the Machine Translation by Intento (March 2018)
PDF
State of the Machine Translation by Intento (stock engines, Jan 2019)
PDF
Intento Enterprise MT Hub
PDF
Dodging AI biases in future-proof Machine Translation solutions
Machine Translation Insights
State of the Domain-Adaptive Machine Translation by Intento (November 2018)
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
State of the Machine Translation by Intento (July 2018)
State of the Machine Translation by Intento (March 2018)
State of the Machine Translation by Intento (stock engines, Jan 2019)
Intento Enterprise MT Hub
Dodging AI biases in future-proof Machine Translation solutions

What's hot (10)

PDF
Cloud Artificial Intelligence Landscape
PDF
Cloud Sentiment Analysis - Vendor Overview (April 2018)
PDF
State of the Machine Translation by Intento (stock engines, Jun 2019)
PDF
State of the Machine Translation by Intento (November 2017)
PDF
Progress in Commercial Machine Translation Systems
PDF
Improving the Demand Side of the AI Economy (API World 2018)
PDF
Intento Machine Translation Benchmark, July 2017
PDF
Intento Enterprise MT Hub
PDF
Intento Enterprise MT Hub
PDF
NLU / Intent Detection Benchmark by Intento, August 2017
Cloud Artificial Intelligence Landscape
Cloud Sentiment Analysis - Vendor Overview (April 2018)
State of the Machine Translation by Intento (stock engines, Jun 2019)
State of the Machine Translation by Intento (November 2017)
Progress in Commercial Machine Translation Systems
Improving the Demand Side of the AI Economy (API World 2018)
Intento Machine Translation Benchmark, July 2017
Intento Enterprise MT Hub
Intento Enterprise MT Hub
NLU / Intent Detection Benchmark by Intento, August 2017
Ad

Similar to Building Multi-Purpose MT Portfolio (20)

PDF
Intento Enterprise MT Hub
PDF
June 27 top_10_techtrends_dcearley_176465
PDF
CPaaS is dead, Long live CPaaS, Filipe Leitão, 8X8
PPTX
New Breakthroughs in Machine Transation Technology
PPTX
SAARIKOSKI YLE metadata machine
PPTX
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
PDF
Chat Generative Pre-Trained Transformer: An Overview
PPTX
ARO For Developers
PDF
Automotive Processes and Open Source
PDF
FDT/DTM Introduction Webinar
PPTX
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
PPTX
DataRobot - 머신러닝 자동화 플랫폼
PDF
M-RTOS webinar presentation July 20th 2020
PPTX
The Art of Managing and Securing Endpoints with SanerNow Patch Management
PPTX
IP Licensing for Technology Entrepreneurs
PDF
Letter from CEO
PDF
TM Forum AI Program Overview
PDF
You’re Spiky and We Know It With Ravindra Bhanot | Current 2022
PPTX
OCS RoI
PDF
The Future is Now: Chat Bots and Workflow Automation, Symphony Platform Solut...
Intento Enterprise MT Hub
June 27 top_10_techtrends_dcearley_176465
CPaaS is dead, Long live CPaaS, Filipe Leitão, 8X8
New Breakthroughs in Machine Transation Technology
SAARIKOSKI YLE metadata machine
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
Chat Generative Pre-Trained Transformer: An Overview
ARO For Developers
Automotive Processes and Open Source
FDT/DTM Introduction Webinar
Improve Efficiency by Double Digits – Leveraging Artificial Intelligence and ...
DataRobot - 머신러닝 자동화 플랫폼
M-RTOS webinar presentation July 20th 2020
The Art of Managing and Securing Endpoints with SanerNow Patch Management
IP Licensing for Technology Entrepreneurs
Letter from CEO
TM Forum AI Program Overview
You’re Spiky and We Know It With Ravindra Bhanot | Current 2022
OCS RoI
The Future is Now: Chat Bots and Workflow Automation, Symphony Platform Solut...
Ad

More from Konstantin Savenkov (12)

PDF
GPT and other Text Transformers: Black Swans and Stochastic Parrots
PDF
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
PDF
Сравнительный анализ систем машинного перевода
PDF
Building a Data Driven Business
PDF
Управление бизнесом на основе данных
PDF
Messengers, Bots and Personal Assistants
PDF
Рекомендательные системы: роль и оценка эффективности
PPTX
Measuring the agile process improvement
PDF
Lean production для SAAS
PDF
Driving Business Goals with Recommender Systems @ YAC/m 2015
PDF
The Economics of Recommender Systems
PPTX
Recommender Systems in a nutshell
GPT and other Text Transformers: Black Swans and Stochastic Parrots
Как выбрать и приручить машинный перевод / How to choose and tame the Machine...
Сравнительный анализ систем машинного перевода
Building a Data Driven Business
Управление бизнесом на основе данных
Messengers, Bots and Personal Assistants
Рекомендательные системы: роль и оценка эффективности
Measuring the agile process improvement
Lean production для SAAS
Driving Business Goals with Recommender Systems @ YAC/m 2015
The Economics of Recommender Systems
Recommender Systems in a nutshell

Recently uploaded (20)

PPTX
sap open course for s4hana steps from ECC to s4
PPTX
Cloud computing and distributed systems.
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
cuic standard and advanced reporting.pdf
PDF
Encapsulation_ Review paper, used for researhc scholars
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PPTX
A Presentation on Artificial Intelligence
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPTX
Machine Learning_overview_presentation.pptx
PPTX
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPTX
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
sap open course for s4hana steps from ECC to s4
Cloud computing and distributed systems.
20250228 LYD VKU AI Blended-Learning.pptx
“AI and Expert System Decision Support & Business Intelligence Systems”
The Rise and Fall of 3GPP – Time for a Sabbatical?
Diabetes mellitus diagnosis method based random forest with bat algorithm
cuic standard and advanced reporting.pdf
Encapsulation_ Review paper, used for researhc scholars
Digital-Transformation-Roadmap-for-Companies.pptx
Chapter 3 Spatial Domain Image Processing.pdf
A Presentation on Artificial Intelligence
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
Machine Learning_overview_presentation.pptx
ACSFv1EN-58255 AWS Academy Cloud Security Foundations.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Advanced methodologies resolving dimensionality complications for autism neur...
VMware vSphere Foundation How to Sell Presentation-Ver1.4-2-14-2024.pptx
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
The AUB Centre for AI in Media Proposal.docx

Building Multi-Purpose MT Portfolio

  • 1. INTENTO Building A Multi-Purpose MT Portfolio © Intento, Inc. / September 2020
  • 2. © Intento, Inc. / September 2020 AGENDA 2 Multi-Purpose MT? — MT usage scenarios and requirements — Case Study 1: Entity Protection — Case Study 2: Custom Terminology — Case Study 3: Tone of Voice — Key Takeaways
  • 3. © Intento, Inc. / September 2020 MULTI-PURPOSE MT? 3
  • 4. ENTERPRISES MASSIVELY FAIL * Share of US companies with successful AI deployment (Deloitte State of Cognitive Survey 2017) INTENTO4 20%* Wrong vendor selected Failed integrations Failed pilots Failed to deliver ROI © Intento, Inc. / September 2020 TO ADOPT AI
  • 5. © Intento, Inc. / September 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 5 MT Procurement MT Need MT Systems Localization
  • 6. © Intento, Inc. / September 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 6 MT Procurement MT Need MT Systems Localization
  • 7. © Intento, Inc. / September 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 7 MT Procurement — MT Curation MT Need MT Systems Localization
  • 8. © Intento, Inc. / September 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 8 MT Procurement — MT Curation — Multi-Engine MT MT Need MT Systems Localization
  • 9. © Intento, Inc. / September 2020 BRIDGING THE GAP BETWEEN MT CAPABILITIES AND ADOPTION 9 MT Procurement — MT Curation — Multi-Engine MT — Multi-Purpose MT MT Need MT Systems Localization Customer Service Office Productivity Global Community
  • 10. © Intento, Inc. / September 2020 MULTI-PURPOSE MT 10 Instant ROI on the investments already made — Combining resources of multiple stakeholders to benefit everyone — MT Requirements beyond the objective linguistic quality — Optimizing for features may compromise the quality
  • 11. © Intento, Inc. / September 2020 MT USAGE SCENARIOS AND REQUIREMENTS 11
  • 12. © Intento, Inc. / September 2020 MULTI-PURPOSE MT 12
  • 13. © Intento, Inc. / September 2020 MULTI-PURPOSE MT REQUIREMENTS BEYOND QUALITY 13 large text translation — batch translation — latency and jitter — tolerance to bad source — language detection — tag support multilingual source — profanity control — metadata protection — entity protection — custom terminology — tone of voice consistency
  • 14. © Intento, Inc. / September 2020 ADDITIONAL CHALLENGES WITH SPECIFIC COMBINATIONS 14 large text translation + HTML support — source language detection + multilingual source — …
  • 15. © Intento, Inc. / September 2020 MT REQUIREMENTS MATRIX EVERY CASE HAS ITS OWN NEEDS 15 large text translation Post-editing / TMS Support tickets Live chats Chatbots On-the-fly UGC Real-time communication 15 Knowledge bases batch translation latency and jitter tolerance to bad source language detectiontag supportm ultilingualsource profanity control m etadata protection entity protection custom term inology tone ofvoice control
  • 16. © Intento, Inc. / September 2020 MT REQUIREMENTS MATRIX SAMPLE 16 large text translation Post-editing / TMS Support tickets Live chats Chatbots On-the-fly UGC Real-time communication 16 Knowledge bases batch translation latency and jitter tolerance to bad source language detectiontag supportm ultilingualsource profanity control m etadata protection entity protection custom term inology tone ofvoice control ALSO different for inbound and outbound…
  • 17. © Intento, Inc. / September 2020 MT REQUIREMENTS SUPPORT BY POPULAR MT ENGINES 17 large text translation Amazon Translate Google Translate Advanced DeepL Pro API IBM Watson Translator Microsoft Text Translator ModernMT 17 Systran PNMT batch translation latency and jitter tolerance to bad source language detectiontag supportm ultilingualsource profanity control m etadata protection entity protection custom term inology tone ofvoice control supported support or its quality depends on the language pair / model
  • 18. © Intento, Inc. / September 2020 CASE STUDY 1: ENTITY PROTECTION 18
  • 19. © Intento, Inc. / September 2020 ENTITY PROTECTION SOME SAMPLES 19 Simplest cases: protecting email, URLs, phone numbers, file paths — Crucial for Customer Service — Easily broken by MT Source text (English) Machine Translation I just want to let you know about a spam mail I have received on Friday - it’s in D: DrvPrtEpsonUniversal driver x64ABC6eeecu120m.inf Я просто хочу уведомить вас о спаме, который я получил в пятницу - он здесь D: DrvPrtEpsonУниверсальный драйвер x64ABC6eeecu120m.inf It has been Ivan Mitrich (ASAP, email.some+plus@example.com.tr) from Belgrad, but in the future it will be me. Bio je to Ivan Mitrich (ASAP, email.some+plus@ekample.com.tr) iz Belgrada, ali u budućnosti to ću biti ja. Would you like to help with a new phone for the ABC department - (772) 194 59 65 ext 4406/4408). Desideri aiutarti con un nuovo telefono per il dipartimento ABC - (772) 194 59 65 ext 406/4408). You must submit such a request via ABC-portal, attached link: www.example.com/en/submit Deve enviar o pedido de tal atraves do ABC-portal, link anexo: www.example.com/pt/submit
  • 20. © Intento, Inc. / September 2020 ENTITY PROTECTION EXPERIMENTAL RESULTS 20 Selecting the MT based on the default entity protection may compromise the quality — What if we enforce protection via MT- agnostic NLP? 0 % 25 % 50 % 75 % 100 % 94 % 0 % 66 %68 % 51 % 67 % 100 %97 % emails URLs phones paths ENTITY PROTECTION IN TWO STOCK ENGINES BASELINE 0 % 25 % 50 % 75 % 100 % MT Engine 1 MT Engine 2 94 % 54 % 79 % 71 % 75 %75 % 100 %100 %+NLP
  • 21. © Intento, Inc. / September 2020 CASE STUDY 2: CUSTOM TERMINOLOGY 21
  • 22. © Intento, Inc. / September 2020 CUSTOM TERMINOLOGY IMPROVES FIDELITY 22 Simplest cases: enforcing acronyms, brand names and other proper nouns. — Without Custom Terminology support, NMT easily breaks them. 0 % 25 % 50 % 75 % 100 % 32 %29 %34 % 28 % 78 % 95 %91 % 97 % 96 %91 %93 %94 % brands acronyms other proper nouns GOOGLE TRANSLATE ADVANCED (WITH GLOSSARY SUPPORT) W/OGLOSSARY 0 % 25 % 50 % 75 % 100 % EN > L1 EN > L2 EN > L3 EN > L4 99,3 %99,3 %99,3 %99,3 % 98,8 %99,4 %99,5 %99,3 % 99,9 %99,7 %99,9 %99,6 %WITHGLOSSARY
  • 23. © Intento, Inc. / September 2020 CUSTOM TERMINOLOGY IMPROVES FIDELITY 23 Selecting the MT engine by the custom terminology support may compromise MT Quality — MT-agnostic glossary on a top of NMT 0 % 25 % 50 % 75 % 100 % 41 % 31 % 21 % 89 % 81 % 93 % 0 % 96 % 90 % brands acronyms other proper nouns STOCK MT ENGINES WITHOUT GLOSSARY SUPPORT W/OGLOSSARY 0 % 25 % 50 % 75 % 100 % ModernMT, EN > KO DeepL, EN > FR Baidu, EN > ZH 85 %82 % 95 % 97 %97 %100 % 95 %98 %95 % +NLP
  • 24. © Intento, Inc. / September 2020 CASE STUDY 3: TONE OF VOICE CONTROL 24
  • 25. © Intento, Inc. / September 2020 TONE OF VOICE CONTROL SAMPLES FROM SUPPORT CHATS 25 Source text (English) Machine Translation (German) COMMENT Can you share your screen? Können Sie Ihren Bildschirm freigeben? FORMAL Could you help me? Kannst du mir helfen? INFORMAL Make sure you report any of these issues. Stellen Sie sicher, dass Sie eines dieser Probleme melden. FORMAL Can you give an example? Kannst du ein Beispiel geben? INFORMAL Formal vs. Informal — Crucial for Live Chats — Baseline MT engines are not consistent
  • 26. © Intento, Inc. / September 2020 TONE OF VOICE CONTROL DEFAULT MT OUTPUT 26 English to German — 210 segments — stock models A B C D E F G
  • 27. © Intento, Inc. / September 2020 TONE OF VOICE CONTROL HOW TO MAKE IT INFORMAL? 27 Option 1: Use DeepL with formality=less (99.5% accuracy) — Option 2: Generate synthetic training data, hoping translations become more informal — Option 3: MT-agnostic NLP What if you need a custom model and terminology, or another MT has better linguistic quality for you? Expensive and time-consuming, also introduces bias into the model Works to a certain extent, provides a wider choice of MT engines
  • 28. © Intento, Inc. / September 2020 TONE OF VOICE CONTROL MT-AGNOSTIC ADJUSTMENT 28 English to German — 210 segments — stock models — let’s make it more INFORMAL A B C D E F G
  • 29. © Intento, Inc. / September 2020 TONE OF VOICE CONTROL MT-AGNOSTIC ADJUSTMENT 29 English to German — 210 segments — stock models — let’s make it more FORMAL A B C D E F G
  • 30. © Intento, Inc. / September 2020 KEY TAKEAWAYS 30 Multi-Purpose MT brings an instant ROI on the MT investment already made. — Different use cases impose multiple requirements beyond the linguistic quality. — Meeting the requirements takes either the right MT engine choice or clever engineering. — We do both, by implementing MT-agnostic fine tuning algorithms to avoid compromising the MT quality.
  • 31. THANKS! ks@inten.to 31 Konstantin Savenkov, CEO ks@inten.to 2150 Shattuck Ave Berkeley CA 94705 INTENTO https://inten.to