SlideShare a Scribd company logo
No Hardware. No Software. No Hassle MT.
New Breakthroughs in Machine Translation Technology
in association with#KantanWebinar
KantanMT.Com
NO HARDWARE. NO SOFTWARE. NO HASSLE MT
Tony O’Dowd
Founder & Chief Architect
New Breakthroughs in Machine
Translation Technology
What we aim to cover today?
What is KantanMT.com?
Challenges of the L10N Industry
 Making the right Project Management decisions
 Going beyond the baseline of MT quality
Conclusions
15 minutes
What is KantanMT.com?
Statistical MT System
 Cloud-based =
 Highly scalable
 Inexpensive to operate
 Quick to deploy
Our Vision
 To put Machine Translation:
 Customization
 Improvement
 Deployment
 …into your hands
Active KantanMT Engines
6,191
Training Words Uploaded
28,243,234,615
Member WordsTranslated
427,526,741
Fully Operational 15 months
Initial Steps of any project are:
 Determine Scope
 How long will it take?
 How much will it cost?
 What is my margin?
 Determine resources
 How many Translators will I need?
Introducing KantanAnalytics™
 …think Fuzzy-Match report and you’ve got it in one!
Challenge #1
How can Project Managers ‘manage’ Post-
Editing Projects?
KantanAnalytics™
Kantan TotalRecall – Advanced TM
% of TM hits in this job
KantanMT – automated translations
% of automated translations for this job
Range of QE Scores
QE range defined to match existing fuzzy match ranges used by
L10N industry
Quality Estimation Scores
Segment level QE scores – akin to fuzzy match scores
Word Counts – Project Stats
Can be used to develop Project TimeLine and Tiered Pricing Model
for Post-Editing Projects
Placeholder & Tag Counts
Used by PM for complexity sur-charges
KantanAnalytics embeds QE scores
into
 TRADOS Studio
 MemoQ
 XLIFF
KantanAnalytics™
Helping PMs make the right business
decisions!
KantanAnalytics™ - Helping PMs make the right decisions
Challenge #2: Going beyond the baseline and developing
production ready MT!
Easy to build 1st baseline engine
 Aggregate Training Data – TM, Mono, Stock, Terminology
 Use Cloud-based platform, like KantanMT.com
Real Challenge:
 How do these platforms go beyond the baseline engine and achieve
higher levels of production quality
Introducing Kantan BuildAnalytics
 Data analytics and visualisation providing insights into the
customisation of SMT engines.
Kantan BuildAnalytics™
Rapidly develop production ready engines
 Summary Report
 Training Rejects Reports
 F-Measure Analysis
 BLEU Analysis
 TER Analysis
 GAP Analysis
 Timeline Report
 Deep Tuning
Kantan BuildAnalytics™
F-Measure Score
Measures word recall & precision of KantanMT engines
Distributions
Provides distribution of F-Measure scores across all reference
translations
Kantan Insight™
Holistic analysis of score and advice on how to improve this for
KantanMT engines
Detailed Analysis
Segment level F-Measure analysis to help SMT Developers
improve training material
Kantan BuildAnalytics™
Detailed Reports for: F-Measure, BLEU and TER
Kantan BuildAnalytics™
Gap Analysis – quickest way of improving fluency
Kantan BuildAnalytics™
Training Rejects Report – Improve training data rapidly
Kantan BuildAnalytics™
Timeline – Tracks history of KantanMT engines
Kantan BuildAnalytics™ - Rapid MT Customisation
bmmt GmbH and KantanMT:
The Real-World Use
of Machine Translation
Maxim Khalilov
Technical Lead
bmmt GmbH
maxim.khalilov@machine-translation.eu
KantanMT webinar
April 10, 2014
MT in industry: context and rationale
The combination of these two technologies, well-established TM and cutting-edge MT, plus
post-editing allows the creation of a high-quality translation that reads just as well as a
“classically” produced translation.
MT in industry: what about cost?
The cost structure changes when machine translation is integrated into the translation pipeline.
When machine translation is adopted, the data preparation and quality assurance (editing) costs rise
whereas translation costs fall to as low as zero. Most importantly, the total cost of translation is
reduced dramatically as illustrated.
MT case study
 Customer: big German machine manufacturer
 Project: 51,000 words, technical documentation. English into German. Approach: hybrid MT/TM.
 Settings: the files were processed through Trados Studio 2011.
 Implementation: KantanMT
 Description: Roughly 7,000 words came from TM as high matches. The remainder went through
MT-based pretranslation, followed by a post-editing cycle, with the overall goal to produce the
same level of quality as in an all-human translation.
 Training material: Our customer had not worked in this language combination before, so there was
no TM to go on. But we knew that the English authors based their work on material that the
customer had previously translated from German into English. Thus we reversed the language
direction of the TM and trained a customer-specific engine with this TM.
 Results: As a result, 44,000 words were post-edited to a final quality level that the customer was
very happy with.
 Cost savings > 30%.
MT: benefits of KantanMT solution
 Fully automated system training
 One-click system customization
 Automatic data pre-processing
 Fully automated translation
 Automatic pre- and post-processing
 Quality assessment
 KantanWatch
 Gap Analysis
 Reject Report
 No worry about maintenance and infrastructure
MT: benefits of KantanMT solution
 Transparent file format conversion
 Training material conversion: TM conversion, monolingual material
 Documents to translate: TMS format into MTable format
 SDLXliff
 Smooth terminology integration
 Consistent terminology
 Tag handling and mark-up transfer
Source: <x id="16480"/>SWord1 SWord2 SWord3 SWord4 <g id="16481">Number</g><g id="16480">SWord 8
SWord 9</g>
Target: <x id="16480"/>TWord1 TWord2 TWord3 TWord4 <g id="16480">TWord 8 TWord 9</g><g
id="16481">Number</g>
bmmt GmbH
 Founded in 2013 by a group of language industry experts who wanted to offer innovative translation technology
solutions
 Three operations centers in Germany: Munich, Berlin and Stuttgart
 bmmt GmbH heavily relies on KantanMT services from 2013
 Primary industries: Automotive and Trucks, Machine Engineering, Telecomunications, Construction, IT
 Types of documents: workshop texts, product catalogues & other highly repetitive information documents
 Primary source language: German
 Integration: SDL Trados, SDL WorldServer and others
 Find more: www.machine-translation.eu
Berlin
Alt-Moabit 92
10559 Berlin
Phone: +49 30-3117505-15
Fax: +49 30-3117505-20
Munich
Bernhard-Wicki-Straße 5
80636 Munich
Phone: +49 89 2000037-17
Fax: +49 89 2000037-11
Stuttgart
Ruppmannstraße 33b
70565 Stuttgart
Phone: +49 711 16646-66
Fax: +49 711 16646-50
bmmt GmbH
info@machine-translation.eu
Thank you
No Hardware. No Software. No Hassle MT.
New Breakthroughs in Machine Translation Technology
in association with#KantanWebinar
Tony O’Dowd, tonyod@kantanmt.com
Maxim Khalilov, maxim.khalilov@machine-translation.eu
Speakers

More Related Content

PPTX
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
PPTX
AdaptiveMT in SDL Trados Studio 2017
PDF
Intento Enterprise MT Hub
PDF
Intento Enterprise MT Hub
PPTX
Get Started with KantanNeural
PDF
Lucia Specia - Estimativa de qualidade em TA
PDF
OpenERP - PLM, Omnia Solutions
PDF
Carla Parra Escartin - ER2 Hermes Traducciones
5 challenges of scaling l10n workflows KantanMT/bmmt webinar
AdaptiveMT in SDL Trados Studio 2017
Intento Enterprise MT Hub
Intento Enterprise MT Hub
Get Started with KantanNeural
Lucia Specia - Estimativa de qualidade em TA
OpenERP - PLM, Omnia Solutions
Carla Parra Escartin - ER2 Hermes Traducciones

What's hot (8)

PDF
Gestión proyectos traducción - Universitat Autònoma de Barcelona
PDF
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
PDF
State of the Machine Translation by Intento (stock engines, Jan 2019)
PDF
State of the Machine Translation by Intento (November 2017)
PDF
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
PDF
Lesson 1 introduction to programming
PPTX
CAN FD Stack Introduction & Related FAQ
PDF
What machine translation developers are doing to make post-editors happy
Gestión proyectos traducción - Universitat Autònoma de Barcelona
Machine Translation Quality - Are We There Yet? - Olga Beregovaya (Welocalize)
State of the Machine Translation by Intento (stock engines, Jan 2019)
State of the Machine Translation by Intento (November 2017)
Pangeanic Cor-ActivaTM-Neural machine translation Taus Tokyo 2017
Lesson 1 introduction to programming
CAN FD Stack Introduction & Related FAQ
What machine translation developers are doing to make post-editors happy
Ad

Viewers also liked (7)

PPT
Building the DW - ETL
ODP
Git, Beginner to Advanced Survey
ODP
Apache HISE + Apache Camel
PDF
Learn BEM: CSS Naming Convention
PDF
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
PDF
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
PDF
SEO: Getting Personal
Building the DW - ETL
Git, Beginner to Advanced Survey
Apache HISE + Apache Camel
Learn BEM: CSS Naming Convention
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
SEO: Getting Personal
Ad

Similar to New Breakthroughs in Machine Transation Technology (20)

PPTX
Webinar automotive and engineering content 16.06.16
PPSX
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
PPTX
Managing Translation Memories for Engineering and Automotive Translation
PDF
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
PPTX
KantanMT
PPTX
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
PPTX
How to Improve Translation Productivity
PDF
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
PDF
KantanMT Brochure
PDF
KantanMT for Automotive
PPTX
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
 
PDF
iMT Language Solutions
 
PDF
Gestión proyectos traducción en la Universitat Autònoma de Barcelona
PDF
Intento Enterprise MT Hub
PPT
Lexcelera MT Breaking Compromises
PDF
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
PPTX
KantanFest: Tony O'Dowd
PDF
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
PPTX
Learn the different approaches to machine translation and how to improve the ...
 
PPTX
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)
Webinar automotive and engineering content 16.06.16
Maximising Machine Translation Return on Investment (KantanMT/Medialocate)
Managing Translation Memories for Engineering and Automotive Translation
TAUS MT Showcase 2014, Enabling MT for the Everyone! Tony O’Dowd, KantanMT
KantanMT
How to Achieve Agile Localization for High-Volume Content with Machine Transl...
How to Improve Translation Productivity
TAUS MT SHOWCASE, Creating Competitive Advantage with Rapid Customization & D...
KantanMT Brochure
KantanMT for Automotive
Machine Translation: Latest Innovations and their Impact on Commercial Transl...
 
iMT Language Solutions
 
Gestión proyectos traducción en la Universitat Autònoma de Barcelona
Intento Enterprise MT Hub
Lexcelera MT Breaking Compromises
Topic 4: The Magician's Hat: Turning Data into Business Intelligence (3)
KantanFest: Tony O'Dowd
EVALUATION IN USE: NAVIGATING THE MT ENGINE LANDSCAPE WITH THE INTENTO EVALUA...
Learn the different approaches to machine translation and how to improve the ...
 
MT Benchmarking and Business Intelligence - Tom Shaw (Capita TI)

More from kantanmt (20)

PDF
KantanFest: Mindaugas Kazlauskas
PPTX
Kantanfest: Dimitar Shterionov - Part 2
PPTX
Kantanfest: Laura Casanellas
PPTX
Kantanfest: Dimitar Shterionov - Part 1
PDF
KantanFest: Andy Way
PPTX
You Asked, We Will Answer
PPTX
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
PPTX
Cross Border Selling: Breaking the Language Barrier with Automated Translation
PPTX
Go global with this Winning Combination – Content strategy and Machine Transl...
PPTX
IC4 Cloud Security Workshop 2016
PPTX
New Ways to Engage Clients with Custom Machine Translation
PPTX
Improving your Bottom Line with Custom Machine Translation
PPTX
How to save 16 million euro for your start up business
PPTX
What is the Economic Case for Machine Translation?
PPTX
Tips for Preparing Training Data for High Quality Machine Translation
PPTX
EAMT Workshop 2015 - KantanMT
PPTX
Breaking Language Barriers: Machine Translation for eCommerce
PPTX
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
PPTX
How to set up a high tech business in the Cloud for 2,000 EUR
PPTX
How Does Your MT System Measure Up? tekom/tcworld 2014
KantanFest: Mindaugas Kazlauskas
Kantanfest: Dimitar Shterionov - Part 2
Kantanfest: Laura Casanellas
Kantanfest: Dimitar Shterionov - Part 1
KantanFest: Andy Way
You Asked, We Will Answer
ATC Summit 2016: The 7th Habit of 7 Habits of Effective MT Systems
Cross Border Selling: Breaking the Language Barrier with Automated Translation
Go global with this Winning Combination – Content strategy and Machine Transl...
IC4 Cloud Security Workshop 2016
New Ways to Engage Clients with Custom Machine Translation
Improving your Bottom Line with Custom Machine Translation
How to save 16 million euro for your start up business
What is the Economic Case for Machine Translation?
Tips for Preparing Training Data for High Quality Machine Translation
EAMT Workshop 2015 - KantanMT
Breaking Language Barriers: Machine Translation for eCommerce
Cloud Computing: IC4 Cloud On-Boarding Clinic, DCU
How to set up a high tech business in the Cloud for 2,000 EUR
How Does Your MT System Measure Up? tekom/tcworld 2014

Recently uploaded (20)

PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Getting Started with Data Integration: FME Form 101
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Encapsulation theory and applications.pdf
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PPTX
Machine Learning_overview_presentation.pptx
PDF
Encapsulation_ Review paper, used for researhc scholars
PDF
Machine learning based COVID-19 study performance prediction
PPTX
A Presentation on Artificial Intelligence
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
A comparative analysis of optical character recognition models for extracting...
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PDF
Mushroom cultivation and it's methods.pdf
Spectral efficient network and resource selection model in 5G networks
Getting Started with Data Integration: FME Form 101
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
Digital-Transformation-Roadmap-for-Companies.pptx
Assigned Numbers - 2025 - Bluetooth® Document
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Encapsulation theory and applications.pdf
Heart disease approach using modified random forest and particle swarm optimi...
Machine Learning_overview_presentation.pptx
Encapsulation_ Review paper, used for researhc scholars
Machine learning based COVID-19 study performance prediction
A Presentation on Artificial Intelligence
Building Integrated photovoltaic BIPV_UPV.pdf
Reach Out and Touch Someone: Haptics and Empathic Computing
Programs and apps: productivity, graphics, security and other tools
Diabetes mellitus diagnosis method based random forest with bat algorithm
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
A comparative analysis of optical character recognition models for extracting...
NewMind AI Weekly Chronicles - August'25-Week II
Mushroom cultivation and it's methods.pdf

New Breakthroughs in Machine Transation Technology

  • 1. No Hardware. No Software. No Hassle MT. New Breakthroughs in Machine Translation Technology in association with#KantanWebinar
  • 2. KantanMT.Com NO HARDWARE. NO SOFTWARE. NO HASSLE MT Tony O’Dowd Founder & Chief Architect New Breakthroughs in Machine Translation Technology
  • 3. What we aim to cover today? What is KantanMT.com? Challenges of the L10N Industry  Making the right Project Management decisions  Going beyond the baseline of MT quality Conclusions 15 minutes
  • 4. What is KantanMT.com? Statistical MT System  Cloud-based =  Highly scalable  Inexpensive to operate  Quick to deploy Our Vision  To put Machine Translation:  Customization  Improvement  Deployment  …into your hands Active KantanMT Engines 6,191 Training Words Uploaded 28,243,234,615 Member WordsTranslated 427,526,741 Fully Operational 15 months
  • 5. Initial Steps of any project are:  Determine Scope  How long will it take?  How much will it cost?  What is my margin?  Determine resources  How many Translators will I need? Introducing KantanAnalytics™  …think Fuzzy-Match report and you’ve got it in one! Challenge #1 How can Project Managers ‘manage’ Post- Editing Projects?
  • 6. KantanAnalytics™ Kantan TotalRecall – Advanced TM % of TM hits in this job KantanMT – automated translations % of automated translations for this job Range of QE Scores QE range defined to match existing fuzzy match ranges used by L10N industry Quality Estimation Scores Segment level QE scores – akin to fuzzy match scores Word Counts – Project Stats Can be used to develop Project TimeLine and Tiered Pricing Model for Post-Editing Projects Placeholder & Tag Counts Used by PM for complexity sur-charges KantanAnalytics embeds QE scores into  TRADOS Studio  MemoQ  XLIFF
  • 7. KantanAnalytics™ Helping PMs make the right business decisions!
  • 8. KantanAnalytics™ - Helping PMs make the right decisions
  • 9. Challenge #2: Going beyond the baseline and developing production ready MT! Easy to build 1st baseline engine  Aggregate Training Data – TM, Mono, Stock, Terminology  Use Cloud-based platform, like KantanMT.com Real Challenge:  How do these platforms go beyond the baseline engine and achieve higher levels of production quality Introducing Kantan BuildAnalytics  Data analytics and visualisation providing insights into the customisation of SMT engines.
  • 10. Kantan BuildAnalytics™ Rapidly develop production ready engines  Summary Report  Training Rejects Reports  F-Measure Analysis  BLEU Analysis  TER Analysis  GAP Analysis  Timeline Report  Deep Tuning
  • 11. Kantan BuildAnalytics™ F-Measure Score Measures word recall & precision of KantanMT engines Distributions Provides distribution of F-Measure scores across all reference translations Kantan Insight™ Holistic analysis of score and advice on how to improve this for KantanMT engines Detailed Analysis Segment level F-Measure analysis to help SMT Developers improve training material
  • 12. Kantan BuildAnalytics™ Detailed Reports for: F-Measure, BLEU and TER
  • 13. Kantan BuildAnalytics™ Gap Analysis – quickest way of improving fluency
  • 14. Kantan BuildAnalytics™ Training Rejects Report – Improve training data rapidly
  • 15. Kantan BuildAnalytics™ Timeline – Tracks history of KantanMT engines
  • 16. Kantan BuildAnalytics™ - Rapid MT Customisation
  • 17. bmmt GmbH and KantanMT: The Real-World Use of Machine Translation Maxim Khalilov Technical Lead bmmt GmbH maxim.khalilov@machine-translation.eu KantanMT webinar April 10, 2014
  • 18. MT in industry: context and rationale The combination of these two technologies, well-established TM and cutting-edge MT, plus post-editing allows the creation of a high-quality translation that reads just as well as a “classically” produced translation.
  • 19. MT in industry: what about cost? The cost structure changes when machine translation is integrated into the translation pipeline. When machine translation is adopted, the data preparation and quality assurance (editing) costs rise whereas translation costs fall to as low as zero. Most importantly, the total cost of translation is reduced dramatically as illustrated.
  • 20. MT case study  Customer: big German machine manufacturer  Project: 51,000 words, technical documentation. English into German. Approach: hybrid MT/TM.  Settings: the files were processed through Trados Studio 2011.  Implementation: KantanMT  Description: Roughly 7,000 words came from TM as high matches. The remainder went through MT-based pretranslation, followed by a post-editing cycle, with the overall goal to produce the same level of quality as in an all-human translation.  Training material: Our customer had not worked in this language combination before, so there was no TM to go on. But we knew that the English authors based their work on material that the customer had previously translated from German into English. Thus we reversed the language direction of the TM and trained a customer-specific engine with this TM.  Results: As a result, 44,000 words were post-edited to a final quality level that the customer was very happy with.  Cost savings > 30%.
  • 21. MT: benefits of KantanMT solution  Fully automated system training  One-click system customization  Automatic data pre-processing  Fully automated translation  Automatic pre- and post-processing  Quality assessment  KantanWatch  Gap Analysis  Reject Report  No worry about maintenance and infrastructure
  • 22. MT: benefits of KantanMT solution  Transparent file format conversion  Training material conversion: TM conversion, monolingual material  Documents to translate: TMS format into MTable format  SDLXliff  Smooth terminology integration  Consistent terminology  Tag handling and mark-up transfer Source: <x id="16480"/>SWord1 SWord2 SWord3 SWord4 <g id="16481">Number</g><g id="16480">SWord 8 SWord 9</g> Target: <x id="16480"/>TWord1 TWord2 TWord3 TWord4 <g id="16480">TWord 8 TWord 9</g><g id="16481">Number</g>
  • 23. bmmt GmbH  Founded in 2013 by a group of language industry experts who wanted to offer innovative translation technology solutions  Three operations centers in Germany: Munich, Berlin and Stuttgart  bmmt GmbH heavily relies on KantanMT services from 2013  Primary industries: Automotive and Trucks, Machine Engineering, Telecomunications, Construction, IT  Types of documents: workshop texts, product catalogues & other highly repetitive information documents  Primary source language: German  Integration: SDL Trados, SDL WorldServer and others  Find more: www.machine-translation.eu
  • 24. Berlin Alt-Moabit 92 10559 Berlin Phone: +49 30-3117505-15 Fax: +49 30-3117505-20 Munich Bernhard-Wicki-Straße 5 80636 Munich Phone: +49 89 2000037-17 Fax: +49 89 2000037-11 Stuttgart Ruppmannstraße 33b 70565 Stuttgart Phone: +49 711 16646-66 Fax: +49 711 16646-50 bmmt GmbH info@machine-translation.eu Thank you
  • 25. No Hardware. No Software. No Hassle MT. New Breakthroughs in Machine Translation Technology in association with#KantanWebinar Tony O’Dowd, tonyod@kantanmt.com Maxim Khalilov, maxim.khalilov@machine-translation.eu Speakers