SlideShare a Scribd company logo
Improving Lexical Choice
in Neural Machine Translation
Toan Q.	Nguyen	and	David	Chiang
arXiv 2017
presentation
Sekizawa Yuuki
2017/12/5 1
Overview
• NMT	learns	word	representations	in	continuous	space	
• NMT’s	translation	tends	to	seem	natural	in	the	context,
but	do	not	reflect	the	content	of	the	source	sentence	
• due	to	the	frequency	of	training
• proposed	method:	two	solution
1. argue	that	the	standard	output	layer,	which	computes	the	
inner	product	of	a	vector	representing	the	context	with	all	
possible	output	word	embeddings,	rewards	frequent	words	
disproportionately,	and	we	propose	to	fix	the	norms	of	both	
vectors	to	a	constant	value	
2. integrate	a	simple	lexical	module	which	is	jointly	trained	with	
the	rest	of	the	model	
2017/12/5 2
Output	a	word	in	NMT
• f:	source	sequence
• e:	output	word
• We:	embedding	of	word	e
• be:	bias	vector	(scalar)
• h~:	hidden	vector
• depending	only	on	the	source	sentence	and	previous	
output	words	
2017/12/5 3
depending	only	e
Propose	method	argues
2017/12/5 4
1. measures	how	well	e	fits	into	the	context	h	̃,	favors	
common	words	disproportionately,	and	show	that	it	
helps	to	fix	the	norm	of	both	vectors	to	a	constant	
2. add	a	new	term	representing	a	more	direct	connection	
from	the	source	sentence,	which	allows	the	model	to	
better	memorize	translations	of	rare	words
Proposed	method:	Nomalization
• do	this	by	projected	gradient	descent:	after	an	update,
project	each	We	onto	the	hypersphere	of	radius	r	
2017/12/5 5
Previous	work:	lexicon	into	NMT
• background
• hidden	state	contains	information
• the	source	word(s)	corresponding	to	the	current	target	word
• the	contexts	of	those	source	words	and	the	preceding	context	of	the	
target	word.	
• This	could	make	the	model	prone	to	generate	a	target	word	that	
fits	the	context	but	doesn’t	necessarily	correspond	to	the	source	
word(s)	
• Arthur	et	al.	(2016)
• tried	to	alleviate	this	issue	by	integrating	a	count-based	lexicon	
into	an	NMT	system
• However,	this	lexicon	must	be	trained	separately	using	GIZA++	
and	its	parameters	form	a	large,	sparse	array,	which	can	be	
difficult	to	store	in	GPU	memory	
2017/12/5 6
Proposed	method:	Lexical	Translation
• use	a	simple	feedforward	neural	network	(FFNN)	
• trained	jointly	with	the	rest	of	the	NMT	model	to	generate	
a	target	word	based	directly	on	the	source	word(s)	
2017/12/5 7
Experiment:	Data	settings
• Tamil	(ta),	Urdu	(ur),	Hausa	(ha),	Turkish	(tu),	and	
Hungarian	(hu)	to	English	(en),	using	data	from	the	
LORELEI	program.	
• English	to	Vietnamese	(vi),	using	data	from	the	IWSLT	
2015	shared	task.1	
• English	to	Japanese	(ja)	KFTT	and	BTEC	datasets.	
2017/12/5 8
Experiment:	NMT	systems
• NMT	baselines
• untied:	does	not	tie	the	rows	of	Wo	to	the	target	word	
embeddings
• tied:	tie	the	rows	of	Wo	to	the	target	word	embeddings
• other	baselines
• Moses:	The	Moses	phrase-based	translation	system	
Moses	used	the	full	vocabulary	from	the	training	data;	unknown	
words	were	copied	to	the	target	sentence.	
• Arthur:	Our	reimplementation	of	the	discrete	lexicon	approach	
of	Arthur	et	al.	(2016).	We	only	tried	their	auto	lexicon,	using	
GIZA++	integrated	using	their	bias	approach.	
• proposed	methods	
• fixnorm:	The	normalization	approach
• fixnorm+lex:	fixnorm with	the	addition	of	the	lexical	translation
2017/12/5 9
Result:	BLEU	evaluation
2017/12/5 10
parentheses	are	relative	to	tied	
a	dagger	†	indicating	an	insignificant	difference	in	BLEU	(p	>	0.01)
Result:	Translation	example
2017/12/5 11
Result:	Alignment
2017/12/5 12
Analysis:	Hyper	parameter	r
2017/12/5 13
Analysis:	Training	process
2017/12/5 14
BLEU	of	develop
Conclusion
• presented	two	simple	yet	effective	changes	to	the	
output	layer	of	a	NMT	model
• both	of	these	changes	improve	translation	quality
• substantially	on	low-resource	language	pairs
• the	baseline	NMT	system	performs	poorly	relative	to	
phrase-based	translation,	but	our	system	surpasses	it	
• We	conclude	that	NMT,	equipped	with	the	methods	
is	a	more	viable	choice	for	low-resource	translation
2017/12/5 15

More Related Content

PPTX
English to Bangla Translation
PDF
Teaching english basic writing
PDF
P6 english 2018
PPTX
Parts of speech tagger
PDF
Is acquiring knowledge of verb subcategorization in English easier? A partial...
PDF
Comparison of Transfer-Learning Approaches for Response Selection in Multi-Tu...
PDF
Semi-supervised Prosody Modeling Using Deep Gaussian Process Latent Variable...
PDF
Japanese EFL Learners' Implicit and Explicit Knowledge of Subject-Verb Agreem...
English to Bangla Translation
Teaching english basic writing
P6 english 2018
Parts of speech tagger
Is acquiring knowledge of verb subcategorization in English easier? A partial...
Comparison of Transfer-Learning Approaches for Response Selection in Multi-Tu...
Semi-supervised Prosody Modeling Using Deep Gaussian Process Latent Variable...
Japanese EFL Learners' Implicit and Explicit Knowledge of Subject-Verb Agreem...

What's hot (7)

PDF
Word Frequency Effects and Plurality in L2 Word Recognition—A Preliminary Study—
PPTX
Introducing LexTALE
PPT
Cloze procedure technique in reading and listening comprehension
PDF
ANELA VIOT Juniorendag 6 March 15_Z. van Polen FINAL
PDF
Seq2seq Model to Tokenize the Chinese Language
Word Frequency Effects and Plurality in L2 Word Recognition—A Preliminary Study—
Introducing LexTALE
Cloze procedure technique in reading and listening comprehension
ANELA VIOT Juniorendag 6 March 15_Z. van Polen FINAL
Seq2seq Model to Tokenize the Chinese Language
Ad

Similar to Improving lexical choice in neural machine translation (20)

PDF
Translating phrases in neural machine translation
PPTX
Natural Language Processing For Language Translation.pptx
PPTX
Word embedding
PPTX
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
PDF
Neural machine translation of rare words with subword units
PDF
Seq2seq Model to Tokenize the Chinese Language
PDF
Open vocabulary problem
PPTX
wordembedding.pptx
PDF
Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...
PDF
Networks and Natural Language Processing
PPTX
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
PPTX
What is word2vec?
PPTX
Experiments with Different Models of Statistcial Machine Translation
PPTX
project present
PPTX
Experiments with Different Models of Statistcial Machine Translation
PDF
Interspeech 2017 s_miyoshi
PDF
Understanding Natural Languange with Corpora-based Generation of Dependency G...
PDF
The Effect of Translationese on Statistical Machine Translation
PDF
Parafraseo-Chenggang.pdf
PPTX
NS-CUK Seminar: J.H.Lee, Review on "Abstract Meaning Representation for Semb...
Translating phrases in neural machine translation
Natural Language Processing For Language Translation.pptx
Word embedding
Deep Learning勉強会@小町研 "Learning Character-level Representations for Part-of-Sp...
Neural machine translation of rare words with subword units
Seq2seq Model to Tokenize the Chinese Language
Open vocabulary problem
wordembedding.pptx
Improving Japanese-to-English Neural Machine Translation by Paraphrasing the ...
Networks and Natural Language Processing
Deep Learning and Modern Natural Language Processing (AnacondaCon2019)
What is word2vec?
Experiments with Different Models of Statistcial Machine Translation
project present
Experiments with Different Models of Statistcial Machine Translation
Interspeech 2017 s_miyoshi
Understanding Natural Languange with Corpora-based Generation of Dependency G...
The Effect of Translationese on Statistical Machine Translation
Parafraseo-Chenggang.pdf
NS-CUK Seminar: J.H.Lee, Review on "Abstract Meaning Representation for Semb...
Ad

More from sekizawayuuki (20)

PDF
Incorporating word reordering knowledge into attention-based neural machine t...
PDF
paper introducing: Exploiting source side monolingual data in neural machine ...
PDF
Coling2016 pre-translation for neural machine translation
PPTX
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
PPTX
Emnlp読み会@2017 02-15
PDF
Acl reading@2016 10-26
PDF
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
PDF
Nlp2016 sekizawa
PDF
Emnlp読み会@2015 10-09
PDF
Acl読み会@2015 09-18
PDF
読解支援@2015 08-10-6
PDF
読解支援@2015 08-10-5
PDF
読解支援@2015 08-10-4
PDF
読解支援@2015 08-10-3
PDF
読解支援@2015 08-10-2
PDF
読解支援@2015 08-10-1
PDF
読解支援@2015 07-24
PDF
読解支援@2015 07-17
PDF
読解支援@2015 07-13
PDF
読解支援@2015 07-03
Incorporating word reordering knowledge into attention-based neural machine t...
paper introducing: Exploiting source side monolingual data in neural machine ...
Coling2016 pre-translation for neural machine translation
目的言語の低頻度語の高頻度語への言い換えによるニューラル機械翻訳の改善
Emnlp読み会@2017 02-15
Acl reading@2016 10-26
[論文紹介]Selecting syntactic, non redundant segments in active learning for mach...
Nlp2016 sekizawa
Emnlp読み会@2015 10-09
Acl読み会@2015 09-18
読解支援@2015 08-10-6
読解支援@2015 08-10-5
読解支援@2015 08-10-4
読解支援@2015 08-10-3
読解支援@2015 08-10-2
読解支援@2015 08-10-1
読解支援@2015 07-24
読解支援@2015 07-17
読解支援@2015 07-13
読解支援@2015 07-03

Recently uploaded (20)

PPTX
Cell Structure & Organelles in detailed.
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Trump Administration's workforce development strategy
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Classroom Observation Tools for Teachers
PDF
Weekly quiz Compilation Jan -July 25.pdf
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
Complications of Minimal Access Surgery at WLH
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Yogi Goddess Pres Conference Studio Updates
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
RMMM.pdf make it easy to upload and study
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Cell Structure & Organelles in detailed.
Final Presentation General Medicine 03-08-2024.pptx
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Trump Administration's workforce development strategy
O7-L3 Supply Chain Operations - ICLT Program
RTP_AR_KS1_Tutor's Guide_English [FOR REPRODUCTION].pdf
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
202450812 BayCHI UCSC-SV 20250812 v17.pptx
Chinmaya Tiranga quiz Grand Finale.pdf
Computing-Curriculum for Schools in Ghana
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Classroom Observation Tools for Teachers
Weekly quiz Compilation Jan -July 25.pdf
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Complications of Minimal Access Surgery at WLH
GDM (1) (1).pptx small presentation for students
Yogi Goddess Pres Conference Studio Updates
O5-L3 Freight Transport Ops (International) V1.pdf
RMMM.pdf make it easy to upload and study
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx

Improving lexical choice in neural machine translation