Quoc Le, Software Engineer, Google at MLconf SF

Sequence Learning for
Language Understanding
Presenter: Quoc V. Le
Google
Thanks: Andrew Dai, Jeff Dean, Matthieu Devin, Geoff
Hinton, Thang Luong, Rajat Monga, Ilya Sutskever, Oriol
Vinyals

Sequence Learning
Typical success of Machine Learning: Mapping fixed length input to
a scalar value:
- Image recognition (Pixels -> “cat”)
- Speech recognition (Waveforms -> the utterance of “cat”)
Many language understanding problems require mapping from
sequences to sequences:
- Machine Translation (“I love music” -> “Je aime la musique”)
Quoc V. Le

How does Machine Translation work?
Use a dictionary to translate one word at a time
Use a model put reorder the words so that the sentence looks
reasonable.
Lots of rules:
- Phrases instead of words (“New York” should not be translated
as “New” + “York”)
- Meaning of words depend on contexts
Quoc V. Le

Ideas:
Sequence Learning
- Use a Recurrent Neural Net encoder to map an input sequence
to a vector
- Use a Recurrent Neural Net decoder to map the vector to
another sequence
Quoc V. Le

Sequence Learning
W X Y Z <EOS>
Quoc V. Le
Example network that maps ABC -> WXYZ
A B C <EOS> W X Y Z
At test time, feed the output back into the decoder as the input
For better output sequence, generate many candidates, feed each
candidate to the decoder to have a beam of possible sequences
Use “beam search” to find the top sequences

A machine translation experiment
WMT’2014 (small in comparison to Google’s data):
- State-of-art (a combination of many methods, took 20 years to
develop): 37
- Our method (took 3 person year): 37
Important achievement because it’s a new way to represent input
texts and output texts. Potential breakthrough in many other areas
of language understanding.
Quoc V. Le

Sequence Learning
W X Y Z <EOS>
A B C <EOS> W X Y Z
Quoc V. Le

Quoc Le, Software Engineer, Google at MLconf SF

Contact: Quoc V. Le (qvl@google.com),
Ilya Sutskever (ilyasu@google.com),
Oriol Vinyals (vinyals@google.com)
Minh-Thang Luong (lmthang@cs.stanford.edu)
Paper: Sequence to Sequence Learning with Neural Networks
Addressing the Rare Word Problem in Neural Machine
Translation
Upcoming NIPS paper
Quoc V. Le

Quoc Le, Software Engineer, Google at MLconf SF

More Related Content

Viewers also liked (15)

Similar to Quoc Le, Software Engineer, Google at MLconf SF (20)

More from MLconf (20)

Recently uploaded (20)

Quoc Le, Software Engineer, Google at MLconf SF