Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting

Horizon
Robotics
Language Learning in an Interactive
and Embodied Setting
11/2018
Wei Xu
1
Horizon
Robotics

Horizon
RoboticsA Developmental Approach to Machine Intelligence
1. It might be easier than solving all the tasks a human adult can do
2. Learn skills and knowledges unspecified at design time
3. Gradually proceed from easy tasks to difficult tasks
2
“Instead of trying to produce a program to simulate the adult mind, why
not rather try to produce one which simulates the child's? If this were then
subjected to an appropriate course of education one would obtain the adult
brain.” - Alan Turing (1950)
Language learning in an interactive and embodied setting

Horizon
RoboticsWhy Embodied?
 Learn from the experiences coming from the
machine’s interactions with its environment
 Learn commonsense through the observation
and interaction with the environment
 Meaning emerges by “grounding” language in
modalities in our environment
3Language learning in an interactive and embodied setting
Human driving: < 1000 miles
Self-driving: >10 million miles

Horizon
RoboticsWhy Interactive?
 A useful robot needs to be able to understand
and communicate effectively
 It is easier for human to teach machines directly
using language than writing code
 Humans are great teachers
 Learn the effects of speaking by observing
feedbacks from conversational partner
 Learn human value through the interaction
4Language learning in an interactive and embodied setting

Horizon
RoboticsAnswering Questions and Following Commands
1. Is it possible to learn to follow commands using
end-to-end reinforcement learning without any
pretraining for vision or language?
2. Whether learning question answering can help
learning command
3. Can the machine understand words under new
context not seen in training?
5
Haonan Yu, Haichao. Zhang, Wei Xu “Interactive Grounded Language
Acquisition and Generalization in a 2D World” ICLR 2018

Horizon
RoboticsProblem Setup
6Answering questions and following commands
east and avocado never
appears together in training
Watermelon only appears in
answers during training

Horizon
RoboticsModel architecture
answer
action
value

Horizon
RoboticsExperiments
No QA training

Horizon
RoboticsGeneralization Ability
9
We can generalize to word combinations
never seen in training
We can generalize to questions containing
words never seen in training
Answering questions and following commands
Held out X(%): %X of word/combinations are held out from training

Horizon
Robotics
Challenges:
 Partially observed
 Much longer delay of reward
 More visual variations
“Navigate to the dog!”Navigation in a 3D Environment
10

Horizon
RoboticsGuided Feature Transformation
Haonan Yu, Xiaochen Lian, Haichao Zhang, Wei. Xu “Guided Feature Transformation (GFT):
A Neural Language Grounding Module for Embodied Agents” CoRL 2018
11Navigation in 3D environment
action
value

Horizon
RoboticsExperimental Results

Horizon
RoboticsDemo the object besides candle is your target .
please move to the object that is front of the basketball
.
can you reach the object right of toilet ?go to the object to the right of bike please .reach the location between car and trampoline please.please navigate to the grid between gift and tower .please navigate to the grid between bucket and chair .please move to the object that is front of basketball .

Horizon
RoboticsLearning to Speak and Remember
1. How to learn to speak by talking with other people?
2. What information should be remembered?
3. How to utilize knowledge in memory?
14
Haichao Zhang, Haonan Yu, Wei Xu “Interactive Language Acquisition with One-Shot
Visual Concept Learning through a Conversation Game” ACL 2018

Horizon
RoboticsProblem Setup
Rewards are given for each learner response based on its
appropriateness
15Learning to speak and remember

Horizon
RoboticsMemory Augmented Imitation + Behavior Shaping
Through RL
Interpreter Speaker
Vision Memory
What is this? It is a bird.
Reward

Horizon
RoboticsModel Detail
17
Trained end-to-end using gradient descent over Imitation Cost + Reinforce Cost
Learning to speak and remember

Horizon
RoboticsExample Dialogs
T: Virtual teacher
L: Learner (machine)
T: i see grape
L: watermelon grape watermelon
T: tell what you see
L: see see see see see
T: there is grape
L: grape grape watermelon
T: i can observe coconut
L: fox watermelon watermelon
-------------------------------------------------
_________________________________
-------------------------------------------------
-------------------------------------------------
_________________________________
Before learning
After learning

Horizon
RoboticsSummary
 What we have now:
 Learning to understand and use simple
language, memorize useful information, and
execute simple commands from the
interactions with a virtual teacher in virtual
environments
 What we will do in the future:
 Simple → complex
 Virtual → real
19

Horizon
RoboticsAI Research at Horizon Robotics
 About the company
 A leading technology powerhouse of edge AI platform
 Provide algorithms, processors and hardware jointly optimized for high-performance, low-
power and low-cost edge AI capabilities
 CES 2019 Innovation Reward
 General AI Lab @ Silicon Valley
 Research towards the company’s long term vision for artificial general intelligence
 Build machines that can learn skills and knowledges unspecified at design time
 Applied AI Lab @ Silicon Valley
 Applied research focusing on near term needs
 Developing novel AI technologies that are critical to our current products
Job: bit.ly/general-ai-lab
bit.ly/applied-ai-lab
20

Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting

More Related Content

What's hot (12)

Similar to Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting (20)

More from AI Frontiers (20)

Recently uploaded (20)

Wei Xu at AI Frontiers : Language Learning in an Interactive and Embodied Setting

Editor's Notes