Teacher-Aware Active Robot Learning

Teacher-Aware
Active Robot Learning
Mattia Racca, Antti Oulasvirta and Ville Kyrki
ACM/IEEE International Conference on
Human-Robot Interaction (HRI), 2019
mattia.racca@aalto.fi

Why (active) learning robots?
2
Programming robots is hard, pre-programming them
for each task is harder impossible.

Why (active) learning robots?
3
Robot should learn by interacting with humans!
M. Racca and V. Kyrki, Active Robot Learning for Temporal Task models, HRI ‘18

The idea behind Active Learning
4

5

6
The agent can efficiently choose what to learn next.

7
… and improve its model faster!

8
Important aspects of Active Learning for HRI
1. Interactive Nature
Transparency
Design of
questions
Control over interaction
Timing of questions

9
Transparency
Design of questions
Timing of questions
2. Query Efficiency
Learning faster (with less data)

10
Transparency
Design of questions
Timing of questions
2. Query Efficiency
Learning faster (with less data)
But what about REAL users?

What if efficient query
selection is not best
for the interaction?
11

Can efficiency
indirectly
counter its
own benefits?
12
Query
Efficiency
Complex
questions
Questions
out of context
Harder for the teacher
● slower interaction
● more effort
● more errors!

Different types of Active Learning
13
1. CLASSIC
AL STRATEGY
(LEARNER C)
2. TEACHER-AWARE
AL STRATEGY
(LEARNER M)
3. HYBRID AL
STRATEGY
(LEARNER H)

An agent has to learn the value of a certain attribute a for a
set E of entities by making queries. We used the Animals with
Attributes 2* dataset with 50 animals (entities) and 85
semantic attributes.
Problem statement & Evaluation scenario
14
* Y. Xian, et al.. Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly, T-PAMI

An agent has to learn the value of a certain attribute a for a
set E of entities by making queries. We used the Animals with
Attributes 2* dataset with 50 animals (entities) and 85
semantic attributes.
15
* Y. Xian, et al.. Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly, T-PAMI
YES
Do giraffes have
patches?

● categories C over entities using WordNet
● Learner assumption: Entities in the same category are
more likely to share the same attribute value.
16

● categories C over entities using WordNet
● Learner assumption: Entities in the same category are
more likely to share the same attribute value.
17

Classic AL: Uncertainty Sampling
18
● Learner C:
○ uses Uncertainty Sampling
○ selects the most uncertain query,
given the current model.
○ As expected efficient!

Classic AL: Uncertainty Sampling
19
● Learner C drawbacks
○ Some questions are difficult!
○ Topic or context switches!

● Teacher-Aware strategy (Learner M)
○ Inspired by ACT-R declarative memory model,
saying “Information associated with recently
retrieved information is easier to retrieve”,
○ minimize the distance between consecutive
queries
In response to the drawbacks
20

● Teacher-Aware strategy (Learner M)
○ Inspired by ACT-R declarative memory model,
saying “Information associated with recently
retrieved information is easier to retrieve”,
○ minimize the distance between consecutive
queries;
● Hybrid strategy (Learner H)
○ a tradeoff between Learner C and Learner M
In response to the drawbacks
21

Teacher-Aware AL: Memory Effort strategy
22

Simulation on the entire dataset:
● Perfect users (no errors, no distraction)
● Baseline: asks random questions and cannot leverage our
model to make predictions
Performance in Simulation
23

User study: 26 participants,
the 3 strategies as conditions
(within-subject).
What about real users?
25

(within-subject).
Data logged:
● NASA TLX
● Q&A, response times,
prediction power
● Overall preferences
26

(within-subject).
Data logged:
● NASA TLX
● Q&A, response times,
prediction power
● Overall preferences
27
Our hypotheses:
Learner M makes the
participants reply (a)
faster and (b) with less
errors compared to
Learner C, with Learner
H achieving
intermediate results.

● Higher response time and more errors for Learner C.
Discussion
32

○ stressful, unpredictable and requiring more
thinking
Discussion
33

thinking
● Higher response time and more errors for Learner M.
Discussion
34

Discussion
35
thinking
○ easy, natural and predictable

Discussion
36
thinking
○ too easy? lowering attention or cause boredom

thinking
○ too easy? lowering attention or cause boredom
○ too predictable? using the same (maybe wrong)
answer
Discussion
37

Discussion
38
● Overall preferences:

● Learner C as efficient Mitigating difficulty!
Discussion
39

Discussion
40
● Learner M as useless Frustration and boredom!

Discussion
41
● Learner M as useless Frustration and boredom!
● AVOID USELESS QUESTIONS!

Conclusions
Can efficiency-driven Active Learning counter its
own benefits?
42

Can efficiency-driven Active Learning counter its
own benefits?
If we consider in the equation non-oracle users, yes!
But we just scratched the surface...
● We need a better understanding of interaction
aspects that can affect learning
● Strategies that can adapt to the specific user
Conclusions
43

Teacher-Aware Active Robot Learning
Mattia Racca, Antti Oulasvirta and Ville Kyrki
mattia.racca@aalto.fi
Thank you for the attention!
Code available at github.com/MattiaRacca
Can efficiency-driven Active Learning
counter its own benefits?
If we consider in the equation non-oracle
users and the interaction, yes!

46
We model the probability of attribute a applying
to category c as
and then we maintain a prior over these
distribution. We can then compute the
probability of a applying to entity e as
and therefore predict attribute entities pairs, given our current model.
The update step of the model is the computation of the posterior distributions
given the user answer r as an observation.
Attribute-Category Model

47
Learner C
Learner M
Learner H
Scores for each active learner

Teacher-Aware Active Robot Learning

More Related Content

What's hot (19)

Similar to Teacher-Aware Active Robot Learning (20)

Recently uploaded (20)

Teacher-Aware Active Robot Learning