SlideShare a Scribd company logo
Teacher-Aware
Active Robot Learning
Mattia Racca, Antti Oulasvirta and Ville Kyrki
ACM/IEEE International Conference on
Human-Robot Interaction (HRI), 2019
mattia.racca@aalto.fi
Why (active) learning robots?
2
Programming robots is hard, pre-programming them
for each task is harder impossible.
Why (active) learning robots?
3
Robot should learn by interacting with humans!
M. Racca and V. Kyrki, Active Robot Learning for Temporal Task models, HRI ‘18
The idea behind Active Learning
4
The idea behind Active Learning
5
The idea behind Active Learning
6
The agent can efficiently choose what to learn next.
The idea behind Active Learning
7
… and improve its model faster!
8
Important aspects of Active Learning for HRI
1. Interactive Nature
Transparency
Design of
questions
Control over interaction
Timing of questions
9
Important aspects of Active Learning for HRI
1. Interactive Nature
Transparency
Design of questions
Control over interaction
Timing of questions
2. Query Efficiency
Learning faster (with less data)
10
Important aspects of Active Learning for HRI
1. Interactive Nature
Transparency
Design of questions
Control over interaction
Timing of questions
2. Query Efficiency
Learning faster (with less data)
But what about REAL users?
What if efficient query
selection is not best
for the interaction?
11
Can efficiency
indirectly
counter its
own benefits?
12
Query
Efficiency
Complex
questions
Questions
out of context
Harder for the teacher
● slower interaction
● more effort
● more errors!
Different types of Active Learning
13
1. CLASSIC
AL STRATEGY
(LEARNER C)
2. TEACHER-AWARE
AL STRATEGY
(LEARNER M)
3. HYBRID AL
STRATEGY
(LEARNER H)
An agent has to learn the value of a certain attribute a for a
set E of entities by making queries. We used the Animals with
Attributes 2* dataset with 50 animals (entities) and 85
semantic attributes.
Problem statement & Evaluation scenario
14
* Y. Xian, et al.. Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly, T-PAMI
An agent has to learn the value of a certain attribute a for a
set E of entities by making queries. We used the Animals with
Attributes 2* dataset with 50 animals (entities) and 85
semantic attributes.
Problem statement & Evaluation scenario
15
* Y. Xian, et al.. Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly, T-PAMI
YES
Do giraffes have
patches?
● categories C over entities using WordNet
● Learner assumption: Entities in the same category are
more likely to share the same attribute value.
Problem statement & Evaluation scenario
16
● categories C over entities using WordNet
● Learner assumption: Entities in the same category are
more likely to share the same attribute value.
Problem statement & Evaluation scenario
17
Classic AL: Uncertainty Sampling
18
● Learner C:
○ uses Uncertainty Sampling
○ selects the most uncertain query,
given the current model.
○ As expected efficient!
Classic AL: Uncertainty Sampling
19
● Learner C drawbacks
○ Some questions are difficult!
○ Topic or context switches!
● Teacher-Aware strategy (Learner M)
○ Inspired by ACT-R declarative memory model,
saying “Information associated with recently
retrieved information is easier to retrieve”,
○ minimize the distance between consecutive
queries
In response to the drawbacks
20
● Teacher-Aware strategy (Learner M)
○ Inspired by ACT-R declarative memory model,
saying “Information associated with recently
retrieved information is easier to retrieve”,
○ minimize the distance between consecutive
queries;
● Hybrid strategy (Learner H)
○ a tradeoff between Learner C and Learner M
In response to the drawbacks
21
Teacher-Aware AL: Memory Effort strategy
22
Simulation on the entire dataset:
● Perfect users (no errors, no distraction)
● Baseline: asks random questions and cannot leverage our
model to make predictions
Performance in Simulation
23
Performance in Simulation
24
User study: 26 participants,
the 3 strategies as conditions
(within-subject).
What about real users?
25
User study: 26 participants,
the 3 strategies as conditions
(within-subject).
Data logged:
● NASA TLX
● Q&A, response times,
prediction power
● Overall preferences
What about real users?
26
User study: 26 participants,
the 3 strategies as conditions
(within-subject).
Data logged:
● NASA TLX
● Q&A, response times,
prediction power
● Overall preferences
What about real users?
27
Our hypotheses:
Learner M makes the
participants reply (a)
faster and (b) with less
errors compared to
Learner C, with Learner
H achieving
intermediate results.
Results
28
*
(Unexpected) Results
29
*
*
(Unexpected) Results
30
*
*
(Unexpected) Results
31
* *
● Higher response time and more errors for Learner C.
Discussion
32
● Higher response time and more errors for Learner C.
○ stressful, unpredictable and requiring more
thinking
Discussion
33
● Higher response time and more errors for Learner C.
○ stressful, unpredictable and requiring more
thinking
● Higher response time and more errors for Learner M.
Discussion
34
Discussion
35
● Higher response time and more errors for Learner C.
○ stressful, unpredictable and requiring more
thinking
● Higher response time and more errors for Learner M.
○ easy, natural and predictable
Discussion
36
● Higher response time and more errors for Learner C.
○ stressful, unpredictable and requiring more
thinking
● Higher response time and more errors for Learner M.
○ easy, natural and predictable
○ too easy? lowering attention or cause boredom
● Higher response time and more errors for Learner C.
○ stressful, unpredictable and requiring more
thinking
● Higher response time and more errors for Learner M.
○ easy, natural and predictable
○ too easy? lowering attention or cause boredom
○ too predictable? using the same (maybe wrong)
answer
Discussion
37
Discussion
38
● Overall preferences:
● Overall preferences:
● Learner C as efficient Mitigating difficulty!
Discussion
39
Discussion
40
● Overall preferences:
● Learner C as efficient Mitigating difficulty!
● Learner M as useless Frustration and boredom!
Discussion
41
● Overall preferences:
● Learner C as efficient Mitigating difficulty!
● Learner M as useless Frustration and boredom!
● AVOID USELESS QUESTIONS!
Conclusions
Can efficiency-driven Active Learning counter its
own benefits?
42
Can efficiency-driven Active Learning counter its
own benefits?
If we consider in the equation non-oracle users, yes!
But we just scratched the surface...
● We need a better understanding of interaction
aspects that can affect learning
● Strategies that can adapt to the specific user
Conclusions
43
Teacher-Aware Active Robot Learning
Mattia Racca, Antti Oulasvirta and Ville Kyrki
mattia.racca@aalto.fi
Thank you for the attention!
Code available at github.com/MattiaRacca
Can efficiency-driven Active Learning
counter its own benefits?
If we consider in the equation non-oracle
users and the interaction, yes!
45
Tree building algorithm
46
We model the probability of attribute a applying
to category c as
and then we maintain a prior over these
distribution. We can then compute the
probability of a applying to entity e as
and therefore predict attribute entities pairs, given our current model.
The update step of the model is the computation of the posterior distributions
given the user answer r as an observation.
Attribute-Category Model
47
Learner C
Learner M
Learner H
Scores for each active learner
Assumption choice

More Related Content

PDF
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
PPTX
SBAC What is a CAT
PPTX
Supervised Unsupervised and Reinforcement Learning
PDF
Pragmatic software testing education - SIGCSE 2019
DOCX
Computational thinking
PPTX
2014 what-is-a-cat-narrated
PPTX
Data Driven Problem Solving Assistive Technology Practices Jan 2010
PPTX
Evaluating algorithms using Item Response Theory
Supervised vs Unsupervised vs Reinforcement Learning | Edureka
SBAC What is a CAT
Supervised Unsupervised and Reinforcement Learning
Pragmatic software testing education - SIGCSE 2019
Computational thinking
2014 what-is-a-cat-narrated
Data Driven Problem Solving Assistive Technology Practices Jan 2010
Evaluating algorithms using Item Response Theory

What's hot (19)

PPTX
What makes a good adaptive testing program
PDF
RecSys 2016 Talk: Feature Selection For Human Recommenders
PDF
A robotics-based approach to foster programming skills and computational thin...
PPTX
Machine learning: A Walk Through School Exams
PPT
Week 1 fall 2011
PPTX
Computer adaptive testing
PPTX
Anomalies! You can't escape them.
PPTX
LEAD model for designing CS labs - T4E 2019 (Goa Dec 9-11)
PPT
Basics of Machine Learning
PPTX
Engineering Design & Development Project
PDF
Continual Learning with Deep Architectures - Tutorial ICML 2021
PPTX
Introduction to Item Response Theory
PDF
Emp Research
PPTX
Introduction to Computerized Adaptive Testing (CAT)
PDF
Introduction to machine learning - Ray Poynter - NewMR webinar 2019
PDF
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
PPT
Statistical learning intro
PDF
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
PPTX
Multimodal Learning Analytics
What makes a good adaptive testing program
RecSys 2016 Talk: Feature Selection For Human Recommenders
A robotics-based approach to foster programming skills and computational thin...
Machine learning: A Walk Through School Exams
Week 1 fall 2011
Computer adaptive testing
Anomalies! You can't escape them.
LEAD model for designing CS labs - T4E 2019 (Goa Dec 9-11)
Basics of Machine Learning
Engineering Design & Development Project
Continual Learning with Deep Architectures - Tutorial ICML 2021
Introduction to Item Response Theory
Emp Research
Introduction to Computerized Adaptive Testing (CAT)
Introduction to machine learning - Ray Poynter - NewMR webinar 2019
Day 2 (Lecture 1): Introduction to Statistical Machine Learning and Applications
Statistical learning intro
[Paper Introduction] Evaluating MT Systems with Second Language Proficiency T...
Multimodal Learning Analytics
Ad

Similar to Teacher-Aware Active Robot Learning (20)

PDF
Artificial intelligence to support human instruction Michael C. Mozera,b,c,, ...
PPT
5 learning edited 2012.ppt
PPTX
Learning to Teach: Improving Instruction with Machine Learning Techniques
PDF
AI-Learning style prediction for primary education
PDF
A new-method-of-adaptation-in-integrated-learning-environment
PDF
eTeacher: Providing personalized assistance to e-learning students. Silvia Sc...
PPTX
Chapter 5 of 1
PDF
Predicting the Presence of Learning Motivation in Electronic Learning: A New ...
PPT
Reinforcement learning 7313
PDF
Design a personalized e-learning system based on item response theory and art...
PDF
Design a personalized e-learning system based on item response theory and art...
PPTX
Learning
PDF
Artificial Intelligence: an introduction.pdf
PPT
RL_online _presentation_1.ppt
PPT
YijueRL.ppt
PPTX
Learning occam razor
PPTX
Chapter 6 - Learning data and analytics course
PDF
A novel approach for selection of learning objects for personalized delivery ...
PDF
A NOVEL APPROACH FOR SELECTION OF LEARNING OBJECTS FOR PERSONALIZED DELIVERY ...
Artificial intelligence to support human instruction Michael C. Mozera,b,c,, ...
5 learning edited 2012.ppt
Learning to Teach: Improving Instruction with Machine Learning Techniques
AI-Learning style prediction for primary education
A new-method-of-adaptation-in-integrated-learning-environment
eTeacher: Providing personalized assistance to e-learning students. Silvia Sc...
Chapter 5 of 1
Predicting the Presence of Learning Motivation in Electronic Learning: A New ...
Reinforcement learning 7313
Design a personalized e-learning system based on item response theory and art...
Design a personalized e-learning system based on item response theory and art...
Learning
Artificial Intelligence: an introduction.pdf
RL_online _presentation_1.ppt
YijueRL.ppt
Learning occam razor
Chapter 6 - Learning data and analytics course
A novel approach for selection of learning objects for personalized delivery ...
A NOVEL APPROACH FOR SELECTION OF LEARNING OBJECTS FOR PERSONALIZED DELIVERY ...
Ad

Recently uploaded (20)

DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
Classification Systems_TAXONOMY_SCIENCE8.pptx
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
The scientific heritage No 166 (166) (2025)
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
Microbiology with diagram medical studies .pptx
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PDF
Sciences of Europe No 170 (2025)
PDF
An interstellar mission to test astrophysical black holes
PDF
. Radiology Case Scenariosssssssssssssss
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
Viruses (History, structure and composition, classification, Bacteriophage Re...
7. General Toxicologyfor clinical phrmacy.pptx
Classification Systems_TAXONOMY_SCIENCE8.pptx
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
The scientific heritage No 166 (166) (2025)
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Microbiology with diagram medical studies .pptx
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Sciences of Europe No 170 (2025)
An interstellar mission to test astrophysical black holes
. Radiology Case Scenariosssssssssssssss
HPLC-PPT.docx high performance liquid chromatography
Derivatives of integument scales, beaks, horns,.pptx
Comparative Structure of Integument in Vertebrates.pptx
AlphaEarth Foundations and the Satellite Embedding dataset
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS

Teacher-Aware Active Robot Learning

  • 1. Teacher-Aware Active Robot Learning Mattia Racca, Antti Oulasvirta and Ville Kyrki ACM/IEEE International Conference on Human-Robot Interaction (HRI), 2019 mattia.racca@aalto.fi
  • 2. Why (active) learning robots? 2 Programming robots is hard, pre-programming them for each task is harder impossible.
  • 3. Why (active) learning robots? 3 Robot should learn by interacting with humans! M. Racca and V. Kyrki, Active Robot Learning for Temporal Task models, HRI ‘18
  • 4. The idea behind Active Learning 4
  • 5. The idea behind Active Learning 5
  • 6. The idea behind Active Learning 6 The agent can efficiently choose what to learn next.
  • 7. The idea behind Active Learning 7 … and improve its model faster!
  • 8. 8 Important aspects of Active Learning for HRI 1. Interactive Nature Transparency Design of questions Control over interaction Timing of questions
  • 9. 9 Important aspects of Active Learning for HRI 1. Interactive Nature Transparency Design of questions Control over interaction Timing of questions 2. Query Efficiency Learning faster (with less data)
  • 10. 10 Important aspects of Active Learning for HRI 1. Interactive Nature Transparency Design of questions Control over interaction Timing of questions 2. Query Efficiency Learning faster (with less data) But what about REAL users?
  • 11. What if efficient query selection is not best for the interaction? 11
  • 12. Can efficiency indirectly counter its own benefits? 12 Query Efficiency Complex questions Questions out of context Harder for the teacher ● slower interaction ● more effort ● more errors!
  • 13. Different types of Active Learning 13 1. CLASSIC AL STRATEGY (LEARNER C) 2. TEACHER-AWARE AL STRATEGY (LEARNER M) 3. HYBRID AL STRATEGY (LEARNER H)
  • 14. An agent has to learn the value of a certain attribute a for a set E of entities by making queries. We used the Animals with Attributes 2* dataset with 50 animals (entities) and 85 semantic attributes. Problem statement & Evaluation scenario 14 * Y. Xian, et al.. Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly, T-PAMI
  • 15. An agent has to learn the value of a certain attribute a for a set E of entities by making queries. We used the Animals with Attributes 2* dataset with 50 animals (entities) and 85 semantic attributes. Problem statement & Evaluation scenario 15 * Y. Xian, et al.. Zero-Shot Learning - A Comprehensive Evaluation of the Good, the Bad and the Ugly, T-PAMI YES Do giraffes have patches?
  • 16. ● categories C over entities using WordNet ● Learner assumption: Entities in the same category are more likely to share the same attribute value. Problem statement & Evaluation scenario 16
  • 17. ● categories C over entities using WordNet ● Learner assumption: Entities in the same category are more likely to share the same attribute value. Problem statement & Evaluation scenario 17
  • 18. Classic AL: Uncertainty Sampling 18 ● Learner C: ○ uses Uncertainty Sampling ○ selects the most uncertain query, given the current model. ○ As expected efficient!
  • 19. Classic AL: Uncertainty Sampling 19 ● Learner C drawbacks ○ Some questions are difficult! ○ Topic or context switches!
  • 20. ● Teacher-Aware strategy (Learner M) ○ Inspired by ACT-R declarative memory model, saying “Information associated with recently retrieved information is easier to retrieve”, ○ minimize the distance between consecutive queries In response to the drawbacks 20
  • 21. ● Teacher-Aware strategy (Learner M) ○ Inspired by ACT-R declarative memory model, saying “Information associated with recently retrieved information is easier to retrieve”, ○ minimize the distance between consecutive queries; ● Hybrid strategy (Learner H) ○ a tradeoff between Learner C and Learner M In response to the drawbacks 21
  • 22. Teacher-Aware AL: Memory Effort strategy 22
  • 23. Simulation on the entire dataset: ● Perfect users (no errors, no distraction) ● Baseline: asks random questions and cannot leverage our model to make predictions Performance in Simulation 23
  • 25. User study: 26 participants, the 3 strategies as conditions (within-subject). What about real users? 25
  • 26. User study: 26 participants, the 3 strategies as conditions (within-subject). Data logged: ● NASA TLX ● Q&A, response times, prediction power ● Overall preferences What about real users? 26
  • 27. User study: 26 participants, the 3 strategies as conditions (within-subject). Data logged: ● NASA TLX ● Q&A, response times, prediction power ● Overall preferences What about real users? 27 Our hypotheses: Learner M makes the participants reply (a) faster and (b) with less errors compared to Learner C, with Learner H achieving intermediate results.
  • 32. ● Higher response time and more errors for Learner C. Discussion 32
  • 33. ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking Discussion 33
  • 34. ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking ● Higher response time and more errors for Learner M. Discussion 34
  • 35. Discussion 35 ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking ● Higher response time and more errors for Learner M. ○ easy, natural and predictable
  • 36. Discussion 36 ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking ● Higher response time and more errors for Learner M. ○ easy, natural and predictable ○ too easy? lowering attention or cause boredom
  • 37. ● Higher response time and more errors for Learner C. ○ stressful, unpredictable and requiring more thinking ● Higher response time and more errors for Learner M. ○ easy, natural and predictable ○ too easy? lowering attention or cause boredom ○ too predictable? using the same (maybe wrong) answer Discussion 37
  • 39. ● Overall preferences: ● Learner C as efficient Mitigating difficulty! Discussion 39
  • 40. Discussion 40 ● Overall preferences: ● Learner C as efficient Mitigating difficulty! ● Learner M as useless Frustration and boredom!
  • 41. Discussion 41 ● Overall preferences: ● Learner C as efficient Mitigating difficulty! ● Learner M as useless Frustration and boredom! ● AVOID USELESS QUESTIONS!
  • 42. Conclusions Can efficiency-driven Active Learning counter its own benefits? 42
  • 43. Can efficiency-driven Active Learning counter its own benefits? If we consider in the equation non-oracle users, yes! But we just scratched the surface... ● We need a better understanding of interaction aspects that can affect learning ● Strategies that can adapt to the specific user Conclusions 43
  • 44. Teacher-Aware Active Robot Learning Mattia Racca, Antti Oulasvirta and Ville Kyrki mattia.racca@aalto.fi Thank you for the attention! Code available at github.com/MattiaRacca Can efficiency-driven Active Learning counter its own benefits? If we consider in the equation non-oracle users and the interaction, yes!
  • 46. 46 We model the probability of attribute a applying to category c as and then we maintain a prior over these distribution. We can then compute the probability of a applying to entity e as and therefore predict attribute entities pairs, given our current model. The update step of the model is the computation of the posterior distributions given the user answer r as an observation. Attribute-Category Model
  • 47. 47 Learner C Learner M Learner H Scores for each active learner