SlideShare a Scribd company logo
Juxi Leitner
arc centre of excellence for robotic vision
queensland university of technology
<j.leitner@qut.edu.au>

http://Juxi.net
reinforcement
learning
Juxi
(deep)
Juxi Leitner
arc centre of excellence for robotic vision
queensland university of technology
<j.leitner@qut.edu.au>

http://Juxi.net
reinforcement
learning
Juxi
(deep)
http://roboticvision.org/
tinyurl.com/QUTRobotics
roboticvision.org
Dalle Molle Institute for AI (IDSIA)
Work
Juxi
Leitner
PhD Informatics / Intelligent Systems
MSc Space Robotics & Automation
BSc Information & Software Engineering
Intelligent (Space) Robots
European Space Agency (ESA)
Erasmus Intelligent Systems
Work (Humanoid) Robot Vision
Instituto Superior Técnico (IST)
Mobility Intelligent Space Systems Laboratory
About Me
Current Robotic Vision and Actions
Queensland University of Technology (QUT)
arc centre of excellence for robotic vision | qut

juxi.net | roboticvision.org | bne-robotics.net | brisbane.ai
BRISBANE ARTIFICIAL INTELLIGENCE
@BrisbaneAI #brai
sponsors
Event hosts
startupsresearch industry
http://roboticvision.org/
create agent that see and
interact with the world
http://Juxi.net/aboutme
http://roboticvision.org/
coordination
eye-hand
Vision
and Ac4on
http://Juxi.net/projects/VisionAndAction/
BRISBANE.AI
defining AI
study of "intelligent agents”:
any device that perceives its environment and takes actions
that maximize its chance of success at some goal
http://roboticvision.org/
agent interacting

with the world
http://roboticvision.org/
IM-CLeVeR Teaser
https://www.youtube.com/watch?v=OyfonCDxUiU
full video: https://vimeo.com/51011081
BRISBANE.AI
machinelearning
supervised learning
…
BRISBANE.AI
machinelearning
unsupervised learning
BRISBANE.AI
learning
reinforcement learning
run
hug
?
agent
ageagent
see act
reward
machine
BRISBANE.AI
machinelearning
reinforcement learning
run
hug
agent
ageagent
see act
reward
hug :)
BRISBANE.AI
machinelearning
reinforcement learning
reward
reward
http://roboticvision.org/
robot RL
Konidaris et al.
Autonomous Skill
Acquisition on

a Mobile
Manipulator
https://www.youtube.com/watch?v=yUICAkSQTZY
http://roboticvision.org/
Jan Peters et al.
Motor Skill

Learning from

Demonstration
robot RL
https://www.youtube.com/watch?v=qtqubguikMk
http://roboticvision.org/
foundations
a policy, a reward signal, a value func,on,
and, op,onally, a model of the environment
http://cs.stanford.edu/people/karpathy/reinforcejs/http://karpathy.github.io/2016/05/31/rl/
http://roboticvision.org/
policy
Policy defines the actions to be taken

per state
http://roboticvision.org/
value function
Value function is a prediction of future reward

per state
http://roboticvision.org/
mdp
An information state (a.k.a. Markov state)
contains all useful information from the history.
i.e. the state is a sufficient statistic of the future
pomdp
what if: robot with camera vision isn’t told its absolute location
agent state != environment state



Formally this is a partially observable Markov decision process
(POMDP)
http://roboticvision.org/
mdp
http://roboticvision.org/
policy iteration
http://roboticvision.org/
value iteration
http://roboticvision.org/
Q function
http://roboticvision.org/
on/off policy
Q-learning (Watkins, 1989),
h7ps://studywolf.wordpress.com/2013/07/01/reinforcement-learning-sarsa-vs-q-learning/
BRISBANE.AI
deephype
Dartmouth1956
Initial

Hype
AI Winter
AI Winter
Expert

Systems

Hype
AI Spring
Deep
Learning

Hype
http://roboticvision.org/
deep RL
Sergey Levine / Google Brain
DeepMind
QUT
http://roboticvision.org/
deep RL
DeepMind
idea: learn a policy to play a computer game
using only visual information
state: ?
actions: ?
reward: ?
game screen
button press
game score
problems?
understanding limita,ons of deep nets,

reinforcement learning and transfer of knowledge
deep learning visual control
[Tow et al, ACRA 2015]
http://roboticvision.org/
[Zhang et al, ACRA 2015]
deep learning visual control
http://roboticvision.org/
[Zhang et al, arxiv.org]
deep learning visual control
understanding limita,ons of deep nets,

reinforcement learning and transfer of knowledge
http://roboticvision.org/
transfer visual control

from simulation to real world
reward issue
percep,on issue
noise issue
explora,on issue
deep learning visual servoing
Perception Module Control Module
Conv1 Conv2 Conv3 FC_c2 FC_c3FC_c1
Q-values
7×7conv+ReLU
stride2
4×4conv+ReLU
stride2
3×3conv+ReLU
stride1
64 lters 64 lters 64 lters
fullyconn.
300units
9units
84×84
400units
fullyconn.
fullyconn.+ReLU
fullyconn.+ReLU
I BN
5units
θ
Bottleneck
Or
Occlusion
A B C ED
Occlusion Occlusion
Occlusion
[Zhang et al, arxiv.org]
understanding limita,ons of deep nets,

reinforcement learning and transfer of knowledge
ARC Centre of Excellence for Robotic Vision roboticvision.org
limita,ons of current robo,c systems

reproducible research on TASKS not datasets 

picking benchmark
http://Juxi.net/dataset/acrv-picking-benchmark/
https://arxiv.org/abs/1609.05258
http://roboticvision.org/
deep RL
Sergey Levine, Peter Pastor et al. (Google Brain)
http://roboticvision.org/
artificial curiosity
Reward the reward-optimizing controller for
actions yielding data that cause
improvements of the adaptive predictor or
data compressor!
idea:
similar: intrinsic motivation
http://roboticvision.org/
RL applications
Motion Planning
Grasping
End-to-end
Vision (?)
http://roboticvision.org/
challenges
Curiosity?
Exploration/Exploitation?
Oracles?
discreet —> continuous?
real-world reward?
https://github.com/aikorea/awesome-rl#theory
BRISBANE.AI
new developments
arxiv-sanity, twitter & get your hands dirty
come to Brisbane.AI meetups! :)
how to keep in the loop?
http://Juxi.net/workshop/deep-learning-rss-2017/
Tools and toolboxes
Neuroscience vs Deep Learning
&
Evolutionary approaches
Generative Adversarial Networks
Unsupervised Learning, Embodied Learning
BRISBANE.AI
Jürgen ‘Juxi’ Leitner
arc centre of excellence for robotic vision | qut

juxi.net | roboticvision.org | bne-robotics.net | brisbane.ai
In which we try to explain why we consider ar,ficial
intelligence to be a subject most worthy of study, and
in which we try to decide what exactly it is, this

being a good thing to decide before embarking.
TUTORIAL ONE
BRISBANE ARTIFICIAL INTELLIGENCE
http://Juxi.net

<juxi.leitner@gmail.com>
interested?
j.leitner@roboticvision.org
http://Juxi.net/projects
Juxi
BEB801/2 Projects
PhD positions
ideas…
Amazon Robotics Challenge
Jürgen ‘Juxi’ Leitner
arc centre of excellence for robotic vision
queensland university of technology
<j.leitner@qut.edu.au> http://Juxi.net
http://roboticvision.org/
object manipulation in clutter
long term goal
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
#cartman
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Hardware
#cartman
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Hardware
#cartman
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Hardware
#cartman
Deep Reinforcement Learning | Amazon Robotics Challenge, Image Processing Lecture (EGH444, QUT)
s m a r t . r o b o t s .
SMRTRobots
END-

EFFECTOR
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Perception
#cartman
seman&c segmenta&on
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Perception
#cartman
rapid training
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Perception
#cartman
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Perception
#cartman
grasp synthesis
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
in Action
#cartman
Videos
https://www.youtube.com/watch?time_continue=5&v=p-WhO0LF4oY (ARC Pick failure)
https://www.youtube.com/watch?v=BB5Pyh4dtxw (ARC Quick Learning of Items)
https://www.youtube.com/watch?v=VEKanLH2gFY (ARC Finals)
https://www.youtube.com/watch?v=a4_j6EAK3rs&feature=youtu.be (Reaching Learning)
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
in Action
#cartman
Papers / TechRep
Cartman: The low-cost Cartesian Manipulator that won the Amazon Robotics Challenge. Douglas Morrison, et al
https://arxiv.org/abs/1709.06283
Mechanical Design of a Cartesian Manipulator for Warehouse Pick and Place. M. McTaggart, et al 

http://juxi.net/papers/ACRV-TR-2017-02.pdf
Design of a Multi-Modal End-Effector and Grasping System: How Integrated Design helped win the ARC. S. Wade-
McCue, N. Kelly-Boxall, et al. http://juxi.net/papers/ACRV-TR-2017-03.pdf
Semantic Segmentation from Limited Training Data. Anton Milan, et al. http://juxi.net/papers/ACRV-TR-2017-04.pdf
Sim-to-real Transfer of Visuo-motor Policies for Reaching in Clutter: Domain Randomization & Adaptation with
Modular Nets. Fangyi Zhang, et al. https://arxiv.org/abs/1709.05746
Training Deep Neural Networks for Visual Servoing. Quentin Bateux, et al https://arxiv.org/abs/1705.08940
h7p://Juxi.net/papers
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Nice Features
#cartman
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Nice Features
#cartman
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Nice Features
#cartman
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Nice Features
#cartman
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Nice Features
#cartman
ARC Centre of Excellence for Robotic Vision roboticvision.orghttp://roboticvision.org/
Nice Features
#cartman
HW=12k
project=42k
travel=22k
w/o salaries
http://roboticvision.org/
#teamACRVRoboticVisionAU
Adam Tow
Steve Mar&n
Rohan Smith
Jordan Erskine
Anthony Gillespie
Riccardo Grinover
Alec Gurman
Tom Hunn
Darryl Lee
Nathan Perkins
Gerard Rallos
Andrew Razjigaev
Juxi Leitner, Ian Reid, Peter Corke
http://facebook.com/TeamACRV
Doug Morrison
Ma7 McTaggert
Zheyu Zhuang
Norton Kelly-Boxall
Sean Wade-McCue
Thomas Rowntree
Trung Pham
Vijay Kumar
Ming Cai
Saroj Weerasekera
Chris Lehnert
Anton Milan
Thank
You!

More Related Content

PDF
(deep) reinforcement learning - CAB420
PDF
Robots that Grasp the World
PDF
Improving Robotic Manipulation with Vision and Learning @AmazonDevCentre Berlin
PDF
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
PDF
The Need For Robots To Grasp the World
PDF
AI-based Robotic Manipulation
PPT
Machine Learning and Robotics
PPTX
Robotics
(deep) reinforcement learning - CAB420
Robots that Grasp the World
Improving Robotic Manipulation with Vision and Learning @AmazonDevCentre Berlin
Deep Learning in Robotics: Robot gains Social Intelligence through Multimodal...
The Need For Robots To Grasp the World
AI-based Robotic Manipulation
Machine Learning and Robotics
Robotics

Similar to Deep Reinforcement Learning | Amazon Robotics Challenge, Image Processing Lecture (EGH444, QUT) (20)

PDF
20181212 Queensland AI Meetup
PPT
Robotics lover
PDF
deep-reinforcement-learning-framework.pdf
PPT
569637 634222725772371250
PDF
Dmytro Kuzmenko: State-of-the-Art AI in Robotics (UA)
PDF
EEE-BEE009 - Robotics and Automation Dr. S. P. Vijaya Raghavan (1).pdf
PDF
Professor Michael Milford's (Queensland University of Technology) presentatio...
PPT
Robotics (1)
PPT
Robotics
PPT
Robotics
PPT
presentation.ppt
PPT
presentation.ppt
PPT
PPTX
Autonomous Mobile Robotic Arm.pptx
PDF
Racing with Artificial Intelligence
PPT
AI Robotics
PPTX
A.Levenchuk -- visuomotor learning in cyber-phisical systems
DOCX
Page 1 of 14 ENS4152 Project Development Proposal a.docx
DOCX
Page 1 of 14 ENS4152 Project Development Proposal a.docx
DOCX
Page 1 of 14 ENS4152 Project Development Proposal a.docx
20181212 Queensland AI Meetup
Robotics lover
deep-reinforcement-learning-framework.pdf
569637 634222725772371250
Dmytro Kuzmenko: State-of-the-Art AI in Robotics (UA)
EEE-BEE009 - Robotics and Automation Dr. S. P. Vijaya Raghavan (1).pdf
Professor Michael Milford's (Queensland University of Technology) presentatio...
Robotics (1)
Robotics
Robotics
presentation.ppt
presentation.ppt
Autonomous Mobile Robotic Arm.pptx
Racing with Artificial Intelligence
AI Robotics
A.Levenchuk -- visuomotor learning in cyber-phisical systems
Page 1 of 14 ENS4152 Project Development Proposal a.docx
Page 1 of 14 ENS4152 Project Development Proposal a.docx
Page 1 of 14 ENS4152 Project Development Proposal a.docx
Ad

More from Juxi Leitner (20)

PDF
Cartman, how to win the amazon robotics challenge with robotic vision and dee...
PDF
ACRV Picking Benchmark: how to benchmark pick and place robotics research
PDF
Team ACRV's experience at #AmazonPickingChallenge 2016
PDF
ACRV : Robotic Vision presentation in Lisbon at IST
PDF
The Australian Centre for Robotic Vision (ACRV)
PDF
How to place 6th in the Amazon Picking Challenge (ENB329, QUT)
PDF
LunaRoo: Designing a Hopping Lunar Science Payload #space #exploration
PDF
Robotic Vision - Vision for Robotics #IEEE #QLD #CIS #Colloquium
PDF
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
PDF
ACRV Research Fellow Intro/Tutorial [Vision and Action]
PDF
From Vision to Actions - Towards Adaptive & Autonomous Humanoid Robots [PhD D...
PDF
Reactive Reaching and Grasping on a Humanoid: Towards Closing the Action-Perc...
PDF
Tele-operation of a Humanoid Robot, Using Operator Bio-data
PDF
Improving Robot Vision Models for Object Detection Through Interaction #ijcnn...
PDF
How does it feel to be a SpaceMaster? [Erasmus Mundus - ACE Talk]
PDF
Appetizer Talk Slides
PDF
Towards Autonomous and Adaptive Humanoids [PhD Proposal @ Università della Sv...
PDF
ALife in Humanoid Robots #ecal2013
PDF
Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...
PDF
Humanoid Learns to Detect Its Own Hands #cec2013
Cartman, how to win the amazon robotics challenge with robotic vision and dee...
ACRV Picking Benchmark: how to benchmark pick and place robotics research
Team ACRV's experience at #AmazonPickingChallenge 2016
ACRV : Robotic Vision presentation in Lisbon at IST
The Australian Centre for Robotic Vision (ACRV)
How to place 6th in the Amazon Picking Challenge (ENB329, QUT)
LunaRoo: Designing a Hopping Lunar Science Payload #space #exploration
Robotic Vision - Vision for Robotics #IEEE #QLD #CIS #Colloquium
ROS Hands-On Intro/Tutorial (Robotic Vision Summer School 2015) #RVSS #ACRV
ACRV Research Fellow Intro/Tutorial [Vision and Action]
From Vision to Actions - Towards Adaptive & Autonomous Humanoid Robots [PhD D...
Reactive Reaching and Grasping on a Humanoid: Towards Closing the Action-Perc...
Tele-operation of a Humanoid Robot, Using Operator Bio-data
Improving Robot Vision Models for Object Detection Through Interaction #ijcnn...
How does it feel to be a SpaceMaster? [Erasmus Mundus - ACE Talk]
Appetizer Talk Slides
Towards Autonomous and Adaptive Humanoids [PhD Proposal @ Università della Sv...
ALife in Humanoid Robots #ecal2013
Artificial Neural Networks For Spatial Perception: Towards Visual Object Loca...
Humanoid Learns to Detect Its Own Hands #cec2013
Ad

Recently uploaded (20)

PDF
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
PPTX
famous lake in india and its disturibution and importance
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
Cell Membrane: Structure, Composition & Functions
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
Sciences of Europe No 170 (2025)
PPTX
2. Earth - The Living Planet Module 2ELS
PPTX
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PDF
HPLC-PPT.docx high performance liquid chromatography
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
Microbiology with diagram medical studies .pptx
PDF
Phytochemical Investigation of Miliusa longipes.pdf
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
2Systematics of Living Organisms t-.pptx
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
SEHH2274 Organic Chemistry Notes 1 Structure and Bonding.pdf
famous lake in india and its disturibution and importance
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
ECG_Course_Presentation د.محمد صقران ppt
Cell Membrane: Structure, Composition & Functions
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
Sciences of Europe No 170 (2025)
2. Earth - The Living Planet Module 2ELS
EPIDURAL ANESTHESIA ANATOMY AND PHYSIOLOGY.pptx
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
HPLC-PPT.docx high performance liquid chromatography
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
Microbiology with diagram medical studies .pptx
Phytochemical Investigation of Miliusa longipes.pdf
7. General Toxicologyfor clinical phrmacy.pptx
2Systematics of Living Organisms t-.pptx
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
The KM-GBF monitoring framework – status & key messages.pptx
neck nodes and dissection types and lymph nodes levels

Deep Reinforcement Learning | Amazon Robotics Challenge, Image Processing Lecture (EGH444, QUT)