SlideShare a Scribd company logo
Methods
 Data from over 100 different users was collected
on 12 pre-defined gestures: Tap, Two finger tap,
Swipe, Grab, Release, Pinch, Wipe, Checkmark,
Figure 8, Lower case e, Capital E, Capital F
 Total dataset is 1.5GB and ~9,600 gesture
instances
 Preprocessing gestures into hand-selected
features, such as maximum displacement, and
feeding into three-layer neural network returned
mixed results
 Needed a way to turn messy temporal data with
variable timespans for the same gesture into a
fixed representation with constant-sized input
 To do this, created motion images of each type of
gesture by mapping 3D locations to pixels,
projecting onto XY, YZ, and XZ planes
 This image representation of gestures provides
a lot of flexibility for learning models
 Well defined methods exist for image
classification
 Can include temporal history into the images by
decaying older parts of the path
 Can easily augment the dataset through skews,
reflections, and transformations of the captured
images instead of modifying the underlying data
 Data augmentation leads to reduced overfitting
in the learning models
 Deep belief nets: use a Restricted Boltzmann
Machine trained with contrastive divergence to
extract features from the data without needing
class labels
 Stack RBMs by using the output of a hidden
layer as the visible input to the next RMB
 Add softmax output layer as the classifier and
use backpropagation to finetune the model
 Convolutional Neural Networks: widely used
for the task of handwritten digit classification
and object recognition
 Also combines feature extraction with
classification, but greatly reduces the number of
parameters required by sharing them between
feature maps
 Can tolerate translations and skew in the image
through overlapped layers and pooling
Figure 2. The data captured from the Leap Motion device
Discussion
 G.E. Hinton and R.R. Salakhutdinov, Reducing the
Dimensionality of Data with Neural Networks, Science,
28 July 2006, Vol. 313. no. 5786, pp. 504 - 507.
 G.E. Hinton, S. Osindero, and Y. Teh, “A fast learning
algorithm for deep belief nets”, Neural Computation, vol
18, 2006.
 Y. LeCun, L. Bottou, Y. Bengio and P. Haffner: Gradient-
Based Learning Applied to Document Recognition,
Proceedings of the IEEE, 86(11):2278-2324, November
1998.
 Graves, Alex. Supervised sequence labelling with
recurrent neural networks. Vol. 385. Springer, 2012.
 Hochreiter, S., & Schmidhuber, J. (1997). Long short-
term memory. Neural computation, 9(8), 1735-1780.
 Leap Motion. https://www.leapmotion.com/. Accessed
March 19, 2014.
 LeCun, Yann. "LeNet-5, convolutional neural
networks". Accessed 24 February 2015.
Future Work
References
Introduction
 Leap Motion controller uses IR cameras to
capture the position and orientation of a hand
 Allows your hands to be present onscreen, using
them to perform gestures and actions as a way
to interact with a computer without a mouse
 By itself the model can only say what is
happening now, not what has been done over
time or what the user is trying to communicate
 Common solution is to use a finite state machine
to map gestures to controls:
If moving down at 45° then up at 45° → check mark
 This can’t scale to a large corpus of gestures and
can’t segment or parse continuous gestures into
a semantically meaningful language
 Continuous gesture recognition has applications
in sign language translation, design tools, robot
control, gaming, and stereoscopic computing
Objectives
 Collect a dataset using the Leap Motion
controller for training models
 Use deep learning to segment, classify, and
parse a series of human input gestures
 Create a “gesture engine” that can be trained on
desired gestures and used inside Leap Motion-
enabled applications for continuous, meaningful
interaction with a computer without a mouse
3D Gesture Recognition with the Leap Motion Controller
Robert McCartney, Dr. Hans-Peter Bischof
Rochester Institute of Technology
Figure 1. Leap
Motion’s model
of the hand
Figure 3. A checkmark Figure 5. A figure 8
Figure 4. A capital E
Figure 6. RBM as an energy model
Figure 7. Deep belief net
Figure 8. Convolutional NN
Taken from http://parse.ele.tue.nl/education/cluster2
 Working with Jie Yuan to incorporate his HMM
in order to model the hidden rejection state
when the user is not performing any gesture
 The HMM will segment online gestures, with
segmented data then turned into motion images
 This will allow for continuous online gesture
recognition
 Alternative future approaches include Recurrent
Neural Networks and LSTMs
 Various dimensionality reduction techniques
need to be explored: autoencoders, PCA, and
manifold methods (LPP)
 Pending issue: long training times require GPU
implementations to run efficiently

More Related Content

PPTX
Image recognition
PPTX
Neural networks and deep learning
PPT
IEEE ICAPR 2009
PDF
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENT
PDF
IRJET- 3D Object Recognition of Car Image Detection
DOCX
researchPaper
PPTX
FastV2C-HandNet - ICICC 2020
PPTX
Image identifier system
Image recognition
Neural networks and deep learning
IEEE ICAPR 2009
APPLICATION OF CONVOLUTIONAL NEURAL NETWORK IN LAWN MEASUREMENT
IRJET- 3D Object Recognition of Car Image Detection
researchPaper
FastV2C-HandNet - ICICC 2020
Image identifier system

What's hot (17)

PDF
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...
PDF
Image recognition
PPTX
Deep learning
PPT
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
PDF
High PSNR Based Image Steganography
PDF
Goal location prediction based on deep learning using RGB-D camera
PPTX
Face recognition using artificial neural network
PDF
IRJET - Deep Learning Approach to Inpainting and Outpainting System
PPTX
Facial recognition
PDF
Image recognition
PDF
Properties of Images in LSB Plane
PPTX
BLIND RECOVERY OF DATA
PDF
Implementation of image steganography using lab view
PDF
Notes from Coursera Deep Learning courses by Andrew Ng
PDF
INPAINTING FOR LAZY RANDOM WALKS SEGMENTED IMAGE
PDF
Hand Gesture Recognition using OpenCV and Python
PDF
Automatic Conversion of 2D Image into 3D Image
COMPLETE END-TO-END LOW COST SOLUTION TO A 3D SCANNING SYSTEM WITH INTEGRATED...
Image recognition
Deep learning
SECURE IMAGE RETRIEVAL BASED ON HYBRID FEATURES AND HASHES
High PSNR Based Image Steganography
Goal location prediction based on deep learning using RGB-D camera
Face recognition using artificial neural network
IRJET - Deep Learning Approach to Inpainting and Outpainting System
Facial recognition
Image recognition
Properties of Images in LSB Plane
BLIND RECOVERY OF DATA
Implementation of image steganography using lab view
Notes from Coursera Deep Learning courses by Andrew Ng
INPAINTING FOR LAZY RANDOM WALKS SEGMENTED IMAGE
Hand Gesture Recognition using OpenCV and Python
Automatic Conversion of 2D Image into 3D Image
Ad

Similar to georgefox_template1 (20)

PDF
Deep Belief Networks for Recognizing Handwriting Captured by Leap Motion Cont...
PDF
Control Buggy using Leap Sensor Camera in Data Mining Domain
PPTX
Hand Gesture Recognition Applications
PDF
IRJET - Hand Gesture Recognition to Perform System Operations
PPTX
EXPLORATORY PROJECT
PDF
MOUSE SIMULATION USING NON MAXIMUM SUPPRESSION
PDF
Recognition of sign language hand gestures using leap motion sensor based on ...
PDF
Smart Room Gesture Control
PPTX
Introduction
PDF
IRJET- A Survey on Control of Mechanical ARM based on Hand Gesture Recognitio...
PDF
Neo4j Integration with the Leap Motion as a Gesture Recognition System - Slat...
PPTX
Prototyping of a Robot Arm Controller: getting the hands dirty to learn new t...
PPTX
Recurrent neural networks for sequence learning and learning human identity f...
PPTX
Gesture Recognition (an AI model based project )ppt (3).pptx
PPTX
Leap Motion - Aydin Akcasu
PDF
Gesture recognition using artificial neural network,a technology for identify...
PDF
IRJET= Air Writing: Gesture Recognition using Ultrasound Sensors and Grid-Eye...
PDF
Vision Based Gesture Recognition Using Neural Networks Approaches: A Review
DOCX
STEFANO CARRINO
PDF
G0342039042
Deep Belief Networks for Recognizing Handwriting Captured by Leap Motion Cont...
Control Buggy using Leap Sensor Camera in Data Mining Domain
Hand Gesture Recognition Applications
IRJET - Hand Gesture Recognition to Perform System Operations
EXPLORATORY PROJECT
MOUSE SIMULATION USING NON MAXIMUM SUPPRESSION
Recognition of sign language hand gestures using leap motion sensor based on ...
Smart Room Gesture Control
Introduction
IRJET- A Survey on Control of Mechanical ARM based on Hand Gesture Recognitio...
Neo4j Integration with the Leap Motion as a Gesture Recognition System - Slat...
Prototyping of a Robot Arm Controller: getting the hands dirty to learn new t...
Recurrent neural networks for sequence learning and learning human identity f...
Gesture Recognition (an AI model based project )ppt (3).pptx
Leap Motion - Aydin Akcasu
Gesture recognition using artificial neural network,a technology for identify...
IRJET= Air Writing: Gesture Recognition using Ultrasound Sensors and Grid-Eye...
Vision Based Gesture Recognition Using Neural Networks Approaches: A Review
STEFANO CARRINO
G0342039042
Ad

georgefox_template1

  • 1. Methods  Data from over 100 different users was collected on 12 pre-defined gestures: Tap, Two finger tap, Swipe, Grab, Release, Pinch, Wipe, Checkmark, Figure 8, Lower case e, Capital E, Capital F  Total dataset is 1.5GB and ~9,600 gesture instances  Preprocessing gestures into hand-selected features, such as maximum displacement, and feeding into three-layer neural network returned mixed results  Needed a way to turn messy temporal data with variable timespans for the same gesture into a fixed representation with constant-sized input  To do this, created motion images of each type of gesture by mapping 3D locations to pixels, projecting onto XY, YZ, and XZ planes  This image representation of gestures provides a lot of flexibility for learning models  Well defined methods exist for image classification  Can include temporal history into the images by decaying older parts of the path  Can easily augment the dataset through skews, reflections, and transformations of the captured images instead of modifying the underlying data  Data augmentation leads to reduced overfitting in the learning models  Deep belief nets: use a Restricted Boltzmann Machine trained with contrastive divergence to extract features from the data without needing class labels  Stack RBMs by using the output of a hidden layer as the visible input to the next RMB  Add softmax output layer as the classifier and use backpropagation to finetune the model  Convolutional Neural Networks: widely used for the task of handwritten digit classification and object recognition  Also combines feature extraction with classification, but greatly reduces the number of parameters required by sharing them between feature maps  Can tolerate translations and skew in the image through overlapped layers and pooling Figure 2. The data captured from the Leap Motion device Discussion  G.E. Hinton and R.R. Salakhutdinov, Reducing the Dimensionality of Data with Neural Networks, Science, 28 July 2006, Vol. 313. no. 5786, pp. 504 - 507.  G.E. Hinton, S. Osindero, and Y. Teh, “A fast learning algorithm for deep belief nets”, Neural Computation, vol 18, 2006.  Y. LeCun, L. Bottou, Y. Bengio and P. Haffner: Gradient- Based Learning Applied to Document Recognition, Proceedings of the IEEE, 86(11):2278-2324, November 1998.  Graves, Alex. Supervised sequence labelling with recurrent neural networks. Vol. 385. Springer, 2012.  Hochreiter, S., & Schmidhuber, J. (1997). Long short- term memory. Neural computation, 9(8), 1735-1780.  Leap Motion. https://www.leapmotion.com/. Accessed March 19, 2014.  LeCun, Yann. "LeNet-5, convolutional neural networks". Accessed 24 February 2015. Future Work References Introduction  Leap Motion controller uses IR cameras to capture the position and orientation of a hand  Allows your hands to be present onscreen, using them to perform gestures and actions as a way to interact with a computer without a mouse  By itself the model can only say what is happening now, not what has been done over time or what the user is trying to communicate  Common solution is to use a finite state machine to map gestures to controls: If moving down at 45° then up at 45° → check mark  This can’t scale to a large corpus of gestures and can’t segment or parse continuous gestures into a semantically meaningful language  Continuous gesture recognition has applications in sign language translation, design tools, robot control, gaming, and stereoscopic computing Objectives  Collect a dataset using the Leap Motion controller for training models  Use deep learning to segment, classify, and parse a series of human input gestures  Create a “gesture engine” that can be trained on desired gestures and used inside Leap Motion- enabled applications for continuous, meaningful interaction with a computer without a mouse 3D Gesture Recognition with the Leap Motion Controller Robert McCartney, Dr. Hans-Peter Bischof Rochester Institute of Technology Figure 1. Leap Motion’s model of the hand Figure 3. A checkmark Figure 5. A figure 8 Figure 4. A capital E Figure 6. RBM as an energy model Figure 7. Deep belief net Figure 8. Convolutional NN Taken from http://parse.ele.tue.nl/education/cluster2  Working with Jie Yuan to incorporate his HMM in order to model the hidden rejection state when the user is not performing any gesture  The HMM will segment online gestures, with segmented data then turned into motion images  This will allow for continuous online gesture recognition  Alternative future approaches include Recurrent Neural Networks and LSTMs  Various dimensionality reduction techniques need to be explored: autoencoders, PCA, and manifold methods (LPP)  Pending issue: long training times require GPU implementations to run efficiently