Silhouette analysis based action recognition via exploiting human poses

Silhouette Analysis-Based
Action Recognition via
Exploiting Human Poses

CONTENTS
•
•
•
•
•
•
•
•
•
•

Abstract & Objective
Introduction
Software Requirement
Hardware Requirement
Existing System & Disadvantages
Proposed System & Advantages
Literature Survey
Application
Conclusion
References

Abstract
•
•
•
•
•

In this paper, we propose a novel scheme for human
action recognition that combines the advantages of both
local and global representations.
We explore human silhouettes for human action
representation by taking into account the correlation
between sequential poses in an action.
A modified bag-of-words model, named bag of
correlated poses, is introduced to encode temporally
local features of actions.
To utilize the property of visual word ambiguity, we
reduce the dimensionality of our model.
To compensate for the loss of structural information, we
propose an extended motion template, i.e., extensions
of the motion history image, to capture the holistic
structural features.

OBJECTIVE
• The objective of vision-based human action
recognition is to label the video sequence with its
corresponding action category.

Software Requirement
• Operating System

:

Windows XP

• Language

:

MATLAB

• Version

:

MATLAB 7.9

Hardware Requirement
• Pentium IV – 2.7 GHz
• 1 GB DDR RAM
• 250 GB Hard Disk

Existing system
•
•

•
•
•
•

STIPs using a temporal Gabor filter and a spatial
Gaussian filter.
STIP detectors such as Harris3D, Cuboid, 3D-Hessian,
dense sampling, spatiotemporal regularity-based
feature HOG/HOF, HOG3D, extended SURF, and
MoSIFT.
PageRank-based centrality measure to select key
poses according to the recovered geometric structure.
Utilizing properties of the solution to the Poisson
equation to extract space-time features.
By calculating the differences between frames and used
them as intermediate features.
Action recognition framework fusing local 3D-SIFT
descriptors and holistic Zernike motion energy image
(MEI) features.

disadvantages
• Segmentation and tracking are not possible.
• It is consuming too time for the feature points
computation.
• Sparse representation, such as bag of visual
words
(BoVWs),
discards
geometric
relationship of the features and is less
discriminative.
• Hard-assignment quantization during
codebook construction for BoVW.

the

Proposed system
•

Here we proposed the method to recognize the action in the silhouette of
human.

•

Here we extract the BoCP(Bag of correlated posses).

•

BoCP feature will extract in the sequence of steps.

– PCA feature extraction followed by k-means
clustering
and
the
correlogram
matrix
construction.
– We reduce the correlogram dimension by the use
of LDA.
•

BoCP feature descriptor and Extended-MHI forms the feature vector.

•

SVM (Support Vector Machine) trains the features and predict the result.

advantages
• Reduce computational complexity and quantization
error.
• The proposed scheme takes advantages of local and
global features

• Provides a discriminative representation for human
actions.

Action recognition using
context and appearance
distribution features distribution
•
We first propose a new spatio-temporal context
•
•

•

•
•

feature of

interest points for human action recognition.
Each action video is expressed as a set of relative XYT coordinates between
pairwise interest points in a local region.
We learn a global GMM (referred to as Universal Background Model, UBM)
using the relative coordinate features from all the training videos, and then
represent each video as the normalized parameters of a video-specific GMM
adapted from the global GMM.
In order to capture the spatio-temporal relationships at different levels,
multiple GMMs are utilized to describe the context distributions of interest
points over multi-scale local regions.
To describe the appearance information of an action video, we also propose
to use GMM to characterize the distribution of local appearance features
from the cuboids centered around the interest points.
Accordingly, an action video can be represented by two types of distribution
features:

– 1) multiple GMM distributions of spatio-temporal
context;
– 2) GMM distribution of local video appearance.

Action Recognition using
Space-time Shape Difference
Images we present a novel motion representation
• In this paper,
•
•

•

•

based on difference images.
In this paper we have presented a new method of
extracting useful features from human action videos for
action recognition.
We show that this representation exploits the dynamics
of motion, and show its effectiveness in action
recognition
We showed the effectiveness of our method, and
compared our results against other well established
algorithms, which shows our algorithm has competitive
accuracy, is fast, and furthermore, is not very sensitive
to video resolution, partial shape deformation of actions
nor the number of clusters used.
Future work can include combining other features
containing additional shape information, and improving
the quality of silhouette extraction.

Making action recognition
robust to occlusions and
viewpoint changes
•

•

•

•

We propose a novel approach to providing robustness
to both occlusions and viewpoint changes that yields
significant improvements over existing techniques.
At its heart is a local partitioning and hierarchical
classification of the 3D Histogram of Oriented Gradients
(HOG) descriptor to represent sequences of images that
have been concatenated into a data volume.
We achieve robustness to occlusions and viewpoint
changes by combining training data from all viewpoints
to train classifiers that estimate action labels
independently over sets of HOG blocks.
A top level classifier combines these local labels into a
global action class decision.

Action recognition using
correlogram of body poses and
spectral regression

• In this paper, we propose a novel representation for
human actions using Correlogram of Body Poses
(CBP) which takes advantage of both the probabilistic
distribution and the temporal relationship of human
poses.
• To reduce the high dimensionality of the CBP
representation, an efficient subspace learning
technique called Spectral Regression Discriminant
Analysis (SRDA) is explored.
• Experimental results on the challenging IXMAS
dataset show that the proposed algorithm
outperforms the state-of-the-art methods on action
recognition.

Evaluation of local spatio temporal features for action
recognition paper is to evaluate and compare
• The purpose of this
•
•
•

•

previously proposed space-time features in a common
experimental setup.
In particular, we consider four different feature detectors
and six local feature descriptors and use a standard
bag-of-features SVM approach for action recognition.
We investigate the performance of these methods on a
total of 25 action classes distributed over three datasets
with varying difficulty.
Among interesting conclusions, we demonstrate that
regular sampling of space-time features consistently
outperforms all tested space-time interest point
detectors for human actions in realistic settings.
We also demonstrate a consistent ranking for the
majority of methods over different datasets and discuss
their advantages and limitations.

applications
•
•
•
•
•

Video Surveillance
Robotics
Human–Computer Interaction
User Interface Design
Multimedia Video Retrieval

Conclusion
•
•
•

•
•

In this paper, we proposed two new representations,
namely, BoCP and the extended-MHI for action
recognition.
BoCP was a temporally local feature descriptor and the
extended-MHI was a holistic motion descriptor.
The extension of MHI compensated for information loss
in the original approach and later we verified the
conjecture that local and holistic features were
complementary to each other.
In this paper, our system showed promising
performance and produced better results than any
published paper on the IXMAS.
With more sophisticated feature descriptors and
advanced dimensionality reduction methods, we
reckoned better performance.

Future Work
• We propose to replace PCA (Principal Component
Analysis) feature extraction by ICA (Independent
Component Analysis), so that the accuracy of
recognition can be improved.

References
• X. Wu, D. Xu, L. Duan, and J. Luo, “Action
recognition using context and appearance distribution
features,”
• H. Qu, L. Wang, and C. Leckie, “Action recognition
using space-time shape difference images,”
• D. Weinland, M. O¨ zuysal, and P. Fua, “Making
action recognition robust to occlusions and viewpoint
changes,”
• L. Shao, D. Wu, and X. Chen, “Action recognition
using correlogram of body poses and spectral
regression,”
• H. Wang, M. Ullah, A. Klaser, I. Laptev, and C.
Schmid, “Evaluation of local spatio-temporal features
for action recognition,”

/AvvenireTechnologies

/avveniretech

Silhouette analysis based action recognition via exploiting human poses

More Related Content

What's hot (19)

Similar to Silhouette analysis based action recognition via exploiting human poses (20)

Recently uploaded (20)

Silhouette analysis based action recognition via exploiting human poses