SlideShare a Scribd company logo
Silhouette Analysis-Based
Action Recognition via
Exploiting Human Poses
CONTENTS
•
•
•
•
•
•
•
•
•
•

Abstract & Objective
Introduction
Software Requirement
Hardware Requirement
Existing System & Disadvantages
Proposed System & Advantages
Literature Survey
Application
Conclusion
References
Abstract
•
•
•
•
•

In this paper, we propose a novel scheme for human
action recognition that combines the advantages of both
local and global representations.
We explore human silhouettes for human action
representation by taking into account the correlation
between sequential poses in an action.
A modified bag-of-words model, named bag of
correlated poses, is introduced to encode temporally
local features of actions.
To utilize the property of visual word ambiguity, we
reduce the dimensionality of our model.
To compensate for the loss of structural information, we
propose an extended motion template, i.e., extensions
of the motion history image, to capture the holistic
structural features.
OBJECTIVE
• The objective of vision-based human action
recognition is to label the video sequence with its
corresponding action category.
Software Requirement
• Operating System

:

Windows XP

• Language

:

MATLAB

• Version

:

MATLAB 7.9
Hardware Requirement
• Pentium IV – 2.7 GHz
• 1 GB DDR RAM
• 250 GB Hard Disk
Existing system
•
•

•
•
•
•

STIPs using a temporal Gabor filter and a spatial
Gaussian filter.
STIP detectors such as Harris3D, Cuboid, 3D-Hessian,
dense sampling, spatiotemporal regularity-based
feature HOG/HOF, HOG3D, extended SURF, and
MoSIFT.
PageRank-based centrality measure to select key
poses according to the recovered geometric structure.
Utilizing properties of the solution to the Poisson
equation to extract space-time features.
By calculating the differences between frames and used
them as intermediate features.
Action recognition framework fusing local 3D-SIFT
descriptors and holistic Zernike motion energy image
(MEI) features.
disadvantages
• Segmentation and tracking are not possible.
• It is consuming too time for the feature points
computation.
• Sparse representation, such as bag of visual
words
(BoVWs),
discards
geometric
relationship of the features and is less
discriminative.
• Hard-assignment quantization during
codebook construction for BoVW.

the
Proposed system
•

Here we proposed the method to recognize the action in the silhouette of
human.

•

Here we extract the BoCP(Bag of correlated posses).

•

BoCP feature will extract in the sequence of steps.

– PCA feature extraction followed by k-means
clustering
and
the
correlogram
matrix
construction.
– We reduce the correlogram dimension by the use
of LDA.
•

BoCP feature descriptor and Extended-MHI forms the feature vector.

•

SVM (Support Vector Machine) trains the features and predict the result.
advantages
• Reduce computational complexity and quantization
error.
• The proposed scheme takes advantages of local and
global features

• Provides a discriminative representation for human
actions.
Literature survey
Action recognition using
context and appearance
distribution features distribution
•
We first propose a new spatio-temporal context
•
•

•

•
•

feature of

interest points for human action recognition.
Each action video is expressed as a set of relative XYT coordinates between
pairwise interest points in a local region.
We learn a global GMM (referred to as Universal Background Model, UBM)
using the relative coordinate features from all the training videos, and then
represent each video as the normalized parameters of a video-specific GMM
adapted from the global GMM.
In order to capture the spatio-temporal relationships at different levels,
multiple GMMs are utilized to describe the context distributions of interest
points over multi-scale local regions.
To describe the appearance information of an action video, we also propose
to use GMM to characterize the distribution of local appearance features
from the cuboids centered around the interest points.
Accordingly, an action video can be represented by two types of distribution
features:

– 1) multiple GMM distributions of spatio-temporal
context;
– 2) GMM distribution of local video appearance.
Action Recognition using
Space-time Shape Difference
Images we present a novel motion representation
• In this paper,
•
•

•

•

based on difference images.
In this paper we have presented a new method of
extracting useful features from human action videos for
action recognition.
We show that this representation exploits the dynamics
of motion, and show its effectiveness in action
recognition
We showed the effectiveness of our method, and
compared our results against other well established
algorithms, which shows our algorithm has competitive
accuracy, is fast, and furthermore, is not very sensitive
to video resolution, partial shape deformation of actions
nor the number of clusters used.
Future work can include combining other features
containing additional shape information, and improving
the quality of silhouette extraction.
Making action recognition
robust to occlusions and
viewpoint changes
•

•

•

•

We propose a novel approach to providing robustness
to both occlusions and viewpoint changes that yields
significant improvements over existing techniques.
At its heart is a local partitioning and hierarchical
classification of the 3D Histogram of Oriented Gradients
(HOG) descriptor to represent sequences of images that
have been concatenated into a data volume.
We achieve robustness to occlusions and viewpoint
changes by combining training data from all viewpoints
to train classifiers that estimate action labels
independently over sets of HOG blocks.
A top level classifier combines these local labels into a
global action class decision.
Action recognition using
correlogram of body poses and
spectral regression

• In this paper, we propose a novel representation for
human actions using Correlogram of Body Poses
(CBP) which takes advantage of both the probabilistic
distribution and the temporal relationship of human
poses.
• To reduce the high dimensionality of the CBP
representation, an efficient subspace learning
technique called Spectral Regression Discriminant
Analysis (SRDA) is explored.
• Experimental results on the challenging IXMAS
dataset show that the proposed algorithm
outperforms the state-of-the-art methods on action
recognition.
Evaluation of local spatio temporal features for action
recognition paper is to evaluate and compare
• The purpose of this
•
•
•

•

previously proposed space-time features in a common
experimental setup.
In particular, we consider four different feature detectors
and six local feature descriptors and use a standard
bag-of-features SVM approach for action recognition.
We investigate the performance of these methods on a
total of 25 action classes distributed over three datasets
with varying difficulty.
Among interesting conclusions, we demonstrate that
regular sampling of space-time features consistently
outperforms all tested space-time interest point
detectors for human actions in realistic settings.
We also demonstrate a consistent ranking for the
majority of methods over different datasets and discuss
their advantages and limitations.
applications
•
•
•
•
•

Video Surveillance
Robotics
Human–Computer Interaction
User Interface Design
Multimedia Video Retrieval
Conclusion
•
•
•

•
•

In this paper, we proposed two new representations,
namely, BoCP and the extended-MHI for action
recognition.
BoCP was a temporally local feature descriptor and the
extended-MHI was a holistic motion descriptor.
The extension of MHI compensated for information loss
in the original approach and later we verified the
conjecture that local and holistic features were
complementary to each other.
In this paper, our system showed promising
performance and produced better results than any
published paper on the IXMAS.
With more sophisticated feature descriptors and
advanced dimensionality reduction methods, we
reckoned better performance.
Future Work
• We propose to replace PCA (Principal Component
Analysis) feature extraction by ICA (Independent
Component Analysis), so that the accuracy of
recognition can be improved.
References
• X. Wu, D. Xu, L. Duan, and J. Luo, ā€œAction
recognition using context and appearance distribution
features,ā€
• H. Qu, L. Wang, and C. Leckie, ā€œAction recognition
using space-time shape difference images,ā€
• D. Weinland, M. OĀØ zuysal, and P. Fua, ā€œMaking
action recognition robust to occlusions and viewpoint
changes,ā€
• L. Shao, D. Wu, and X. Chen, ā€œAction recognition
using correlogram of body poses and spectral
regression,ā€
• H. Wang, M. Ullah, A. Klaser, I. Laptev, and C.
Schmid, ā€œEvaluation of local spatio-temporal features
for action recognition,ā€
/AvvenireTechnologies

/avveniretech

More Related Content

PDF
IRJET-Underwater Image Enhancement by Wavelet Decomposition using FPGA
PDF
Leader follower formation control of ground vehicles using camshift based gui...
Ā 
PDF
Review On Different Feature Extraction Algorithms
PPT
face recognition system using LBP
PDF
DeepVO - Towards Visual Odometry with Deep Learning
PDF
A survey on feature descriptors for texture image classification
PDF
Driving Behavior for ADAS and Autonomous Driving VII
PDF
International Journal of Engineering Research and Development (IJERD)
IRJET-Underwater Image Enhancement by Wavelet Decomposition using FPGA
Leader follower formation control of ground vehicles using camshift based gui...
Ā 
Review On Different Feature Extraction Algorithms
face recognition system using LBP
DeepVO - Towards Visual Odometry with Deep Learning
A survey on feature descriptors for texture image classification
Driving Behavior for ADAS and Autonomous Driving VII
International Journal of Engineering Research and Development (IJERD)

What's hot (19)

PDF
A Study on Image Retrieval Features and Techniques with Various Combinations
PDF
(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...
PDF
IRJET-A Review of Underwater Image Enhancement By Wavelet Decomposition using...
PDF
Speeded-up and Compact Visual Codebook for Object Recognition
PDF
Modified CSLBP
PDF
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
PDF
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...
PDF
IRJET- Crowd Density Estimation using Novel Feature Descriptor
PDF
Ijcatr04051016
DOCX
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
PDF
PR-297: Training data-efficient image transformers & distillation through att...
PDF
EFFICIENT IMAGE RETRIEVAL USING REGION BASED IMAGE RETRIEVAL
Ā 
PDF
AN EFFICIENT FPGA IMPLEMENTATION OF MRI IMAGE FILTERING AND TUMOUR CHARACTERI...
PDF
Multi sensor calibration by deep learning
PDF
Seminarpaper
PDF
Weighted Performance comparison of DWT and LWT with PCA for Face Image Retrie...
PDF
Depth Fusion from RGB and Depth Sensors III
PDF
AN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHM
PDF
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
A Study on Image Retrieval Features and Techniques with Various Combinations
(Paper Review)U-GAT-IT: unsupervised generative attentional networks with ada...
IRJET-A Review of Underwater Image Enhancement By Wavelet Decomposition using...
Speeded-up and Compact Visual Codebook for Object Recognition
Modified CSLBP
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
Review : Structure Boundary Preserving Segmentation
for Medical Image with Am...
IRJET- Crowd Density Estimation using Novel Feature Descriptor
Ijcatr04051016
BULK IEEE PROJECTS IN MATLAB ,BULK IEEE PROJECTS, IEEE 2015-16 MATLAB PROJEC...
PR-297: Training data-efficient image transformers & distillation through att...
EFFICIENT IMAGE RETRIEVAL USING REGION BASED IMAGE RETRIEVAL
Ā 
AN EFFICIENT FPGA IMPLEMENTATION OF MRI IMAGE FILTERING AND TUMOUR CHARACTERI...
Multi sensor calibration by deep learning
Seminarpaper
Weighted Performance comparison of DWT and LWT with PCA for Face Image Retrie...
Depth Fusion from RGB and Depth Sensors III
AN EFFICIENT CODEBOOK INITIALIZATION APPROACH FOR LBG ALGORITHM
(CVPR2021 Oral) RobustNet: Improving Domain Generalization in Urban-Scene Seg...
Ad

Similar to Silhouette analysis based action recognition via exploiting human poses (20)

PDF
Human action recognition using local space time features and adaboost svm
PPTX
A general survey of previous works on action recognition
Ā 
PDF
IRJET- Recurrent Neural Network for Human Action Recognition using Star S...
PDF
A COMPARATIVE STUDY ON HUMAN ACTION RECOGNITION USING MULTIPLE SKELETAL FEATU...
Ā 
PDF
A COMPARATIVE STUDY ON HUMAN ACTION RECOGNITION USING MULTIPLE SKELETAL FEATU...
Ā 
PDF
Volume 2-issue-6-1960-1964
PDF
Volume 2-issue-6-1960-1964
PDF
Action Recognition using Nonnegative Action
Ā 
PDF
Human Action Recognition Using Deep Learning
PPTX
Action_recognition-topic.pptx
PDF
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
Ā 
PDF
Motion based action recognition using k nearest neighbor
PDF
IRJET- Review on Human Action Detection in Stored Videos using Support Vector...
PDF
Comparison of feed forward and cascade forward neural networks for human acti...
PPT
Fcv scene schmid
Ā 
PDF
E03404025032
PDF
Activity recognition based on spatio-temporal features with transfer learning
PPTX
Iciap 2
PDF
IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...
DOCX
Chapter 1_Introduction.docx
Human action recognition using local space time features and adaboost svm
A general survey of previous works on action recognition
Ā 
IRJET- Recurrent Neural Network for Human Action Recognition using Star S...
A COMPARATIVE STUDY ON HUMAN ACTION RECOGNITION USING MULTIPLE SKELETAL FEATU...
Ā 
A COMPARATIVE STUDY ON HUMAN ACTION RECOGNITION USING MULTIPLE SKELETAL FEATU...
Ā 
Volume 2-issue-6-1960-1964
Volume 2-issue-6-1960-1964
Action Recognition using Nonnegative Action
Ā 
Human Action Recognition Using Deep Learning
Action_recognition-topic.pptx
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
Ā 
Motion based action recognition using k nearest neighbor
IRJET- Review on Human Action Detection in Stored Videos using Support Vector...
Comparison of feed forward and cascade forward neural networks for human acti...
Fcv scene schmid
Ā 
E03404025032
Activity recognition based on spatio-temporal features with transfer learning
Iciap 2
IRJET- A Survey on the Enhancement of Video Action Recognition using Semi-Sup...
Chapter 1_Introduction.docx
Ad

Recently uploaded (20)

PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Sports Quiz easy sports quiz sports quiz
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Pre independence Education in Inndia.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
Complications of Minimal Access Surgery at WLH
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Ā 
PPTX
Pharma ospi slides which help in ospi learning
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
TR - Agricultural Crops Production NC III.pdf
PDF
Classroom Observation Tools for Teachers
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Cell Types and Its function , kingdom of life
PDF
Anesthesia in Laparoscopic Surgery in India
Renaissance Architecture: A Journey from Faith to Humanism
O7-L3 Supply Chain Operations - ICLT Program
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Sports Quiz easy sports quiz sports quiz
human mycosis Human fungal infections are called human mycosis..pptx
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Pre independence Education in Inndia.pdf
VCE English Exam - Section C Student Revision Booklet
Complications of Minimal Access Surgery at WLH
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Ā 
Pharma ospi slides which help in ospi learning
Microbial diseases, their pathogenesis and prophylaxis
TR - Agricultural Crops Production NC III.pdf
Classroom Observation Tools for Teachers
O5-L3 Freight Transport Ops (International) V1.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
STATICS OF THE RIGID BODIES Hibbelers.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Cell Types and Its function , kingdom of life
Anesthesia in Laparoscopic Surgery in India

Silhouette analysis based action recognition via exploiting human poses

  • 1. Silhouette Analysis-Based Action Recognition via Exploiting Human Poses
  • 2. CONTENTS • • • • • • • • • • Abstract & Objective Introduction Software Requirement Hardware Requirement Existing System & Disadvantages Proposed System & Advantages Literature Survey Application Conclusion References
  • 3. Abstract • • • • • In this paper, we propose a novel scheme for human action recognition that combines the advantages of both local and global representations. We explore human silhouettes for human action representation by taking into account the correlation between sequential poses in an action. A modified bag-of-words model, named bag of correlated poses, is introduced to encode temporally local features of actions. To utilize the property of visual word ambiguity, we reduce the dimensionality of our model. To compensate for the loss of structural information, we propose an extended motion template, i.e., extensions of the motion history image, to capture the holistic structural features.
  • 4. OBJECTIVE • The objective of vision-based human action recognition is to label the video sequence with its corresponding action category.
  • 5. Software Requirement • Operating System : Windows XP • Language : MATLAB • Version : MATLAB 7.9
  • 6. Hardware Requirement • Pentium IV – 2.7 GHz • 1 GB DDR RAM • 250 GB Hard Disk
  • 7. Existing system • • • • • • STIPs using a temporal Gabor filter and a spatial Gaussian filter. STIP detectors such as Harris3D, Cuboid, 3D-Hessian, dense sampling, spatiotemporal regularity-based feature HOG/HOF, HOG3D, extended SURF, and MoSIFT. PageRank-based centrality measure to select key poses according to the recovered geometric structure. Utilizing properties of the solution to the Poisson equation to extract space-time features. By calculating the differences between frames and used them as intermediate features. Action recognition framework fusing local 3D-SIFT descriptors and holistic Zernike motion energy image (MEI) features.
  • 8. disadvantages • Segmentation and tracking are not possible. • It is consuming too time for the feature points computation. • Sparse representation, such as bag of visual words (BoVWs), discards geometric relationship of the features and is less discriminative. • Hard-assignment quantization during codebook construction for BoVW. the
  • 9. Proposed system • Here we proposed the method to recognize the action in the silhouette of human. • Here we extract the BoCP(Bag of correlated posses). • BoCP feature will extract in the sequence of steps. – PCA feature extraction followed by k-means clustering and the correlogram matrix construction. – We reduce the correlogram dimension by the use of LDA. • BoCP feature descriptor and Extended-MHI forms the feature vector. • SVM (Support Vector Machine) trains the features and predict the result.
  • 10. advantages • Reduce computational complexity and quantization error. • The proposed scheme takes advantages of local and global features • Provides a discriminative representation for human actions.
  • 12. Action recognition using context and appearance distribution features distribution • We first propose a new spatio-temporal context • • • • • feature of interest points for human action recognition. Each action video is expressed as a set of relative XYT coordinates between pairwise interest points in a local region. We learn a global GMM (referred to as Universal Background Model, UBM) using the relative coordinate features from all the training videos, and then represent each video as the normalized parameters of a video-specific GMM adapted from the global GMM. In order to capture the spatio-temporal relationships at different levels, multiple GMMs are utilized to describe the context distributions of interest points over multi-scale local regions. To describe the appearance information of an action video, we also propose to use GMM to characterize the distribution of local appearance features from the cuboids centered around the interest points. Accordingly, an action video can be represented by two types of distribution features: – 1) multiple GMM distributions of spatio-temporal context; – 2) GMM distribution of local video appearance.
  • 13. Action Recognition using Space-time Shape Difference Images we present a novel motion representation • In this paper, • • • • based on difference images. In this paper we have presented a new method of extracting useful features from human action videos for action recognition. We show that this representation exploits the dynamics of motion, and show its effectiveness in action recognition We showed the effectiveness of our method, and compared our results against other well established algorithms, which shows our algorithm has competitive accuracy, is fast, and furthermore, is not very sensitive to video resolution, partial shape deformation of actions nor the number of clusters used. Future work can include combining other features containing additional shape information, and improving the quality of silhouette extraction.
  • 14. Making action recognition robust to occlusions and viewpoint changes • • • • We propose a novel approach to providing robustness to both occlusions and viewpoint changes that yields significant improvements over existing techniques. At its heart is a local partitioning and hierarchical classification of the 3D Histogram of Oriented Gradients (HOG) descriptor to represent sequences of images that have been concatenated into a data volume. We achieve robustness to occlusions and viewpoint changes by combining training data from all viewpoints to train classifiers that estimate action labels independently over sets of HOG blocks. A top level classifier combines these local labels into a global action class decision.
  • 15. Action recognition using correlogram of body poses and spectral regression • In this paper, we propose a novel representation for human actions using Correlogram of Body Poses (CBP) which takes advantage of both the probabilistic distribution and the temporal relationship of human poses. • To reduce the high dimensionality of the CBP representation, an efficient subspace learning technique called Spectral Regression Discriminant Analysis (SRDA) is explored. • Experimental results on the challenging IXMAS dataset show that the proposed algorithm outperforms the state-of-the-art methods on action recognition.
  • 16. Evaluation of local spatio temporal features for action recognition paper is to evaluate and compare • The purpose of this • • • • previously proposed space-time features in a common experimental setup. In particular, we consider four different feature detectors and six local feature descriptors and use a standard bag-of-features SVM approach for action recognition. We investigate the performance of these methods on a total of 25 action classes distributed over three datasets with varying difficulty. Among interesting conclusions, we demonstrate that regular sampling of space-time features consistently outperforms all tested space-time interest point detectors for human actions in realistic settings. We also demonstrate a consistent ranking for the majority of methods over different datasets and discuss their advantages and limitations.
  • 18. Conclusion • • • • • In this paper, we proposed two new representations, namely, BoCP and the extended-MHI for action recognition. BoCP was a temporally local feature descriptor and the extended-MHI was a holistic motion descriptor. The extension of MHI compensated for information loss in the original approach and later we verified the conjecture that local and holistic features were complementary to each other. In this paper, our system showed promising performance and produced better results than any published paper on the IXMAS. With more sophisticated feature descriptors and advanced dimensionality reduction methods, we reckoned better performance.
  • 19. Future Work • We propose to replace PCA (Principal Component Analysis) feature extraction by ICA (Independent Component Analysis), so that the accuracy of recognition can be improved.
  • 20. References • X. Wu, D. Xu, L. Duan, and J. Luo, ā€œAction recognition using context and appearance distribution features,ā€ • H. Qu, L. Wang, and C. Leckie, ā€œAction recognition using space-time shape difference images,ā€ • D. Weinland, M. OĀØ zuysal, and P. Fua, ā€œMaking action recognition robust to occlusions and viewpoint changes,ā€ • L. Shao, D. Wu, and X. Chen, ā€œAction recognition using correlogram of body poses and spectral regression,ā€ • H. Wang, M. Ullah, A. Klaser, I. Laptev, and C. Schmid, ā€œEvaluation of local spatio-temporal features for action recognition,ā€