SlideShare a Scribd company logo
Anisotropic Partial Differential Equation
based Video Saliency Detection
Vartika Sharma, Vembarasan Vaitheeswaran, Chee Seng Chan
Original LESD Model
Our Contributions
• First, we propose a novel method to generate static saliency map based on the adaptive nonlinear PDE
model. It is based on the Linear Elliptic System with Dirichlet boundary (LESD) model for image saliency
detection.
• We refine this model for the purpose of video saliency detection because the original LESD model does not
consider the orientation and motion information contained in the video.
• Further, the proposed algorithm was tested on MSRA and Berkeley datasets, where images are mostly
noiseless and are nearer to the image center but most of the video datasets contains heavy noise and the
salient object is usually moving within the frames. For this reason, we do not use center-prior which is given
in the original LESD model but instead, an extensive direction map consisting of background prior, color
prior, texture and luminance features are used.
• We then combine the static map with motion map, which consists of motion features extracted from the
motion vectors of predicted frames, to get the final saliency map. Figure 1 shows the pipeline of our model.
CVPR presentation
Addition of Non-Linear Matric Tensor
• The diffusion PDE seen previously does not give reliable information
in the presence of flow-like structures (e.g. fingerprints).
• We will extend our model for flow like structure where it would be
required to rotate the PDE flow towards the orientation of interesting
features.
Addition of Non-Linear Matric Tensor
K2
Feature Extraction From DCT Coefficients
• Three features including luminance, color and texture are extracted
from the unpredicted (I-frames) using DCT Coefficients
• On a given video frame, DCT operates on one 8X8 block at a time. On
this block, there are 64-elements or 64-coeffients and the DCT
operates on this block in a left to right and top to down manner (zig-
zag sequencing).
Feature Extraction From DCT Coefficients
• The results of a 64-element DCT transform are 1 DC coefficient and 63
AC coefficients.
• The DC coefficient represents the average color of the 8x8 region.
(Color and Luminance Prior)
• The 63 AC coefficients represent color change across the
block.(Texture)
Motion Feature Extraction from Motion
Vectors
• Motion Vector: A two-dimensional vector used for inter prediction
that provides an offset from the coordinates in the decoded picture to
the coordinates in a reference picture.
• There are two types of predicted frames: P frames use motion
compensated prediction from a past reference frame, while B frames
are bidirectionally predictive-coded by using motion compensated
prediction from a past and/or a future reference frame.
Motion Feature Extraction from Motion
Vectors
• As there is just one prediction direction (predicted from a past
reference frame) for P frames, the original motion vector MV are used
to represent the motion feature for P frames.
• As B frames might include two types of motion compensated
prediction (the backward and forward prediction), we calculate the
motion vectors for B frames
Anisotropic Partial Differential Equation based Video Saliency
Detection
Vartika Sharma, Vembarasan Vaitheeswaran, Chee Seng Chan
Result of our Video Saliency Detection model on KTH Action
Dataset
Results on KTH Action Datasetϯ
Number of action classes = 6
{boxing, hand clapping, hand waving, jogging, running, walking}
Boxing Hand Clapping Hand Waving Jogging Running Walking
Original Action Videos*
Final Saliency Maps
* For convenience, I have chosen only 16 frames per video
Ϯ "Recognizing Human Actions: A Local SVMApproach",Christian Schuldt, Ivan Laptev and Barbara Caputo; in Proc.
ICPR'04, Cambridge, UK.
CVPR presentation
DC AC01 AC02 AC03 AC04 AC05 AC06 AC07
AC20 AC21 AC22 AC23 AC24 AC25 AC26 AC27
AC10 AC11 AC12 AC13 AC14 AC15 AC16 AC17
AC30 AC31 AC32 AC33 AC34 AC35 AC36 AC37
AC50 AC51 AC52 AC53 AC54 AC55 AC56 AC57
AC40 AC41 AC42 AC43 AC44 AC45 AC46 AC47
AC60 AC61 AC62 AC63 AC64 AC65 AC66 AC67
AC70 AC71 AC72 AC73 AC74 AC75 AC76 AC77
(a)
(c)
(b)
(d)
CVPR presentation
• We performed salient region segmentation using MCMC segmentation method which
was proposed by Barbu etal ~cite{Barbu2012} for crowd counting. The main purpose of
our experiment is to get an estimation of crowd in a particular video frame and also, to
calculate the rate with which the crowd count is changing in the consecutive frames.
Although, now CCTV cameras are becoming very common for video surveillance, there
are very few algorithms available for real-time automated crowd counting. It is important
to note here that our focus is more on the rate of change of crowd count rather than the
actual crowd count of every frame. A sudden increase or decrease in a crowd count can
act as a warning sign of an unusual activity such as explosion, fight or some other
emergency. For our experiment, we calculate the standard deviation of crowd count in
consecutive video frames for every 10 seconds as a risk calculator. We train our algorithm
on 2000 video frames provided in the Mall Dataset cite{Loy2013} to set the threshold
limit of the standard deviation for which the rate of change of crowd count is ‘safe’. We
further test our algorithm on a few videos in the Pedestrian Traffic Database to show our
results. Figure shows our result on Mall database for crowd counting.
CVPR presentation
CVPR presentation
• begin{figure}begin{center}%fbox{rule{0pt}{2in}
rule{0.9linewidth}{0pt}} includegraphics[height=0.95linewidth,
width=0.95linewidth]{final2.png}end{center} caption{Crowd
counting result on frames of Mall Dataset. $(a)$ is the Original video
frames, $(b)$ is our Saliency detection results and $(c)$ is the
segmentaion based on MCMC
method.}label{fig:final2}%label{fig:onecol}end{figure}
CVPR presentation
CVPR presentation
CVPR presentation
CVPR presentation
CIOFM
MRS
SURPRISE
CA
Our Model
CVPR presentation
Thank You

More Related Content

PDF
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
PPTX
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
PDF
Risk IT Practitioner v2.1
PDF
Sales Management Problems PowerPoint Presentation Slides
PDF
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
PPTX
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
PDF
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
PPTX
KNN Classifier
1시간만에 GAN(Generative Adversarial Network) 완전 정복하기
Data Science for Dummies - Data Engineering with Titanic dataset + Databricks...
Risk IT Practitioner v2.1
Sales Management Problems PowerPoint Presentation Slides
Machine Learning In Python | Python Machine Learning Tutorial | Deep Learning...
【DL輪読会】論文解説:Offline Reinforcement Learning as One Big Sequence Modeling Problem
[DL輪読会]NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis
KNN Classifier

What's hot (19)

PDF
Applying intelligent change readiness to achieve better change webinar, 14 Ju...
PDF
Attention Is All You Need
PDF
Real World End to End machine Learning Pipeline
PDF
An introduction to deep reinforcement learning
PDF
Netflix Recommendations - Beyond the 5 Stars
PDF
Introduction to Diffusion Models
PPTX
Machine Learning in Astrophysics
PDF
Sequence to sequence (encoder-decoder) learning
PDF
Risk Management module PowerPoint Presentation Slides
PDF
Recsys2016 Tutorial by Xavier and Deepak
PPTX
マイクロブログテキストを用いた教師なし文書間類似度評価手法の分析
PDF
【CVPR 2020 メタサーベイ】Neural Generative Models
PDF
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
PDF
딥러닝을 이용한 얼굴 인식
PDF
Real-Time Semantic Stereo Matching
PDF
22-prompt engineering noted slide shown.pdf
PDF
[DL輪読会]Domain Adaptive Faster R-CNN for Object Detection in the Wild
PDF
Hardware Acceleration for Machine Learning
PDF
Introduction to Deep Learning (NVIDIA)
Applying intelligent change readiness to achieve better change webinar, 14 Ju...
Attention Is All You Need
Real World End to End machine Learning Pipeline
An introduction to deep reinforcement learning
Netflix Recommendations - Beyond the 5 Stars
Introduction to Diffusion Models
Machine Learning in Astrophysics
Sequence to sequence (encoder-decoder) learning
Risk Management module PowerPoint Presentation Slides
Recsys2016 Tutorial by Xavier and Deepak
マイクロブログテキストを用いた教師なし文書間類似度評価手法の分析
【CVPR 2020 メタサーベイ】Neural Generative Models
Life-long / Incremental Learning (DLAI D6L1 2017 UPC Deep Learning for Artifi...
딥러닝을 이용한 얼굴 인식
Real-Time Semantic Stereo Matching
22-prompt engineering noted slide shown.pdf
[DL輪読会]Domain Adaptive Faster R-CNN for Object Detection in the Wild
Hardware Acceleration for Machine Learning
Introduction to Deep Learning (NVIDIA)
Ad

Viewers also liked (19)

PPT
Generacion 3.5 g
PPTX
PPT
2008 "An overview of Methods for analysis of Identifiability and Observabilit...
PDF
Preparing for your viva voce dissertation defence.
PPSX
Research Proposal 5 - The Formal Meeting and Presentation
PDF
Thesis Defense Presentation
PPSX
Tips on how to defend your thesis
PPTX
Public speaking & oral presentation
PPTX
Thesis powerpoint
PPT
Proposal Defense Power Point
PDF
My Thesis Defense Presentation
PPT
Powerpoint presentation M.A. Thesis Defence
PPT
Dissertation oral defense presentation
PDF
CVPR 2016 速報
PPT
How to Defend your Thesis Proposal like a Professional
PPTX
Thesis Powerpoint
PPTX
Prepare your Ph.D. Defense Presentation
PPTX
Thesis Power Point Presentation
Generacion 3.5 g
2008 "An overview of Methods for analysis of Identifiability and Observabilit...
Preparing for your viva voce dissertation defence.
Research Proposal 5 - The Formal Meeting and Presentation
Thesis Defense Presentation
Tips on how to defend your thesis
Public speaking & oral presentation
Thesis powerpoint
Proposal Defense Power Point
My Thesis Defense Presentation
Powerpoint presentation M.A. Thesis Defence
Dissertation oral defense presentation
CVPR 2016 速報
How to Defend your Thesis Proposal like a Professional
Thesis Powerpoint
Prepare your Ph.D. Defense Presentation
Thesis Power Point Presentation
Ad

Similar to CVPR presentation (20)

PDF
IRJET- Survey on Detection of Crime
PDF
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...
PDF
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
PDF
IRJET- Prediction of Anomalous Activities in a Video
PDF
Video saliency detection using modified high efficiency video coding and back...
PDF
Video saliency-detection using custom spatiotemporal fusion method
PPTX
Semantic human activity detection in videos
PPTX
People counting in low density video sequences2
PPTX
SPPRA'2013 Paper Presentation
PDF
Crowd Density Estimation Using Base Line Filtering
PDF
Crowd Recognition System Based on Optical Flow Along with SVM classifier
PDF
PDF
Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...
PPTX
Dataset and methods for 360-degree video summarization
PPTX
masters seminar_Detection
PDF
Presentation
PDF
IRJET- Estimation of Crowd Count in a Heavily Occulated Regions
PDF
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...
PDF
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
PDF
Human action recognition using local space time features and adaboost svm
IRJET- Survey on Detection of Crime
Discovering Anomalies Based on Saliency Detection and Segmentation in Surveil...
HUMAN ACTION RECOGNITION IN VIDEOS USING STABLE FEATURES
IRJET- Prediction of Anomalous Activities in a Video
Video saliency detection using modified high efficiency video coding and back...
Video saliency-detection using custom spatiotemporal fusion method
Semantic human activity detection in videos
People counting in low density video sequences2
SPPRA'2013 Paper Presentation
Crowd Density Estimation Using Base Line Filtering
Crowd Recognition System Based on Optical Flow Along with SVM classifier
Inspection of Suspicious Human Activity in the Crowd Sourced Areas Captured i...
Dataset and methods for 360-degree video summarization
masters seminar_Detection
Presentation
IRJET- Estimation of Crowd Count in a Heavily Occulated Regions
5 ijaems sept-2015-9-video feature extraction based on modified lle using ada...
Secure IoT Systems Monitor Framework using Probabilistic Image Encryption
Human action recognition using local space time features and adaboost svm

CVPR presentation

  • 1. Anisotropic Partial Differential Equation based Video Saliency Detection Vartika Sharma, Vembarasan Vaitheeswaran, Chee Seng Chan
  • 3. Our Contributions • First, we propose a novel method to generate static saliency map based on the adaptive nonlinear PDE model. It is based on the Linear Elliptic System with Dirichlet boundary (LESD) model for image saliency detection. • We refine this model for the purpose of video saliency detection because the original LESD model does not consider the orientation and motion information contained in the video. • Further, the proposed algorithm was tested on MSRA and Berkeley datasets, where images are mostly noiseless and are nearer to the image center but most of the video datasets contains heavy noise and the salient object is usually moving within the frames. For this reason, we do not use center-prior which is given in the original LESD model but instead, an extensive direction map consisting of background prior, color prior, texture and luminance features are used. • We then combine the static map with motion map, which consists of motion features extracted from the motion vectors of predicted frames, to get the final saliency map. Figure 1 shows the pipeline of our model.
  • 5. Addition of Non-Linear Matric Tensor • The diffusion PDE seen previously does not give reliable information in the presence of flow-like structures (e.g. fingerprints). • We will extend our model for flow like structure where it would be required to rotate the PDE flow towards the orientation of interesting features.
  • 6. Addition of Non-Linear Matric Tensor K2
  • 7. Feature Extraction From DCT Coefficients • Three features including luminance, color and texture are extracted from the unpredicted (I-frames) using DCT Coefficients • On a given video frame, DCT operates on one 8X8 block at a time. On this block, there are 64-elements or 64-coeffients and the DCT operates on this block in a left to right and top to down manner (zig- zag sequencing).
  • 8. Feature Extraction From DCT Coefficients • The results of a 64-element DCT transform are 1 DC coefficient and 63 AC coefficients. • The DC coefficient represents the average color of the 8x8 region. (Color and Luminance Prior) • The 63 AC coefficients represent color change across the block.(Texture)
  • 9. Motion Feature Extraction from Motion Vectors • Motion Vector: A two-dimensional vector used for inter prediction that provides an offset from the coordinates in the decoded picture to the coordinates in a reference picture. • There are two types of predicted frames: P frames use motion compensated prediction from a past reference frame, while B frames are bidirectionally predictive-coded by using motion compensated prediction from a past and/or a future reference frame.
  • 10. Motion Feature Extraction from Motion Vectors • As there is just one prediction direction (predicted from a past reference frame) for P frames, the original motion vector MV are used to represent the motion feature for P frames. • As B frames might include two types of motion compensated prediction (the backward and forward prediction), we calculate the motion vectors for B frames
  • 11. Anisotropic Partial Differential Equation based Video Saliency Detection Vartika Sharma, Vembarasan Vaitheeswaran, Chee Seng Chan Result of our Video Saliency Detection model on KTH Action Dataset
  • 12. Results on KTH Action Datasetϯ Number of action classes = 6 {boxing, hand clapping, hand waving, jogging, running, walking} Boxing Hand Clapping Hand Waving Jogging Running Walking Original Action Videos* Final Saliency Maps * For convenience, I have chosen only 16 frames per video Ϯ "Recognizing Human Actions: A Local SVMApproach",Christian Schuldt, Ivan Laptev and Barbara Caputo; in Proc. ICPR'04, Cambridge, UK.
  • 14. DC AC01 AC02 AC03 AC04 AC05 AC06 AC07 AC20 AC21 AC22 AC23 AC24 AC25 AC26 AC27 AC10 AC11 AC12 AC13 AC14 AC15 AC16 AC17 AC30 AC31 AC32 AC33 AC34 AC35 AC36 AC37 AC50 AC51 AC52 AC53 AC54 AC55 AC56 AC57 AC40 AC41 AC42 AC43 AC44 AC45 AC46 AC47 AC60 AC61 AC62 AC63 AC64 AC65 AC66 AC67 AC70 AC71 AC72 AC73 AC74 AC75 AC76 AC77
  • 17. • We performed salient region segmentation using MCMC segmentation method which was proposed by Barbu etal ~cite{Barbu2012} for crowd counting. The main purpose of our experiment is to get an estimation of crowd in a particular video frame and also, to calculate the rate with which the crowd count is changing in the consecutive frames. Although, now CCTV cameras are becoming very common for video surveillance, there are very few algorithms available for real-time automated crowd counting. It is important to note here that our focus is more on the rate of change of crowd count rather than the actual crowd count of every frame. A sudden increase or decrease in a crowd count can act as a warning sign of an unusual activity such as explosion, fight or some other emergency. For our experiment, we calculate the standard deviation of crowd count in consecutive video frames for every 10 seconds as a risk calculator. We train our algorithm on 2000 video frames provided in the Mall Dataset cite{Loy2013} to set the threshold limit of the standard deviation for which the rate of change of crowd count is ‘safe’. We further test our algorithm on a few videos in the Pedestrian Traffic Database to show our results. Figure shows our result on Mall database for crowd counting.
  • 20. • begin{figure}begin{center}%fbox{rule{0pt}{2in} rule{0.9linewidth}{0pt}} includegraphics[height=0.95linewidth, width=0.95linewidth]{final2.png}end{center} caption{Crowd counting result on frames of Mall Dataset. $(a)$ is the Original video frames, $(b)$ is our Saliency detection results and $(c)$ is the segmentaion based on MCMC method.}label{fig:final2}%label{fig:onecol}end{figure}