Mk slides.ppt

1
Motion Analysis
Mike Knowles
mjk802@bham.ac.uk
January 2006

2
Introduction
• So far you have seen techniques for
analysing static images – 2 Dimensional
information
• Now we shall consider time-varying
images – video.
• Variations in a scene through time are
caused by motion

3
Motion
• Motion analysis allows us to extract much
useful information from a scene:
– Object locations and tracks
– Camera Motion
– 3D Geometry of the scene

4
Contents
• Perspective and motion geometry
• Optical flow
• Estimation of flow
• Feature point detection, matching and
tracking
• Techniques for tracking moving objects
• Structure from motion

5
Perspective Geometry
• An image can be
modelled as a
projection of the scene
at a distance of f from
the optical centre O
• Note convention:
capital letters denote
scene properties,
lowercase for image
properties

6
Perspective Geometry
• The position of our point in the image is
related to the position in 3D space by the
perspective projection:
x
X
y
Y
f
Z
 

7
Motion Geometry
• If the point is moving in
space then it will also
move in the image
• Thus we have a set if
vectors v(x,y) describing
the motion present in
the image at a given
position – This is the
Optical flow

8
Optical Flow
• An optical flow
field is simply a
set of vectors
describing the
image motion at
any point in the
image.

9
Estimating Optical Flow
• In order to estimate optical flow we need
to study adjacent frame pairs
• There are 2 approaches we can take to
this:
– Greylevel gradient based methods
– ‘Interesting’ feature matching

10
Greylevel conservation
• If we have a perfect optical flow field:
f x y t f x v dt y v dt t dt
x y
( , , ) ( , , )
   

11
Greylevel conservation
• Generally we measure time in frames so
dt = 1
• This leaves us with
)
1
,
,
(
)
,
,
( 


 t
v
y
v
x
f
t
y
x
f y
x

12
Greylevel Conservation
• Taking a Taylor expansion and eliminating
the higher order terms:
t
t
y
x
f
v
y
t
y
x
f
v
x
t
y
x
f
t
y
x
f
t
y
x
f
y
x






)
,
,
(
)
,
,
(
+
)
,
,
(
)
,
,
(
)
,
,
(




13
Greylevel Conservation
• Tidying up we are left with:
• This is the standard form of the greylevel
constraint equation
• But…..






f x y t
x
v
f x y t
y
v
f x y t
t
x y
( , , ) ( , , ) ( , , )
   0

14
Limitations of the greylevel
constraint equation
• The greylevel constraint equation only allows us
to measure the flow in the direction of the
greylevel image gradient

15
The aperture problem
• Consider a plot in (vx ,
vy) space, at a single
point in space - the
greylevel conservation
equation gives a line
on which the true flow
lies

17
The aperture problem
• Thus we cannot generate a flow vector for
a single point – we have to use a window
• The larger the window is the better the
chance of overcoming this problem
• But the larger the window the greater the
chance of the motion being different
• This is called the aperture problem

18
Overcoming the aperture problem
• Several solutions have been proposed:
– Assume v(x,y) is smooth (Horn and Schunck’s
algorithm)
– Assume v(x,y) is locally piecewise linear or
constant (Lucas and Kanade’s algorithm)
– Assume v(x,y) obeys some simple model
(Black and Annandan’s algorithm)
• We shall consider the latter two solutions

19
Assuming a locally constant field
• This algorithm assumes that the flow field
around some point is constant

20
The model
• Model:
• This model is valid for points in some -point
neighbourhood where the optical flow is
assumed constant.
f x y t f x v y v t n x y t
x y
( , , ) ( , , ) ( , , )
    
1

21
Noise
• n(x,y,t) is noise corrupting the true
greylevel values and is assumed zero-
mean and uncorrelated with variance:
0
))
,
,
(
( 
t
y
x
n
E
'
'
2
))
,
'
,
'
(
)
,
,
(
( yy
xx
n
t
y
x
n
t
y
x
n
E 




22
• We can linearise our model:
• Where:
f x y t f x v y v t n x y t
x y
( , , ) ( , , ) ( , , )
    
1
f x y t f x y t f x y t n x y t
T
( , , ) ( , , ) ( , , ) ( , , )
     
1 1 v
f x y t f x y t n x y t
T
( , , ) ( , , ) ( , , )
   
1 v
f x y t f x y t f x y t
( , , ) ( , , ) ( , , )
   1

23
• For each point we have an equation:
• We can write this in matrix form:
f x y t
f x y t
x
v
f x y t
y
v n x y t
i i
i i
x
i i
y i i
( , , )
( , , ) ( , , )
( , , )
 








1 1
h Av n
 
v n






 
















v
v
n x y t
n x y t
n x y t
x
y
m m
( , , )
( , , )
.
.
( , , )
1 1
2 2

24
• Matrix A and vector v are:
A
h






  

  



































f x y t
x
f x y t
y
f x y t
x
f x y t
y
f x y t
x
f x y t
y
f x y t
f x y t
f
m m m m
( , , ) ( , , )
( , , ) ( , , )
. .
. .
( , , ) ( , , )
( , , )
( , , )
.
.
(
1 1 1 1
2 2 2 2
1 1
2 2
1 1
1 1
1 1


 x y t
m m
, , )

















25
• We can solve for using a least squares
technique:
• The result is:
   
 min min
v Av h Av h Av h
v v
    
2 T
 

v A A A h


T T
1
v̂

26
• We are also interested in the quality of the
estimate as measured by the covariance
matrix of:
• It can be shown that:
• Thus we can determine the variances of
the estimates of the components vx and vy
v̂
  
C v v v v
v
  (  )  (  )
  
E E E
T
  1
2
ˆ

 A
A
Cv
T
n


27
• We can use the
covariance matrix to
determine a
confidence ellipse at
a certain probability
(e.g. 99%) that the
flow lies in that ellipse

28
• It can be seen from the expression for the
variance estimates that the accuracy of
the algorithm depends on:
– Noise variance
– Size of the neighbourhood
– Edge business

29
Modelling Flow
• An alternative to assuming constant flow is
to use a model of the flow field
• One such model is the Affine model:


































y
x
a
a
a
a
a
a
v
u
5
4
2
1
3
0

30
Estimating motion models
• Black and Annandan propose an algorithm
for estimating the parameters of the the
model parameters
• This uses robust estimation to separate
different classes of motion

31
Minimisation of Error Function
• Once again, if we are to find the optimum
parameters we need an error function to
minimise:
• But this is not in a form that is easy to
minimise…
    )
1
,
,
,
,
(
)
,
,
(
)
,
( 



 t
y
x
v
y
y
x
u
x
f
t
y
x
f
y
x
E

32
Gradient-based Formulation
• Applying Taylor expansion to the error function:
• This is the greylevel constraint equation again
t
I
v
u
f
t
I
v
y
I
u
x
I
y
x
E






















)
,
(

33
Gradient-descent Minimisation
• If we know how the error changes with respect
to the parameters, we can home in on the
minimum error

34
Applying Gradient Descent
• We need:
• Using the chain rule:
n
a
E


n
n a
u
u
E
a
E








35
Robust Estimation
• What about points that do not belong to
the motion we are estimating?
• These will pull the solution away from the
true one

36
Robust Estimators
• Robust estimators decrease the effect of
outliers on estimation

37
Error w.r.t. parameters
• The complete function is:
n
R
n
R
a
u
u
E
E
E
a
E










38
Aside – Influence Function
• It can be seen that the first derivative of the
robust estimator is used in the minimisation:

39
Pyramid Approach
• Trying to estimate the parameters form
scratch at full scale can be wasteful
• Therefore a ‘pyramid of resolutions’ or
‘Gaussian pyramid’ is used
• The principle is to estimate the parameters
on a smaller scale and refine until full
scale is reached

40
Pyramid of Resolutions
• Each level in the pyramid is half the scale of
the one below – i.e. a quarter of the area

41
• Out pops the solution….
– When combined with a suitable gradient
based minimisation scheme…
• Black and Annadan suggest the use of
Graduated Non-convexity

42
Feature Matching
• Feature point matching offers an
alternative to gradient based techniques
for finding optical flow
• The principle is to extract the locations of
particular features from the frame and
track their position in subsequent frames

43
Feature point selection
• Feature points must be :
• Local (extended line segments are no good,
we require local disparity)
• Distinct (a lot ‘different’ from neighbouring
points)
• Invariant (to rotation, scale, illumination)
• The matching process must be :
• Local (thus limiting the search area)
• Consistent (leading to ‘smooth’ disparity
estimates)

44
Approaches to Feature point
selection
• Previous approaches to feature point
selection have been
– Moravec interest operator, this is based on
thresholding local greylevel squared
differences
– Symmetric features e.g. circular features,
spirals
– Line segment endpoints
– Corner points

46
Motion and 3D Structure From
Optical Flow
• This area of computer vision attempts to
reconstruct the structure of the 3D
environment and the motion of objects
within it using optical flow
• Applications are many, the dominant one
is autonomous navigation

47
• As we saw
previously, the
relationship
between image
plane motion
and the 3D
motion that it
describes is
summed up by
the perspective
projection

48
• The perspective projection is described as:
• We can differentiate this w.r.t. time:
x
y
f
Z x y
X
Y





 






( , )
d
dt
x
y
v
v
f
Z
ZV XV
ZV YV
fV
Z
fXV
Z
fV
Z
fYV
Z
x
y
X Z
Y Z
X Z
Y Z





 





 




















2
2
2
=

49
• Substituting in the original perspective
projection equation:
• We can invert this by solving for
v
v Z
fV xV
fV yV Z
f x
f y
V
V
V
x
y
X Z
Y Z
X
Y
Z





 







 


















1 1 0
0
( , , )
V V V
X Y Z
V
V
V
Z
f
v
v
x
y
f
X
Y
Z
x
y
































0


50
• This gives us two components – one parallel to
the image plane and one along our line of sight

51
Focus of Expansion
• From the expression for optical flow we can
determine a simple structure for the flow vectors
in an image corresponding to a rigid body
translation:
v
v Z
fV xV
fV yV
V
Z
fV
V
x
fV
V
y
V
Z
x x
y y
x
y
X Z
Y Z
Z
X
Z
Y
Z
Z





 







 























1
0
0

52
• is called the Focus of
Expansion (FOE)
• For towards the camera (negative) the
flow vectors point away from the FOE
(expansion) and for away from the
camera (positive) the flow vectors point
towards the FOE (contraction).
( , ) ( , )
x y
fV
V
fV
V
X
Z
Y
Z
0 0 
VZ
VZ

53
• The FOE provides important 3D
information:
• Thus the direction of translational motion
can be determined:
( , ) ( , , ) ( , , )
,
x y f
fV
V
fV
V
f
f
V
V V V
X
Z
Y
Z Z
X Y Z
0 0  
V V
/

54
• We can also estimate the time to impact
from flow measurements close to the FOE
• Both the FOE and time to impact can be
estimated using least squares on the
optical flow field at a number of image
points
V
Z
Z

1


55
Structure from Motion
• Here we shall discuss a simple structure
from motion algorithm, which uses optical
flow to estimate the 3D structure of the
scene.
• We shall be looking at a simplified
situation where the camera is assumed to
be fixed – i.e. no pan or tilt

56
• The starting point is the optical flow
equation:
• Thus, since is the vector sum
of and then the vector
product of these 2 vectors is orthogonal to
V
V
V
Z
f
v
v
x
y
f
X
Y
Z
x
y
































0

V  ( , , )
V V V
X Y Z
T
( , , )
v v
x y
T
0 ( , , )
x y f T
V

58
• But:
• So:
v
v
x
y
f
v f
v f
v f v x
x
y
y
x
x y
0





















 











v f
v f
v f v x
V
V
V
y
x
x y
T
X
Y
Z






















 0

59
• This equation applies to all points
• Obviously a trivial solution to this equation
would be
• Also if some non-zero vector is a
solution then so is the vector for any
scalar constant .
• This confirms that we cannot determine
the absolute magnitude of the velocity
vector, we can only determine it to a
multiplicative scale constant.
( , ) ..
x y i N
i i  1
V  0
V
cV
c

60
• The solution to this is to solve the equation
using a least squares formulation subject
to the condition that:
V V
T
k
 2

61
• We can re-write the orthgonality constraint
for all points in matrix form:
• Where:
A
V
V
V
X
Y
Z










 0
A 
 
 












v f v f v y x v
v f v f v y x v
x x x y
x x x N N y
N N N N
1 1 1 1
1 1
. . .
. . .

62
• Thus the problem is
• This is a classic optimisation problem, the
solution is that the opitmal value is
given by the eigenvector of that is
produced by the minimum eigenvalue.
min ( ) ( )
V AV AV V V
T T
k
subject to  2

V
A A
T

63
• Once we have our estimate we can
compute our scene depths using the
original optical flow equation:

V
v
v Z
f x
f y
V
V
V
x
y
X
Y
Z





 


















1 0
0
Zv fV xV
Zv fV yV
x X Z
y Y Z
 
 

64
• We can estimate each depth using a least
squares formulation:
• The solution of which is:
• The scene co-ordinates can be found
using perspective projection
   
min    
Z i x X i Z i y Y i Z
i i i
Z v fV xV Z v fV yV
    


 


2 2

(   ) (   )
Z
v fV x V v fV y V
v v
i
x X i Z y Y i Z
x y
i i
i i

  

2 2

65
Summary
• 3D Geometry
• Optical Flow
• Flow estimation and the aperture problem
• Focus of Expansion
• Structure from Motion

66
Recap
• Geometry
• Flow
• Flow estimation
• Feature trackers
• FOE and structure from motion

67
Tracking
• Goal – to detect and track objects moving
independently to the background
• Two situations to be considered:
– Static Background
– Moving Background

68
Applications of Motion Tracking
• Control Applications
– Object Avoidance
– Automatic Guidance
– Head Tracking for Video Conferencing
• Surveillance/Monitoring Applications
– Security Cameras
– Traffic Monitoring
– People Counting

69
My Work
• Started by tracking moving objects in a
static scene
• Develop a statistical model of the
background
• Mark all regions that do not conform to the
model as moving object

70
My Work
• Now working on object detection and
classification from a moving camera
• Current focus is motion compensated
background filtering
• Determine motion of background and
apply to the model.

71
Detecting moving objects in a static
scene
• Simplest method:
– Subtract consecutive frames.
– Ideally this will leave only moving objects.
– This is not an ideal world….

72
Using a background model
• Lack of texture in objects mean incomplete
object masks are produced.
• In order to obtain complete object masks
we must have a model of the background
as a whole.

73
Adapting to variable backgrounds
• In order to cope with varying backgrounds
it is necessary to make the model dynamic
• A statistical system is used to update the
model over time

74
Background Filtering
• My algorithm based on:
“Learning Patterns of Activity using Real-Time Tracking” C.
Stauffer and W.E.L. Grimson. IEEE Trans. On Pattern
Analysis and Machine Intelligence. August 2000
• The history of each pixel is modelled by a
sequence of Gaussian distributions

75
Multi-dimensional Gaussian
Distributions
• Described mathematically as:
• More easily visualised as:
(2-Dimensional)
 
 
   
t
t
T
t
t X
X
n
t e
X








 



1
2
1
2
1
2
2
1
,
,

76
Simplifying….
• Calculating the full Gaussian for every
pixel in frame is very, very slow
• Therefore I use a linear approximation

77
How do we use this to represent a
pixel?
• Stauffer and Grimson suggest using a
static number of Gaussians for each pixel
• This was found to be inefficient – so the
number of Gaussians used to represent
each pixel is variable

78
Weights
• Each Gaussian carries a weight value
• This weight is a measure of how well the
Gaussian represents the history of the pixel
• If a pixel is found to match a Gaussian then the
weight is increased and vice-versa
• If the weight drops below a threshold then that
Gaussian is eliminated

79
Matching
• Each incoming pixel value must be
checked against all the Gaussians at that
location
• If a match is found then the value of that
Gaussian is updated
• If there is no match then a new Gaussian
is created with a low weight

80
Updating
• If a Gaussian matches a pixel, then the
value of that Gaussian is updated using
the current value
• The rate of learning is greater in the early
stages when the model is being formed

81
Static Scene Object Detection and
Tracking
• Model the background and subtract to
obtain object mask
• Filter to remove noise
• Group adjacent pixels to obtain objects
• Track objects between frames to develop
trajectories

82
Moving Camera Sequences
• Basic Idea is the same as before
– Detect and track objects moving within a
scene
• BUT – this time the camera is not
stationary, so everything is moving

83
Motion Segmentation
• Use a motion estimation algorithm on the
whole frame
• Iteratively apply the same algorithm to
areas that do not conform to this motion to
find all motions present
• Problem – this is very, very slow

84
Motion Compensated Background
Filtering
• Basic Principle
– Develop and maintain background model as
previously
– Determine global motion and use this to
update the model between frames

85
Advantages
• Only one motion model has to be found
– This is therefore much faster
• Estimating motion for small regions can be
unreliable
• Not as easy as it sounds though…..

86
Motion Models
• Trying to determine the exact optical flow
at every point in the frame would be
ridiculously slow
• Therefore we try to fit a parametric model
to the motion

87
Affine Motion Model


































y
x
a
a
a
a
a
a
v
u
5
4
2
1
3
0
• The affine model describes the vector at each
point in the image
• Need to find values for the parameters that best
fit the motion present

88
Background Motion Estimation
• Uses the framework described earlier by
Black and Annadan

90
Other approaches to Tracking
• Many approaches using active contours –
a.k.a. snakes
– Parameterised curves
– Fitted to the image by minimising some cost
function – often based on fitting the contour to
edges

91
Constraining shape
• To avoid the snake being influenced by
point we aren’t interested in, use a model
to constrain its shape.

92
CONDENSATION
• No discussion on tracking can omit the
CONDENSATION algorithm developed by
Isard and Blake.
• CONditional DENSity propagATION
• Non-gaussian substitute for the Kalman
Filter
• Uses factored sampling to model non-
gaussian probabiltiy densities and
estimate propogate them though time.

93
CONDENSATION
• Thus we can take a set of parameters and
estimate them from frame to frame, using
current information from the frames
• These parameters may be positions or
shape parameters from a snake.

94
CONDENSATION - Algorithm
• Randomly take samples from the previous
distribution.
• Apply a random drift and deterministic diffusion
based on a model of how the parameters
behave to the samples.
• Weight each sample on the basis of the current
information.
• Estimate of actual value can be either a
weighted average or a peak value from the
distribution

95
Tracking Summary
• Static-scene background subtraction
methods
• Extensions to moving camera systems
• Use of model-constrained active contour
systems
• CONDENSATION

Mk slides.ppt

More Related Content

Similar to Mk slides.ppt (20)

More from Tabassum Saher (20)

Recently uploaded (20)

Mk slides.ppt