11 cie552 image_featuresii_sift

Image Feature Extraction II
SIFT
Elsayed Hemayed

Overview
• Image Matching and local features
• Scale-invariant feature transform (SIFT)
Image Features II SIFT 2

Image Matching
3Image Features II SIFT

Image Matching

Invariant local features
- Your eyes don’t see everything at once, but they jump around. You
see only about 2 degrees with high resolution
- Find features that are invariant to transformations
– geometric invariance: translation, rotation, scale
– photometric invariance: brightness, exposure, …
Feature Descriptors

How to achieve invariance
Need both of the following:
1. Make sure your detector is invariant
– Harris is invariant to translation and rotation
– Scale is trickier
• common approach is to detect features at many scales using a Gaussian pyramid
• More sophisticated methods find “the best scale” to represent each feature (e.g., SIFT)
2. Design an invariant feature descriptor
– A descriptor captures the information in a region around the
detected feature point
– The simplest descriptor: a square window of pixels
• What’s this invariant to?
– better approaches exist now …

Feature descriptors
We know how to detect good points
Next question: How to match them?
?

Feature descriptors
We know how to detect good points
Next question: How to match them?
Lots of possibilities (this is a popular research area)
– Simple option: match square windows around the point
– Better approach: SIFT
• David Lowe, UBC http://www.cs.ubc.ca/~lowe/keypoints/
?

SIFT Background
• Scale-invariant feature transform
– SIFT: to detect and describe local
features in an images.
– Proposed by David Lowe in ICCV1999.
– Refined in IJCV 2004.
– Cited more than 60,000 times till now.
– Wildly used in image search, object
recognition, video tracking, gesture
recognition, etc.

Why SIFT is so popular?
• Desired property of SIFT
– Invariant to scale change
– Invariant to rotation change
– Invariant to illumination change
– Robust to addition of noise
– Robust to substantial range of affine
transformation
– Robust to 3D view point
– Highly distinctive for discrimination

How to extract SIFT
Test image Detector: where are
the local features?
Descriptor: how
to describe them?

SIFT Algorithm Steps
• Step 1: Constructing a scale space
• Step 2: Laplacian of Gaussian
approximation
• Step 3: Finding Keypoints
• Step 4: Eliminate edges and low contrast
regions
• Step 5: Assign an orientation to the
keypoints
• Step 6: Generate SIFT features

Step 1: Constructing a scale space
• To create a scale space, you take the
original image and generate progressively
blurred out by using Gaussian Blur. To
Focus on certain objects, get rid of other
objects in the scene.

Gaussian Blur
The symbols:
• L is a blurred image
• G is the Gaussian Blur operator
• I is an image
• x,y are the location coordinates
• σ is the “scale” parameter. Think of it as the
amount of blur. Greater the value, greater the
blur.

Here’s an example:
• Look at how the cat’s helmet loses detail. So do it’s whiskers.

• SIFT takes scale spaces to the next level.
• Resize the original image to half size.
And you generate blurred out images
again. And you keep repeating.

4 octaves and 5 blur levels

Step 2: Laplacian of Gaussian
approximation
• The Laplacian of Gaussian (LoG) operation goes like
this. You take an image, and blur it a little. And then,
you calculate second order derivatives on it (or, the
“laplacian”). These are good for finding keypoints.
• The problem is, calculating all those second order
derivatives is computationally intensive.
• Solution, use the Difference of Gaussians (DoG).
-We use the scale space (from previous step).
-We calculate the difference between two consecutive
scales.
-These DoG images are a great for finding out
interesting key points in the image

These Difference of Gaussian images are approximately equivalent to the
Laplacian of Gaussian. And we’ve replaced a computationally intensive
process with a simple subtraction (fast and efficient).

Difference-of-Gaussians
  IkG *        IGkGD * 
  IG *
  IkG *2


Step 3: Finding Keypoints
• Iterate through each pixel and check all it’s neighbors. The check is
done within the current image, and also the one above and below
it. Something like this:
X marks the current
pixel.
The green circles mark
the neighbours.
X is marked as a “key point” if it is the greatest or least of all 26
neighbours

Step 4: Eliminate edges and low
contrast regions
• Key points generated in the previous step
produce a lot of key points. Some of them lie
along an edge, or they don’t have enough
contrast. In both cases, they are not useful as
features, so we need to get rid of them.
• Reject points with bad contrast:
– DoG smaller than 0.03 (image values in [0,1])
• Reject edges
– Use Harris detector and keep only corners

Maxima in D

Remove low contrast and edges

Step 5: Assign an orientation to the keypoints
• The idea is to collect gradient magnitude and orientation around
each keypoint. Then we figure out the most prominent orientation(s)
in that region. And we assign this orientation(s) to the keypoint.
• This orientation provides rotation invariance
• Let, for a keypoint, L is the image with the closest scale.
– Compute gradient magnitude and orientation using finite
differences:
( 1, ) ( 1, )
( , 1) ( , 1)
L x y L x y
GradientVector
L x y L x y
   
     

Step 5: Assign an orientation to the keypoints
• The magnitude and orientation is calculated for all pixels around the
keypoint. Then, A histogram is created. In this histogram, the 360 degrees
of orientation are broken into 36 bins (each 10 degrees).the histogram will
have a peak at some point.
• Above, you see the histogram peaks at 20-29 degrees. So, the keypoint is
assigned orientation 3 (the third bin). And the “amount” that is added to the
bin is proportional to the magnitude of gradient at that point
• Also, any peaks above 80% of the highest peak are converted into a new
keypoint. This new keypoint has the same location and scale as the original.
But it’s orientation is equal to the other peak.So, orientation can split up one
keypoint into multiple keypoints.

Orientation assignment

SIFT descriptor

Step 6: Generate SIFT features
• Each point so far has x, y, σ, m, θ
– Location x,y
– Scale: σ
– gradient magnitude and orientation: m, θ
• Now we need a descriptor for the region
– Could sample intensities around point, but…
• Sensitive to lighting changes
• Sensitive to slight errors in x, y, θ

Making descriptor rotation invariant
• Rotate patch according to its dominant gradient orientation
• This puts the patches into a canonical orientation.

Step 6: Generate SIFT features
• Till now, we had scale and rotation invariance. Now we create a
fingerprint for each keypoint to identify each keypoint.
• To do this, take a 16×16 window around the keypoint. This 16×16
window is broken into sixteen 4×4 windows
Within each 4×4 window, gradient magnitudes and orientations are calculated.
These orientations are put into an 8 bin histogram. the amount added to the bin
depends on the magnitude of the gradient, also depends on the distance from the
keypoint. So gradients that are far away from the keypoint will add smaller values
to the histogram.

Step 6: Generate SIFT features (Cont.)
Do this for all sixteen 4×4 regions. So you end up with 4x4x8 = 128
numbers. Once you have all 128 numbers, you normalize them.
These 128 numbers form the “feature vector”.
This keypoint is uniquely identified by this feature vector.
=> Feature vector (128)

0.37 0.79 0.97 0.98
0.97
0.91
0.98
0.79
0.73
0.900.75
0.31
0.45
0.45
0.04
0.08
by Yao Lu
Numeric Example

L(x-1,y-1) L(x,y-1) L(x+1,y-1) 0.98
0.97
0.91
0.98
L(x+1,y)
L(x+1,y+1)
0.900.75
L(x,y+1)
0.45
L(x-1,y+1)
L(x-1,y)
magnitude(x,y)= 𝐿 𝑥 + 1, 𝑦 − 𝐿 𝑥 − 1, 𝑦
2
+ 𝐿 𝑥, 𝑦 + 1 − 𝐿 𝑥, 𝑦 − 1
2
𝜃(x,y)=a𝑡𝑎𝑛(
L x,y+1 −L x,y−1
L(x+1,y)−L(x−1,y)
L(x,y)
𝜃(x,y)
by Yao Lu 39Image Features II SIFT

Orientations in each of
the 16 pixels of the cell
The orientations all
ended up in two bins:
11 in one bin, 5 in the
other. (rough count)
40
5 11 0 0 0 0 0 0
Image Features II SIFT

Summary of SIFT Feature
• Descriptor: 128-D
– 4 by 4 patches, each with 8-D gradient angle
histogram:
4×4×8 = 128
– Normalized to reduce the effects of illumination
change.
• Position: (x, y)
– Where the feature is located at.
• Scale
– Control the region size for descriptor extraction.
• Orientation
– To achieve rotation-invariant descriptor.

Properties of SIFT
• Extraordinarily robust matching technique
– Can handle changes in viewpoint
• Up to about 30 degree out of plane rotation
– Can handle significant changes in illumination
• Sometimes even day vs. night (below)
– Fast and efficient—can run in real time
– Various code available
• http://www.cs.ubc.ca/~lowe/keypoints/

NASA Mars Rover images
with SIFT feature matches
Figure by Noah Snavely
Example

Example: Object Recognition
Lowe, IJCV04
SIFT is extremely powerful for object instance
recognition, especially for well-textured objects

Example: Google Goggle

panorama?
• We need to match (align) images

Matching with Features
•Detect feature points in both images

•Find corresponding pairs

•Find corresponding pairs
•Use these matching pairs to align images - the
required mapping is called a homography.

Automatic mosaicing

Recognition of specific objects, scenes
Rothganger et al. 2003 Lowe 2002
Schmid and Mohr 1997 Sivic and Zisserman, 2003
Kristen Grauman

Example: 3D Reconstructions
• Photosynth (also called Photo Tourism)
developed at UW by Noah Snavely, Steve Seitz,
Rick Szeliski and others
http://www.youtube.com/watch?v=p16frKJLVi0
• Building Rome in a day, developed at UW by
Sameer Agarwal, Noah Snavely, Steve Seitz
and others
http://www.youtube.com/watch?v=kxtQqYLRaSQ&
feature=player_embedded

When does the SIFT descriptor fail?
Patches SIFT thought were the same but aren’t:

References
• David G. Lowe, "Distinctive image features from
scale-invariant keypoints," International Journal
of Computer Vision,(2004)
http://www.cs.ubc.ca/~lowe/papers/ijcv04.pdf
• http://www.aishack.in/2010/05/sift-scale-invariant-
feature-transform/
• Implementing SIFT in OpenCV
http://www.aishack.in/2010/07/implementing-sift-in-
opencv/

Thank You
Elsayed Hemayed
hemayed@ieee.org

11 cie552 image_featuresii_sift

More Related Content

Similar to 11 cie552 image_featuresii_sift (20)

More from Elsayed Hemayed (20)

Recently uploaded (20)

11 cie552 image_featuresii_sift