Image representation usage guide

Lu, Wang-Chou
Image Representation
Usage Guide 2014/10/01 @ 林⼜⼝口

Why Image Representation?
Machine Learning Course @ Caltech
Xi = 500 x 375 D
Low Dimensional Vector

Map of Representation
Robust
Color Histogram, !
Computation Cost
PCA, !
Sparse Coding, !
Bag of Visual Word, !
DPM, !
Deep Learning…

Outline
❖ Hand Crafted Features!
❖ Machine Learning approach!
❖ Hierarchical Approaches!
❖ When to use? Real-time vs Precision

Hand Crafted Features
❖ Color Histogram, Template, Haar Features!
❖ Interested Point Detector + HOG!
❖ Bag of Visual Word

Simple Features
Histogram Based Haar Features
Template Based

SIFT Like Approach
HOG, 3780D, !
overlapped 7 x 15 cells * !
SIFT 128D,|V| = 1! ( normalized 2x2 grid) * 9 bins
4 X 4 grid * 8 bins
David Lowe [IJCV 2004] N Dalal et al [CVPR2005]

Bag of Visual Word + SPM
SVM
SIFT!
Descriptor
S. Lazebnik et al [cvpr06]

Machine Learning Approach
❖ Dimensionality Reduction!
❖ PCA, Manifold Learning, Sparse Coding, LSH!
❖ Deformable Part Model!
❖ Neural Network!
❖ Convolution Neural Network

Principle Component Analysis
MA Turk et al [cvpr91]

Manifold Learning
[ISOMAP, LLE 2003]

Sparse Coding
reconstruction error sparsity
Y: Input Vector!
B: Basis Matrix!
Z: weight
H Lee et al. [NIPS 2007]

Locality Sensitive Hashing Embedding

Deformable Part Model
Pedro F. Felzenszwalb et al [PAMI 2010]

Neural Network
Tanh & Sigmoid !
nonlinear function

Convolution Neural Network
LeCun 1989
Krizhevsky et al. [NIPS2012]
ReLu

State of the Art
GoogleNet 2014
MSRA2014

Deepness Table
Convolution Neural Network & Deformable Part Model use max pooling, !
others use sum pooling or say histogram pooling

Image Representation Usage Guide
Deepness
5
4.5
4
3.5
3
2.5
2
1.5
1
Real Time Application
Interactive Application
Convolution Neural Network
BoW+ SPM, Deformable Part Model
Bag of Visual Word
SIFT/HOG
Color Histogram
iPhone 5s Tegra K1 or PC Geforce Titan HPC
45 gflops 370 gflops 5.1 tera flops
gflops for Single Precision, PC: i7 3.5G 4 cores parellel computing
TRAINING TIME NOT INCLUDED

Some Tips
❖ GPU ~= 50 CPU Cores!
❖ Hand Crafted Feature is shallow, higher feature
template need to be learnt.!
❖ Do Dimensionality Reduction!
❖ Deeper Features, More Training Data!
❖ Handle Invariance: Registration vs Spatial Pooling!
❖ The Learnt Deep Representation(CNN) is shareable

Image representation usage guide

More Related Content

Viewers also liked (6)

Similar to Image representation usage guide (20)

Recently uploaded (20)

Image representation usage guide