SlideShare a Scribd company logo
Lu, Wang-Chou 
Image Representation 
Usage Guide 2014/10/01 @ 林⼜⼝口
Why Image Representation? 
Machine Learning Course @ Caltech 
Xi = 500 x 375 D 
Low Dimensional Vector
Map of Representation 
Robust 
Color Histogram, ! 
Computation Cost 
PCA, ! 
Sparse Coding, ! 
Bag of Visual Word, ! 
DPM, ! 
Deep Learning…
Outline 
❖ Hand Crafted Features! 
❖ Machine Learning approach! 
❖ Hierarchical Approaches! 
❖ When to use? Real-time vs Precision
Hand Crafted Features 
❖ Color Histogram, Template, Haar Features! 
❖ Interested Point Detector + HOG! 
❖ Bag of Visual Word
Simple Features 
Histogram Based Haar Features 
Template Based
SIFT Like Approach 
HOG, 3780D, ! 
overlapped 7 x 15 cells * ! 
SIFT 128D,|V| = 1! ( normalized 2x2 grid) * 9 bins 
4 X 4 grid * 8 bins 
David Lowe [IJCV 2004] N Dalal et al [CVPR2005]
Bag of Visual Word + SPM 
SVM 
SIFT! 
Descriptor 
S. Lazebnik et al [cvpr06]
Machine Learning Approach 
❖ Dimensionality Reduction! 
❖ PCA, Manifold Learning, Sparse Coding, LSH! 
❖ Deformable Part Model! 
❖ Neural Network! 
❖ Convolution Neural Network
Principle Component Analysis 
MA Turk et al [cvpr91]
Manifold Learning 
[ISOMAP, LLE 2003]
Sparse Coding 
reconstruction error sparsity 
Y: Input Vector! 
B: Basis Matrix! 
Z: weight 
H Lee et al. [NIPS 2007]
Locality Sensitive Hashing Embedding
Deformable Part Model 
Pedro F. Felzenszwalb et al [PAMI 2010]
Neural Network 
Tanh & Sigmoid ! 
nonlinear function
Convolution Neural Network 
LeCun 1989 
Krizhevsky et al. [NIPS2012] 
ReLu
State of the Art 
GoogleNet 2014 
MSRA2014
Deepness Table 
Convolution Neural Network & Deformable Part Model use max pooling, ! 
others use sum pooling or say histogram pooling
Image Representation Usage Guide 
Deepness 
5 
4.5 
4 
3.5 
3 
2.5 
2 
1.5 
1 
Real Time Application 
Interactive Application 
Convolution Neural Network 
BoW+ SPM, Deformable Part Model 
Bag of Visual Word 
SIFT/HOG 
Color Histogram 
iPhone 5s Tegra K1 or PC Geforce Titan HPC 
45 gflops 370 gflops 5.1 tera flops 
gflops for Single Precision, PC: i7 3.5G 4 cores parellel computing 
TRAINING TIME NOT INCLUDED
Some Tips 
❖ GPU ~= 50 CPU Cores! 
❖ Hand Crafted Feature is shallow, higher feature 
template need to be learnt.! 
❖ Do Dimensionality Reduction! 
❖ Deeper Features, More Training Data! 
❖ Handle Invariance: Registration vs Spatial Pooling! 
❖ The Learnt Deep Representation(CNN) is shareable

More Related Content

PDF
Data By the Bay 2016 - May 17, 2016
PDF
[SIGGRAPH 2016] Automatic Image Colorization
PDF
NTCIR-15 www-3 kasys poster
PPTX
TYPifier: Inferring the Type Semantics of Structured Data (icde2013)
PPTX
Tasract OCR
PDF
RS at Search and Hyperlinking of Television Content Task
PDF
Os Raysmith
PDF
Introduction to Deep Learning (Dmytro Fishman Technology Stream)
Data By the Bay 2016 - May 17, 2016
[SIGGRAPH 2016] Automatic Image Colorization
NTCIR-15 www-3 kasys poster
TYPifier: Inferring the Type Semantics of Structured Data (icde2013)
Tasract OCR
RS at Search and Hyperlinking of Television Content Task
Os Raysmith
Introduction to Deep Learning (Dmytro Fishman Technology Stream)

Viewers also liked (6)

PPTX
Tensor flow
PDF
Voxxed Days Thesaloniki 2016 - Machine Learning for Developers
PPTX
Image representation
PDF
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
PPTX
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
PPTX
Deep neural networks
Tensor flow
Voxxed Days Thesaloniki 2016 - Machine Learning for Developers
Image representation
Deep Learning for Computer Vision: ImageNet Challenge (UPC 2016)
Deep Learning with TensorFlow: Understanding Tensors, Computations Graphs, Im...
Deep neural networks
Ad

Similar to Image representation usage guide (20)

PDF
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
PDF
Computer vision for transportation
PDF
Unsupervised Computer Vision: The Current State of the Art
PDF
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
PDF
lecture_16_jiajun.pdf
PPTX
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
PDF
How to Design Efficient Deep Convolutional Architectures
PDF
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
PDF
Bootstrap Custom Image Classification using Transfer Learning by Danielle Dea...
PDF
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
PPT
Fcv learn yu
PDF
"Introduction to Feature Descriptors in Vision: From Haar to SIFT," A Present...
PDF
Deep Learning for New User Interactions (Gestures, Speech and Emotions)
PDF
Modeling perceptual similarity and shift invariance in deep networks
PDF
Computer Vision
PDF
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
PDF
DeepImage_GTC15_public
PPTX
Using Deep Learning for Computer Vision Applications
PDF
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
PDF
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Discovering Your AI Super Powers - Tips and Tricks to Jumpstart your AI Projects
Computer vision for transportation
Unsupervised Computer Vision: The Current State of the Art
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
lecture_16_jiajun.pdf
AlexNet(ImageNet Classification with Deep Convolutional Neural Networks)
How to Design Efficient Deep Convolutional Architectures
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
Bootstrap Custom Image Classification using Transfer Learning by Danielle Dea...
The Search for a New Visual Search Beyond Language - StampedeCon AI Summit 2017
Fcv learn yu
"Introduction to Feature Descriptors in Vision: From Haar to SIFT," A Present...
Deep Learning for New User Interactions (Gestures, Speech and Emotions)
Modeling perceptual similarity and shift invariance in deep networks
Computer Vision
Content-Based Image Retrieval (D2L6 Insight@DCU Machine Learning Workshop 2017)
DeepImage_GTC15_public
Using Deep Learning for Computer Vision Applications
Visualization of Deep Learning Models (D1L6 2017 UPC Deep Learning for Comput...
Deep Learning for Computer Vision (1/4): Image Analytics @ laSalle 2016
Ad

Recently uploaded (20)

PPTX
Database Infoormation System (DBIS).pptx
PPTX
Modelling in Business Intelligence , information system
PDF
Oracle OFSAA_ The Complete Guide to Transforming Financial Risk Management an...
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PDF
Lecture1 pattern recognition............
PPTX
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
PPTX
STERILIZATION AND DISINFECTION-1.ppthhhbx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Leprosy and NLEP programme community medicine
PDF
Introduction to the R Programming Language
PDF
annual-report-2024-2025 original latest.
PPT
DATA COLLECTION METHODS-ppt for nursing research
PPTX
SAP 2 completion done . PRESENTATION.pptx
PDF
Business Analytics and business intelligence.pdf
PDF
Mega Projects Data Mega Projects Data
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Database Infoormation System (DBIS).pptx
Modelling in Business Intelligence , information system
Oracle OFSAA_ The Complete Guide to Transforming Financial Risk Management an...
Galatica Smart Energy Infrastructure Startup Pitch Deck
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Lecture1 pattern recognition............
Market Analysis -202507- Wind-Solar+Hybrid+Street+Lights+for+the+North+Amer...
STERILIZATION AND DISINFECTION-1.ppthhhbx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Leprosy and NLEP programme community medicine
Introduction to the R Programming Language
annual-report-2024-2025 original latest.
DATA COLLECTION METHODS-ppt for nursing research
SAP 2 completion done . PRESENTATION.pptx
Business Analytics and business intelligence.pdf
Mega Projects Data Mega Projects Data
Qualitative Qantitative and Mixed Methods.pptx
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj

Image representation usage guide

  • 1. Lu, Wang-Chou Image Representation Usage Guide 2014/10/01 @ 林⼜⼝口
  • 2. Why Image Representation? Machine Learning Course @ Caltech Xi = 500 x 375 D Low Dimensional Vector
  • 3. Map of Representation Robust Color Histogram, ! Computation Cost PCA, ! Sparse Coding, ! Bag of Visual Word, ! DPM, ! Deep Learning…
  • 4. Outline ❖ Hand Crafted Features! ❖ Machine Learning approach! ❖ Hierarchical Approaches! ❖ When to use? Real-time vs Precision
  • 5. Hand Crafted Features ❖ Color Histogram, Template, Haar Features! ❖ Interested Point Detector + HOG! ❖ Bag of Visual Word
  • 6. Simple Features Histogram Based Haar Features Template Based
  • 7. SIFT Like Approach HOG, 3780D, ! overlapped 7 x 15 cells * ! SIFT 128D,|V| = 1! ( normalized 2x2 grid) * 9 bins 4 X 4 grid * 8 bins David Lowe [IJCV 2004] N Dalal et al [CVPR2005]
  • 8. Bag of Visual Word + SPM SVM SIFT! Descriptor S. Lazebnik et al [cvpr06]
  • 9. Machine Learning Approach ❖ Dimensionality Reduction! ❖ PCA, Manifold Learning, Sparse Coding, LSH! ❖ Deformable Part Model! ❖ Neural Network! ❖ Convolution Neural Network
  • 10. Principle Component Analysis MA Turk et al [cvpr91]
  • 12. Sparse Coding reconstruction error sparsity Y: Input Vector! B: Basis Matrix! Z: weight H Lee et al. [NIPS 2007]
  • 14. Deformable Part Model Pedro F. Felzenszwalb et al [PAMI 2010]
  • 15. Neural Network Tanh & Sigmoid ! nonlinear function
  • 16. Convolution Neural Network LeCun 1989 Krizhevsky et al. [NIPS2012] ReLu
  • 17. State of the Art GoogleNet 2014 MSRA2014
  • 18. Deepness Table Convolution Neural Network & Deformable Part Model use max pooling, ! others use sum pooling or say histogram pooling
  • 19. Image Representation Usage Guide Deepness 5 4.5 4 3.5 3 2.5 2 1.5 1 Real Time Application Interactive Application Convolution Neural Network BoW+ SPM, Deformable Part Model Bag of Visual Word SIFT/HOG Color Histogram iPhone 5s Tegra K1 or PC Geforce Titan HPC 45 gflops 370 gflops 5.1 tera flops gflops for Single Precision, PC: i7 3.5G 4 cores parellel computing TRAINING TIME NOT INCLUDED
  • 20. Some Tips ❖ GPU ~= 50 CPU Cores! ❖ Hand Crafted Feature is shallow, higher feature template need to be learnt.! ❖ Do Dimensionality Reduction! ❖ Deeper Features, More Training Data! ❖ Handle Invariance: Registration vs Spatial Pooling! ❖ The Learnt Deep Representation(CNN) is shareable