SlideShare a Scribd company logo
Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University
Outline of Part 2 05/13/11 Local Features, Sampling, Visual Words Discriminative Methods Bag-of-Words (BoW) representation Spatial pyramid matching (SPM) Generative Methods P art-based  methods Topic models
Outline of Part 2 05/13/11 Local Features, Sampling, Visual Words Discriminative Methods Bag-of-Words (BoW) representation Spatial pyramid matching (SPM) Generative Methods P art-based  methods Topic models
Local features 05/13/11 Distinctive descriptors of local image patches Invariant to local translation, scale, … and sometimes rotation or general affine transformations The most famous choice is the  SIFT feature
Sampling local features from images 05/13/11 Image credits: F-F. Li, E. Nowak, J. Sivic A set of points
Visual words 05/13/11 Similar points are grouped into one visual word Algorithms: k-means, agglomerative clustering, … Points from different images are then more easily compared. Slide credit: Kristen Grauman
Outline of Part 2 05/13/11 Local Features, Sampling, Visual Words, … Discriminative Methods Bag-of-Words (BoW) representation Spatial pyramid matching (SPM) Generative Methods P art-based  methods Topic models
Bag-of-words (BoW) representation 05/13/11 Adapted from tutorial slides by Fei-Fei et al.  Analogy to documents
BoW for object categorization 05/13/11 Works pretty well for whole-image classification Slide credit: Svetlana Lazebnik Csurka et al. (2004), Willamowski et al. (2005), Grauman & Darrell (2005), Sivic et al. (2003, 2005)
Unsupervised Dictionary Learning 05/13/11 image database  Sample local features from images Run k-mean or other clustering algorithm to get dictionary Dictionary is also called “codebook” SIFT space R1 R2 R3
Compute BoW histogram for each image 05/13/11 Assign sift features into clusters BoW histogram representations  R1 R2 R3 Compute the frequency of each cluster  within an image R1 R2 R3
Indication of BoW histogram 05/13/11 Summarize entire image based on its distribution of visual word occurrences Turn bags of different sizes into  a fixed length vector Analogous to bag of words representation commonly used for text categorization.
Image classification based on BoW histogram 05/13/11 dog bird Decision boundary BoW histogram vector space Learn a classification model to determine the decision boundary Nonlinear SVMs are commonly applied.
Issues 05/13/11 Sampling strategy Learning codebook: size? supervised?, …  Classification: which method? scalability? Scalability: how to handle millions of data?  How to use spatial information?
Spatial information 05/13/11 The BoW removes spatial layout. This  increases the invariance to scale, translation, and deformation, B ut sacrifices discriminative power, especially when the spatial layout is important . Slide adapted from Bill Freeman
Spatial pyramid matching 05/13/11 Compute BoW for image regions at different locations in various scales Figure credit: Svetlana Lazebnik
A common pipeline for discriminative image classification using BoW 05/13/11 K-means   Dense/Sparse  SIFT dictionary Dictionary Learning VQ  Coding  Dense/Sparse  SIFT Spatial Pyramid  Pooling Nonlinear SVM Image Classification
Combining multiple descriptors 05/13/11 Multiple Feature Detectors Multiple Descriptors: SIFT, shape, color, … VQ Coding and Spatial Pooling  Nonlinear SVM Diagram from SurreyUVA_SRKDA, winner team in PASCAL VOC 2008
Outline of Part 2 05/13/11 Local Features, Sampling, Visual Words, … Discriminative Methods Bag-of-Words (BoW) representation Spatial pyramid matching (SPM) Generative Methods P art-based  methods Topic models
Topic models for images 05/13/11 Latent Dirichlet Allocation (LDA) Fei-Fei et al. ICCV 2005 Slide credit Fei-Fei Li w N c z D  “ beach”
Part-based Model 05/13/11 Fischler & Elschlager 1973 Rob Fergus ICCV09 Tutorial
For a comprehensive coverage of  object categorization models , please visit 05/13/11 Recognizing and Learning Object Categories Li Fei-Fei (Stanford), Rob Fergus (NYU),  Antonio Torralba (MIT)  http://people.csail.mit.edu/torralba/shortCourseRLOC/

More Related Content

PDF
"Introduction to Feature Descriptors in Vision: From Haar to SIFT," A Present...
PPTX
G322 AS Exam - Representation of sexuality
PPTX
Coursera "Neural Networks"
PDF
MongoDB at Baidu
PDF
Hands-on Deep Learning in Python
PPTX
Andrew Ng, Chief Scientist at Baidu
PDF
Machine(s) Learning with Neural Networks
PDF
Deep learning - Conceptual understanding and applications
"Introduction to Feature Descriptors in Vision: From Haar to SIFT," A Present...
G322 AS Exam - Representation of sexuality
Coursera "Neural Networks"
MongoDB at Baidu
Hands-on Deep Learning in Python
Andrew Ng, Chief Scientist at Baidu
Machine(s) Learning with Neural Networks
Deep learning - Conceptual understanding and applications

More from zukun (20)

PDF
My lyn tutorial 2009
PDF
ETHZ CV2012: Tutorial openCV
PDF
ETHZ CV2012: Information
PDF
Siwei lyu: natural image statistics
PDF
Lecture9 camera calibration
PDF
Brunelli 2008: template matching techniques in computer vision
PDF
Modern features-part-4-evaluation
PDF
Modern features-part-3-software
PDF
Modern features-part-2-descriptors
PDF
Modern features-part-1-detectors
PDF
Modern features-part-0-intro
PDF
Lecture 02 internet video search
PDF
Lecture 01 internet video search
PDF
Lecture 03 internet video search
PDF
Icml2012 tutorial representation_learning
PPT
Advances in discrete energy minimisation for computer vision
PDF
Gephi tutorial: quick start
PDF
EM algorithm and its application in probabilistic latent semantic analysis
PDF
Object recognition with pictorial structures
PDF
Iccv2011 learning spatiotemporal graphs of human activities
My lyn tutorial 2009
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Information
Siwei lyu: natural image statistics
Lecture9 camera calibration
Brunelli 2008: template matching techniques in computer vision
Modern features-part-4-evaluation
Modern features-part-3-software
Modern features-part-2-descriptors
Modern features-part-1-detectors
Modern features-part-0-intro
Lecture 02 internet video search
Lecture 01 internet video search
Lecture 03 internet video search
Icml2012 tutorial representation_learning
Advances in discrete energy minimisation for computer vision
Gephi tutorial: quick start
EM algorithm and its application in probabilistic latent semantic analysis
Object recognition with pictorial structures
Iccv2011 learning spatiotemporal graphs of human activities
Ad

Recently uploaded (20)

PPTX
PPH.pptx obstetrics and gynecology in nursing
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Pre independence Education in Inndia.pdf
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Pharma ospi slides which help in ospi learning
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
GDM (1) (1).pptx small presentation for students
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Lesson notes of climatology university.
PPH.pptx obstetrics and gynecology in nursing
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Pre independence Education in Inndia.pdf
Microbial disease of the cardiovascular and lymphatic systems
Computing-Curriculum for Schools in Ghana
Pharma ospi slides which help in ospi learning
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Anesthesia in Laparoscopic Surgery in India
GDM (1) (1).pptx small presentation for students
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
FourierSeries-QuestionsWithAnswers(Part-A).pdf
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
Microbial diseases, their pathogenesis and prophylaxis
STATICS OF THE RIGID BODIES Hibbelers.pdf
human mycosis Human fungal infections are called human mycosis..pptx
Lesson notes of climatology university.
Ad

ECCV2010: feature learning for image classification, part 1

  • 1. Part 1: Classical Image Classification Methods Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University
  • 2. Outline of Part 2 05/13/11 Local Features, Sampling, Visual Words Discriminative Methods Bag-of-Words (BoW) representation Spatial pyramid matching (SPM) Generative Methods P art-based methods Topic models
  • 3. Outline of Part 2 05/13/11 Local Features, Sampling, Visual Words Discriminative Methods Bag-of-Words (BoW) representation Spatial pyramid matching (SPM) Generative Methods P art-based methods Topic models
  • 4. Local features 05/13/11 Distinctive descriptors of local image patches Invariant to local translation, scale, … and sometimes rotation or general affine transformations The most famous choice is the SIFT feature
  • 5. Sampling local features from images 05/13/11 Image credits: F-F. Li, E. Nowak, J. Sivic A set of points
  • 6. Visual words 05/13/11 Similar points are grouped into one visual word Algorithms: k-means, agglomerative clustering, … Points from different images are then more easily compared. Slide credit: Kristen Grauman
  • 7. Outline of Part 2 05/13/11 Local Features, Sampling, Visual Words, … Discriminative Methods Bag-of-Words (BoW) representation Spatial pyramid matching (SPM) Generative Methods P art-based methods Topic models
  • 8. Bag-of-words (BoW) representation 05/13/11 Adapted from tutorial slides by Fei-Fei et al. Analogy to documents
  • 9. BoW for object categorization 05/13/11 Works pretty well for whole-image classification Slide credit: Svetlana Lazebnik Csurka et al. (2004), Willamowski et al. (2005), Grauman & Darrell (2005), Sivic et al. (2003, 2005)
  • 10. Unsupervised Dictionary Learning 05/13/11 image database Sample local features from images Run k-mean or other clustering algorithm to get dictionary Dictionary is also called “codebook” SIFT space R1 R2 R3
  • 11. Compute BoW histogram for each image 05/13/11 Assign sift features into clusters BoW histogram representations R1 R2 R3 Compute the frequency of each cluster within an image R1 R2 R3
  • 12. Indication of BoW histogram 05/13/11 Summarize entire image based on its distribution of visual word occurrences Turn bags of different sizes into a fixed length vector Analogous to bag of words representation commonly used for text categorization.
  • 13. Image classification based on BoW histogram 05/13/11 dog bird Decision boundary BoW histogram vector space Learn a classification model to determine the decision boundary Nonlinear SVMs are commonly applied.
  • 14. Issues 05/13/11 Sampling strategy Learning codebook: size? supervised?, … Classification: which method? scalability? Scalability: how to handle millions of data? How to use spatial information?
  • 15. Spatial information 05/13/11 The BoW removes spatial layout. This increases the invariance to scale, translation, and deformation, B ut sacrifices discriminative power, especially when the spatial layout is important . Slide adapted from Bill Freeman
  • 16. Spatial pyramid matching 05/13/11 Compute BoW for image regions at different locations in various scales Figure credit: Svetlana Lazebnik
  • 17. A common pipeline for discriminative image classification using BoW 05/13/11 K-means Dense/Sparse SIFT dictionary Dictionary Learning VQ Coding Dense/Sparse SIFT Spatial Pyramid Pooling Nonlinear SVM Image Classification
  • 18. Combining multiple descriptors 05/13/11 Multiple Feature Detectors Multiple Descriptors: SIFT, shape, color, … VQ Coding and Spatial Pooling Nonlinear SVM Diagram from SurreyUVA_SRKDA, winner team in PASCAL VOC 2008
  • 19. Outline of Part 2 05/13/11 Local Features, Sampling, Visual Words, … Discriminative Methods Bag-of-Words (BoW) representation Spatial pyramid matching (SPM) Generative Methods P art-based methods Topic models
  • 20. Topic models for images 05/13/11 Latent Dirichlet Allocation (LDA) Fei-Fei et al. ICCV 2005 Slide credit Fei-Fei Li w N c z D  “ beach”
  • 21. Part-based Model 05/13/11 Fischler & Elschlager 1973 Rob Fergus ICCV09 Tutorial
  • 22. For a comprehensive coverage of object categorization models , please visit 05/13/11 Recognizing and Learning Object Categories Li Fei-Fei (Stanford), Rob Fergus (NYU), Antonio Torralba (MIT) http://people.csail.mit.edu/torralba/shortCourseRLOC/