SlideShare a Scribd company logo
Part 3: Image Classification using Sparse Coding: Advanced Topics Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University
Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
Intuition: why sparse coding helps classification? 05/13/11 The coding is a  nonlinear  feature mapping Represent data in a  higher dimensional  space Sparsity makes prominent patterns more  distinctive Figure from  http://www.dtreg.com/svm.htm
A “topic model” view to sparse coding 05/13/11 Each basis is a “ direction ” or a “ topic ”. Sparsity : each datum is a linear combination of only a few bases. Applicable to image denoising, inpainting, and super-resolution. B oth f igures adapted from CVPR10 tutorial by F. Bach, J. Mairal, J. Ponce and G. Sapiro Basis 1 Basis 2
A geometric view to sparse coding 05/13/11 Data manifold Each basis is somewhat like a pseudo data point – “ anchor point ” Sparsity : each datum is a sparse combination of neighbor anchors. The coding scheme explores the  manifold structure  of data.  Basis Data
MNIST Experiment: Classification using SC 05/13/11 60K training, 10K for test Let k=512 Linear SVM  on sparse codes Try different values
MNIST Experiment: Lambda = 0.0005 05/13/11 Each basis is like a  part  or  direction .
MNIST Experiment: Lambda = 0.005 05/13/11 Again, each basis is like a  part  or  direction .
MNIST Experiment: Lambda = 0.05 05/13/11 Now, each basis is more like a  digit  !
MNIST Experiment: Lambda = 0.5 05/13/11 Like clustering now!
Geometric view of sparse coding 05/13/11 Error: 4.54% When SC achieves the  best classification accuracy, the learned bases are like digits –  each basis has a clear local class association. Implication:  exploring data geometry may be useful for classification . Error: 3.75% Error: 2.64%
Distribution of coefficients (MNIST)  05/13/11 Neighbor bases tend to get nonzero coefficients
Distribution of coefficient (SIFT, Caltech101) 05/13/11 Similar observation here!
Recap: two different views to sparse coding 05/13/11 View 1 Discover “topic” components Each basis is a “ direction ” Sparsity : each datum is a linear combination of several bases. Related to topic model View 2 Geometric structure of data manifold Each basis is an “ anchor point ” Sparsity : each datum is a linear combination of neighbor anchors.  S omewhat   like a  soft VQ  (link to BoW)  Either can be valid for sparse coding under certain circumstances. View 2 seems to be helpful to  sensory data  classification.
Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
Key theoretical question 05/13/11 Why  unsupervised feature learning  via sparse coding can  help classification ?
The image classification setting for analysis Implication : Learning an image classifier is a matter of learning nonlinear functions on patches. Sparse Coding Dense local feature Linear  Pooling Linear  SVM Function on images Function on patches
Illustration:  nonlinear l earning  via local coding 05/13/11 data points bases locally linear
How to learn a nonlinear function? 05/13/11 S tep 1: Learning the  dictionary  from unlabeled data
How to learn a nonlinear function? 05/13/11 S tep 2: Use  t he dictionary to encode data
How to learn a nonlinear function? Nonlinear local learning via learning a  global linear function . 05/13/11 Sparse codes of data S tep 3:  Estimate parameters  Global linear weights to be learned
L ocal Coordinate Coding (LCC):  connect coding to n onlinear  f unction  l earning 05/13/11 Locality term Function approximation error Coding error If f(x) is (alpha, beta)-Lipschitz smooth Yu  et al  NIPS-09 T he key message: A good coding scheme should 1. have a small coding error, 2. and also  b e sufficiently local
Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
Application of LCC theory 05/13/11 F ast Implementation with a large dictionary A  simple geometric way to improve BoW Wang  e t al, CVPR 10 Zhou et al, ECCV 10
Application of LCC theory 05/13/11 F ast Implementation with a large dictionary A  simple geometric way to improve BoW
The larger dictionary, the higher accuracy, but also the higher computation cost 05/13/11 T he same observation for Caltech-256, PASCAL, ImageNet, …  Yu  et al  NIPS-09 Y ang et al CVPR  09
L ocality-constrained linear coding  a fast implementation of LCC 05/13/11 D ictionary Learning: k-means (or hierarchical  k -means) C oding for X,  Step 1 –  ensure locality : find the K nearest bases Step 2 –  ensure low coding error : Wang et al, CVPR 10
C ompetitive in accuracy, cheap in computation  05/13/11 Wang et al CVPR 10 Sparse coding Significantly better than sparse coding T his is one of the two major algorithms applied by NEC-UIUC team to achieve the No.1 position in ImageNet challenge 2010!  Comparable with sparse coding
Application of the LCC theory 05/13/11 F ast Implementation with a large dictionary A  simple geometric way to improve BoW
Interpret “BoW + linear classifier” data points cluster centers Piece-wise local constant ( zero-order)
Super-vector coding:  a simple geometric way to improve BoW (VQ) Zhou et al, ECCV 10 data points cluster centers Piecewise local linear ( first-order) Local tangent
Super-vector coding:  a simple geometric way to improve BoW (VQ) 05/13/11 Q uantization error Function approximation error If f(x) is beta-Lipschitz smooth,  and Local tangent
Super-vector coding: learning nonlinear function via a global linear model 05/13/11 Let  be the VQ coding of  T his is one of the two major algorithms applied by NEC-UIUC team to achieve the No.1 position in PASCAL VOC 2009! Global linear weights to be learned S uper-vector  codes of data
Summary of Geometric Coding Methods Super-vector Coding A ll lead to  higher-dimensional,   sparse , and  localized  coding A ll explore  geometric structure  of data N ew coding methods are suitable for  linear classifiers . Their implementations are quite straightforward.  Vector Quantization (BoW)  (Fast) Local Coordinate Coding
Things not covered here 05/13/11 I mproved LCC using Local Tangent, Yu & Zhang, ICML10 M ixture of Sparse Coding, Yang et al ECCV 10 Deep Coding Network, Lin  et al NIPS 10 P ooling methods Max-pooling works wel l  in practice, but appears to be ad-hoc. An interesting analysis on max-pooling,  Boureau et al. ICML 2010 W e are working on a linear pooling method, which has a similar effect as max-pooling. Some preliminary results already in the super-vector coding paper, Zhou et al, ECCV2010.
Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
Fast approximation of sparse coding via neural networks 05/13/11 Gregor & LeCun, ICML-10 The method aims at improving sparse coding speed in coding time, not training speed, potentially make sparse coding practical for video.  Idea: Given a trained sparse coding model, use its input outputs as training data to train a feed-forward model They showed a speedup of X20 faster. But not evaluated on real video data.
Group sparse coding 05/13/11 Sparse coding is on patches, the image representation is unlikely sparse.  Idea: enforce joint sparsity via L1/L2 norm on sparse codes of a group of patches.  The resultant image representation becomes sparse, which can save the memory cost, but the classification accuracy decreases.  Bengio et al, NIPS 09
Learning hierarchical dictionary 05/13/11 Jenatton, Mairal, Obozinski, and Bach, 2010 A node can be active only if its ancestors are active.
Reference 05/13/11 Image Classification using Super-Vector Coding of Local Image Descriptors, Xi Zhou, Kai Yu, Tong Zhang, and Thomas Huang. In ECCV 2010. Efficient Highly Over-Complete Sparse Coding using a Mixture Model, Jianchao Yang, Kai Yu, and Thomas Huang. In ECCV 2010. Learning Fast Approximations of Sparse Coding, Karol Gregor and Yann LeCun. In ICML 2010. Improved Local Coordinate Coding using Local Tangents, Kai Yu and Tong Zhang. In ICML 2010. Sparse Coding and Dictionary Learning for Image Analysis, Francis Bach,  Julien Mairal, Jean Ponce, and Guillermo Sapiro. CVPR 2010 Tutorial Supervised translation-invariant sparse coding, Jianchao Yang, Kai Yu, and Thomas Huang, In CVPR 2010. Learning locality-constrained linear coding for image classification, Jingjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, and Yihong Gong. In CVPR 2010. Group Sparse Coding, Samy Bengio, Fernando Pereira, Yoram Singer, and Dennis  Strelow, In NIPS*2009. Nonlinear learning using local coordinate coding, Kai Yu, Tong Zhang, and Yihong Gong. In NIPS*2009. Linear spatial pyramid matching using sparse coding for image classification, Jianchao Yang, Kai Yu, Yihong Gong, and Thomas Huang. In CVPR 2009. Efficient sparse coding algorithms. Honglak Lee, Alexis Battle, Raina Rajat and Andrew Y.Ng. In NIPS*2007.

More Related Content

PDF
Reversed-Trellis Tail-Biting Convolutional Code (RT-TBCC) Decoder Architectur...
PDF
Self-Directing Text Detection and Removal from Images with Smoothing
PDF
Implementation and Performance Evaluation of Neural Network for English Alpha...
PDF
Bs25412419
PPTX
Introduction to Segmentation in Computer vision
PDF
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
PPTX
Transformer in Vision
Reversed-Trellis Tail-Biting Convolutional Code (RT-TBCC) Decoder Architectur...
Self-Directing Text Detection and Removal from Images with Smoothing
Implementation and Performance Evaluation of Neural Network for English Alpha...
Bs25412419
Introduction to Segmentation in Computer vision
IJCER (www.ijceronline.com) International Journal of computational Engineerin...
Transformer in Vision

What's hot (20)

PDF
[Paper] Multiscale Vision Transformers(MVit)
PDF
Gg3311121115
PDF
Image Matting via LLE/iLLE Manifold Learning
PPTX
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
PDF
A1802040111
PPT
Fcv learn yu
PPT
Das09112008
DOCX
Non binary orthogonal latin square codes for a multilevel phase charge memory...
PDF
Image compression using negative format
PDF
Image compression using negative format
PDF
IRJET- Devnagari Text Detection
PDF
Representing Simplicial Complexes with Mangroves
PDF
Review : Rethinking Pre-training and Self-training
PPT
Image stegnography and steganalysis
PPTX
Image to text Converter
PDF
PhD Thesis Defense Presentation: Robust Low-rank and Sparse Decomposition for...
PPTX
Text extraction from images
PPTX
Lbp based edge-texture features for object recoginition
PDF
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
PDF
A Run Length Smoothing-Based Algorithm for Non-Manhattan Document Segmentation
[Paper] Multiscale Vision Transformers(MVit)
Gg3311121115
Image Matting via LLE/iLLE Manifold Learning
Vision Transformer(ViT) / An Image is Worth 16*16 Words: Transformers for Ima...
A1802040111
Fcv learn yu
Das09112008
Non binary orthogonal latin square codes for a multilevel phase charge memory...
Image compression using negative format
Image compression using negative format
IRJET- Devnagari Text Detection
Representing Simplicial Complexes with Mangroves
Review : Rethinking Pre-training and Self-training
Image stegnography and steganalysis
Image to text Converter
PhD Thesis Defense Presentation: Robust Low-rank and Sparse Decomposition for...
Text extraction from images
Lbp based edge-texture features for object recoginition
A Survey on Portable Camera-Based Assistive Text and Product Label Reading Fr...
A Run Length Smoothing-Based Algorithm for Non-Manhattan Document Segmentation
Ad

Similar to ECCV2010: feature learning for image classification, part 3 (20)

PDF
IEEE 2015 Matlab Projects
PDF
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
PDF
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
PDF
Remote Sensing IEEE 2015 Projects
PDF
Remote Sensing IEEE 2015 Projects
PDF
IEEE 2015 Matlab Projects
PDF
Currency recognition on mobile phones
PDF
Speeded-up and Compact Visual Codebook for Object Recognition
PDF
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
PPTX
fuzzy LBP for face recognition ppt
PDF
IRJET- Weakly Supervised Object Detection by using Fast R-CNN
PDF
Enhanced Face Detection Based on Haar-Like and MB-LBP Features
PDF
PDF
Kernel Descriptors for Visual Recognition
PDF
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
PDF
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
PDF
A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...
PDF
Modified Skip Line Encoding for Binary Image Compression
PDF
最近の研究情勢についていくために - Deep Learningを中心に -
PDF
K2 Algorithm-based Text Detection with An Adaptive Classifier Threshold
IEEE 2015 Matlab Projects
Shallow vs. Deep Image Representations: A Comparative Study with Enhancements...
Implementation of Fuzzy Logic for the High-Resolution Remote Sensing Images w...
Remote Sensing IEEE 2015 Projects
Remote Sensing IEEE 2015 Projects
IEEE 2015 Matlab Projects
Currency recognition on mobile phones
Speeded-up and Compact Visual Codebook for Object Recognition
Comparison Between Levenberg-Marquardt And Scaled Conjugate Gradient Training...
fuzzy LBP for face recognition ppt
IRJET- Weakly Supervised Object Detection by using Fast R-CNN
Enhanced Face Detection Based on Haar-Like and MB-LBP Features
Kernel Descriptors for Visual Recognition
Semantic Concept Detection in Video Using Hybrid Model of CNN and SVM Classif...
IRJET- Digital Image Forgery Detection using Local Binary Patterns (LBP) and ...
A Survey on Deblur The License Plate Image from Fast Moving Vehicles Using Sp...
Modified Skip Line Encoding for Binary Image Compression
最近の研究情勢についていくために - Deep Learningを中心に -
K2 Algorithm-based Text Detection with An Adaptive Classifier Threshold
Ad

More from zukun (20)

PDF
My lyn tutorial 2009
PDF
ETHZ CV2012: Tutorial openCV
PDF
ETHZ CV2012: Information
PDF
Siwei lyu: natural image statistics
PDF
Lecture9 camera calibration
PDF
Brunelli 2008: template matching techniques in computer vision
PDF
Modern features-part-4-evaluation
PDF
Modern features-part-3-software
PDF
Modern features-part-2-descriptors
PDF
Modern features-part-1-detectors
PDF
Modern features-part-0-intro
PDF
Lecture 02 internet video search
PDF
Lecture 01 internet video search
PDF
Lecture 03 internet video search
PDF
Icml2012 tutorial representation_learning
PPT
Advances in discrete energy minimisation for computer vision
PDF
Gephi tutorial: quick start
PDF
EM algorithm and its application in probabilistic latent semantic analysis
PDF
Object recognition with pictorial structures
PDF
Iccv2011 learning spatiotemporal graphs of human activities
My lyn tutorial 2009
ETHZ CV2012: Tutorial openCV
ETHZ CV2012: Information
Siwei lyu: natural image statistics
Lecture9 camera calibration
Brunelli 2008: template matching techniques in computer vision
Modern features-part-4-evaluation
Modern features-part-3-software
Modern features-part-2-descriptors
Modern features-part-1-detectors
Modern features-part-0-intro
Lecture 02 internet video search
Lecture 01 internet video search
Lecture 03 internet video search
Icml2012 tutorial representation_learning
Advances in discrete energy minimisation for computer vision
Gephi tutorial: quick start
EM algorithm and its application in probabilistic latent semantic analysis
Object recognition with pictorial structures
Iccv2011 learning spatiotemporal graphs of human activities

Recently uploaded (20)

PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
Institutional Correction lecture only . . .
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
RMMM.pdf make it easy to upload and study
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Lesson notes of climatology university.
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
Presentation on HIE in infants and its manifestations
PDF
Computing-Curriculum for Schools in Ghana
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Cell Types and Its function , kingdom of life
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
human mycosis Human fungal infections are called human mycosis..pptx
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Institutional Correction lecture only . . .
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
O5-L3 Freight Transport Ops (International) V1.pdf
RMMM.pdf make it easy to upload and study
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
202450812 BayCHI UCSC-SV 20250812 v17.pptx
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Lesson notes of climatology university.
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
Presentation on HIE in infants and its manifestations
Computing-Curriculum for Schools in Ghana
Pharmacology of Heart Failure /Pharmacotherapy of CHF
STATICS OF THE RIGID BODIES Hibbelers.pdf
Supply Chain Operations Speaking Notes -ICLT Program
Cell Types and Its function , kingdom of life
102 student loan defaulters named and shamed – Is someone you know on the list?
Final Presentation General Medicine 03-08-2024.pptx
school management -TNTEU- B.Ed., Semester II Unit 1.pptx

ECCV2010: feature learning for image classification, part 3

  • 1. Part 3: Image Classification using Sparse Coding: Advanced Topics Kai Yu Dept. of Media Analytics NEC Laboratories America Andrew Ng Computer Science Dept. Stanford University
  • 2. Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
  • 3. Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
  • 4. Intuition: why sparse coding helps classification? 05/13/11 The coding is a nonlinear feature mapping Represent data in a higher dimensional space Sparsity makes prominent patterns more distinctive Figure from http://www.dtreg.com/svm.htm
  • 5. A “topic model” view to sparse coding 05/13/11 Each basis is a “ direction ” or a “ topic ”. Sparsity : each datum is a linear combination of only a few bases. Applicable to image denoising, inpainting, and super-resolution. B oth f igures adapted from CVPR10 tutorial by F. Bach, J. Mairal, J. Ponce and G. Sapiro Basis 1 Basis 2
  • 6. A geometric view to sparse coding 05/13/11 Data manifold Each basis is somewhat like a pseudo data point – “ anchor point ” Sparsity : each datum is a sparse combination of neighbor anchors. The coding scheme explores the manifold structure of data. Basis Data
  • 7. MNIST Experiment: Classification using SC 05/13/11 60K training, 10K for test Let k=512 Linear SVM on sparse codes Try different values
  • 8. MNIST Experiment: Lambda = 0.0005 05/13/11 Each basis is like a part or direction .
  • 9. MNIST Experiment: Lambda = 0.005 05/13/11 Again, each basis is like a part or direction .
  • 10. MNIST Experiment: Lambda = 0.05 05/13/11 Now, each basis is more like a digit !
  • 11. MNIST Experiment: Lambda = 0.5 05/13/11 Like clustering now!
  • 12. Geometric view of sparse coding 05/13/11 Error: 4.54% When SC achieves the best classification accuracy, the learned bases are like digits – each basis has a clear local class association. Implication: exploring data geometry may be useful for classification . Error: 3.75% Error: 2.64%
  • 13. Distribution of coefficients (MNIST) 05/13/11 Neighbor bases tend to get nonzero coefficients
  • 14. Distribution of coefficient (SIFT, Caltech101) 05/13/11 Similar observation here!
  • 15. Recap: two different views to sparse coding 05/13/11 View 1 Discover “topic” components Each basis is a “ direction ” Sparsity : each datum is a linear combination of several bases. Related to topic model View 2 Geometric structure of data manifold Each basis is an “ anchor point ” Sparsity : each datum is a linear combination of neighbor anchors. S omewhat like a soft VQ (link to BoW) Either can be valid for sparse coding under certain circumstances. View 2 seems to be helpful to sensory data classification.
  • 16. Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
  • 17. Key theoretical question 05/13/11 Why unsupervised feature learning via sparse coding can help classification ?
  • 18. The image classification setting for analysis Implication : Learning an image classifier is a matter of learning nonlinear functions on patches. Sparse Coding Dense local feature Linear Pooling Linear SVM Function on images Function on patches
  • 19. Illustration: nonlinear l earning via local coding 05/13/11 data points bases locally linear
  • 20. How to learn a nonlinear function? 05/13/11 S tep 1: Learning the dictionary from unlabeled data
  • 21. How to learn a nonlinear function? 05/13/11 S tep 2: Use t he dictionary to encode data
  • 22. How to learn a nonlinear function? Nonlinear local learning via learning a global linear function . 05/13/11 Sparse codes of data S tep 3: Estimate parameters Global linear weights to be learned
  • 23. L ocal Coordinate Coding (LCC): connect coding to n onlinear f unction l earning 05/13/11 Locality term Function approximation error Coding error If f(x) is (alpha, beta)-Lipschitz smooth Yu et al NIPS-09 T he key message: A good coding scheme should 1. have a small coding error, 2. and also b e sufficiently local
  • 24. Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
  • 25. Application of LCC theory 05/13/11 F ast Implementation with a large dictionary A simple geometric way to improve BoW Wang e t al, CVPR 10 Zhou et al, ECCV 10
  • 26. Application of LCC theory 05/13/11 F ast Implementation with a large dictionary A simple geometric way to improve BoW
  • 27. The larger dictionary, the higher accuracy, but also the higher computation cost 05/13/11 T he same observation for Caltech-256, PASCAL, ImageNet, … Yu et al NIPS-09 Y ang et al CVPR 09
  • 28. L ocality-constrained linear coding a fast implementation of LCC 05/13/11 D ictionary Learning: k-means (or hierarchical k -means) C oding for X, Step 1 – ensure locality : find the K nearest bases Step 2 – ensure low coding error : Wang et al, CVPR 10
  • 29. C ompetitive in accuracy, cheap in computation 05/13/11 Wang et al CVPR 10 Sparse coding Significantly better than sparse coding T his is one of the two major algorithms applied by NEC-UIUC team to achieve the No.1 position in ImageNet challenge 2010! Comparable with sparse coding
  • 30. Application of the LCC theory 05/13/11 F ast Implementation with a large dictionary A simple geometric way to improve BoW
  • 31. Interpret “BoW + linear classifier” data points cluster centers Piece-wise local constant ( zero-order)
  • 32. Super-vector coding: a simple geometric way to improve BoW (VQ) Zhou et al, ECCV 10 data points cluster centers Piecewise local linear ( first-order) Local tangent
  • 33. Super-vector coding: a simple geometric way to improve BoW (VQ) 05/13/11 Q uantization error Function approximation error If f(x) is beta-Lipschitz smooth, and Local tangent
  • 34. Super-vector coding: learning nonlinear function via a global linear model 05/13/11 Let be the VQ coding of T his is one of the two major algorithms applied by NEC-UIUC team to achieve the No.1 position in PASCAL VOC 2009! Global linear weights to be learned S uper-vector codes of data
  • 35. Summary of Geometric Coding Methods Super-vector Coding A ll lead to higher-dimensional, sparse , and localized coding A ll explore geometric structure of data N ew coding methods are suitable for linear classifiers . Their implementations are quite straightforward. Vector Quantization (BoW) (Fast) Local Coordinate Coding
  • 36. Things not covered here 05/13/11 I mproved LCC using Local Tangent, Yu & Zhang, ICML10 M ixture of Sparse Coding, Yang et al ECCV 10 Deep Coding Network, Lin et al NIPS 10 P ooling methods Max-pooling works wel l in practice, but appears to be ad-hoc. An interesting analysis on max-pooling, Boureau et al. ICML 2010 W e are working on a linear pooling method, which has a similar effect as max-pooling. Some preliminary results already in the super-vector coding paper, Zhou et al, ECCV2010.
  • 37. Outline of Part 3 05/13/11 Why can sparse coding learn good features? Intuition, topic model view, and geometric view A theoretical framework: local coordinate coding Two practical coding methods Recent advances in sparse coding for image classification
  • 38. Fast approximation of sparse coding via neural networks 05/13/11 Gregor & LeCun, ICML-10 The method aims at improving sparse coding speed in coding time, not training speed, potentially make sparse coding practical for video. Idea: Given a trained sparse coding model, use its input outputs as training data to train a feed-forward model They showed a speedup of X20 faster. But not evaluated on real video data.
  • 39. Group sparse coding 05/13/11 Sparse coding is on patches, the image representation is unlikely sparse. Idea: enforce joint sparsity via L1/L2 norm on sparse codes of a group of patches. The resultant image representation becomes sparse, which can save the memory cost, but the classification accuracy decreases. Bengio et al, NIPS 09
  • 40. Learning hierarchical dictionary 05/13/11 Jenatton, Mairal, Obozinski, and Bach, 2010 A node can be active only if its ancestors are active.
  • 41. Reference 05/13/11 Image Classification using Super-Vector Coding of Local Image Descriptors, Xi Zhou, Kai Yu, Tong Zhang, and Thomas Huang. In ECCV 2010. Efficient Highly Over-Complete Sparse Coding using a Mixture Model, Jianchao Yang, Kai Yu, and Thomas Huang. In ECCV 2010. Learning Fast Approximations of Sparse Coding, Karol Gregor and Yann LeCun. In ICML 2010. Improved Local Coordinate Coding using Local Tangents, Kai Yu and Tong Zhang. In ICML 2010. Sparse Coding and Dictionary Learning for Image Analysis, Francis Bach,  Julien Mairal, Jean Ponce, and Guillermo Sapiro. CVPR 2010 Tutorial Supervised translation-invariant sparse coding, Jianchao Yang, Kai Yu, and Thomas Huang, In CVPR 2010. Learning locality-constrained linear coding for image classification, Jingjun Wang, Jianchao Yang, Kai Yu, Fengjun Lv, Thomas Huang, and Yihong Gong. In CVPR 2010. Group Sparse Coding, Samy Bengio, Fernando Pereira, Yoram Singer, and Dennis  Strelow, In NIPS*2009. Nonlinear learning using local coordinate coding, Kai Yu, Tong Zhang, and Yihong Gong. In NIPS*2009. Linear spatial pyramid matching using sparse coding for image classification, Jianchao Yang, Kai Yu, Yihong Gong, and Thomas Huang. In CVPR 2009. Efficient sparse coding algorithms. Honglak Lee, Alexis Battle, Raina Rajat and Andrew Y.Ng. In NIPS*2007.

Editor's Notes

  • #14: Let’s further check what’s happening when best classification performance is achieved.