SlideShare a Scribd company logo
Image Analysis & Retrieval
CS/EE 5590 Special Topics (Class Ids: 44873, 44874)
Fall 2016, M/W 4-5:15pm@Bloch 0012
Lec 07
Feature Aggregation and Image Retrieval System
Zhu Li
Dept of CSEE, UMKC
Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346.
http://l.web.umkc.edu/lizhu
p.1Image Analysis & Retrieval, 2016
Outline
 ReCap of Lecture 06
 SIFT
 Box Filter
 Image Retrieval System
 Why Aggregation ?
 Aggregation Schemes
 Summary
Image Analysis & Retrieval, 2016 p.2
Scale Space Theory - Lindeberg
 Scale Space Response via Laplacian of Gaussian
 The scale is controlled by 𝜎
 Characteristic Scale:
Image Analysis & Retrieval, 2016 p.3
2
2
2
2
2
y
g
x
g
g






𝑔 = 𝑒
− 𝑥+𝑦 2
2𝜎
r
image
𝜎 = 0.8𝑟 𝜎 = 1.2𝑟 𝜎 = 2𝑟
…
characteristic
scale
SIFT
 Use DoG to approximate LoG
 Separable Gaussian filter
 Difference of image instead of difference of Gaussian kernel
Image Analysis & Retrieval, 2016 p.4
L
o
G
Scale space construction
By Gaussian Filtering,
and Image Difference
Peak Strength & Edge Removal
 Peak Strength:
 Interpolate true DoG response and pixel location by Taylor
expansion
 Edge Removal:
 Re-do Harris type detection to remove edge on much reduced
pixel set
Image Analysis & Retrieval, 2016 p.5
Scale Invariance thru Dominant Orientation Coding
 Voting for the dominant orientation
 Weighted by a Gaussian window to give more emphasis to the
gradients closer to the center
Image Analysis & Retrieval, 2016 p.6
SIFT Matching and Repeatability Prediction
 SIFT Distance
Not all SIFT are created equal…
 Peak strength (DoG response at interpolated position)
Image Analysis & Retrieval, 2016 p.7
Combined scale/peak strength pmf
𝑑(𝑠1
1
, 𝑠 𝑘∗
2
)
𝑑(𝑠1
1
, 𝑠 𝑘
2
)
≤ 𝜃
Box Fitler – CABOX work
 Basic Idea:
 Approximate DoG with linear combination of box filters
min.
𝒉
𝒈 − 𝐵 ∙ 𝒉 𝐿2
2
+ 𝒉 𝐿1
 Solution by LASSO
Image Analysis & Retrieval, 2016 p.8
= h1*
h2*+ + …
Outline
 ReCap of Lecture 06
 SIFT
 Box Filter
 Image Retrieval System
 Why Aggregation ?
 Aggregation Schemes
 Summary
Image Analysis & Retrieval, 2016 p.9
Image Matching/Retrieval System
SIFT is a sub-image level feature, we actually care
more on how SIFT match will translate into image level
matching/retrieval accuracy
Say if we can compute a single distance from a
collection of features:
 Then for a data base of n images, we can compute an n
x n distance matrix
 This gives us full information of the performance of this
feature/distance system
 How to characterize the performance of such image matching
and retrieval system ?
Image Analysis & Retrieval, 2016 p.10
𝑑 𝐼1, 𝐼2 =
𝑘
𝛼 𝑘 𝑑(𝐹𝑘
1
, 𝐹𝑘
2
)
𝐷𝑖, 𝑘 = 𝑑(𝐼𝑗, 𝐼 𝑘)
Thresholding for Matching
 Basically, for any pair of Images (documents, in IR
jargon), we declare
 Then for each possible image pair, or pairs we care, for
a given threshold t, there will be 4 possible
consequences
 TP pair: {Ij, Ik} declared matching pairs, d(Ij, Ik) < t;
 FP pair: {Ij, Ik} declared matching pairs, d(Ij, Ik) >= t;
 TN pair: {Ij, Ik} declared non-matching pairs, d(Ij, Ik) >= t;
 FN pair: {Ij, Ik} declared non- matching pairs, d(Ij, Ik) < t;
Image Analysis & Retrieval, 2016 p.11
𝐼𝑗, 𝐼 𝑘 𝑎𝑟𝑒 𝑚𝑎𝑡𝑐ℎ, 𝑖𝑓 𝑑 𝐼𝑗, 𝐼 𝑘 < 𝑡
𝐼𝑗, 𝐼 𝑘 𝑎𝑟𝑒𝑛𝑜𝑡 𝑚𝑎𝑡𝑐ℎ, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
Matching System Performance
 True Positive Rate/Precision:
 Out of retrieved matching pairs, how many are true matching
pairs
 For all matching pairs with distance < t
 False Positive Rate:
 Out of retrieved matching pairs, how many are actually
negative, false matchings
Image Analysis & Retrieval, 2016 p.12
𝑇𝑃𝑅 =
𝑡𝑝
𝑡𝑝 + 𝑓𝑛
𝐹𝑃𝑅 =
𝑓𝑝
𝑓𝑝 + 𝑡𝑛
TPR-FPR
Definition:
TP rate = TP/(TP+FN)
FP rate = FP/(FP+TN)
From the
actual value
point of view
Image Analysis & Retrieval, 2016 p.13
ROC curve(1)
ROC = receiver operating characteristic
Y:TP rate
X:FP rate
Image Analysis & Retrieval, 2016 p.14
ROC curve(2)
Which method (A or B) is better?
compute ROC area: area under ROC
curve
Image Analysis & Retrieval, 2016 p.15
Precision, Recall, F-measure
Precision = TP/(TP + FP),
Recall = TP/(TP + FN)
F-measure = 2*(precision*recall)/(precision + recall)
Precision:
is the probability that a
retrieved document
is relevant.
Recall:
is the probability that a
relevant document
is retrieved in a search.
Image Analysis & Retrieval, 2016 p.16
Matlab Implementation
 We will compute all image
pair distances D(j,k)
 How do we compute the
TPR-FPR plot ?
 Understand that TPR and
FPR are actually function of
threshold t,
 Just need to parameterize
TPR(t) and FPR(t), and
obtaining operating points of
meaningful thresholds, to
generate the plot.
 Matlab Implementation:
 [tp, fp, tn,
fn]=getPrecisionRecall()
Image Analysis & Retrieval, 2016 p.17
d_min = min(min(d0), min(d1));
d_max = max(max(d0), max(d1));
delta = (d_max - d_min) / npt;
for k=1:npt
thres = d_min + (k-1)*delta;
tp(k) = length(find(d0<=thres));
fp(k) = length(find(d1<=thres));
tn(k) = length(find(d1>thres));
fn(k) = length(find(d0>thres));
end
if dbg
figure(22); grid on; hold on;
plot(fp./(tn+fp), tp./(tp+fn), '.-r',
'DisplayName', 'tpr-fpr');legend();
end
TPR-FPR
 Image Matching performance are characterized by
functions
 TPR(FPR)
 Retrieval set: we want high Precision, Short List: High
Recall.
Image Analysis & Retrieval, 2016 p.18
Outline
 ReCap of Lecture 06
 SIFT
 Box Filter
 Image Retrieval System
 Why Aggregation ?
 Aggregation Schemes
 Summary
Image Analysis & Retrieval, 2016 p.19
Why Aggregation ?
 What (Local) Interesting Points features bring us ?
 Scale and rotation invariance in the form of nk x d:
 Un-cerntainty of the number of detected features nk, at query
time
 Permutation along rows of features are the same
representation.
 Problems:
 The feature has state, not able to draw decision boundaries,
 Not directly indexable/hashable
 Typically very high dimensionality
Image Analysis & Retrieval, 2016 p.20
𝑆 𝑘| [𝑥 𝑘, 𝑦 𝑘, 𝜃 𝑘, 𝜎 𝑘, ℎ1, ℎ2, … , ℎ128] , 𝑘 = 1. . 𝑛
Decision Boundary in Matching
 Can we have a decision boundary function for
interesting points based representation ?
Image Analysis & Retrieval, 2016 p.21
…..
Curse of Dimensionality in Retrieval
 What feature dimensions will do to the retrieval
efficiency…
 Looking at retrieval 99% of per dimension locality, and the
total volume covered plot.
 Matlab: showDimensionCurse.m
Image Analysis & Retrieval, 2016 p.22
+
Aggregation – 30,000ft view
 Bag of Words
 Compute k centroids in feature space, called visual words
 Compute histogram
 k x1 feature, hard assignment
 VLAD
 Compute centroids in feature space
 Compute aggregaged difference w.r.t the centroids
 k x d feature, soft assignment
 Fisher Vector
 Compute a Gaussian Mixture Model (GMM) with 2nd order info
 Compute the aggregated feature w.r.t the mean and covariance of
GMM
 2 x k x d feature
 AKULA
 Adaptive centroids and feature count
 Improved with covariance ?
Image Analysis & Retrieval, 2016 p.23
0.5
0.4 0.05
0.05
Visual Key Words: main idea
Extract some local features from a number of
images …
Image Analysis & Retrieval, 2016 24
e.g., SIFT descriptor
space: each point is 128-
dimensional
Slide credit: D. Nister
Visual Key Words: main idea
Image Analysis & Retrieval, 2016 25Slide credit: D. Nister
Visual words: main idea
Image Analysis & Retrieval, 2016 26
Slide credit: D. Nister
Visual words: main idea
Image Analysis & Retrieval, 2016 27
Slide credit: D. Nister
Slide credit: D. Nister
Visual Key Words
Image Analysis & Retrieval, 2016 28
Each point is a local
descriptor, e.g. SIFT
vector.
Slide credit: D. Nister
Image Analysis & Retrieval, 2016 29
Visual words
Example: each group of patches belongs to the
same visual word
Image Analysis & Retrieval, 2016 30
Figure from Sivic & Zisserman, ICCV 2003
Visual words
Image Analysis & Retrieval, 2016 31
31
Source credit: K. Grauman, B. Leibe
• More recently used for describing scenes and
objects for the sake of indexing or classification.
Sivic & Zisserman 2003;
Csurka, Bray, Dance, & Fan
2004; many others.
Object Bag of ‘words’
ICCV 2005 short course, L. Fei-Fei
Bag of Words
Image Analysis & Retrieval, 2016 32
BoW Examples
 Illustration
Image Analysis & Retrieval, 2016 33
Bags of visual words
Summarize entire image based on its distribution
(histogram) of word occurrences.
Analogous to bag of words representation
commonly used for documents.
Image Analysis & Retrieval, 2016 34
Image credit: Fei-Fei Li
Texture Retrieval
Texons…
Image Analysis & Retrieval, 2016 35
Universal texton dictionary
histogram
Source: Lana Lazebnik
BoW Distance Metrics
Rank images by normalized scalar product
between their (possibly weighted) occurrence
counts---nearest neighbor search for similar
images.
Image Analysis & Retrieval, 2016 p.36
[5 1 1 0][1 8 1 4]
dj
q
Inverted List
 Image Retrieval via Inverted List
Image Analysis & Retrieval, 2016 37
Image credit: A. Zisserman
Visual
Word
number
List of image
numbers
When will this give us a significant gain in efficiency?
Indexing local features: inverted file index
For text documents, an
efficient way to find all pages
on which a word occurs is to
use an index…
We want to find all images in
which a feature occurs.
We need to index each
feature by the image it
appears and also we keep the
# of occurrence.
Image Analysis & Retrieval, 2016 38
Source credit : K. Grauman, B. Leibe
TF-IDF Weighting
Term Frequency – Inverse Document Frequency
 Describe image by frequency of each visual word within
it, down-weight words that appear often in the database
(Standard weighting for text retrieval)
Image Analysis & Retrieval, 2016 p.39
Total number of
words in database
Number of
occurrences of
word i in whole
database
Number of
occurrences of
word i in
document d
Number of
words in
document d
BoW Use Case with Spatial Localization
Collecting words within a query region
Image Analysis & Retrieval, 2016 40
Query region:
pull out only the SIFT
descriptors whose
positions are within the
polygon
Image Analysis & Retrieval, 2016 41
BoW Patch Search
Localizing the BoW representation
Image Analysis & Retrieval, 2016 42
Localization with BoW
Image Analysis & Retrieval, 2016 43
Hiearchical Assignment of Histogram
Tree construction:
Image Analysis & Retrieval, 2016 44
[Nister & Stewenius, CVPR’06]
Vocabulary Tree
Training: Filling the tree
Image Analysis & Retrieval, 2016 45
[Nister & Stewenius, CVPR’06]
46
Vocabulary Tree
Training: Filling the tree
Image Analysis & Retrieval, 2016 46Slide credit: David Nister
[Nister & Stewenius, CVPR’06]
47
Vocabulary Tree
Training: Filling the tree
Image Analysis & Retrieval, 2016 47Slide credit: David Nister
[Nister & Stewenius, CVPR’06]
Vocabulary Tree
Training: Filling the tree
Image Analysis & Retrieval, 2016 48
[Nister & Stewenius, CVPR’06]
Vocabulary Tree
Training: Filling the tree
Image Analysis & Retrieval, 2016 49
[Nister & Stewenius, CVPR’06]
50
Vocabulary Tree
Recognition
Image Analysis & Retrieval, 2016 50Slide credit: David Nister
[Nister & Stewenius, CVPR’06]
RANSAC
verification
Vocabulary Tree: Performance
Evaluated on large databases
 Indexing with up to 1M images
Online recognition for database
of 50,000 CD covers
 Retrieval in ~1s
Find experimentally that large vocabularies can be
beneficial for recognition
Image Analysis & Retrieval, 2016 51
[Nister & Stewenius, CVPR’06]
Larger vocabularies
can be
advantageous…
But what happens if it
is too large?
Visual Word Vocabulary Size
 Performance w.r.t vocabulary size
Image Analysis & Retrieval, 2016 52
Bags of words: pros and cons
Good:
+ flexible to geometry / deformations / viewpoint
+ compact summary of image content
+ provides vector representation for sets
+ Inverted List implementation offers practical solution
against large repository
Bad:
- Lost of information at quantization and histogram
generation
- basic model ignores geometry – must verify afterwards,
or encode via features
- background and foreground mixed when bag covers
whole image
- interest points or sampling: no guarantee to capture
object-level parts
Image Analysis & Retrieval, 2016 53Source credit : K. Grauman, B. Leibe
Can we improve BoW ?
• E.g. Why isn’t our Bag of Words classifier at 90%
instead of 70%?
• Training Data
– Huge issue, but not necessarily a variable you can manipulate.
• Learning method
– BoW is on top of any feature scheme
• Representation
– Are we losing too much info in the process ?
Image Analysis & Retrieval, 2016 p.54
Standard Kmeans Bag of Words
 BoW revisited
Image Analysis & Retrieval, 2016 p.55
http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
Motivation
Bag of Visual Words is only about counting the number
of local descriptors assigned to each Voronoi region
Why not including other statistics/information ?
Image Analysis & Retrieval, 2016 p.56
http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
We already looked at the Spatial Pyramid/Pooling
Spatial Pooling
Image Analysis & Retrieval, 2016 p.57
level 2: 4x4level 0: 1x1 level 1: 2x2
Key take away: Multiple assignment ? Soft Assignment ?
Motivation
Bag of Visual Words is only about counting the number
of local descriptors assigned to each Voronoi region
Why not including other statistics? For instance:
• mean of local descriptors
Image Analysis & Retrieval, 2016 p.58
http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
Motivation
Bag of Visual Words is only about counting the number
of local descriptors assigned to each Voronoi region
Why not including other statistics? For instance:
• mean of local descriptors
• (co)variance of local descriptors
Image Analysis & Retrieval, 2016 p.59
http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
Simple case: Soft Assignment
Called “Kernel codebook encoding” by Chatfield et al.
2011. Cast a weighted vote into the most similar
clusters.
Image Analysis & Retrieval, 2016 p.60
Simple case: Soft Assignment
Called “Kernel codebook encoding” by Chatfield et al.
2011. Cast a weighted vote into the most similar
clusters.
This is fast and easy to implement (try it for Project 3!)
but it does have some downsides for image retrieval –
the inverted file index becomes less sparse.
Image Analysis & Retrieval, 2016 p.61
A first example: the VLAD
Given a codebook ,
e.g. learned with K-means, and a set of
local descriptors :
•  assign:
•  compute:
• concatenate vi’s + normalize
Image Analysis & Retrieval, 2016 p.62
Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10.
 3
x
v1 v2
v3 v4
v5
1
 4
 2
 5
① assign descriptors
② compute x-  i
③ vi=sum x-  i for cell i
A first example: the VLAD
A graphical representation of
Image Analysis & Retrieval, 2016 p.63
Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10.
VL_FEAT Implementation
 Matlab:
Image Analysis & Retrieval, 2016 p.64
function [vc]=vladSiftEncoding(sift,
codebook)
dbg=1;
if dbg
if (0) % init VL_FEAT, only need
to do once
run('../../tools/vlfeat-
0.9.20/toolbox/vl_setup.m');
end
im = imread('../pics/flarsheim-
2.jpg');
[f, sift] =
vl_sift(single(rgb2gray(im))); sift =
single(sift');
[indx, codebook] = kmeans(sift,
16);
% make sift # smaller
sift = sift(1:800,:);
end
[n, kd]=size(sift);
[m, kd]=size(codebook);
% compute assignment
dist = pdist2(codebook, sift);
mdist = mean(mean(dist));
% normalize the heat kernel s.t. mean
dist is mapped to 0.5
a = -log(0.5)/mdist;
indx = exp(-a*dist);
vc=vl_vlad(sift', codebook', indx);
if dbg
figure(41); colormap(gray);
subplot(2,2,1); imshow(im);
title('image');
subplot(2,2,2); imagesc(dist);
title('m x n distance');
subplot(2,2,3); imagesc(indx);
title('m x n assignment');
subplot(2,2,4); imagesc(reshape(vc,
[m, kd]));title('vlad code');
end
VLAD Code
 What are the tweaks ?
 Code book design
 Soft Assignment options
Image Analysis & Retrieval, 2016 p.65
References
 Vocabulary Tree:
 David Nistér, Henrik Stewénius: Scalable Recognition with a Vocabulary
Tree. CVPR (2) 2006: 2161-2168
 VLAD:
 Herve Jegou, Matthijs Douze, Cordelia Schmid:
Improving Bag-of-Features for Large Scale Image Search. International
Journal of Computer Vision 87(3): 316-336 (2010)
 Fisher Vector:
 Florent Perronnin, Jorge Sánchez, Thomas Mensink:
Improving the Fisher Kernel for Large-Scale Image Classification.
ECCV (4) 2010: 143-156
 AKULA:
 Abhishek Nagar, Zhu Li, Gaurav Srivastava, Kyungmo Park:
AKULA - Adaptive Cluster Aggregation for Visual Search. DCC 2014:
13-22
Image Analysis & Retrieval, 2016 p.66
Lec 07 Summary
 Image Retrieval System Metric
 What is true positive, false positive, true negative, false
negative ?
 What is precision, recall, F-score ?
Why Aggregation ?
 Decision boundary
 Indexing/Hashing
 Bag of Words
 A histogram with bins visual words
 Variations: hierarchical assignment with vocabulary tree
 Implementation: Inverted List
VLAD
 Richer encoding of aggregated info
 Soft assignment of features to codebook bins
 Vectorized representation – no need for inverted list
Image Analysis & Retrieval, 2016 p.67

More Related Content

PDF
PDF
Lec12 review-part-i
PDF
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
PDF
Lec17 sparse signal processing & applications
PDF
Lec15 graph laplacian embedding
PDF
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
PDF
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
PDF
Class Weighted Convolutional Features for Image Retrieval
Lec12 review-part-i
Lec-08 Feature Aggregation II: Fisher Vector, AKULA and Super Vector
Lec17 sparse signal processing & applications
Lec15 graph laplacian embedding
Subspace Indexing on Grassmannian Manifold for Large Scale Visual Identification
Image Retrieval with Fisher Vectors of Binary Features (MIRU'14)
Class Weighted Convolutional Features for Image Retrieval

What's hot (20)

PDF
Lec16 subspace optimization
PDF
Object Detection Beyond Mask R-CNN and RetinaNet III
PDF
Deep image retrieval learning global representations for image search
PDF
Convolutional Features for Instance Search
PDF
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
PDF
Mask R-CNN
PPTX
Histogram based Enhancement
PPTX
Deep image retrieval - learning global representations for image search - ub ...
PDF
改进的固定点图像复原算法_英文_阎雪飞
PPTX
Histogram based enhancement
PDF
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
PDF
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
PDF
Optimized linear spatial filters implemented in FPGA
PPTX
Graph R-CNN for Scene Graph Generation
PDF
Digital Image Processing: Image Enhancement in the Spatial Domain
PDF
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
PDF
Digital Image Processing: Digital Image Fundamentals
PDF
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
PDF
Webinar on Graph Neural Networks
PDF
RP BASED OPTIMIZED IMAGE COMPRESSING TECHNIQUE
Lec16 subspace optimization
Object Detection Beyond Mask R-CNN and RetinaNet III
Deep image retrieval learning global representations for image search
Convolutional Features for Instance Search
Deep Learning for Computer Vision: Image Retrieval (UPC 2016)
Mask R-CNN
Histogram based Enhancement
Deep image retrieval - learning global representations for image search - ub ...
改进的固定点图像复原算法_英文_阎雪飞
Histogram based enhancement
CLIM Program: Remote Sensing Workshop, Statistical Emulation with Dimension R...
Object Segmentation (D2L7 Insight@DCU Machine Learning Workshop 2017)
Optimized linear spatial filters implemented in FPGA
Graph R-CNN for Scene Graph Generation
Digital Image Processing: Image Enhancement in the Spatial Domain
Deep 3D Analysis - Javier Ruiz-Hidalgo - UPC Barcelona 2018
Digital Image Processing: Digital Image Fundamentals
A pixel to-pixel segmentation method of DILD without masks using CNN and perl...
Webinar on Graph Neural Networks
RP BASED OPTIMIZED IMAGE COMPRESSING TECHNIQUE
Ad

Viewers also liked (9)

PDF
Mobile Visual Search: Object Re-Identification Against Large Repositories
PDF
Lec11 rate distortion optimization
DOCX
Scale invariant feature transform
PDF
ICME 2013
PPTX
Scale Invariant Feature Transform
PPT
Feature Matching using SIFT algorithm
DOCX
Scale Invariant Feature Tranform
PPT
Michal Erel's SIFT presentation
PPTX
SIFT vs other Feature Descriptor
Mobile Visual Search: Object Re-Identification Against Large Repositories
Lec11 rate distortion optimization
Scale invariant feature transform
ICME 2013
Scale Invariant Feature Transform
Feature Matching using SIFT algorithm
Scale Invariant Feature Tranform
Michal Erel's SIFT presentation
SIFT vs other Feature Descriptor
Ad

Similar to Lec07 aggregation-and-retrieval-system (20)

PDF
4 image enhancement in spatial domain
PPT
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
PDF
A comparative analysis of retrieval techniques in content based image retrieval
PDF
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVAL
PPT
JPEG XR objective and subjective evaluations
PDF
final_presentation
PDF
Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...
PDF
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
DOCX
Computer vision,,summer training programme
PDF
Query Image Searching With Integrated Textual and Visual Relevance Feedback f...
PPTX
Orb feature by nitin
PDF
Final Report for project
PDF
Currency recognition on mobile phones
PDF
Modelling User Interaction utilising Information Foraging Theory (and a bit o...
PDF
Kernel Descriptors for Visual Recognition
PDF
Lec14 eigenface and fisherface
PPT
Semantics In Digital Photos A Contenxtual Analysis
PDF
Video Stitching using Improved RANSAC and SIFT
PPTX
search engine for images
4 image enhancement in spatial domain
PCA-SIFT: A More Distinctive Representation for Local Image Descriptors
A comparative analysis of retrieval techniques in content based image retrieval
A COMPARATIVE ANALYSIS OF RETRIEVAL TECHNIQUES IN CONTENT BASED IMAGE RETRIEVAL
JPEG XR objective and subjective evaluations
final_presentation
Distributed Formal Concept Analysis Algorithms Based on an Iterative MapReduc...
3D Reconstruction from Multiple uncalibrated 2D Images of an Object
Computer vision,,summer training programme
Query Image Searching With Integrated Textual and Visual Relevance Feedback f...
Orb feature by nitin
Final Report for project
Currency recognition on mobile phones
Modelling User Interaction utilising Information Foraging Theory (and a bit o...
Kernel Descriptors for Visual Recognition
Lec14 eigenface and fisherface
Semantics In Digital Photos A Contenxtual Analysis
Video Stitching using Improved RANSAC and SIFT
search engine for images

More from United States Air Force Academy (6)

PDF
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
PDF
Multimedia Communication Lec02: Info Theory and Entropy
PDF
ECE 4490 Multimedia Communication Lec01
PDF
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
PDF
Light Weight Fingerprinting for Video Playback Verification in MPEG DASH
PDF
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...
Lec-03 Entropy Coding I: Hoffmann & Golomb Codes
Multimedia Communication Lec02: Info Theory and Entropy
ECE 4490 Multimedia Communication Lec01
Tutorial on MPEG CDVS/CDVA Standardization at ICNITS L3 Meeting
Light Weight Fingerprinting for Video Playback Verification in MPEG DASH
Scaled Eigen Appearance and Likelihood Prunning for Large Scale Video Duplica...

Recently uploaded (20)

PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Orientation - ARALprogram of Deped to the Parents.pptx
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Pharma ospi slides which help in ospi learning
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Complications of Minimal Access Surgery at WLH
PPTX
Lesson notes of climatology university.
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
A systematic review of self-coping strategies used by university students to ...
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PPTX
GDM (1) (1).pptx small presentation for students
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
O5-L3 Freight Transport Ops (International) V1.pdf
Orientation - ARALprogram of Deped to the Parents.pptx
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Final Presentation General Medicine 03-08-2024.pptx
Pharma ospi slides which help in ospi learning
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Complications of Minimal Access Surgery at WLH
Lesson notes of climatology university.
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
VCE English Exam - Section C Student Revision Booklet
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Computing-Curriculum for Schools in Ghana
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Microbial diseases, their pathogenesis and prophylaxis
Anesthesia in Laparoscopic Surgery in India
A systematic review of self-coping strategies used by university students to ...
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
GDM (1) (1).pptx small presentation for students
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...

Lec07 aggregation-and-retrieval-system

  • 1. Image Analysis & Retrieval CS/EE 5590 Special Topics (Class Ids: 44873, 44874) Fall 2016, M/W 4-5:15pm@Bloch 0012 Lec 07 Feature Aggregation and Image Retrieval System Zhu Li Dept of CSEE, UMKC Office: FH560E, Email: lizhu@umkc.edu, Ph: x 2346. http://l.web.umkc.edu/lizhu p.1Image Analysis & Retrieval, 2016
  • 2. Outline  ReCap of Lecture 06  SIFT  Box Filter  Image Retrieval System  Why Aggregation ?  Aggregation Schemes  Summary Image Analysis & Retrieval, 2016 p.2
  • 3. Scale Space Theory - Lindeberg  Scale Space Response via Laplacian of Gaussian  The scale is controlled by 𝜎  Characteristic Scale: Image Analysis & Retrieval, 2016 p.3 2 2 2 2 2 y g x g g       𝑔 = 𝑒 − 𝑥+𝑦 2 2𝜎 r image 𝜎 = 0.8𝑟 𝜎 = 1.2𝑟 𝜎 = 2𝑟 … characteristic scale
  • 4. SIFT  Use DoG to approximate LoG  Separable Gaussian filter  Difference of image instead of difference of Gaussian kernel Image Analysis & Retrieval, 2016 p.4 L o G Scale space construction By Gaussian Filtering, and Image Difference
  • 5. Peak Strength & Edge Removal  Peak Strength:  Interpolate true DoG response and pixel location by Taylor expansion  Edge Removal:  Re-do Harris type detection to remove edge on much reduced pixel set Image Analysis & Retrieval, 2016 p.5
  • 6. Scale Invariance thru Dominant Orientation Coding  Voting for the dominant orientation  Weighted by a Gaussian window to give more emphasis to the gradients closer to the center Image Analysis & Retrieval, 2016 p.6
  • 7. SIFT Matching and Repeatability Prediction  SIFT Distance Not all SIFT are created equal…  Peak strength (DoG response at interpolated position) Image Analysis & Retrieval, 2016 p.7 Combined scale/peak strength pmf 𝑑(𝑠1 1 , 𝑠 𝑘∗ 2 ) 𝑑(𝑠1 1 , 𝑠 𝑘 2 ) ≤ 𝜃
  • 8. Box Fitler – CABOX work  Basic Idea:  Approximate DoG with linear combination of box filters min. 𝒉 𝒈 − 𝐵 ∙ 𝒉 𝐿2 2 + 𝒉 𝐿1  Solution by LASSO Image Analysis & Retrieval, 2016 p.8 = h1* h2*+ + …
  • 9. Outline  ReCap of Lecture 06  SIFT  Box Filter  Image Retrieval System  Why Aggregation ?  Aggregation Schemes  Summary Image Analysis & Retrieval, 2016 p.9
  • 10. Image Matching/Retrieval System SIFT is a sub-image level feature, we actually care more on how SIFT match will translate into image level matching/retrieval accuracy Say if we can compute a single distance from a collection of features:  Then for a data base of n images, we can compute an n x n distance matrix  This gives us full information of the performance of this feature/distance system  How to characterize the performance of such image matching and retrieval system ? Image Analysis & Retrieval, 2016 p.10 𝑑 𝐼1, 𝐼2 = 𝑘 𝛼 𝑘 𝑑(𝐹𝑘 1 , 𝐹𝑘 2 ) 𝐷𝑖, 𝑘 = 𝑑(𝐼𝑗, 𝐼 𝑘)
  • 11. Thresholding for Matching  Basically, for any pair of Images (documents, in IR jargon), we declare  Then for each possible image pair, or pairs we care, for a given threshold t, there will be 4 possible consequences  TP pair: {Ij, Ik} declared matching pairs, d(Ij, Ik) < t;  FP pair: {Ij, Ik} declared matching pairs, d(Ij, Ik) >= t;  TN pair: {Ij, Ik} declared non-matching pairs, d(Ij, Ik) >= t;  FN pair: {Ij, Ik} declared non- matching pairs, d(Ij, Ik) < t; Image Analysis & Retrieval, 2016 p.11 𝐼𝑗, 𝐼 𝑘 𝑎𝑟𝑒 𝑚𝑎𝑡𝑐ℎ, 𝑖𝑓 𝑑 𝐼𝑗, 𝐼 𝑘 < 𝑡 𝐼𝑗, 𝐼 𝑘 𝑎𝑟𝑒𝑛𝑜𝑡 𝑚𝑎𝑡𝑐ℎ, 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
  • 12. Matching System Performance  True Positive Rate/Precision:  Out of retrieved matching pairs, how many are true matching pairs  For all matching pairs with distance < t  False Positive Rate:  Out of retrieved matching pairs, how many are actually negative, false matchings Image Analysis & Retrieval, 2016 p.12 𝑇𝑃𝑅 = 𝑡𝑝 𝑡𝑝 + 𝑓𝑛 𝐹𝑃𝑅 = 𝑓𝑝 𝑓𝑝 + 𝑡𝑛
  • 13. TPR-FPR Definition: TP rate = TP/(TP+FN) FP rate = FP/(FP+TN) From the actual value point of view Image Analysis & Retrieval, 2016 p.13
  • 14. ROC curve(1) ROC = receiver operating characteristic Y:TP rate X:FP rate Image Analysis & Retrieval, 2016 p.14
  • 15. ROC curve(2) Which method (A or B) is better? compute ROC area: area under ROC curve Image Analysis & Retrieval, 2016 p.15
  • 16. Precision, Recall, F-measure Precision = TP/(TP + FP), Recall = TP/(TP + FN) F-measure = 2*(precision*recall)/(precision + recall) Precision: is the probability that a retrieved document is relevant. Recall: is the probability that a relevant document is retrieved in a search. Image Analysis & Retrieval, 2016 p.16
  • 17. Matlab Implementation  We will compute all image pair distances D(j,k)  How do we compute the TPR-FPR plot ?  Understand that TPR and FPR are actually function of threshold t,  Just need to parameterize TPR(t) and FPR(t), and obtaining operating points of meaningful thresholds, to generate the plot.  Matlab Implementation:  [tp, fp, tn, fn]=getPrecisionRecall() Image Analysis & Retrieval, 2016 p.17 d_min = min(min(d0), min(d1)); d_max = max(max(d0), max(d1)); delta = (d_max - d_min) / npt; for k=1:npt thres = d_min + (k-1)*delta; tp(k) = length(find(d0<=thres)); fp(k) = length(find(d1<=thres)); tn(k) = length(find(d1>thres)); fn(k) = length(find(d0>thres)); end if dbg figure(22); grid on; hold on; plot(fp./(tn+fp), tp./(tp+fn), '.-r', 'DisplayName', 'tpr-fpr');legend(); end
  • 18. TPR-FPR  Image Matching performance are characterized by functions  TPR(FPR)  Retrieval set: we want high Precision, Short List: High Recall. Image Analysis & Retrieval, 2016 p.18
  • 19. Outline  ReCap of Lecture 06  SIFT  Box Filter  Image Retrieval System  Why Aggregation ?  Aggregation Schemes  Summary Image Analysis & Retrieval, 2016 p.19
  • 20. Why Aggregation ?  What (Local) Interesting Points features bring us ?  Scale and rotation invariance in the form of nk x d:  Un-cerntainty of the number of detected features nk, at query time  Permutation along rows of features are the same representation.  Problems:  The feature has state, not able to draw decision boundaries,  Not directly indexable/hashable  Typically very high dimensionality Image Analysis & Retrieval, 2016 p.20 𝑆 𝑘| [𝑥 𝑘, 𝑦 𝑘, 𝜃 𝑘, 𝜎 𝑘, ℎ1, ℎ2, … , ℎ128] , 𝑘 = 1. . 𝑛
  • 21. Decision Boundary in Matching  Can we have a decision boundary function for interesting points based representation ? Image Analysis & Retrieval, 2016 p.21 …..
  • 22. Curse of Dimensionality in Retrieval  What feature dimensions will do to the retrieval efficiency…  Looking at retrieval 99% of per dimension locality, and the total volume covered plot.  Matlab: showDimensionCurse.m Image Analysis & Retrieval, 2016 p.22 +
  • 23. Aggregation – 30,000ft view  Bag of Words  Compute k centroids in feature space, called visual words  Compute histogram  k x1 feature, hard assignment  VLAD  Compute centroids in feature space  Compute aggregaged difference w.r.t the centroids  k x d feature, soft assignment  Fisher Vector  Compute a Gaussian Mixture Model (GMM) with 2nd order info  Compute the aggregated feature w.r.t the mean and covariance of GMM  2 x k x d feature  AKULA  Adaptive centroids and feature count  Improved with covariance ? Image Analysis & Retrieval, 2016 p.23 0.5 0.4 0.05 0.05
  • 24. Visual Key Words: main idea Extract some local features from a number of images … Image Analysis & Retrieval, 2016 24 e.g., SIFT descriptor space: each point is 128- dimensional Slide credit: D. Nister
  • 25. Visual Key Words: main idea Image Analysis & Retrieval, 2016 25Slide credit: D. Nister
  • 26. Visual words: main idea Image Analysis & Retrieval, 2016 26 Slide credit: D. Nister
  • 27. Visual words: main idea Image Analysis & Retrieval, 2016 27 Slide credit: D. Nister
  • 28. Slide credit: D. Nister Visual Key Words Image Analysis & Retrieval, 2016 28 Each point is a local descriptor, e.g. SIFT vector.
  • 29. Slide credit: D. Nister Image Analysis & Retrieval, 2016 29
  • 30. Visual words Example: each group of patches belongs to the same visual word Image Analysis & Retrieval, 2016 30 Figure from Sivic & Zisserman, ICCV 2003
  • 31. Visual words Image Analysis & Retrieval, 2016 31 31 Source credit: K. Grauman, B. Leibe • More recently used for describing scenes and objects for the sake of indexing or classification. Sivic & Zisserman 2003; Csurka, Bray, Dance, & Fan 2004; many others.
  • 32. Object Bag of ‘words’ ICCV 2005 short course, L. Fei-Fei Bag of Words Image Analysis & Retrieval, 2016 32
  • 33. BoW Examples  Illustration Image Analysis & Retrieval, 2016 33
  • 34. Bags of visual words Summarize entire image based on its distribution (histogram) of word occurrences. Analogous to bag of words representation commonly used for documents. Image Analysis & Retrieval, 2016 34 Image credit: Fei-Fei Li
  • 35. Texture Retrieval Texons… Image Analysis & Retrieval, 2016 35 Universal texton dictionary histogram Source: Lana Lazebnik
  • 36. BoW Distance Metrics Rank images by normalized scalar product between their (possibly weighted) occurrence counts---nearest neighbor search for similar images. Image Analysis & Retrieval, 2016 p.36 [5 1 1 0][1 8 1 4] dj q
  • 37. Inverted List  Image Retrieval via Inverted List Image Analysis & Retrieval, 2016 37 Image credit: A. Zisserman Visual Word number List of image numbers When will this give us a significant gain in efficiency?
  • 38. Indexing local features: inverted file index For text documents, an efficient way to find all pages on which a word occurs is to use an index… We want to find all images in which a feature occurs. We need to index each feature by the image it appears and also we keep the # of occurrence. Image Analysis & Retrieval, 2016 38 Source credit : K. Grauman, B. Leibe
  • 39. TF-IDF Weighting Term Frequency – Inverse Document Frequency  Describe image by frequency of each visual word within it, down-weight words that appear often in the database (Standard weighting for text retrieval) Image Analysis & Retrieval, 2016 p.39 Total number of words in database Number of occurrences of word i in whole database Number of occurrences of word i in document d Number of words in document d
  • 40. BoW Use Case with Spatial Localization Collecting words within a query region Image Analysis & Retrieval, 2016 40 Query region: pull out only the SIFT descriptors whose positions are within the polygon
  • 41. Image Analysis & Retrieval, 2016 41
  • 42. BoW Patch Search Localizing the BoW representation Image Analysis & Retrieval, 2016 42
  • 43. Localization with BoW Image Analysis & Retrieval, 2016 43
  • 44. Hiearchical Assignment of Histogram Tree construction: Image Analysis & Retrieval, 2016 44 [Nister & Stewenius, CVPR’06]
  • 45. Vocabulary Tree Training: Filling the tree Image Analysis & Retrieval, 2016 45 [Nister & Stewenius, CVPR’06]
  • 46. 46 Vocabulary Tree Training: Filling the tree Image Analysis & Retrieval, 2016 46Slide credit: David Nister [Nister & Stewenius, CVPR’06]
  • 47. 47 Vocabulary Tree Training: Filling the tree Image Analysis & Retrieval, 2016 47Slide credit: David Nister [Nister & Stewenius, CVPR’06]
  • 48. Vocabulary Tree Training: Filling the tree Image Analysis & Retrieval, 2016 48 [Nister & Stewenius, CVPR’06]
  • 49. Vocabulary Tree Training: Filling the tree Image Analysis & Retrieval, 2016 49 [Nister & Stewenius, CVPR’06]
  • 50. 50 Vocabulary Tree Recognition Image Analysis & Retrieval, 2016 50Slide credit: David Nister [Nister & Stewenius, CVPR’06] RANSAC verification
  • 51. Vocabulary Tree: Performance Evaluated on large databases  Indexing with up to 1M images Online recognition for database of 50,000 CD covers  Retrieval in ~1s Find experimentally that large vocabularies can be beneficial for recognition Image Analysis & Retrieval, 2016 51 [Nister & Stewenius, CVPR’06]
  • 52. Larger vocabularies can be advantageous… But what happens if it is too large? Visual Word Vocabulary Size  Performance w.r.t vocabulary size Image Analysis & Retrieval, 2016 52
  • 53. Bags of words: pros and cons Good: + flexible to geometry / deformations / viewpoint + compact summary of image content + provides vector representation for sets + Inverted List implementation offers practical solution against large repository Bad: - Lost of information at quantization and histogram generation - basic model ignores geometry – must verify afterwards, or encode via features - background and foreground mixed when bag covers whole image - interest points or sampling: no guarantee to capture object-level parts Image Analysis & Retrieval, 2016 53Source credit : K. Grauman, B. Leibe
  • 54. Can we improve BoW ? • E.g. Why isn’t our Bag of Words classifier at 90% instead of 70%? • Training Data – Huge issue, but not necessarily a variable you can manipulate. • Learning method – BoW is on top of any feature scheme • Representation – Are we losing too much info in the process ? Image Analysis & Retrieval, 2016 p.54
  • 55. Standard Kmeans Bag of Words  BoW revisited Image Analysis & Retrieval, 2016 p.55 http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
  • 56. Motivation Bag of Visual Words is only about counting the number of local descriptors assigned to each Voronoi region Why not including other statistics/information ? Image Analysis & Retrieval, 2016 p.56 http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
  • 57. We already looked at the Spatial Pyramid/Pooling Spatial Pooling Image Analysis & Retrieval, 2016 p.57 level 2: 4x4level 0: 1x1 level 1: 2x2 Key take away: Multiple assignment ? Soft Assignment ?
  • 58. Motivation Bag of Visual Words is only about counting the number of local descriptors assigned to each Voronoi region Why not including other statistics? For instance: • mean of local descriptors Image Analysis & Retrieval, 2016 p.58 http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
  • 59. Motivation Bag of Visual Words is only about counting the number of local descriptors assigned to each Voronoi region Why not including other statistics? For instance: • mean of local descriptors • (co)variance of local descriptors Image Analysis & Retrieval, 2016 p.59 http://www.cs.utexas.edu/~grauman/courses/fall2009/papers/bag_of_visual_words.pdf
  • 60. Simple case: Soft Assignment Called “Kernel codebook encoding” by Chatfield et al. 2011. Cast a weighted vote into the most similar clusters. Image Analysis & Retrieval, 2016 p.60
  • 61. Simple case: Soft Assignment Called “Kernel codebook encoding” by Chatfield et al. 2011. Cast a weighted vote into the most similar clusters. This is fast and easy to implement (try it for Project 3!) but it does have some downsides for image retrieval – the inverted file index becomes less sparse. Image Analysis & Retrieval, 2016 p.61
  • 62. A first example: the VLAD Given a codebook , e.g. learned with K-means, and a set of local descriptors : •  assign: •  compute: • concatenate vi’s + normalize Image Analysis & Retrieval, 2016 p.62 Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10.  3 x v1 v2 v3 v4 v5 1  4  2  5 ① assign descriptors ② compute x-  i ③ vi=sum x-  i for cell i
  • 63. A first example: the VLAD A graphical representation of Image Analysis & Retrieval, 2016 p.63 Jégou, Douze, Schmid and Pérez, “Aggregating local descriptors into a compact image representation”, CVPR’10.
  • 64. VL_FEAT Implementation  Matlab: Image Analysis & Retrieval, 2016 p.64 function [vc]=vladSiftEncoding(sift, codebook) dbg=1; if dbg if (0) % init VL_FEAT, only need to do once run('../../tools/vlfeat- 0.9.20/toolbox/vl_setup.m'); end im = imread('../pics/flarsheim- 2.jpg'); [f, sift] = vl_sift(single(rgb2gray(im))); sift = single(sift'); [indx, codebook] = kmeans(sift, 16); % make sift # smaller sift = sift(1:800,:); end [n, kd]=size(sift); [m, kd]=size(codebook); % compute assignment dist = pdist2(codebook, sift); mdist = mean(mean(dist)); % normalize the heat kernel s.t. mean dist is mapped to 0.5 a = -log(0.5)/mdist; indx = exp(-a*dist); vc=vl_vlad(sift', codebook', indx); if dbg figure(41); colormap(gray); subplot(2,2,1); imshow(im); title('image'); subplot(2,2,2); imagesc(dist); title('m x n distance'); subplot(2,2,3); imagesc(indx); title('m x n assignment'); subplot(2,2,4); imagesc(reshape(vc, [m, kd]));title('vlad code'); end
  • 65. VLAD Code  What are the tweaks ?  Code book design  Soft Assignment options Image Analysis & Retrieval, 2016 p.65
  • 66. References  Vocabulary Tree:  David Nistér, Henrik Stewénius: Scalable Recognition with a Vocabulary Tree. CVPR (2) 2006: 2161-2168  VLAD:  Herve Jegou, Matthijs Douze, Cordelia Schmid: Improving Bag-of-Features for Large Scale Image Search. International Journal of Computer Vision 87(3): 316-336 (2010)  Fisher Vector:  Florent Perronnin, Jorge Sánchez, Thomas Mensink: Improving the Fisher Kernel for Large-Scale Image Classification. ECCV (4) 2010: 143-156  AKULA:  Abhishek Nagar, Zhu Li, Gaurav Srivastava, Kyungmo Park: AKULA - Adaptive Cluster Aggregation for Visual Search. DCC 2014: 13-22 Image Analysis & Retrieval, 2016 p.66
  • 67. Lec 07 Summary  Image Retrieval System Metric  What is true positive, false positive, true negative, false negative ?  What is precision, recall, F-score ? Why Aggregation ?  Decision boundary  Indexing/Hashing  Bag of Words  A histogram with bins visual words  Variations: hierarchical assignment with vocabulary tree  Implementation: Inverted List VLAD  Richer encoding of aggregated info  Soft assignment of features to codebook bins  Vectorized representation – no need for inverted list Image Analysis & Retrieval, 2016 p.67