Fcv rep darrell

Learning visual representations
for unfamiliar environments

Kate Saenko, Brian Kulis,
Trevor Darrell

UC Berkeley EECS & ICSI

The challenge of large scale visual interaction

Last decade has proven the superiority of models
learned from data vs. hand engineered structures!

Large-scale learning
• “Unsupervised”: Learn models from “found data”;
often exploit multiple modalities (text+image)

… The Tote is the perfect example of
two handbag design principles that ...
The lines of this tote are incredibly
sleek, but ... The semi buckles that
form the handle attachments are ...

E.g., finding visual senses

Artifact sense: “telephone” DICTIONARY

1: (n)
telephone, phone, telepho
ne set (electronic
equipment that converts
sound into electrical
signals that can be
transmitted over distances
and then converts received
signals back into sounds)

2: (n)
telephone, telephony
(transmitting speech at a
distance)

[Saenko and Darrell ’09]
4

Large-scale Learning
• “Unsupervised”: Learn models from “found data”;
often exploit multiple modalities (text+image)

… The Tote is the perfect example of
two handbag design principles that ...
The lines of this tote are incredibly
sleek, but ... The semi buckles that
form the handle attachments are ...

• Supervised: Crowdsource labels (e.g., ImageNet)

Yet…
• Even the best collection of images from the web and
strong machine learning methods can often yield poor
classifiers on in-situ data!

?
• Supervised learning assumption: training distribution
== test distribution
• Unsupervised learning assumption: joint distribution is
stationary w.r.t. online world and real world

Almost never true! 6

“What You Saw Is Not What You Get”

SVM:20%
NBNN:19%
SVM:54%
NBNN:61%

The models fail due to domain shift

Examples of visual domain shifts

digital SLR webcam Close-up Far-away

amazon.com FLICKR CCTV
Consumer images

Examples of domain shift:
change in camera, feature type, dimension
digital SLR webcam

SURF SIFT

VQ to 300
Different VQ to
1000
dimensions

Solutions?

• Do nothing (poor performance)
• Collect all types of data (impossible)
• Find out what changed (impractical)
• Learn what changed

Prior Work on Domain Adaptation

• Pre-process the data [Daumé ’07] : replicate
features to also create source- and domain-
specific versions; re-train learner on new features

• SVM-based methods [Yang’07], [Jiang’08],
[Duan’09], [Duan’10] : adapt SVM parameters

• Kernel mean matching [Gretton’09] : re-weight
training data to match test data distribution

Our paradigm: Transform-based
Domain Adaptation
Example: “green” and “blue” domains
Previous methods’ drawbacks
• cannot transfer learned shift
to new categories
• cannot handle new features
We can do both by learning W
domain transformations*

* Saenko, Kulis, Fritz, and Darrell.
Adapting visual category models to
new domains. ECCV, 2010

Limitations of symmetric transforms
Symmetric assumption fails!

Saenko et al. ECCV10 used
metric learning:
• symmetric transforms
• same features
W
How do we learn more
general shifts?

Latest approach*: asymmetric transforms
Asymmetric transform (rotation)

• Metric learning model no
longer applicable
• We propose to learn
asymmetric transforms
– Map from target to source
– Handle different dimensions

*Kulis, Saenko, and Darrell, What You
Saw is Not What You Get: Domain
Adaptation Using Asymmetric Kernel
Transforms, CVPR 2011

Latest approach: asymmetric transforms
Asymmetric transform (rotation)

• Metric learning model no
longer applicable
• We propose to learn
asymmetric transforms
W
– Map from target to source
– Handle different dimensions

Model Details

W

• Learn a linear transformation to map points
from one domain to another
– Call this transformation W
– Matrices of source and target:

Loss Functions

Choose a point x from the
source and y from the
target, and consider inner
product:

Should be “large” for similar
objects and “small” for dissimilar
objects

Loss Functions

• Input to problem includes a collection of m
loss functions

• General assumption: loss functions depend
on data only through inner product matrix

Regularized Objective Function

• Minimize a linear combination of sum of loss
functions and a regularizer:

• We use squared Frobenius norm as a
regularizer
– Not restricted to this choice

The Model Has Drawbacks

• A linear transformation may be insufficient
• Cost of optimization grows as the product of
the dimensionalities of the source and target
data

• What to do?

Kernelization

• Main idea: run in kernel space
– Use a non-linear kernel function (e.g., RBF kernel)
to learn non-linear transformations in input space
– Resulting optimization is independent of input
dimensionality
– Additional assumption necessary: regularizer is a
spectral function

Kernelization
Kernel matrices for source
and target

Original Transformation
Learning Problem

New Kernel Problem

Relationship between
original and new problems
at optimality

Summary of approach

Input Input
space space
1. Multi-Domain Data 2. Generate Constraints, Learn W

Test point
y1
y2
Test point
3. Map via W 4. Apply to New Categories

Experimental Setup

• Utilized a standard bag-of-words model
• Also utilize different features in the target domain
– SURF vs SIFT
– Different visual word dictionaries

• Baseline for comparing such data: KCCA

Novel-class experiments

Our Method (linear)
Our Method

• Test method’s ability to transfer domain shift to unseen
classes
• Train transform on half of the classes, test on the other half

Extreme shift example

Query from target Nearest neighbors in source using KCCA+KNN

Nearest neighbors in source using transformation

Conclusion
• Should not rely on hand-engineered features any
more than we rely on hand engineered models!

• Learn feature transformation across domains

• Developed a domain adaptation method based on
regularized non-linear transforms
– Asymmetric transform achieves best results on more
extreme shifts
– Saenko et al ECCV 2010 and Kulis et al CVPR 2011;
journal version forthcoming

Fcv rep darrell

More Related Content

What's hot (20)

Viewers also liked (8)

Similar to Fcv rep darrell (20)

More from zukun (20)

Recently uploaded (20)

Fcv rep darrell

Editor's Notes