SlideShare a Scribd company logo
Machine LearningMachine Learninggg
Introduction &Introduction &
Nonparametric ClassifiersNonparametric ClassifiersNonparametric ClassifiersNonparametric Classifiers
Eric XingEric Xing
Lecture 1, August 12, 2010
© Eric Xing @ CMU, 2006-2010
Reading:
Machine Learning
 Where does it come from:
Machine Learning
 http://www.cs.cmu.edu/~epxing/Class/10701/
 http://www.cs.cmu.edu/~epxing/Class/10708/
© Eric Xing @ CMU, 2006-2010
LogisticsLogistics
 Text book
 Chris Bishop, Pattern Recognition and Machine Learning (required)
 Tom Mitchell, Machine Learning
 David Mackay, Information Theory, Inference, and Learning Algorithms
 Daphnie Koller and Nir Friedman, Probabilistic Graphical Models
 Class resource
 http://bcmi sjtu edu cn/ds/ http://bcmi.sjtu.edu.cn/ds/
 Host:
 Lu Baoliang, Shanghai JiaoTong University
 Xue Xiangyang, Fudan University
 Instructors:
© Eric Xing @ CMU, 2006-2010
Instructors:
Local Hosts and
co instructorsco-instructors
© Eric Xing @ CMU, 2006-2010
LogisticsLogistics
 Class mailing listg
 dragonstar_machinelearning@googlegroups.com
 Home work Home work
 Exam
 Project
© Eric Xing @ CMU, 2006-2010
What is LearningWhat is Learning
Learning is about seeking a predictive and/or executable understanding of
natural/artificial subjects phenomena or activities from
Apoptosis + Medicine
natural/artificial subjects, phenomena, or activities from …
Grammatical rules
Manufacturing procedures
Inferenceg p
Natural laws
…
Inference
© Eric Xing @ CMU, 2006-2010
Machine LearningMachine Learning
© Eric Xing @ CMU, 2006-2010
Fetching a stapler from inside an
office the Stanford STAIR robotoffice --- the Stanford STAIR robot
© Eric Xing @ CMU, 2006-2010
What is Machine Learning?What is Machine Learning?
Machine Learning seeks to develop theories and computer systems for
 representing;
 classifying, clustering and recognizing;
i d t i t reasoning under uncertainty;
 predicting;
 and reacting to
 …
complex, real world data, based on the system's own experience with data,
and (hopefully) under a unified model or mathematical framework, that
 can be formally characterized and analyzed
 can take into account human prior knowledge
 can generalize and adapt across data and domains
© Eric Xing @ CMU, 2006-2010
 can operate automatically and autonomously
 and can be interpreted and perceived by human.
Where Machine Learning is being
used or can be useful?used or can be useful?
Speech recognitionSpeech recognition
Information retrievalInformation retrieval
Computer visionComputer vision
GamesGames
Robotic controlRobotic control
GamesGames
© Eric Xing @ CMU, 2006-2010 PlanningPlanning
EvolutionEvolution
PedigreePedigree
Natural language processing and
speech recognitionspeech recognition
 Now most pocket Speech Recognizers or Translators are running
on some sort of learning device --- the more you play/use them, the
smarter they become!
© Eric Xing @ CMU, 2006-2010
Object RecognitionObject Recognition
 Behind a security camera,
most likely there is a computer
that is learning and/or
checking!
© Eric Xing @ CMU, 2006-2010
Robotic Control IRobotic Control I
 The best helicopter pilot is now a computer!p p p
 it runs a program that learns how to fly and make acrobatic maneuvers by itself!
 no taped instructions, joysticks, or things like …
© Eric Xing @ CMU, 2006-2010
A. Ng 2005
Text MiningText Mining
 We want:
 Reading, digesting, and
categorizing a vast text
database is too much fordatabase is too much for
human!
© Eric Xing @ CMU, 2006-2010
Bioinformatics
g g g g ggg g ggg g g g gg g g g g g g gg g g gg g gg g gg g
cacatcgctgcgtttcggcagctaattgccttttagaaattattttcccatttcgagaaactcgtgtgggatgccggatgcggctttcaatcacttctggcccgggatcggattgggtcacattgtctgcgggctctattgtctcgatccgc
ggcgcagttcgcgtgcttagcggtcagaaaggcagagattcggttcggattgatgcgctggcagcagggcacaaagatctaatgactggcaaatcgctacaaataaattaaagtccggcggctaattaatgagcggactgaagccactttgg
attaaccaaaaaacagcagataaacaaaaacggcaaagaaaattgccacagagttgtcacgctttgttgcacaaacatttgtgcagaaaagtgaaaagcttttagccattattaagtttttcctcagctcgctggcagcacttgcgaatgta
ctgatgttcctcataaatgaaaattaatgtttgctctacgctccaccgaactcgcttgtttgggggattggctggctaatcgcggctagatcccaggcggtataaccttttcgcttcatcagttgtgaaaccagatggctggtgttttggca
cagcggactcccctcgaacgctctcgaaatcaagtggctttccagccggcccgctgggccgctcgcccactggaccggtattcccaggccaggccacactgtaccgcaccgcataatcctcgccagactcggcgctgataaggcccaatgtc
actccgcaggcgtctatttatgccaaggaccgttcttcttcagctttcggctcgagtatttgttgtgccatgttggttacgatgccaatcgcggtacagttatgcaaatgagcagcgaataccgctcactgacaatgaacggcgtcttgtca
tattcatgctgacattcatattcattcctttggttttttgtcttcgacggactgaaaagtgcggagagaaacccaaaaacagaagcgcgcaaagcgccgttaatatgcgaactcagcgaactcattgaagttatcacaacaccatatccata
catatccatatcaatatcaatatcgctattattaacgatcatgctctgctgatcaagtattcagcgctgcgctagattcgacagattgaatcgagctcaatagactcaacagactccactcgacagatgcgcaatgccaaggacaattgccg
Bioinformaticsg g g g g g g g g g g g g g g g g g g g g g g gg g g
tggagtaaacgaggcgtatgcgcaacctgcacctggcggacgcggcgtatgcgcaatgtgcaattcgcttaccttctcgttgcgggtcaggaactcccagatgggaatggccgatgacgagctgatctgaatgtggaaggcgcccagcaggc
aagattactttcgccgcagtcgtcatggtgtcgttgctgcttttatgttgcgtactccgcactacacggagagttcaggggattcgtgctccgtgatctgtgatccgtgttccgtgggtcaattgcacggttcggttgtgtaaccttcgtgt
tctttttttttagggcccaataaaagcgcttttgtggcggcttgatagattatcacttggtttcggtggctagccaagtggctttcttctgtccgacgcacttaattgaattaaccaaacaacgagcgtggccaattcgtattatcgctgtt
tacgtgtgtctcagcttgaaacgcaaaagcttgtttcacacatcggtttctcggcaagatgggggagtcagtcggtctagggagaggggcgcccaccagtcgatcacgaaaacggcgaattccaagcgaaacggaaacggagcgagcactat
agtactatgtcgaacaaccgatcgcggcgatgtcagtgagtcgtcttcggacagcgctggcgctccacacgtatttaagctctgagatcggctttgggagagcgcagagagcgccatcgcacggcagagcgaaagcggcagtgagcgaaagc
gagcggcagcgggtgggggatcgggagccccccgaaaaaaacagaggcgcacgtcgatgccatcggggaattggaacctcaatgtgtgggaatgtttaaatattctgtgttaggtagtgtagtttcatagactatagattctcatacagatt
gagtccttcgagccgattatacacgacagcaaaatatttcagtcgcgcttgggcaaaaggcttaagcacgactcccagtccccccttacatttgtcttcctaagcccctggagccactatcaaacttgttctacgcttgcactgaaaataga
accaaagtaaacaatcaaaaagaccaaaaacaataacaaccagcaccgagtcgaacatcagtgaggcattgcaaaaatttcaaagtcaagtttgcgtcgtcatcgcgtctgagtccgatcaagccgggcttgtaattgaagttgttgatgag
ttactggattgtggcgaattctggtcagcatacttaacagcagcccgctaattaagcaaaataaacatatcaaattccagaatgcgacggcgccatcatcctgtttgggaattcaattcgcgggcagatcgtttaattcaattaaaaggtag
aaaagggagcagaagaatgcgatcgctggaatttcctaacatcacggaccccataaatttgataagcccgagctcgctgcgttgagtcagccaccccacatccccaaatccccgccaaaagaagacagctgggttgttgactcgccagattg
attgcagtggagtggacctggtcaaagaagcaccgttaatgtgctgattccattcgattccatccgggaatgcgataaagaaaggctctgatccaagcaactgcaatccggatttcgattttctctttccatttggttttgtatttacgtac
aagcattctaatgaagacttggagaagacttacgttatattcagaccatcgtgcgatagaggatgagtcatttccatatggccgaaatttattatgtttactatcgtttttagaggtgttttttggacttaccaaaagaggcatttgttttc
ttcaactgaaaagatatttaaattttttcttggaccattttcaaggttccggatatatttgaaacacactagctagcagtgttggtaagttacatgtatttctataatgtcatattcctttgtccgtattcaaatcgaatactccacatctc
ttgtacttgaggaattggcgatcgtagcgatttcccccgccgtaaagttcctgatcctcgttgtttttgtacatcataaagtccggattctgctcgtcgccgaagatgggaacgaagctgccaaagctgagagtctgcttgaggtgctggtc
gtcccagctggataaccttgctgtacagatcggcatctgcctggagggcacgatcgaaatccttccagtggacgaacttcacctgctcgctgggaatagcgttgttgtcaagcagctcaaggagcgtattcgagttgacgggctgcaccacg
ctgctccttcgctggggattcccctgcgggtaagcgccgcttgcttggactcgtttccaaatcccatagccacgccagcagaggagtaacagagctcwhereisthegenetgattaaaaatatcctttaagaaagcccatgggtataactt
actgcgtcctatgcgaggaatggtctttaggttctttatggcaaagttctcgcctcgcttgcccagccgcggtacgttcttggtgatctttaggaagaatcctggactactgtcgtctgcctggcttatggccacaagacccaccaagagcg
aggactgttatgattctcatgctgatgcgactgaagcttcacctgactcctgctccacaattggtggcctttatatagcgagatccacccgcatcttgcgtggaatagaaatgcgggtgactccaggaattagcattatcgatcggaaagtg
ataaaactgaactaacctgacctaaatgcctggccataattaagtgcatacatacacattacattacttacatttgtataagaactaaattttatagtacataccacttgcgtatgtaaatgcttgtcttttctcttatatacgttttataa
cccagcatattttacgtaaaaacaaaacggtaatgcgaacataacttatttattggggcccggaccgcaaaccggccaaacgcgtttgcacccataaaaacataagggcaacaaaaaaattgttaagctgttgtttatttttgcaatcgaaa
cgctcaaatagctgcgatcactcgggagcagggtaaagtcgcctcgaaacaggaagctgaagcatcttctataaatacactcaaagcgatcattccgaggcgagtctggttagaaatttacatggactgcaaaaaggtatagccccacaaac
cacatcgctgcgtttcggcagctaattgccttttagaaattattttcccatttcgagaaactcgtgtgggatgccggatgcggctttcaatcacttctggcccgggatcggattgggtcacattgtctgcgggctctattgtctcgatccgc
ggcgcagttcgcgtgcttagcggtcagaaaggcagagattcggttcggattgatgcgctggcagcagggcacaaagatctaatgactggcaaatcgctacaaataaattaaagtccggcggctaattaatgagcggactgaagccactttgg
attaaccaaaaaacagcagataaacaaaaacggcaaagaaaattgccacagagttgtcacgctttgttgcacaaacatttgtgcagaaaagtgaaaagcttttagccattattaagtttttcctcagctcgctggcagcacttgcgaatgta
ctgatgttcctcataaatgaaaattaatgtttgctctacgctccaccgaactcgcttgtttgggggattggctggctaatcgcggctagatcccaggcggtataaccttttcgcttcatcagttgtgaaaccagatggctggtgttttggca
cagcggactcccctcgaacgctctcgaaatcaagtggctttccagccggcccgctgggccgctcgcccactggaccggtattcccaggccaggccacactgtaccgcaccgcataatcctcgccagactcggcgctgataaggcccaatgtc
actccgcaggcgtctatttatgccaaggaccgttcttcttcagctttcggctcgagtatttgttgtgccatgttggttacgatgccaatcgcggtacagttatgcaaatgagcagcgaataccgctcactgacaatgaacggcgtcttgtca
tattcatgctgacattcatattcattcctttggttttttgtcttcgacggactgaaaagtgcggagagaaacccaaaaacagaagcgcgcaaagcgccgttaatatgcgaactcagcgaactcattgaagttatcacaacaccatatccata
catatccatatcaatatcaatatcgctattattaacgatcatgctctgctgatcaagtattcagcgctgcgctagattcgacagattgaatcgagctcaatagactcaacagactccactcgacagatgcgcaatgccaaggacaattgccg
tggagtaaacgaggcgtatgcgcaacctgcacctggcggacgcggcgtatgcgcaatgtgcaattcgcttaccttctcgttgcgggtcaggaactcccagatgggaatggccgatgacgagctgatctgaatgtggaaggcgcccagcaggc
aagattactttcgccgcagtcgtcatggtgtcgttgctgcttttatgttgcgtactccgcactacacggagagttcaggggattcgtgctccgtgatctgtgatccgtgttccgtgggtcaattgcacggttcggttgtgtaaccttcgtgt
tctttttttttagggcccaataaaagcgcttttgtggcggcttgatagattatcacttggtttcggtggctagccaagtggctttcttctgtccgacgcacttaattgaattaaccaaacaacgagcgtggccaattcgtattatcgctgtt
Wh i h ?Wh i h ?tacgtgtgtctcagcttgaaacgcaaaagcttgtttcacacatcggtttctcggcaagatgggggagtcagtcggtctagggagaggggcgcccaccagtcgatcacgaaaacggcgaattccaagcgaaacggaaacggagcgagcactat
agtactatgtcgaacaaccgatcgcggcgatgtcagtgagtcgtcttcggacagcgctggcgctccacacgtatttaagctctgagatcggctttgggagagcgcagagagcgccatcgcacggcagagcgaaagcggcagtgagcgaaagc
gagcggcagcgggtgggggatcgggagccccccgaaaaaaacagaggcgcacgtcgatgccatcggggaattggaacctcaatgtgtgggaatgtttaaatattctgtgttaggtagtgtagtttcatagactatagattctcatacagatt
gagtccttcgagccgattatacacgacagcaaaatatttcagtcgcgcttgggcaaaaggcttaagcacgactcccagtccccccttacatttgtcttcctaagcccctggagccactatcaaacttgttctacgcttgcactgaaaataga
accaaagtaaacaatcaaaaagaccaaaaacaataacaaccagcaccgagtcgaacatcagtgaggcattgcaaaaatttcaaagtcaagtttgcgtcgtcatcgcgtctgagtccgatcaagccgggcttgtaattgaagttgttgatgag
ttactggattgtggcgaattctggtcagcatacttaacagcagcccgctaattaagcaaaataaacatatcaaattccagaatgcgacggcgccatcatcctgtttgggaattcaattcgcgggcagatcgtttaattcaattaaaaggtag
aaaagggagcagaagaatgcgatcgctggaatttcctaacatcacggaccccataaatttgataagcccgagctcgctgcgttgagtcagccaccccacatccccaaatccccgccaaaagaagacagctgggttgttgactcgccagattg
attgcagtggagtggacctggtcaaagaagcaccgttaatgtgctgattccattcgattccatccgggaatgcgataaagaaaggctctgatccaagcaactgcaatccggatttcgattttctctttccatttggttttgtatttacgtac
Where is the gene?Where is the gene?
© Eric Xing @ CMU, 2006-2010
aagcattctaatgaagacttggagaagacttacgttatattcagaccatcgtgcgatagaggatgagtcatttccatatggccgaaatttattatgtttactatcgtttttagaggtgttttttggacttaccaaaagaggcatttgttttc
ttcaactgaaaagatatttaaattttttcttggaccattttcaaggttccggatatatttgaaacacactagctagcagtgttggtaagttacatgtatttctataatgtcatattcctttgtccgtattcaaatcgaatactccacatctc
ttgtacttgaggaattggcgatcgtagcgatttcccccgccgtaaagttcctgatcctcgttgtttttgtacatcataaagtccggattctgctcgtcgccgaagatgggaacgaagctgccaaagctgagagtctgcttgaggtgctggtc
gtcccagctggataaccttgctgtacagatcggcatctgcctggagggcacgatcgaaatccttccagtggacgaacttcacctgctcgctgggaatagcgttgttgtcaagcagctcaaggagcgtattcgagttgacgggctgcaccacg
ctgctccttcgctggggattcccctgcgggtaagcgccgcttgcttggactcgtttccaaatcccatagccacgccagcagaggagtaacagagctctgaaaacagttcatggtttaaaaatatcctttaagaaagcccatgggtataactt
actgcgtcctatgcgaggaatggtctttaggttctttatggcaaagttctcgcctcgcttgcccagccgcggtacgttcttggtgatctttaggaagaatcctggactactgtcgtctgcctggcttatggccacaagacccaccaagagcg
aggactgttatgattctcatgctgatgcgactgaagcttcacctgactcctgctccacaattggtggcctttatatagcgagatccacccgcatcttgcgtggaatagaaatgcgggtgactccaggaattagcattatcgatcggaaagtg
ataaaactgaactaacctgacctaaatgcctggccataattaagtgcatacatacacattacattacttacatttgtataagaactaaattttatagtacataccacttgcgtatgtaaatgcttgtcttttctcttatatacgttttataa
Paradigms of Machine LearningParadigms of Machine Learning
 Supervised Learningp g
 Given , learn , s.t.
 Unsupervised Learning
 iiD YX ,  ii XY f:)f(     jjD YXnew

 Unsupervised Learning
 Given , learn , s.t.
R i f t L i
 iD X  ii XY f:)f(     jjD YXnew

 Reinforcement Learning
 Given  gametrace/realsimulator/rewards,,actions,envD
are :policy
 learn , s.t.
 Active Learning
rea
are


,:utility
,:policy
  321 aaa ,,gamerealnew,env 
© Eric Xing @ CMU, 2006-2010
 Active Learning
 Given , learn , s.t.)(G~ D  jD Ypolicy,),(G'all
)f(and)(G'~new
D
Elements of LearningElements of Learning
 Here are some important elements to consider before you start:
 Task: Task:
 Embedding? Classification? Clustering? Topic extraction? …
 Data and other info:
 Input and output (e.g., continuous, binary, counts, …)
S i d i d f bl d f thi ? Supervised or unsupervised, of a blend of everything?
 Prior knowledge? Bias?
 Models and paradigms:
 BN? MRF? Regression? SVM?
 Bayesian/Frequents ? Parametric/Nonparametric?
 Objective/Loss function:
 MLE? MCLE? Max margin?
 Log loss, hinge loss, square loss? … Log loss, hinge loss, square loss? …
 Tractability and exactness trade off:
 Exact inference? MCMC? Variational? Gradient? Greedy search?
 Online? Batch? Distributed?
E l ti
© Eric Xing @ CMU, 2006-2010
 Evaluation:
 Visualization? Human interpretability? Perperlexity? Predictive accuracy?
 It is better to consider one element at a time!
Theories of LearningTheories of Learning
For the learned F(; )
 Consistency (value, pattern, …)
Bi i Bias versus variance
 Sample complexity
 Learning rateg
 Convergence
 Error bound
 Confidence
 Stability

© Eric Xing @ CMU, 2006-2010
 …
ClassificationClassification
 Representing data:p g
 Hypothesis (classifier)
© Eric Xing @ CMU, 2006-2010
Decision-making as dividing a
high dimensional spacehigh-dimensional space
 Classification-specific Dist.: P(X|Y)p ( | )
);(
)|( 1




Xp
YXp
),;( 111  Xp
),;(
)|(
222
2




Xp
YXp
 Class prior (i.e., "weight"): P(Y)
© Eric Xing @ CMU, 2006-2010
p ( , g ) ( )
The Bayes RuleThe Bayes Rule
 What we have just did leads to the following generalj g g
expression:
)()|( YpYXP
)(
)()|(
)|(
XP
YpYXP
XYP 
This is Bayes Rule
© Eric Xing @ CMU, 2006-2010
The Bayes Decision Rule for
Minimum ErrorMinimum Error
 The a posteriori probability of a samplep p y p
)(
)|(
)|(
)(
)()|(
)|( Xq
iYXp
iYXp
Xp
iYPiYXp
XiYP i
i ii
ii






 

 Bayes Test:
 Likelihood Ratio:
)(X
 Discriminant function:
)(X
© Eric Xing @ CMU, 2006-2010
)(Xh
Example of Decision RulesExample of Decision Rules
 When each class is a normal …
 We can write the decision boundary analytically in some
© Eric Xing @ CMU, 2006-2010
 We can write the decision boundary analytically in some
cases … homework!!
Bayes ErrorBayes Error
 We must calculate the probability of errorp y
 the probability that a sample is assigned to the wrong class
 Given a datum X, what is the risk?
 The Bayes error (the expected risk): The Bayes error (the expected risk):
© Eric Xing @ CMU, 2006-2010
More on Bayes ErrorMore on Bayes Error
 Bayes error is the lower bound of probability of classification error
 Bayes classifier is the theoretically best classifier that minimize
probability of classification error
 Computing Bayes error is in general a very complex problem Why? Computing Bayes error is in general a very complex problem. Why?
 Density estimation:
 Integrating density function:
© Eric Xing @ CMU, 2006-2010
 Integrating density function:
Learning ClassifierLearning Classifier
 The decision rule:
 Learning strategies
 Generative Learning
 Discriminative Learning Discriminative Learning
 Instance-based Learning (Store all past experience in memory)
 A special case of nonparametric classifier
© Eric Xing @ CMU, 2006-2010
Supervised LearningSupervised Learning
 K-Nearest-Neighbor Classifier:
where the h(X) is represented by all the data, and by an algorithm
© Eric Xing @ CMU, 2006-2010
Recall: Vector Space
RepresentationRepresentation
 Each document is a vector, one
Doc 1 Doc 2 Doc 3
component for each term (= word).
Doc 1 Doc 2 Doc 3 ...
Word 1 3 0 0 ...
Word 2 0 8 1 ...
Word 3 12 1 10 ...
... 0 1 3 ...
0 0 0... 0 0 0 ...
 Normalize to unit length.
 High dimensional vector space:
© Eric Xing @ CMU, 2006-2010
 High-dimensional vector space:
 Terms are axes, 10,000+ dimensions, or even 100,000+
 Docs are vectors in this space
Test Document = ?Test Document = ?
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
1-Nearest Neighbor (kNN)
classifierclassifier
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
2-Nearest Neighbor (kNN)
classifierclassifier
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
3-Nearest Neighbor (kNN)
classifierclassifier
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
K-Nearest Neighbor (kNN)
classifierclassifier
Voting kNN
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
Classes in a Vector SpaceClasses in a Vector Space
Sports
Science
Arts
© Eric Xing @ CMU, 2006-2010
kNN Is Close to OptimalkNN Is Close to Optimal
 Cover and Hart 1967
 Asymptotically, the error rate of 1-nearest-neighbor
classification is less than twice the Bayes rate [error rate of
classifier knowing model that generated data]classifier knowing model that generated data]
 In particular, asymptotic error rate is 0 if Bayes rate is 0.
 Where does kNN come from?
 Nonparametric density estimation
© Eric Xing @ CMU, 2006-2010
Nearest-Neighbor Learning
AlgorithmAlgorithm
 Learning is just storing the representations of the trainingg j g p g
examples in D.
 Testing instance x:
 Compute similarity between x and all examples in D.
 Assign x the category of the most similar example in D.
Does not explicitly compute a generalization or category Does not explicitly compute a generalization or category
prototypes.
 Also called: Also called:
 Case-based learning
 Memory-based learning
 Lazy learning
© Eric Xing @ CMU, 2006-2010
 Lazy learning
kNN is an instance of
Instance Based LearningInstance-Based Learning
 What makes an Instance-Based Learner?
 A distance metric
 How many nearby neighbors to look at?
 A weighting function (optional)
 How to relate to the local points?
© Eric Xing @ CMU, 2006-2010
Euclidean Distance MetricEuclidean Distance Metric
D 22
)'()'(
 Or equivalently,
 
i
iii xxxxD 22
)'()',( 
Oth t i
)'()'()',( xxxxxxD T

 Other metrics:
 L1 norm: |x-x'|
 L∞ norm: max |x-x'| (elementwise …)
 Mahalanobis: where  is full, and symmetric
 Correlation
 Angle
H i di t M h tt di t
© Eric Xing @ CMU, 2006-2010
 Hamming distance, Manhattan distance
 …
Case Study:
kNN for Web ClassificationkNN for Web Classification
 Dataset
 20 News Groups (20 classes)
 Download :(http://people.csail.mit.edu/jrennie/20Newsgroups/)
 61 118 words 18 774 documents 61,118 words, 18,774 documents
 Class labels descriptions
© Eric Xing @ CMU, 2006-2010© Eric Xing @ CMU, 2006-2008
Results: Binary ClassesResults: Binary Classes
alt.atheism
Accuracy vs.
comp.graphics
rec.autos
vs
comp.windows.x
vs.
Accuracy
vs.
rec.sport.baseball
rec.motorcycles
© Eric Xing @ CMU, 2006-2010© Eric Xing @ CMU, 2006-2008
k
Results: Multiple ClassesResults: Multiple Classes
Random select 5-out-of-20 classes, repeat 10 runs and average
Accuracy
All 20 classes
© Eric Xing @ CMU, 2006-2010© Eric Xing @ CMU, 2006-2008
k
Is kNN ideal?Is kNN ideal?
© Eric Xing @ CMU, 2006-2010
Is kNN ideal? more laterIs kNN ideal? … more later
© Eric Xing @ CMU, 2006-2010
Effect of ParametersEffect of Parameters
 Sample size
 The more the better
 Need efficient search algorithm for NN
 Dimensionality
 Curse of dimensionality
 Density
 How smooth?
 Metric
 The relative scalings in the distance metric affect region shapes.
 Weight Weight
 Spurious or less relevant points need to be downweighted
 K
© Eric Xing @ CMU, 2006-2010
SummarySummary
 Machine Learning is Cool and Useful!!
P di f M hi L i Paradigms of Machine Learning.
 Design elements learning
 Theories on learning
 Fundamental theory of classification
 Bayes optimal classifier
 Instance-based learning: kNN – a Nonparametric classifier
 A nonparametric method does not rely on any assumption concerning the structure
of the underlying density function.
 Very little “learning” is involved in these methods
 Good news: Good news:
 Simple and powerful methods; Flexible and easy to apply to many problems.
 kNN classifier asymptotically approaches the Bayes classifier, which is theoretically the
best classifier that minimizes the probability of classification error.
© Eric Xing @ CMU, 2006-2010
 Bad news:
 High memory requirements
 Very dependant on the scale factor for a specific problem.
Learning Based Approaches forLearning Based Approaches for 
Visual Recognition
L. Fei‐Fei
Computer Science Dept.
f dStanford University
As legend goes…
8 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
47
CVPR: 1985 ‐ 2010
8 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
48
What is vision?
“ d di ” iReal world “understanding” pictures
pixel world
“forming” pictures8 August 2010 49
L. Fei-Fei, Dragon Star 2010,
Stanford
“ d di ” i
What is vision?
“understanding” pictures
• edges
Real world
• intensity
• texture
• …
pixel world
LowLow‐‐Level VisionLevel Vision
8 August 2010 50
L. Fei-Fei, Dragon Star 2010,
Stanford
“ d di ” i
What is vision?
“understanding” pictures
• groupings of 
Real world
similar pixels
• geometry
• …
pixel world
MidMid‐‐Level VisionLevel Vision
LowLow‐‐Level VisionLevel Vision
8 August 2010 51
L. Fei-Fei, Dragon Star 2010,
Stanford
“ d di ” i
What is vision?
“understanding” pictures
This is a story of love and 
Real world
friendship among three  
young hamsters. On a 
sunny day in the garden…
pixel world
Hi hHi h L l Vi iL l Vi i
MidMid‐‐Level VisionLevel Vision
HighHigh‐‐Level VisionLevel Vision
LowLow‐‐Level VisionLevel Vision
8 August 2010 52
L. Fei-Fei, Dragon Star 2010,
Stanford
Humans are extremely good at high‐
level semantic understandingg
time
Fei‐Fei et al. JoV 200753
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Humans are extremely good at high‐
level semantic understandingg
Fei‐Fei et al. JoV 200754
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
PT = 27ms
This was a picture with some dark sploches in it. 
Y h th t' b t it (S bj t KM)
I think I saw two people on a field. (Subject: 
PT = 40ms
Yeah. . .that's about it. (Subject: KM)
p p ( j
RW) 
PT = 67ms
Outdoor scene. There were some kind of 
animals, maybe dogs or horses, in the middle of 
the picture. It looked like they were running in 
the middle of a grassy field. (Subject: IV) 
S ki d f fi h f
PT = 500ms
Some kind of game or fight. Two groups of two 
men? The foregound pair looked like one was 
getting a fist in the face. Outdoors seemed like 
because i have an impression of grass and maybe 
li h ? Th ld b h I hi k
PT = 107ms
two people, whose profile was toward me. 
looked like they were on a field of some sort and 
engaged in some sort of sport (their attire 
suggested soccer, but it looked like there was too 
lines on the grass? That would be why I think 
perhaps a game, rough game though, more like 
rugby than football because they pairs weren't in 
pads and helmets, though I did get the 
i i f i il l hi b much contact for that). (Subject: AI) impression of similar clothing. maybe some 
trees? in the background. (Subject: SM)
Fei‐Fei et al. JoV 200755
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Visual recognition
social rolessocial roles
situationsituation goals and intentionsgoals and intentions
functionality
(human-object interaction)
functionality
(human-object interaction)
roles and functionsroles and functions causalitycausality
High‐level
1 sec
on 
me in 
ystem
recognitionrecognition
geometric layoutgeometric layout
scene understandingscene understanding
activity understandingactivity understanding
3D geometry3D geometry
presentatio
ion tasks
ocessing tim
man visual sy
t tit ti
parts and attributesparts and attributes
basic-level
classification
basic-level
classification
segmentationsegmentation
action classificationaction classification
3D geometry3D geometry
150 msec
of image re
nd recogniti
Pro
hum
shapeshape
features and descriptorsfeatures and descriptors
segmentationsegmentation
gg
90 msec
Level o
an
texturetexture
trackingtracking
ObjectObject SceneScene Event/activityEvent/activity
pp
Low‐level
8 August 2010 56
L. Fei-Fei, Dragon Star 2010,
Stanford
Visual recognition
1990 l 2000
social rolessocial roles
situationsituation goals and intentionsgoals and intentions
1990‐early 2000
functionality
(human-object interaction)
functionality
(human-object interaction)
roles and functionsroles and functions causalitycausality
High‐level
1 sec
on 
me in 
ystem
recognitionrecognition
geometric layoutgeometric layout
scene understandingscene understanding
activity understandingactivity understanding
3D geometry3D geometry
presentatio
ion tasks
ocessing tim
man visual sy
t tit ti
parts and attributesparts and attributes
basic-level
classification
basic-level
classification
segmentationsegmentation
action classificationaction classification
3D geometry3D geometry
150 msec
of image re
nd recogniti
Pro
hum
shapeshape
features and descriptorsfeatures and descriptors
segmentationsegmentation
gg
90 msec
Level o
an
texturetexture
trackingtracking
pp
Low‐level
ObjectObject SceneScene Event/activityEvent/activity
8 August 2010 57
L. Fei-Fei, Dragon Star 2010,
Stanford
Visual recognition
l 2000
social rolessocial roles
situationsituation goals and intentionsgoals and intentions
early 2000 ‐ now
functionality
(human-object interaction)
functionality
(human-object interaction)
roles and functionsroles and functions causalitycausality
High‐level
1 sec
on 
me in 
ystem
recognitionrecognition
geometric layoutgeometric layout
scene understandingscene understanding
activity understandingactivity understanding
3D geometry3D geometry
presentatio
ion tasks
ocessing tim
man visual sy
t tit ti
parts and attributesparts and attributes
basic-level
classification
basic-level
classification
segmentationsegmentation
action classificationaction classification
3D geometry3D geometry
150 msec
of image re
nd recogniti
Pro
hum
shapeshape
features and descriptorsfeatures and descriptors
segmentationsegmentation
gg
90 msec
Level o
an
texturetexture
trackingtracking
pp
Low‐level
ObjectObject SceneScene Event/activityEvent/activity
8 August 2010 58
L. Fei-Fei, Dragon Star 2010,
Stanford
Why machine learning?
8 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
59
Why machine learning?
• 80,000,000,000+ images
• 5,000,000,000+ images
• 120,000,000 videos (upload: 13hours/min)
• 20Gb of imagesg
• My mom’s hard drive: 220+Gb of images• My mom s hard‐drive: 220+Gb of images
8 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
60
Why machine learning?
• Today: Lots of data, complex tasks
Internet images,  Movies, news, sports
personal photo albums
Surveillance and security Medical and scientific images 61
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Machine learning in computer vision
• Aug 12, Lecture 1: Nearest Neighbor
– Large‐scale image classification
– Scene completion
• Aug 12 Lecture 3: Neural NetworkAug 12, Lecture 3: Neural Network
– Convolutional Nets for object recognition
– Unsupervised feature learning via Deep Belief Net
• Aug 13, Lecture 7: Dimensionality reduction, Manifold learning
– Eigen‐ and Fisher‐ faces
– Applications to object representation
• Aug 15, Lecture 13: Conditional Random Field
– Image segmentation, object recognition & image annotationImage segmentation, object recognition & image annotation
• Aug 15 & 16, Lecture 14 + 17: Topic models
– Object recognition
– Scene classification, image annotation, large‐scale image clustering
Total scene understanding– Total scene understanding
8 August 2010 62
L. Fei-Fei, Dragon Star 2010,
Stanford
Nearest Neighbor approaches:Nearest Neighbor approaches:
two case study
L. Fei‐Fei
Computer Science Dept.
f dStanford University
Machine learning in computer vision
• Aug 12, Lecture 1: Nearest Neighbor
– Large‐scale image classificationLarge scale image classification
– Scene completion
8 August 2010 64
L. Fei-Fei, Dragon Star 2010,
Stanford
Machine learning in computer vision
• Aug 12, Lecture 1: Nearest Neighbor
– Large‐scale image classificationLarge scale image classification
– Scene completion
8 August 2010 65
L. Fei-Fei, Dragon Star 2010,
Stanford
66L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
http://www.image-net.org
67L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Large-scale image classification and
t i l i dd d bl i i iretrieval is an unaddressed problem in vision
? ?
68L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
) Humans
4
y(log_10)
3
PASCAL1
category
2
Caltech101/256MRSC
PASCAL1
LabelMe3
agesper
1
Caltech101/256
Tiny Images2
cleanima
1 2 3 4 5
# of visual concept categories (log 10)
#of
p g ( g_ )
1. Excluding the Caltech101 datasets from PASCAL
2. Image in this dataset are not human annotated. The # of clean images per category is a rough estimation
3. Categories of more than 100 images are considered only
kNN for image classification: basic set‐up
Antelope
Trombone?Antelope
J ll fi h
Kangaroo
Jellyfish
German Shepherd
kNN for image classification: basic set‐up
?
5‐NN
?
Classification
5‐NN
? Kangaroo
? a ga oo
Count
3
2
1
Antelope Jellyfish German Shepherd TromboneKangaroo
0
10K classes, 4.5M Queries, 4.5M training 
images imagesimages images
?
o Torralba
?
urtesy: Antoniound image couBackgrou
KNN on 10K classes
• 10K classes
• 4.5M queries
• 4.5M trainingg
• Features
BOW– BOW
– GIST
Deng, Berg, Li & Fei‐Fei, ECCV 2010
74
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
How fast is kNN?
• Brute force linear scan
– E.g. scanning 4.5M images!
• Can we be faster?
– Yes if feature dimensionality is low ( e.g. <= 16 )
– K‐D tree.
http://graphics.stanford.edu/courses/cs368‐00‐spring/TA/manuals/CGAL/ref‐manual2/SearchStructures/kdtree.gif
75
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Curse of dimensionality
• For high dimensionality, in both theory and 
practice, little improvement over brute‐force 
linear scan.
• E.g. KD‐tree on 128 dimension SIFT is not 
much faster than linear scan.
76
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
http://graphics.stanford.edu/courses/cs368‐00‐spring/TA/manuals/CGAL/ref‐manual2/SearchStructures/kdtree.gif
Locality sensitive hashing
• Approximate kNN 
G d h i i– Good enough in practice
– Can get around curse of dimensionality
• Locality sensitive hashing
– Near feature points  (likely) same hash values
Hash table
77
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Example: Random projection
• h(x) = sgn (x ∙ r),  r is a random unit vector
• h(x) gives 1 bit. Repeat and concatenate.h(x) gives 1 bit. Repeat and concatenate. 
• Prob[h(x) = h(y)] = 1 – θ(x,y) / π
y y y
r
y
θ
x
y
x
y
x
h(x) = 0, h(y) = 1 h l
h(x) = 0, h(y) = 0 h(x) = 0, h(y) = 1
h(x)   0, h(y)   1 hyperplane
x y
Hash table
000 101
78
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Example: Random projection
• h(x) = sgn (x ∙ r),  r is a random unit vector
• h(x) gives 1 bit. Repeat and concatenate.h(x) gives 1 bit. Repeat and concatenate. 
• Prob[h(x) = h(y)] = 1 – θ(x,y) / π
ry x
y
x
y
x
y
θ
h(x) = 0, h(y) = 0 h l
h(x) = 0, h(y) = 0 h(x) = 0, h(y) = 0
h(x)   0, h(y)   0 hyperplane
x y
Hash table
000 101
79
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Locality sensitive hashing
Retrieved NNs
Hash table
?
80
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Locality sensitive hashing
• 1000X speed‐up with 50% recall of top 10‐NN
• 1.2M images + 1000 dimensions
0.6
0.7op10 L1Prod LSH + L1Prod ranking
RandHP LSH + L1Prod ranking
rieved
0.5
0.6
1Prodattop
exact NN ret
0.3
0.4
ecallofL1P
rcentage of e
0.4 0.6 0.8 1 1.2 1.4 1.6
x 10
−3
0.2
Rec
Scan cost
Percentage of points scanned
Per
x 10Percentage of points scanned
81
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
10K classes, 4.5M Queries, 4.5M training 
images imagesimages images
??
Machine learning in computer vision
• Aug 12, Lecture 1: Nearest Neighbor
– Large‐scale image classificationLarge scale image classification
– Scene completion
(slides courtesy: Alyosha Efros (CMU))(slides courtesy: Alyosha Efros (CMU))
8 August 2010 83
L. Fei-Fei, Dragon Star 2010,
Stanford
[Hays and Efros. Scene Completion Using Millions of Photographs. 
SIGGRAPH 2007 and CACM October 2008.] 848 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
858 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
Diffusion Result
868 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
Efros and Leung result
878 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
888 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
Scene Matching for 
I C l tiImage Completion
898 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
Scene Completion Result 90
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
91
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Scene Matching
92
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
… 200 total
93
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Context Matching
94
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
95
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
96
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
97
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
98
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
“The Internet is the world’s largest library.”
99
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
100
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Nearest neighbors from a
collection of 20 thousand images 1018 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
Nearest neighbors from a
collection of 2 million images 1028 August 2010
L. Fei-Fei, Dragon Star 2010,
Stanford
Hays and Efros, SIGGRAPH 2007
103
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Hays and Efros, SIGGRAPH 2007
104
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Hays and Efros, SIGGRAPH 2007
105
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Hays and Efros, SIGGRAPH 2007
106
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Hays and Efros, SIGGRAPH 2007
107
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010
Hays and Efros, SIGGRAPH 2007
108
L. Fei-Fei, Dragon Star 2010,
Stanford
8 August 2010

More Related Content

PDF
Mobile Learning Design - not just for ILIAS
PPT
P1151439345
PDF
Pixel Matching from Stereo Images (Callan seminar)
PDF
01_introduction.pdfbnmelllleitrthnjjjkkk
PPTX
introduction to machine learning education.pptx
PDF
Lecture 1: What is Machine Learning?
PPTX
Introduction to Machine Learning and AI.pptx
PPTX
introduction to machine learning and ai.pptx
Mobile Learning Design - not just for ILIAS
P1151439345
Pixel Matching from Stereo Images (Callan seminar)
01_introduction.pdfbnmelllleitrthnjjjkkk
introduction to machine learning education.pptx
Lecture 1: What is Machine Learning?
Introduction to Machine Learning and AI.pptx
introduction to machine learning and ai.pptx

Similar to Lecture1 xing fei-fei (20)

PPTX
ppt on introduction to Machine learning tools
PDF
01_introduction to machine learning algorithms and basics .pdf
PDF
ML All Chapter PDF.pdf
PDF
01_introduction_ML.pdf
PPTX
INTRO TO ML.pptx
PPTX
Machine Learning GDSC DCE Darbhanga.pptx
PDF
Seminar(Pattern Recognition)
PPTX
Machine learning
PDF
MachineLearningTomMitchell.pdf
PPT
Demystifying AI AND ml and its applications
PPT
chapter1-introduction1.ppt
PPTX
Machine learning
PDF
L1_Introduction - part 1.pdf
PPTX
Machine learning
PDF
Unit 1_Introduction to ML_Types_Applications.pdf
PPT
2.17Mb ppt
PPTX
L 8 introduction to machine learning final kirti.pptx
PPTX
machine Learning subject of third year information technology unit 1.pptx
PDF
IRJET - Review on Machine Learning
PPTX
Machine learning
ppt on introduction to Machine learning tools
01_introduction to machine learning algorithms and basics .pdf
ML All Chapter PDF.pdf
01_introduction_ML.pdf
INTRO TO ML.pptx
Machine Learning GDSC DCE Darbhanga.pptx
Seminar(Pattern Recognition)
Machine learning
MachineLearningTomMitchell.pdf
Demystifying AI AND ml and its applications
chapter1-introduction1.ppt
Machine learning
L1_Introduction - part 1.pdf
Machine learning
Unit 1_Introduction to ML_Types_Applications.pdf
2.17Mb ppt
L 8 introduction to machine learning final kirti.pptx
machine Learning subject of third year information technology unit 1.pptx
IRJET - Review on Machine Learning
Machine learning

More from Tianlu Wang (20)

PDF
L7 er2
PDF
L8 design1
PDF
L9 design2
PDF
14 pro resolution
PDF
13 propositional calculus
PDF
12 adversal search
PDF
11 alternative search
PDF
10 2 sum
PDF
22 planning
PDF
21 situation calculus
PDF
20 bayes learning
PDF
19 uncertain evidence
PDF
18 common knowledge
PDF
17 2 expert systems
PDF
17 1 knowledge-based system
PDF
16 2 predicate resolution
PDF
16 1 predicate resolution
PDF
15 predicate
PDF
09 heuristic search
PDF
08 uninformed search
L7 er2
L8 design1
L9 design2
14 pro resolution
13 propositional calculus
12 adversal search
11 alternative search
10 2 sum
22 planning
21 situation calculus
20 bayes learning
19 uncertain evidence
18 common knowledge
17 2 expert systems
17 1 knowledge-based system
16 2 predicate resolution
16 1 predicate resolution
15 predicate
09 heuristic search
08 uninformed search

Recently uploaded (20)

PDF
higher edu open stores 12.5.24 (1).pdf forreal
PDF
Renesas R-Car_Cockpit_overview210214-Gen4.pdf
PDF
Honda Dealership SNS Evaluation pdf/ppts
PPT
ACCOMPLISHMENT REPOERTS AND FILE OF GRADE 12 2021.ppt
PPT
Mettal aloys and it's application and theri composition
PPT
Your score increases as you pick a category, fill out a long description and ...
PDF
3-REasdfghjkl;[poiunvnvncncn-Process.pdf
PDF
Volvo EC290C NL EC290CNL Excavator Service Repair Manual Instant Download.pdf
PPTX
laws of thermodynamics with diagrams details
PDF
Physics class 12thstep down transformer project.pdf
PDF
EC290C NL EC290CNL Volvo excavator specs.pdf
PPTX
capstoneoooooooooooooooooooooooooooooooooo
PPTX
Paediatric History & Clinical Examination.pptx
PDF
Todays Technician Automotive Heating & Air Conditioning Classroom Manual and ...
PDF
Caterpillar CAT 311B EXCAVATOR (8GR00001-UP) Operation and Maintenance Manual...
PPTX
IMMUNITY TYPES PPT.pptx very good , sufficient
PDF
Caterpillar Cat 315C Excavator (Prefix ANF) Service Repair Manual Instant Dow...
DOCX
lp of food hygiene.docxvvvvvvvvvvvvvvvvvvvvvvv
PDF
MANDIBLE (1).pdffawffffffffffffffffffffffffffffffffffffffffff
PDF
Volvo EC290C NL EC290CNL engine Manual.pdf
higher edu open stores 12.5.24 (1).pdf forreal
Renesas R-Car_Cockpit_overview210214-Gen4.pdf
Honda Dealership SNS Evaluation pdf/ppts
ACCOMPLISHMENT REPOERTS AND FILE OF GRADE 12 2021.ppt
Mettal aloys and it's application and theri composition
Your score increases as you pick a category, fill out a long description and ...
3-REasdfghjkl;[poiunvnvncncn-Process.pdf
Volvo EC290C NL EC290CNL Excavator Service Repair Manual Instant Download.pdf
laws of thermodynamics with diagrams details
Physics class 12thstep down transformer project.pdf
EC290C NL EC290CNL Volvo excavator specs.pdf
capstoneoooooooooooooooooooooooooooooooooo
Paediatric History & Clinical Examination.pptx
Todays Technician Automotive Heating & Air Conditioning Classroom Manual and ...
Caterpillar CAT 311B EXCAVATOR (8GR00001-UP) Operation and Maintenance Manual...
IMMUNITY TYPES PPT.pptx very good , sufficient
Caterpillar Cat 315C Excavator (Prefix ANF) Service Repair Manual Instant Dow...
lp of food hygiene.docxvvvvvvvvvvvvvvvvvvvvvvv
MANDIBLE (1).pdffawffffffffffffffffffffffffffffffffffffffffff
Volvo EC290C NL EC290CNL engine Manual.pdf

Lecture1 xing fei-fei

  • 1. Machine LearningMachine Learninggg Introduction &Introduction & Nonparametric ClassifiersNonparametric ClassifiersNonparametric ClassifiersNonparametric Classifiers Eric XingEric Xing Lecture 1, August 12, 2010 © Eric Xing @ CMU, 2006-2010 Reading:
  • 2. Machine Learning  Where does it come from: Machine Learning  http://www.cs.cmu.edu/~epxing/Class/10701/  http://www.cs.cmu.edu/~epxing/Class/10708/ © Eric Xing @ CMU, 2006-2010
  • 3. LogisticsLogistics  Text book  Chris Bishop, Pattern Recognition and Machine Learning (required)  Tom Mitchell, Machine Learning  David Mackay, Information Theory, Inference, and Learning Algorithms  Daphnie Koller and Nir Friedman, Probabilistic Graphical Models  Class resource  http://bcmi sjtu edu cn/ds/ http://bcmi.sjtu.edu.cn/ds/  Host:  Lu Baoliang, Shanghai JiaoTong University  Xue Xiangyang, Fudan University  Instructors: © Eric Xing @ CMU, 2006-2010 Instructors:
  • 4. Local Hosts and co instructorsco-instructors © Eric Xing @ CMU, 2006-2010
  • 5. LogisticsLogistics  Class mailing listg  dragonstar_machinelearning@googlegroups.com  Home work Home work  Exam  Project © Eric Xing @ CMU, 2006-2010
  • 6. What is LearningWhat is Learning Learning is about seeking a predictive and/or executable understanding of natural/artificial subjects phenomena or activities from Apoptosis + Medicine natural/artificial subjects, phenomena, or activities from … Grammatical rules Manufacturing procedures Inferenceg p Natural laws … Inference © Eric Xing @ CMU, 2006-2010
  • 7. Machine LearningMachine Learning © Eric Xing @ CMU, 2006-2010
  • 8. Fetching a stapler from inside an office the Stanford STAIR robotoffice --- the Stanford STAIR robot © Eric Xing @ CMU, 2006-2010
  • 9. What is Machine Learning?What is Machine Learning? Machine Learning seeks to develop theories and computer systems for  representing;  classifying, clustering and recognizing; i d t i t reasoning under uncertainty;  predicting;  and reacting to  … complex, real world data, based on the system's own experience with data, and (hopefully) under a unified model or mathematical framework, that  can be formally characterized and analyzed  can take into account human prior knowledge  can generalize and adapt across data and domains © Eric Xing @ CMU, 2006-2010  can operate automatically and autonomously  and can be interpreted and perceived by human.
  • 10. Where Machine Learning is being used or can be useful?used or can be useful? Speech recognitionSpeech recognition Information retrievalInformation retrieval Computer visionComputer vision GamesGames Robotic controlRobotic control GamesGames © Eric Xing @ CMU, 2006-2010 PlanningPlanning EvolutionEvolution PedigreePedigree
  • 11. Natural language processing and speech recognitionspeech recognition  Now most pocket Speech Recognizers or Translators are running on some sort of learning device --- the more you play/use them, the smarter they become! © Eric Xing @ CMU, 2006-2010
  • 12. Object RecognitionObject Recognition  Behind a security camera, most likely there is a computer that is learning and/or checking! © Eric Xing @ CMU, 2006-2010
  • 13. Robotic Control IRobotic Control I  The best helicopter pilot is now a computer!p p p  it runs a program that learns how to fly and make acrobatic maneuvers by itself!  no taped instructions, joysticks, or things like … © Eric Xing @ CMU, 2006-2010 A. Ng 2005
  • 14. Text MiningText Mining  We want:  Reading, digesting, and categorizing a vast text database is too much fordatabase is too much for human! © Eric Xing @ CMU, 2006-2010
  • 15. Bioinformatics g g g g ggg g ggg g g g gg g g g g g g gg g g gg g gg g gg g cacatcgctgcgtttcggcagctaattgccttttagaaattattttcccatttcgagaaactcgtgtgggatgccggatgcggctttcaatcacttctggcccgggatcggattgggtcacattgtctgcgggctctattgtctcgatccgc ggcgcagttcgcgtgcttagcggtcagaaaggcagagattcggttcggattgatgcgctggcagcagggcacaaagatctaatgactggcaaatcgctacaaataaattaaagtccggcggctaattaatgagcggactgaagccactttgg attaaccaaaaaacagcagataaacaaaaacggcaaagaaaattgccacagagttgtcacgctttgttgcacaaacatttgtgcagaaaagtgaaaagcttttagccattattaagtttttcctcagctcgctggcagcacttgcgaatgta ctgatgttcctcataaatgaaaattaatgtttgctctacgctccaccgaactcgcttgtttgggggattggctggctaatcgcggctagatcccaggcggtataaccttttcgcttcatcagttgtgaaaccagatggctggtgttttggca cagcggactcccctcgaacgctctcgaaatcaagtggctttccagccggcccgctgggccgctcgcccactggaccggtattcccaggccaggccacactgtaccgcaccgcataatcctcgccagactcggcgctgataaggcccaatgtc actccgcaggcgtctatttatgccaaggaccgttcttcttcagctttcggctcgagtatttgttgtgccatgttggttacgatgccaatcgcggtacagttatgcaaatgagcagcgaataccgctcactgacaatgaacggcgtcttgtca tattcatgctgacattcatattcattcctttggttttttgtcttcgacggactgaaaagtgcggagagaaacccaaaaacagaagcgcgcaaagcgccgttaatatgcgaactcagcgaactcattgaagttatcacaacaccatatccata catatccatatcaatatcaatatcgctattattaacgatcatgctctgctgatcaagtattcagcgctgcgctagattcgacagattgaatcgagctcaatagactcaacagactccactcgacagatgcgcaatgccaaggacaattgccg Bioinformaticsg g g g g g g g g g g g g g g g g g g g g g g gg g g tggagtaaacgaggcgtatgcgcaacctgcacctggcggacgcggcgtatgcgcaatgtgcaattcgcttaccttctcgttgcgggtcaggaactcccagatgggaatggccgatgacgagctgatctgaatgtggaaggcgcccagcaggc aagattactttcgccgcagtcgtcatggtgtcgttgctgcttttatgttgcgtactccgcactacacggagagttcaggggattcgtgctccgtgatctgtgatccgtgttccgtgggtcaattgcacggttcggttgtgtaaccttcgtgt tctttttttttagggcccaataaaagcgcttttgtggcggcttgatagattatcacttggtttcggtggctagccaagtggctttcttctgtccgacgcacttaattgaattaaccaaacaacgagcgtggccaattcgtattatcgctgtt tacgtgtgtctcagcttgaaacgcaaaagcttgtttcacacatcggtttctcggcaagatgggggagtcagtcggtctagggagaggggcgcccaccagtcgatcacgaaaacggcgaattccaagcgaaacggaaacggagcgagcactat agtactatgtcgaacaaccgatcgcggcgatgtcagtgagtcgtcttcggacagcgctggcgctccacacgtatttaagctctgagatcggctttgggagagcgcagagagcgccatcgcacggcagagcgaaagcggcagtgagcgaaagc gagcggcagcgggtgggggatcgggagccccccgaaaaaaacagaggcgcacgtcgatgccatcggggaattggaacctcaatgtgtgggaatgtttaaatattctgtgttaggtagtgtagtttcatagactatagattctcatacagatt gagtccttcgagccgattatacacgacagcaaaatatttcagtcgcgcttgggcaaaaggcttaagcacgactcccagtccccccttacatttgtcttcctaagcccctggagccactatcaaacttgttctacgcttgcactgaaaataga accaaagtaaacaatcaaaaagaccaaaaacaataacaaccagcaccgagtcgaacatcagtgaggcattgcaaaaatttcaaagtcaagtttgcgtcgtcatcgcgtctgagtccgatcaagccgggcttgtaattgaagttgttgatgag ttactggattgtggcgaattctggtcagcatacttaacagcagcccgctaattaagcaaaataaacatatcaaattccagaatgcgacggcgccatcatcctgtttgggaattcaattcgcgggcagatcgtttaattcaattaaaaggtag aaaagggagcagaagaatgcgatcgctggaatttcctaacatcacggaccccataaatttgataagcccgagctcgctgcgttgagtcagccaccccacatccccaaatccccgccaaaagaagacagctgggttgttgactcgccagattg attgcagtggagtggacctggtcaaagaagcaccgttaatgtgctgattccattcgattccatccgggaatgcgataaagaaaggctctgatccaagcaactgcaatccggatttcgattttctctttccatttggttttgtatttacgtac aagcattctaatgaagacttggagaagacttacgttatattcagaccatcgtgcgatagaggatgagtcatttccatatggccgaaatttattatgtttactatcgtttttagaggtgttttttggacttaccaaaagaggcatttgttttc ttcaactgaaaagatatttaaattttttcttggaccattttcaaggttccggatatatttgaaacacactagctagcagtgttggtaagttacatgtatttctataatgtcatattcctttgtccgtattcaaatcgaatactccacatctc ttgtacttgaggaattggcgatcgtagcgatttcccccgccgtaaagttcctgatcctcgttgtttttgtacatcataaagtccggattctgctcgtcgccgaagatgggaacgaagctgccaaagctgagagtctgcttgaggtgctggtc gtcccagctggataaccttgctgtacagatcggcatctgcctggagggcacgatcgaaatccttccagtggacgaacttcacctgctcgctgggaatagcgttgttgtcaagcagctcaaggagcgtattcgagttgacgggctgcaccacg ctgctccttcgctggggattcccctgcgggtaagcgccgcttgcttggactcgtttccaaatcccatagccacgccagcagaggagtaacagagctcwhereisthegenetgattaaaaatatcctttaagaaagcccatgggtataactt actgcgtcctatgcgaggaatggtctttaggttctttatggcaaagttctcgcctcgcttgcccagccgcggtacgttcttggtgatctttaggaagaatcctggactactgtcgtctgcctggcttatggccacaagacccaccaagagcg aggactgttatgattctcatgctgatgcgactgaagcttcacctgactcctgctccacaattggtggcctttatatagcgagatccacccgcatcttgcgtggaatagaaatgcgggtgactccaggaattagcattatcgatcggaaagtg ataaaactgaactaacctgacctaaatgcctggccataattaagtgcatacatacacattacattacttacatttgtataagaactaaattttatagtacataccacttgcgtatgtaaatgcttgtcttttctcttatatacgttttataa cccagcatattttacgtaaaaacaaaacggtaatgcgaacataacttatttattggggcccggaccgcaaaccggccaaacgcgtttgcacccataaaaacataagggcaacaaaaaaattgttaagctgttgtttatttttgcaatcgaaa cgctcaaatagctgcgatcactcgggagcagggtaaagtcgcctcgaaacaggaagctgaagcatcttctataaatacactcaaagcgatcattccgaggcgagtctggttagaaatttacatggactgcaaaaaggtatagccccacaaac cacatcgctgcgtttcggcagctaattgccttttagaaattattttcccatttcgagaaactcgtgtgggatgccggatgcggctttcaatcacttctggcccgggatcggattgggtcacattgtctgcgggctctattgtctcgatccgc ggcgcagttcgcgtgcttagcggtcagaaaggcagagattcggttcggattgatgcgctggcagcagggcacaaagatctaatgactggcaaatcgctacaaataaattaaagtccggcggctaattaatgagcggactgaagccactttgg attaaccaaaaaacagcagataaacaaaaacggcaaagaaaattgccacagagttgtcacgctttgttgcacaaacatttgtgcagaaaagtgaaaagcttttagccattattaagtttttcctcagctcgctggcagcacttgcgaatgta ctgatgttcctcataaatgaaaattaatgtttgctctacgctccaccgaactcgcttgtttgggggattggctggctaatcgcggctagatcccaggcggtataaccttttcgcttcatcagttgtgaaaccagatggctggtgttttggca cagcggactcccctcgaacgctctcgaaatcaagtggctttccagccggcccgctgggccgctcgcccactggaccggtattcccaggccaggccacactgtaccgcaccgcataatcctcgccagactcggcgctgataaggcccaatgtc actccgcaggcgtctatttatgccaaggaccgttcttcttcagctttcggctcgagtatttgttgtgccatgttggttacgatgccaatcgcggtacagttatgcaaatgagcagcgaataccgctcactgacaatgaacggcgtcttgtca tattcatgctgacattcatattcattcctttggttttttgtcttcgacggactgaaaagtgcggagagaaacccaaaaacagaagcgcgcaaagcgccgttaatatgcgaactcagcgaactcattgaagttatcacaacaccatatccata catatccatatcaatatcaatatcgctattattaacgatcatgctctgctgatcaagtattcagcgctgcgctagattcgacagattgaatcgagctcaatagactcaacagactccactcgacagatgcgcaatgccaaggacaattgccg tggagtaaacgaggcgtatgcgcaacctgcacctggcggacgcggcgtatgcgcaatgtgcaattcgcttaccttctcgttgcgggtcaggaactcccagatgggaatggccgatgacgagctgatctgaatgtggaaggcgcccagcaggc aagattactttcgccgcagtcgtcatggtgtcgttgctgcttttatgttgcgtactccgcactacacggagagttcaggggattcgtgctccgtgatctgtgatccgtgttccgtgggtcaattgcacggttcggttgtgtaaccttcgtgt tctttttttttagggcccaataaaagcgcttttgtggcggcttgatagattatcacttggtttcggtggctagccaagtggctttcttctgtccgacgcacttaattgaattaaccaaacaacgagcgtggccaattcgtattatcgctgtt Wh i h ?Wh i h ?tacgtgtgtctcagcttgaaacgcaaaagcttgtttcacacatcggtttctcggcaagatgggggagtcagtcggtctagggagaggggcgcccaccagtcgatcacgaaaacggcgaattccaagcgaaacggaaacggagcgagcactat agtactatgtcgaacaaccgatcgcggcgatgtcagtgagtcgtcttcggacagcgctggcgctccacacgtatttaagctctgagatcggctttgggagagcgcagagagcgccatcgcacggcagagcgaaagcggcagtgagcgaaagc gagcggcagcgggtgggggatcgggagccccccgaaaaaaacagaggcgcacgtcgatgccatcggggaattggaacctcaatgtgtgggaatgtttaaatattctgtgttaggtagtgtagtttcatagactatagattctcatacagatt gagtccttcgagccgattatacacgacagcaaaatatttcagtcgcgcttgggcaaaaggcttaagcacgactcccagtccccccttacatttgtcttcctaagcccctggagccactatcaaacttgttctacgcttgcactgaaaataga accaaagtaaacaatcaaaaagaccaaaaacaataacaaccagcaccgagtcgaacatcagtgaggcattgcaaaaatttcaaagtcaagtttgcgtcgtcatcgcgtctgagtccgatcaagccgggcttgtaattgaagttgttgatgag ttactggattgtggcgaattctggtcagcatacttaacagcagcccgctaattaagcaaaataaacatatcaaattccagaatgcgacggcgccatcatcctgtttgggaattcaattcgcgggcagatcgtttaattcaattaaaaggtag aaaagggagcagaagaatgcgatcgctggaatttcctaacatcacggaccccataaatttgataagcccgagctcgctgcgttgagtcagccaccccacatccccaaatccccgccaaaagaagacagctgggttgttgactcgccagattg attgcagtggagtggacctggtcaaagaagcaccgttaatgtgctgattccattcgattccatccgggaatgcgataaagaaaggctctgatccaagcaactgcaatccggatttcgattttctctttccatttggttttgtatttacgtac Where is the gene?Where is the gene? © Eric Xing @ CMU, 2006-2010 aagcattctaatgaagacttggagaagacttacgttatattcagaccatcgtgcgatagaggatgagtcatttccatatggccgaaatttattatgtttactatcgtttttagaggtgttttttggacttaccaaaagaggcatttgttttc ttcaactgaaaagatatttaaattttttcttggaccattttcaaggttccggatatatttgaaacacactagctagcagtgttggtaagttacatgtatttctataatgtcatattcctttgtccgtattcaaatcgaatactccacatctc ttgtacttgaggaattggcgatcgtagcgatttcccccgccgtaaagttcctgatcctcgttgtttttgtacatcataaagtccggattctgctcgtcgccgaagatgggaacgaagctgccaaagctgagagtctgcttgaggtgctggtc gtcccagctggataaccttgctgtacagatcggcatctgcctggagggcacgatcgaaatccttccagtggacgaacttcacctgctcgctgggaatagcgttgttgtcaagcagctcaaggagcgtattcgagttgacgggctgcaccacg ctgctccttcgctggggattcccctgcgggtaagcgccgcttgcttggactcgtttccaaatcccatagccacgccagcagaggagtaacagagctctgaaaacagttcatggtttaaaaatatcctttaagaaagcccatgggtataactt actgcgtcctatgcgaggaatggtctttaggttctttatggcaaagttctcgcctcgcttgcccagccgcggtacgttcttggtgatctttaggaagaatcctggactactgtcgtctgcctggcttatggccacaagacccaccaagagcg aggactgttatgattctcatgctgatgcgactgaagcttcacctgactcctgctccacaattggtggcctttatatagcgagatccacccgcatcttgcgtggaatagaaatgcgggtgactccaggaattagcattatcgatcggaaagtg ataaaactgaactaacctgacctaaatgcctggccataattaagtgcatacatacacattacattacttacatttgtataagaactaaattttatagtacataccacttgcgtatgtaaatgcttgtcttttctcttatatacgttttataa
  • 16. Paradigms of Machine LearningParadigms of Machine Learning  Supervised Learningp g  Given , learn , s.t.  Unsupervised Learning  iiD YX ,  ii XY f:)f(     jjD YXnew   Unsupervised Learning  Given , learn , s.t. R i f t L i  iD X  ii XY f:)f(     jjD YXnew   Reinforcement Learning  Given  gametrace/realsimulator/rewards,,actions,envD are :policy  learn , s.t.  Active Learning rea are   ,:utility ,:policy   321 aaa ,,gamerealnew,env  © Eric Xing @ CMU, 2006-2010  Active Learning  Given , learn , s.t.)(G~ D  jD Ypolicy,),(G'all )f(and)(G'~new D
  • 17. Elements of LearningElements of Learning  Here are some important elements to consider before you start:  Task: Task:  Embedding? Classification? Clustering? Topic extraction? …  Data and other info:  Input and output (e.g., continuous, binary, counts, …) S i d i d f bl d f thi ? Supervised or unsupervised, of a blend of everything?  Prior knowledge? Bias?  Models and paradigms:  BN? MRF? Regression? SVM?  Bayesian/Frequents ? Parametric/Nonparametric?  Objective/Loss function:  MLE? MCLE? Max margin?  Log loss, hinge loss, square loss? … Log loss, hinge loss, square loss? …  Tractability and exactness trade off:  Exact inference? MCMC? Variational? Gradient? Greedy search?  Online? Batch? Distributed? E l ti © Eric Xing @ CMU, 2006-2010  Evaluation:  Visualization? Human interpretability? Perperlexity? Predictive accuracy?  It is better to consider one element at a time!
  • 18. Theories of LearningTheories of Learning For the learned F(; )  Consistency (value, pattern, …) Bi i Bias versus variance  Sample complexity  Learning rateg  Convergence  Error bound  Confidence  Stability  © Eric Xing @ CMU, 2006-2010  …
  • 19. ClassificationClassification  Representing data:p g  Hypothesis (classifier) © Eric Xing @ CMU, 2006-2010
  • 20. Decision-making as dividing a high dimensional spacehigh-dimensional space  Classification-specific Dist.: P(X|Y)p ( | ) );( )|( 1     Xp YXp ),;( 111  Xp ),;( )|( 222 2     Xp YXp  Class prior (i.e., "weight"): P(Y) © Eric Xing @ CMU, 2006-2010 p ( , g ) ( )
  • 21. The Bayes RuleThe Bayes Rule  What we have just did leads to the following generalj g g expression: )()|( YpYXP )( )()|( )|( XP YpYXP XYP  This is Bayes Rule © Eric Xing @ CMU, 2006-2010
  • 22. The Bayes Decision Rule for Minimum ErrorMinimum Error  The a posteriori probability of a samplep p y p )( )|( )|( )( )()|( )|( Xq iYXp iYXp Xp iYPiYXp XiYP i i ii ii           Bayes Test:  Likelihood Ratio: )(X  Discriminant function: )(X © Eric Xing @ CMU, 2006-2010 )(Xh
  • 23. Example of Decision RulesExample of Decision Rules  When each class is a normal …  We can write the decision boundary analytically in some © Eric Xing @ CMU, 2006-2010  We can write the decision boundary analytically in some cases … homework!!
  • 24. Bayes ErrorBayes Error  We must calculate the probability of errorp y  the probability that a sample is assigned to the wrong class  Given a datum X, what is the risk?  The Bayes error (the expected risk): The Bayes error (the expected risk): © Eric Xing @ CMU, 2006-2010
  • 25. More on Bayes ErrorMore on Bayes Error  Bayes error is the lower bound of probability of classification error  Bayes classifier is the theoretically best classifier that minimize probability of classification error  Computing Bayes error is in general a very complex problem Why? Computing Bayes error is in general a very complex problem. Why?  Density estimation:  Integrating density function: © Eric Xing @ CMU, 2006-2010  Integrating density function:
  • 26. Learning ClassifierLearning Classifier  The decision rule:  Learning strategies  Generative Learning  Discriminative Learning Discriminative Learning  Instance-based Learning (Store all past experience in memory)  A special case of nonparametric classifier © Eric Xing @ CMU, 2006-2010
  • 27. Supervised LearningSupervised Learning  K-Nearest-Neighbor Classifier: where the h(X) is represented by all the data, and by an algorithm © Eric Xing @ CMU, 2006-2010
  • 28. Recall: Vector Space RepresentationRepresentation  Each document is a vector, one Doc 1 Doc 2 Doc 3 component for each term (= word). Doc 1 Doc 2 Doc 3 ... Word 1 3 0 0 ... Word 2 0 8 1 ... Word 3 12 1 10 ... ... 0 1 3 ... 0 0 0... 0 0 0 ...  Normalize to unit length.  High dimensional vector space: © Eric Xing @ CMU, 2006-2010  High-dimensional vector space:  Terms are axes, 10,000+ dimensions, or even 100,000+  Docs are vectors in this space
  • 29. Test Document = ?Test Document = ? Sports Science Arts © Eric Xing @ CMU, 2006-2010
  • 33. K-Nearest Neighbor (kNN) classifierclassifier Voting kNN Sports Science Arts © Eric Xing @ CMU, 2006-2010
  • 34. Classes in a Vector SpaceClasses in a Vector Space Sports Science Arts © Eric Xing @ CMU, 2006-2010
  • 35. kNN Is Close to OptimalkNN Is Close to Optimal  Cover and Hart 1967  Asymptotically, the error rate of 1-nearest-neighbor classification is less than twice the Bayes rate [error rate of classifier knowing model that generated data]classifier knowing model that generated data]  In particular, asymptotic error rate is 0 if Bayes rate is 0.  Where does kNN come from?  Nonparametric density estimation © Eric Xing @ CMU, 2006-2010
  • 36. Nearest-Neighbor Learning AlgorithmAlgorithm  Learning is just storing the representations of the trainingg j g p g examples in D.  Testing instance x:  Compute similarity between x and all examples in D.  Assign x the category of the most similar example in D. Does not explicitly compute a generalization or category Does not explicitly compute a generalization or category prototypes.  Also called: Also called:  Case-based learning  Memory-based learning  Lazy learning © Eric Xing @ CMU, 2006-2010  Lazy learning
  • 37. kNN is an instance of Instance Based LearningInstance-Based Learning  What makes an Instance-Based Learner?  A distance metric  How many nearby neighbors to look at?  A weighting function (optional)  How to relate to the local points? © Eric Xing @ CMU, 2006-2010
  • 38. Euclidean Distance MetricEuclidean Distance Metric D 22 )'()'(  Or equivalently,   i iii xxxxD 22 )'()',(  Oth t i )'()'()',( xxxxxxD T   Other metrics:  L1 norm: |x-x'|  L∞ norm: max |x-x'| (elementwise …)  Mahalanobis: where  is full, and symmetric  Correlation  Angle H i di t M h tt di t © Eric Xing @ CMU, 2006-2010  Hamming distance, Manhattan distance  …
  • 39. Case Study: kNN for Web ClassificationkNN for Web Classification  Dataset  20 News Groups (20 classes)  Download :(http://people.csail.mit.edu/jrennie/20Newsgroups/)  61 118 words 18 774 documents 61,118 words, 18,774 documents  Class labels descriptions © Eric Xing @ CMU, 2006-2010© Eric Xing @ CMU, 2006-2008
  • 40. Results: Binary ClassesResults: Binary Classes alt.atheism Accuracy vs. comp.graphics rec.autos vs comp.windows.x vs. Accuracy vs. rec.sport.baseball rec.motorcycles © Eric Xing @ CMU, 2006-2010© Eric Xing @ CMU, 2006-2008 k
  • 41. Results: Multiple ClassesResults: Multiple Classes Random select 5-out-of-20 classes, repeat 10 runs and average Accuracy All 20 classes © Eric Xing @ CMU, 2006-2010© Eric Xing @ CMU, 2006-2008 k
  • 42. Is kNN ideal?Is kNN ideal? © Eric Xing @ CMU, 2006-2010
  • 43. Is kNN ideal? more laterIs kNN ideal? … more later © Eric Xing @ CMU, 2006-2010
  • 44. Effect of ParametersEffect of Parameters  Sample size  The more the better  Need efficient search algorithm for NN  Dimensionality  Curse of dimensionality  Density  How smooth?  Metric  The relative scalings in the distance metric affect region shapes.  Weight Weight  Spurious or less relevant points need to be downweighted  K © Eric Xing @ CMU, 2006-2010
  • 45. SummarySummary  Machine Learning is Cool and Useful!! P di f M hi L i Paradigms of Machine Learning.  Design elements learning  Theories on learning  Fundamental theory of classification  Bayes optimal classifier  Instance-based learning: kNN – a Nonparametric classifier  A nonparametric method does not rely on any assumption concerning the structure of the underlying density function.  Very little “learning” is involved in these methods  Good news: Good news:  Simple and powerful methods; Flexible and easy to apply to many problems.  kNN classifier asymptotically approaches the Bayes classifier, which is theoretically the best classifier that minimizes the probability of classification error. © Eric Xing @ CMU, 2006-2010  Bad news:  High memory requirements  Very dependant on the scale factor for a specific problem.
  • 46. Learning Based Approaches forLearning Based Approaches for  Visual Recognition L. Fei‐Fei Computer Science Dept. f dStanford University
  • 47. As legend goes… 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 47
  • 48. CVPR: 1985 ‐ 2010 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 48
  • 49. What is vision? “ d di ” iReal world “understanding” pictures pixel world “forming” pictures8 August 2010 49 L. Fei-Fei, Dragon Star 2010, Stanford
  • 50. “ d di ” i What is vision? “understanding” pictures • edges Real world • intensity • texture • … pixel world LowLow‐‐Level VisionLevel Vision 8 August 2010 50 L. Fei-Fei, Dragon Star 2010, Stanford
  • 51. “ d di ” i What is vision? “understanding” pictures • groupings of  Real world similar pixels • geometry • … pixel world MidMid‐‐Level VisionLevel Vision LowLow‐‐Level VisionLevel Vision 8 August 2010 51 L. Fei-Fei, Dragon Star 2010, Stanford
  • 52. “ d di ” i What is vision? “understanding” pictures This is a story of love and  Real world friendship among three   young hamsters. On a  sunny day in the garden… pixel world Hi hHi h L l Vi iL l Vi i MidMid‐‐Level VisionLevel Vision HighHigh‐‐Level VisionLevel Vision LowLow‐‐Level VisionLevel Vision 8 August 2010 52 L. Fei-Fei, Dragon Star 2010, Stanford
  • 55. PT = 27ms This was a picture with some dark sploches in it.  Y h th t' b t it (S bj t KM) I think I saw two people on a field. (Subject:  PT = 40ms Yeah. . .that's about it. (Subject: KM) p p ( j RW)  PT = 67ms Outdoor scene. There were some kind of  animals, maybe dogs or horses, in the middle of  the picture. It looked like they were running in  the middle of a grassy field. (Subject: IV)  S ki d f fi h f PT = 500ms Some kind of game or fight. Two groups of two  men? The foregound pair looked like one was  getting a fist in the face. Outdoors seemed like  because i have an impression of grass and maybe  li h ? Th ld b h I hi k PT = 107ms two people, whose profile was toward me.  looked like they were on a field of some sort and  engaged in some sort of sport (their attire  suggested soccer, but it looked like there was too  lines on the grass? That would be why I think  perhaps a game, rough game though, more like  rugby than football because they pairs weren't in  pads and helmets, though I did get the  i i f i il l hi b much contact for that). (Subject: AI) impression of similar clothing. maybe some  trees? in the background. (Subject: SM) Fei‐Fei et al. JoV 200755 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 56. Visual recognition social rolessocial roles situationsituation goals and intentionsgoals and intentions functionality (human-object interaction) functionality (human-object interaction) roles and functionsroles and functions causalitycausality High‐level 1 sec on  me in  ystem recognitionrecognition geometric layoutgeometric layout scene understandingscene understanding activity understandingactivity understanding 3D geometry3D geometry presentatio ion tasks ocessing tim man visual sy t tit ti parts and attributesparts and attributes basic-level classification basic-level classification segmentationsegmentation action classificationaction classification 3D geometry3D geometry 150 msec of image re nd recogniti Pro hum shapeshape features and descriptorsfeatures and descriptors segmentationsegmentation gg 90 msec Level o an texturetexture trackingtracking ObjectObject SceneScene Event/activityEvent/activity pp Low‐level 8 August 2010 56 L. Fei-Fei, Dragon Star 2010, Stanford
  • 57. Visual recognition 1990 l 2000 social rolessocial roles situationsituation goals and intentionsgoals and intentions 1990‐early 2000 functionality (human-object interaction) functionality (human-object interaction) roles and functionsroles and functions causalitycausality High‐level 1 sec on  me in  ystem recognitionrecognition geometric layoutgeometric layout scene understandingscene understanding activity understandingactivity understanding 3D geometry3D geometry presentatio ion tasks ocessing tim man visual sy t tit ti parts and attributesparts and attributes basic-level classification basic-level classification segmentationsegmentation action classificationaction classification 3D geometry3D geometry 150 msec of image re nd recogniti Pro hum shapeshape features and descriptorsfeatures and descriptors segmentationsegmentation gg 90 msec Level o an texturetexture trackingtracking pp Low‐level ObjectObject SceneScene Event/activityEvent/activity 8 August 2010 57 L. Fei-Fei, Dragon Star 2010, Stanford
  • 58. Visual recognition l 2000 social rolessocial roles situationsituation goals and intentionsgoals and intentions early 2000 ‐ now functionality (human-object interaction) functionality (human-object interaction) roles and functionsroles and functions causalitycausality High‐level 1 sec on  me in  ystem recognitionrecognition geometric layoutgeometric layout scene understandingscene understanding activity understandingactivity understanding 3D geometry3D geometry presentatio ion tasks ocessing tim man visual sy t tit ti parts and attributesparts and attributes basic-level classification basic-level classification segmentationsegmentation action classificationaction classification 3D geometry3D geometry 150 msec of image re nd recogniti Pro hum shapeshape features and descriptorsfeatures and descriptors segmentationsegmentation gg 90 msec Level o an texturetexture trackingtracking pp Low‐level ObjectObject SceneScene Event/activityEvent/activity 8 August 2010 58 L. Fei-Fei, Dragon Star 2010, Stanford
  • 59. Why machine learning? 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 59
  • 60. Why machine learning? • 80,000,000,000+ images • 5,000,000,000+ images • 120,000,000 videos (upload: 13hours/min) • 20Gb of imagesg • My mom’s hard drive: 220+Gb of images• My mom s hard‐drive: 220+Gb of images 8 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford 60
  • 62. Machine learning in computer vision • Aug 12, Lecture 1: Nearest Neighbor – Large‐scale image classification – Scene completion • Aug 12 Lecture 3: Neural NetworkAug 12, Lecture 3: Neural Network – Convolutional Nets for object recognition – Unsupervised feature learning via Deep Belief Net • Aug 13, Lecture 7: Dimensionality reduction, Manifold learning – Eigen‐ and Fisher‐ faces – Applications to object representation • Aug 15, Lecture 13: Conditional Random Field – Image segmentation, object recognition & image annotationImage segmentation, object recognition & image annotation • Aug 15 & 16, Lecture 14 + 17: Topic models – Object recognition – Scene classification, image annotation, large‐scale image clustering Total scene understanding– Total scene understanding 8 August 2010 62 L. Fei-Fei, Dragon Star 2010, Stanford
  • 64. Machine learning in computer vision • Aug 12, Lecture 1: Nearest Neighbor – Large‐scale image classificationLarge scale image classification – Scene completion 8 August 2010 64 L. Fei-Fei, Dragon Star 2010, Stanford
  • 65. Machine learning in computer vision • Aug 12, Lecture 1: Nearest Neighbor – Large‐scale image classificationLarge scale image classification – Scene completion 8 August 2010 65 L. Fei-Fei, Dragon Star 2010, Stanford
  • 66. 66L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 68. Large-scale image classification and t i l i dd d bl i i iretrieval is an unaddressed problem in vision ? ? 68L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 69. ) Humans 4 y(log_10) 3 PASCAL1 category 2 Caltech101/256MRSC PASCAL1 LabelMe3 agesper 1 Caltech101/256 Tiny Images2 cleanima 1 2 3 4 5 # of visual concept categories (log 10) #of p g ( g_ ) 1. Excluding the Caltech101 datasets from PASCAL 2. Image in this dataset are not human annotated. The # of clean images per category is a rough estimation 3. Categories of more than 100 images are considered only
  • 72. Classification 5‐NN ? Kangaroo ? a ga oo Count 3 2 1 Antelope Jellyfish German Shepherd TromboneKangaroo 0
  • 74. KNN on 10K classes • 10K classes • 4.5M queries • 4.5M trainingg • Features BOW– BOW – GIST Deng, Berg, Li & Fei‐Fei, ECCV 2010 74 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 75. How fast is kNN? • Brute force linear scan – E.g. scanning 4.5M images! • Can we be faster? – Yes if feature dimensionality is low ( e.g. <= 16 ) – K‐D tree. http://graphics.stanford.edu/courses/cs368‐00‐spring/TA/manuals/CGAL/ref‐manual2/SearchStructures/kdtree.gif 75 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 76. Curse of dimensionality • For high dimensionality, in both theory and  practice, little improvement over brute‐force  linear scan. • E.g. KD‐tree on 128 dimension SIFT is not  much faster than linear scan. 76 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010 http://graphics.stanford.edu/courses/cs368‐00‐spring/TA/manuals/CGAL/ref‐manual2/SearchStructures/kdtree.gif
  • 77. Locality sensitive hashing • Approximate kNN  G d h i i– Good enough in practice – Can get around curse of dimensionality • Locality sensitive hashing – Near feature points  (likely) same hash values Hash table 77 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 78. Example: Random projection • h(x) = sgn (x ∙ r),  r is a random unit vector • h(x) gives 1 bit. Repeat and concatenate.h(x) gives 1 bit. Repeat and concatenate.  • Prob[h(x) = h(y)] = 1 – θ(x,y) / π y y y r y θ x y x y x h(x) = 0, h(y) = 1 h l h(x) = 0, h(y) = 0 h(x) = 0, h(y) = 1 h(x)   0, h(y)   1 hyperplane x y Hash table 000 101 78 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 79. Example: Random projection • h(x) = sgn (x ∙ r),  r is a random unit vector • h(x) gives 1 bit. Repeat and concatenate.h(x) gives 1 bit. Repeat and concatenate.  • Prob[h(x) = h(y)] = 1 – θ(x,y) / π ry x y x y x y θ h(x) = 0, h(y) = 0 h l h(x) = 0, h(y) = 0 h(x) = 0, h(y) = 0 h(x)   0, h(y)   0 hyperplane x y Hash table 000 101 79 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 81. Locality sensitive hashing • 1000X speed‐up with 50% recall of top 10‐NN • 1.2M images + 1000 dimensions 0.6 0.7op10 L1Prod LSH + L1Prod ranking RandHP LSH + L1Prod ranking rieved 0.5 0.6 1Prodattop exact NN ret 0.3 0.4 ecallofL1P rcentage of e 0.4 0.6 0.8 1 1.2 1.4 1.6 x 10 −3 0.2 Rec Scan cost Percentage of points scanned Per x 10Percentage of points scanned 81 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 83. Machine learning in computer vision • Aug 12, Lecture 1: Nearest Neighbor – Large‐scale image classificationLarge scale image classification – Scene completion (slides courtesy: Alyosha Efros (CMU))(slides courtesy: Alyosha Efros (CMU)) 8 August 2010 83 L. Fei-Fei, Dragon Star 2010, Stanford
  • 85. 858 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford
  • 86. Diffusion Result 868 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford
  • 87. Efros and Leung result 878 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford
  • 88. 888 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford
  • 89. Scene Matching for  I C l tiImage Completion 898 August 2010 L. Fei-Fei, Dragon Star 2010, Stanford
  • 90. Scene Completion Result 90 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 91. 91 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 92. Scene Matching 92 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 93. … 200 total 93 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 94. Context Matching 94 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 95. 95 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 96. 96 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 97. 97 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 98. 98 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 99. “The Internet is the world’s largest library.” 99 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 100. 100 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 103. Hays and Efros, SIGGRAPH 2007 103 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 104. Hays and Efros, SIGGRAPH 2007 104 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 105. Hays and Efros, SIGGRAPH 2007 105 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 106. Hays and Efros, SIGGRAPH 2007 106 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 107. Hays and Efros, SIGGRAPH 2007 107 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010
  • 108. Hays and Efros, SIGGRAPH 2007 108 L. Fei-Fei, Dragon Star 2010, Stanford 8 August 2010