SlideShare a Scribd company logo
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Statistical Pattern Recognition
Lecture4
Bayesian Learning
Dr Zohreh Azimifar
School of Electrical and Computer Engineering
Shiraz University
Fall2014
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 1 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Table of contents
1 Introduction
2 Generative Learning vs Discriminative Learning
Generative Learning and Discriminative Learning
3 Linear Discriminant Analysis
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
4 Quadratic Discriminant Analysis
Quadratic Discriminant Analysis
Analysis of QDA
5 GLAD and QDA, Another point of view
GLAD and QDA, Another point of view
6 Naive Bayes
Naive Bayes
Naive Bayes: An Example
7 Lecture Summary
Summary
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 2 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Introduction
Classification based on the theory Bayesian Learning
P(y = 0|X) ≷y=0
y=1 P(y = 1|X)
Classification involves determining P(y|X), from different
perspectives.
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 3 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Generative Learning and Discriminative Learning
Generative Learning and Discriminative Learning
1 Discriminative Learning:
Direct learning of P(y|X).
Modelling of decision boundary, to which side a new sample is
assigned.
Logistic and softmax regression are called discriminative learners.
2 Generative Learning
Explicit modelling of each class separately.
Compare new sample with each class probability, based on Bayesian
rule:
P(y|X) =
Likelihood
z }| {
P(X|y)
Prior
z }| {
P(y)
P(X)
| {z }
normalizing factor
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 4 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Model P(y) and P(X|y) for each class y:
P(y) = φ
1{y=1}
1 φ
1{y=2}
2 · · · φ1{y=c}
c
P(X|y = i) =
1
(
√
2π)n|Σ|
1
2
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))
Parameter set: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 5 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Take partial derivative in terms of each individual parameter:
φMLE
i =
Pm
j=1 1{y(j)
= i}
m
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Take partial derivative in terms of each individual parameter:
φMLE
i =
Pm
j=1 1{y(j)
= i}
m
µMLE
i =
Pm
j=1 1{y(j)
= i}X(j)
Pm
j=1 1{y(j) = i}
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ}
l(θ) = log
m
Y
j=1
P(X(j)
, y(j)
) = log
m
Y
j=1
P(X(j)
|P(y(j)
))P(y(j)
)
Take partial derivative in terms of each individual parameter:
φMLE
i =
Pm
j=1 1{y(j)
= i}
m
µMLE
i =
Pm
j=1 1{y(j)
= i}X(j)
Pm
j=1 1{y(j) = i}
ΣMLE
=
1
m
m
X
j=1
(X(j)
− µy(j) )(X(j)
− µy(j) )T
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Gaussian Linear Discriminant Analysis
Determine class label of a new sample Xnew
:
ynew
= argmaxy P(y|X)
= argmaxy
P(X|y)P(y)
P(X)
= argmaxy P(X|y)P(y)
Note that P(X|y) is a class dependent density.
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 7 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary is a line, a plane, or a hyper-plane. Why?
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary is a line, a plane, or a hyper-plane. Why?
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary is a line, a plane, or a hyper-plane. Why?
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
1
(2π)
n
2 |Σ|
1
2
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i)
=
1
(2π)
n
2 |Σ|
1
2
exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
−1
2
(X − µi )T
Σ−1
(X − µi ) + log P(y = i) =
−1
2
(X − µj )T
Σ−1
(X − µj ) + log P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
−1
2
(X − µi )T
Σ−1
(X − µi ) + log P(y = i) =
−1
2
(X − µj )T
Σ−1
(X − µj ) + log P(y = j)
⇒ log
P(y = i)
P(y = j)
−
1
2
[XT
Σ−1
X − 2XT
Σ−1
µi + µT
i Σ−1
µi ]
+
1
2
[XT
Σ−1
X − 2XT
Σ−1
µj + µT
j Σ−1
µj ] = 0
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Boundary Decision for GLDA
Decision boundary cont’ed:
exp(
−1
2
(X − µi )T
Σ−1
(X − µi ))P(y = i) = exp(
−1
2
(X − µj )T
Σ−1
(X − µj ))P(y = j)
−1
2
(X − µi )T
Σ−1
(X − µi ) + log P(y = i) =
−1
2
(X − µj )T
Σ−1
(X − µj ) + log P(y = j)
⇒ log
P(y = i)
P(y = j)
−
1
2
[XT
Σ−1
X − 2XT
Σ−1
µi + µT
i Σ−1
µi ]
+
1
2
[XT
Σ−1
X − 2XT
Σ−1
µj + µT
j Σ−1
µj ] = 0
⇒ XT
Σ−1
(µi − µj )
| {z }
aX
+
1
2
µT
i Σ−1
µi −
1
2
µT
j Σ−1
µj + log
P(y = i)
P(y = j)
| {z }
b
= 0
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA when Σ = σ2
I
Classes are of identical distribution, but different means.
Cross-section of classes distribution is spherical.
Decision boundary is linear.
Called classifier with nearest Euclidean distance to the class mean,
when?
Σ =

1 0
0 1

Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 10 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA when Σ is not identity
Classes are of identical distribution, but different means.
Cross-section of classes distribution is ellipsoidal.
Decision boundary is linear.
Σ =

0.5 0
0 1.5

Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 11 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA with arbitrary Σ
Classes are of identical distribution, but different means.
Cross-section of classes distribution is ellipsoidal. Linear decision
boundary.
Classes are aligned with direction of covariance eigenvectors.
Σ =

0.5 0.2
0.2 1.5

Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Gaussian Linear Discriminant Analysis
Boundary Decision for GLDA
Analysis of GLDA
Analysis of GLDA with arbitrary Σ
Classes are of identical distribution, but different means.
Cross-section of classes distribution is ellipsoidal. Linear decision
boundary.
Classes are aligned with direction of covariance eigenvectors.
Σ =

0.5 0.2
0.2 1.5

Called classifier with nearest Mahalanobis distance to the class
mean, when? Dist(X, µi) = (X − µi)T
Σ−1
(X − µi)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Another generative learning model; a Bayesian classifier
Here, classes are of multinomial distribution, and likelihoods are
multivariate Gaussian with separate covariance Σi .
Decision boundary becomes non-linear, Why?
P(X|y = i) =
1
(
√
2π)n|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 13 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary is parabolic:
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary is parabolic:
P(y = i|X) = P(y = j|X)
P(X|y = i)P(y = i)
P(X)
=
P(X|y = j)P(y = j)
P(X)
P(X|y = i)P(y = i) = P(X|y = j)P(y = j)
1
(2π)
n
2 |Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
(2π)
n
2 |Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
1
|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
|Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
1
|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
|Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
−1
2
log|Σi | −
1
2
(X − µi )T
Σ−1
i (X − µi ) + logP(y = i)
=
−1
2
log|Σj | −
1
2
(X − µj )T
Σ−1
j (X − µj ) + logP(y = j)
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
1
|Σi |
1
2
exp(
−1
2
(X − µi )T
Σ−1
i (X − µi ))P(y = i)
=
1
|Σj |
1
2
exp(
−1
2
(X − µj )T
Σ−1
j (X − µj ))P(y = j)
−1
2
log|Σi | −
1
2
(X − µi )T
Σ−1
i (X − µi ) + logP(y = i)
=
−1
2
log|Σj | −
1
2
(X − µj )T
Σ−1
j (X − µj ) + logP(y = j)
⇒ log
P(y = i)
P(y = j)
−
1
2
log
|Σi |
|Σj |
−
1
2
[XT
Σ−1
i X + µT
i Σ−1
i µi −
2XT
Σ−1
i µi − XT
Σ−1
j X − µT
j Σ−1
j µj + 2XT
Σ−1
j µj ] = 0
Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
Introduction
Generative Learning vs Discriminative Learning
Linear Discriminant Analysis
Quadratic Discriminant Analysis
GLAD and QDA, Another point of view
Naive Bayes
Lecture Summary
Quadratic Discriminant Analysis
Analysis of QDA
Quadratic Discriminant Analysis
Decision boundary cont’ed:
log
P(y = i)
P(y = j)
−
1
2
log
Dr azimifar pattern recognition lect4
Dr azimifar pattern recognition lect4
Σi
Dr azimifar pattern recognition lect4
Dr azimifar pattern recognition lect4
Dr azimifar pattern recognition lect4
Dr azimifar pattern recognition lect4

More Related Content

PDF
A method for constructing fuzzy test statistics with application
PDF
Pattern-based classification of demographic sequences
PDF
A lattice-based consensus clustering
PDF
Linear models2
PDF
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
PDF
PDF
Numerical Evidence for Darmon Points
PDF
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...
A method for constructing fuzzy test statistics with application
Pattern-based classification of demographic sequences
A lattice-based consensus clustering
Linear models2
MUMS Undergraduate Workshop - A Biased Introduction to Global Sensitivity Ana...
Numerical Evidence for Darmon Points
Wei Yang - 2015 - Sampling-based Alignment and Hierarchical Sub-sentential Al...

What's hot (10)

PDF
A STUDY ON L-FUZZY NORMAL SUBl -GROUP
PDF
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
PDF
RGA of Cayley graphs
PDF
Introduction to second gradient theory of elasticity - Arjun Narayanan
PDF
On the proof theory for Description Logics
PDF
A new generalized lindley distribution
PDF
A Gentle Introduction to Bayesian Nonparametrics
PPTX
Lec 17
PDF
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
PDF
A Gentle Introduction to Bayesian Nonparametrics
A STUDY ON L-FUZZY NORMAL SUBl -GROUP
Bayesian Nonparametric Topic Modeling Hierarchical Dirichlet Processes
RGA of Cayley graphs
Introduction to second gradient theory of elasticity - Arjun Narayanan
On the proof theory for Description Logics
A new generalized lindley distribution
A Gentle Introduction to Bayesian Nonparametrics
Lec 17
Linear Discriminant Analysis (LDA) Under f-Divergence Measures
A Gentle Introduction to Bayesian Nonparametrics
Ad

Similar to Dr azimifar pattern recognition lect4 (20)

PDF
Model complexity
PDF
Reading "Bayesian measures of model complexity and fit"
PDF
Nec 602 unit ii Random Variables and Random process
PDF
DIC
PDF
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
PDF
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
PDF
Type-based Dependency Analysis
PDF
Hypothesis testings on individualized treatment rules
PDF
Workshop in honour of Don Poskitt and Gael Martin
PDF
asymptotics of ABC
PDF
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PDF
Laplace's Demon: seminar #1
PDF
graphicascaaaaaaaaaa dsaaaaal models.pdf
PDF
PAWL - GPU meeting @ Warwick
PDF
Improved semdefinite programming for entanglement
PDF
Self-organizing Network for Variable Clustering and Predictive Modeling
PDF
Gradient Estimation Using Stochastic Computation Graphs
PDF
prior selection for mixture estimation
PDF
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
PDF
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Model complexity
Reading "Bayesian measures of model complexity and fit"
Nec 602 unit ii Random Variables and Random process
DIC
RSS Annual Conference, Newcastle upon Tyne, Sept. 03, 2013
Deep Learning Opening Workshop - Horseshoe Regularization for Machine Learnin...
Type-based Dependency Analysis
Hypothesis testings on individualized treatment rules
Workshop in honour of Don Poskitt and Gael Martin
asymptotics of ABC
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
Laplace's Demon: seminar #1
graphicascaaaaaaaaaa dsaaaaal models.pdf
PAWL - GPU meeting @ Warwick
Improved semdefinite programming for entanglement
Self-organizing Network for Variable Clustering and Predictive Modeling
Gradient Estimation Using Stochastic Computation Graphs
prior selection for mixture estimation
MUMS: Bayesian, Fiducial, and Frequentist Conference - Spatially Informed Var...
Accelerating Pseudo-Marginal MCMC using Gaussian Processes
Ad

More from Zahra Amini (12)

PDF
Dip azimifar enhancement_l05_2020
PDF
Dip azimifar enhancement_l04_2020
PDF
Dip azimifar enhancement_l03_2020
PDF
Dip azimifar enhancement_l02_2020
PDF
Dip azimifar enhancement_l01_2020
PDF
Ch 1-3 nn learning 1-7
PDF
Ch 1-2 NN classifier
PDF
Ch 1-1 introduction
PDF
Lecture 8
PDF
Kernel estimation(ref)
PDF
Dr azimifar pattern recognition lect2
PDF
Dr azimifar pattern recognition lect1
Dip azimifar enhancement_l05_2020
Dip azimifar enhancement_l04_2020
Dip azimifar enhancement_l03_2020
Dip azimifar enhancement_l02_2020
Dip azimifar enhancement_l01_2020
Ch 1-3 nn learning 1-7
Ch 1-2 NN classifier
Ch 1-1 introduction
Lecture 8
Kernel estimation(ref)
Dr azimifar pattern recognition lect2
Dr azimifar pattern recognition lect1

Recently uploaded (20)

PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Fundamentals of safety and accident prevention -final (1).pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPT
Project quality management in manufacturing
PPT
Mechanical Engineering MATERIALS Selection
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
Sustainable Sites - Green Building Construction
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
DOCX
573137875-Attendance-Management-System-original
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Artificial Intelligence
PDF
III.4.1.2_The_Space_Environment.p pdffdf
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Well-logging-methods_new................
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
CYBER-CRIMES AND SECURITY A guide to understanding
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Fundamentals of safety and accident prevention -final (1).pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Project quality management in manufacturing
Mechanical Engineering MATERIALS Selection
OOP with Java - Java Introduction (Basics)
Sustainable Sites - Green Building Construction
Operating System & Kernel Study Guide-1 - converted.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
573137875-Attendance-Management-System-original
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Artificial Intelligence
III.4.1.2_The_Space_Environment.p pdffdf
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Well-logging-methods_new................
Model Code of Practice - Construction Work - 21102022 .pdf

Dr azimifar pattern recognition lect4

  • 1. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Statistical Pattern Recognition Lecture4 Bayesian Learning Dr Zohreh Azimifar School of Electrical and Computer Engineering Shiraz University Fall2014 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 1 / 22
  • 2. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Table of contents 1 Introduction 2 Generative Learning vs Discriminative Learning Generative Learning and Discriminative Learning 3 Linear Discriminant Analysis Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA 4 Quadratic Discriminant Analysis Quadratic Discriminant Analysis Analysis of QDA 5 GLAD and QDA, Another point of view GLAD and QDA, Another point of view 6 Naive Bayes Naive Bayes Naive Bayes: An Example 7 Lecture Summary Summary Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 2 / 22
  • 3. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Introduction Classification based on the theory Bayesian Learning P(y = 0|X) ≷y=0 y=1 P(y = 1|X) Classification involves determining P(y|X), from different perspectives. Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 3 / 22
  • 4. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Generative Learning and Discriminative Learning Generative Learning and Discriminative Learning 1 Discriminative Learning: Direct learning of P(y|X). Modelling of decision boundary, to which side a new sample is assigned. Logistic and softmax regression are called discriminative learners. 2 Generative Learning Explicit modelling of each class separately. Compare new sample with each class probability, based on Bayesian rule: P(y|X) = Likelihood z }| { P(X|y) Prior z }| { P(y) P(X) | {z } normalizing factor Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 4 / 22
  • 5. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Model P(y) and P(X|y) for each class y: P(y) = φ 1{y=1} 1 φ 1{y=2} 2 · · · φ1{y=c} c P(X|y = i) = 1 ( √ 2π)n|Σ| 1 2 exp( −1 2 (X − µi )T Σ−1 (X − µi )) Parameter set: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 5 / 22
  • 6. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 7. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Take partial derivative in terms of each individual parameter: φMLE i = Pm j=1 1{y(j) = i} m Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 8. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Take partial derivative in terms of each individual parameter: φMLE i = Pm j=1 1{y(j) = i} m µMLE i = Pm j=1 1{y(j) = i}X(j) Pm j=1 1{y(j) = i} Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 9. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Estimate parameters: θ = {φ1, φ2, ..., φc , µ1, µ2, ..., µc , Σ} l(θ) = log m Y j=1 P(X(j) , y(j) ) = log m Y j=1 P(X(j) |P(y(j) ))P(y(j) ) Take partial derivative in terms of each individual parameter: φMLE i = Pm j=1 1{y(j) = i} m µMLE i = Pm j=1 1{y(j) = i}X(j) Pm j=1 1{y(j) = i} ΣMLE = 1 m m X j=1 (X(j) − µy(j) )(X(j) − µy(j) )T Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 6 / 22
  • 10. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Gaussian Linear Discriminant Analysis Determine class label of a new sample Xnew : ynew = argmaxy P(y|X) = argmaxy P(X|y)P(y) P(X) = argmaxy P(X|y)P(y) Note that P(X|y) is a class dependent density. Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 7 / 22
  • 11. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary is a line, a plane, or a hyper-plane. Why? P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
  • 12. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary is a line, a plane, or a hyper-plane. Why? P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
  • 13. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary is a line, a plane, or a hyper-plane. Why? P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) 1 (2π) n 2 |Σ| 1 2 exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = 1 (2π) n 2 |Σ| 1 2 exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 8 / 22
  • 14. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 15. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) −1 2 (X − µi )T Σ−1 (X − µi ) + log P(y = i) = −1 2 (X − µj )T Σ−1 (X − µj ) + log P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 16. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) −1 2 (X − µi )T Σ−1 (X − µi ) + log P(y = i) = −1 2 (X − µj )T Σ−1 (X − µj ) + log P(y = j) ⇒ log P(y = i) P(y = j) − 1 2 [XT Σ−1 X − 2XT Σ−1 µi + µT i Σ−1 µi ] + 1 2 [XT Σ−1 X − 2XT Σ−1 µj + µT j Σ−1 µj ] = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 17. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Boundary Decision for GLDA Decision boundary cont’ed: exp( −1 2 (X − µi )T Σ−1 (X − µi ))P(y = i) = exp( −1 2 (X − µj )T Σ−1 (X − µj ))P(y = j) −1 2 (X − µi )T Σ−1 (X − µi ) + log P(y = i) = −1 2 (X − µj )T Σ−1 (X − µj ) + log P(y = j) ⇒ log P(y = i) P(y = j) − 1 2 [XT Σ−1 X − 2XT Σ−1 µi + µT i Σ−1 µi ] + 1 2 [XT Σ−1 X − 2XT Σ−1 µj + µT j Σ−1 µj ] = 0 ⇒ XT Σ−1 (µi − µj ) | {z } aX + 1 2 µT i Σ−1 µi − 1 2 µT j Σ−1 µj + log P(y = i) P(y = j) | {z } b = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 9 / 22
  • 18. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA when Σ = σ2 I Classes are of identical distribution, but different means. Cross-section of classes distribution is spherical. Decision boundary is linear. Called classifier with nearest Euclidean distance to the class mean, when? Σ = 1 0 0 1 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 10 / 22
  • 19. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA when Σ is not identity Classes are of identical distribution, but different means. Cross-section of classes distribution is ellipsoidal. Decision boundary is linear. Σ = 0.5 0 0 1.5 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 11 / 22
  • 20. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA with arbitrary Σ Classes are of identical distribution, but different means. Cross-section of classes distribution is ellipsoidal. Linear decision boundary. Classes are aligned with direction of covariance eigenvectors. Σ = 0.5 0.2 0.2 1.5 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
  • 21. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Gaussian Linear Discriminant Analysis Boundary Decision for GLDA Analysis of GLDA Analysis of GLDA with arbitrary Σ Classes are of identical distribution, but different means. Cross-section of classes distribution is ellipsoidal. Linear decision boundary. Classes are aligned with direction of covariance eigenvectors. Σ = 0.5 0.2 0.2 1.5 Called classifier with nearest Mahalanobis distance to the class mean, when? Dist(X, µi) = (X − µi)T Σ−1 (X − µi) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 12 / 22
  • 22. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Another generative learning model; a Bayesian classifier Here, classes are of multinomial distribution, and likelihoods are multivariate Gaussian with separate covariance Σi . Decision boundary becomes non-linear, Why? P(X|y = i) = 1 ( √ 2π)n|Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi )) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 13 / 22
  • 23. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary is parabolic: P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
  • 24. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary is parabolic: P(y = i|X) = P(y = j|X) P(X|y = i)P(y = i) P(X) = P(X|y = j)P(y = j) P(X) P(X|y = i)P(y = i) = P(X|y = j)P(y = j) 1 (2π) n 2 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 (2π) n 2 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 14 / 22
  • 25. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: 1 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
  • 26. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: 1 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) −1 2 log|Σi | − 1 2 (X − µi )T Σ−1 i (X − µi ) + logP(y = i) = −1 2 log|Σj | − 1 2 (X − µj )T Σ−1 j (X − µj ) + logP(y = j) Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
  • 27. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: 1 |Σi | 1 2 exp( −1 2 (X − µi )T Σ−1 i (X − µi ))P(y = i) = 1 |Σj | 1 2 exp( −1 2 (X − µj )T Σ−1 j (X − µj ))P(y = j) −1 2 log|Σi | − 1 2 (X − µi )T Σ−1 i (X − µi ) + logP(y = i) = −1 2 log|Σj | − 1 2 (X − µj )T Σ−1 j (X − µj ) + logP(y = j) ⇒ log P(y = i) P(y = j) − 1 2 log |Σi | |Σj | − 1 2 [XT Σ−1 i X + µT i Σ−1 i µi − 2XT Σ−1 i µi − XT Σ−1 j X − µT j Σ−1 j µj + 2XT Σ−1 j µj ] = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 15 / 22
  • 28. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: log P(y = i) P(y = j) − 1 2 log
  • 31. Σi
  • 37. Σj
  • 40. − 1 2 [XT (Σ−1 i − Σ−1 j )X + µT i Σ−1 i µi − µT j Σ−1 j µj − 2XT (Σ−1 i µi − Σ−1 j µj )] = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 16 / 22
  • 41. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Quadratic Discriminant Analysis Decision boundary cont’ed: log P(y = i) P(y = j) − 1 2 log
  • 44. Σi
  • 50. Σj
  • 53. − 1 2 [XT (Σ−1 i − Σ−1 j )X + µT i Σ−1 i µi − µT j Σ−1 j µj − 2XT (Σ−1 i µi − Σ−1 j µj )] = 0 ⇒ XT aX + bT X + c = 0 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 16 / 22
  • 54. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Analysis of QDA when Σ = σ2 i I Classes are of different distributions and different means. Cross-sections of classes distribution are spherical but of different sizes. Σ1 = 1.5 0 0 1.5 , Σ2 = 2 0 0 2 , Σ3 = 1 0 0 1 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 17 / 22
  • 55. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Quadratic Discriminant Analysis Analysis of QDA Analysis of QDA with arbitrary Σi 6= Σj Classes are of different distributions and different means. Cross-sections of classes distribution are ellipsoidal and of different sizes. Σ1 = 1.5 0.1 0.1 0.5 , Σ2 = 1 −0.2 −0.2 2 , Σ3 = 2 −0.25 −0.25 1.5 Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 18 / 22
  • 56. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary GLAD and QDA, Another point of view GLAD and QDA, Another point of view Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 19 / 22
  • 57. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Naive Bayes Naive Bayes: An Example Naive Bayes Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 20 / 22
  • 58. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Naive Bayes Naive Bayes: An Example Naive Bayes: An Example Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 21 / 22
  • 59. Introduction Generative Learning vs Discriminative Learning Linear Discriminant Analysis Quadratic Discriminant Analysis GLAD and QDA, Another point of view Naive Bayes Lecture Summary Summary Summary Dr Zohreh Azimifar, 2014 Statistical Pattern Recognition Lecture4 Bayesian Learning 22 / 22