Principal Component Analysis
October 9, 2019
Amit Praseed Classification October 9, 2019 1 / 24
Feature Extraction
Feature selection involves loss of data due to the loss of features.
Even though care is taken to remove dimensions which are unlikely to
contribute much to data mining, it is still encouraged to retain all the
input data in one way or the other.
So, how to retain all the input data, and reduce the number of dimen-
sions?
Feature Extraction maps the data in a higher dimension feature space
to a lower dimension feature space without much loss of data.
The most common feature extraction technique used is Principal Com-
ponent Analysis (PCA).
Amit Praseed Classification October 9, 2019 2 / 24
The Idea behind PCA
Amit Praseed Classification October 9, 2019 3 / 24
The Idea behind PCA
Amit Praseed Classification October 9, 2019 4 / 24
Data as Vectors
A vector is a geometric object
that has magnitude and direc-
tion.
For example for a vector v =
[3 4]T
It has a magnitude of ||v|| =
(32
+ 42
) = 5
And has a direction given by
tan−1 4
3 anticlockwise from
the x-axis.
A unit vector is one which has
magnitude 1 and is often used
to specify directions.
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
x
y
Amit Praseed Classification October 9, 2019 5 / 24
Vector Spaces
A vector space is a collection of vectors, along with their associated
operations such as vector addition, subtraction etc.
For example, all two dimensional vectors are said to belong to a vector
space represented as R2.
Amit Praseed Classification October 9, 2019 6 / 24
Basis of a Vector Space
A basis for a vector space represents a minimal set of vectors from
which all other vectors in the vector space can be generated.
For example, all vectors in R2 can be generated using some linear
combination of [1 0] and [0 1], and hence forms a basis for R2.
For a set of vectors {v1, v2...vn} to form a basis for a vector space V ,
They must be linearly independent.
They must be able to generate all vectors in V ,
No subset of {v1, v2...vn} must be a basis for V
A vector space can have multiple basis vectors. For example, [1 2] and
[1 − 1] also form a basis for R2.
Amit Praseed Classification October 9, 2019 7 / 24
Change of Basis
Often times, vectors need to be converted from one basis to another
for ease of representation or processing. This can be done by specifying
a Change of Basis Matrix.
For example consider the vector space R2 and the standard basis B =
1 0
0 1
. Also consider an alternate basis B =
1 1
2 −1
.
The change of basis matrix from B to B is simply M =
1 1
2 −1
.
The change of basis matrix from B to B is simply M =
1 1
2 −1
−1
.
If the new basis B is orthogonal, then M =
1 1
2 −1
T
Amit Praseed Classification October 9, 2019 8 / 24
Linear Transformations
A linear transformation is a
function from one vector space
to another (or itself) that trans-
forms the vectors in the vector
space.
Linear transformations can be
represented as matrices, for ex-
ample
2 1
1.5 1
represents a
linear transformation.
Linear transformations change
the magnitude and/or direc-
tions of vectors.
2 1
1.5 1
∗
3
4
=
10
8.5
0 1 2 3 4 5 6 7 8 9 10
0
1
2
3
4
5
6
7
8
9
10
x
y
Amit Praseed Classification October 9, 2019 9 / 24
Another Example
M =
2 2
5 −1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
x
y
M =
2 2
5 −1
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
x
y
Amit Praseed Classification October 9, 2019 10 / 24
Eigen Vectors and Eigen Values
Eigen vectors for a matrix A representing a linear transformation are
vectors which when affected by A, only change in magnitude, but not
in direction.
Mathematically, a vector v is said to be an eigen vector for a matrix A
iff
A.v = λv
where λ is the magnitude by which the vector v is stretched, and is
called the eigen value for v.
Think of a linear transformation (or a matrix representing it) as
a force, and the eigen vectors represent the direction in which
the force acts.
The eigen values represent the strength of the force.
Amit Praseed Classification October 9, 2019 11 / 24
What is so special about Eigen Vectors?
Eigen vectors are important in a number of fields because they possess
some important properties.
The eigen vectors of a symmetric matrix are linearly independent or
orthogonal, which means that they form a very convenient basis in
which to represent vectors.
Eigen vectors can also be used to convert a matrix into its diagonal
form, i.e. a square matrix A can be decomposed into the form
A = PDP−1
D = P−1
AP
where P is a square matrix composed of the eigen vectors of A and D
is a diagonal matrix composed of the corresponding eigen values.
Amit Praseed Classification October 9, 2019 12 / 24
Change of Basis and Linear Transformations
If A is a data vector, it can be converted to a new basis by simply
multiplying with the change of basis matrix.
If A is a linear transformation, something more involved is required.
Let A be a linear transformation is a regular canonical basis, and let V
be the new basis. Then Av , the linear transformation in V is given by
Av = V −1
AV
Amit Praseed Classification October 9, 2019 13 / 24
An Example
Let A =
5 −3
2 −2
be a linear transformation in R2 with the canonical
basis.
Let V =
3 1
1 2
be a new basis in R2.
Also let a = 1 3 be a data vector in the canonical basis.
a after applying the linear transformation A is
aA = aA
aA = 1 3
5 2
−3 −2
aA = −4 −4
Amit Praseed Classification October 9, 2019 14 / 24
An Example
A transformed in the new vector space is given by
Av = V −1AV
Av =
3 1
1 2
−1
5 −3
2 −2
3 1
1 2
Av = 1
5
2 −1
−1 3
−1
5 −3
2 −2
3 1
1 2
Av =
4 0
0 −1
Amit Praseed Classification October 9, 2019 15 / 24
An Example
a transformed in the new vector space is given by
av = aV −1
Av = 1
5 1 3
2 −1
−1 3
av = 1
5 −1 8
av transformed by the linear transformation Av is given by
atrans = 1
5 −1 8
4 0
0 −1
atrans = 1
5 −4 −8
Transforming atrans back into the canonical basis
atrans back = atransV
atrans back = 1
5 −4 −8
3 1
1 2
atrans back = −4 −4 = aA
which is the same as the vector originally transformed in the canonical
basis
Amit Praseed Classification October 9, 2019 16 / 24
Coming Back to PCA...
Our basic question is: how to find out the new basis in which to
represent our data?
What are the new dimensions/axes/features in our new transformed
data?
Amit Praseed Classification October 9, 2019 17 / 24
The Covariance Matrix
First step in PCA is to compute the covariance matrix for our data.
If X is our input data matrix, then the covariance matrix is given by
Cx =
1
n
XT
X
Think of the covariance matrix as describing the interplay of the forces
of covariance between the different variables in the system.
Ideally, we want the variance within each variable to be high, and
covariance between variables to be nearly zero.
In other words, we want our covariance matrix to be a diagonal matrix.
Amit Praseed Classification October 9, 2019 18 / 24
The Maths...
Let our input data matrix be X. We want to convert X into a different
basis Y such that the covariance between variables in Y be zero. Let
P be the new orthogonal basis needed.
Y = XPT
Let Cx be the covariance matrix of X and Cy be the covariance matrix
of Y .
Cx =
1
n
XT
X
Cy =
1
n
Y T
Y
Cy =
1
n
(XPT
)T
(XP)
Cy = PT 1
n
XT
X P
Cy = PT
(Cx )P
Amit Praseed Classification October 9, 2019 19 / 24
The Maths...
Our previous result:
Cy = PT
(Cx )P
Ideally we want Cy to be a diagonal matrix, and we know that the
matrix than can diagonalize Cx is its eigen matrix!!!
In other words, P is the eigen matrix of Cx and since Cx is symmetric,
P is orthogonal i.e. P−1 = PT .
So finally, the new data points in our new basis( which is nothing but
the eigen vectors of our covariance matrix) can be obtained by
Y = XPT
where P is the eigen matrix of the covariance matrix of X.
Amit Praseed Classification October 9, 2019 20 / 24
But wait... we still have n dimensions...
If X is an matrix with n columns, then Cov(X) is also an nxn matrix.
Cov(X) has n eigen vectors and thus Y also has n features...
PCA only recomposes a data space in n dimensions to another datas-
pace of n dimensions.
However, in the new data space, we have an idea of which dimensions
contain the most variance (given by the eigen values).
We can easily retain only the top k eigen values and eigen vectors to
get a new k-dimensional vector space.
Amit Praseed Classification October 9, 2019 21 / 24
Example
Let X =








Student Maths English Art
1 90 60 90
2 90 90 30
3 60 60 60
4 60 60 90
5 30 30 30








The covariance matrix Cx =


504 360 180
360 360 0
180 0 720


Amit Praseed Classification October 9, 2019 22 / 24
Example
To find the eigen vectors and the eigen values of Cx , we have to solve
for |Cx − λI| = 0
det




504 − λ 360 180
360 360 − λ 0
180 0 720 − λ



 = 0
Which gives −λ3 + 1584λ2 − 641520λ + 25660800 = 0
Which evaluates to λ = 44.8, 629.11, 910.06 which are the eigen values
for Cx
The eigen vectors correponding to these vectors are

−3.75
4.28
1

 ,


−0.5
−0.675
1

 ,


1.055
0.69
1


Amit Praseed Classification October 9, 2019 23 / 24
Example
We select the two eigen vectors which have the highest eigen values
for building our eigen matrix
P =


1.055 −0.5
0.69 −0.675
1 1


The new data matrix
Y =






90 60 90
90 90 30
60 60 60
60 60 90
30 30 30








1.055 −0.5
0.69 −0.675
1 1


which is in a 2 dimensional data space
Amit Praseed Classification October 9, 2019 24 / 24

More Related Content

PPTX
Dominating set of fixed size in degenerated graph
PDF
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
PPT
S6 l04 analytical and numerical methods of structural analysis
PPT
Lar calc10 ch04_sec4
PPT
L 4 4
PDF
A0280115(1)
PDF
A Quick and Terse Introduction to Efficient Frontier Mathematics
PPTX
Merge sort
Dominating set of fixed size in degenerated graph
Paper introduction to Combinatorial Optimization on Graphs of Bounded Treewidth
S6 l04 analytical and numerical methods of structural analysis
Lar calc10 ch04_sec4
L 4 4
A0280115(1)
A Quick and Terse Introduction to Efficient Frontier Mathematics
Merge sort

What's hot (20)

PPT
Physics examination techniques
PDF
Numerical Methods in Mechanical Engineering - Final Project
PPTX
Some Engg. Applications of Matrices and Partial Derivatives
PDF
PBL Implementation Mathematics
PDF
T coffee algorithm dissection
PPTX
Integration
PDF
Approximate Thin Plate Spline Mappings
PPTX
PDF
10.1.1.630.8055
PPTX
Lecture 6-cs345-2014
PPTX
Application of interpolation and finite difference
PPTX
Maths Investigatory Project Class 12 on Differentiation
PDF
Some New Fixed Point Theorems on S Metric Spaces
PDF
Boyd 4.6, 4.7
PDF
Adaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree Search
PPT
Secante
PDF
A Graph Syntax for Processes and Services @ Workshop WS-FM 2009
PDF
Linear regression [Theory and Application (In physics point of view) using py...
PDF
Calculus
PPTX
Ruta solucion de problemas
Physics examination techniques
Numerical Methods in Mechanical Engineering - Final Project
Some Engg. Applications of Matrices and Partial Derivatives
PBL Implementation Mathematics
T coffee algorithm dissection
Integration
Approximate Thin Plate Spline Mappings
10.1.1.630.8055
Lecture 6-cs345-2014
Application of interpolation and finite difference
Maths Investigatory Project Class 12 on Differentiation
Some New Fixed Point Theorems on S Metric Spaces
Boyd 4.6, 4.7
Adaptive Multistage Sampling Algorithm: The Origins of Monte Carlo Tree Search
Secante
A Graph Syntax for Processes and Services @ Workshop WS-FM 2009
Linear regression [Theory and Application (In physics point of view) using py...
Calculus
Ruta solucion de problemas
Ad

Similar to Principal Component Analysis (20)

PPTX
MODULE_05-Matrix Decomposition.pptx
PDF
3. Linear Algebra for Machine Learning: Factorization and Linear Transformations
PDF
PM [B07] Exponent Partner
PDF
01.01 vector spaces
PDF
Journal Of Functional Analysis Volume 255 Issues 5 6 7 8 A Connes
PPTX
Beginning direct3d gameprogrammingmath03_vectors_20160328_jintaeks
PDF
Linear_Algebra_final.pdf
PPT
SVM (2).ppt
PPT
Vector
PPTX
Lecture 11 diagonalization & complex eigenvalues - 5-3 & 5-5
PPT
Lecture 7 determinants cramers spaces - section 3-2 3-3 and 4-1
DOCX
PROJECT
PPTX
Lecture 9 dim & rank - 4-5 & 4-6
PPTX
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
PPT
PPTX
STA003_WK4_L.pptx
PDF
Machine learning (11)
PPT
Introduction and Mathematical Foundations.ppt
PDF
2. Linear Algebra for Machine Learning: Basis and Dimension
PDF
AI Lesson 26
MODULE_05-Matrix Decomposition.pptx
3. Linear Algebra for Machine Learning: Factorization and Linear Transformations
PM [B07] Exponent Partner
01.01 vector spaces
Journal Of Functional Analysis Volume 255 Issues 5 6 7 8 A Connes
Beginning direct3d gameprogrammingmath03_vectors_20160328_jintaeks
Linear_Algebra_final.pdf
SVM (2).ppt
Vector
Lecture 11 diagonalization & complex eigenvalues - 5-3 & 5-5
Lecture 7 determinants cramers spaces - section 3-2 3-3 and 4-1
PROJECT
Lecture 9 dim & rank - 4-5 & 4-6
principalcomponentanalysis-150314161616-conversion-gate01 (1).pptx
STA003_WK4_L.pptx
Machine learning (11)
Introduction and Mathematical Foundations.ppt
2. Linear Algebra for Machine Learning: Basis and Dimension
AI Lesson 26
Ad

More from amitpraseed (7)

PDF
Decision Trees
PDF
Support Vector Machines (SVM)
PDF
Perceptron Learning
PDF
Introduction to Classification
PDF
Dimensionality Reduction
PDF
Convolutional Neural Networks
PDF
Bayesianclassifiers
Decision Trees
Support Vector Machines (SVM)
Perceptron Learning
Introduction to Classification
Dimensionality Reduction
Convolutional Neural Networks
Bayesianclassifiers

Recently uploaded (20)

PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
DOC
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PPTX
Share_Module_2_Power_conflict_and_negotiation.pptx
PDF
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
PDF
My India Quiz Book_20210205121199924.pdf
PDF
Empowerment Technology for Senior High School Guide
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PPTX
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
PDF
What if we spent less time fighting change, and more time building what’s rig...
PPTX
Introduction to pro and eukaryotes and differences.pptx
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
PPTX
History, Philosophy and sociology of education (1).pptx
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PDF
Hazard Identification & Risk Assessment .pdf
DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
Environmental Education MCQ BD2EE - Share Source.pdf
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Soft-furnishing-By-Architect-A.F.M.Mohiuddin-Akhand.doc
Paper A Mock Exam 9_ Attempt review.pdf.
Share_Module_2_Power_conflict_and_negotiation.pptx
FOISHS ANNUAL IMPLEMENTATION PLAN 2025.pdf
My India Quiz Book_20210205121199924.pdf
Empowerment Technology for Senior High School Guide
Chinmaya Tiranga quiz Grand Finale.pdf
Onco Emergencies - Spinal cord compression Superior vena cava syndrome Febr...
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 1)
What if we spent less time fighting change, and more time building what’s rig...
Introduction to pro and eukaryotes and differences.pptx
Unit 4 Computer Architecture Multicore Processor.pptx
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
History, Philosophy and sociology of education (1).pptx
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Hazard Identification & Risk Assessment .pdf
Cambridge-Practice-Tests-for-IELTS-12.docx
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Environmental Education MCQ BD2EE - Share Source.pdf

Principal Component Analysis

  • 1. Principal Component Analysis October 9, 2019 Amit Praseed Classification October 9, 2019 1 / 24
  • 2. Feature Extraction Feature selection involves loss of data due to the loss of features. Even though care is taken to remove dimensions which are unlikely to contribute much to data mining, it is still encouraged to retain all the input data in one way or the other. So, how to retain all the input data, and reduce the number of dimen- sions? Feature Extraction maps the data in a higher dimension feature space to a lower dimension feature space without much loss of data. The most common feature extraction technique used is Principal Com- ponent Analysis (PCA). Amit Praseed Classification October 9, 2019 2 / 24
  • 3. The Idea behind PCA Amit Praseed Classification October 9, 2019 3 / 24
  • 4. The Idea behind PCA Amit Praseed Classification October 9, 2019 4 / 24
  • 5. Data as Vectors A vector is a geometric object that has magnitude and direc- tion. For example for a vector v = [3 4]T It has a magnitude of ||v|| = (32 + 42 ) = 5 And has a direction given by tan−1 4 3 anticlockwise from the x-axis. A unit vector is one which has magnitude 1 and is often used to specify directions. 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 x y Amit Praseed Classification October 9, 2019 5 / 24
  • 6. Vector Spaces A vector space is a collection of vectors, along with their associated operations such as vector addition, subtraction etc. For example, all two dimensional vectors are said to belong to a vector space represented as R2. Amit Praseed Classification October 9, 2019 6 / 24
  • 7. Basis of a Vector Space A basis for a vector space represents a minimal set of vectors from which all other vectors in the vector space can be generated. For example, all vectors in R2 can be generated using some linear combination of [1 0] and [0 1], and hence forms a basis for R2. For a set of vectors {v1, v2...vn} to form a basis for a vector space V , They must be linearly independent. They must be able to generate all vectors in V , No subset of {v1, v2...vn} must be a basis for V A vector space can have multiple basis vectors. For example, [1 2] and [1 − 1] also form a basis for R2. Amit Praseed Classification October 9, 2019 7 / 24
  • 8. Change of Basis Often times, vectors need to be converted from one basis to another for ease of representation or processing. This can be done by specifying a Change of Basis Matrix. For example consider the vector space R2 and the standard basis B = 1 0 0 1 . Also consider an alternate basis B = 1 1 2 −1 . The change of basis matrix from B to B is simply M = 1 1 2 −1 . The change of basis matrix from B to B is simply M = 1 1 2 −1 −1 . If the new basis B is orthogonal, then M = 1 1 2 −1 T Amit Praseed Classification October 9, 2019 8 / 24
  • 9. Linear Transformations A linear transformation is a function from one vector space to another (or itself) that trans- forms the vectors in the vector space. Linear transformations can be represented as matrices, for ex- ample 2 1 1.5 1 represents a linear transformation. Linear transformations change the magnitude and/or direc- tions of vectors. 2 1 1.5 1 ∗ 3 4 = 10 8.5 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10 x y Amit Praseed Classification October 9, 2019 9 / 24
  • 10. Another Example M = 2 2 5 −1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 x y M = 2 2 5 −1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 x y Amit Praseed Classification October 9, 2019 10 / 24
  • 11. Eigen Vectors and Eigen Values Eigen vectors for a matrix A representing a linear transformation are vectors which when affected by A, only change in magnitude, but not in direction. Mathematically, a vector v is said to be an eigen vector for a matrix A iff A.v = λv where λ is the magnitude by which the vector v is stretched, and is called the eigen value for v. Think of a linear transformation (or a matrix representing it) as a force, and the eigen vectors represent the direction in which the force acts. The eigen values represent the strength of the force. Amit Praseed Classification October 9, 2019 11 / 24
  • 12. What is so special about Eigen Vectors? Eigen vectors are important in a number of fields because they possess some important properties. The eigen vectors of a symmetric matrix are linearly independent or orthogonal, which means that they form a very convenient basis in which to represent vectors. Eigen vectors can also be used to convert a matrix into its diagonal form, i.e. a square matrix A can be decomposed into the form A = PDP−1 D = P−1 AP where P is a square matrix composed of the eigen vectors of A and D is a diagonal matrix composed of the corresponding eigen values. Amit Praseed Classification October 9, 2019 12 / 24
  • 13. Change of Basis and Linear Transformations If A is a data vector, it can be converted to a new basis by simply multiplying with the change of basis matrix. If A is a linear transformation, something more involved is required. Let A be a linear transformation is a regular canonical basis, and let V be the new basis. Then Av , the linear transformation in V is given by Av = V −1 AV Amit Praseed Classification October 9, 2019 13 / 24
  • 14. An Example Let A = 5 −3 2 −2 be a linear transformation in R2 with the canonical basis. Let V = 3 1 1 2 be a new basis in R2. Also let a = 1 3 be a data vector in the canonical basis. a after applying the linear transformation A is aA = aA aA = 1 3 5 2 −3 −2 aA = −4 −4 Amit Praseed Classification October 9, 2019 14 / 24
  • 15. An Example A transformed in the new vector space is given by Av = V −1AV Av = 3 1 1 2 −1 5 −3 2 −2 3 1 1 2 Av = 1 5 2 −1 −1 3 −1 5 −3 2 −2 3 1 1 2 Av = 4 0 0 −1 Amit Praseed Classification October 9, 2019 15 / 24
  • 16. An Example a transformed in the new vector space is given by av = aV −1 Av = 1 5 1 3 2 −1 −1 3 av = 1 5 −1 8 av transformed by the linear transformation Av is given by atrans = 1 5 −1 8 4 0 0 −1 atrans = 1 5 −4 −8 Transforming atrans back into the canonical basis atrans back = atransV atrans back = 1 5 −4 −8 3 1 1 2 atrans back = −4 −4 = aA which is the same as the vector originally transformed in the canonical basis Amit Praseed Classification October 9, 2019 16 / 24
  • 17. Coming Back to PCA... Our basic question is: how to find out the new basis in which to represent our data? What are the new dimensions/axes/features in our new transformed data? Amit Praseed Classification October 9, 2019 17 / 24
  • 18. The Covariance Matrix First step in PCA is to compute the covariance matrix for our data. If X is our input data matrix, then the covariance matrix is given by Cx = 1 n XT X Think of the covariance matrix as describing the interplay of the forces of covariance between the different variables in the system. Ideally, we want the variance within each variable to be high, and covariance between variables to be nearly zero. In other words, we want our covariance matrix to be a diagonal matrix. Amit Praseed Classification October 9, 2019 18 / 24
  • 19. The Maths... Let our input data matrix be X. We want to convert X into a different basis Y such that the covariance between variables in Y be zero. Let P be the new orthogonal basis needed. Y = XPT Let Cx be the covariance matrix of X and Cy be the covariance matrix of Y . Cx = 1 n XT X Cy = 1 n Y T Y Cy = 1 n (XPT )T (XP) Cy = PT 1 n XT X P Cy = PT (Cx )P Amit Praseed Classification October 9, 2019 19 / 24
  • 20. The Maths... Our previous result: Cy = PT (Cx )P Ideally we want Cy to be a diagonal matrix, and we know that the matrix than can diagonalize Cx is its eigen matrix!!! In other words, P is the eigen matrix of Cx and since Cx is symmetric, P is orthogonal i.e. P−1 = PT . So finally, the new data points in our new basis( which is nothing but the eigen vectors of our covariance matrix) can be obtained by Y = XPT where P is the eigen matrix of the covariance matrix of X. Amit Praseed Classification October 9, 2019 20 / 24
  • 21. But wait... we still have n dimensions... If X is an matrix with n columns, then Cov(X) is also an nxn matrix. Cov(X) has n eigen vectors and thus Y also has n features... PCA only recomposes a data space in n dimensions to another datas- pace of n dimensions. However, in the new data space, we have an idea of which dimensions contain the most variance (given by the eigen values). We can easily retain only the top k eigen values and eigen vectors to get a new k-dimensional vector space. Amit Praseed Classification October 9, 2019 21 / 24
  • 22. Example Let X =         Student Maths English Art 1 90 60 90 2 90 90 30 3 60 60 60 4 60 60 90 5 30 30 30         The covariance matrix Cx =   504 360 180 360 360 0 180 0 720   Amit Praseed Classification October 9, 2019 22 / 24
  • 23. Example To find the eigen vectors and the eigen values of Cx , we have to solve for |Cx − λI| = 0 det     504 − λ 360 180 360 360 − λ 0 180 0 720 − λ     = 0 Which gives −λ3 + 1584λ2 − 641520λ + 25660800 = 0 Which evaluates to λ = 44.8, 629.11, 910.06 which are the eigen values for Cx The eigen vectors correponding to these vectors are  −3.75 4.28 1   ,   −0.5 −0.675 1   ,   1.055 0.69 1   Amit Praseed Classification October 9, 2019 23 / 24
  • 24. Example We select the two eigen vectors which have the highest eigen values for building our eigen matrix P =   1.055 −0.5 0.69 −0.675 1 1   The new data matrix Y =       90 60 90 90 90 30 60 60 60 60 60 90 30 30 30         1.055 −0.5 0.69 −0.675 1 1   which is in a 2 dimensional data space Amit Praseed Classification October 9, 2019 24 / 24