Principal Component Analysis PCA

Principal Component
Analysis (PCA)

Outline
1. What is PCA?
2. Dimensionality Reduction.
3. Why PCA?
4. Important Terminologies.
5. How does PCA Work?
6. Applications of PCA
7. Advantages and Limitations

Introduction
Principal Component Analysis, commonly referred to as
PCA, is a powerful mathematical technique used in data
analysis and statistics. At its core, PCA is designed to
simplify complex datasets by transforming them into a
more manageable form while retaining the most critical
information.
- reducing the dimensionality of dataset
- Increasing interpretability without losing
information

Dimensionality Reduction
Dimensionality reduction refers to the techniques that reduce the number of input variables in a
dataset.
Why DR?
- Less dimensions for a given dataset means less computation or training time
- Redundancy is removed after removing similar entries from the dataset
- Data Compression (Reduce storage space)
- It helps to find out the most significant features and skip the rest
- Leads to better human interpretations

Why PCA?
- Dimensionality Reduction
- Noise Reduction
- Visualization
- Feature Engineering
- Overfitting Problem
- Data Compression
- Machine Learning Processing

Important Terminologies
- Variance
- Covariance
- Eigenvalues
- Eigenvectors
- Principle Component

Important Terminologies (Variance)
- Variance is the sum of squares of differences between all numbers and means.
- Variance (σ²) = (Sum of the squared differences from the mean) / (Total number of values)
- In mathematical notation: σ² = Σ(x - μ)² / (n)
Here:
- μ is the mean of independent features
-Mean (μ) = (Sum of all values) / (Total number of values)

- The variance is a measure that indicates how much data scatter around the mean

- In mathematical notation: σ² = Σ(x - μ)² / (n)
.

Important Terminologies (Covariance)
1. It is the relationship between a pair of random variables where change in one variable causes
change in another variable.
2. It can take any value between -infinity to +infinity, where the negative value represents the
negative relationship whereas a positive value represents the positive relationship.
3. It is used for the linear relationship between variables.
4. It gives the direction of relationship between variables.

The formula for the covariance (Cov) between two random variables X and Y, each with N data
points, is as follows:
Where:
- Cov(X,Y) is the covariance between X and Y.
- N is the number of data points.
- Xi and Yi represent individual data points for X and Y, respectively.

X Y
10 40
12 48
14 56
8 21
Covariance Matrix

Compute Eigenvalues/EigenVectors
Let A be square N*N matrix & x be non-zero vector for which :
Ax = λx
For some scalar values λ
λ = Eigenvalue of matrix A.
X = Eigenvector of matrix A.
Eigenvalues :
A-λI=0 [return n numbers of eigenvalues]

Compute Eigenvalue / Eigenvectors

How does PCA work?
Step 1: Standardize the data.
Step 2: Calculate the covariance matrix.
Step 3: Compute the eigenvectors and
eigenvalues.
Step 4: Select the principal components.
Step 5: Project data onto the new basis.

Step-By-Step Explanation of PCA (Principal Component Analysis)
Step 1: Standardization
The main aim of this step is to standardize the range of the attributes so that each one of them lie
within similar boundaries
- μ is the mean of independent features
- σ is the standard deviation of independent features
σ = √[ ∑(x - x
̄ )2 / N ]

Standardization
Dataset:
Consider a small dataset with two variables, X and Y, represented by the following data points:
X: [2, 3, 5, 7, 10]
Y: [4, 5, 7, 8, 11]
- For variable X:
- Mean (μX) = (2 + 3 + 5 + 7 + 10) / 5 = 5.4
- Standard Deviation (σX) = √[Σ(Xi - μX)² / (n - 1)] = √[(0.64 + 0.04 + 0.16 + 1.44 + 20.25) /
4] ≈ 2.40
- For variable Y:
- Mean (μY) = (4 + 5 + 7 + 8 + 11) / 5 = 7
- Standard Deviation (σY) = √[Σ(Yi - μY)² / (n - 1)] = √[(9 + 4 + 0 + 1 + 16) / 4] ≈ 2.38
Standardized X: [-1.25, -0.71, 0.36, 1.43, 0.17]
Standardized Y: [-1.34, -0.87, 0.11, 0.61, 1.50]

Covariance Matrix Computation
Covariance matrix is use to express the correlation between any two or more attributes in a
multidimensional dataset
- Variance is denoted by Var
- Covariance is denoted by Cov

Covariance Matrix Computation
Cov(X, X) Cov(X, Y)
Cov(Y, X) Cov(Y, Y)
Using the formula for covariance:
- Cov(X, X) = Σ(Standardized X * Standardized X) / (n - 1) = (1.56 + 0.50 + 0.13 + 2.05 + 0.03) / 4
≈ 1.305
- Cov(X, Y) = Σ(Standardized X * Standardized Y) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) /
4 ≈ 0.133
- Cov(Y, X) = Σ(Standardized Y * Standardized X) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) /
4 ≈ 0.133
- Cov(Y, Y) = Σ(Standardized Y * Standardized Y) / (n - 1) = (1.79 + 0.76 + 0.01 + 0.15 + 2.25) / 4
≈ 1.24
Covariance Matrix:
1.305 0.133
0.133 1.24

Compute Eigenvalues and Eigenvectors of Covariance Matrix to
Identify Principal Components
Let's assume we find two eigenvalues and corresponding eigenvectors:
Eigenvalue 1 (λ1) = 1.50
Eigenvector 1 (v1) = [0.707, 0.707]
Eigenvalue 2 (λ2) = 1.05
Eigenvector 2 (v2) = [-0.707, 0.707]

Select the Principal Components.
1. First Principle component is the direction of greatest
variability(covariance) in the data
1. Second is the next orthogonal(uncorrelated) direction
of greatest variability

Project Data onto Principal Components
To transform the data into the new principal component space, we dot-multiply the standardized
data by the eigenvectors:
- New PC1 = (Standardized X * v1, Standardized Y * v1)
- New PC2 = (Standardized X * v2, Standardized Y * v2)

Applications of PCA
- Netflix Movie Recommendations
- Grocery Shopping
- Fitness Trackers
- Car Shopping
- Real Estate
- Manufacturing and Quality Control
- Sports Analytics
- Renewable Energy
- Smart Cities

Advantages of PCA
- Prevents Overfitting
- Speeds Up Other Machine Learning Algorithms
- Improves Visualization
- Dimensionality Reduction
- Noise Reduction

Limitations of PCA
- Linearity Assumption
- Loss of Interpretability
- Loss of Information
- Sensitivity to Scaling
- Orthogonal Components

Some Mathematical Problem
Given the Following data ,Use PCA to reduce the dimension from 2 to 1
Feature Example 1 Example 2 Example 3 Example 4
X 4 8 13 7
Y 11 4 5 14

Reference
1. https://www.simplilearn.com/tutorials/machine-learning-tutorial/principal-component-analysis
2. https://www.geeksforgeeks.org/principal-component-analysis-pca/
3. https://www.cuemath.com/algebra/covariance-matrix/

Principal Component Analysis PCA

More Related Content

What's hot (20)

Similar to Principal Component Analysis PCA (20)

More from Abdullah al Mamun (20)

Recently uploaded (20)

Principal Component Analysis PCA