SlideShare a Scribd company logo
Principal Component
Analysis (PCA)
Outline
1. What is PCA?
2. Dimensionality Reduction.
3. Why PCA?
4. Important Terminologies.
5. How does PCA Work?
6. Applications of PCA
7. Advantages and Limitations
Introduction
Principal Component Analysis, commonly referred to as
PCA, is a powerful mathematical technique used in data
analysis and statistics. At its core, PCA is designed to
simplify complex datasets by transforming them into a
more manageable form while retaining the most critical
information.
- reducing the dimensionality of dataset
- Increasing interpretability without losing
information
Dimensionality Reduction
Dimensionality reduction refers to the techniques that reduce the number of input variables in a
dataset.
Why DR?
- Less dimensions for a given dataset means less computation or training time
- Redundancy is removed after removing similar entries from the dataset
- Data Compression (Reduce storage space)
- It helps to find out the most significant features and skip the rest
- Leads to better human interpretations
Why PCA?
- Dimensionality Reduction
- Noise Reduction
- Visualization
- Feature Engineering
- Overfitting Problem
- Data Compression
- Machine Learning Processing
Important Terminologies
- Variance
- Covariance
- Eigenvalues
- Eigenvectors
- Principle Component
Important Terminologies (Variance)
- Variance is the sum of squares of differences between all numbers and means.
- Variance (σ²) = (Sum of the squared differences from the mean) / (Total number of values)
- In mathematical notation: σ² = Σ(x - μ)² / (n)
Here:
- μ is the mean of independent features
-Mean (μ) = (Sum of all values) / (Total number of values)
Important Terminologies (Variance)
- The variance is a measure that indicates how much data scatter around the mean
Important Terminologies (Variance)
- In mathematical notation: σ² = Σ(x - μ)² / (n)
.
Important Terminologies (Covariance)
1. It is the relationship between a pair of random variables where change in one variable causes
change in another variable.
2. It can take any value between -infinity to +infinity, where the negative value represents the
negative relationship whereas a positive value represents the positive relationship.
3. It is used for the linear relationship between variables.
4. It gives the direction of relationship between variables.
Important Terminologies (Covariance)
The formula for the covariance (Cov) between two random variables X and Y, each with N data
points, is as follows:
Where:
- Cov(X,Y) is the covariance between X and Y.
- N is the number of data points.
- Xi and Yi represent individual data points for X and Y, respectively.
Important Terminologies (Covariance)
X Y
10 40
12 48
14 56
8 21
Covariance Matrix
Compute Eigenvalues/EigenVectors
Let A be square N*N matrix & x be non-zero vector for which :
Ax = λx
For some scalar values λ
λ = Eigenvalue of matrix A.
X = Eigenvector of matrix A.
Eigenvalues :
A-λI=0 [return n numbers of eigenvalues]
Compute Eigenvalue / Eigenvectors
How does PCA work?
Step 1: Standardize the data.
Step 2: Calculate the covariance matrix.
Step 3: Compute the eigenvectors and
eigenvalues.
Step 4: Select the principal components.
Step 5: Project data onto the new basis.
Step-By-Step Explanation of PCA (Principal Component Analysis)
Step 1: Standardization
The main aim of this step is to standardize the range of the attributes so that each one of them lie
within similar boundaries
- μ is the mean of independent features
- σ is the standard deviation of independent features
σ = √[ ∑(x - x
̄ )2 / N ]
Standardization
Dataset:
Consider a small dataset with two variables, X and Y, represented by the following data points:
X: [2, 3, 5, 7, 10]
Y: [4, 5, 7, 8, 11]
- For variable X:
- Mean (μX) = (2 + 3 + 5 + 7 + 10) / 5 = 5.4
- Standard Deviation (σX) = √[Σ(Xi - μX)² / (n - 1)] = √[(0.64 + 0.04 + 0.16 + 1.44 + 20.25) /
4] ≈ 2.40
- For variable Y:
- Mean (μY) = (4 + 5 + 7 + 8 + 11) / 5 = 7
- Standard Deviation (σY) = √[Σ(Yi - μY)² / (n - 1)] = √[(9 + 4 + 0 + 1 + 16) / 4] ≈ 2.38
Standardized X: [-1.25, -0.71, 0.36, 1.43, 0.17]
Standardized Y: [-1.34, -0.87, 0.11, 0.61, 1.50]
Covariance Matrix Computation
Covariance matrix is use to express the correlation between any two or more attributes in a
multidimensional dataset
- Variance is denoted by Var
- Covariance is denoted by Cov
Covariance Matrix Computation
Cov(X, X) Cov(X, Y)
Cov(Y, X) Cov(Y, Y)
Using the formula for covariance:
- Cov(X, X) = Σ(Standardized X * Standardized X) / (n - 1) = (1.56 + 0.50 + 0.13 + 2.05 + 0.03) / 4
≈ 1.305
- Cov(X, Y) = Σ(Standardized X * Standardized Y) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) /
4 ≈ 0.133
- Cov(Y, X) = Σ(Standardized Y * Standardized X) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) /
4 ≈ 0.133
- Cov(Y, Y) = Σ(Standardized Y * Standardized Y) / (n - 1) = (1.79 + 0.76 + 0.01 + 0.15 + 2.25) / 4
≈ 1.24
Covariance Matrix:
1.305 0.133
0.133 1.24
Compute Eigenvalues and Eigenvectors of Covariance Matrix to
Identify Principal Components
Let's assume we find two eigenvalues and corresponding eigenvectors:
Eigenvalue 1 (λ1) = 1.50
Eigenvector 1 (v1) = [0.707, 0.707]
Eigenvalue 2 (λ2) = 1.05
Eigenvector 2 (v2) = [-0.707, 0.707]
Select the Principal Components.
1. First Principle component is the direction of greatest
variability(covariance) in the data
1. Second is the next orthogonal(uncorrelated) direction
of greatest variability
Project Data onto Principal Components
To transform the data into the new principal component space, we dot-multiply the standardized
data by the eigenvectors:
- New PC1 = (Standardized X * v1, Standardized Y * v1)
- New PC2 = (Standardized X * v2, Standardized Y * v2)
Applications of PCA
- Netflix Movie Recommendations
- Grocery Shopping
- Fitness Trackers
- Car Shopping
- Real Estate
- Manufacturing and Quality Control
- Sports Analytics
- Renewable Energy
- Smart Cities
Advantages of PCA
- Prevents Overfitting
- Speeds Up Other Machine Learning Algorithms
- Improves Visualization
- Dimensionality Reduction
- Noise Reduction
Limitations of PCA
- Linearity Assumption
- Loss of Interpretability
- Loss of Information
- Sensitivity to Scaling
- Orthogonal Components
Some Mathematical Problem
Given the Following data ,Use PCA to reduce the dimension from 2 to 1
Feature Example 1 Example 2 Example 3 Example 4
X 4 8 13 7
Y 11 4 5 14
Reference
1. https://www.simplilearn.com/tutorials/machine-learning-tutorial/principal-component-analysis
2. https://www.geeksforgeeks.org/principal-component-analysis-pca/
3. https://www.cuemath.com/algebra/covariance-matrix/
Thank You
Q&A

More Related Content

PDF
Principal Component Analysis
PPTX
PPTX
Principal component analysis
PPTX
Lect5 principal component analysis
PPTX
Lect4 principal component analysis-I
PPTX
PPTX
Pca(principal components analysis)
Principal Component Analysis
Principal component analysis
Lect5 principal component analysis
Lect4 principal component analysis-I
Pca(principal components analysis)

What's hot (20)

PPTX
Introduction to Linear Discriminant Analysis
PPTX
Principal Component Analysis (PCA) and LDA PPT Slides
PPTX
Machine learning clustering
PPTX
Data Reduction
PPTX
Artificial neural network
PPTX
Linear Regression
PDF
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
PPTX
Support vector machines (svm)
PPTX
Logistic regression
PPT
2.4 rule based classification
PPTX
Naïve Bayes Classifier Algorithm.pptx
PPT
Data cleaning-outlier-detection
PPTX
Unsupervised learning clustering
PPT
Hidden markov model ppt
PDF
Dimensionality Reduction
PPTX
Presentation on K-Means Clustering
PPT
Clustering
PDF
Logistic regression in Machine Learning
PPT
3.7 outlier analysis
Introduction to Linear Discriminant Analysis
Principal Component Analysis (PCA) and LDA PPT Slides
Machine learning clustering
Data Reduction
Artificial neural network
Linear Regression
Lecture 4 Decision Trees (2): Entropy, Information Gain, Gain Ratio
Support vector machines (svm)
Logistic regression
2.4 rule based classification
Naïve Bayes Classifier Algorithm.pptx
Data cleaning-outlier-detection
Unsupervised learning clustering
Hidden markov model ppt
Dimensionality Reduction
Presentation on K-Means Clustering
Clustering
Logistic regression in Machine Learning
3.7 outlier analysis
Ad

Similar to Principal Component Analysis PCA (20)

PPTX
principle component analysis.pptx
PPTX
Dimensionality Reduction and feature extraction.pptx
PPTX
11 Principal Component Analysis Computer Graphics.pptx
PPTX
ML-Lec-18-NEW Dimensionality Reduction-PCA (1).pptx
PPT
Principal Component Analysis (PCA):How to conduct PCA
PPT
Lecture 12 Principal Component Analysis in Machine Learning.ppt
PPT
pca in machine learning pca in machine learning pca in machine learning pca i...
PPTX
pcappt-140121072949-phpapp01.pptx
PPT
The following ppt is about principal component analysis
PPT
PPTX
Feature selection using PCA.pptx
PDF
Principal Components Analysis, Calculation and Visualization
PDF
Covariance.pdf
PDF
Mathematical Introduction to Principal Components Analysis
PPTX
Principal Component Analysis in Machine learning.pptx
PPTX
PCA Final.pptx
PPTX
Principal Component Analysis (PCA).pptx
PPT
Class9_PCA_final.ppt
PPTX
PCA Algorithmthatincludespcathatispca.pptx
principle component analysis.pptx
Dimensionality Reduction and feature extraction.pptx
11 Principal Component Analysis Computer Graphics.pptx
ML-Lec-18-NEW Dimensionality Reduction-PCA (1).pptx
Principal Component Analysis (PCA):How to conduct PCA
Lecture 12 Principal Component Analysis in Machine Learning.ppt
pca in machine learning pca in machine learning pca in machine learning pca i...
pcappt-140121072949-phpapp01.pptx
The following ppt is about principal component analysis
Feature selection using PCA.pptx
Principal Components Analysis, Calculation and Visualization
Covariance.pdf
Mathematical Introduction to Principal Components Analysis
Principal Component Analysis in Machine learning.pptx
PCA Final.pptx
Principal Component Analysis (PCA).pptx
Class9_PCA_final.ppt
PCA Algorithmthatincludespcathatispca.pptx
Ad

More from Abdullah al Mamun (20)

PPTX
Underfitting and Overfitting in Machine Learning
PPTX
Recurrent Neural Networks (RNNs)
PPTX
Random Forest
PPTX
Natural Language Processing (NLP)
PPTX
Naive Bayes
PPTX
Multilayer Perceptron Neural Network MLP
PPTX
Long Short Term Memory LSTM
PPTX
K-Nearest Neighbor(KNN)
PPTX
Hidden Markov Model (HMM)
PPTX
Ensemble Method (Bagging Boosting)
PPTX
Convolutional Neural Networks CNN
PPTX
Artificial Neural Network ANN
PPTX
Reinforcement Learning, Application and Q-Learning
PPTX
Session on evaluation of DevSecOps
PPTX
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
PPTX
DevOps Presentation.pptx
PPTX
Python Virtual Environment.pptx
PPTX
Artificial intelligence Presentation.pptx
PPT
An approach to empirical Optical Character recognition paradigm using Multi-L...
PPT
Automatic Speaker Recognition system using MFCC and VQ approach
Underfitting and Overfitting in Machine Learning
Recurrent Neural Networks (RNNs)
Random Forest
Natural Language Processing (NLP)
Naive Bayes
Multilayer Perceptron Neural Network MLP
Long Short Term Memory LSTM
K-Nearest Neighbor(KNN)
Hidden Markov Model (HMM)
Ensemble Method (Bagging Boosting)
Convolutional Neural Networks CNN
Artificial Neural Network ANN
Reinforcement Learning, Application and Q-Learning
Session on evaluation of DevSecOps
Artificial Intelligence: Classification, Applications, Opportunities, and Cha...
DevOps Presentation.pptx
Python Virtual Environment.pptx
Artificial intelligence Presentation.pptx
An approach to empirical Optical Character recognition paradigm using Multi-L...
Automatic Speaker Recognition system using MFCC and VQ approach

Recently uploaded (20)

PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Launch Your Data Science Career in Kochi – 2025
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
IB Computer Science - Internal Assessment.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
Major-Components-ofNKJNNKNKNKNKronment.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
Introduction to Business Data Analytics.
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Mega Projects Data Mega Projects Data
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Global journeys: estimating international migration
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Launch Your Data Science Career in Kochi – 2025
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
IB Computer Science - Internal Assessment.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
Major-Components-ofNKJNNKNKNKNKronment.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction to Business Data Analytics.
Miokarditis (Inflamasi pada Otot Jantung)
Moving the Public Sector (Government) to a Digital Adoption
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Mega Projects Data Mega Projects Data
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Global journeys: estimating international migration
Business Ppt On Nestle.pptx huunnnhhgfvu
05. PRACTICAL GUIDE TO MICROSOFT EXCEL.pptx

Principal Component Analysis PCA

  • 2. Outline 1. What is PCA? 2. Dimensionality Reduction. 3. Why PCA? 4. Important Terminologies. 5. How does PCA Work? 6. Applications of PCA 7. Advantages and Limitations
  • 3. Introduction Principal Component Analysis, commonly referred to as PCA, is a powerful mathematical technique used in data analysis and statistics. At its core, PCA is designed to simplify complex datasets by transforming them into a more manageable form while retaining the most critical information. - reducing the dimensionality of dataset - Increasing interpretability without losing information
  • 4. Dimensionality Reduction Dimensionality reduction refers to the techniques that reduce the number of input variables in a dataset. Why DR? - Less dimensions for a given dataset means less computation or training time - Redundancy is removed after removing similar entries from the dataset - Data Compression (Reduce storage space) - It helps to find out the most significant features and skip the rest - Leads to better human interpretations
  • 5. Why PCA? - Dimensionality Reduction - Noise Reduction - Visualization - Feature Engineering - Overfitting Problem - Data Compression - Machine Learning Processing
  • 6. Important Terminologies - Variance - Covariance - Eigenvalues - Eigenvectors - Principle Component
  • 7. Important Terminologies (Variance) - Variance is the sum of squares of differences between all numbers and means. - Variance (σ²) = (Sum of the squared differences from the mean) / (Total number of values) - In mathematical notation: σ² = Σ(x - μ)² / (n) Here: - μ is the mean of independent features -Mean (μ) = (Sum of all values) / (Total number of values)
  • 8. Important Terminologies (Variance) - The variance is a measure that indicates how much data scatter around the mean
  • 9. Important Terminologies (Variance) - In mathematical notation: σ² = Σ(x - μ)² / (n) .
  • 10. Important Terminologies (Covariance) 1. It is the relationship between a pair of random variables where change in one variable causes change in another variable. 2. It can take any value between -infinity to +infinity, where the negative value represents the negative relationship whereas a positive value represents the positive relationship. 3. It is used for the linear relationship between variables. 4. It gives the direction of relationship between variables.
  • 11. Important Terminologies (Covariance) The formula for the covariance (Cov) between two random variables X and Y, each with N data points, is as follows: Where: - Cov(X,Y) is the covariance between X and Y. - N is the number of data points. - Xi and Yi represent individual data points for X and Y, respectively.
  • 12. Important Terminologies (Covariance) X Y 10 40 12 48 14 56 8 21 Covariance Matrix
  • 13. Compute Eigenvalues/EigenVectors Let A be square N*N matrix & x be non-zero vector for which : Ax = λx For some scalar values λ λ = Eigenvalue of matrix A. X = Eigenvector of matrix A. Eigenvalues : A-λI=0 [return n numbers of eigenvalues]
  • 14. Compute Eigenvalue / Eigenvectors
  • 15. How does PCA work? Step 1: Standardize the data. Step 2: Calculate the covariance matrix. Step 3: Compute the eigenvectors and eigenvalues. Step 4: Select the principal components. Step 5: Project data onto the new basis.
  • 16. Step-By-Step Explanation of PCA (Principal Component Analysis) Step 1: Standardization The main aim of this step is to standardize the range of the attributes so that each one of them lie within similar boundaries - μ is the mean of independent features - σ is the standard deviation of independent features σ = √[ ∑(x - x ̄ )2 / N ]
  • 17. Standardization Dataset: Consider a small dataset with two variables, X and Y, represented by the following data points: X: [2, 3, 5, 7, 10] Y: [4, 5, 7, 8, 11] - For variable X: - Mean (μX) = (2 + 3 + 5 + 7 + 10) / 5 = 5.4 - Standard Deviation (σX) = √[Σ(Xi - μX)² / (n - 1)] = √[(0.64 + 0.04 + 0.16 + 1.44 + 20.25) / 4] ≈ 2.40 - For variable Y: - Mean (μY) = (4 + 5 + 7 + 8 + 11) / 5 = 7 - Standard Deviation (σY) = √[Σ(Yi - μY)² / (n - 1)] = √[(9 + 4 + 0 + 1 + 16) / 4] ≈ 2.38 Standardized X: [-1.25, -0.71, 0.36, 1.43, 0.17] Standardized Y: [-1.34, -0.87, 0.11, 0.61, 1.50]
  • 18. Covariance Matrix Computation Covariance matrix is use to express the correlation between any two or more attributes in a multidimensional dataset - Variance is denoted by Var - Covariance is denoted by Cov
  • 19. Covariance Matrix Computation Cov(X, X) Cov(X, Y) Cov(Y, X) Cov(Y, Y) Using the formula for covariance: - Cov(X, X) = Σ(Standardized X * Standardized X) / (n - 1) = (1.56 + 0.50 + 0.13 + 2.05 + 0.03) / 4 ≈ 1.305 - Cov(X, Y) = Σ(Standardized X * Standardized Y) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) / 4 ≈ 0.133 - Cov(Y, X) = Σ(Standardized Y * Standardized X) / (n - 1) = (-1.67 + 0.62 + 0.04 + 0.88 + 0.26) / 4 ≈ 0.133 - Cov(Y, Y) = Σ(Standardized Y * Standardized Y) / (n - 1) = (1.79 + 0.76 + 0.01 + 0.15 + 2.25) / 4 ≈ 1.24 Covariance Matrix: 1.305 0.133 0.133 1.24
  • 20. Compute Eigenvalues and Eigenvectors of Covariance Matrix to Identify Principal Components Let's assume we find two eigenvalues and corresponding eigenvectors: Eigenvalue 1 (λ1) = 1.50 Eigenvector 1 (v1) = [0.707, 0.707] Eigenvalue 2 (λ2) = 1.05 Eigenvector 2 (v2) = [-0.707, 0.707]
  • 21. Select the Principal Components. 1. First Principle component is the direction of greatest variability(covariance) in the data 1. Second is the next orthogonal(uncorrelated) direction of greatest variability
  • 22. Project Data onto Principal Components To transform the data into the new principal component space, we dot-multiply the standardized data by the eigenvectors: - New PC1 = (Standardized X * v1, Standardized Y * v1) - New PC2 = (Standardized X * v2, Standardized Y * v2)
  • 23. Applications of PCA - Netflix Movie Recommendations - Grocery Shopping - Fitness Trackers - Car Shopping - Real Estate - Manufacturing and Quality Control - Sports Analytics - Renewable Energy - Smart Cities
  • 24. Advantages of PCA - Prevents Overfitting - Speeds Up Other Machine Learning Algorithms - Improves Visualization - Dimensionality Reduction - Noise Reduction
  • 25. Limitations of PCA - Linearity Assumption - Loss of Interpretability - Loss of Information - Sensitivity to Scaling - Orthogonal Components
  • 26. Some Mathematical Problem Given the Following data ,Use PCA to reduce the dimension from 2 to 1 Feature Example 1 Example 2 Example 3 Example 4 X 4 8 13 7 Y 11 4 5 14