SlideShare a Scribd company logo
A principled way to principal
components analysis
Teaching activity objectives
• Visualize large data sets.
• Transform the data to aid in this
visualization.
• Clustering data.
• Implement basic linear algebra operations.
• Connect this operations to neuronal
models and brain function.
Context for the activity
• Homework Assignment in 9.40 Intro to
neural Computation (Sophomore/Junior).
• In-class activity 9.014 Quantitative
Methods and Computational Models in
Neuroscience (1st year PhD).
Data visualization and
performing pca:
MNIST data set
28 by 28 pixels
8-bit gray scale images
These images live in
a 784 dimensional space
http://yann.lecun.com/exdb/mnist/
Can we cluster images in the
pixel space?
One possible visualization
There are more than 300000 possible pairwise pixel plots!!!
Is there a more principled way?
• Represent the data in a new basis set.
• Aids in visualization and potentially in
clustering and dimensionality reduction.
• PCA provides such a basis set by looking
at directions that capture most variance.
• The directions are ranked by decreasing
variance.
• It diagonalizes the covariance matrix.
Pedagogical approach
• Guide them step by step to implement PCA.
• Emphasize visualizations and geometrical
approach/intuition.
• We don’t use the MATLAB canned function
for PCA.
• We want students to get their hands “dirty”.
This helps build confidence and deep
understanding.
PCA Mantra
• Reshape the data to proper format for PCA.
• Center the data performing mean subtraction.
• Construct the data covariance matrix.
• Perform SVD to obtain the eigenvalues and
eigenvectors of the covariance matrix.
• Compute the variance explained per component
and plot it.
• Reshape the eigenvectors and visualize their
images.
• Project the mean subtracted data onto the
eigenvectors basis.
First 9 Eigenvectors
Projections onto the first 2 axes
• The first two PCs capture ~37% of the variance.
• The data forms clear clusters that are almost linearly separable
Building models: Synapses and
PCA
• 1949 book: 'The Organization
of Behavior' Theory about the
neural bases of learning
• Learning takes place at
synapses.
• Synapses get modified, they
get stronger when the pre- and
post- synaptic cells fire
together.
• "Cells that fire together, wire
together"
Hebbian Learning
Donald Hebb
Unstable
Building Hebbian synapses
Erkki Oja
Oja’s rule
A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology,
15:267-273 (1982).
Feedback,forgetting term or regularizer
• Stabilizes the Hebbian rule.
• Leads to a covariance learning rule: the weights
converge to the first eigenvector of the covariance
matrix.
• Similar to power iteration method.
Learning outcomes
• Visualize and manipulate a relatively large and
complex data set.
• Perform PCA by building it step by step.
• Gain an intuition of the geometry involved in a
change of basis and projections.
• Start thinking about basic clustering
algorithms.
• Discuss on dimensionality reduction and other
PCA applications
Learning outcomes (cont)
• Discuss the assumptions, limitations and
shortcomings of applying PCA in different
contexts.
• Build a model of how PCA might actually
take place in neural circuits.
• Follow up: eigenfaces, is the brain doing
PCA to recognize faces?

More Related Content

POTX
SoftComputing6
PPTX
Neural network in R by Aman Chauhan
PPTX
Unsupervised Learning
PPTX
DOCX
27.09 evaluación historia.simbolos patrios(1) l.romero
PPTX
Linea de tiempo
DOCX
PPTX
Lesiones elementales
SoftComputing6
Neural network in R by Aman Chauhan
Unsupervised Learning
27.09 evaluación historia.simbolos patrios(1) l.romero
Linea de tiempo
Lesiones elementales

Viewers also liked (10)

PPTX
Jisc Analytics maturity and services
PDF
eStudy Company Brochure
PDF
MTSingle
PPTX
Collage de emociones.jpg
PDF
Cuento del medio ambiente LUGO WILNOR
DOCX
Ficha diagnóstica português 5ºano
PDF
Jesse James Jamnik: Fitness Tips - Building Muscle
PPT
Webquestednapaula celiabatista iolandaxavier_meirefuzzaro
PPTX
Fitness tips for women
DOCX
Hoja de vida juan camilo gomez
Jisc Analytics maturity and services
eStudy Company Brochure
MTSingle
Collage de emociones.jpg
Cuento del medio ambiente LUGO WILNOR
Ficha diagnóstica português 5ºano
Jesse James Jamnik: Fitness Tips - Building Muscle
Webquestednapaula celiabatista iolandaxavier_meirefuzzaro
Fitness tips for women
Hoja de vida juan camilo gomez
Ad

Similar to A principled way to principal components analysis (20)

PDF
Pca ankita dubey
PDF
Neural Networks: Principal Component Analysis (PCA)
PDF
Covariance.pdf
PPTX
pcappt-140121072949-phpapp01.pptx
PDF
pca.pdf polymer nanoparticles and sensors
PPTX
Principal component analysis in machine L
PDF
Pca slides- sanket shetye
PPTX
Feature selection using PCA.pptx
PPTX
PCA and SVD in brief
PPTX
Dimensionality Reduction and feature extraction.pptx
PDF
Mathematical Introduction to Principal Components Analysis
PDF
Lecture notes on the subject Osai important
PPTX
introduction to Statistical Theory.pptx
PPTX
Principal Component Analysis (PCA) .
PPTX
Face recognition using PCA
PDF
5 DimensionalityReduction.pdf
PDF
PCA for the uninitiated
PDF
1376846406 14447221
PDF
A Novel Algorithm for Design Tree Classification with PCA
PDF
Principal Component Analysis in Machine Learning.pdf
Pca ankita dubey
Neural Networks: Principal Component Analysis (PCA)
Covariance.pdf
pcappt-140121072949-phpapp01.pptx
pca.pdf polymer nanoparticles and sensors
Principal component analysis in machine L
Pca slides- sanket shetye
Feature selection using PCA.pptx
PCA and SVD in brief
Dimensionality Reduction and feature extraction.pptx
Mathematical Introduction to Principal Components Analysis
Lecture notes on the subject Osai important
introduction to Statistical Theory.pptx
Principal Component Analysis (PCA) .
Face recognition using PCA
5 DimensionalityReduction.pdf
PCA for the uninitiated
1376846406 14447221
A Novel Algorithm for Design Tree Classification with PCA
Principal Component Analysis in Machine Learning.pdf
Ad

More from SERC at Carleton College (20)

PPTX
StatVignette03_Sig.Figs_v04_07_15_2020.pptx
PPTX
StatVignette06_HypTesting.pptx
PPTX
Unit 1 (optional slides)
PPTX
Cretaceous Coatlines and Modern Voting Patterns Presentation
PPTX
Climate and Biomes PPT 2
PPTX
weather tracking ppt
PPTX
Presentation: Unit 1 Introduction to the hydrological cycle
PPTX
StatVignette05_M3_v02_10_21_2020.pptx
PPTX
KSKL chapter 8 PPT
PPTX
KSKL chap 5 PPT
PPTX
KSKL_Chapter 4_ Chem Properties of Soils.pptx
PPTX
Degraded Soil Images.pptx
PPTX
Educators PPT file chapter 7
PPTX
Educators PPT file chapter 2
PPTX
Educators PPT file chapter 6
PPTX
Educators PPT chapter 3
PPTX
Unit 4 background presentation
PPTX
Presentation: Unit 3 background information
PPTX
Presentation: Unit 2 Measuring Groundwater Background Information
PPTX
Introduction to GPS presentation
StatVignette03_Sig.Figs_v04_07_15_2020.pptx
StatVignette06_HypTesting.pptx
Unit 1 (optional slides)
Cretaceous Coatlines and Modern Voting Patterns Presentation
Climate and Biomes PPT 2
weather tracking ppt
Presentation: Unit 1 Introduction to the hydrological cycle
StatVignette05_M3_v02_10_21_2020.pptx
KSKL chapter 8 PPT
KSKL chap 5 PPT
KSKL_Chapter 4_ Chem Properties of Soils.pptx
Degraded Soil Images.pptx
Educators PPT file chapter 7
Educators PPT file chapter 2
Educators PPT file chapter 6
Educators PPT chapter 3
Unit 4 background presentation
Presentation: Unit 3 background information
Presentation: Unit 2 Measuring Groundwater Background Information
Introduction to GPS presentation

A principled way to principal components analysis

  • 1. A principled way to principal components analysis
  • 2. Teaching activity objectives • Visualize large data sets. • Transform the data to aid in this visualization. • Clustering data. • Implement basic linear algebra operations. • Connect this operations to neuronal models and brain function.
  • 3. Context for the activity • Homework Assignment in 9.40 Intro to neural Computation (Sophomore/Junior). • In-class activity 9.014 Quantitative Methods and Computational Models in Neuroscience (1st year PhD).
  • 5. MNIST data set 28 by 28 pixels 8-bit gray scale images These images live in a 784 dimensional space http://yann.lecun.com/exdb/mnist/
  • 6. Can we cluster images in the pixel space?
  • 7. One possible visualization There are more than 300000 possible pairwise pixel plots!!!
  • 8. Is there a more principled way? • Represent the data in a new basis set. • Aids in visualization and potentially in clustering and dimensionality reduction. • PCA provides such a basis set by looking at directions that capture most variance. • The directions are ranked by decreasing variance. • It diagonalizes the covariance matrix.
  • 9. Pedagogical approach • Guide them step by step to implement PCA. • Emphasize visualizations and geometrical approach/intuition. • We don’t use the MATLAB canned function for PCA. • We want students to get their hands “dirty”. This helps build confidence and deep understanding.
  • 10. PCA Mantra • Reshape the data to proper format for PCA. • Center the data performing mean subtraction. • Construct the data covariance matrix. • Perform SVD to obtain the eigenvalues and eigenvectors of the covariance matrix. • Compute the variance explained per component and plot it. • Reshape the eigenvectors and visualize their images. • Project the mean subtracted data onto the eigenvectors basis.
  • 12. Projections onto the first 2 axes • The first two PCs capture ~37% of the variance. • The data forms clear clusters that are almost linearly separable
  • 14. • 1949 book: 'The Organization of Behavior' Theory about the neural bases of learning • Learning takes place at synapses. • Synapses get modified, they get stronger when the pre- and post- synaptic cells fire together. • "Cells that fire together, wire together" Hebbian Learning Donald Hebb
  • 16. Erkki Oja Oja’s rule A simplified neuron model as a principal component analyzer. Journal of Mathematical Biology, 15:267-273 (1982). Feedback,forgetting term or regularizer • Stabilizes the Hebbian rule. • Leads to a covariance learning rule: the weights converge to the first eigenvector of the covariance matrix. • Similar to power iteration method.
  • 17. Learning outcomes • Visualize and manipulate a relatively large and complex data set. • Perform PCA by building it step by step. • Gain an intuition of the geometry involved in a change of basis and projections. • Start thinking about basic clustering algorithms. • Discuss on dimensionality reduction and other PCA applications
  • 18. Learning outcomes (cont) • Discuss the assumptions, limitations and shortcomings of applying PCA in different contexts. • Build a model of how PCA might actually take place in neural circuits. • Follow up: eigenfaces, is the brain doing PCA to recognize faces?