This document discusses feature extraction and dimensionality reduction techniques. It begins by defining feature extraction as mapping a set of features to a reduced feature set that maximizes classification ability. It then explains principal component analysis (PCA) and how it works by finding orthogonal directions that maximize data variance. However, PCA does not consider class information. Linear discriminant analysis (LDA) is then introduced as a technique that finds projections by maximizing between-class distance and minimizing within-class distance to better separate classes. LDA thus provides a "good projection" for classification tasks.