CS8080_IRT_UNIT - III T8 FEATURE SELECTION OR DIMENSIONALITY REDUCTION.pdf

P1WU
UNIT – III: CLASSIFICATION
Topic 8 FEATURE SELECTION OR
DIMENSIONALITY REDUCTION
AALIM MUHAMMED SALEGH COLLEGE OF ENGINEERING
DEPARTMENT OF COMPUTER SCIENCE AND ENGINEERING
SEMESTER – VIII
PROFESSIONAL ELECTIVE – IV
CS8080- INFORMATION RETRIEVAL TECHNIQUES

UNIT III : TEXT CLASSIFICATION AND CLUSTERING
1.A Characterization of Text
Classification
2. Unsupervised Algorithms:
Clustering
3. Naïve Text Classification
4. Supervised Algorithms
5. Decision Tree
6. k-NN Classifier
7. SVM Classifier
8. Feature Selection or
Dimensionality
Reduction
9. Evaluation metrics
10. Accuracy and Error
11. Organizing the classes
12. Indexing and Searching
13. Inverted Indexes
14. Sequential Searching
15. Multi-dimensional Indexing
SEMESTER – VIII

FEATURE SELECTION OR DIMENSIONALITY REDUCTION
SEMESTER – VIII

• Feature selection and dimensionality reduction allow us to
• minimize the number of features in a dataset by only keeping features that are
important.
• In other words, we want to retain features that contain the
most useful information that is needed by our model to
• make accurate predictions while discarding redundant features that contain
little to no information.
SEMESTER – VIII

• There are several benefits in performing feature selection and
dimensionality reduction which include:
• model interpretability,
• minimizing overfitting
as well as
• reducing the size of the training set and consequently training time.
SEMESTER – VIII

Dimensionality Reduction
• The number of input variables or features for a dataset is referred to
as its dimensionality.
• Dimensionality reduction refers to techniques that reduce the number
of input variables in a dataset.
• More input features often make a predictive modeling task more
challenging to model, more generally referred to as the curse of
dimensionality.
SEMESTER – VIII

• High-dimensionality statistics and dimensionality reduction
techniques are often used for data visualization.
• Nevertheless these techniques can be used in applied machine
learning to
• simplify a classification or regression dataset in order to better fit a predictive
model.
SEMESTER – VIII

Problem With Many Input Variables
• If your data is represented using rows and columns, such as in a
spreadsheet, then
• the input variables are the columns that are fed as input to a model to predict
the target variable. Input variables are also called features.
• We can consider the columns of data representing dimensions on an
n-dimensional feature space and the rows of data as points in that
space.
• This is a useful geometric interpretation of a dataset.
SEMESTER – VIII

Problem With Many Input Variables
• Having a large number of dimensions in the feature space can mean that
the volume of that space is very large, and in turn,
• the points that we have in that space (rows of data) often represent a small and non-
representative sample.
• This can dramatically impact the performance of machine learning
algorithms fit on data with many input features, generally referred to as the
“curse of dimensionality.”
• Therefore, it is often desirable to reduce the number of input features. This
reduces the number of dimensions of the feature space, hence the name
“dimensionality reduction.”
SEMESTER – VIII

SEMESTER – VIII

• Dimensionality reduction refers to techniques for reducing the
number of input variables in training data.
• When dealing with high dimensional data, it is often useful to reduce
the dimensionality by projecting the data to a lower dimensional
subspace which captures the “essence” of the data. This is called
dimensionality reduction.
SEMESTER – VIII

• Fewer input dimensions often mean correspondingly fewer
parameters or a simpler structure in the machine learning model,
referred to as degrees of freedom.
• A model with too many degrees of freedom is likely to overfit the
training dataset and therefore may not perform well on new data.
• It is desirable to have simple models that generalize well, and in turn,
input data with few input variables.
• This is particularly true for linear models where the number of inputs
and the degrees of freedom of the model are often closely related.
SEMESTER – VIII

Techniques for Dimensionality Reduction
SEMESTER – VIII

Techniques for Dimensionality Reduction
• There are many techniques that can be used for dimensionality
reduction.
• Feature Selection Methods
• Matrix Factorization
• Manifold Learning
• Auto encoder Methods
SEMESTER – VIII

Feature Selection Methods
• Feature selection is also called variable selection or attribute
selection.
• It is the automatic selection of attributes in your data (such as
columns in tabular data) that are most relevant to the predictive
modeling problem you are working on.
• feature selection…
• is the process of selecting a subset of relevant features for use in model
construction
SEMESTER – VIII

• Feature selection is different from dimensionality reduction.
• Both methods seek to reduce the number of attributes in the dataset,
• but a dimensionality reduction method do so by creating new combinations of
attributes,
• where as feature selection methods include and exclude attributes present in
the data without changing them.
SEMESTER – VIII

• Examples of dimensionality reduction methods include
• Principal Component Analysis,
• Singular Value Decomposition and
• Sammon’s Mapping.
• Feature selection is itself useful, but it mostly acts as a filter, muting
out features that aren’t useful in addition to your existing features.
SEMESTER – VIII

Feature Selection Algorithms
SEMESTER – VIII

Filter Methods
• Filter feature selection methods apply a statistical measure to assign a
scoring to each feature.
• The features are ranked by the score and either selected to be kept or
removed from the dataset.
• The methods are often univariate and consider the feature
independently, or with regard to the dependent variable.
• Some examples of some filter methods include the Chi squared test,
information gain and correlation coefficient scores.
SEMESTER – VIII

Wrapper Methods
• Wrapper methods consider the selection of a set of features as a search problem,
where different combinations are prepared, evaluated and compared to other
combinations.
• A predictive model us used to evaluate a combination of features and assign a
score based on model accuracy.
• The search process may be
• methodical such as a best-first search,
• it may stochastic such as a random hill-climbing algorithm, or
• it may use heuristics, like forward and backward passes to add and remove features.
• An example if a wrapper method is the recursive feature elimination algorithm.
SEMESTER – VIII

Embedded Methods
• Embedded methods learn which features best contribute to the accuracy of the
model while the model is being created.
• The most common type of embedded feature selection methods are
• regularization methods.
• Regularization methods are also called penalization methods that introduce
• additional constraints into the optimization of a predictive algorithm (such as a regression
algorithm) that bias the model toward lower complexity (fewer coefficients).
• Examples of regularization algorithms are the
• LASSO, Elastic Net and Ridge Regression.
SEMESTER – VIII

Any Questions?
SEMESTER – VIII

CS8080_IRT_UNIT - III T8 FEATURE SELECTION OR DIMENSIONALITY REDUCTION.pdf

More Related Content

Similar to CS8080_IRT_UNIT - III T8 FEATURE SELECTION OR DIMENSIONALITY REDUCTION.pdf (20)

More from AALIM MUHAMMED SALEGH COLLEGE OF ENGINEERING (15)

Recently uploaded (20)

CS8080_IRT_UNIT - III T8 FEATURE SELECTION OR DIMENSIONALITY REDUCTION.pdf