SlideShare a Scribd company logo
Machine Learning:LEC-2 (Unit 1)
Course Code: : PCCAIML-502
Prof D. Chakraborty (PhD, J.U.)
Dept: Computer Sc. & Engg.
Asansol Engineering College, WB
FEATURE ENGINEERING
• Feature engineering is the pre-processing step of machine learning, which is
used to transform raw data into features that can be used for creating a
predictive model using Machine learning or statistical Modelling.
• Since 2016, automated feature engineering is also used in different machine
learning software that helps in automatically extracting features from raw
data. Feature engineering in ML contains mainly four processes: Feature
Creation, Transformations, Feature Extraction, and Feature Selection.
• Feature engineering in machine learning aims to improve the performance of
models. In this topic, we will understand the details about feature engineering
in Machine learning. But before going into details, let's first understand what
features are? And What is the need for feature engineering?
FEATURE ENGINEERING
• What is a feature?
• Generally, all machine learning algorithms take input data to generate the
output. The input data remains in a tabular form consisting of rows
(instances or observations) and columns (variable or attributes), and these
attributes are often known as features. For example, an image is an
instance in computer vision, but a line in the image could be the feature.
Similarly, in NLP, a document can be an observation, and the word count
could be the feature. So, we can say a feature is an attribute that impacts
a problem or is useful for the problem.
EXAMPLE OF DATASET
IMAGE DATASET
CONTD..
FEATURE ENGINEERIG
• Feature Creation: Feature creation is finding the most useful
variables to be used in a predictive model. The process is subjective,
and it requires human creativity and intervention. The new features
are created by mixing existing features using addition, subtraction,
and ratio, and these new features have great flexibility.
• Transformations: The transformation step of feature engineering
involves adjusting the predictor variable to improve the accuracy
and performance of the model. For example, it ensures that the
model is flexible to take input of the variety of data; it ensures that
all the variables are on the same scale, making the model easier to
understand. It improves the model's accuracy and ensures that all
the features are within the acceptable range to avoid any
computational error.
CONTD..
• Feature Extraction: Feature extraction is an automated feature engineering process
that generates new variables by extracting them from the raw data. The main aim of
this step is to reduce the volume of data so that it can be easily used and managed
for data modelling. Feature extraction methods include cluster analysis, text
analytics, edge detection algorithms, and principal components analysis (PCA).
• Feature Selection: While developing the machine learning model, only a few
variables in the dataset are useful for building the model, and the rest features are
either redundant or irrelevant. If we input the dataset with all these redundant and
irrelevant features, it may negatively impact and reduce the overall performance and
accuracy of the model. Hence it is very important to identify and select the most
appropriate features from the data and remove the irrelevant or less important
features, which is done with the help of feature selection in machine
learning. "Feature selection is a way of selecting the subset of the most relevant
features from the original features set by removing the redundant, irrelevant, or
noisy features."
Machine Learning Paradigms
• Machine learning is commonly separated into three main
learning paradigms: supervised learning, unsupervised
learning, and reinforcement learning.
1. Supervised Learning
Supervised learning is the most common learning paradigm. In
supervised learning, the computer learns from a set of input-
output pairs, which are called labeled examples
CONTD..
CONTD..
• Our goal is to predict the weight of an animal from its other
characteristics, so we rewrite this dataset as a set of input-
output pairs:
CONTD..
• The input variables (here, age and gender) are generally
called features, and the set of features representing an
example is called a feature vector. From this dataset, we can
learn a predictor in a supervised way.
Out[•]=
OUT = 3.65KG
SUPERVISED LEARNING
CONTD..
• Unsupervised Learning
• Unsupervised learning is the second most used learning
paradigm. It is not used as much as supervised learning, but it
unlocks different types of applications. In unsupervised
learning, there are neither inputs nor outputs, the data is just
a set of examples:
UNSUPERVISED LEANING
REINFORCEMENT LEARNING
• The third most classic learning paradigm is called reinforcement learning,
which is a way for autonomous agents to learn. Reinforcement learning is
fundamentally different from supervised and unsupervised learning in the
sense that the data is not provided as a fixed set of examples. Rather, the
data to learn from is obtained by interacting with an external system
called the environment. The name “reinforcement learning” originates
from behavioral psychology, but it could just as well be called “interactive
learning.”
• Reinforcement learning is often used to teach agents, such as robots, to
learn a given task. The agent learns by taking actions in the environment
and receiving observations from this environment:
Reinforcement learning

More Related Content

PDF
Feature Engineering in Machine Learning
PDF
Unit1_Introduction to ML_Concept of features.pdf
DOCX
Deep Learning Vocabulary.docx
PPTX
introduction to Statistical Theory.pptx
PPTX
Learning – Types of Machine Learning – Supervised Learning – Unsupervised UNI...
PPTX
seminar.pptx
PPTX
Introduction to ML (Machine Learning)
PDF
newmicrosoftpowerpointpresentation-210512111200.pdf
Feature Engineering in Machine Learning
Unit1_Introduction to ML_Concept of features.pdf
Deep Learning Vocabulary.docx
introduction to Statistical Theory.pptx
Learning – Types of Machine Learning – Supervised Learning – Unsupervised UNI...
seminar.pptx
Introduction to ML (Machine Learning)
newmicrosoftpowerpointpresentation-210512111200.pdf

Similar to LECTURE-2-INTRO FEATURE ENGINEERING.pptx (20)

PDF
Application of Machine Learning in Structural Health Monitoring
PPTX
Machine Learning_overview_presentation.pptx
PPTX
Module-4_Part-II.pptx
PDF
Chapter 5 - Machine which of Learning.pdf
PPTX
Introduction-to-ML.--Module-1-section-1-.pptx
PPTX
This notes are more beneficial for artifical intelligence
PPTX
STAT7440StudentIMLPresentationJishan.pptx
PPTX
Doctor, Ismail ishengoma PowerPointL3.pptx
PDF
Unit 1_Concet of Feature-Feature Selection Methods.pdf
PPTX
Module 5.pptxIntelligent systems leverage machine learning algorithm
PPTX
Feature enginnering and selection
PPTX
Machine Learning course in Chandigarh Join
PPTX
Introduction to Machine Learning.pptx
PPTX
Machine Learning and its Applications
PPTX
data science module-3 power point presentation
PPTX
Optimal Model Complexity (1).pptx
PPTX
24AI201_AI_Unit_4 (1).pptx Artificial intelligence
PPTX
UNIT-2. unsupervised learning of machine learning
PDF
Module 5.pdf ISML ISML VTU syallabus VTU
PPTX
Statistical Machine Learning Lecture notes
Application of Machine Learning in Structural Health Monitoring
Machine Learning_overview_presentation.pptx
Module-4_Part-II.pptx
Chapter 5 - Machine which of Learning.pdf
Introduction-to-ML.--Module-1-section-1-.pptx
This notes are more beneficial for artifical intelligence
STAT7440StudentIMLPresentationJishan.pptx
Doctor, Ismail ishengoma PowerPointL3.pptx
Unit 1_Concet of Feature-Feature Selection Methods.pdf
Module 5.pptxIntelligent systems leverage machine learning algorithm
Feature enginnering and selection
Machine Learning course in Chandigarh Join
Introduction to Machine Learning.pptx
Machine Learning and its Applications
data science module-3 power point presentation
Optimal Model Complexity (1).pptx
24AI201_AI_Unit_4 (1).pptx Artificial intelligence
UNIT-2. unsupervised learning of machine learning
Module 5.pdf ISML ISML VTU syallabus VTU
Statistical Machine Learning Lecture notes
Ad

Recently uploaded (20)

PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
GDM (1) (1).pptx small presentation for students
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Pharma ospi slides which help in ospi learning
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Basic Mud Logging Guide for educational purpose
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
RMMM.pdf make it easy to upload and study
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Anesthesia in Laparoscopic Surgery in India
2.FourierTransform-ShortQuestionswithAnswers.pdf
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
O7-L3 Supply Chain Operations - ICLT Program
Microbial disease of the cardiovascular and lymphatic systems
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
GDM (1) (1).pptx small presentation for students
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
Pharma ospi slides which help in ospi learning
PPH.pptx obstetrics and gynecology in nursing
Black Hat USA 2025 - Micro ICS Summit - ICS/OT Threat Landscape
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
Final Presentation General Medicine 03-08-2024.pptx
human mycosis Human fungal infections are called human mycosis..pptx
Microbial diseases, their pathogenesis and prophylaxis
Basic Mud Logging Guide for educational purpose
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
RMMM.pdf make it easy to upload and study
Ad

LECTURE-2-INTRO FEATURE ENGINEERING.pptx

  • 1. Machine Learning:LEC-2 (Unit 1) Course Code: : PCCAIML-502 Prof D. Chakraborty (PhD, J.U.) Dept: Computer Sc. & Engg. Asansol Engineering College, WB
  • 2. FEATURE ENGINEERING • Feature engineering is the pre-processing step of machine learning, which is used to transform raw data into features that can be used for creating a predictive model using Machine learning or statistical Modelling. • Since 2016, automated feature engineering is also used in different machine learning software that helps in automatically extracting features from raw data. Feature engineering in ML contains mainly four processes: Feature Creation, Transformations, Feature Extraction, and Feature Selection. • Feature engineering in machine learning aims to improve the performance of models. In this topic, we will understand the details about feature engineering in Machine learning. But before going into details, let's first understand what features are? And What is the need for feature engineering?
  • 3. FEATURE ENGINEERING • What is a feature? • Generally, all machine learning algorithms take input data to generate the output. The input data remains in a tabular form consisting of rows (instances or observations) and columns (variable or attributes), and these attributes are often known as features. For example, an image is an instance in computer vision, but a line in the image could be the feature. Similarly, in NLP, a document can be an observation, and the word count could be the feature. So, we can say a feature is an attribute that impacts a problem or is useful for the problem.
  • 7. FEATURE ENGINEERIG • Feature Creation: Feature creation is finding the most useful variables to be used in a predictive model. The process is subjective, and it requires human creativity and intervention. The new features are created by mixing existing features using addition, subtraction, and ratio, and these new features have great flexibility. • Transformations: The transformation step of feature engineering involves adjusting the predictor variable to improve the accuracy and performance of the model. For example, it ensures that the model is flexible to take input of the variety of data; it ensures that all the variables are on the same scale, making the model easier to understand. It improves the model's accuracy and ensures that all the features are within the acceptable range to avoid any computational error.
  • 8. CONTD.. • Feature Extraction: Feature extraction is an automated feature engineering process that generates new variables by extracting them from the raw data. The main aim of this step is to reduce the volume of data so that it can be easily used and managed for data modelling. Feature extraction methods include cluster analysis, text analytics, edge detection algorithms, and principal components analysis (PCA). • Feature Selection: While developing the machine learning model, only a few variables in the dataset are useful for building the model, and the rest features are either redundant or irrelevant. If we input the dataset with all these redundant and irrelevant features, it may negatively impact and reduce the overall performance and accuracy of the model. Hence it is very important to identify and select the most appropriate features from the data and remove the irrelevant or less important features, which is done with the help of feature selection in machine learning. "Feature selection is a way of selecting the subset of the most relevant features from the original features set by removing the redundant, irrelevant, or noisy features."
  • 9. Machine Learning Paradigms • Machine learning is commonly separated into three main learning paradigms: supervised learning, unsupervised learning, and reinforcement learning. 1. Supervised Learning Supervised learning is the most common learning paradigm. In supervised learning, the computer learns from a set of input- output pairs, which are called labeled examples
  • 11. CONTD.. • Our goal is to predict the weight of an animal from its other characteristics, so we rewrite this dataset as a set of input- output pairs:
  • 12. CONTD.. • The input variables (here, age and gender) are generally called features, and the set of features representing an example is called a feature vector. From this dataset, we can learn a predictor in a supervised way. Out[•]= OUT = 3.65KG
  • 14. CONTD.. • Unsupervised Learning • Unsupervised learning is the second most used learning paradigm. It is not used as much as supervised learning, but it unlocks different types of applications. In unsupervised learning, there are neither inputs nor outputs, the data is just a set of examples:
  • 16. REINFORCEMENT LEARNING • The third most classic learning paradigm is called reinforcement learning, which is a way for autonomous agents to learn. Reinforcement learning is fundamentally different from supervised and unsupervised learning in the sense that the data is not provided as a fixed set of examples. Rather, the data to learn from is obtained by interacting with an external system called the environment. The name “reinforcement learning” originates from behavioral psychology, but it could just as well be called “interactive learning.” • Reinforcement learning is often used to teach agents, such as robots, to learn a given task. The agent learns by taking actions in the environment and receiving observations from this environment: