SlideShare a Scribd company logo
5
Most read
9
Most read
14/04/2025 1
Department of Computer Science & Engineering (SB-ET)
III B. Tech -I Semester
MACHINE LEARNING
SUBJECT CODE: 22PCOAM16
AcademicY
ear: 2023-2024
by
Dr. M.Gokilavani
GNITC
Department of CSE (SB-ET)
14/04/2025 Department of CSE (SB-ET) 2
22PCOAM16 MACHINE LEARNING
UNIT – III
Syllabus
Learning with Trees – Decision Trees – Constructing Decision Trees –
Classification and Regression Trees – Ensemble Learning – Boosting –
Bagging – Different ways to Combine Classifiers – Basic Statistics –
Gaussian Mixture Models – Nearest Neighbor Methods – Unsupervised
Learning – K means Algorithms
14/04/2025 3
TEXTBOOK:
• Stephen Marsland, Machine Learning - An Algorithmic Perspective, Second Edition,
Chapman and Hall/CRC.
• Machine Learning and Pattern Recognition Series, 2014.
REFERENCES:
• Tom M Mitchell, Machine Learning, First Edition, McGraw Hill Education, 2013.
• Ethem Alpaydin, Introduction to Machine Learning 3e (Adaptive Computation and
Machine
No of Hours Required: 13
Department of CSE (SB-ET)
UNIT - III LECTURE – 19
14/04/2025 Department of CSE (SB-ET) 4
Constructing Decision Trees
• Starting at the Root: The algorithm begins at the top, called the “root
node,” representing the entire dataset.
• Asking the Best Questions: It looks for the most important feature or
question that splits the data into the most distinct groups.
• Branching Out: Based on the answer to that question, it divides the data
into smaller subsets, creating new branches. Each branch represents a
possible route through the tree.
• Repeating the Process: The algorithm continues asking questions and
splitting the data at each branch until it reaches the final “leaf nodes,”
representing the predicted outcomes or classifications.
UNIT - III LECTURE - 19
14/04/2025 Department of CSE (SB-ET) 5
Constructing Decision Trees
• While implementing a Decision tree, the main issue arises that how to
select the best attribute for the root node and for sub-nodes.
• To solve such problems there is a technique which is called as Attribute
selection measure or ASM.
• By this measurement, we can easily select the best attribute for the nodes
of the tree. There are two popular techniques for ASM, which are:
• Information Gain
• Gini Index
UNIT - III LECTURE - 19
14/04/2025 Department of CSE (SB-ET) 6
INFORMATION GAIN
• Information gain is the measurement of changes in entropy after the
segmentation of a dataset based on an attribute.
• It calculates how much information a feature provides us about a class.
• According to the value of information gain, we split the node and build the
decision tree.
• A decision tree algorithm always tries to maximize the value of
information gain, and a node/attribute having the highest information gain
is split first.
• It can be calculated using the below formula:
Information Gain= Entropy(S)-[(Weighted Avg) *Entropy(each feature)]
UNIT - III LECTURE - 19
14/04/2025 Department of CSE (SB-ET) 7
Entropy
UNIT - III LECTURE - 19
• Entropy: Entropy is a metric to measure the impurity in a given attribute.
It specifies randomness in data.
• Entropy can be calculated as:
Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no)
Where,
•S= Total number of samples
•P(yes)= probability of yes
•P(no)= probability of no
14/04/2025 Department of CSE (SB-ET) 8
Gini Index
• Gini index is a measure of impurity or purity used while creating a decision
tree in the CART(Classification and Regression Tree) algorithm.
• An attribute with the low Gini index should be preferred as compared to
the high Gini index.
• It only creates binary splits, and the CART algorithm uses the Gini index to
create binary splits.
• Gini index can be calculated using the below formula:
UNIT - III LECTURE - 19
Gini Index= 1- ∑j
Pj
2
14/04/2025 Department of CSE (SB-ET) 9
Pruning
• Pruning is a process of deleting the unnecessary nodes from a tree in
order to get the optimal decision tree.
• A too-large tree increases the risk of over fitting, and a small tree may not
capture all the important features of the dataset.
• Therefore, a technique that decreases the size of the learning tree without
reducing accuracy is known as Pruning.
• There are mainly two types of tree pruning technology used:
• Cost Complexity Pruning
• Reduced Error Pruning
UNIT - III LECTURE - 19
14/04/2025 Department of CSE (SB-ET) 10
UNIT - III LECTURE - 19
Types of Decision Tree
• Classification Trees: Used when the target variable is categorical. For
example, predicting whether an email is spam or not spam.
• Regression Trees: Used when the target variable is continuous, like
predicting house prices.
14/04/2025 Department of CSE (SB-ET) 11
Topics to be covered in next session 20
• ID3 Algorithm
Thank you!!!
UNIT - III LECTURE - 19

More Related Content

PPTX
22PCOAM16 ML Unit 3 Session 18 Learning with tree.pptx
PPTX
22PCOAM16 ML Unit 3 Session 20 ID3 Algorithm and working.pptx
PPTX
22PCOAM16 Unit 3 Session 23 Different ways to Combine Classifiers.pptx
PPTX
22PCOAM16 Unit 3 Session 22 Ensemble Learning .pptx
PPTX
22PCOAM16 P_UNIT 1_ Session 2 Brain and the Neuron.pptx
PPTX
22PCOAM16 Unit 1 : Session 1 Learning and Types of Machine Learning.pptx
PPTX
22PCOAM16 Unit 3 Session 24 K means Algorithms.pptx
PPTX
22PCOAM16 Unit 2 Session 15 Curse of Dimensionality.pptx
22PCOAM16 ML Unit 3 Session 18 Learning with tree.pptx
22PCOAM16 ML Unit 3 Session 20 ID3 Algorithm and working.pptx
22PCOAM16 Unit 3 Session 23 Different ways to Combine Classifiers.pptx
22PCOAM16 Unit 3 Session 22 Ensemble Learning .pptx
22PCOAM16 P_UNIT 1_ Session 2 Brain and the Neuron.pptx
22PCOAM16 Unit 1 : Session 1 Learning and Types of Machine Learning.pptx
22PCOAM16 Unit 3 Session 24 K means Algorithms.pptx
22PCOAM16 Unit 2 Session 15 Curse of Dimensionality.pptx

Similar to 22PCOAM16 ML Unit 3 Session 19 Constructing Decision Trees.pptx (20)

PDF
BIM Data Mining Unit3 by Tekendra Nath Yogi
PPTX
22PCOAM21 Data Quality Session 3 Data Quality.pptx
PPTX
22PCOAM16_UNIT 2_ Session 11 MLP Practice & Example .pptx
PPTX
22PCOAM16 Unit 2 Session 17 Support vector Machine.pptx
PDF
Decision Tree in Machine Learning
PPTX
22PCOAM16_UNIT 1_Session 5 Candidate Elimination Algorithm.pptx
PPTX
22PCOAM16_UNIT 1_Session 7 Single layer Perceptrons.pptx
PPTX
random forest.pptx
PPTX
Supervised Learning Algorithm Slide.pptx
PDF
22PCOAM16 _ML_Unit 3 Notes & Question bank
PDF
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
PDF
Unit4_Clustering k means_Clustering in ML.pdf
PDF
Clustering of Big Data Using Different Data-Mining Techniques
PPTX
22PCOAM21 Session 1 Data Management.pptx
PDF
Unit3_Classification_Decision Tree ID4, C4.5, CART.pdf
PPTX
Decision tree
PPTX
22PCOAM16_UNIT 1_Session 8 Multi layer Perceptrons.pptx
PDF
Unit4_AML_MTech that has many ML concepts covered
PDF
Machine Learning, K-means Algorithm Implementation with R
PDF
Predicting performance of classification algorithms
BIM Data Mining Unit3 by Tekendra Nath Yogi
22PCOAM21 Data Quality Session 3 Data Quality.pptx
22PCOAM16_UNIT 2_ Session 11 MLP Practice & Example .pptx
22PCOAM16 Unit 2 Session 17 Support vector Machine.pptx
Decision Tree in Machine Learning
22PCOAM16_UNIT 1_Session 5 Candidate Elimination Algorithm.pptx
22PCOAM16_UNIT 1_Session 7 Single layer Perceptrons.pptx
random forest.pptx
Supervised Learning Algorithm Slide.pptx
22PCOAM16 _ML_Unit 3 Notes & Question bank
22PCOAM16 ML Unit 3 Full notes PDF & QB.pdf
Unit4_Clustering k means_Clustering in ML.pdf
Clustering of Big Data Using Different Data-Mining Techniques
22PCOAM21 Session 1 Data Management.pptx
Unit3_Classification_Decision Tree ID4, C4.5, CART.pdf
Decision tree
22PCOAM16_UNIT 1_Session 8 Multi layer Perceptrons.pptx
Unit4_AML_MTech that has many ML concepts covered
Machine Learning, K-means Algorithm Implementation with R
Predicting performance of classification algorithms
Ad

More from Guru Nanak Technical Institutions (16)

PPTX
22PCOAM21 Session 2 Understanding Data Source.pptx
PDF
III Year II Sem 22PCOAM21 Data Analytics Syllabus.pdf
PDF
22PCOAM16 Machine Learning Unit V Full notes & QB
PDF
22PCOAM16_MACHINE_LEARNING_UNIT_IV_NOTES_with_QB
PPTX
22PCOAM16 ML Unit 3 Session 21 Classification and Regression Trees .pptx
PDF
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
PDF
22PCOAM16 _ML_ Unit 2 Full unit notes.pdf
PDF
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
PDF
22PCOAM16_MACHINE_LEARNING_UNIT_I_NOTES.pdf
PPTX
22PCOAM16 Unit 2 Session 16 Interpolations and Basic Functions.pptx
PPTX
22PCOAM16 Unit 2 Session 14 RBF Network.pptx
PPTX
22PCOAM16 Unit 2 Session 13 Radial Basis Functions and Splines.pptx
PPTX
22PCOAM16_UNIT 2_ Session 12 Deriving Back-Propagation .pptx
PPTX
22PCOAM16_UNIT 2_Session 10 Multi Layer Perceptrons.pptx
PPTX
22PCOAM16_UNIT 1_Session 3 concept Learning task.pptx
PPTX
22PCOAM16_UNIT 1_Session 9 Linear Regression.pptx
22PCOAM21 Session 2 Understanding Data Source.pptx
III Year II Sem 22PCOAM21 Data Analytics Syllabus.pdf
22PCOAM16 Machine Learning Unit V Full notes & QB
22PCOAM16_MACHINE_LEARNING_UNIT_IV_NOTES_with_QB
22PCOAM16 ML Unit 3 Session 21 Classification and Regression Trees .pptx
22PCOAM16 ML UNIT 2 NOTES & QB QUESTION WITH ANSWERS
22PCOAM16 _ML_ Unit 2 Full unit notes.pdf
22PCOAM16_ML_Unit 1 notes & Question Bank with answers.pdf
22PCOAM16_MACHINE_LEARNING_UNIT_I_NOTES.pdf
22PCOAM16 Unit 2 Session 16 Interpolations and Basic Functions.pptx
22PCOAM16 Unit 2 Session 14 RBF Network.pptx
22PCOAM16 Unit 2 Session 13 Radial Basis Functions and Splines.pptx
22PCOAM16_UNIT 2_ Session 12 Deriving Back-Propagation .pptx
22PCOAM16_UNIT 2_Session 10 Multi Layer Perceptrons.pptx
22PCOAM16_UNIT 1_Session 3 concept Learning task.pptx
22PCOAM16_UNIT 1_Session 9 Linear Regression.pptx
Ad

Recently uploaded (20)

PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
additive manufacturing of ss316l using mig welding
PPTX
web development for engineering and engineering
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
Construction Project Organization Group 2.pptx
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
bas. eng. economics group 4 presentation 1.pptx
DOCX
573137875-Attendance-Management-System-original
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
additive manufacturing of ss316l using mig welding
web development for engineering and engineering
OOP with Java - Java Introduction (Basics)
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
Construction Project Organization Group 2.pptx
Embodied AI: Ushering in the Next Era of Intelligent Systems
bas. eng. economics group 4 presentation 1.pptx
573137875-Attendance-Management-System-original
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Arduino robotics embedded978-1-4302-3184-4.pdf
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS

22PCOAM16 ML Unit 3 Session 19 Constructing Decision Trees.pptx

  • 1. 14/04/2025 1 Department of Computer Science & Engineering (SB-ET) III B. Tech -I Semester MACHINE LEARNING SUBJECT CODE: 22PCOAM16 AcademicY ear: 2023-2024 by Dr. M.Gokilavani GNITC Department of CSE (SB-ET)
  • 2. 14/04/2025 Department of CSE (SB-ET) 2 22PCOAM16 MACHINE LEARNING UNIT – III Syllabus Learning with Trees – Decision Trees – Constructing Decision Trees – Classification and Regression Trees – Ensemble Learning – Boosting – Bagging – Different ways to Combine Classifiers – Basic Statistics – Gaussian Mixture Models – Nearest Neighbor Methods – Unsupervised Learning – K means Algorithms
  • 3. 14/04/2025 3 TEXTBOOK: • Stephen Marsland, Machine Learning - An Algorithmic Perspective, Second Edition, Chapman and Hall/CRC. • Machine Learning and Pattern Recognition Series, 2014. REFERENCES: • Tom M Mitchell, Machine Learning, First Edition, McGraw Hill Education, 2013. • Ethem Alpaydin, Introduction to Machine Learning 3e (Adaptive Computation and Machine No of Hours Required: 13 Department of CSE (SB-ET) UNIT - III LECTURE – 19
  • 4. 14/04/2025 Department of CSE (SB-ET) 4 Constructing Decision Trees • Starting at the Root: The algorithm begins at the top, called the “root node,” representing the entire dataset. • Asking the Best Questions: It looks for the most important feature or question that splits the data into the most distinct groups. • Branching Out: Based on the answer to that question, it divides the data into smaller subsets, creating new branches. Each branch represents a possible route through the tree. • Repeating the Process: The algorithm continues asking questions and splitting the data at each branch until it reaches the final “leaf nodes,” representing the predicted outcomes or classifications. UNIT - III LECTURE - 19
  • 5. 14/04/2025 Department of CSE (SB-ET) 5 Constructing Decision Trees • While implementing a Decision tree, the main issue arises that how to select the best attribute for the root node and for sub-nodes. • To solve such problems there is a technique which is called as Attribute selection measure or ASM. • By this measurement, we can easily select the best attribute for the nodes of the tree. There are two popular techniques for ASM, which are: • Information Gain • Gini Index UNIT - III LECTURE - 19
  • 6. 14/04/2025 Department of CSE (SB-ET) 6 INFORMATION GAIN • Information gain is the measurement of changes in entropy after the segmentation of a dataset based on an attribute. • It calculates how much information a feature provides us about a class. • According to the value of information gain, we split the node and build the decision tree. • A decision tree algorithm always tries to maximize the value of information gain, and a node/attribute having the highest information gain is split first. • It can be calculated using the below formula: Information Gain= Entropy(S)-[(Weighted Avg) *Entropy(each feature)] UNIT - III LECTURE - 19
  • 7. 14/04/2025 Department of CSE (SB-ET) 7 Entropy UNIT - III LECTURE - 19 • Entropy: Entropy is a metric to measure the impurity in a given attribute. It specifies randomness in data. • Entropy can be calculated as: Entropy(s)= -P(yes)log2 P(yes)- P(no) log2 P(no) Where, •S= Total number of samples •P(yes)= probability of yes •P(no)= probability of no
  • 8. 14/04/2025 Department of CSE (SB-ET) 8 Gini Index • Gini index is a measure of impurity or purity used while creating a decision tree in the CART(Classification and Regression Tree) algorithm. • An attribute with the low Gini index should be preferred as compared to the high Gini index. • It only creates binary splits, and the CART algorithm uses the Gini index to create binary splits. • Gini index can be calculated using the below formula: UNIT - III LECTURE - 19 Gini Index= 1- ∑j Pj 2
  • 9. 14/04/2025 Department of CSE (SB-ET) 9 Pruning • Pruning is a process of deleting the unnecessary nodes from a tree in order to get the optimal decision tree. • A too-large tree increases the risk of over fitting, and a small tree may not capture all the important features of the dataset. • Therefore, a technique that decreases the size of the learning tree without reducing accuracy is known as Pruning. • There are mainly two types of tree pruning technology used: • Cost Complexity Pruning • Reduced Error Pruning UNIT - III LECTURE - 19
  • 10. 14/04/2025 Department of CSE (SB-ET) 10 UNIT - III LECTURE - 19 Types of Decision Tree • Classification Trees: Used when the target variable is categorical. For example, predicting whether an email is spam or not spam. • Regression Trees: Used when the target variable is continuous, like predicting house prices.
  • 11. 14/04/2025 Department of CSE (SB-ET) 11 Topics to be covered in next session 20 • ID3 Algorithm Thank you!!! UNIT - III LECTURE - 19