SlideShare a Scribd company logo
DECISION TREES
Why Decision Tree Structure in ML?
• A decision tree is a supervised learning algorithm used for
both classification and regression tasks. It models decisions
as a tree-like structure where internal nodes represent
attribute tests, branches represent attribute values, and
leaf nodes represent final decisions or predictions.
Decision trees are versatile, interpretable, and widely used in
machine learning for predictive modeling.
https://mlu-explain.github.io/decision-tree/
Intuition behind the Decision Tree
Here’s an example to make it simple to understand the intuition
of decision tree:
Imagine you’re deciding whether to buy an umbrella:
1.Step 1 – Ask a Question (Root Node):
Is it raining?
If yes, you might decide to buy an umbrella. If no, you move to
the next question.
2.Step 2 – More Questions (Internal Nodes):
If it’s not raining, you might ask:
Is it likely to rain later?
If yes, you buy an umbrella; if no, you don’t.
3.Step 3 – Decision (Leaf Node):
Based on your answers, you either buy or skip the umbrella
Approach in Decision Tree
• Decision tree uses the tree representation to solve the
problem in which each leaf node corresponds to a class label
and attributes are represented on the internal node of the
tree. We can represent any boolean function on discrete
attributes using the decision tree.
Example: Predicting Whether a Person Likes
Computer Games
Imagine you want to predict if a person enjoys
computer games based on their age and gender.
Here’s how the decision tree works:
1.Start with the Root Question (Age):
1. The first question is: “Is the person’s age less than
15?”
2. If Yes, move to the left.
3. If No, move to the right.
•Branch Based on Age:If the person is younger than
15, they are likely to enjoy computer games (+2
prediction score).
•If the person is 15 or older, ask the next question: “Is
the person male?”
•Branch Based on Gender (For Age 15+):If the person
is male, they are somewhat likely to enjoy computer
games (+0.1 prediction score).
•If the person is not male, they are less likely to enjoy
computer games (-1 prediction score)
Example: Predicting Whether a Person Likes
Computer Games Using Two Decision Trees
Tree 1: Age and Gender
1.The first tree asks two questions:
1. “Is the person’s age less than 15?”
1. If Yes, they get a score of +2.
2. If No, proceed to the next question.
2. “Is the person male?”
1. If Yes, they get a score of +0.1.
2. If No, they get a score of -1.
Tree 2: Computer Usage
1.The second tree focuses on daily computer usage:
1. “Does the person use a computer daily?”
1. If Yes, they get a score of +0.9.
2. If No, they get a score of -0.9.
Combining Trees: Final Prediction
The final prediction score is the sum of scores from
both trees
Information Gain and Gini Index in Decision Tree
Till now we have discovered the basic intituition and approach
of how decision tree works, so lets just move to the attribute
selection measure of decision tree.
We have two popular attribute selection measures used:
1.1. Information Gain
2.2. Gini Index
1. Information Gain:
Information Gain tells us how useful a question (or feature) is
for splitting data into groups. It measures how much the
uncertainty decreases after the split. A good question will
create clearer groups, and the feature with the highest
Information Gain is chosen to make the decision.
• For example, if we split a dataset of people into “Young” and
“Old” based on age, and all young people bought the product
while all old people did not, the Information Gain would be
high because the split perfectly separates the two groups with
no uncertainty left
• Suppose S is a set of instances, A is an attribute, Sv is the
subset of S , v represents an individual value that the attribute
A can take and Values (A) is the set of all possible values of A,
then
Building Decision Tree using Information GainThe
essentials:
• Start with all training instances associated with the root node
• Use info gain to choose which attribute to label each node
with
• Note: No root-to-leaf path should contain the same discrete
attribute twice
• Recursively construct each subtree on the subset of training
instances that would be classified down that path in the tree.
• If all positive or all negative training instances remain, the
label that node “yes” or “no” accordingly
• If no attributes remain, label with a majority vote of training
instances left at that node
• If no instances remain, label with a majority vote of the
parent’s training instances.
• Example: Now, let us draw a Decision Tree for the following
data using Information gain. Training set: 3 features and 2
classes
From the above images, we can see that the
information gain is maximum when we make a split
on feature Y. So, for the root node best-suited feature
is feature Y. Now we can see that while splitting the
dataset by feature Y, the child contains a pure subset
of the target variable. So we don’t need to further split
the dataset. The final tree for the above dataset would
look like this:
2. Gini Index
• Gini Index is a metric to measure how often a randomly
chosen element would be incorrectly identified. It means an
attribute with a lower Gini index should be preferred.
• Sklearn supports “Gini” criteria for Gini Index and by default,
it takes “gini” value.
•
For example, if we have a group of people where all bought
the product (100% “Yes”), the Gini Index is 0, indicating
perfect purity. But if the group has an equal mix of “Yes” and
“No”, the Gini Index would be 0.5, showing higher impurity
or uncertainty.
Compared to other impurity measures like entropy,
the Gini Index is faster to compute and more sensitive
to changes in class probabilities.One disadvantage of
the Gini Index is that it tends to favour splits that
create equally sized child nodes, even if they are not
optimal for classification accuracy. In practice, the
choice between using the Gini Index or other impurity
measures depends on the specific problem and
dataset, and often requires experimentation and
tuning.
DECISION TRESS for Machine Learning Beginners
This is how a decision tree works: by splitting data
step-by-step based on the best questions and stopping
when a clear decision is made!
• PRACTICE:
https://www.kaggle.com/code/kashnitsky
/a3-demo-decision-trees

More Related Content

PDF
Decision trees
PPTX
Lect9 Decision tree
PPTX
Decision Tree.pptx
PPTX
Lecture 12.pptx for bca student DAA lecture
PDF
Supervised Learning Decision Trees Review of Entropy
PDF
Supervised Learning Decision Trees Machine Learning
PDF
Machine Learning Algorithm - Decision Trees
PPTX
data mining.pptx
Decision trees
Lect9 Decision tree
Decision Tree.pptx
Lecture 12.pptx for bca student DAA lecture
Supervised Learning Decision Trees Review of Entropy
Supervised Learning Decision Trees Machine Learning
Machine Learning Algorithm - Decision Trees
data mining.pptx

Similar to DECISION TRESS for Machine Learning Beginners (20)

PPTX
Ai & Machine learning - 31140523010 - BDS302.pptx
PDF
Decision Tree-ID3,C4.5,CART,Regression Tree
PPTX
decision tree DECISION TREE IN MACHINE .pptx
PPTX
DecisionTree.pptx for btech cse student
PDF
CSA 3702 machine learning module 2
PPTX
Decision Trees Learning in Machine Learning
PPT
2.2 decision tree
PDF
Aiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdf
PPTX
Decision tree
PPTX
Decision Trees
PDF
Chapter 4.pdf
PDF
Decision tree lecture 3
PPTX
Decision Tree Concepts and Problems Machine Learning
PDF
Gloeocercospora sorghiGloeocercospora sorghi
PDF
A Method for Vibration Testing Decision Tree-Based Classification Systems.
DOCX
Classification Using Decision Trees and RulesChapter 5.docx
PDF
Decision Tree in classification problems in ML
PDF
Decision tree
PPTX
Decision Tree Classification Algorithm.pptx
PPT
Classfication Basic.ppt
Ai & Machine learning - 31140523010 - BDS302.pptx
Decision Tree-ID3,C4.5,CART,Regression Tree
decision tree DECISION TREE IN MACHINE .pptx
DecisionTree.pptx for btech cse student
CSA 3702 machine learning module 2
Decision Trees Learning in Machine Learning
2.2 decision tree
Aiml ajsjdjcjcjcjfjfjModule4_Pashrt1-1.pdf
Decision tree
Decision Trees
Chapter 4.pdf
Decision tree lecture 3
Decision Tree Concepts and Problems Machine Learning
Gloeocercospora sorghiGloeocercospora sorghi
A Method for Vibration Testing Decision Tree-Based Classification Systems.
Classification Using Decision Trees and RulesChapter 5.docx
Decision Tree in classification problems in ML
Decision tree
Decision Tree Classification Algorithm.pptx
Classfication Basic.ppt
Ad

Recently uploaded (20)

PPTX
Strings in CPP - Strings in C++ are sequences of characters used to store and...
PDF
Arduino robotics embedded978-1-4302-3184-4.pdf
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
composite construction of structures.pdf
PDF
PPT on Performance Review to get promotions
PPTX
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PPTX
Sustainable Sites - Green Building Construction
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PPTX
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPT
Project quality management in manufacturing
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPTX
web development for engineering and engineering
Strings in CPP - Strings in C++ are sequences of characters used to store and...
Arduino robotics embedded978-1-4302-3184-4.pdf
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
bas. eng. economics group 4 presentation 1.pptx
composite construction of structures.pdf
PPT on Performance Review to get promotions
Recipes for Real Time Voice AI WebRTC, SLMs and Open Source Software.pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
Sustainable Sites - Green Building Construction
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
KTU 2019 -S7-MCN 401 MODULE 2-VINAY.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Project quality management in manufacturing
CH1 Production IntroductoryConcepts.pptx
MET 305 2019 SCHEME MODULE 2 COMPLETE.pptx
OOP with Java - Java Introduction (Basics)
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
web development for engineering and engineering
Ad

DECISION TRESS for Machine Learning Beginners

  • 2. Why Decision Tree Structure in ML? • A decision tree is a supervised learning algorithm used for both classification and regression tasks. It models decisions as a tree-like structure where internal nodes represent attribute tests, branches represent attribute values, and leaf nodes represent final decisions or predictions. Decision trees are versatile, interpretable, and widely used in machine learning for predictive modeling.
  • 4. Intuition behind the Decision Tree Here’s an example to make it simple to understand the intuition of decision tree: Imagine you’re deciding whether to buy an umbrella: 1.Step 1 – Ask a Question (Root Node): Is it raining? If yes, you might decide to buy an umbrella. If no, you move to the next question. 2.Step 2 – More Questions (Internal Nodes): If it’s not raining, you might ask: Is it likely to rain later? If yes, you buy an umbrella; if no, you don’t. 3.Step 3 – Decision (Leaf Node): Based on your answers, you either buy or skip the umbrella
  • 5. Approach in Decision Tree • Decision tree uses the tree representation to solve the problem in which each leaf node corresponds to a class label and attributes are represented on the internal node of the tree. We can represent any boolean function on discrete attributes using the decision tree.
  • 6. Example: Predicting Whether a Person Likes Computer Games Imagine you want to predict if a person enjoys computer games based on their age and gender. Here’s how the decision tree works: 1.Start with the Root Question (Age): 1. The first question is: “Is the person’s age less than 15?” 2. If Yes, move to the left. 3. If No, move to the right. •Branch Based on Age:If the person is younger than 15, they are likely to enjoy computer games (+2 prediction score). •If the person is 15 or older, ask the next question: “Is the person male?” •Branch Based on Gender (For Age 15+):If the person is male, they are somewhat likely to enjoy computer games (+0.1 prediction score). •If the person is not male, they are less likely to enjoy computer games (-1 prediction score)
  • 7. Example: Predicting Whether a Person Likes Computer Games Using Two Decision Trees Tree 1: Age and Gender 1.The first tree asks two questions: 1. “Is the person’s age less than 15?” 1. If Yes, they get a score of +2. 2. If No, proceed to the next question. 2. “Is the person male?” 1. If Yes, they get a score of +0.1. 2. If No, they get a score of -1. Tree 2: Computer Usage 1.The second tree focuses on daily computer usage: 1. “Does the person use a computer daily?” 1. If Yes, they get a score of +0.9. 2. If No, they get a score of -0.9. Combining Trees: Final Prediction The final prediction score is the sum of scores from both trees
  • 8. Information Gain and Gini Index in Decision Tree Till now we have discovered the basic intituition and approach of how decision tree works, so lets just move to the attribute selection measure of decision tree. We have two popular attribute selection measures used: 1.1. Information Gain 2.2. Gini Index
  • 9. 1. Information Gain: Information Gain tells us how useful a question (or feature) is for splitting data into groups. It measures how much the uncertainty decreases after the split. A good question will create clearer groups, and the feature with the highest Information Gain is chosen to make the decision. • For example, if we split a dataset of people into “Young” and “Old” based on age, and all young people bought the product while all old people did not, the Information Gain would be high because the split perfectly separates the two groups with no uncertainty left
  • 10. • Suppose S is a set of instances, A is an attribute, Sv is the subset of S , v represents an individual value that the attribute A can take and Values (A) is the set of all possible values of A, then
  • 11. Building Decision Tree using Information GainThe essentials: • Start with all training instances associated with the root node • Use info gain to choose which attribute to label each node with • Note: No root-to-leaf path should contain the same discrete attribute twice • Recursively construct each subtree on the subset of training instances that would be classified down that path in the tree. • If all positive or all negative training instances remain, the label that node “yes” or “no” accordingly
  • 12. • If no attributes remain, label with a majority vote of training instances left at that node • If no instances remain, label with a majority vote of the parent’s training instances. • Example: Now, let us draw a Decision Tree for the following data using Information gain. Training set: 3 features and 2 classes
  • 13. From the above images, we can see that the information gain is maximum when we make a split on feature Y. So, for the root node best-suited feature is feature Y. Now we can see that while splitting the dataset by feature Y, the child contains a pure subset of the target variable. So we don’t need to further split the dataset. The final tree for the above dataset would look like this:
  • 14. 2. Gini Index • Gini Index is a metric to measure how often a randomly chosen element would be incorrectly identified. It means an attribute with a lower Gini index should be preferred. • Sklearn supports “Gini” criteria for Gini Index and by default, it takes “gini” value. • For example, if we have a group of people where all bought the product (100% “Yes”), the Gini Index is 0, indicating perfect purity. But if the group has an equal mix of “Yes” and “No”, the Gini Index would be 0.5, showing higher impurity or uncertainty.
  • 15. Compared to other impurity measures like entropy, the Gini Index is faster to compute and more sensitive to changes in class probabilities.One disadvantage of the Gini Index is that it tends to favour splits that create equally sized child nodes, even if they are not optimal for classification accuracy. In practice, the choice between using the Gini Index or other impurity measures depends on the specific problem and dataset, and often requires experimentation and tuning.
  • 17. This is how a decision tree works: by splitting data step-by-step based on the best questions and stopping when a clear decision is made!