DECISION TRESS for Machine Learning Beginners

Why Decision Tree Structure in ML?
• A decision tree is a supervised learning algorithm used for
both classification and regression tasks. It models decisions
as a tree-like structure where internal nodes represent
attribute tests, branches represent attribute values, and
leaf nodes represent final decisions or predictions.
Decision trees are versatile, interpretable, and widely used in
machine learning for predictive modeling.

https://mlu-explain.github.io/decision-tree/

Intuition behind the Decision Tree
Here’s an example to make it simple to understand the intuition
of decision tree:
Imagine you’re deciding whether to buy an umbrella:
1.Step 1 – Ask a Question (Root Node):
Is it raining?
If yes, you might decide to buy an umbrella. If no, you move to
the next question.
2.Step 2 – More Questions (Internal Nodes):
If it’s not raining, you might ask:
Is it likely to rain later?
If yes, you buy an umbrella; if no, you don’t.
3.Step 3 – Decision (Leaf Node):
Based on your answers, you either buy or skip the umbrella

Approach in Decision Tree
• Decision tree uses the tree representation to solve the
problem in which each leaf node corresponds to a class label
and attributes are represented on the internal node of the
tree. We can represent any boolean function on discrete
attributes using the decision tree.

Example: Predicting Whether a Person Likes
Computer Games
Imagine you want to predict if a person enjoys
computer games based on their age and gender.
Here’s how the decision tree works:
1.Start with the Root Question (Age):
1. The first question is: “Is the person’s age less than
15?”
2. If Yes, move to the left.
3. If No, move to the right.
•Branch Based on Age:If the person is younger than
15, they are likely to enjoy computer games (+2
prediction score).
•If the person is 15 or older, ask the next question: “Is
the person male?”
•Branch Based on Gender (For Age 15+):If the person
is male, they are somewhat likely to enjoy computer
games (+0.1 prediction score).
•If the person is not male, they are less likely to enjoy
computer games (-1 prediction score)

Example: Predicting Whether a Person Likes
Computer Games Using Two Decision Trees
Tree 1: Age and Gender
1.The first tree asks two questions:
1. “Is the person’s age less than 15?”
1. If Yes, they get a score of +2.
2. If No, proceed to the next question.
2. “Is the person male?”
1. If Yes, they get a score of +0.1.
2. If No, they get a score of -1.
Tree 2: Computer Usage
1.The second tree focuses on daily computer usage:
1. “Does the person use a computer daily?”
1. If Yes, they get a score of +0.9.
2. If No, they get a score of -0.9.
Combining Trees: Final Prediction
The final prediction score is the sum of scores from
both trees

Information Gain and Gini Index in Decision Tree
Till now we have discovered the basic intituition and approach
of how decision tree works, so lets just move to the attribute
selection measure of decision tree.
We have two popular attribute selection measures used:
1.1. Information Gain
2.2. Gini Index

1. Information Gain:
Information Gain tells us how useful a question (or feature) is
for splitting data into groups. It measures how much the
uncertainty decreases after the split. A good question will
create clearer groups, and the feature with the highest
Information Gain is chosen to make the decision.
• For example, if we split a dataset of people into “Young” and
“Old” based on age, and all young people bought the product
while all old people did not, the Information Gain would be
high because the split perfectly separates the two groups with
no uncertainty left

• Suppose S is a set of instances, A is an attribute, Sv is the
subset of S , v represents an individual value that the attribute
A can take and Values (A) is the set of all possible values of A,
then

Building Decision Tree using Information GainThe
essentials:
• Start with all training instances associated with the root node
• Use info gain to choose which attribute to label each node
with
• Note: No root-to-leaf path should contain the same discrete
attribute twice
• Recursively construct each subtree on the subset of training
instances that would be classified down that path in the tree.
• If all positive or all negative training instances remain, the
label that node “yes” or “no” accordingly

• If no attributes remain, label with a majority vote of training
instances left at that node
• If no instances remain, label with a majority vote of the
parent’s training instances.
• Example: Now, let us draw a Decision Tree for the following
data using Information gain. Training set: 3 features and 2
classes

From the above images, we can see that the
information gain is maximum when we make a split
on feature Y. So, for the root node best-suited feature
is feature Y. Now we can see that while splitting the
dataset by feature Y, the child contains a pure subset
of the target variable. So we don’t need to further split
the dataset. The final tree for the above dataset would
look like this:

2. Gini Index
• Gini Index is a metric to measure how often a randomly
chosen element would be incorrectly identified. It means an
attribute with a lower Gini index should be preferred.
• Sklearn supports “Gini” criteria for Gini Index and by default,
it takes “gini” value.
•
For example, if we have a group of people where all bought
the product (100% “Yes”), the Gini Index is 0, indicating
perfect purity. But if the group has an equal mix of “Yes” and
“No”, the Gini Index would be 0.5, showing higher impurity
or uncertainty.

Compared to other impurity measures like entropy,
the Gini Index is faster to compute and more sensitive
to changes in class probabilities.One disadvantage of
the Gini Index is that it tends to favour splits that
create equally sized child nodes, even if they are not
optimal for classification accuracy. In practice, the
choice between using the Gini Index or other impurity
measures depends on the specific problem and
dataset, and often requires experimentation and
tuning.

DECISION TRESS for Machine Learning Beginners

This is how a decision tree works: by splitting data
step-by-step based on the best questions and stopping
when a clear decision is made!

• PRACTICE:
https://www.kaggle.com/code/kashnitsky
/a3-demo-decision-trees

DECISION TRESS for Machine Learning Beginners

More Related Content

Similar to DECISION TRESS for Machine Learning Beginners (20)

Recently uploaded (20)

DECISION TRESS for Machine Learning Beginners