Day 4 of 30 – Decision Trees in Machine Learning

Day 4 of 30 – Decision Trees in Machine Learning

Index

#MachineLearning #AI #MLRoadmap #30DaysOfML #LearningTogether #DecisionTree #SupervisedLearning #PythonML #HandsOnML #GrowEveryday #SaileshWrites

What is a Decision Tree?

A Decision Tree is like a flowchart. Imagine asking a series of yes/no questions to arrive at a decision – like a game of “20 Questions”. Decision Trees follow a similar logic.

Each internal node represents a question (decision), each branch represents the outcome, and each leaf node represents the final result (prediction).

It's like asking:

  • Is the weather sunny?

  • Yes → Is it hot?

  • Yes → Stay indoors

  • No → Go outside

  • No → Carry an umbrella

This kind of structure is great for both classification (Yes/No, True/False, categories) and regression (predicting numbers).

No Feature Scaling Needed!

Unlike logistic or linear regression, Decision Trees are not affected by different ranges of data. So you don’t need to scale features. That’s one step less!

Advantages

  • Easy to understand

  • Inbuilt feature selection

  • Requires little data for preprocessing and preparation

  • Works for both classification and regression

  • No need for scaling

  • Perform well with large datasets

Disadvantages

  • Can overfit if not pruned

  • Sensitive to small data changes

  • Unbalanced dataset can create problems

Decision Tree Metrics (Split Criteria)

  1. Gini Impurity

  2. Entropy

  3. Information Gain

What is Gini Impurity?

Gini Impurity is one of the popular metrics used to decide how to split a node in a Decision Tree. It helps us measure how “pure” or “impure” a node is. In other words, it tells us how mixed up the classes are in a group of data.

Imagine This:

Let’s say you have a basket of fruits — apples and oranges.

  • If your basket contains only apples, it is pure.

  • If it contains 50% apples and 50% oranges, it is impure.

The Gini Impurity measures this impurity.

Gini Formula

The formula for Gini Impurity is:

Where:

  • pi​ is the probability (or proportion) of each class in the node.

Example 1: Pure Node (All Apples)

Suppose:

  • You have 10 fruits — all apples.

So, probability of apple p1 = 1, and orange p2=0

Gini = 1 - (1^2 + 0^2) = 1 - (1 + 0) = 0

A Gini score of 0 means the node is completely pure.

Example 2: Half Apples, Half Oranges

Suppose:

  • You have 10 fruits — 5 apples, 5 oranges.

So, p1 = 0.5; p2 = 0.5

Gini = 1 - (0.5^2 + 0.5^2) = 1 - (0.25 + 0.25) = 0.5

A Gini score of 0.5 means the node is impure (most mixed up).

Example 3: 80% Apples, 20% Oranges

So, p1 = 0.8; p2 = 0.2

Gini = 1 - (0.8^2 + 0.2^2) = 1 - (0.64 + 0.04) = 1 - 0.68 = 0.32

A Gini score of 0.32 — better than 0.5 — means this node is less impure.

Gini in Decision Tree

The Decision Tree algorithm chooses the split that gives us:

  • Lower Gini Impurity after the split

  • Because lower Gini means groups are purer (more “certain” class predictions)

What is Entropy?

Entropy is a metric that tells us how pure or impure a dataset is. It comes from information theory and is used in decision trees to decide which attribute to split on.

  • If all elements in a dataset belong to the same class → Entropy = 0 (pure)

  • If the data is split evenly between classes → Entropy = 1 (impure)

Think of it like:

"How much disorder or uncertainty is in this group?"

Entropy Formula:

Where:

  • S = dataset

  • c = number of classes

  • pi​ = proportion of class i

✅ Example:

Suppose you have 10 samples:

  • 6 are "Yes" (positive)

  • 4 are "No" (negative)

Then,

So the entropy is 0.971, which means the data is somewhat impure.

What is Information Gain?

Information Gain tells us how much entropy is reduced after we split the data on a particular feature.

"How much better did we make our dataset by splitting it on this feature?"

Formula:

  • It calculates the reduction in entropy.

  • The higher the Information Gain, the better that feature is for splitting.

Example:

Suppose we want to decide whether to play outside based on the weather (Sunny, Rainy). We have:

Parent set:

  • 6 Yes

  • 4 No Entropy = 0.971 (from earlier)

Split on “Weather”:

  • Sunny (5 samples) → 4 Yes, 1 No; Entropy ≈0.722

  • Rainy (5 samples) → 2 Yes, 3 No; Entropy ≈0.971

Weighted average of child entropies:

Entropy = (5/10 * 0.722) + (5/10 * 0.971) = 0.847

Information Gain:

IG = 0.971 − 0.847 = 0.124

So, splitting on "Weather" reduces impurity by 0.124.

Hands on

Click here to access the dataset

Click here to access the working code

TIP: To know how to run the code, or to know where these code is written and executed, here is the small video to help. https://www.youtube.com/watch?v=RLYoEyIHL6A

Learning Recap

Today we learned:

  • What Decision Trees are and how they work

  • Why they are intuitive and visual

  • How to train and test a Decision Tree model

  • How to interpret its accuracy using confusion matrix and reports


What’s Coming Next?

Next up – Day 5: K-Nearest Neighbors (KNN). A model that learns by “looking around” – quite literally!


Tip: Try changing the and see how the accuracy changes. A deeper tree may fit better but can overfit – watch out!

Share this with a friend who's curious about how machines make decisions!

Repost to your network — let’s build a powerful ML community together. #MachineLearning #AI #Python #DataScience #SaileshWrites #MLCommunity #30DaysChallenge #TechLearning #GrowEveryday

Tanay Srivastava

Student at PGDAV College

1w

💡 Great insight

To view or add a comment, sign in

Others also viewed

Explore topics