PRUNING in Decision Trees
Pruning is a technique used in decision tree algorithms to prevent overfitting and improve the generalization ability of the model. Overfitting occurs when a decision tree is too complex and captures noise in the training data rather than the underlying patterns in the data. Pruning involves removing parts of the tree that do not provide significant predictive power, making the tree simpler and more interpretable.
There are two main types of pruning:
Cost-Complexity Pruning:
Cost-complexity pruning is a common method for post-pruning decision trees. It involves assigning a cost to each subtree in the fully grown tree and then selecting the subtree with the smallest cost as the pruned tree. The cost of a subtree is determined by a complexity parameter (often denoted as alpha) and the number of leaf nodes in the subtree.
Example:
Consider a decision tree for predicting whether a customer will buy a product based on two features: age and income. The fully grown tree might look like this:
IF age < 30 AND income < 50000
THEN Classify as "Not Buy"
ELSE IF age >= 30 AND income >= 50000
THEN Classify as "Buy"
ELSE IF age >= 30 AND income < 50000
THEN Classify as "Not Buy"
ELSE
THEN Classify as "Buy"
In this example, if the tree is overfitting the data, pruning might occur as follows:
IF age < 30 AND income < 50000
THEN Classify as "Not Buy"
ELSE IF age >= 30 AND income >= 50000
THEN Classify as "Buy"
ELSE
THEN Classify as "Not Buy"
By pruning, the complexity of the tree is reduced, and it becomes less likely to overfit the training data. Pruning is essential to ensure that the decision tree generalizes well to unseen data and makes accurate predictions.
🚀 NITRR'25 | Chemical Engineering Grad | Data Analyst | Data Scientist | Skilled in Python, Tableau, SQL | Passionate about Deep Learning and Neural Networks 🌟 #DataScience #DataAnalytics #ChemicalEngineer
4moI didn't get this : do we keep the tree with the least cost, or do we remove that and keep rest. I'll here attach the image of the tree that I made, can somebody explain what the pruned tree will look like?
Attended BVRIT Hyderabad Student of Smart interviews
1yhow error values are taken?