SlideShare a Scribd company logo
Supervised Learning
Understanding
Bagging and
Boosting
Both are ensemble techniques,
where a set of weak learners are combined to create a strong learner
that obtains better performance than a single one.
Error = Bias + Variance
+ Noise
Bagging short for Bootstrap Aggregating
It’s a way to increase accuracy by Decreasing Variance
Done by
Generating additional dataset using combinations
with repetitions to produce multisets of same
cardinality/size as original dataset.
Example: Random Forest
Develops fully grown decision
trees (low bias high variance)
which are uncorrelated to
maximize the decrease in
variance.
Since cannot reduce bias
therefore req. large unpruned
trees.
Boosting
It’s a way to increase accuracy by Reducing Bias
2- step Process Done by
Develop averagely performing models over subsets of
the original data.
Boost these model performance by combining them
using a cost function (eg.majority vote).
Note: every subsets contains elements that were
misclassified or were close by the previous model.
Example: Gradient Boosted Tree
Develops shallow decision trees (high
bias low variance) aka weak larner.
Reduce error mainly by reducing bias
developing new learner taking into
account the previous learner
(Sequential).
Understanding Graphically
Understanding Bagging and Boosting
Understanding Bagging and Boosting
Understanding Bagging and Boosting
Understanding Bagging and Boosting
Comparison
Both are ensemble methods to get N learners
from 1 learner…
… but, while they are built independently for
Bagging, Boosting tries to add new models that do
well where previous models fail.
Both generate several training data sets by
random sampling…
… but only Boosting determines weights for the data
to tip the scales in favor of the most difficult cases.
Both make the final decision by averaging the N
learners (or taking the majority of them)…
… but it is an equally weighted average for Bagging
and a weighted average for Boosting, more weight
to those with better performance on training data.
Both are good at reducing variance and provide
higher stability…
… but only Boosting tries to reduce bias. On the other
hand, Bagging may solve the overfitting problem,
while Boosting can increase it.
Similarities Differences
Exploring the Scope of Supervised
Learning in Current Setup
Areas where Supervised Learning can be useful
Feature Selection for Clustering
Evaluating Features
Increasing the Aggressiveness of the Current setup
Bringing New Rules Idea
Feature
Selection/
Feature
Importance &
Model
Accuracy and
Threshold
Evaluation
Algorithm Used Feature Importance Metric
XGBoost F Score
Random Forest Gini Index, Entropy
Feature
Selection/
Importance
XGBoost - F Score
Feature
Selection/
Importance
RF - Gini Index
Feature
Selection/
Importance
RF - Entropy
Feature Selection/ Importance
Comparison b/w Important Feature by Random Forest & XGBoost
feature_21w
feature_sut
feature_du1
feature_sc3
feature_drh
feature_1a2
feature_sc18
feature_drl
feature_snc
feature_sc1
feature_2c3
feature_npb
feature_3e1
feature_bst
feature_nub
RF - Entropy
feature_sut
feature_sc3
feature_21w
feature_sc18
feature_du1
feature_sc1
feature_drh
feature_drl
feature_1a2
feature_snc
feature_npb
feature_3e1
feature_tbu
feature_nub
feature_bst
RF - GiniXGBoost - F Score
feature_1a2
feature_2c3
feature_hhs
feature_nrp
feature_urh
feature_nub
feature_nup
feature_psc
feature_sncp
feature_3e1
feature_tpa
feature_snc
feature_bst
feature_tbu
feature_nub
Analysis of Top 15 important variable
Feature Selection/ Importance
Comparison b/w Important Feature by Random Forest & XGBoost
Reason for difference in Feature Importance b/w XGB & RF
Basically, when there are several correlated features, boosting will tend to choose one and use it in
several trees (if necessary). Other correlated features won t be used a lot (or not at all).
It makes sense as other correlated features can't help in the split process anymore -> they don't bring
new information regarding the already used feature. And the learning is done in a serial way.
Each tree of a Random forest is not built from the same features (there is a random selection of
features to use for each tree). Each correlated feature may have the chance to be selected in one of the
tree. Therefore, when you look at the whole model it has used all features. The learning is done in
parallel so each tree is not aware of what have been used for other trees.
Tree Growth XGB
When you grow too many trees, trees are starting to be look very similar (when there is no loss
remaining to learn). Therefore the dominant feature will be an even more important. Having shallow
trees reinforce this trend because there are few possible important features at the root of a tree (shared
features between trees are most of the time the one at the root of it). So your results are not surprising.
In this case, you may have interesting results with random selection of columns (rate around 0.8).
Decreasing ETA may also help (keep more loss to explain after each iteration).
Model Accuracy and Threshold Evaluation
XGBoost
Model Accuracy and Threshold Evaluation
XGBoost
A A
A A BB
B B
Model Accuracy and Threshold Evaluation
XGBoost
A A
A A BB
B B
Model Accuracy and Threshold Evaluation
XGBoost
A A
A A BB
B B
Model Accuracy and Threshold Evaluation
XGBoost
A A
A A BB
B B
Model Accuracy and Threshold Evaluation
XGBoost
A A
A A BB
B B
Model Accuracy and Threshold Evaluation
XGBoost
Threshold Accuracy TN FP FN TP
0 0.059% 0 46990 0 2936
0.1 87.353% 42229 4761 1553 1383
0.2 93.881% 46075 915 2140 796
0.3 94.722% 46691 299 2336 600
0.4 94.894% 46866 124 2425 511
0.5 94.902% 46923 67 2478 458
0.6 94.866% 46956 34 2529 407
0.7 94.856% 46973 17 2551 385
0.8 94.824% 46977 13 2571 365
0.9 94.776% 46982 8 2600 336
1 94.119% 46990 0 2936 0
A
A B
B
Model Accuracy and Threshold Evaluation
Random Forest Criteria - Gini Index Random Forest Criteria - Entropy
Criteria Accuracy TN FP FN TP
Gini 94.800% 46968 22 2574 362
Entropy 94.788% 46967 23 2579 357
A A
A A BB
B B
Model Accuracy and Threshold Evaluation
Comparison b/w Random Forest & XGBoost
Criteria Accuracy TN FP FN TP
Gini 94.800% 46968 22 2574 362
Entropy 94.788% 46967 23 2579 357
Threshold Accuracy TN FP FN TP
0 0.059% 0 46990 0 2936
0.1 87.353% 42229 4761 1553 1383
0.2 93.881% 46075 915 2140 796
0.3 94.722% 46691 299 2336 600
0.4 94.894% 46866 124 2425 511
0.5 94.902% 46923 67 2478 458
0.6 94.866% 46956 34 2529 407
0.7 94.856% 46973 17 2551 385
0.8 94.824% 46977 13 2571 365
0.9 94.776% 46982 8 2600 336
1 94.119% 46990 0 2936 0
Bringing New Rules Idea
Comparison b/w Random Forest & XGBoost
Bringing New Rules Idea
Comparison b/w Random Forest & XGBoost
Understanding Bagging and Boosting

More Related Content

PDF
Understanding random forests
PPTX
Gradient Boosted trees
PPTX
Decision tree
PPTX
Ensemble Method (Bagging Boosting)
PDF
Boosting Algorithms Omar Odibat
PPTX
Ensemble methods
PPTX
Bagging.pptx
PPTX
Boosting Approach to Solving Machine Learning Problems
Understanding random forests
Gradient Boosted trees
Decision tree
Ensemble Method (Bagging Boosting)
Boosting Algorithms Omar Odibat
Ensemble methods
Bagging.pptx
Boosting Approach to Solving Machine Learning Problems

What's hot (20)

PDF
Naive Bayes
PDF
Decision trees in Machine Learning
ODP
Machine Learning with Decision trees
PPTX
Machine Learning
PPTX
Ensemble learning Techniques
PPTX
Machine learning with ADA Boost
PPTX
Machine Learning - Ensemble Methods
PDF
Logistic regression in Machine Learning
PPTX
Ensemble learning
ODP
Machine Learning With Logistic Regression
PPTX
Support vector machines (svm)
PPTX
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
PPTX
Random forest algorithm
PPTX
Naive Bayes Presentation
PPTX
Ensemble methods in machine learning
PDF
Classification Based Machine Learning Algorithms
PPTX
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
PPTX
Ensemble learning
PPTX
Random forest
PDF
Bias and variance trade off
Naive Bayes
Decision trees in Machine Learning
Machine Learning with Decision trees
Machine Learning
Ensemble learning Techniques
Machine learning with ADA Boost
Machine Learning - Ensemble Methods
Logistic regression in Machine Learning
Ensemble learning
Machine Learning With Logistic Regression
Support vector machines (svm)
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
Random forest algorithm
Naive Bayes Presentation
Ensemble methods in machine learning
Classification Based Machine Learning Algorithms
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Ensemble learning
Random forest
Bias and variance trade off
Ad

Similar to Understanding Bagging and Boosting (20)

PPTX
CS109a_Lecture16_Bagging_RF_Boosting.pptx
PPTX
Decision_Tree_Ensembles_Lecture.pptx Basics
PDF
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
PDF
BaggingBoosting.pdf
PPTX
Learning Trees - Decision Tree Learning Methods
PPT
Download It
PPT
RandomForests in artificial intelligence
PDF
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
PDF
Ensembles.pdf
PPT
RandomForestsRandomForestsRandomForests.ppt
PPT
RandomForests Bootstrapping BAgging Aggregation
PPTX
Introduction to RandomForests 2004
PPTX
Ml8 boosting and-stacking
PPTX
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
PDF
Applied machine learning: Insurance
PPTX
13 random forest
PPTX
Supervised and Unsupervised Learning .pptx
PDF
Overview of tree algorithms from decision tree to xgboost
PDF
Sample_Subjective_Questions_Answers (1).pdf
PPTX
Xgboost: A Scalable Tree Boosting System - Explained
CS109a_Lecture16_Bagging_RF_Boosting.pptx
Decision_Tree_Ensembles_Lecture.pptx Basics
Dr. Trevor Hastie: Data Science of GBM (October 10, 2013: Presented With H2O)
BaggingBoosting.pdf
Learning Trees - Decision Tree Learning Methods
Download It
RandomForests in artificial intelligence
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Ensembles.pdf
RandomForestsRandomForestsRandomForests.ppt
RandomForests Bootstrapping BAgging Aggregation
Introduction to RandomForests 2004
Ml8 boosting and-stacking
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Applied machine learning: Insurance
13 random forest
Supervised and Unsupervised Learning .pptx
Overview of tree algorithms from decision tree to xgboost
Sample_Subjective_Questions_Answers (1).pdf
Xgboost: A Scalable Tree Boosting System - Explained
Ad

More from Mohit Rajput (20)

PDF
Understanding Association Rule Mining
PDF
Understanding known _ unknown - known _ unknown
PPTX
Algorithms in Reinforcement Learning
PPTX
Dissertation mid evaluation
PPTX
For Seminar - Prospect: Development of continuous CNT path in BCP using sel...
PDF
Mid-Dissertation Work Done Report
PPTX
Mid-Dissertation Work Report Presentation
PPTX
Sura ppt final
PDF
SURA Final report PVDF-CNT
PDF
R markup code to create Regression Model
PDF
Regression Model for movies
PPTX
Presentation- BCP self assembly meshes
PPTX
Presentation- Multilayer block copolymer meshes by orthogonal self-assembly
PDF
Cover for report on Biofuels Generation
PDF
A Report on Metal Drawing Operations
PDF
A technical report on BioFuels Generation
PPTX
Presentation - Bio-fuels Generation
PDF
Status of Education in India by Mohit Rajput
PPTX
Internship Presentation on Characterization of Stainless Steel-Titanium Diffu...
PPTX
Posters for Exhibition
Understanding Association Rule Mining
Understanding known _ unknown - known _ unknown
Algorithms in Reinforcement Learning
Dissertation mid evaluation
For Seminar - Prospect: Development of continuous CNT path in BCP using sel...
Mid-Dissertation Work Done Report
Mid-Dissertation Work Report Presentation
Sura ppt final
SURA Final report PVDF-CNT
R markup code to create Regression Model
Regression Model for movies
Presentation- BCP self assembly meshes
Presentation- Multilayer block copolymer meshes by orthogonal self-assembly
Cover for report on Biofuels Generation
A Report on Metal Drawing Operations
A technical report on BioFuels Generation
Presentation - Bio-fuels Generation
Status of Education in India by Mohit Rajput
Internship Presentation on Characterization of Stainless Steel-Titanium Diffu...
Posters for Exhibition

Recently uploaded (20)

PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
Mega Projects Data Mega Projects Data
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Moving the Public Sector (Government) to a Digital Adoption
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PDF
Launch Your Data Science Career in Kochi – 2025
PDF
Fluorescence-microscope_Botany_detailed content
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
1_Introduction to advance data techniques.pptx
PPTX
Global journeys: estimating international migration
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Mega Projects Data Mega Projects Data
Acceptance and paychological effects of mandatory extra coach I classes.pptx
Supervised vs unsupervised machine learning algorithms
Moving the Public Sector (Government) to a Digital Adoption
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
Launch Your Data Science Career in Kochi – 2025
Fluorescence-microscope_Botany_detailed content
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Business Acumen Training GuidePresentation.pptx
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Reliability_Chapter_ presentation 1221.5784
1_Introduction to advance data techniques.pptx
Global journeys: estimating international migration
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
oil_refinery_comprehensive_20250804084928 (1).pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...

Understanding Bagging and Boosting

  • 2. Understanding Bagging and Boosting Both are ensemble techniques, where a set of weak learners are combined to create a strong learner that obtains better performance than a single one. Error = Bias + Variance + Noise
  • 3. Bagging short for Bootstrap Aggregating It’s a way to increase accuracy by Decreasing Variance Done by Generating additional dataset using combinations with repetitions to produce multisets of same cardinality/size as original dataset. Example: Random Forest Develops fully grown decision trees (low bias high variance) which are uncorrelated to maximize the decrease in variance. Since cannot reduce bias therefore req. large unpruned trees.
  • 4. Boosting It’s a way to increase accuracy by Reducing Bias 2- step Process Done by Develop averagely performing models over subsets of the original data. Boost these model performance by combining them using a cost function (eg.majority vote). Note: every subsets contains elements that were misclassified or were close by the previous model. Example: Gradient Boosted Tree Develops shallow decision trees (high bias low variance) aka weak larner. Reduce error mainly by reducing bias developing new learner taking into account the previous learner (Sequential).
  • 10. Comparison Both are ensemble methods to get N learners from 1 learner… … but, while they are built independently for Bagging, Boosting tries to add new models that do well where previous models fail. Both generate several training data sets by random sampling… … but only Boosting determines weights for the data to tip the scales in favor of the most difficult cases. Both make the final decision by averaging the N learners (or taking the majority of them)… … but it is an equally weighted average for Bagging and a weighted average for Boosting, more weight to those with better performance on training data. Both are good at reducing variance and provide higher stability… … but only Boosting tries to reduce bias. On the other hand, Bagging may solve the overfitting problem, while Boosting can increase it. Similarities Differences
  • 11. Exploring the Scope of Supervised Learning in Current Setup Areas where Supervised Learning can be useful Feature Selection for Clustering Evaluating Features Increasing the Aggressiveness of the Current setup Bringing New Rules Idea
  • 12. Feature Selection/ Feature Importance & Model Accuracy and Threshold Evaluation Algorithm Used Feature Importance Metric XGBoost F Score Random Forest Gini Index, Entropy
  • 16. Feature Selection/ Importance Comparison b/w Important Feature by Random Forest & XGBoost feature_21w feature_sut feature_du1 feature_sc3 feature_drh feature_1a2 feature_sc18 feature_drl feature_snc feature_sc1 feature_2c3 feature_npb feature_3e1 feature_bst feature_nub RF - Entropy feature_sut feature_sc3 feature_21w feature_sc18 feature_du1 feature_sc1 feature_drh feature_drl feature_1a2 feature_snc feature_npb feature_3e1 feature_tbu feature_nub feature_bst RF - GiniXGBoost - F Score feature_1a2 feature_2c3 feature_hhs feature_nrp feature_urh feature_nub feature_nup feature_psc feature_sncp feature_3e1 feature_tpa feature_snc feature_bst feature_tbu feature_nub Analysis of Top 15 important variable
  • 17. Feature Selection/ Importance Comparison b/w Important Feature by Random Forest & XGBoost Reason for difference in Feature Importance b/w XGB & RF Basically, when there are several correlated features, boosting will tend to choose one and use it in several trees (if necessary). Other correlated features won t be used a lot (or not at all). It makes sense as other correlated features can't help in the split process anymore -> they don't bring new information regarding the already used feature. And the learning is done in a serial way. Each tree of a Random forest is not built from the same features (there is a random selection of features to use for each tree). Each correlated feature may have the chance to be selected in one of the tree. Therefore, when you look at the whole model it has used all features. The learning is done in parallel so each tree is not aware of what have been used for other trees. Tree Growth XGB When you grow too many trees, trees are starting to be look very similar (when there is no loss remaining to learn). Therefore the dominant feature will be an even more important. Having shallow trees reinforce this trend because there are few possible important features at the root of a tree (shared features between trees are most of the time the one at the root of it). So your results are not surprising. In this case, you may have interesting results with random selection of columns (rate around 0.8). Decreasing ETA may also help (keep more loss to explain after each iteration).
  • 18. Model Accuracy and Threshold Evaluation XGBoost
  • 19. Model Accuracy and Threshold Evaluation XGBoost A A A A BB B B
  • 20. Model Accuracy and Threshold Evaluation XGBoost A A A A BB B B
  • 21. Model Accuracy and Threshold Evaluation XGBoost A A A A BB B B
  • 22. Model Accuracy and Threshold Evaluation XGBoost A A A A BB B B
  • 23. Model Accuracy and Threshold Evaluation XGBoost A A A A BB B B
  • 24. Model Accuracy and Threshold Evaluation XGBoost Threshold Accuracy TN FP FN TP 0 0.059% 0 46990 0 2936 0.1 87.353% 42229 4761 1553 1383 0.2 93.881% 46075 915 2140 796 0.3 94.722% 46691 299 2336 600 0.4 94.894% 46866 124 2425 511 0.5 94.902% 46923 67 2478 458 0.6 94.866% 46956 34 2529 407 0.7 94.856% 46973 17 2551 385 0.8 94.824% 46977 13 2571 365 0.9 94.776% 46982 8 2600 336 1 94.119% 46990 0 2936 0 A A B B
  • 25. Model Accuracy and Threshold Evaluation Random Forest Criteria - Gini Index Random Forest Criteria - Entropy Criteria Accuracy TN FP FN TP Gini 94.800% 46968 22 2574 362 Entropy 94.788% 46967 23 2579 357 A A A A BB B B
  • 26. Model Accuracy and Threshold Evaluation Comparison b/w Random Forest & XGBoost Criteria Accuracy TN FP FN TP Gini 94.800% 46968 22 2574 362 Entropy 94.788% 46967 23 2579 357 Threshold Accuracy TN FP FN TP 0 0.059% 0 46990 0 2936 0.1 87.353% 42229 4761 1553 1383 0.2 93.881% 46075 915 2140 796 0.3 94.722% 46691 299 2336 600 0.4 94.894% 46866 124 2425 511 0.5 94.902% 46923 67 2478 458 0.6 94.866% 46956 34 2529 407 0.7 94.856% 46973 17 2551 385 0.8 94.824% 46977 13 2571 365 0.9 94.776% 46982 8 2600 336 1 94.119% 46990 0 2936 0
  • 27. Bringing New Rules Idea Comparison b/w Random Forest & XGBoost
  • 28. Bringing New Rules Idea Comparison b/w Random Forest & XGBoost