SlideShare a Scribd company logo
1
DTS304TC: Machine Learning
Lecture 4: Boosting
Dr Kang Dang
D-5032, Taicang Campus
Kang.Dang@xjtlu.edu.cn
Tel: 88973341
2
Ensembling Methods
• Ensembles involve a group of prediction models working
together to improve classification outcomes.
• Bagging Revisited: Bagging develops multiple models
using randomly selected subsets of the training dataset.
• Introduction to Boosting : Our focus today shifts to
Boosting, a technique where models are trained in
sequence, with an increased focus on examples that were
incorrectly predicted in previous rounds.
3
Decision Stump
• A weak learner refers to a model that predicts marginally better than a
random guess, exemplified by achieving accuracy just above 50%, such
as 55%.
• Boosting leverages these minimal-performing models, prioritizing
those that are computationally simple, such as a decision stump—a
decision tree characterized by only one decision node.
4
Visualizing Decision Stump Classifiers
Decision stump classifiers delineate the feature space using
horizontal and vertical boundaries, creating distinct half-
spaces for classification.
5
Understanding AdaBoost Algorithm
6
Understanding the AdaBoost
Algorithm
• Key steps of AdaBoost:
1. In every iteration, AdaBoost adjusts the weights of the training instances, emphasizing those
that were previously misclassified.
2. A fresh weak classifier is then trained on these reweighted instances.
3. Newly developed classifiers are merged into the current ensemble with an assigned weight
based on their accuracy, thereby strengthening the collective decision-making power.
4. Repeat this process many times.
• Each weak learner is tasked with minimizing the error on the weighted data.
• AdaBoost's strategy of focusing on prior errors reduces the overall bias of the model, sharpening
accuracy with each step.
Example of AdaBoost
Original Dataset: Equal Weights to All the samples.
8
AdaBoost Round-1
weighted training error. weighting of the current decision stump
=> Train a classifier using =>
Question: why is w=0.3 here?
9
AdaBoost Round-1
weighted training error. weighting of the current decision stump
=> Train a classifier using =>
=>
10
Question
• If the decision stump is well optimized, The error rate should be
always lesser than 0.5. Why?
• (the confidence/trust) level will be greater than 0. When will the
confidence level be high?
• (the confidence/trust) is used as the weight of each individual
learner/weak classifier.
11
Question
• In adaboost, after each boosting round, we should update the weight
of each individual sample using this formula
• What does this mean?
• How to explain the figure?
12
AdaBoost Round-2
weighted training error. weighting of the current decision stump
=> Train a classifier using =>
=>
13
AdaBoost Round-3
weighted training error. weighting of the current decision stump
=> Train a classifier using =>
=>
14
AdaBoost Final Classifier
15
Key steps of the above AdaBoost process
ε =
∑
𝑖=1
𝑤𝑖 Ι { ht ( 𝑥 ( 𝑖 )
)≠ 𝑡 (𝑖 )
}
∑
𝑖= 1
𝑤𝑖
𝑎=
1
2
log ( 1 − 𝜖
𝜖 )
Data Samples
with Weights
Train a decision stump
(See decision tree building lecture slide)
Calculate the weighted classification
error of the decision stump
Calculate weighting factor of the
current decision stump
Increase Data Sample Weights for
Wrongly Classified Samples
Next Round of
Boosting
Final classifier after T
round of boosting:
16
AdaBoost Algorithm
17
AdaBoost Algorithm Key Steps
1. Classifier Training according
to weighted data samples
2. Calculated weighted error of
current classifier
3. Calculate classifier coefficient
4. Update data weights
18
Adaboost Examples
• The decision made by the latest added learner is depicted with a dashed black line.
• The collective decision boundary formed by the entire ensemble is illustrated in green.
19
Control AdaBoost Overfitting
• AdaBoost's training error is theoretically demonstrated to approach zero as the
algorithm progresses.
• With an increasing number of classifiers, there's a risk of overfitting the model
to the training data.
• To prevent overfitting, it's crucial to adjust the number of boosting rounds. This
is best achieved by utilizing a separate validation set to fine-tune the process.
20
AdaBoost Application: Face Detection
• AdaBoost has gained prominence for its effective use in the
realm of facial detection.
• Achieved real-time facial detection capabilities as early as
2001.
21
AdaBoost Application: Face Detection
• The fundamental classifier, or weak learner, operates by evaluating the sum of
pixel intensities within a specified rectangular area, employing certain efficient
computational techniques.
• As the number of boosting iterations increases, the selected features increasingly
concentrate on specific facial regions, improving detection accuracy.
22
AdaBoost Face Detection Experiments in Lab
2
• Please DO Attend Lab 2 of this week, as we will detailed study face
detection in AdaBoost via experiments there.
23
Course Summary
• Boosting strategically lowers bias by creating a collective of
weak classifiers, where each subsequent classifier is fine-
tuned to address the errors of the preceding ensemble.
• Careful calibration of the number of boosting iterations is
helpful to avert the risk of overfitting.
• A practical application of boosting is face detection,
showcasing the effectiveness of this ensemble method.

More Related Content

PPT
INTRODUCTION TO BOOSTING.ppt
PDF
2013-1 Machine Learning Lecture 06 - Artur Ferreira - A Survey on Boosting…
PDF
Supervised Learning Ensemble Techniques Machine Learning
PPTX
PPTX
boosting algorithm
PDF
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
PDF
DMTM 2015 - 15 Classification Ensembles
PPTX
Boosting in ensemble learning in ml.pptx
INTRODUCTION TO BOOSTING.ppt
2013-1 Machine Learning Lecture 06 - Artur Ferreira - A Survey on Boosting…
Supervised Learning Ensemble Techniques Machine Learning
boosting algorithm
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
DMTM 2015 - 15 Classification Ensembles
Boosting in ensemble learning in ml.pptx

Similar to Adaboost Classifier for Machine Learning Course (20)

PPT
An Introduction to boosting
PDF
Boosting - An Ensemble Machine Learning Method
PPTX
Unit V -Multiple Learners.pptx for artificial intelligence
PPTX
Unit V -Multiple Learners in artificial intelligence and machine learning
PPTX
Bagging_and_Boosting.pptx
PDF
BaggingBoosting.pdf
PDF
Ensemble Learning and Boosting
PPT
Lecture -8 Classification(AdaBoost) .ppt
PPTX
Ml8 boosting and-stacking
PPTX
Ensemble Method (Bagging Boosting)
PPSX
ADABoost classifier
PPT
Handout14
PPTX
Ensemble methods in machine learning
PDF
DMTM Lecture 10 Classification ensembles
PDF
机器学习Adaboost
PPTX
Multiclass classification of imbalanced data
PDF
MLHEP 2015: Introductory Lecture #3
PDF
C3.5.1
PPTX
Ensemble methods
An Introduction to boosting
Boosting - An Ensemble Machine Learning Method
Unit V -Multiple Learners.pptx for artificial intelligence
Unit V -Multiple Learners in artificial intelligence and machine learning
Bagging_and_Boosting.pptx
BaggingBoosting.pdf
Ensemble Learning and Boosting
Lecture -8 Classification(AdaBoost) .ppt
Ml8 boosting and-stacking
Ensemble Method (Bagging Boosting)
ADABoost classifier
Handout14
Ensemble methods in machine learning
DMTM Lecture 10 Classification ensembles
机器学习Adaboost
Multiclass classification of imbalanced data
MLHEP 2015: Introductory Lecture #3
C3.5.1
Ensemble methods
Ad

More from ssuserfece35 (7)

PPTX
Build_Machine_Learning_System for Machine Learning Course
PPTX
GMM Clustering Presentation Slides for Machine Learning Course
PPTX
K-Means Clustering Presentation Slides for Machine Learning Course
PDF
Introduction to Machine Learning Lectures
PPTX
hyperparamater search netowrk technnique
PDF
5 推想科技Infervision_Intro_NV_English Intro Material
PPTX
Transformer in Medical Imaging A brief review
Build_Machine_Learning_System for Machine Learning Course
GMM Clustering Presentation Slides for Machine Learning Course
K-Means Clustering Presentation Slides for Machine Learning Course
Introduction to Machine Learning Lectures
hyperparamater search netowrk technnique
5 推想科技Infervision_Intro_NV_English Intro Material
Transformer in Medical Imaging A brief review
Ad

Recently uploaded (20)

PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Chinmaya Tiranga quiz Grand Finale.pdf
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PPTX
Presentation on HIE in infants and its manifestations
PDF
Computing-Curriculum for Schools in Ghana
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
master seminar digital applications in india
PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
A systematic review of self-coping strategies used by university students to ...
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
Lesson notes of climatology university.
102 student loan defaulters named and shamed – Is someone you know on the list?
Chinmaya Tiranga quiz Grand Finale.pdf
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Supply Chain Operations Speaking Notes -ICLT Program
Module 4: Burden of Disease Tutorial Slides S2 2025
Pharmacology of Heart Failure /Pharmacotherapy of CHF
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Presentation on HIE in infants and its manifestations
Computing-Curriculum for Schools in Ghana
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
2.FourierTransform-ShortQuestionswithAnswers.pdf
master seminar digital applications in india
STATICS OF THE RIGID BODIES Hibbelers.pdf
A systematic review of self-coping strategies used by university students to ...
human mycosis Human fungal infections are called human mycosis..pptx
Anesthesia in Laparoscopic Surgery in India
Microbial diseases, their pathogenesis and prophylaxis
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Lesson notes of climatology university.

Adaboost Classifier for Machine Learning Course

  • 1. 1 DTS304TC: Machine Learning Lecture 4: Boosting Dr Kang Dang D-5032, Taicang Campus Kang.Dang@xjtlu.edu.cn Tel: 88973341
  • 2. 2 Ensembling Methods • Ensembles involve a group of prediction models working together to improve classification outcomes. • Bagging Revisited: Bagging develops multiple models using randomly selected subsets of the training dataset. • Introduction to Boosting : Our focus today shifts to Boosting, a technique where models are trained in sequence, with an increased focus on examples that were incorrectly predicted in previous rounds.
  • 3. 3 Decision Stump • A weak learner refers to a model that predicts marginally better than a random guess, exemplified by achieving accuracy just above 50%, such as 55%. • Boosting leverages these minimal-performing models, prioritizing those that are computationally simple, such as a decision stump—a decision tree characterized by only one decision node.
  • 4. 4 Visualizing Decision Stump Classifiers Decision stump classifiers delineate the feature space using horizontal and vertical boundaries, creating distinct half- spaces for classification.
  • 6. 6 Understanding the AdaBoost Algorithm • Key steps of AdaBoost: 1. In every iteration, AdaBoost adjusts the weights of the training instances, emphasizing those that were previously misclassified. 2. A fresh weak classifier is then trained on these reweighted instances. 3. Newly developed classifiers are merged into the current ensemble with an assigned weight based on their accuracy, thereby strengthening the collective decision-making power. 4. Repeat this process many times. • Each weak learner is tasked with minimizing the error on the weighted data. • AdaBoost's strategy of focusing on prior errors reduces the overall bias of the model, sharpening accuracy with each step.
  • 7. Example of AdaBoost Original Dataset: Equal Weights to All the samples.
  • 8. 8 AdaBoost Round-1 weighted training error. weighting of the current decision stump => Train a classifier using => Question: why is w=0.3 here?
  • 9. 9 AdaBoost Round-1 weighted training error. weighting of the current decision stump => Train a classifier using => =>
  • 10. 10 Question • If the decision stump is well optimized, The error rate should be always lesser than 0.5. Why? • (the confidence/trust) level will be greater than 0. When will the confidence level be high? • (the confidence/trust) is used as the weight of each individual learner/weak classifier.
  • 11. 11 Question • In adaboost, after each boosting round, we should update the weight of each individual sample using this formula • What does this mean? • How to explain the figure?
  • 12. 12 AdaBoost Round-2 weighted training error. weighting of the current decision stump => Train a classifier using => =>
  • 13. 13 AdaBoost Round-3 weighted training error. weighting of the current decision stump => Train a classifier using => =>
  • 15. 15 Key steps of the above AdaBoost process ε = ∑ 𝑖=1 𝑤𝑖 Ι { ht ( 𝑥 ( 𝑖 ) )≠ 𝑡 (𝑖 ) } ∑ 𝑖= 1 𝑤𝑖 𝑎= 1 2 log ( 1 − 𝜖 𝜖 ) Data Samples with Weights Train a decision stump (See decision tree building lecture slide) Calculate the weighted classification error of the decision stump Calculate weighting factor of the current decision stump Increase Data Sample Weights for Wrongly Classified Samples Next Round of Boosting Final classifier after T round of boosting:
  • 17. 17 AdaBoost Algorithm Key Steps 1. Classifier Training according to weighted data samples 2. Calculated weighted error of current classifier 3. Calculate classifier coefficient 4. Update data weights
  • 18. 18 Adaboost Examples • The decision made by the latest added learner is depicted with a dashed black line. • The collective decision boundary formed by the entire ensemble is illustrated in green.
  • 19. 19 Control AdaBoost Overfitting • AdaBoost's training error is theoretically demonstrated to approach zero as the algorithm progresses. • With an increasing number of classifiers, there's a risk of overfitting the model to the training data. • To prevent overfitting, it's crucial to adjust the number of boosting rounds. This is best achieved by utilizing a separate validation set to fine-tune the process.
  • 20. 20 AdaBoost Application: Face Detection • AdaBoost has gained prominence for its effective use in the realm of facial detection. • Achieved real-time facial detection capabilities as early as 2001.
  • 21. 21 AdaBoost Application: Face Detection • The fundamental classifier, or weak learner, operates by evaluating the sum of pixel intensities within a specified rectangular area, employing certain efficient computational techniques. • As the number of boosting iterations increases, the selected features increasingly concentrate on specific facial regions, improving detection accuracy.
  • 22. 22 AdaBoost Face Detection Experiments in Lab 2 • Please DO Attend Lab 2 of this week, as we will detailed study face detection in AdaBoost via experiments there.
  • 23. 23 Course Summary • Boosting strategically lowers bias by creating a collective of weak classifiers, where each subsequent classifier is fine- tuned to address the errors of the preceding ensemble. • Careful calibration of the number of boosting iterations is helpful to avert the risk of overfitting. • A practical application of boosting is face detection, showcasing the effectiveness of this ensemble method.

Editor's Notes

  • #2: Simple Idea: Ensembles use many different models together to get better results than any single model alone. Bagging, in Short: How It Works: Bagging creates several models by taking random parts of the training data for each one. Learning About Boosting: What's Boosting: Boosting is when you train models one after the other. Each new model pays more attention to the training examples that the previous models got wrong.
  • #3: The Basics: A weak learner is a simple model that doesn't predict very well, just a bit better than if you were guessing without any information. Think of it like getting a grade just over passing, like 55% when 50% is a pass. How Boosting Uses Them: Boosting takes these basic models that don't do much better than guessing and uses them to build a stronger prediction. It often starts with really simple models, like a decision stump, which is a decision tree with just one question.
  • #4: In Simple Terms: A decision stump classifier is a very simple way of making decisions. It uses straight lines to split up a space into different areas. Each area represents a different group or category.
  • #6: Adjusting Weights: AdaBoost changes the importance (weights) of the examples in the training set. Examples that were wrong before get more attention. Training a New Model: It then makes a weak classifier based on the examples that now have different weights. Combining Models: Each new model is added to the ensemble with a certain importance based on how well it performs. This helps the ensemble make better decisions together. Repeat: Do these steps over and over, many times. Focus on Mistakes: Every simple model tries to get better at predicting the examples that have bigger weights (the ones that were wrong before). Learning from Errors: By always focusing on what it got wrong before, AdaBoost becomes more accurate with each step, reducing overall mistakes (bias).
  • #8: Understanding AdaBoost Terms and Training: ε (Error): This is the mistake rate of the model, considering how important (weighted) each example is. α (Model Weight): This is how much trust we put in the model's decisions. Step-by-Step Example: Starting Point: Imagine we have 10 examples to learn from, and we treat each one as equally important, so each one gets a weight of 1/10. First Model: We train our first simple model (let's call it h1) using these equal weights. Calculate Error (ε): We find out how often this model h1 makes mistakes, weighted by our initial equal weights. Let's say the error rate (ε) is 0.3, which means the model gets 30% of the weighted examples wrong. Calculate Model Weight (α): We use a formula to decide how much to trust h1. The formula uses the error rate (ε) we just found. For our error of 0.3, the trust level (α) comes out to be 0.42. Combine for Final Model (H(x)): We then create a combined model, H(x), which is just our first model h1 weighted by the trust level (α) we calculated. In math terms, it looks like this: H(x) = α1 * h1(x), where α1 is 0.42 in this example.
  • #9: Understanding AdaBoost Terms and Training: ε (Error): This is the mistake rate of the model, considering how important (weighted) each example is. α (Model Weight): This is how much trust we put in the model's decisions. Step-by-Step Example: Starting Point: Imagine we have 10 examples to learn from, and we treat each one as equally important, so each one gets a weight of 1/10. First Model: We train our first simple model (let's call it h1) using these equal weights. Calculate Error (ε): We find out how often this model h1 makes mistakes, weighted by our initial equal weights. Let's say the error rate (ε) is 0.3, which means the model gets 30% of the weighted examples wrong. Calculate Model Weight (α): We use a formula to decide how much to trust h1. The formula uses the error rate (ε) we just found. For our error of 0.3, the trust level (α) comes out to be 0.42. Combine for Final Model (H(x)): We then create a combined model, H(x), which is just our first model h1 weighted by the trust level (α) we calculated. In math terms, it looks like this: H(x) = α1 * h1(x), where α1 is 0.42 in this example.
  • #12: Using Updated Weights to Train a New Classifier: Updated Weights (w): We start with new importance levels (weights) for each data point. Train Classifier h2: We train a second decision stump (h2) using these updated weights. Calculate New Error (ε): We find the error rate of h2, which is 0.21 (21% mistake rate). Calculate New Trust Level (α2): We work out the trust level for h2 with a formula. Here, it's 0.66. Create Combined Model (H(x)): We combine the predictions of the first and second decision stumps, each weighted by their respective trust levels (α1 for h1 and α2 for h2). So, the formula for the combined model is: H(x) = α1 * h1(x) + α2 * h2(x) Here, α1 is the weight from the first stump, h1 is the first decision stump, α2 is 0.66 from our second stump, and h2 is the second decision stump.
  • #13: Training the Third Classifier with New Weights: Updated Weights (w): Use the new weights for each data point that were adjusted from the last round. Train Classifier h3: Now use these weights to train a third decision stump (h3). Find Error Rate (ε): The error for this third stump is 0.14 (14% mistakes). Calculate Trust Level (α3): The formula tells us to trust this stump's predictions with a level of 0.92. Combine All Stumps for Final Model (H(x)): The final model adds up the predictions from all the stumps, each one multiplied by its trust level.
  • #15: Training a Decision Stump Step-by-Step: Train the Stump: First, make a simple decision stump (a basic model) to classify data. (Look at the lecture slides on building decision trees for help.) Find Mistakes: Next, figure out the decision stump's error rate, but make sure to consider the importance (weight) of each data sample. Set the Trust Level: After that, decide how much trust you can put in the decision stump's results. This is called the 'weighting factor'. Focus on Errors: Increase the importance (weights) of the samples that the decision stump got wrong. Repeat: Now, you're ready for another round of boosting, where you'll make a new decision stump that tries to correct the errors of the first one.
  • #19: Getting Better: As AdaBoost keeps going, it's supposed to make fewer and fewer mistakes on the training data, almost reaching a point where it makes no errors. Too Specific: If you keep adding more classifiers, AdaBoost might get too good at the training data and not good at new, unseen data. That's like memorizing the answers to a test without understanding the subject. Finding the Balance: It's important to not go too far with boosting. You can find the right amount by testing the model on a different set of data (validation set) that it hasn't seen before. This helps you figure out when to stop adding new classifiers.
  • #20: AdaBoost and Facial Detection: Success Story: AdaBoost is famous for being really good at recognizing faces. Fast Results: It was even able to detect faces in real-time back in 2001.
  • #21: The fundamental classifier, or weak learner, operates by evaluating the sum of pixel intensities within a specified rectangular area, employing certain efficient computational techniques. As the number of boosting iterations increases, the selected features increasingly concentrate on specific facial regions, improving detection accuracy.
  • #23: How Boosting Works to Recognize Faces: Teamwork: Boosting builds a team of simple models. Each new model is made to fix the mistakes the team made before. Just Enough Practice: You have to choose the right number of times to repeat the process. Too many, and the model might get too picky and not work well on new data it hasn't seen. Real-World Use: Boosting is really good at finding faces in pictures, which proves it's a strong tool for building smart models.