SlideShare a Scribd company logo
Clarkson Honors Program Thesis Proposal

                   Altering the AdaBoost algorithm to produce a new boosting method
                  yielding more accurate results under the same number of repetitions.

April 5, 2000
Daniel Lawry
Professor Christino Tamon, Advisor

Topics:


          Boosting is a method used implicitly to improve the accuracy of learning algorithms. Boosting's
roots lie in a theoretical framework for studying machine learning called "PAC" learning model. The
creators of this model: Kearns and Valiant, presented the hypothesis that a "weak" learning algorithm, an
algorithm which produces results slightly better than random guessing, in the PAC model can be boosted,
increasing the weak learning algorithm's accuracy and creating a "strong" learning algorithm. Currently, a
boosting algorithm called AdaBoost produces the desired increase in accuracy given a weak learning
algorithm. The AdaBoost method utilizes the weak learning method it is given and a training set (xl,yl),...,
(xm,ym) where xi belongs to a domain X and each label yi is in some label set Y. AdaBoost then calls the
weak learning algorithm repeatedly in a series of T rounds giving weights to the training sets and
updating the weights of these sets each repetition by utilizing data from the last run of the weak
predictor and current weights. These weights will increase or decrease as they are run through the
method, yielding more accurate results based on these new weights each time the training set is run
through the weak learning algorithm. It is believed that by eliminating the last k runs of the weak learning
algorithm where k < t where t is the number of times the weak learning algorithm has been used so far
will force this method to produce more accurate results with the same amount of repetitions. The
elimination of the last k runs forces the current run's data to draw on a smaller set of output from the weak
hypothesis repetitions. By doing this, the hope is that the algorithm will place more of an emphasis on the
runs left in the hypothesis repetitions forcing it to have to become more accurate faster. The parameters to
investigate include the appropriate value of k based on the weak learning algorithm and the number of
repetitions, T, of that algorithm. The investigation will also involve developing this new boosting method
and testing it against the AdaBoost method.


Methodology:


The new boosting method will be developed and constructed for testing purposes using the C
programming language. Likewise the AdaBoost method will be constructed using the C programming
language. A formula to maximize the value of k using the number of repetitions, T, and the efficiency of
the weak learning method will be formed. This formula will then be tested in conjuncture with the new
boosting method and variations on the variable k. Once the optimal value of k is achieved, the two
methods will be run on the same training sets and the resulting data will be compared as to see which
method yields more accurate results.

More Related Content

PDF
Fast and Probvably Seedings for k-Means
PDF
Fraud Detection for Insurance Claims
PPT
Implications of Ceiling Effects in Defect Predictors
PPTX
Pydata presentation
PPTX
Data Analysis project "TITANIC SURVIVAL"
PPT
Real-time ranking with concept drift using expert advice
PPTX
WEKA: Credibility Evaluating Whats Been Learned
PDF
PyData London 2018 talk on feature selection
Fast and Probvably Seedings for k-Means
Fraud Detection for Insurance Claims
Implications of Ceiling Effects in Defect Predictors
Pydata presentation
Data Analysis project "TITANIC SURVIVAL"
Real-time ranking with concept drift using expert advice
WEKA: Credibility Evaluating Whats Been Learned
PyData London 2018 talk on feature selection

Viewers also liked (20)

PDF
Lead To Win Bootcamp - Day 3
PPTX
Guidance & counseling report
PPSX
State Owned Enterprises (SOEs)
PDF
Human action recognition using local space time features and adaboost svm
DOCX
AN ADABOOST-BASED FACE DETECTION SYSTEM USING PARALLEL CONFIGURABLE ARCHITECT...
PPT
Handout14
PDF
Adaboost Clustering In Defining Los Criteria of Mumbai City
PDF
HIGHLY SCALABLE, PARALLEL AND DISTRIBUTED ADABOOST ALGORITHM USING LIGHT WEIG...
PPT
Download It
PDF
A Parallel Architecture for Multiple-Face Detection Technique Using AdaBoost ...
PPTX
Poggi analytics - ensamble - 1b
PDF
Ada boost brown boost performance with noisy data
PPT
Ensemble Learning Featuring the Netflix Prize Competition and ...
PDF
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
PDF
Datamining 4th Adaboost
PPT
Cliffs Notes on Computer Vision
PPTX
A neural ada boost based facial expression recogniton System
PPTX
PDF
18.02.2011, NEWSWIRE, Issue 155
PDF
Cleantech Innovations (CTEK) files Injunctive Relief Against NASDAQ
Lead To Win Bootcamp - Day 3
Guidance & counseling report
State Owned Enterprises (SOEs)
Human action recognition using local space time features and adaboost svm
AN ADABOOST-BASED FACE DETECTION SYSTEM USING PARALLEL CONFIGURABLE ARCHITECT...
Handout14
Adaboost Clustering In Defining Los Criteria of Mumbai City
HIGHLY SCALABLE, PARALLEL AND DISTRIBUTED ADABOOST ALGORITHM USING LIGHT WEIG...
Download It
A Parallel Architecture for Multiple-Face Detection Technique Using AdaBoost ...
Poggi analytics - ensamble - 1b
Ada boost brown boost performance with noisy data
Ensemble Learning Featuring the Netflix Prize Competition and ...
Kato Mivule: An Overview of Adaptive Boosting – AdaBoost
Datamining 4th Adaboost
Cliffs Notes on Computer Vision
A neural ada boost based facial expression recogniton System
18.02.2011, NEWSWIRE, Issue 155
Cleantech Innovations (CTEK) files Injunctive Relief Against NASDAQ
Ad

Similar to Lawry-Daniel.doc (20)

PPTX
Boosting in ensemble learning in ml.pptx
PPT
An Introduction to boosting
PPT
INTRODUCTION TO BOOSTING.ppt
PPTX
Adaboost Classifier for Machine Learning Course
PDF
2013-1 Machine Learning Lecture 06 - Artur Ferreira - A Survey on Boosting…
PDF
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
PPTX
Machine learning with ADA Boost
PPTX
Boosting Approach to Solving Machine Learning Problems
PPTX
Unit V -Multiple Learners.pptx for artificial intelligence
PPTX
Unit V -Multiple Learners in artificial intelligence and machine learning
PPTX
Ensemble methods
DOC
Figure 1.doc
DOC
Figure 1.doc
DOC
Figure 1.doc
DOC
Figure 1.doc
PPTX
Diabetes prediction using Machine Leanring and Data Preprocessing techniques
PDF
Boosting - An Ensemble Machine Learning Method
PPTX
boosting algorithm
PDF
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Boosting in ensemble learning in ml.pptx
An Introduction to boosting
INTRODUCTION TO BOOSTING.ppt
Adaboost Classifier for Machine Learning Course
2013-1 Machine Learning Lecture 06 - Artur Ferreira - A Survey on Boosting…
ADABOOST ENSEMBLE WITH SIMPLE GENETIC ALGORITHM FOR STUDENT PREDICTION MODEL
Machine learning with ADA Boost
Boosting Approach to Solving Machine Learning Problems
Unit V -Multiple Learners.pptx for artificial intelligence
Unit V -Multiple Learners in artificial intelligence and machine learning
Ensemble methods
Figure 1.doc
Figure 1.doc
Figure 1.doc
Figure 1.doc
Diabetes prediction using Machine Leanring and Data Preprocessing techniques
Boosting - An Ensemble Machine Learning Method
boosting algorithm
Implementation of Naive Bayesian Classifier and Ada-Boost Algorithm Using Mai...
Ad

More from butest (20)

PDF
EL MODELO DE NEGOCIO DE YOUTUBE
DOC
1. MPEG I.B.P frame之不同
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPT
Timeline: The Life of Michael Jackson
DOCX
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
PDF
LESSONS FROM THE MICHAEL JACKSON TRIAL
PPTX
Com 380, Summer II
PPT
PPT
DOCX
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
DOC
MICHAEL JACKSON.doc
PPTX
Social Networks: Twitter Facebook SL - Slide 1
PPT
Facebook
DOCX
Executive Summary Hare Chevrolet is a General Motors dealership ...
DOC
Welcome to the Dougherty County Public Library's Facebook and ...
DOC
NEWS ANNOUNCEMENT
DOC
C-2100 Ultra Zoom.doc
DOC
MAC Printing on ITS Printers.doc.doc
DOC
Mac OS X Guide.doc
DOC
hier
DOC
WEB DESIGN!
EL MODELO DE NEGOCIO DE YOUTUBE
1. MPEG I.B.P frame之不同
LESSONS FROM THE MICHAEL JACKSON TRIAL
Timeline: The Life of Michael Jackson
Popular Reading Last Updated April 1, 2010 Adams, Lorraine The ...
LESSONS FROM THE MICHAEL JACKSON TRIAL
Com 380, Summer II
PPT
The MYnstrel Free Press Volume 2: Economic Struggles, Meet Jazz
MICHAEL JACKSON.doc
Social Networks: Twitter Facebook SL - Slide 1
Facebook
Executive Summary Hare Chevrolet is a General Motors dealership ...
Welcome to the Dougherty County Public Library's Facebook and ...
NEWS ANNOUNCEMENT
C-2100 Ultra Zoom.doc
MAC Printing on ITS Printers.doc.doc
Mac OS X Guide.doc
hier
WEB DESIGN!

Lawry-Daniel.doc

  • 1. Clarkson Honors Program Thesis Proposal Altering the AdaBoost algorithm to produce a new boosting method yielding more accurate results under the same number of repetitions. April 5, 2000 Daniel Lawry Professor Christino Tamon, Advisor Topics: Boosting is a method used implicitly to improve the accuracy of learning algorithms. Boosting's roots lie in a theoretical framework for studying machine learning called "PAC" learning model. The creators of this model: Kearns and Valiant, presented the hypothesis that a "weak" learning algorithm, an algorithm which produces results slightly better than random guessing, in the PAC model can be boosted, increasing the weak learning algorithm's accuracy and creating a "strong" learning algorithm. Currently, a boosting algorithm called AdaBoost produces the desired increase in accuracy given a weak learning algorithm. The AdaBoost method utilizes the weak learning method it is given and a training set (xl,yl),..., (xm,ym) where xi belongs to a domain X and each label yi is in some label set Y. AdaBoost then calls the weak learning algorithm repeatedly in a series of T rounds giving weights to the training sets and updating the weights of these sets each repetition by utilizing data from the last run of the weak predictor and current weights. These weights will increase or decrease as they are run through the method, yielding more accurate results based on these new weights each time the training set is run through the weak learning algorithm. It is believed that by eliminating the last k runs of the weak learning algorithm where k < t where t is the number of times the weak learning algorithm has been used so far will force this method to produce more accurate results with the same amount of repetitions. The elimination of the last k runs forces the current run's data to draw on a smaller set of output from the weak hypothesis repetitions. By doing this, the hope is that the algorithm will place more of an emphasis on the runs left in the hypothesis repetitions forcing it to have to become more accurate faster. The parameters to investigate include the appropriate value of k based on the weak learning algorithm and the number of repetitions, T, of that algorithm. The investigation will also involve developing this new boosting method and testing it against the AdaBoost method. Methodology: The new boosting method will be developed and constructed for testing purposes using the C programming language. Likewise the AdaBoost method will be constructed using the C programming
  • 2. language. A formula to maximize the value of k using the number of repetitions, T, and the efficiency of the weak learning method will be formed. This formula will then be tested in conjuncture with the new boosting method and variations on the variable k. Once the optimal value of k is achieved, the two methods will be run on the same training sets and the resulting data will be compared as to see which method yields more accurate results.