SlideShare a Scribd company logo
2
Most read
11
Most read
13
Most read
Random forest
 Random forest is a classifier
 An ensemble classifier using many decision tree models.
 Can be used for classification and regression
 Accuracy and variable importance information is provided with the result
 A random forest is a collection of unpruned CART-like trees following specific
rules for
 Tree growing
 Tree combination
 Self-testing
 Post-processing
 Trees are grown using binary partitioning
 Similar to decision tree with a few differences
 For each split-point, the search is not over all variables but just over a part of variables
 No pruning necessary. Trees can be grown until each node contain just very few
observations
 Advantages over decision tree
 Better prediction (in general)
 No parameter tuning necessary with RF
 Terminology
 Training size (N)
 Total number of attributes (M)
 Number of attributes used (m)
 Total number of trees (n)
 A random seed is chosen which pulls out at random a collection of samples from
training dataset while maintaining the class distribution
 With this selected dataset, a random set of attributes from original dataset is
chosen based on user defined values. All the input variables are not considered
because of enormous computation and high chances of over fitting
 In a dataset, where M is the total number of input attributes in the dataset, only
m attributes are chosen at random for each tree where m<M
 The attribute for this set creates the best possible split using the gini index to
develop a decision tree model. This process repeats for each of the branches until
the termination condition stating that the leaves are the nodes that are too small
to split.
 Information from random forest
 Classification accuracy
 Variable importance
 Outliers (Classification)
 Missing Data Estimation
 Error Rates for Random Forest Object
 Advantages
 No need for pruning trees
 Accuracy and variable importance generated automatically
 Overfitting is not a problem
 Not very sensitive to outliers in training data
 Easy to set parameters
 Limitations
 Regression cant predict beyond range in the training data
 Extreme values are not predicted accurately
 Applications
 Classification
 Land cover classification
 Cloud screening
 Regression
 Continuous field mapping
 Biomass mapping
 Efficient use of Multi-Core Technology
 Though it is OS dependent, but the usage of Hadoop guarantees efficient use of
multi-core
 Its a technique from machine learning for learning a linear classifier from labelled
examples
 Similar to perceptron algorithm
 While perceptron algorithm uses additive weight-update scheme, winnowing uses
a multiplicative weight-update scheme
 Performs well when many of the features given to the learner turns out to be
irrelevant
 During training, its shown a sequence of positive and negative examples. From
these it learn a decision hyperplane which can be used to novel examples as
positive or negative
 Uses linear threshold function (like the perceptron training algorithm) as
hypothesis and performs incremental updates to its current hypothesis
 Initialize the weights w1,…….wn to 1
 Both winnow and perceptron algorithm uses the same classification scheme
 The winnowing algorithms differs form the perceptron algorithm in its updating
scheme.
 When misclassifying a positive training example x (i.e. a prediction was negative because
w.x was too small)
 When misclassifying a negative training example x (i.e. Prediction was positive because
w.x was too large)
SPAM Example – each email is a Boolean vector indicating which phase appears
and which don’t
SPAM if at least one of the phrase in S is present
Random forest
 Initialize the weights w1, …..wn = 1 on the n variables
 Given an example x = (x1,……..xn), output 1 if
 Else output 0
 If the algorithm makes a mistake:
 On positive – if it predicts 0 when f(x)=1, then for each xi equal to 1, double the value of
wi
 On negative – if it predicts 1 when f(x)=0, then for each xi equal to 1 cut the value of wi
in half
Random forest
 The principle of maximum entropy states that, subject to precisely stated prior
data, the probability distribution which best represents the current state of
knowledge is the one with the largest entropy.
 Commonly used in Natural Language Processing, speech and Information
Retrieval
 What is maximum entropy classifier?
 Probabilistic classifier which belongs to the class of exponential models
 Does not assume the features that are conditionally independent of each other
 Based on the principle of maximum entropy and forms all models that fit our training
data and selects the one which has the largest entropy
 A piece of information is testable if it can be determined whether a given
distribution is consistent with it
 The expectation of variable x is 2.87
 And p2 + p3 > 0.6
 Are statements of testable information
 Maximum entropy procedure consist of seeking the probability distribution which
maximizes information entropy, subject to constrains of the information.
 Entropy maximization takes place under a single constrain: the sum of
probabilities must be one
 When to use maximum entropy?
 Since it makes minimum assumptions, we use it when we don’t know about the prior
distribution
 Used when we cannot assume conditional independence of the features
 The principle of maximum entropy is commonly applied in two ways to inferential
problems
 Prior Probabilities: its often used to obtain prior probability distribution for Bayesian
inference
 Maximum Entropy Models: involved in model specifications which are widely used in
natural language processing. Ex. Logistic regression

More Related Content

PPTX
Random forest
PPTX
Random forest algorithm
PPT
Decision tree and random forest
PPTX
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
PDF
Understanding random forests
PPT
TCP/IP Network ppt
PDF
Introduction to Artificial Intelligence and Machine Learning
PPTX
Natural language processing
Random forest
Random forest algorithm
Decision tree and random forest
Random Forest Algorithm - Random Forest Explained | Random Forest In Machine ...
Understanding random forests
TCP/IP Network ppt
Introduction to Artificial Intelligence and Machine Learning
Natural language processing

What's hot (20)

PPTX
Random Forest Classifier in Machine Learning | Palin Analytics
PPTX
Random Forest and KNN is fun
PDF
Data Science - Part V - Decision Trees & Random Forests
PDF
Decision trees in Machine Learning
PDF
Classification Based Machine Learning Algorithms
PDF
From decision trees to random forests
PPTX
Ensemble methods in machine learning
PDF
K - Nearest neighbor ( KNN )
PPTX
Decision tree
PPTX
Random Forest
PPTX
Unsupervised learning clustering
PPTX
Ensemble methods
PDF
Understanding Bagging and Boosting
PPTX
Decision Tree Learning
PPT
2.4 rule based classification
PDF
Logistic regression in Machine Learning
PPTX
Machine Learning - Ensemble Methods
PPTX
Decision Trees for Classification: A Machine Learning Algorithm
PPTX
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
PDF
Naive Bayes
Random Forest Classifier in Machine Learning | Palin Analytics
Random Forest and KNN is fun
Data Science - Part V - Decision Trees & Random Forests
Decision trees in Machine Learning
Classification Based Machine Learning Algorithms
From decision trees to random forests
Ensemble methods in machine learning
K - Nearest neighbor ( KNN )
Decision tree
Random Forest
Unsupervised learning clustering
Ensemble methods
Understanding Bagging and Boosting
Decision Tree Learning
2.4 rule based classification
Logistic regression in Machine Learning
Machine Learning - Ensemble Methods
Decision Trees for Classification: A Machine Learning Algorithm
KNN Algorithm - How KNN Algorithm Works With Example | Data Science For Begin...
Naive Bayes
Ad

Similar to Random forest (20)

PPTX
13 random forest
PDF
Classifiers
PPT
Tree net and_randomforests_2009
PPTX
Reuqired ppt for machine learning algirthms and part
PPTX
Machine learning session6(decision trees random forrest)
PPTX
AIML UNIT 4.pptx. IT contains syllabus and full subject
PDF
Data Science Interview Preparation(#DAY 02).pdf
PDF
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...
PPTX
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
PDF
Nbe rcausalpredictionv111 lecture2
PDF
BaggingBoosting.pdf
PPT
Download It
DOCX
introduction to machine learning unit iv
PPTX
Gradient Boosted trees
PDF
dm1.pdf
PPTX
Decision Tree - C4.5&CART
PDF
Sample_Subjective_Questions_Answers (1).pdf
PPTX
Deeplearning for Computer Vision PPT with
PDF
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
PDF
Machine learning and reinforcement learning
13 random forest
Classifiers
Tree net and_randomforests_2009
Reuqired ppt for machine learning algirthms and part
Machine learning session6(decision trees random forrest)
AIML UNIT 4.pptx. IT contains syllabus and full subject
Data Science Interview Preparation(#DAY 02).pdf
Efficient Disease Classifier Using Data Mining Techniques: Refinement of Rand...
20211229120253D6323_PERT 06_ Ensemble Learning.pptx
Nbe rcausalpredictionv111 lecture2
BaggingBoosting.pdf
Download It
introduction to machine learning unit iv
Gradient Boosted trees
dm1.pdf
Decision Tree - C4.5&CART
Sample_Subjective_Questions_Answers (1).pdf
Deeplearning for Computer Vision PPT with
IMAGE CLASSIFICATION USING DIFFERENT CLASSICAL APPROACHES
Machine learning and reinforcement learning
Ad

More from Ujjawal (10)

PPTX
fMRI in machine learning
PPTX
Neural network for machine learning
PPTX
Information retrieval
PPTX
Genetic algorithm
PPTX
K nearest neighbor
PPTX
Support vector machines
PPTX
Vector space classification
PPTX
Scoring, term weighting and the vector space
PPTX
Bayes’ theorem and logistic regression
PPTX
Introduction to data mining
fMRI in machine learning
Neural network for machine learning
Information retrieval
Genetic algorithm
K nearest neighbor
Support vector machines
Vector space classification
Scoring, term weighting and the vector space
Bayes’ theorem and logistic regression
Introduction to data mining

Recently uploaded (20)

PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Computer network topology notes for revision
PPTX
1_Introduction to advance data techniques.pptx
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PDF
Foundation of Data Science unit number two notes
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Reliability_Chapter_ presentation 1221.5784
Business Ppt On Nestle.pptx huunnnhhgfvu
STUDY DESIGN details- Lt Col Maksud (21).pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Computer network topology notes for revision
1_Introduction to advance data techniques.pptx
IB Computer Science - Internal Assessment.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Foundation of Data Science unit number two notes
oil_refinery_comprehensive_20250804084928 (1).pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd

Random forest

  • 2.  Random forest is a classifier  An ensemble classifier using many decision tree models.  Can be used for classification and regression  Accuracy and variable importance information is provided with the result  A random forest is a collection of unpruned CART-like trees following specific rules for  Tree growing  Tree combination  Self-testing  Post-processing  Trees are grown using binary partitioning
  • 3.  Similar to decision tree with a few differences  For each split-point, the search is not over all variables but just over a part of variables  No pruning necessary. Trees can be grown until each node contain just very few observations  Advantages over decision tree  Better prediction (in general)  No parameter tuning necessary with RF  Terminology  Training size (N)  Total number of attributes (M)  Number of attributes used (m)  Total number of trees (n)
  • 4.  A random seed is chosen which pulls out at random a collection of samples from training dataset while maintaining the class distribution  With this selected dataset, a random set of attributes from original dataset is chosen based on user defined values. All the input variables are not considered because of enormous computation and high chances of over fitting  In a dataset, where M is the total number of input attributes in the dataset, only m attributes are chosen at random for each tree where m<M  The attribute for this set creates the best possible split using the gini index to develop a decision tree model. This process repeats for each of the branches until the termination condition stating that the leaves are the nodes that are too small to split.
  • 5.  Information from random forest  Classification accuracy  Variable importance  Outliers (Classification)  Missing Data Estimation  Error Rates for Random Forest Object  Advantages  No need for pruning trees  Accuracy and variable importance generated automatically  Overfitting is not a problem  Not very sensitive to outliers in training data  Easy to set parameters
  • 6.  Limitations  Regression cant predict beyond range in the training data  Extreme values are not predicted accurately  Applications  Classification  Land cover classification  Cloud screening  Regression  Continuous field mapping  Biomass mapping
  • 7.  Efficient use of Multi-Core Technology  Though it is OS dependent, but the usage of Hadoop guarantees efficient use of multi-core
  • 8.  Its a technique from machine learning for learning a linear classifier from labelled examples  Similar to perceptron algorithm  While perceptron algorithm uses additive weight-update scheme, winnowing uses a multiplicative weight-update scheme  Performs well when many of the features given to the learner turns out to be irrelevant  During training, its shown a sequence of positive and negative examples. From these it learn a decision hyperplane which can be used to novel examples as positive or negative  Uses linear threshold function (like the perceptron training algorithm) as hypothesis and performs incremental updates to its current hypothesis
  • 9.  Initialize the weights w1,…….wn to 1  Both winnow and perceptron algorithm uses the same classification scheme  The winnowing algorithms differs form the perceptron algorithm in its updating scheme.  When misclassifying a positive training example x (i.e. a prediction was negative because w.x was too small)  When misclassifying a negative training example x (i.e. Prediction was positive because w.x was too large)
  • 10. SPAM Example – each email is a Boolean vector indicating which phase appears and which don’t SPAM if at least one of the phrase in S is present
  • 12.  Initialize the weights w1, …..wn = 1 on the n variables  Given an example x = (x1,……..xn), output 1 if  Else output 0  If the algorithm makes a mistake:  On positive – if it predicts 0 when f(x)=1, then for each xi equal to 1, double the value of wi  On negative – if it predicts 1 when f(x)=0, then for each xi equal to 1 cut the value of wi in half
  • 14.  The principle of maximum entropy states that, subject to precisely stated prior data, the probability distribution which best represents the current state of knowledge is the one with the largest entropy.  Commonly used in Natural Language Processing, speech and Information Retrieval  What is maximum entropy classifier?  Probabilistic classifier which belongs to the class of exponential models  Does not assume the features that are conditionally independent of each other  Based on the principle of maximum entropy and forms all models that fit our training data and selects the one which has the largest entropy
  • 15.  A piece of information is testable if it can be determined whether a given distribution is consistent with it  The expectation of variable x is 2.87  And p2 + p3 > 0.6  Are statements of testable information  Maximum entropy procedure consist of seeking the probability distribution which maximizes information entropy, subject to constrains of the information.  Entropy maximization takes place under a single constrain: the sum of probabilities must be one
  • 16.  When to use maximum entropy?  Since it makes minimum assumptions, we use it when we don’t know about the prior distribution  Used when we cannot assume conditional independence of the features  The principle of maximum entropy is commonly applied in two ways to inferential problems  Prior Probabilities: its often used to obtain prior probability distribution for Bayesian inference  Maximum Entropy Models: involved in model specifications which are widely used in natural language processing. Ex. Logistic regression