SlideShare a Scribd company logo
Decision Tree,
Random Forest
Example Decision Tree – Retail Data
Decision Tree - Ruleset Model
Terminology
Best Binary Partitioning
Best Binary Partitioning
Tree Depth = 1 (Decision Stump)
Tree Depth = 3
Tree Depth = 20 (Complex Tree)
wo Predictor Decision Boundaries
Two Predictor Decision Boundaries
Minimize overfitting: Early stopping
Minimize overfitting: Pruning
Decision Tree - Strengths & Weaknesses
he Problem with Single Decision Trees
1. Sample records with
replacement (aka "bootstrap"
the training data)
Sampling is the process of selecting a
subset of items from a vast collection of
items.
Bootstrap = Sampling with replacement. It
means a data point in a drawn sample can
reappear in future drawn samples as well.
2. Fit an overgrown tree to
each resampled data set
3. Average predictions
Bagging :
Bootstrap Aggregating :
wisdom of the crowd
Bagging : Bootstrap Aggregating : wisdom of the crowd
As we add more trees... our average prediction error reduces
• Random forest is identified as a collection of
decision trees. Each tree estimates a
classification, and this is called a “vote”.
Ideally, we consider each vote from every
tree and chose the most voted classification
(Majority-Voting).
• Random Forest follow the same bagging
process as the decision trees but each time a
split is to be performed, the search for the
split variable is limited to a random subset of
m of the p attributes (variables or features)
aka Split-Attribute Randomization :
• classification trees: m = √p
• regression trees: m = p/3
• m is commonly referred to as mtry
• Random Forests produce many unique trees.
Random Forest
Bagging vs Random Forest
• Bagging introduces randomness
into the rows of the data.
• Random forest
introduces randomness into the
rows and columns of the data
• Combined, this provides a more
diverse set of trees that almost
always lowers our prediction error.
Split-Attribute Randomization : Prediction Error
Random Forest : Out-of-Bag (OOB) Observations
Random Forest : Tuning
andom Forest - Strengths & Weaknesses

More Related Content

PPT
RANDOM FORESTS Ensemble technique Introduction
PPTX
CS109a_Lecture16_Bagging_RF_Boosting.pptx
PPTX
Random Forest Classifier in Machine Learning | Palin Analytics
PDF
Introduction to Some Tree based Learning Method
PDF
Working mechanism of a random forest classifier and its performance evaluation
PPTX
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
PPTX
Random Forest and KNN is fun
PPT
Using Tree algorithms on machine learning
RANDOM FORESTS Ensemble technique Introduction
CS109a_Lecture16_Bagging_RF_Boosting.pptx
Random Forest Classifier in Machine Learning | Palin Analytics
Introduction to Some Tree based Learning Method
Working mechanism of a random forest classifier and its performance evaluation
Machine learning basics using trees algorithm (Random forest, Gradient Boosting)
Random Forest and KNN is fun
Using Tree algorithms on machine learning

Similar to DecisionTree_RandomForest good for data science (20)

PDF
Random forest sgv_ai_talk_oct_2_2018
PPTX
Ml7 bagging
PPTX
RandomForests_Sayed-tree based model.pptx
PPT
Tree net and_randomforests_2009
PDF
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
PPTX
Decision Tree - C4.5&CART
PDF
Random Forests for AIML for 3rd year ECE department CSE
PDF
Random Forests for Machine Learning ML Decision Tree
PPTX
decision_trees_forests_2.pptx
PPT
RandomForestsRandomForestsRandomForests.ppt
PPT
RandomForests Bootstrapping BAgging Aggregation
PDF
Random Forest / Bootstrap Aggregation
PPT
RandomForests in artificial intelligence
PPTX
13 random forest
PDF
Machine Learning-Lec6 expalin the decision .pdf
PDF
Machine Learning Algorithm - Decision Trees
PPTX
Random ForestRandomForestsRandomForests.pptx
PPTX
Decision Tree.pptx
PDF
Random forests-talk-nl-meetup
PPTX
Comparitive Analysis .pptx Footprinting, Enumeration, Scanning, Sniffing, Soc...
Random forest sgv_ai_talk_oct_2_2018
Ml7 bagging
RandomForests_Sayed-tree based model.pptx
Tree net and_randomforests_2009
Machine Learning Unit-5 Decesion Trees & Random Forest.pdf
Decision Tree - C4.5&CART
Random Forests for AIML for 3rd year ECE department CSE
Random Forests for Machine Learning ML Decision Tree
decision_trees_forests_2.pptx
RandomForestsRandomForestsRandomForests.ppt
RandomForests Bootstrapping BAgging Aggregation
Random Forest / Bootstrap Aggregation
RandomForests in artificial intelligence
13 random forest
Machine Learning-Lec6 expalin the decision .pdf
Machine Learning Algorithm - Decision Trees
Random ForestRandomForestsRandomForests.pptx
Decision Tree.pptx
Random forests-talk-nl-meetup
Comparitive Analysis .pptx Footprinting, Enumeration, Scanning, Sniffing, Soc...
Ad

Recently uploaded (20)

PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
Introduction to Knowledge Engineering Part 1
PDF
Lecture1 pattern recognition............
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PPTX
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
Introduction to machine learning and Linear Models
PPTX
Business Acumen Training GuidePresentation.pptx
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
Computer network topology notes for revision
PPT
ISS -ESG Data flows What is ESG and HowHow
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
Foundation of Data Science unit number two notes
PPT
Miokarditis (Inflamasi pada Otot Jantung)
Reliability_Chapter_ presentation 1221.5784
Introduction to Knowledge Engineering Part 1
Lecture1 pattern recognition............
Business Ppt On Nestle.pptx huunnnhhgfvu
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
climate analysis of Dhaka ,Banglades.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
ALIMENTARY AND BILIARY CONDITIONS 3-1.pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
Introduction to machine learning and Linear Models
Business Acumen Training GuidePresentation.pptx
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
Computer network topology notes for revision
ISS -ESG Data flows What is ESG and HowHow
STUDY DESIGN details- Lt Col Maksud (21).pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
Foundation of Data Science unit number two notes
Miokarditis (Inflamasi pada Otot Jantung)
Ad

DecisionTree_RandomForest good for data science

  • 2. Example Decision Tree – Retail Data
  • 3. Decision Tree - Ruleset Model
  • 7. Tree Depth = 1 (Decision Stump)
  • 9. Tree Depth = 20 (Complex Tree)
  • 10. wo Predictor Decision Boundaries
  • 14. Decision Tree - Strengths & Weaknesses
  • 15. he Problem with Single Decision Trees
  • 16. 1. Sample records with replacement (aka "bootstrap" the training data) Sampling is the process of selecting a subset of items from a vast collection of items. Bootstrap = Sampling with replacement. It means a data point in a drawn sample can reappear in future drawn samples as well. 2. Fit an overgrown tree to each resampled data set 3. Average predictions Bagging : Bootstrap Aggregating : wisdom of the crowd
  • 17. Bagging : Bootstrap Aggregating : wisdom of the crowd As we add more trees... our average prediction error reduces
  • 18. • Random forest is identified as a collection of decision trees. Each tree estimates a classification, and this is called a “vote”. Ideally, we consider each vote from every tree and chose the most voted classification (Majority-Voting). • Random Forest follow the same bagging process as the decision trees but each time a split is to be performed, the search for the split variable is limited to a random subset of m of the p attributes (variables or features) aka Split-Attribute Randomization : • classification trees: m = √p • regression trees: m = p/3 • m is commonly referred to as mtry • Random Forests produce many unique trees. Random Forest
  • 19. Bagging vs Random Forest • Bagging introduces randomness into the rows of the data. • Random forest introduces randomness into the rows and columns of the data • Combined, this provides a more diverse set of trees that almost always lowers our prediction error. Split-Attribute Randomization : Prediction Error
  • 20. Random Forest : Out-of-Bag (OOB) Observations
  • 21. Random Forest : Tuning
  • 22. andom Forest - Strengths & Weaknesses