SlideShare a Scribd company logo
Bagging and Random Forest
Theory and Applications in Machine Learning
Agenda
• Introduction to Bias-Variance Tradeoff
• Overfitting and Tree Pruning
• Ensemble Learning Overview
• Reduction in Variance
• Bagging and Bootstrapping
• Random Forest Algorithm
• Sampling Features at Each Node
• Extensions and Practical Applications
The Bias-Variance Tradeoff
• Bias: Simplistic assumptions lead to under fitting.
• Variance: Complex models over fit the training data.
• Tradeoff: The goal is to minimize both for optimal performance.
Overfitting in Decision Trees
• Deep decision trees capture noise, leading to overfitting.
• Overfitting decreases test set accuracy despite high training accuracy.
Tree Pruning
• Pre-pruning: Stops tree growth early.
• Post-pruning: Removes non-essential branches.
Ensemble Learning Overview
• Bagging: Reduces variance.
• Boosting: Reduces bias.
Reduction in Variance
• Variance is reduced by averaging predictions across models.
• Bagging and Random Forest are designed to reduce variance.
Bagging and Bootstrapping
• Bagging: Combines Bootstrap (sampling with replacement) and
Aggregation (averaging predictions).
Workflow
• 1. Create multiple bootstrapped datasets.
• 2. Train base models on each dataset.
• 3. Aggregate results.
Random Forest Algorithm
• An extension of Bagging using decision trees.
• Randomly selects features for each split, decorrelating trees.
• Aggregates predictions via voting or averaging.
Sampling Features at Each Node
• Feature Selection: Random subset of features at each split.
Benefits:
• Reduces correlation among trees.
• Increases diversity and accuracy.
Extensions to Random Forest
• Extra Trees: Uses all data for splits and randomizes thresholds.
• Gradient Boosted Trees: Sequentially builds trees to reduce errors.
Practical Applications of Random Forests
• Classification: Fraud detection, medical diagnostics.
• Regression: Sales forecasting, stock price prediction.
• Time Series: Modeling temporal trends.
Performance Comparison
• Decision Trees: High interpretability but prone to overfitting.
• Random Forest: Robust and accurate, less interpretable.
Python Implementation Overview
• Load data and preprocess.
• Train RandomForestClassifier.
• Evaluate feature importance.
Code Walkthrough
• from sklearn.ensemble import RandomForestClassifier
• model = RandomForestClassifier(n_estimators=100)
• model.fit(X_train, y_train)
• print(model.feature_importances_)
Tuning Random Forests
Key Parameters:
• n_estimators: Number of trees.
• max_depth: Maximum depth of trees.
• Tools: Grid search, cross-validation.
Limitations of Random Forest
• Computationally intensive for large datasets.
• Less interpretable than single decision trees.
Conclusion and Q&A
• Summary of key points.
• Thank the audience and invite questions.
Future Work
Future work includes hyperparameter tuning for Bagging and Random
Forest, testing on larger datasets, and exploring advanced ensemble
methods like Gradient Boosting.
Thank You

More Related Content

PPTX
Footprinting, Enumeration, Scanning, Sniffing, Social Engineering
PPTX
An Introduction to Random Forest and linear regression algorithms
PPTX
Seminar PPT on Random Forest Tree Algorithm
PPTX
DecisionTree_RandomForest good for data science
PPT
RANDOM FORESTS Ensemble technique Introduction
PDF
Random Forests for AIML for 3rd year ECE department CSE
PDF
Random Forests for Machine Learning ML Decision Tree
PPTX
DecisionTree_RandomForest.pptx
Footprinting, Enumeration, Scanning, Sniffing, Social Engineering
An Introduction to Random Forest and linear regression algorithms
Seminar PPT on Random Forest Tree Algorithm
DecisionTree_RandomForest good for data science
RANDOM FORESTS Ensemble technique Introduction
Random Forests for AIML for 3rd year ECE department CSE
Random Forests for Machine Learning ML Decision Tree
DecisionTree_RandomForest.pptx

Similar to Comparitive Analysis .pptx Footprinting, Enumeration, Scanning, Sniffing, Social Engineering (20)

PPTX
Supervised and Unsupervised Learning .pptx
PDF
Understanding random forests
PDF
Random Forest / Bootstrap Aggregation
PPTX
Decision_Trees_Random_Forests for use in machine learning and computer scienc...
PPTX
Random_Forest_Presentation_Detailed.pptx
PPTX
artifial intelligence notes of islamia university
PPTX
what is Random-Forest-Machine-Learning.pptx
PPTX
Random Forest Decision Tree.pptx
PPTX
Random Forest
PPTX
Random_Forest_Presentation_More_Detailed.pptx
PPTX
13 random forest
PPTX
Random Forest classifier in Machine Learning
PDF
What Is Random Forest_ analytics_ IBM.pdf
PPTX
CS109a_Lecture16_Bagging_RF_Boosting.pptx
PPTX
random forest.pptx
PDF
Building Random Forest at Scale
PPTX
Ml7 bagging
PPTX
RandomForests_Sayed-tree based model.pptx
PDF
Tree models with Scikit-Learn: Great models with little assumptions
PDF
Machine Learning with Python- Machine Learning Algorithms- Random Forest.pdf
Supervised and Unsupervised Learning .pptx
Understanding random forests
Random Forest / Bootstrap Aggregation
Decision_Trees_Random_Forests for use in machine learning and computer scienc...
Random_Forest_Presentation_Detailed.pptx
artifial intelligence notes of islamia university
what is Random-Forest-Machine-Learning.pptx
Random Forest Decision Tree.pptx
Random Forest
Random_Forest_Presentation_More_Detailed.pptx
13 random forest
Random Forest classifier in Machine Learning
What Is Random Forest_ analytics_ IBM.pdf
CS109a_Lecture16_Bagging_RF_Boosting.pptx
random forest.pptx
Building Random Forest at Scale
Ml7 bagging
RandomForests_Sayed-tree based model.pptx
Tree models with Scikit-Learn: Great models with little assumptions
Machine Learning with Python- Machine Learning Algorithms- Random Forest.pdf
Ad

More from MubashirHussain792093 (8)

PDF
Gradient Descent Code Implementation.pdf
PPTX
https://www.slideshare.net/slideshow/chapter-1-ob-38248150/38248150https://ww...
PPTX
https://www.slideshare.net/slideshow/chapter-1-ob-38248150/38248150https://ww...
PPTX
https://www.slideshare.net/slideshow/chapter-1-ob-38248150/38248150
PPTX
Footprinting, Enumeration, Scanning, Sniffing, Social Engineering Footprintin...
PPTX
Footprinting, Enumeration, Scanning, Sniffing, Social Engineering
PPTX
Chemicals in Cosmetics ka sub theak hho .pptx
PPTX
Project Presentation Project Presentation Project Presentation Project Presen...
Gradient Descent Code Implementation.pdf
https://www.slideshare.net/slideshow/chapter-1-ob-38248150/38248150https://ww...
https://www.slideshare.net/slideshow/chapter-1-ob-38248150/38248150https://ww...
https://www.slideshare.net/slideshow/chapter-1-ob-38248150/38248150
Footprinting, Enumeration, Scanning, Sniffing, Social Engineering Footprintin...
Footprinting, Enumeration, Scanning, Sniffing, Social Engineering
Chemicals in Cosmetics ka sub theak hho .pptx
Project Presentation Project Presentation Project Presentation Project Presen...
Ad

Recently uploaded (20)

PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Sustainable Sites - Green Building Construction
DOCX
573137875-Attendance-Management-System-original
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
web development for engineering and engineering
PPTX
Geodesy 1.pptx...............................................
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
UNIT 4 Total Quality Management .pptx
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPTX
Construction Project Organization Group 2.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
DOCX
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Sustainable Sites - Green Building Construction
573137875-Attendance-Management-System-original
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
UNIT-1 - COAL BASED THERMAL POWER PLANTS
web development for engineering and engineering
Geodesy 1.pptx...............................................
Automation-in-Manufacturing-Chapter-Introduction.pdf
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
bas. eng. economics group 4 presentation 1.pptx
UNIT 4 Total Quality Management .pptx
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Construction Project Organization Group 2.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
Model Code of Practice - Construction Work - 21102022 .pdf
ASol_English-Language-Literature-Set-1-27-02-2023-converted.docx
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
IOT PPTs Week 10 Lecture Material.pptx of NPTEL Smart Cities contd
Operating System & Kernel Study Guide-1 - converted.pdf

Comparitive Analysis .pptx Footprinting, Enumeration, Scanning, Sniffing, Social Engineering

  • 1. Bagging and Random Forest Theory and Applications in Machine Learning
  • 2. Agenda • Introduction to Bias-Variance Tradeoff • Overfitting and Tree Pruning • Ensemble Learning Overview • Reduction in Variance • Bagging and Bootstrapping • Random Forest Algorithm • Sampling Features at Each Node • Extensions and Practical Applications
  • 3. The Bias-Variance Tradeoff • Bias: Simplistic assumptions lead to under fitting. • Variance: Complex models over fit the training data. • Tradeoff: The goal is to minimize both for optimal performance.
  • 4. Overfitting in Decision Trees • Deep decision trees capture noise, leading to overfitting. • Overfitting decreases test set accuracy despite high training accuracy.
  • 5. Tree Pruning • Pre-pruning: Stops tree growth early. • Post-pruning: Removes non-essential branches.
  • 6. Ensemble Learning Overview • Bagging: Reduces variance. • Boosting: Reduces bias.
  • 7. Reduction in Variance • Variance is reduced by averaging predictions across models. • Bagging and Random Forest are designed to reduce variance.
  • 8. Bagging and Bootstrapping • Bagging: Combines Bootstrap (sampling with replacement) and Aggregation (averaging predictions). Workflow • 1. Create multiple bootstrapped datasets. • 2. Train base models on each dataset. • 3. Aggregate results.
  • 9. Random Forest Algorithm • An extension of Bagging using decision trees. • Randomly selects features for each split, decorrelating trees. • Aggregates predictions via voting or averaging.
  • 10. Sampling Features at Each Node • Feature Selection: Random subset of features at each split. Benefits: • Reduces correlation among trees. • Increases diversity and accuracy.
  • 11. Extensions to Random Forest • Extra Trees: Uses all data for splits and randomizes thresholds. • Gradient Boosted Trees: Sequentially builds trees to reduce errors.
  • 12. Practical Applications of Random Forests • Classification: Fraud detection, medical diagnostics. • Regression: Sales forecasting, stock price prediction. • Time Series: Modeling temporal trends.
  • 13. Performance Comparison • Decision Trees: High interpretability but prone to overfitting. • Random Forest: Robust and accurate, less interpretable.
  • 14. Python Implementation Overview • Load data and preprocess. • Train RandomForestClassifier. • Evaluate feature importance.
  • 15. Code Walkthrough • from sklearn.ensemble import RandomForestClassifier • model = RandomForestClassifier(n_estimators=100) • model.fit(X_train, y_train) • print(model.feature_importances_)
  • 16. Tuning Random Forests Key Parameters: • n_estimators: Number of trees. • max_depth: Maximum depth of trees. • Tools: Grid search, cross-validation.
  • 17. Limitations of Random Forest • Computationally intensive for large datasets. • Less interpretable than single decision trees.
  • 18. Conclusion and Q&A • Summary of key points. • Thank the audience and invite questions.
  • 19. Future Work Future work includes hyperparameter tuning for Bagging and Random Forest, testing on larger datasets, and exploring advanced ensemble methods like Gradient Boosting.