SlideShare a Scribd company logo
4
Most read
5
Most read
16
Most read
XGBoost: A Scalable Tree Boosting System
Simon Lia-Jonassen
Motivation
 Used by majority of winning solutions on
Kaggle, 2nd most popular method after DNN.
 Also used by 10 best teams in KDDCup’15.
 Applies to classification, regression and
learning-to-rank tasks.
 Usually outperforms alternatives in an
out-of-the-box setting.
 Combines a good theoretical foundation and
a highly efficient implementation.
 So, how does it work?
Decision Tree Boosting
Number of trees Tree function,
maps to a set of leaf weights
Instance features
Regularized Learning Objective
Prediction loss Complexity penalty
Number of leaves L2 regularization on
leaves weights
Regularized Learning Objective
First order gradient
of the loss function
Second order gradient
of the loss function
By additive definition
Where:
However, for example:
Regularized Learning Objective
By expansion:
For each
instance
For each leaf For each
instance
in the leaf
Regularized Learning Objective
Optimal leaf weight for a fixed structure:
By substitution:
Gradient Tree Boosting
Before
we split
Left
split
Right
split
Split
penalty
Gradient Tree Boosting
Optimizations
 Shrinkage
 More trees
 Column subsampling
 Prevents over-fitting
 Approximate split finding
 Faster AUC convergence
 Sparsity-aware split finding
 Visit only non-missing values
 Cache-aware parallel column block
access
 Fewer misses on large datasets
 Block compression and sharding
 Faster I/O for out-of-core computation
Optimizations
 Shrinkage
 More trees
 Column subsampling
 Prevents over-fitting
 Approximate split finding
 Faster AUC convergence
 Sparsity-aware split finding
 Visit only non-missing values
 Cache-aware parallel column block
access
 Fewer misses on large datasets
 Block compression and sharding
 Faster I/O for out-of-core computation
Optimizations
 Shrinkage
 More trees
 Column subsampling
 Prevents over-fitting
 Approximate split finding
 Faster AUC convergence
 Sparsity-aware split finding
 Visit only non-missing values
 Cache-aware parallel column block
access
 Fewer misses on large datasets
 Block compression and sharding
 Faster I/O for out-of-core computation
Optimizations
 Shrinkage
 More trees
 Column subsampling
 Prevents over-fitting
 Approximate split finding
 Faster AUC convergence
 Sparsity-aware split finding
 Visit only non-missing values
 Cache-aware parallel column block
access
 Fewer misses on large datasets
 Block compression and sharding
 Faster I/O for out-of-core computation
Optimizations
 Shrinkage
 More trees
 Column subsampling
 Prevents over-fitting
 Approximate split finding
 Faster AUC convergence
 Sparsity-aware split finding
 Visit only non-missing values
 Cache-aware parallel column block
access
 Fewer misses on large datasets
 Block compression and sharding
 Faster I/O for out-of-core computation
Optimizations
Further reading
 The paper:
 https://arxiv.org/pdf/1603.02754.pdf
 XGBoost tutorial:
 http://xgboost.readthedocs.io/en/latest/model.html
 A great deck of slides:
 https://homes.cs.washington.edu/~tqchen/pdf/BoostedTree.pdf
 A simple usage example:
 https://www.kaggle.com/kevalm/xgboost-implementation-on-iris-dataset-python
 DataCamp mini-course:
 https://campus.datacamp.com/courses/extreme-gradient-boosting-with-xgboost

More Related Content

PPTX
XgBoost.pptx
PPTX
Introduction to XGboost
PDF
PDF
XGBoost: the algorithm that wins every competition
PDF
Introduction to XGBoost
PDF
XGBoost @ Fyber
PPTX
Feature Engineering
PDF
Overview on Optimization algorithms in Deep Learning
XgBoost.pptx
Introduction to XGboost
XGBoost: the algorithm that wins every competition
Introduction to XGBoost
XGBoost @ Fyber
Feature Engineering
Overview on Optimization algorithms in Deep Learning

What's hot (20)

PPTX
Fuzzy Clustering(C-means, K-means)
PDF
General Tips for participating Kaggle Competitions
PDF
Feature Importance Analysis with XGBoost in Tax audit
PPTX
Unsupervised learning clustering
PPTX
K-means Clustering
PDF
Feature Engineering
PPTX
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
PPTX
Spectral clustering
PDF
Python NumPy Tutorial | NumPy Array | Edureka
PPTX
Introduction of Xgboost
PPTX
Machine Learning - Ensemble Methods
PDF
Winning data science competitions, presented by Owen Zhang
PDF
(DL hacks輪読) Deep Kernel Learning
PPT
Ramsey theory
PDF
PPTX
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
PPTX
K means clustering
PDF
Class imbalance problem1
PDF
Feature Engineering - Getting most out of data for predictive models
PPTX
[한글] Tutorial: Sparse variational dropout
Fuzzy Clustering(C-means, K-means)
General Tips for participating Kaggle Competitions
Feature Importance Analysis with XGBoost in Tax audit
Unsupervised learning clustering
K-means Clustering
Feature Engineering
K Means Clustering Algorithm | K Means Clustering Example | Machine Learning ...
Spectral clustering
Python NumPy Tutorial | NumPy Array | Edureka
Introduction of Xgboost
Machine Learning - Ensemble Methods
Winning data science competitions, presented by Owen Zhang
(DL hacks輪読) Deep Kernel Learning
Ramsey theory
Logistic Regression | Logistic Regression In Python | Machine Learning Algori...
K means clustering
Class imbalance problem1
Feature Engineering - Getting most out of data for predictive models
[한글] Tutorial: Sparse variational dropout
Ad

Similar to Xgboost: A Scalable Tree Boosting System - Explained (20)

PDF
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
PDF
Demystifying Xgboost
PPTX
XGBOOST [Autosaved]12.pptx
PDF
Boosting Algorithms Omar Odibat
PDF
Overview of tree algorithms from decision tree to xgboost
PPTX
Comparison Study of Decision Tree Ensembles for Regression
PDF
Understanding Bagging and Boosting
PPTX
XGBoost (System Overview)
PPTX
Tech Talk overview of xgboost and review of paper
PDF
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
PPTX
Jordan Evans Kaplan.pptx
PPTX
Decision_Tree_Ensembles_Lecture.pptx Basics
PDF
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan Zhu
PDF
193_report (1)
PPTX
Ppt shuai
PPTX
Introduction to RandomForests 2004
PPTX
PPT_ML.pptx______________________________________
PPTX
Solar energy Forecasting and site adjustment using ML.pptx
PPTX
Diabetes prediction using Machine Leanring and Data Preprocessing techniques
ODP
Smartphone Activity Prediction
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Demystifying Xgboost
XGBOOST [Autosaved]12.pptx
Boosting Algorithms Omar Odibat
Overview of tree algorithms from decision tree to xgboost
Comparison Study of Decision Tree Ensembles for Regression
Understanding Bagging and Boosting
XGBoost (System Overview)
Tech Talk overview of xgboost and review of paper
Lecture 9 - Decision Trees and Ensemble Methods, a lecture in subject module ...
Jordan Evans Kaplan.pptx
Decision_Tree_Ensembles_Lecture.pptx Basics
Building a Unified Data Pipeline with Apache Spark and XGBoost with Nan Zhu
193_report (1)
Ppt shuai
Introduction to RandomForests 2004
PPT_ML.pptx______________________________________
Solar energy Forecasting and site adjustment using ML.pptx
Diabetes prediction using Machine Leanring and Data Preprocessing techniques
Smartphone Activity Prediction
Ad

More from Simon Lia-Jonassen (10)

PDF
Building successful and secure products with AI and ML
PPTX
HyperLogLog and friends
PPTX
No more bad news!
PPTX
Chatbots are coming!
PDF
Large-Scale Real-Time Data Management for Engagement and Monetization
PDF
Efficient Query Processing in Web Search Engines
PDF
Leveraging Big Data and Real-Time Analytics at Cxense
PDF
Yet another intro to Apache Spark
PDF
Efficient Query Processing in Distributed Search Engines
PDF
What should be done to IR algorithms to meet current, and possible future, ha...
Building successful and secure products with AI and ML
HyperLogLog and friends
No more bad news!
Chatbots are coming!
Large-Scale Real-Time Data Management for Engagement and Monetization
Efficient Query Processing in Web Search Engines
Leveraging Big Data and Real-Time Analytics at Cxense
Yet another intro to Apache Spark
Efficient Query Processing in Distributed Search Engines
What should be done to IR algorithms to meet current, and possible future, ha...

Recently uploaded (20)

PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Spectral efficient network and resource selection model in 5G networks
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPT
Teaching material agriculture food technology
PDF
NewMind AI Monthly Chronicles - July 2025
PPTX
A Presentation on Artificial Intelligence
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PDF
The Rise and Fall of 3GPP – Time for a Sabbatical?
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Building Integrated photovoltaic BIPV_UPV.pdf
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Unlocking AI with Model Context Protocol (MCP)
Spectral efficient network and resource selection model in 5G networks
Digital-Transformation-Roadmap-for-Companies.pptx
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Teaching material agriculture food technology
NewMind AI Monthly Chronicles - July 2025
A Presentation on Artificial Intelligence
Reach Out and Touch Someone: Haptics and Empathic Computing
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
The Rise and Fall of 3GPP – Time for a Sabbatical?
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced methodologies resolving dimensionality complications for autism neur...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
NewMind AI Weekly Chronicles - August'25 Week I
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Dropbox Q2 2025 Financial Results & Investor Presentation
Building Integrated photovoltaic BIPV_UPV.pdf
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
CIFDAQ's Market Insight: SEC Turns Pro Crypto

Xgboost: A Scalable Tree Boosting System - Explained

  • 1. XGBoost: A Scalable Tree Boosting System Simon Lia-Jonassen
  • 2. Motivation  Used by majority of winning solutions on Kaggle, 2nd most popular method after DNN.  Also used by 10 best teams in KDDCup’15.  Applies to classification, regression and learning-to-rank tasks.  Usually outperforms alternatives in an out-of-the-box setting.  Combines a good theoretical foundation and a highly efficient implementation.  So, how does it work?
  • 3. Decision Tree Boosting Number of trees Tree function, maps to a set of leaf weights Instance features
  • 4. Regularized Learning Objective Prediction loss Complexity penalty Number of leaves L2 regularization on leaves weights
  • 5. Regularized Learning Objective First order gradient of the loss function Second order gradient of the loss function By additive definition Where: However, for example:
  • 6. Regularized Learning Objective By expansion: For each instance For each leaf For each instance in the leaf
  • 7. Regularized Learning Objective Optimal leaf weight for a fixed structure: By substitution:
  • 8. Gradient Tree Boosting Before we split Left split Right split Split penalty
  • 10. Optimizations  Shrinkage  More trees  Column subsampling  Prevents over-fitting  Approximate split finding  Faster AUC convergence  Sparsity-aware split finding  Visit only non-missing values  Cache-aware parallel column block access  Fewer misses on large datasets  Block compression and sharding  Faster I/O for out-of-core computation
  • 11. Optimizations  Shrinkage  More trees  Column subsampling  Prevents over-fitting  Approximate split finding  Faster AUC convergence  Sparsity-aware split finding  Visit only non-missing values  Cache-aware parallel column block access  Fewer misses on large datasets  Block compression and sharding  Faster I/O for out-of-core computation
  • 12. Optimizations  Shrinkage  More trees  Column subsampling  Prevents over-fitting  Approximate split finding  Faster AUC convergence  Sparsity-aware split finding  Visit only non-missing values  Cache-aware parallel column block access  Fewer misses on large datasets  Block compression and sharding  Faster I/O for out-of-core computation
  • 13. Optimizations  Shrinkage  More trees  Column subsampling  Prevents over-fitting  Approximate split finding  Faster AUC convergence  Sparsity-aware split finding  Visit only non-missing values  Cache-aware parallel column block access  Fewer misses on large datasets  Block compression and sharding  Faster I/O for out-of-core computation
  • 14. Optimizations  Shrinkage  More trees  Column subsampling  Prevents over-fitting  Approximate split finding  Faster AUC convergence  Sparsity-aware split finding  Visit only non-missing values  Cache-aware parallel column block access  Fewer misses on large datasets  Block compression and sharding  Faster I/O for out-of-core computation
  • 16. Further reading  The paper:  https://arxiv.org/pdf/1603.02754.pdf  XGBoost tutorial:  http://xgboost.readthedocs.io/en/latest/model.html  A great deck of slides:  https://homes.cs.washington.edu/~tqchen/pdf/BoostedTree.pdf  A simple usage example:  https://www.kaggle.com/kevalm/xgboost-implementation-on-iris-dataset-python  DataCamp mini-course:  https://campus.datacamp.com/courses/extreme-gradient-boosting-with-xgboost