SlideShare a Scribd company logo
Volodymyrk
Bayesian Model Averaging
Bayesian Mixer, 27.09.2016
London, UK
Volodymyrk
Bayesian Model Averaging (BMA) - 1 minute version
New Project - how much does it worth?
CFO VP of Growth
Net Present Value: $50m $100m
Model M1
Model M2
30%CEO belief:
after evaluating both
models and market
data
70%
$15m + $70m = $85m
K = 2
Volodymyrk
Bayesian Model Averaging (BMA) - 3 minute version
VP of Growth
CLV assumptions
$10 $12 $15
CAC
$4 72 129 149
$6 62 112 133
$8 51 92 101
Average= $100.11m
Sensitivity Analysis for M2
DATA
Volodymyrk
Bayesian Model Averaging (BMA) - 5 minute version
Bayesian Model Averaging: A Tutorial
Jennifer A. Hoeting, David Madigan, Adrian E. Raftery and Chris T. Volinsky
How much do you trust your
VP and CFO, before you look
at models?
Scary normalising term
that you can ignore
Prior probability for
model parameter
Volodymyrk
Bayesian answer to overfitting
Frequentist:
- model selection
- regularisation
Bayesian:
- BMA
- marginalisation
Volodymyrk
Case Study
You just get the best job in the galaxy
Volodymyrk
Your new Boss Business domain Modelling case
Always test your models on synthetic data that you understand and control
Volodymyrk
Use Cases:
- Fraud Detection
- Inventory Sourcing
Data
Volodymyrk
Modelling goals
- Prediction range is needed, so that you can identify fraudulent transactions
(sand people under-reporting real transaction size and pocketing profit)
- Sale price should be easily explainable, as a function of various Droid Features
so that Jabba can invest in appropriate scavenging/sourcing projects
- You want lowest prediction error possible
so that you are not feeded to Sarlacc
Volodymyrk
Data Generation
Class-1
Class-2
Class-3
Class-4
durability
circuitry
height
weight
price
...
age
Volodymyrk
Data Collection
Volodymyrk
Model Selection - classical method
credits ~ height + weight + power + dents + rad + wheels + legs + red + blue + black + temperature + lat + long + ir_emit + dents_log + height_log + weight_log + power_log + rad_log
Adj. R2: 0.884974385182
Volodymyrk
Model Selection - backward elimination
Volodymyrk
Final Model
credits ~ weight + power + dents + rad + wheels + blue + black + temperature + lat + dents_log + height_log + weight_log + power_log
Adj. R2: 0.903544333611
Volodymyrk
Model Evaluation (out-of-sample)
Volodymyrk
Ridge regression (L2 regularisation)
Volodymyrk
Bayesian Model Averaging for Linear Models - a special case
Inclusion probability for (regression coefficients) are weighted across all possible models
Number of models = combinations of all K features (include/exclude) = 2K
Volodymyrk
How to actually do BMA? (in R)
cran.r-project.org/web/packages/BMA cran.r-project.org/web/packages/BAScran.r-project.org/web/packages/BMS
Mature. A.k.a. “the original”
Developed by PhD duringresearch. Not maintained
Newest. Maintained by Chair
of the Department ofStatistical Science at Duke
Volodymyrk
BMA using BMS (R) package
Model Selection L2 Regularisation BMA
MSE 9736.49 7782.21 7329.44
It worked!
But you can find inputs into data generator script that will not work as well!
Volodymyrk
Nice things you get from BMA
Posterior Inclusion Probability!
How cool is that!
Volodymyrk
Model ranking!
MCMC can beused, if number of
features is large
Best model, according toBMA
Volodymyrk
Can we use it for more complex models?
normalising term
that you can ignore
http://www.ssc.wisc.edu/~bhansen/718/NonParametrics15.pdf
http://www.ejwagenmakers.com/2004/aic.pdf
Warning:Very questionable math.
Does not work
Volodymyrk
Can we use BMA to combine complex (incl. hierarchical) models?
1
3
2
Model order is somewhat similar. Relative probabilities are not.
We need working Reverse-Jump MCMC or something more sophisticated.
Not available in common bayesian MCMC packages yet.
Volodymyrk
In Summary
- BMA is a Bayesian version of ML Model Ensembles
- Math behind is quite beautiful
- Model Averaging is useful for interpretation, not only prediction
- Invest in synthetic data generation,
- before applying new modelling techniques to real-world data
- Even if you are not using BMA, fit different models
- And combine them, if your goal is prediction
- BMA works very well for common GLMs, but does not work yet for arbitrary
models
- Do try it next time you need to fit OLS, though!
Volodymyrk
Of course we are hiring!
● (Snr, Mid) Data Scientists
● Solutions Architect
● Ruby Developer
● Data Engineer
● Senior Artist
● Technical Artist
● Unity Developers
● Senior Product Manager
● Product Director
http://jobs.productmadness.com/

More Related Content

PPTX
1. Introduction to deep learning.pptx
PDF
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
PDF
Overview of Interpretability Approaches in Deep learning: Focus on Convnet ar...
PDF
Deep Learning, an interactive introduction for NLP-ers
PPTX
UNIT-4.pptx
PDF
Recurrent Neural Networks. Part 1: Theory
PPTX
Deep Learning Tutorial
PPTX
1909 BERT: why-and-how (CODE SEMINAR)
1. Introduction to deep learning.pptx
Restricted Boltzmann Machine - A comprehensive study with a focus on Deep Bel...
Overview of Interpretability Approaches in Deep learning: Focus on Convnet ar...
Deep Learning, an interactive introduction for NLP-ers
UNIT-4.pptx
Recurrent Neural Networks. Part 1: Theory
Deep Learning Tutorial
1909 BERT: why-and-how (CODE SEMINAR)

What's hot (20)

PPTX
Deep neural networks
PDF
Neural Network from Scratch in Python
PPTX
Regularization in deep learning
PPTX
Generative Adversarial Networks (GAN)
PDF
Deep Dive into Hyperparameter Tuning
PPTX
Time series predictions using LSTMs
PDF
Convolutional Neural Network
PDF
Gradient descent method
PPTX
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
PDF
Codetecon #KRK 3 - Object detection with Deep Learning
PPTX
Introduction to ML (Machine Learning)
PDF
Naive Bayes Classifier
PPTX
hierarchical clustering.pptx
PDF
ML Basics
PDF
Image segmentation with deep learning
PPTX
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
PDF
Modelling and evaluation
PPTX
Autoencoders in Deep Learning
PPTX
Object Detection using Deep Neural Networks
PPTX
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Deep neural networks
Neural Network from Scratch in Python
Regularization in deep learning
Generative Adversarial Networks (GAN)
Deep Dive into Hyperparameter Tuning
Time series predictions using LSTMs
Convolutional Neural Network
Gradient descent method
Supervised and Unsupervised Learning In Machine Learning | Machine Learning T...
Codetecon #KRK 3 - Object detection with Deep Learning
Introduction to ML (Machine Learning)
Naive Bayes Classifier
hierarchical clustering.pptx
ML Basics
Image segmentation with deep learning
DeepLab V3+: Encoder-Decoder with Atrous Separable Convolution for Semantic I...
Modelling and evaluation
Autoencoders in Deep Learning
Object Detection using Deep Neural Networks
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
Ad

Viewers also liked (20)

PDF
Clean Code in Jupyter notebook
PDF
Customer segmentation - Games Analytics and Business Intelligence, Sep 2015
PDF
Agile Data Science
PDF
Games Analytics and players segmentation
PDF
How to conclude online experiments in python
PDF
Agile data science
PDF
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
PDF
Agile data visualisation
PDF
RFM Segmentation
PPTX
Clean code in Jupyter notebooks
PDF
Churn prediction in mobile social games towards a complete assessment using ...
KEY
NumPy/SciPy Statistics
PPT
Soft Launch Strategies for Mobile App Companies
PDF
4Front Game Data Science
PDF
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
ODP
Introduction to Bayesian Statistics
PDF
KPIs for Mobile Game Soft Launch
PDF
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
PPTX
Bayes theorem explained
PPT
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Clean Code in Jupyter notebook
Customer segmentation - Games Analytics and Business Intelligence, Sep 2015
Agile Data Science
Games Analytics and players segmentation
How to conclude online experiments in python
Agile data science
Big Data Agile Analytics by Ken Collier - Director Agile Analytics, Thoughtwo...
Agile data visualisation
RFM Segmentation
Clean code in Jupyter notebooks
Churn prediction in mobile social games towards a complete assessment using ...
NumPy/SciPy Statistics
Soft Launch Strategies for Mobile App Companies
4Front Game Data Science
Scientific Computing with Python Webinar 9/18/2009:Curve Fitting
Introduction to Bayesian Statistics
KPIs for Mobile Game Soft Launch
Agile Analytics: The Secret to Test, Improve, Fail & Succeed Quickly.
Bayes theorem explained
Agile Data Science by Russell Jurney_ The Hive_Janruary 29 2014
Ad

Similar to Bayesian model averaging (20)

PPTX
Predict Backorder on a supply chain data for an Organization
PPTX
background.pptx
PDF
Foundations of Machine Learning - StampedeCon AI Summit 2017
PDF
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
PDF
Choosing a Machine Learning technique to solve your need
PDF
Machine Learning Guide maXbox Starter62
PDF
PyMC3 — Bayesian Statistical Modelling in Python, Максим Кочуров. 22 июня, 2019
PDF
Visualizing the Model Selection Process
PDF
Machine learning Mind Map
PPTX
Machine learning interviews day2
PDF
A Survey on Stroke Prediction
PDF
A survey on heart stroke prediction
PDF
24 Machine Learning Combining Models - Ada Boost
PDF
Feature Engineering - Getting most out of data for predictive models
PPT
AI & ML INTRODUCTION OF AI AND ML FOR LEARING BASICS
PDF
Machine Learning.pdf
PDF
Data mining with weka
PPTX
Machine Learning with Python made easy and simple
PPTX
Intro to Machine Learning for non-Data Scientists
Predict Backorder on a supply chain data for an Organization
background.pptx
Foundations of Machine Learning - StampedeCon AI Summit 2017
IRJET- Breast Cancer Relapse Prognosis by Classic and Modern Structures o...
Choosing a Machine Learning technique to solve your need
Machine Learning Guide maXbox Starter62
PyMC3 — Bayesian Statistical Modelling in Python, Максим Кочуров. 22 июня, 2019
Visualizing the Model Selection Process
Machine learning Mind Map
Machine learning interviews day2
A Survey on Stroke Prediction
A survey on heart stroke prediction
24 Machine Learning Combining Models - Ada Boost
Feature Engineering - Getting most out of data for predictive models
AI & ML INTRODUCTION OF AI AND ML FOR LEARING BASICS
Machine Learning.pdf
Data mining with weka
Machine Learning with Python made easy and simple
Intro to Machine Learning for non-Data Scientists

Recently uploaded (20)

PPTX
Microbiology with diagram medical studies .pptx
PPTX
ECG_Course_Presentation د.محمد صقران ppt
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PPTX
neck nodes and dissection types and lymph nodes levels
PDF
The scientific heritage No 166 (166) (2025)
PDF
HPLC-PPT.docx high performance liquid chromatography
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPTX
Comparative Structure of Integument in Vertebrates.pptx
PDF
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PPTX
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
PDF
. Radiology Case Scenariosssssssssssssss
PPTX
INTRODUCTION TO EVS | Concept of sustainability
PDF
Biophysics 2.pdffffffffffffffffffffffffff
PPTX
famous lake in india and its disturibution and importance
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PPT
Chemical bonding and molecular structure
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Microbiology with diagram medical studies .pptx
ECG_Course_Presentation د.محمد صقران ppt
7. General Toxicologyfor clinical phrmacy.pptx
neck nodes and dissection types and lymph nodes levels
The scientific heritage No 166 (166) (2025)
HPLC-PPT.docx high performance liquid chromatography
microscope-Lecturecjchchchchcuvuvhc.pptx
Comparative Structure of Integument in Vertebrates.pptx
MIRIDeepImagingSurvey(MIDIS)oftheHubbleUltraDeepField
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
cpcsea ppt.pptxssssssssssssssjjdjdndndddd
. Radiology Case Scenariosssssssssssssss
INTRODUCTION TO EVS | Concept of sustainability
Biophysics 2.pdffffffffffffffffffffffffff
famous lake in india and its disturibution and importance
bbec55_b34400a7914c42429908233dbd381773.pdf
Chemical bonding and molecular structure
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
Introduction to Fisheries Biotechnology_Lesson 1.pptx
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...

Bayesian model averaging

  • 1. Volodymyrk Bayesian Model Averaging Bayesian Mixer, 27.09.2016 London, UK
  • 2. Volodymyrk Bayesian Model Averaging (BMA) - 1 minute version New Project - how much does it worth? CFO VP of Growth Net Present Value: $50m $100m Model M1 Model M2 30%CEO belief: after evaluating both models and market data 70% $15m + $70m = $85m K = 2
  • 3. Volodymyrk Bayesian Model Averaging (BMA) - 3 minute version VP of Growth CLV assumptions $10 $12 $15 CAC $4 72 129 149 $6 62 112 133 $8 51 92 101 Average= $100.11m Sensitivity Analysis for M2 DATA
  • 4. Volodymyrk Bayesian Model Averaging (BMA) - 5 minute version Bayesian Model Averaging: A Tutorial Jennifer A. Hoeting, David Madigan, Adrian E. Raftery and Chris T. Volinsky How much do you trust your VP and CFO, before you look at models? Scary normalising term that you can ignore Prior probability for model parameter
  • 5. Volodymyrk Bayesian answer to overfitting Frequentist: - model selection - regularisation Bayesian: - BMA - marginalisation
  • 6. Volodymyrk Case Study You just get the best job in the galaxy
  • 7. Volodymyrk Your new Boss Business domain Modelling case Always test your models on synthetic data that you understand and control
  • 8. Volodymyrk Use Cases: - Fraud Detection - Inventory Sourcing Data
  • 9. Volodymyrk Modelling goals - Prediction range is needed, so that you can identify fraudulent transactions (sand people under-reporting real transaction size and pocketing profit) - Sale price should be easily explainable, as a function of various Droid Features so that Jabba can invest in appropriate scavenging/sourcing projects - You want lowest prediction error possible so that you are not feeded to Sarlacc
  • 12. Volodymyrk Model Selection - classical method credits ~ height + weight + power + dents + rad + wheels + legs + red + blue + black + temperature + lat + long + ir_emit + dents_log + height_log + weight_log + power_log + rad_log Adj. R2: 0.884974385182
  • 13. Volodymyrk Model Selection - backward elimination
  • 14. Volodymyrk Final Model credits ~ weight + power + dents + rad + wheels + blue + black + temperature + lat + dents_log + height_log + weight_log + power_log Adj. R2: 0.903544333611
  • 17. Volodymyrk Bayesian Model Averaging for Linear Models - a special case Inclusion probability for (regression coefficients) are weighted across all possible models Number of models = combinations of all K features (include/exclude) = 2K
  • 18. Volodymyrk How to actually do BMA? (in R) cran.r-project.org/web/packages/BMA cran.r-project.org/web/packages/BAScran.r-project.org/web/packages/BMS Mature. A.k.a. “the original” Developed by PhD duringresearch. Not maintained Newest. Maintained by Chair of the Department ofStatistical Science at Duke
  • 19. Volodymyrk BMA using BMS (R) package Model Selection L2 Regularisation BMA MSE 9736.49 7782.21 7329.44 It worked! But you can find inputs into data generator script that will not work as well!
  • 20. Volodymyrk Nice things you get from BMA Posterior Inclusion Probability! How cool is that!
  • 21. Volodymyrk Model ranking! MCMC can beused, if number of features is large Best model, according toBMA
  • 22. Volodymyrk Can we use it for more complex models? normalising term that you can ignore http://www.ssc.wisc.edu/~bhansen/718/NonParametrics15.pdf http://www.ejwagenmakers.com/2004/aic.pdf Warning:Very questionable math. Does not work
  • 23. Volodymyrk Can we use BMA to combine complex (incl. hierarchical) models? 1 3 2 Model order is somewhat similar. Relative probabilities are not. We need working Reverse-Jump MCMC or something more sophisticated. Not available in common bayesian MCMC packages yet.
  • 24. Volodymyrk In Summary - BMA is a Bayesian version of ML Model Ensembles - Math behind is quite beautiful - Model Averaging is useful for interpretation, not only prediction - Invest in synthetic data generation, - before applying new modelling techniques to real-world data - Even if you are not using BMA, fit different models - And combine them, if your goal is prediction - BMA works very well for common GLMs, but does not work yet for arbitrary models - Do try it next time you need to fit OLS, though!
  • 25. Volodymyrk Of course we are hiring! ● (Snr, Mid) Data Scientists ● Solutions Architect ● Ruby Developer ● Data Engineer ● Senior Artist ● Technical Artist ● Unity Developers ● Senior Product Manager ● Product Director http://jobs.productmadness.com/