SlideShare a Scribd company logo
Are you sure about that!?
Uncertainty Quantification in AI
Florian Wilhelm Berlin, October 10th 2019
2
Dr. Florian Wilhelm
Principal Data Scientist @ inovex
@FlorianWilhelm
FlorianWilhelm
florianwilhelm.info
Mathematical Modelling
Data Science to Production
Recommender Systems
Uncertainty Quantification & Causality
Python Data Stack
Maintainer PyScaffold
3
Simon Bachstein
Data Scientist @ inovex
2018/07 – 2019/01 Master Thesis at inovex:
Uncertainty Quantification in Deep Learning
• Blogpost:
http://inovex.de/blog/uncertainty-quantification-deep-learning
• Master Thesis:
https://sbachstein.de/master_thesis.pdf
@simonbachstein
sbachstein
sbachstein.de
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
4
Deep Networks cannot look beyond their horizon
Motivation
5
90% cat
10% dog
Deep Networks cannot look beyond their horizon
Motivation
6
40% cat
60% dog
Deep Networks cannot look beyond their horizon
Motivation
7
?
Boult, T. E., Cruz, S., Dhamija, A., Gunther, M., Henrydoss, J., & Scheirer, W. (2019). Learning and the Unknown: Surveying
Steps Toward Open World Recognition. Aaai, 1–8. Retrieved from www.aaai.org
Learning and the Unknown
8
Simple Regression Problem
Interpolation
9
Simple Regression Problem
Deep Networks don’t extrapolate
Neural Arithmetic Logic Units, NIPS'18, Andrew Trask et. al.10
Simple Regression Problem
Deep Networks don’t extrapolate
11
Simple Regression Problem
Uncertainty about interpolation and extrapolation
12
Types of Uncertainty
13
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
14
Methods for Uncertainty Quantification
16
Relaxation of mathematical assumptions about data
Gaussian
Processes
Deep Ensembles / Dropout
Ensembles
Quantile
Regression
Monte Carlo
Dropout
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
17
A Gaussian Process can be thought of as a random function
which is defined by its mean and covariance functions
Gaussian Processes
18
Definition
Gaussian Processes
19
Example
Gaussian Processes
20
Inference
Gaussian Processes
21
Inference
Gaussian Processes
22
Inference with perfect interpolation
Gaussian Processes
23
Inference with noisy observations
Gaussian Processes
24
Inference
Inference using given data points can be done analytically. For
example, when assuming the (prior) mean function to be zero
everywhere, we get:
Good introduction:
Bayesian Non-parametric Models for Data Science using PyMC by Christopher Fonnesbeck
• https://www.youtube.com/watch?v=-sIOMs4MSuA
• https://de.slideshare.net/mlreview/bayesian-nonparametric-models-for-data-science-using-pymc
computationally intense
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
25
MC Dropout
Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016, Yarin Gal et. al.
26
...
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
27
Deep Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al28
Custom loss function:
Capture uncertainty directly at training time
Deep Ensembles
Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al29
Combine an ensemble of networks
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
30
Dropout Ensembles
31
The best of both worlds?
...
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
32
Using the cumulative distribution function (cdf) of a random variable Y, we
define the quantile:
Loss function to estimate quantile:
Quantile Regression
33
Intuition behind Quantile Regression
34
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.1
0.2
0.5
0.8
Error: 0.0 + 1.6 = 1.6
Intuition behind Quantile Regression
35
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.1
0.4
0.7
Error: 0.0 + 1.2 = 1.2
0.0
Intuition behind Quantile Regression
36
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.1
0.3
0.6
Error: 0.1 + 0.9 = 1.0
0.0
Intuition behind Quantile Regression
37
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume median is here (𝜏 = 0.5)
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.2
0.2
0.5
Error: 0.3 + 0.7 = 1.0
0.1
No change due to the linearity of the error!
+0.1
+0.1
-0.1
-0.1
Now the 0.75th Quantile
38
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
Assume 𝜏 = 0.75 is here
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.2
0.2
0.5
0.1
Error: (1 − 0.75) ⋅ 0.3 + 0.75 ⋅ 0.7 = 0.6
Right-side error weights 3 times as much as
the left-side error
Now the 0.75th Quantile
39
0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0
𝜏 = 0.75
𝑦 > )𝑞+(𝑥)
𝑦 ≤ )𝑞+(𝑥)
0.5
0.1
0.2
Error: (1 − 0.75) ⋅ 1.0 + 0.75 ⋅ 0.2 = 0.4
0.4
Change in the right-side error also weights
3 times as much as the left-side error
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
40
According to the function
Samples are generated as follows:
Experiments
Uncertainty in Deep Learning (Phd thesis), Yarin Gal, http://mlg.eng.cam.ac.uk/yarin/blog_2248.html41
Dataset
Experiments
42
Dataset
Neural networks
› 2 hidden layers with 20 ReLU neurons each
› 5 networks for Deep Ensembles
› 100 iterations for Dropout predictions
› Adam optimizer with batch size of 128
› LR, weight decay, dropout probability are optimized
Gaussian Processes
› squared exponential covariance and zero mean function prior
› covariance function parameters and aleatory noise are optimized
Experiments
43
Network setup and hyperparameters
Mean squared error (MSE)
Mean negative log likelihood (MNLL)
Mean Kullback-Leibler (KL) divergence
Experiments
44
Measures for generalization quality
Experiments
45
Interpolation
Experiments
46
They still don’t extrapolate and they don’t quite realize
Experiments
47
Gaussian Process
Experiments
Convergence
48
Experiments
49
Heteroscedastic noise
Experiments
50
Non-Gaussian noise
Experiments
51
Uncertainty split
aleatoric epistemic
Summary
52
GP MCD DeepE DropoutE QR
Homoscedastic
noise
++ o + o o
Heteroscedastic
noise
-- - ++ + +
Non-Gaussian
noise
+ o + + -
Convergence ++ - + - +
Speed (--) + - / (+) + ++
Uncertainty split yes no yes yes no
1. Motivation
2. Methods
a. Gaussian Processes
b. Monte-Carlo Dropout
c. Deep Ensembles
d. Dropout Ensembles
e. Quantile Regression
3. Experiments
4. Conclusion & Outlook
Agenda
53
› Neural network approaches discussed here are very aware of
aleatory uncertainty, however, not capable of correctly estimating
epistemic uncertainty
› Gaussian Processes give clear signals about ignorance but do not
scale
A combined solution needs to be developed because
uncertainty estimation is needed in critical applications
Conclusion
54
There is work to be done
› Bayesian Neural Networks (e.g. with PyMC)
› Sparse Gaussian Process approximations
› Gaussian Processes on top of neural networks
Outlook
55
Other approaches
Thank You!
Florian Wilhelm
Principal Data Scientist
inovex GmbH
Schanzenstraße 6-20
Kupferhütte 1.13
51063 Köln
florian.wilhelm@inovex.de

More Related Content

PDF
Dimensionality Reduction
PDF
Uncertainty Estimation in Deep Learning
PDF
Modeling uncertainty in deep learning
PDF
Uncertainty in Deep Learning
PPTX
Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...
PPTX
Dimensionality Reduction and feature extraction.pptx
PPTX
Hierarchical clustering.pptx
PPTX
Autoencoders in Deep Learning
Dimensionality Reduction
Uncertainty Estimation in Deep Learning
Modeling uncertainty in deep learning
Uncertainty in Deep Learning
Uncertainty Quantification with Unsupervised Deep learning and Multi Agent Sy...
Dimensionality Reduction and feature extraction.pptx
Hierarchical clustering.pptx
Autoencoders in Deep Learning

What's hot (20)

PDF
Bias and variance trade off
PDF
Naive Bayes
PPT
Alpaydin - Chapter 2
PDF
Feature selection
PDF
Introduction to Autoencoders
PDF
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
PPT
Perceptron
PPTX
Support vector machines (svm)
PDF
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
PPTX
Machine Learning: Bias and Variance Trade-off
PPTX
Lecture 18: Gaussian Mixture Models and Expectation Maximization
PDF
Support Vector Machines for Classification
PDF
Linear models for classification
PPT
Bayseian decision theory
PPTX
Deep Belief nets
PDF
Lecture9 - Bayesian-Decision-Theory
PDF
Logistic regression in Machine Learning
PDF
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
PPTX
Perceptron & Neural Networks
PDF
Deep Learning - Convolutional Neural Networks
Bias and variance trade off
Naive Bayes
Alpaydin - Chapter 2
Feature selection
Introduction to Autoencoders
Restricted Boltzmann Machine | Neural Network Tutorial | Deep Learning Tutori...
Perceptron
Support vector machines (svm)
Artificial Neural Networks Lect5: Multi-Layer Perceptron & Backpropagation
Machine Learning: Bias and Variance Trade-off
Lecture 18: Gaussian Mixture Models and Expectation Maximization
Support Vector Machines for Classification
Linear models for classification
Bayseian decision theory
Deep Belief nets
Lecture9 - Bayesian-Decision-Theory
Logistic regression in Machine Learning
SVM Algorithm Explained | Support Vector Machine Tutorial Using R | Edureka
Perceptron & Neural Networks
Deep Learning - Convolutional Neural Networks

Similar to Uncertainty Quantification in AI (20)

PDF
Bayesian Deep Learning
PDF
Uncertainty in deep learning
PDF
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
PDF
Uncertainties in Deep Learning
PPTX
Bayesian Neural Networks
PDF
Representation learning in limited-data settings
PDF
[DL輪読会]Conditional Neural Processes
PDF
Conditional neural processes
PDF
Probabilistic machine learning for optimization and solving complex
PDF
Machine Learning Foundations
PDF
Lecture notes
PDF
A nonlinear approximation of the Bayesian Update formula
PPTX
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
PPT
AML_030607.ppt
PDF
(研究会輪読) Weight Uncertainty in Neural Networks
PPT
tutorial.ppt
PDF
Understanding Blackbox Prediction via Influence Functions
PPTX
DeepLearningLecture.pptx
PDF
Uncertainty Modeling in Deep Learning
PDF
Artificial Intelligence Course: Linear models
Bayesian Deep Learning
Uncertainty in deep learning
QMC Program: Trends and Advances in Monte Carlo Sampling Algorithms Workshop,...
Uncertainties in Deep Learning
Bayesian Neural Networks
Representation learning in limited-data settings
[DL輪読会]Conditional Neural Processes
Conditional neural processes
Probabilistic machine learning for optimization and solving complex
Machine Learning Foundations
Lecture notes
A nonlinear approximation of the Bayesian Update formula
Deep Learning: Introduction & Chapter 5 Machine Learning Basics
AML_030607.ppt
(研究会輪読) Weight Uncertainty in Neural Networks
tutorial.ppt
Understanding Blackbox Prediction via Influence Functions
DeepLearningLecture.pptx
Uncertainty Modeling in Deep Learning
Artificial Intelligence Course: Linear models

More from Florian Wilhelm (18)

PDF
Why Exceptions are just sophisticated GoTos ... and How to Move Beyond
PDF
Vodafone Mathematical Modelling 2024.pdf
PDF
Streamlining Python Development: A Guide to a Modern Project Setup
PDF
Unlocking the Power of Integer Programming
PDF
WALD: A Modern & Sustainable Analytics Stack
PDF
Forget about AI and do Mathematical Modelling instead!
PDF
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
PDF
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
PDF
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
PDF
Performance evaluation of GANs in a semisupervised OCR use case
PDF
Bridging the Gap: from Data Science to Production
PDF
How mobile.de brings Data Science to Production for a Personalized Web Experi...
PDF
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
PDF
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
PDF
Declarative Thinking and Programming
PDF
Which car fits my life? - PyData Berlin 2017
PDF
PyData Meetup Berlin 2017-04-19
PDF
Explaining the idea behind automatic relevance determination and bayesian int...
Why Exceptions are just sophisticated GoTos ... and How to Move Beyond
Vodafone Mathematical Modelling 2024.pdf
Streamlining Python Development: A Guide to a Modern Project Setup
Unlocking the Power of Integer Programming
WALD: A Modern & Sustainable Analytics Stack
Forget about AI and do Mathematical Modelling instead!
An Interpretable Model for Collaborative Filtering Using an Extended Latent D...
Honey I Shrunk the Target Variable! Common pitfalls when transforming the tar...
Matrix Factorization for Collaborative Filtering Is Just Solving an Adjoint L...
Performance evaluation of GANs in a semisupervised OCR use case
Bridging the Gap: from Data Science to Production
How mobile.de brings Data Science to Production for a Personalized Web Experi...
Deep Learning-based Recommendations for Germany's Biggest Vehicle Marketplace
Deep Learning-based Recommendations for Germany's Biggest Online Vehicle Mark...
Declarative Thinking and Programming
Which car fits my life? - PyData Berlin 2017
PyData Meetup Berlin 2017-04-19
Explaining the idea behind automatic relevance determination and bayesian int...

Recently uploaded (20)

PDF
Fluorescence-microscope_Botany_detailed content
PPTX
Global journeys: estimating international migration
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
PPT
Quality review (1)_presentation of this 21
PDF
Launch Your Data Science Career in Kochi – 2025
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Mega Projects Data Mega Projects Data
PPT
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PDF
Introduction to Business Data Analytics.
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Introduction to Knowledge Engineering Part 1
Fluorescence-microscope_Botany_detailed content
Global journeys: estimating international migration
Miokarditis (Inflamasi pada Otot Jantung)
Business Acumen Training GuidePresentation.pptx
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
Introduction to Basics of Ethical Hacking and Penetration Testing -Unit No. 1...
Quality review (1)_presentation of this 21
Launch Your Data Science Career in Kochi – 2025
climate analysis of Dhaka ,Banglades.pptx
Mega Projects Data Mega Projects Data
Chapter 2 METAL FORMINGhhhhhhhjjjjmmmmmmmmm
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Introduction to Firewall Analytics - Interfirewall and Transfirewall.pptx
Reliability_Chapter_ presentation 1221.5784
Introduction to Business Data Analytics.
STUDY DESIGN details- Lt Col Maksud (21).pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Introduction-to-Cloud-ComputingFinal.pptx
Introduction to Knowledge Engineering Part 1

Uncertainty Quantification in AI

  • 1. Are you sure about that!? Uncertainty Quantification in AI Florian Wilhelm Berlin, October 10th 2019
  • 2. 2 Dr. Florian Wilhelm Principal Data Scientist @ inovex @FlorianWilhelm FlorianWilhelm florianwilhelm.info Mathematical Modelling Data Science to Production Recommender Systems Uncertainty Quantification & Causality Python Data Stack Maintainer PyScaffold
  • 3. 3 Simon Bachstein Data Scientist @ inovex 2018/07 – 2019/01 Master Thesis at inovex: Uncertainty Quantification in Deep Learning • Blogpost: http://inovex.de/blog/uncertainty-quantification-deep-learning • Master Thesis: https://sbachstein.de/master_thesis.pdf @simonbachstein sbachstein sbachstein.de
  • 4. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 4
  • 5. Deep Networks cannot look beyond their horizon Motivation 5 90% cat 10% dog
  • 6. Deep Networks cannot look beyond their horizon Motivation 6 40% cat 60% dog
  • 7. Deep Networks cannot look beyond their horizon Motivation 7 ?
  • 8. Boult, T. E., Cruz, S., Dhamija, A., Gunther, M., Henrydoss, J., & Scheirer, W. (2019). Learning and the Unknown: Surveying Steps Toward Open World Recognition. Aaai, 1–8. Retrieved from www.aaai.org Learning and the Unknown 8
  • 10. Simple Regression Problem Deep Networks don’t extrapolate Neural Arithmetic Logic Units, NIPS'18, Andrew Trask et. al.10
  • 11. Simple Regression Problem Deep Networks don’t extrapolate 11
  • 12. Simple Regression Problem Uncertainty about interpolation and extrapolation 12
  • 14. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 14
  • 15. Methods for Uncertainty Quantification 16 Relaxation of mathematical assumptions about data Gaussian Processes Deep Ensembles / Dropout Ensembles Quantile Regression Monte Carlo Dropout
  • 16. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 17
  • 17. A Gaussian Process can be thought of as a random function which is defined by its mean and covariance functions Gaussian Processes 18 Definition
  • 21. Gaussian Processes 22 Inference with perfect interpolation
  • 23. Gaussian Processes 24 Inference Inference using given data points can be done analytically. For example, when assuming the (prior) mean function to be zero everywhere, we get: Good introduction: Bayesian Non-parametric Models for Data Science using PyMC by Christopher Fonnesbeck • https://www.youtube.com/watch?v=-sIOMs4MSuA • https://de.slideshare.net/mlreview/bayesian-nonparametric-models-for-data-science-using-pymc computationally intense
  • 24. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 25
  • 25. MC Dropout Dropout as a Bayesian Approximation: Representing Model Uncertainty in Deep Learning, ICML 2016, Yarin Gal et. al. 26 ...
  • 26. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 27
  • 27. Deep Ensembles Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al28 Custom loss function: Capture uncertainty directly at training time
  • 28. Deep Ensembles Simple and Scalable Predictive Uncertainty Estimation using Deep Ensembles, NIPS 2017, Balaji Lakshminarayanan et. al29 Combine an ensemble of networks
  • 29. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 30
  • 30. Dropout Ensembles 31 The best of both worlds? ...
  • 31. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 32
  • 32. Using the cumulative distribution function (cdf) of a random variable Y, we define the quantile: Loss function to estimate quantile: Quantile Regression 33
  • 33. Intuition behind Quantile Regression 34 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.1 0.2 0.5 0.8 Error: 0.0 + 1.6 = 1.6
  • 34. Intuition behind Quantile Regression 35 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.1 0.4 0.7 Error: 0.0 + 1.2 = 1.2 0.0
  • 35. Intuition behind Quantile Regression 36 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.1 0.3 0.6 Error: 0.1 + 0.9 = 1.0 0.0
  • 36. Intuition behind Quantile Regression 37 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume median is here (𝜏 = 0.5) 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.2 0.2 0.5 Error: 0.3 + 0.7 = 1.0 0.1 No change due to the linearity of the error! +0.1 +0.1 -0.1 -0.1
  • 37. Now the 0.75th Quantile 38 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 Assume 𝜏 = 0.75 is here 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.2 0.2 0.5 0.1 Error: (1 − 0.75) ⋅ 0.3 + 0.75 ⋅ 0.7 = 0.6 Right-side error weights 3 times as much as the left-side error
  • 38. Now the 0.75th Quantile 39 0.0 0.1 0.2 0.90.3 0.4 0.5 0.6 0.7 0.8 1.0 𝜏 = 0.75 𝑦 > )𝑞+(𝑥) 𝑦 ≤ )𝑞+(𝑥) 0.5 0.1 0.2 Error: (1 − 0.75) ⋅ 1.0 + 0.75 ⋅ 0.2 = 0.4 0.4 Change in the right-side error also weights 3 times as much as the left-side error
  • 39. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 40
  • 40. According to the function Samples are generated as follows: Experiments Uncertainty in Deep Learning (Phd thesis), Yarin Gal, http://mlg.eng.cam.ac.uk/yarin/blog_2248.html41 Dataset
  • 42. Neural networks › 2 hidden layers with 20 ReLU neurons each › 5 networks for Deep Ensembles › 100 iterations for Dropout predictions › Adam optimizer with batch size of 128 › LR, weight decay, dropout probability are optimized Gaussian Processes › squared exponential covariance and zero mean function prior › covariance function parameters and aleatory noise are optimized Experiments 43 Network setup and hyperparameters
  • 43. Mean squared error (MSE) Mean negative log likelihood (MNLL) Mean Kullback-Leibler (KL) divergence Experiments 44 Measures for generalization quality
  • 45. Experiments 46 They still don’t extrapolate and they don’t quite realize
  • 51. Summary 52 GP MCD DeepE DropoutE QR Homoscedastic noise ++ o + o o Heteroscedastic noise -- - ++ + + Non-Gaussian noise + o + + - Convergence ++ - + - + Speed (--) + - / (+) + ++ Uncertainty split yes no yes yes no
  • 52. 1. Motivation 2. Methods a. Gaussian Processes b. Monte-Carlo Dropout c. Deep Ensembles d. Dropout Ensembles e. Quantile Regression 3. Experiments 4. Conclusion & Outlook Agenda 53
  • 53. › Neural network approaches discussed here are very aware of aleatory uncertainty, however, not capable of correctly estimating epistemic uncertainty › Gaussian Processes give clear signals about ignorance but do not scale A combined solution needs to be developed because uncertainty estimation is needed in critical applications Conclusion 54 There is work to be done
  • 54. › Bayesian Neural Networks (e.g. with PyMC) › Sparse Gaussian Process approximations › Gaussian Processes on top of neural networks Outlook 55 Other approaches
  • 55. Thank You! Florian Wilhelm Principal Data Scientist inovex GmbH Schanzenstraße 6-20 Kupferhütte 1.13 51063 Köln florian.wilhelm@inovex.de