SlideShare a Scribd company logo
Understanding Blackbox Prediction via Influence Functions
Introduction
Introduction
Introduction
A key question often asked of machine learning systems is
“Why did the system make this prediction?”
How can we explain where the model came from?
In this paper, we tackle this question by tracing a model’s predictions
through its learning algorithm and back to the training data, where the
model parameters ultimately derive from.
Introduction
Answering this question by perturbing the data and retraining the model
can be prohibitively expensive. To overcome this problem, we use
influence functions, a classic technique from robust statistics (Cook &
Weisberg, 1980) that tells us how the model parameters change as we
upweight a training point by an infinitesimal amount.
Methods
Methods
Approach
We are given training points 𝑧1,… , 𝑧 𝑛, where 𝑧𝑖 = (𝑥𝑖, 𝑦𝑖) ∈ X × Y. For
a point 𝑧 and parameters 𝜃 ∈ Θ, let 𝐿(𝑧, 𝜃) be the loss
Assume that the empirical risk is twice-differentiable and strictly
convex in 𝜃
Approach
Model param. by training w/o z :
Model param. by upweighting z :
Model param. by perturbing z :
Approach
Let us begin by studying the change in model parameters due to
removing a point z from the training set.
Formally, this change is ෠𝜃ɛ, 𝑧 − ෠𝜃
Formally, this change is ෠𝜃−𝑧 − ෠𝜃
Formally, this change is ෠𝜃ɛ, 𝑧 𝛿, −𝑧 − ෠𝜃
Influence function - proof of up,params
Influence function - proof of up,params
Up, params influence
where 𝐻෡𝜃 ≝
1
𝑛
σ𝑖=1
𝑛
∇ 𝜃
2
𝐿(𝑧, ෠𝜃) is the Hessian and is positive definite
(PD) by assumption. In essence, we form a quadratic approximation
to the empirical risk around ෠𝜃 and take a single Newton step; see
appendix A for a derivation. Since removing a point z is the same as
upweighting it by ε = −
1
𝑛
, we can linearly approximate the parameter
change due to removing z by computing ෠𝜃−𝑧 − ෠𝜃 ≈ −
1
𝑛
𝜤 𝑢𝑝,𝑝𝑎𝑟𝑎𝑚𝑠,
without retraining the model.
influence loss of up, params
Perturbing a training input
For a training point 𝑧 = (𝑥, 𝑦) , define 𝑧 𝛿 ≝ (𝑥 + 𝛿, 𝑦). Consider the
perturbation 𝑧 → 𝑧 𝛿 , and let ෠𝜃 𝑧 𝛿, −𝑧 be the empirical risk minimizer
on the training points with 𝑧 𝛿 in place of 𝑧. To approximate its
effects, define the parameters resulting from moving ɛ mass from 𝑧
onto 𝑧 𝛿
Perturbing a training input
If x is continuous and 𝛿is small
lim
ℎ→0
F(X+h) – F(X) = F’(X)∗h
Efficiently calculation
How to calculate it?
Efficiently calculation
We discuss two techniques for approximating 𝑠𝑡𝑒𝑠𝑡, both relying on
the fact that the HVP of a single term in 𝐻෡𝜃, [∇ 𝜃
2
𝐿(𝑧, ෠𝜃)]v, can be
computed for arbitrary v in the same time that∇ 𝜃 𝐿(𝑧, ෠𝜃) would take,
which is typically O(p) (Pearlmutter, 1994).
𝑠𝑡𝑒𝑠𝑡 ≝ 𝐻෡𝜃
−1
∇ 𝜃 𝐿(𝑧𝑡𝑒𝑠𝑡, ෠𝜃)
Efficiently calculation - Conjugate gradients (CG)
Since 𝐻෡𝜃 ≻ 0 by assumption, 𝐻෡𝜃
−1
𝑣 ≡ 𝑎𝑟𝑔𝑚𝑖𝑛 𝑡
1
2
𝑡 𝑇 𝐻෡𝜃 𝑡 − 𝑣 𝑇
𝑡 . We
can solve this with CG approaches that only require the evaluation of
𝐻෡𝜃 𝑡 , which takes O(np)time, without explicitly forming 𝐻෡𝜃
𝑠𝑡𝑒𝑠𝑡 ≝ 𝐻෡𝜃
−1
∇ 𝜃 𝐿(𝑧𝑡𝑒𝑠𝑡, ෠𝜃)
Efficiently calculation - Stochastic estimation
𝑠𝑡𝑒𝑠𝑡 ≝ 𝐻෡𝜃
−1
∇ 𝜃 𝐿(𝑧𝑡𝑒𝑠𝑡, ෠𝜃)
Dropping the ෠𝜃 subscript for clarity,let 𝐻𝑗
−1
≝ σ𝑖=0
𝑗
(𝐼 − 𝐻)𝑖, the first
j terms in the Taylor expansion of 𝐻−1. Rewrite this recursively as
𝐻𝑗
−1
= 𝐼 + (𝐼 − 𝐻)𝐻𝑗−1
−1
. From the validity of the Taylor expansion,
𝐻𝑗
−1
→ 𝐻−1 as 𝑗 → ∞. The key is that at each iteration, we can
substitute the full 𝐻 with a draw from any unbiased (and faster to-
compute) estimator of 𝐻 to form ෪𝐻𝑗. Since E[෪𝐻𝑗
−1
] = 𝐻𝑗
−1
, we still
have E[෪𝐻𝑗
−1
] → 𝐻−1
Efficiently calculation - Stochastic estimation
෪𝐻𝑗
−1
𝑣 = 𝑣 + (𝐼 − ∇ 𝜃
2
𝐿(𝑧𝑠 𝑗
, ෠𝜃))෫𝐻𝑗−1
−1
𝑣
Empirically, we found this significantly faster than CG.
Non-convexity and non-convergence
Our approach is to form a convex quadratic approximation of the loss
around ෩𝜃 , i.e., ෩𝐿 𝑧, 𝜃 = 𝐿(𝑧, ෩𝜃 ) + ∇𝐿(𝑧, ෩𝜃 ) 𝑇 𝜃 − ෩𝜃 +
1
2
(𝜃 − ෩𝜃 ) 𝑇൫
൯
𝐻෩𝜃 +
λ 𝐼 𝜃 − ෩𝜃 . Here, λ is a damping term that we add if 𝐻෩𝜃 has negative
eigenvalues; this corresponds to adding L2 regularization on 𝜃. We then
calculate 𝜤 𝑢𝑝,𝑙𝑜𝑠𝑠 using ෩𝐿 . If ෩𝜃 is close to a local minimum, this is
correlated with the result of taking a Newton step from ෩𝜃 after removing 𝜀
weight from z
Let 𝑋 ∈ 𝑅 𝑚×𝑚 be a symmetric matrix.
𝑋 = 𝑈Σ𝑈 𝑇
𝐼 = 𝑈𝐼𝑈 𝑇
𝑋 + 𝐼 = 𝑈(Σ + 𝐼)𝑈 𝑇
IHVP by Lissa Algorithms
Applications
Applications
Applications - Understanding model behavior
Influence functions reveal insights about
how models rely on and extrapolate from the training data.
Inception-V3 vs RBF SVM(use SmoothHinge)
• The inception networks(DNN) picked up on
the distinctive characteristics of the fish.
• RBF SVM pattern-matched training images
superficially
Applications - Understanding model behavior
Applications
Application - Adversarial training examples
Training datasets are vulnerable to attack
Can we create adversarial training examples?
Applications
Application - Debugging domain mismatch
If a model makes a mistake, can we find out why?
Original Modified
~20k -> ~20k
21 -> 1
3 -> 3
same
-20
same
Domain mismatch — where the training distribution
does not match the test distribution — can cause
models with high training accuracy to do poorly on
test data
(………………)
we predicted whether a patient would be readmitted
to hospital. We used logistic regression to predict
readmission with a balanced training dataset of 20K
diabetic patients from 100+ US hospitals, each
represented by127 features.
(………………)
This caused the model to wrongly classify many
children in the test set
Healthy +
re-admitted
Adults
Healthy
children
Re-admitted
children
Application - Debugging domain mismatch
True test label: Healthy children
Model predicts: Re-admitted childeren
0.1
0
-0.1
Influence
Top 20 influential training examples
Applications
Application - Fixing mislabeled examples
Training labels are noisy, and we have a small budget to manually inspect them
Can we prioritize which labels to try to fix?
Even if a human expert could
recognize wrongly labeled
examples, it is impossible in many
applications to manually review
all of the training data We show
that influence functions can help
human experts prioritize their
attention, allowing them to
inspect only the examples that
actually matter
Ham SpamSpamSpamHam
Ham SpamSpamHamSpam
We flipped the labels of a random 10% of the training data
Application - Fixing mislabeled examples
Plots of how test accuracy (left) and the fraction of flipped data
detected (right) change with the fraction of train data checked
References
References
References
Pang Wei Koh and Percy Liang. "Understanding Black-Box prediction via influence functions" ICML 2017 Best
paper
Paper link: https://arxiv.org/abs/1703.04730
Microsoft Research: Understanding Black-box Predictions via Influence Functions (by Pang Wei Koh)
Youtube: https://youtu.be/0w9fLX_T6tY
Understanding Blackbox Prediction via Influence Functions

More Related Content

PDF
Sampling method : MCMC
PDF
Differential Geometry for Machine Learning
PDF
Integral calculus
PDF
Estimation Theory Class (Summary and Revision)
PDF
Side 2019 #5
PDF
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...
PDF
transformations and nonparametric inference
PPT
Fst ch3 notes
Sampling method : MCMC
Differential Geometry for Machine Learning
Integral calculus
Estimation Theory Class (Summary and Revision)
Side 2019 #5
Data Approximation in Mathematical Modelling Regression Analysis and Curve Fi...
transformations and nonparametric inference
Fst ch3 notes

What's hot (20)

PDF
Slides ACTINFO 2016
PDF
random forests for ABC model choice and parameter estimation
PDF
Classification
PDF
Side 2019 #7
PPTX
Least Squares
DOCX
A Course in Fuzzy Systems and Control Matlab Chapter Three
PDF
Linear regression [Theory and Application (In physics point of view) using py...
PDF
Generic Reinforcement Schemes and Their Optimization
PPTX
Curve fitting
PDF
Delayed acceptance for Metropolis-Hastings algorithms
PDF
Econometrics 2017-graduate-3
PDF
Lecture 11 state observer-2020-typed
PPTX
Backpropagation
PPT
Linear Systems Gauss Seidel
PDF
Solving High-order Non-linear Partial Differential Equations by Modified q-Ho...
DOCX
Unit 5 Correlation
PPTX
Interactives Methods
 
PDF
Probability Formula sheet
PDF
Slides univ-van-amsterdam
PDF
Integrales definidas y método de integración por partes
Slides ACTINFO 2016
random forests for ABC model choice and parameter estimation
Classification
Side 2019 #7
Least Squares
A Course in Fuzzy Systems and Control Matlab Chapter Three
Linear regression [Theory and Application (In physics point of view) using py...
Generic Reinforcement Schemes and Their Optimization
Curve fitting
Delayed acceptance for Metropolis-Hastings algorithms
Econometrics 2017-graduate-3
Lecture 11 state observer-2020-typed
Backpropagation
Linear Systems Gauss Seidel
Solving High-order Non-linear Partial Differential Equations by Modified q-Ho...
Unit 5 Correlation
Interactives Methods
 
Probability Formula sheet
Slides univ-van-amsterdam
Integrales definidas y método de integración por partes
Ad

Similar to Understanding Blackbox Prediction via Influence Functions (20)

PPTX
Koh_Liang_ICML2017
PPTX
WEKA: Credibility Evaluating Whats Been Learned
PPTX
WEKA:Credibility Evaluating Whats Been Learned
PDF
working with python
PPTX
Py data19 final
PPTX
Machine learning introduction lecture notes
PPT
INTRODUCTION TO BOOSTING.ppt
PPTX
Artificial intelligence.pptx
PPTX
Artificial intelligence
PPTX
Artificial intelligence
PDF
CS229 Machine Learning Lecture Notes
PPTX
Lecture 3.1_ Logistic Regression powerpoint
PPTX
Machine learning session4(linear regression)
PDF
Machine learning (5)
PDF
18.1 combining models
PDF
Model Selection and Validation
PPTX
Arjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjd
PDF
Machine learning
PDF
Supervised Learning.pdf
PPTX
Chapter 3.pptx Machine learning engineerin
Koh_Liang_ICML2017
WEKA: Credibility Evaluating Whats Been Learned
WEKA:Credibility Evaluating Whats Been Learned
working with python
Py data19 final
Machine learning introduction lecture notes
INTRODUCTION TO BOOSTING.ppt
Artificial intelligence.pptx
Artificial intelligence
Artificial intelligence
CS229 Machine Learning Lecture Notes
Lecture 3.1_ Logistic Regression powerpoint
Machine learning session4(linear regression)
Machine learning (5)
18.1 combining models
Model Selection and Validation
Arjrandomjjejejj3ejjeejjdjddjjdjdjdjdjdjdjdjdjd
Machine learning
Supervised Learning.pdf
Chapter 3.pptx Machine learning engineerin
Ad

More from SEMINARGROOT (20)

PDF
Metric based meta_learning
PDF
Demystifying Neural Style Transfer
PDF
Towards Deep Learning Models Resistant to Adversarial Attacks.
PDF
The ways of node embedding
PDF
Graph Convolutional Network
PDF
Denoising With Frequency Domain
PDF
Bayesian Statistics
PDF
Coding Test Review 3
PDF
Time Series Analysis - ARMA
PDF
Generative models : VAE and GAN
PDF
Effective Python
PDF
Attention Is All You Need
PDF
Attention
PDF
WWW 2020 XAI Tutorial Review
PDF
Coding test review 2
PDF
Locality sensitive hashing
PDF
Coding Test Review1
PDF
Strong convexity on gradient descent and newton's method
PDF
SVM (Support Vector Machine & Kernel)
PDF
Gaussian Process Regression
Metric based meta_learning
Demystifying Neural Style Transfer
Towards Deep Learning Models Resistant to Adversarial Attacks.
The ways of node embedding
Graph Convolutional Network
Denoising With Frequency Domain
Bayesian Statistics
Coding Test Review 3
Time Series Analysis - ARMA
Generative models : VAE and GAN
Effective Python
Attention Is All You Need
Attention
WWW 2020 XAI Tutorial Review
Coding test review 2
Locality sensitive hashing
Coding Test Review1
Strong convexity on gradient descent and newton's method
SVM (Support Vector Machine & Kernel)
Gaussian Process Regression

Recently uploaded (20)

PDF
Well-logging-methods_new................
PDF
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
Foundation to blockchain - A guide to Blockchain Tech
PPTX
OOP with Java - Java Introduction (Basics)
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
PPTX
Geodesy 1.pptx...............................................
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
PPTX
UNIT-1 - COAL BASED THERMAL POWER PLANTS
PPTX
Welding lecture in detail for understanding
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PPTX
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
PDF
Digital Logic Computer Design lecture notes
PDF
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
PPT
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS
Well-logging-methods_new................
The CXO Playbook 2025 – Future-Ready Strategies for C-Suite Leaders Cerebrai...
Internet of Things (IOT) - A guide to understanding
Foundation to blockchain - A guide to Blockchain Tech
OOP with Java - Java Introduction (Basics)
CYBER-CRIMES AND SECURITY A guide to understanding
Geodesy 1.pptx...............................................
bas. eng. economics group 4 presentation 1.pptx
MCN 401 KTU-2019-PPE KITS-MODULE 2.pptx
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
UNIT-1 - COAL BASED THERMAL POWER PLANTS
Welding lecture in detail for understanding
Model Code of Practice - Construction Work - 21102022 .pdf
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Operating System & Kernel Study Guide-1 - converted.pdf
Infosys Presentation by1.Riyan Bagwan 2.Samadhan Naiknavare 3.Gaurav Shinde 4...
Digital Logic Computer Design lecture notes
Enhancing Cyber Defense Against Zero-Day Attacks using Ensemble Neural Networks
CRASH COURSE IN ALTERNATIVE PLUMBING CLASS

Understanding Blackbox Prediction via Influence Functions

  • 3. Introduction A key question often asked of machine learning systems is “Why did the system make this prediction?” How can we explain where the model came from? In this paper, we tackle this question by tracing a model’s predictions through its learning algorithm and back to the training data, where the model parameters ultimately derive from.
  • 4. Introduction Answering this question by perturbing the data and retraining the model can be prohibitively expensive. To overcome this problem, we use influence functions, a classic technique from robust statistics (Cook & Weisberg, 1980) that tells us how the model parameters change as we upweight a training point by an infinitesimal amount.
  • 6. Approach We are given training points 𝑧1,… , 𝑧 𝑛, where 𝑧𝑖 = (𝑥𝑖, 𝑦𝑖) ∈ X × Y. For a point 𝑧 and parameters 𝜃 ∈ Θ, let 𝐿(𝑧, 𝜃) be the loss Assume that the empirical risk is twice-differentiable and strictly convex in 𝜃
  • 7. Approach Model param. by training w/o z : Model param. by upweighting z : Model param. by perturbing z :
  • 8. Approach Let us begin by studying the change in model parameters due to removing a point z from the training set. Formally, this change is ෠𝜃ɛ, 𝑧 − ෠𝜃 Formally, this change is ෠𝜃−𝑧 − ෠𝜃 Formally, this change is ෠𝜃ɛ, 𝑧 𝛿, −𝑧 − ෠𝜃
  • 9. Influence function - proof of up,params
  • 10. Influence function - proof of up,params
  • 11. Up, params influence where 𝐻෡𝜃 ≝ 1 𝑛 σ𝑖=1 𝑛 ∇ 𝜃 2 𝐿(𝑧, ෠𝜃) is the Hessian and is positive definite (PD) by assumption. In essence, we form a quadratic approximation to the empirical risk around ෠𝜃 and take a single Newton step; see appendix A for a derivation. Since removing a point z is the same as upweighting it by ε = − 1 𝑛 , we can linearly approximate the parameter change due to removing z by computing ෠𝜃−𝑧 − ෠𝜃 ≈ − 1 𝑛 𝜤 𝑢𝑝,𝑝𝑎𝑟𝑎𝑚𝑠, without retraining the model.
  • 12. influence loss of up, params
  • 13. Perturbing a training input For a training point 𝑧 = (𝑥, 𝑦) , define 𝑧 𝛿 ≝ (𝑥 + 𝛿, 𝑦). Consider the perturbation 𝑧 → 𝑧 𝛿 , and let ෠𝜃 𝑧 𝛿, −𝑧 be the empirical risk minimizer on the training points with 𝑧 𝛿 in place of 𝑧. To approximate its effects, define the parameters resulting from moving ɛ mass from 𝑧 onto 𝑧 𝛿
  • 14. Perturbing a training input If x is continuous and 𝛿is small lim ℎ→0 F(X+h) – F(X) = F’(X)∗h
  • 16. Efficiently calculation We discuss two techniques for approximating 𝑠𝑡𝑒𝑠𝑡, both relying on the fact that the HVP of a single term in 𝐻෡𝜃, [∇ 𝜃 2 𝐿(𝑧, ෠𝜃)]v, can be computed for arbitrary v in the same time that∇ 𝜃 𝐿(𝑧, ෠𝜃) would take, which is typically O(p) (Pearlmutter, 1994). 𝑠𝑡𝑒𝑠𝑡 ≝ 𝐻෡𝜃 −1 ∇ 𝜃 𝐿(𝑧𝑡𝑒𝑠𝑡, ෠𝜃)
  • 17. Efficiently calculation - Conjugate gradients (CG) Since 𝐻෡𝜃 ≻ 0 by assumption, 𝐻෡𝜃 −1 𝑣 ≡ 𝑎𝑟𝑔𝑚𝑖𝑛 𝑡 1 2 𝑡 𝑇 𝐻෡𝜃 𝑡 − 𝑣 𝑇 𝑡 . We can solve this with CG approaches that only require the evaluation of 𝐻෡𝜃 𝑡 , which takes O(np)time, without explicitly forming 𝐻෡𝜃 𝑠𝑡𝑒𝑠𝑡 ≝ 𝐻෡𝜃 −1 ∇ 𝜃 𝐿(𝑧𝑡𝑒𝑠𝑡, ෠𝜃)
  • 18. Efficiently calculation - Stochastic estimation 𝑠𝑡𝑒𝑠𝑡 ≝ 𝐻෡𝜃 −1 ∇ 𝜃 𝐿(𝑧𝑡𝑒𝑠𝑡, ෠𝜃) Dropping the ෠𝜃 subscript for clarity,let 𝐻𝑗 −1 ≝ σ𝑖=0 𝑗 (𝐼 − 𝐻)𝑖, the first j terms in the Taylor expansion of 𝐻−1. Rewrite this recursively as 𝐻𝑗 −1 = 𝐼 + (𝐼 − 𝐻)𝐻𝑗−1 −1 . From the validity of the Taylor expansion, 𝐻𝑗 −1 → 𝐻−1 as 𝑗 → ∞. The key is that at each iteration, we can substitute the full 𝐻 with a draw from any unbiased (and faster to- compute) estimator of 𝐻 to form ෪𝐻𝑗. Since E[෪𝐻𝑗 −1 ] = 𝐻𝑗 −1 , we still have E[෪𝐻𝑗 −1 ] → 𝐻−1
  • 19. Efficiently calculation - Stochastic estimation ෪𝐻𝑗 −1 𝑣 = 𝑣 + (𝐼 − ∇ 𝜃 2 𝐿(𝑧𝑠 𝑗 , ෠𝜃))෫𝐻𝑗−1 −1 𝑣 Empirically, we found this significantly faster than CG.
  • 20. Non-convexity and non-convergence Our approach is to form a convex quadratic approximation of the loss around ෩𝜃 , i.e., ෩𝐿 𝑧, 𝜃 = 𝐿(𝑧, ෩𝜃 ) + ∇𝐿(𝑧, ෩𝜃 ) 𝑇 𝜃 − ෩𝜃 + 1 2 (𝜃 − ෩𝜃 ) 𝑇൫ ൯ 𝐻෩𝜃 + λ 𝐼 𝜃 − ෩𝜃 . Here, λ is a damping term that we add if 𝐻෩𝜃 has negative eigenvalues; this corresponds to adding L2 regularization on 𝜃. We then calculate 𝜤 𝑢𝑝,𝑙𝑜𝑠𝑠 using ෩𝐿 . If ෩𝜃 is close to a local minimum, this is correlated with the result of taking a Newton step from ෩𝜃 after removing 𝜀 weight from z Let 𝑋 ∈ 𝑅 𝑚×𝑚 be a symmetric matrix. 𝑋 = 𝑈Σ𝑈 𝑇 𝐼 = 𝑈𝐼𝑈 𝑇 𝑋 + 𝐼 = 𝑈(Σ + 𝐼)𝑈 𝑇
  • 21. IHVP by Lissa Algorithms
  • 24. Applications - Understanding model behavior Influence functions reveal insights about how models rely on and extrapolate from the training data. Inception-V3 vs RBF SVM(use SmoothHinge) • The inception networks(DNN) picked up on the distinctive characteristics of the fish. • RBF SVM pattern-matched training images superficially
  • 27. Application - Adversarial training examples Training datasets are vulnerable to attack Can we create adversarial training examples?
  • 29. Application - Debugging domain mismatch If a model makes a mistake, can we find out why? Original Modified ~20k -> ~20k 21 -> 1 3 -> 3 same -20 same Domain mismatch — where the training distribution does not match the test distribution — can cause models with high training accuracy to do poorly on test data (………………) we predicted whether a patient would be readmitted to hospital. We used logistic regression to predict readmission with a balanced training dataset of 20K diabetic patients from 100+ US hospitals, each represented by127 features. (………………) This caused the model to wrongly classify many children in the test set Healthy + re-admitted Adults Healthy children Re-admitted children
  • 30. Application - Debugging domain mismatch True test label: Healthy children Model predicts: Re-admitted childeren 0.1 0 -0.1 Influence Top 20 influential training examples
  • 32. Application - Fixing mislabeled examples Training labels are noisy, and we have a small budget to manually inspect them Can we prioritize which labels to try to fix? Even if a human expert could recognize wrongly labeled examples, it is impossible in many applications to manually review all of the training data We show that influence functions can help human experts prioritize their attention, allowing them to inspect only the examples that actually matter Ham SpamSpamSpamHam Ham SpamSpamHamSpam We flipped the labels of a random 10% of the training data
  • 33. Application - Fixing mislabeled examples Plots of how test accuracy (left) and the fraction of flipped data detected (right) change with the fraction of train data checked
  • 35. References Pang Wei Koh and Percy Liang. "Understanding Black-Box prediction via influence functions" ICML 2017 Best paper Paper link: https://arxiv.org/abs/1703.04730 Microsoft Research: Understanding Black-box Predictions via Influence Functions (by Pang Wei Koh) Youtube: https://youtu.be/0w9fLX_T6tY