SlideShare a Scribd company logo
Understanding black-box
predictions via influence
functions
XIE Ruiming
Outline
• Background
• Taylor's Formula
• Newton's Method
• Introduction
• Influence Function
• Definition
• Efficiently Calculating Influence
• Validation and Extensions
• Use cases of influence function
Background : Taylor
• Taylor's theorem gives an approximation of a k-times
differentiable function around a given point by a k-th
order Taylor polynomial
• Linear approximation
• Quadratic approximation
Background : Newton
• Find x:F(x) = 0 through iteration.
• Recall Taylor’s Formula
• F(a) ≈ F(Xn) + F’(Xn)(a – Xn)
• Set F(a) = 0, get a = Xn – F(Xn)/F’(Xn)
• Newton in optimizing
• X = argmin(F(x)), then F’(X) = 0
• Doing Newton’s method with F’(X)
• Xn+1 = Xn – F’(Xn)/F’’(Xn)
Background : Newton
• X = parameters
Introduction
Introduction
• Why did the model make this prediction?
Introduction
• Why did the model make this prediction?
• Retrieving images that maximally activate a neuron [Girshick et
al. 2014]
Introduction
• Why did the model make this prediction?
• Retrieving images that maximally activate a neuron [Girshick et
al. 2014]
• Finding the most influential part from the image [Zhou et al. 2016]
Introduction
• Why did the model make this prediction?
• Retrieving images that maximally activate a neuron [Girshick et
al. 2014]
• Finding the most influential part from the image [Zhou et al. 2016]
But, they assumed a
fixed model
Introduction
• Existing Methods
• Treat model as fixed
• Explain prediction w.r.t parameters or test input
• Our Method
• Treat model as a function of training data
• Explain prediction w.r.t the training data “most
responsible” for prediction
• How would the prediction change if we up-weighted/
modified a training point?
Influence Function
• Introduction
• Efficiently calculation
• Validation and extension
Influence Function
• the origin loss function
• optimized theta
• If we up-weighted a point z by e, new loss function
• optimized new theta
Influence Function
• We are interested in the parameter & test loss change.
• Theta change:
• Loss change:
Influence Function
• We define two influence function
• - ≈
• F(ε) = argmin =
• F(ε) ≈ F(0) + ε *
Deriving
Deriving
Deriving
• Use Taylor expansion on the right side
• F(θ)=
• F(θz) = F(θ) + (θz - θ) * F’(θ)
• F(θz) = 0
Deriving
Deriving
• Finally
Deriving other functions
Perturbing a training input
• If we change (x, y) to (x + delta, y), what will test loss
change?
• (x, y) to (x + delta, y) equals to delete (x, y) then add (x +
delta, y)
Efficiently calculation
• Two challenges:
• calculating Inverse Hessian Matrix
• calculating influence function on all training points
• n training points, p parameters
• Inverting Hessian O(np2 + p3)
• Use Conjugate gradients(refer to paper), O(np)
• Stochastic estimation(refer to paper)
Validation and Extensions
• There are some assumptions & approximation:
• model parameter minimized the loss
• the loss is twice-differentiable
• We want to check the performance of influence function
when these assumptions are violated.
Validation and Extensions
• Influence function vs leave-one-out retraining
• actually retrain a linear regression model after
removing a training point
Validation and Extensions
• Non-convexity and non-convergence
• When theta is not a minimizer, the loss change will be a
little different( refer to paper)
• Iloss non-convex
• Person’s correlation = 0.86
Validation and Extensions
• Non-differentiable losses
• Hinge loss: we can approximate using some smooth
methods
Use cases of influence functions
• Understanding model behavior
• Fixing mislabeled examples
• Adversarial training examples
• Debugging domain mismatch( refer to paper)
Understanding model behaviors
• Model1: Inception v3 with all but top layer frozen
• Model2: SVM with rbf kernel
• Task: binary image classification of fish and dog
Understanding model behaviors
• Model1: Inception v3 with all but top layer frozen
• Model2: SVM with rbf kernel
• Task: binary image classification of fish and dog
Fixing mislabeled examples
• Only have training set.
• What do we usually do?
• Find example with the largest loss
Fixing mislabeled examples
• Experiment:
• spam email data, random change 10% label
Adversarial training examples
• There exists some paper generating some adversarial test
images that are visually indistinguishable but can fool a
classifier.
• We demonstrate we can craft adversarial training images
that can flip a model’s prediction
• Basically the idea is iterating on training images on the
direction of influence function.
Adversarial training examples
• Data same as fish vs dog
• origin correctly classified 591/600 test images.
• for each test image, find only one training image and
do 100 iterations.
• 335(57%) of the testing images were flipped
• Also, attack on one training image can influence
multiple test images.
Adversarial training examples
• Data same as fish vs dog
• origin correctly classified 591/600 test images.
• for each test image, find only one training image and
do 100 iterations.
• 335(57%) of the testing images were flipped
• Also, attack on one training image can influence
multiple test images.
Thank you
code: http://bit.ly/gt-influence

More Related Content

PPTX
Understanding Black-box Predictions via Influence Functions
PDF
La pollution de l'air
PDF
The Carbon Trust: Introduction
PPT
Introduction au cours de L1 - Géosciences
PPT
Алгебра. 8 клас. Область визначення виразів (ОДЗ). Перетворення виразів
PDF
การคำนวณปรับแก้สำหรับการแปลงพิกัด
DOCX
Intelligent Transport System
PPT
дослідження функції за допомогою похідної
Understanding Black-box Predictions via Influence Functions
La pollution de l'air
The Carbon Trust: Introduction
Introduction au cours de L1 - Géosciences
Алгебра. 8 клас. Область визначення виразів (ОДЗ). Перетворення виразів
การคำนวณปรับแก้สำหรับการแปลงพิกัด
Intelligent Transport System
дослідження функції за допомогою похідної

Similar to ICML2017 best paper (Understanding black box predictions via influence functions) (20)

PPTX
Linear Regression.pptx
PPTX
Bisection & Regual falsi methods
PDF
Mit6 094 iap10_lec03
PPTX
2a-linear-regression-18Maykjkij;oik;.pptx
PPTX
Week 2 - ML models and Linear Regression.pptx
PDF
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
PPTX
Supervised learning for IOT IN Vellore Institute of Technology
PPTX
hal-lectures-01231452-dl-sunilpatnaik.pptx
PPTX
Coursera 1week
PDF
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
PPT
cos323_s06_lecture03_optimization.ppt
PDF
[FLOLAC'14][scm] Functional Programming Using Haskell
PDF
L1 intro2 supervised_learning
PPTX
Week 5 lecture 1 of Calculus course in unergraduate
PDF
AP Advantage: AP Calculus
PDF
Bartosz Milewski, “Re-discovering Monads in C++”
PPTX
Reinforcement Learning and Artificial Neural Nets
PPT
Lesson3.2 a basicdifferentiationrules
PDF
PDF
Model Selection and Validation
Linear Regression.pptx
Bisection & Regual falsi methods
Mit6 094 iap10_lec03
2a-linear-regression-18Maykjkij;oik;.pptx
Week 2 - ML models and Linear Regression.pptx
Optimization (DLAI D4L1 2017 UPC Deep Learning for Artificial Intelligence)
Supervised learning for IOT IN Vellore Institute of Technology
hal-lectures-01231452-dl-sunilpatnaik.pptx
Coursera 1week
Optimization for Neural Network Training - Veronica Vilaplana - UPC Barcelona...
cos323_s06_lecture03_optimization.ppt
[FLOLAC'14][scm] Functional Programming Using Haskell
L1 intro2 supervised_learning
Week 5 lecture 1 of Calculus course in unergraduate
AP Advantage: AP Calculus
Bartosz Milewski, “Re-discovering Monads in C++”
Reinforcement Learning and Artificial Neural Nets
Lesson3.2 a basicdifferentiationrules
Model Selection and Validation
Ad

Recently uploaded (20)

PDF
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
PPT
Mechanical Engineering MATERIALS Selection
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PDF
Embodied AI: Ushering in the Next Era of Intelligent Systems
PPTX
UNIT 4 Total Quality Management .pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PPTX
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
PPTX
Safety Seminar civil to be ensured for safe working.
PPTX
CH1 Production IntroductoryConcepts.pptx
PPTX
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
DOCX
573137875-Attendance-Management-System-original
PDF
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
PDF
Automation-in-Manufacturing-Chapter-Introduction.pdf
PDF
Operating System & Kernel Study Guide-1 - converted.pdf
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
composite construction of structures.pdf
PPTX
bas. eng. economics group 4 presentation 1.pptx
PPTX
Internet of Things (IOT) - A guide to understanding
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PPT
Project quality management in manufacturing
Evaluating the Democratization of the Turkish Armed Forces from a Normative P...
Mechanical Engineering MATERIALS Selection
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
Embodied AI: Ushering in the Next Era of Intelligent Systems
UNIT 4 Total Quality Management .pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
Engineering Ethics, Safety and Environment [Autosaved] (1).pptx
Safety Seminar civil to be ensured for safe working.
CH1 Production IntroductoryConcepts.pptx
CARTOGRAPHY AND GEOINFORMATION VISUALIZATION chapter1 NPTE (2).pptx
573137875-Attendance-Management-System-original
Unit I ESSENTIAL OF DIGITAL MARKETING.pdf
Automation-in-Manufacturing-Chapter-Introduction.pdf
Operating System & Kernel Study Guide-1 - converted.pdf
Model Code of Practice - Construction Work - 21102022 .pdf
composite construction of structures.pdf
bas. eng. economics group 4 presentation 1.pptx
Internet of Things (IOT) - A guide to understanding
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
Project quality management in manufacturing
Ad

ICML2017 best paper (Understanding black box predictions via influence functions)

  • 1. Understanding black-box predictions via influence functions XIE Ruiming
  • 2. Outline • Background • Taylor's Formula • Newton's Method • Introduction • Influence Function • Definition • Efficiently Calculating Influence • Validation and Extensions • Use cases of influence function
  • 3. Background : Taylor • Taylor's theorem gives an approximation of a k-times differentiable function around a given point by a k-th order Taylor polynomial • Linear approximation • Quadratic approximation
  • 4. Background : Newton • Find x:F(x) = 0 through iteration. • Recall Taylor’s Formula • F(a) ≈ F(Xn) + F’(Xn)(a – Xn) • Set F(a) = 0, get a = Xn – F(Xn)/F’(Xn) • Newton in optimizing • X = argmin(F(x)), then F’(X) = 0 • Doing Newton’s method with F’(X) • Xn+1 = Xn – F’(Xn)/F’’(Xn)
  • 5. Background : Newton • X = parameters
  • 7. Introduction • Why did the model make this prediction?
  • 8. Introduction • Why did the model make this prediction? • Retrieving images that maximally activate a neuron [Girshick et al. 2014]
  • 9. Introduction • Why did the model make this prediction? • Retrieving images that maximally activate a neuron [Girshick et al. 2014] • Finding the most influential part from the image [Zhou et al. 2016]
  • 10. Introduction • Why did the model make this prediction? • Retrieving images that maximally activate a neuron [Girshick et al. 2014] • Finding the most influential part from the image [Zhou et al. 2016] But, they assumed a fixed model
  • 11. Introduction • Existing Methods • Treat model as fixed • Explain prediction w.r.t parameters or test input • Our Method • Treat model as a function of training data • Explain prediction w.r.t the training data “most responsible” for prediction • How would the prediction change if we up-weighted/ modified a training point?
  • 12. Influence Function • Introduction • Efficiently calculation • Validation and extension
  • 13. Influence Function • the origin loss function • optimized theta • If we up-weighted a point z by e, new loss function • optimized new theta
  • 14. Influence Function • We are interested in the parameter & test loss change. • Theta change: • Loss change:
  • 15. Influence Function • We define two influence function • - ≈ • F(ε) = argmin = • F(ε) ≈ F(0) + ε *
  • 18. Deriving • Use Taylor expansion on the right side • F(θ)= • F(θz) = F(θ) + (θz - θ) * F’(θ) • F(θz) = 0
  • 22. Perturbing a training input • If we change (x, y) to (x + delta, y), what will test loss change? • (x, y) to (x + delta, y) equals to delete (x, y) then add (x + delta, y)
  • 23. Efficiently calculation • Two challenges: • calculating Inverse Hessian Matrix • calculating influence function on all training points • n training points, p parameters • Inverting Hessian O(np2 + p3) • Use Conjugate gradients(refer to paper), O(np) • Stochastic estimation(refer to paper)
  • 24. Validation and Extensions • There are some assumptions & approximation: • model parameter minimized the loss • the loss is twice-differentiable • We want to check the performance of influence function when these assumptions are violated.
  • 25. Validation and Extensions • Influence function vs leave-one-out retraining • actually retrain a linear regression model after removing a training point
  • 26. Validation and Extensions • Non-convexity and non-convergence • When theta is not a minimizer, the loss change will be a little different( refer to paper) • Iloss non-convex • Person’s correlation = 0.86
  • 27. Validation and Extensions • Non-differentiable losses • Hinge loss: we can approximate using some smooth methods
  • 28. Use cases of influence functions • Understanding model behavior • Fixing mislabeled examples • Adversarial training examples • Debugging domain mismatch( refer to paper)
  • 29. Understanding model behaviors • Model1: Inception v3 with all but top layer frozen • Model2: SVM with rbf kernel • Task: binary image classification of fish and dog
  • 30. Understanding model behaviors • Model1: Inception v3 with all but top layer frozen • Model2: SVM with rbf kernel • Task: binary image classification of fish and dog
  • 31. Fixing mislabeled examples • Only have training set. • What do we usually do? • Find example with the largest loss
  • 32. Fixing mislabeled examples • Experiment: • spam email data, random change 10% label
  • 33. Adversarial training examples • There exists some paper generating some adversarial test images that are visually indistinguishable but can fool a classifier. • We demonstrate we can craft adversarial training images that can flip a model’s prediction • Basically the idea is iterating on training images on the direction of influence function.
  • 34. Adversarial training examples • Data same as fish vs dog • origin correctly classified 591/600 test images. • for each test image, find only one training image and do 100 iterations. • 335(57%) of the testing images were flipped • Also, attack on one training image can influence multiple test images.
  • 35. Adversarial training examples • Data same as fish vs dog • origin correctly classified 591/600 test images. • for each test image, find only one training image and do 100 iterations. • 335(57%) of the testing images were flipped • Also, attack on one training image can influence multiple test images.