SlideShare a Scribd company logo
Interpretation and Explanation Methods For DNN Models
Presented by:
• Subhashis Hazarika, The Ohio State University
CSE 6559: Seminar Presentation
TopicsCovered
Activation Maximization Sensitivity Analysis Simple Taylor Decomposition
Layer-wise Relevance Propagation Guided Backpropagation LRP and deep Taylor
Explanation Evaluation Techniques Applications Concept Activation Vectors (CAV)
Testing with Concept Activation Vectors (TCAV) TCAV Applications
Methods for Interpreting and Understanding
Deep Neural Networks
Authors:
• Gregoire Montavon, TU Berlin
• Wojciech Samek, Fraunhofer Heinrich Hertz Institute, Berlin
• Klaus-Robert Muller, TU Berlin
Presented by:
• Subhashis Hazarika, The Ohio State University
CSE 6559: Seminar Presentation
Outline
• Motivation
• Interpretation vs Explanation
• Interpretation
• Activation Maximization
• Explanation
• Gradient vs Decomposition
• Relevance Propagation (deep Taylor decomposition)
• Evaluating Explanations
• Applications
Motivation
1. Verify that the classifier works as expected
Motivation
2. Improve Classifier
Motivation
3. Learn from the learning machine
Motivation
4. Interpretability in the sciences
Motivation
5. Compliance to legislation
Understanding Deep Nets : Two Views
Understanding Deep Nets : Two Views
Model Analysis  Interpretation
Decision Analysis  Explanation
Overview of Techniques
Interpreting a DNN Model
• Build a prototype in the input domain of the DNN (e.g image or text) that is
interpretable and representative of the abstract learned concept
• Activation Maximization (AM): An analysis framework that searches for an input
pattern that produces a maximum model response for a quantity of interest.
Improving AM
Improving AM
Improving AM
• Simple AM :
• Improving AM with an expert:
• Perform AM in code space:
• These techniques require an unsupervised model of the data, either a density model p(x)
or a generator model g(z).
Comparison of AM variants
Enhanced AM on Natural Images
Limitation of Global Interpretations
From Global Interpretations to Individual Explanations
Explanability
Explaining DNN decisions
• “why” does the model arrive at a certain prediction?
• “why” is the image below classified as a shark?
Explainability: Basic Techniques
• Sensitivity Analysis (Gradient Approach)
• Simple Taylor Decomposition (Function-decomposition Approach)
• Backward Propagation Techniques
• Layer-wise Relevance Propagation (decomposition)
• Guided Backpropagation (gradient)
Sensitivity Analysis
What does Sensitivity Analysis decompose?
What does Sensitivity Analysis decompose?
Decomposing the Correct Quantity
𝒇(𝒙) =
Taylor Series:
Simple Taylor Decomposition
• Achievable for linear models and deep ReLU
𝑓 𝑡𝑥 = 𝑡𝑥 → 𝑓 0 = 0
• We can choose 𝑥 = lim
𝑡→0
𝑡. 𝑥
• Final Relevance; 𝑅𝑖 =
𝜕𝑓
𝜕𝑥 𝑖 𝑥=0
. 𝑥𝑖
• i.e, gradient x input
Simple Taylor Decomposition
• Achievable for linear models and deep ReLU
𝑓 𝑡𝑥 = 𝑡𝑥 → 𝑓 0 = 0
• We can choose 𝑥 = lim
𝑡→0
𝑡. 𝑥
• Final Relevance; 𝑅𝑖 =
𝜕𝑓
𝜕𝑥 𝑖 𝑥=0
. 𝑥𝑖
• i.e, gradient x input
Simple Taylor Decomposition
Simple Taylor Decomposition
Why Simple Taylor doesn’t work?
1. Root point is hard to find or too far from 𝑥 (losses the context of 𝑥)
2. Gradient shattering  the gradient of deep nets has low informative value [Balduzzi ‘17]
Backward Propagation Techniques
• Make explicit use of the graph structure
• Layer-wise Relevance Propagation (Conserving)
• Guided Backpropagation (Non conserving)
• BPT were shown empirically to scale better to complex DNN models
• Facilitates filtering (breaks the explanation into multiple subtask)
Layer-wise Relevance Propagation
• Each neuron receives a share of the network output, and redistributes it to its
predecessors in equal amount, until the input variables are reached
Conservation Property:
LRP - 𝛼1 𝛽0 rule:
LRP propagation rules
• LRP - 𝛼𝛽 rule [Landecker`13, Bach`15, Zhang`16, Montavon`17]
• General rule:
1
0
-1
1
0
-1
1
0
-1
LRP vs Guided Backpropagation
LRP and deep Taylor decomposition
• LRP - 𝛼1 𝛽0 rule  deep Taylor decomposition [Montavon`17]
Assumption:
LRP and deep Taylor decomposition
1. Build the Relevance Neuron
LRP and deep Taylor decomposition
2. Expand the Relevance Neuron
LRP and deep Taylor decomposition
3. Decompose Relevance
LRP and deep Taylor decomposition
4. Pooling relevance over all outgoing neurons
Evaluating Explanation Quality
• Explanation Continuity
• Explanation Selectivity
Explanation Continuity
• “If two data points are nearly equivalent, then the explanations of their predictions
should also be nearly equivalent.”
• Explanation continuity (or lack of it) can be quantified by looking for the strongest
variation of the explanation R(x) in the input domain
Explanation Continuity
2-layer ReLU network:
Explanation Continuity
Explanation Selectivity
• Can be quantified by measuring how fast f(x) goes down when removing features with
the highest relevance scores
• “Pixel flipping” test for image data
Explanation Selectivity
Applications
• Model Validation Procedure
• Analysis of scientific data
Model Validation
• Traditional validation  use validation set (subset of the initial training data)
• Human interpretable results can add to basic validation procedures
Model Validation
• Traditional validation  use validation set (subset of the initial training data)
• Human interpretable results can add to basic validation procedures
Analysis of Scientific Data
• The context of interpretability and explanability can extended to domain specific
knowledge, useful for scientific inferences
• Atomistic simulation:
Analysis of Scientific Data
• The context of interpretability and explanability can extended to domain specific
knowledge, useful for scientific inferences
• Human Brain Pattern:
Analysis of Scientific Data
• The context of interpretability and explanability can extended to domain specific
knowledge, useful for scientific inferences
• DNA sequence:
Analysis of Scientific Data
• The context of interpretability and explanability can extended to domain specific
knowledge, useful for scientific inferences
• Human face analysis:
Summary so far
• ML model transparency:
• Interpreting the concepts learned by a model by building prototypes
• Explaining the model’s decisions by identifying the relevant input variables
• Crucial distinction between sensitivity analysis and decomposition approaches
• Evaluating Explanations
• Continuity
• Selectivity
• Applications
Interpretability Beyond Feature Attribution: Quantitative
Testing with Concept Activation Vectors (TCAV)
Authors:
• Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, and Rory Sayres
• Affiliation: Google Brain
CSE 6559: Seminar Presentation
Presented by:
• Subhashis Hazarika, The Ohio State University
Goal of TCAV
Goal of TCAV
Goal of TCAV
Overview
Testing with CAV
Other Analysis
• TCAV extension: Relative TCAV
• Validating the learned CAVs:
TCAV for medical application
• Diabetic Retinopathy (DR) (level 0- 4)
• Validating the trained model with known concepts from the doctors
Thank you!

More Related Content

PPTX
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
PPTX
Natural Language Processing Advancements By Deep Learning - A Survey
PPTX
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
PDF
Sentence representations and question answering (YerevaNN)
PDF
Icml2018 naver review
PDF
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
PDF
Colloquium talk on modal sense classification using a convolutional neural ne...
PDF
[KDD 2018 tutorial] End to-end goal-oriented question answering systems
Natural Language Generation / Stanford cs224n 2019w lecture 15 Review
Natural Language Processing Advancements By Deep Learning - A Survey
Deep Learning Enabled Question Answering System to Automate Corporate Helpdesk
Sentence representations and question answering (YerevaNN)
Icml2018 naver review
Monte Carlo Modelling of Confidence Intervals in Translation Quality Evaluati...
Colloquium talk on modal sense classification using a convolutional neural ne...
[KDD 2018 tutorial] End to-end goal-oriented question answering systems

What's hot (20)

PDF
P2P EC: A study of viability
PPTX
IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...
PDF
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
PDF
Can Deep Learning solve the Sentiment Analysis Problem
PPT
Topic Models Based Personalized Spam Filter
PPTX
NLP Project Presentation
PDF
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
PPTX
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
PDF
Babelfish_Report
PPTX
Word Tagging with Foundational Ontology Classes
PDF
Nlp presentation
PPTX
Neural Information Retrieval: In search of meaningful progress
PPTX
Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis...
PDF
Latent Topic-semantic Indexing based Automatic Text Summarization
PPTX
Convolutional neural networks for sentiment classification
PPTX
Graph Techniques for Natural Language Processing
PDF
Nlp research presentation
PDF
Topic Set Size Design with the Evaluation Measures for Short Text Conversation
PDF
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
DOC
taghelper-final.doc
P2P EC: A study of viability
IIT-TUDA at SemEval-2016 Task 5: Beyond Sentiment Lexicon: Combining Domain ...
LEPOR: an augmented machine translation evaluation metric - Thesis PPT
Can Deep Learning solve the Sentiment Analysis Problem
Topic Models Based Personalized Spam Filter
NLP Project Presentation
Transformers to Learn Hierarchical Contexts in Multiparty Dialogue
[Paper Reading] Supervised Learning of Universal Sentence Representations fro...
Babelfish_Report
Word Tagging with Foundational Ontology Classes
Nlp presentation
Neural Information Retrieval: In search of meaningful progress
Adapting Sentiment Lexicons using Contextual Semantics for Sentiment Analysis...
Latent Topic-semantic Indexing based Automatic Text Summarization
Convolutional neural networks for sentiment classification
Graph Techniques for Natural Language Processing
Nlp research presentation
Topic Set Size Design with the Evaluation Measures for Short Text Conversation
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
taghelper-final.doc
Ad

Similar to DNN Model Interpretability (20)

PPTX
Responsible AI in Industry: Practical Challenges and Lessons Learned
PDF
Model Evaluation in the land of Deep Learning
PDF
Reds interpretability report
PDF
Machine Learning Interpretability
PDF
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
PDF
graziani_bias.pdf
PPTX
ODSC APAC 2022 - Explainable AI
PDF
Explainable AI - making ML and DL models more interpretable
PDF
PPTX
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
PPTX
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
PDF
ML Interpretability Inside Out
PPTX
Interpretable Machine Learning
PPTX
Explainable AI in Industry (KDD 2019 Tutorial)
PPTX
Explainable AI in Industry (FAT* 2020 Tutorial)
PDF
Can Machine Learning Models be Trusted? Explaining Decisions of ML Models
PDF
Improved Interpretability and Explainability of Deep Learning Models.pdf
PDF
Rsqrd AI: Recent Advances in Explainable Machine Learning Research
PDF
Interpretable machine learning : Methods for understanding complex models
PDF
GDG Community Day 2023 - Interpretable ML in production
Responsible AI in Industry: Practical Challenges and Lessons Learned
Model Evaluation in the land of Deep Learning
Reds interpretability report
Machine Learning Interpretability
"Methods for Understanding How Deep Neural Networks Work," a Presentation fro...
graziani_bias.pdf
ODSC APAC 2022 - Explainable AI
Explainable AI - making ML and DL models more interpretable
Deciphering AI - Unlocking the Black Box of AIML with State-of-the-Art Techno...
Machine Learning Interpretability - Mateusz Dymczyk - H2O AI World London 2018
ML Interpretability Inside Out
Interpretable Machine Learning
Explainable AI in Industry (KDD 2019 Tutorial)
Explainable AI in Industry (FAT* 2020 Tutorial)
Can Machine Learning Models be Trusted? Explaining Decisions of ML Models
Improved Interpretability and Explainability of Deep Learning Models.pdf
Rsqrd AI: Recent Advances in Explainable Machine Learning Research
Interpretable machine learning : Methods for understanding complex models
GDG Community Day 2023 - Interpretable ML in production
Ad

More from Subhashis Hazarika (13)

PPTX
Deep_Learning_Frameworks_CNTK_PyTorch
PPTX
Word2Vec Network Structure Explained
PPTX
Probabilistic Graph Layout for Uncertain Network Visualization
PDF
An analysis of_machine_and_human_analytics_in_classification
PDF
Uncertainty aware multidimensional ensemble data visualization and exploration
PDF
CSE5559::Visualizing the Life and Anatomy of Cosmic Particles
PDF
Visualizing the variability of gradient in uncertain 2d scalarfield
PDF
Sparse PDF Volumes for Consistent Multi-resolution Volume Rendering
PDF
Visualization of uncertainty_without_a_mean
PDF
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
PDF
Graph cluster randomization
PDF
Linear programming in computational geometry
PDF
CERN summer presentation
Deep_Learning_Frameworks_CNTK_PyTorch
Word2Vec Network Structure Explained
Probabilistic Graph Layout for Uncertain Network Visualization
An analysis of_machine_and_human_analytics_in_classification
Uncertainty aware multidimensional ensemble data visualization and exploration
CSE5559::Visualizing the Life and Anatomy of Cosmic Particles
Visualizing the variability of gradient in uncertain 2d scalarfield
Sparse PDF Volumes for Consistent Multi-resolution Volume Rendering
Visualization of uncertainty_without_a_mean
Semi automatic vortex extraction in 4 d pc-mri cardiac blood flow data using ...
Graph cluster randomization
Linear programming in computational geometry
CERN summer presentation

Recently uploaded (20)

PPTX
additive manufacturing of ss316l using mig welding
PPT
introduction to datamining and warehousing
PPTX
Artificial Intelligence
PDF
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
PPTX
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
PDF
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
PDF
Model Code of Practice - Construction Work - 21102022 .pdf
PDF
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
PPTX
UNIT 4 Total Quality Management .pptx
PPTX
Current and future trends in Computer Vision.pptx
PDF
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
PPT
Mechanical Engineering MATERIALS Selection
PDF
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
PPTX
bas. eng. economics group 4 presentation 1.pptx
PDF
Digital Logic Computer Design lecture notes
PPTX
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
PDF
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PDF
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
PDF
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
PPTX
CYBER-CRIMES AND SECURITY A guide to understanding
additive manufacturing of ss316l using mig welding
introduction to datamining and warehousing
Artificial Intelligence
BMEC211 - INTRODUCTION TO MECHATRONICS-1.pdf
FINAL REVIEW FOR COPD DIANOSIS FOR PULMONARY DISEASE.pptx
Mohammad Mahdi Farshadian CV - Prospective PhD Student 2026
Model Code of Practice - Construction Work - 21102022 .pdf
Mitigating Risks through Effective Management for Enhancing Organizational Pe...
UNIT 4 Total Quality Management .pptx
Current and future trends in Computer Vision.pptx
SM_6th-Sem__Cse_Internet-of-Things.pdf IOT
Mechanical Engineering MATERIALS Selection
July 2025 - Top 10 Read Articles in International Journal of Software Enginee...
bas. eng. economics group 4 presentation 1.pptx
Digital Logic Computer Design lecture notes
M Tech Sem 1 Civil Engineering Environmental Sciences.pptx
keyrequirementskkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkkk
PRIZ Academy - 9 Windows Thinking Where to Invest Today to Win Tomorrow.pdf
TFEC-4-2020-Design-Guide-for-Timber-Roof-Trusses.pdf
CYBER-CRIMES AND SECURITY A guide to understanding

DNN Model Interpretability

  • 1. Interpretation and Explanation Methods For DNN Models Presented by: • Subhashis Hazarika, The Ohio State University CSE 6559: Seminar Presentation TopicsCovered Activation Maximization Sensitivity Analysis Simple Taylor Decomposition Layer-wise Relevance Propagation Guided Backpropagation LRP and deep Taylor Explanation Evaluation Techniques Applications Concept Activation Vectors (CAV) Testing with Concept Activation Vectors (TCAV) TCAV Applications
  • 2. Methods for Interpreting and Understanding Deep Neural Networks Authors: • Gregoire Montavon, TU Berlin • Wojciech Samek, Fraunhofer Heinrich Hertz Institute, Berlin • Klaus-Robert Muller, TU Berlin Presented by: • Subhashis Hazarika, The Ohio State University CSE 6559: Seminar Presentation
  • 3. Outline • Motivation • Interpretation vs Explanation • Interpretation • Activation Maximization • Explanation • Gradient vs Decomposition • Relevance Propagation (deep Taylor decomposition) • Evaluating Explanations • Applications
  • 4. Motivation 1. Verify that the classifier works as expected
  • 6. Motivation 3. Learn from the learning machine
  • 10. Understanding Deep Nets : Two Views
  • 11. Model Analysis  Interpretation
  • 12. Decision Analysis  Explanation
  • 14. Interpreting a DNN Model • Build a prototype in the input domain of the DNN (e.g image or text) that is interpretable and representative of the abstract learned concept • Activation Maximization (AM): An analysis framework that searches for an input pattern that produces a maximum model response for a quantity of interest.
  • 17. Improving AM • Simple AM : • Improving AM with an expert: • Perform AM in code space: • These techniques require an unsupervised model of the data, either a density model p(x) or a generator model g(z).
  • 18. Comparison of AM variants
  • 19. Enhanced AM on Natural Images
  • 20. Limitation of Global Interpretations
  • 21. From Global Interpretations to Individual Explanations Explanability
  • 22. Explaining DNN decisions • “why” does the model arrive at a certain prediction? • “why” is the image below classified as a shark?
  • 23. Explainability: Basic Techniques • Sensitivity Analysis (Gradient Approach) • Simple Taylor Decomposition (Function-decomposition Approach) • Backward Propagation Techniques • Layer-wise Relevance Propagation (decomposition) • Guided Backpropagation (gradient)
  • 25. What does Sensitivity Analysis decompose?
  • 26. What does Sensitivity Analysis decompose?
  • 27. Decomposing the Correct Quantity 𝒇(𝒙) = Taylor Series:
  • 28. Simple Taylor Decomposition • Achievable for linear models and deep ReLU 𝑓 𝑡𝑥 = 𝑡𝑥 → 𝑓 0 = 0 • We can choose 𝑥 = lim 𝑡→0 𝑡. 𝑥 • Final Relevance; 𝑅𝑖 = 𝜕𝑓 𝜕𝑥 𝑖 𝑥=0 . 𝑥𝑖 • i.e, gradient x input
  • 29. Simple Taylor Decomposition • Achievable for linear models and deep ReLU 𝑓 𝑡𝑥 = 𝑡𝑥 → 𝑓 0 = 0 • We can choose 𝑥 = lim 𝑡→0 𝑡. 𝑥 • Final Relevance; 𝑅𝑖 = 𝜕𝑓 𝜕𝑥 𝑖 𝑥=0 . 𝑥𝑖 • i.e, gradient x input
  • 32. Why Simple Taylor doesn’t work? 1. Root point is hard to find or too far from 𝑥 (losses the context of 𝑥) 2. Gradient shattering  the gradient of deep nets has low informative value [Balduzzi ‘17]
  • 33. Backward Propagation Techniques • Make explicit use of the graph structure • Layer-wise Relevance Propagation (Conserving) • Guided Backpropagation (Non conserving) • BPT were shown empirically to scale better to complex DNN models • Facilitates filtering (breaks the explanation into multiple subtask)
  • 34. Layer-wise Relevance Propagation • Each neuron receives a share of the network output, and redistributes it to its predecessors in equal amount, until the input variables are reached Conservation Property: LRP - 𝛼1 𝛽0 rule:
  • 35. LRP propagation rules • LRP - 𝛼𝛽 rule [Landecker`13, Bach`15, Zhang`16, Montavon`17] • General rule: 1 0 -1 1 0 -1 1 0 -1
  • 36. LRP vs Guided Backpropagation
  • 37. LRP and deep Taylor decomposition • LRP - 𝛼1 𝛽0 rule  deep Taylor decomposition [Montavon`17] Assumption:
  • 38. LRP and deep Taylor decomposition 1. Build the Relevance Neuron
  • 39. LRP and deep Taylor decomposition 2. Expand the Relevance Neuron
  • 40. LRP and deep Taylor decomposition 3. Decompose Relevance
  • 41. LRP and deep Taylor decomposition 4. Pooling relevance over all outgoing neurons
  • 42. Evaluating Explanation Quality • Explanation Continuity • Explanation Selectivity
  • 43. Explanation Continuity • “If two data points are nearly equivalent, then the explanations of their predictions should also be nearly equivalent.” • Explanation continuity (or lack of it) can be quantified by looking for the strongest variation of the explanation R(x) in the input domain
  • 46. Explanation Selectivity • Can be quantified by measuring how fast f(x) goes down when removing features with the highest relevance scores • “Pixel flipping” test for image data
  • 48. Applications • Model Validation Procedure • Analysis of scientific data
  • 49. Model Validation • Traditional validation  use validation set (subset of the initial training data) • Human interpretable results can add to basic validation procedures
  • 50. Model Validation • Traditional validation  use validation set (subset of the initial training data) • Human interpretable results can add to basic validation procedures
  • 51. Analysis of Scientific Data • The context of interpretability and explanability can extended to domain specific knowledge, useful for scientific inferences • Atomistic simulation:
  • 52. Analysis of Scientific Data • The context of interpretability and explanability can extended to domain specific knowledge, useful for scientific inferences • Human Brain Pattern:
  • 53. Analysis of Scientific Data • The context of interpretability and explanability can extended to domain specific knowledge, useful for scientific inferences • DNA sequence:
  • 54. Analysis of Scientific Data • The context of interpretability and explanability can extended to domain specific knowledge, useful for scientific inferences • Human face analysis:
  • 55. Summary so far • ML model transparency: • Interpreting the concepts learned by a model by building prototypes • Explaining the model’s decisions by identifying the relevant input variables • Crucial distinction between sensitivity analysis and decomposition approaches • Evaluating Explanations • Continuity • Selectivity • Applications
  • 56. Interpretability Beyond Feature Attribution: Quantitative Testing with Concept Activation Vectors (TCAV) Authors: • Been Kim, Martin Wattenberg, Justin Gilmer, Carrie Cai, James Wexler, Fernanda Viegas, and Rory Sayres • Affiliation: Google Brain CSE 6559: Seminar Presentation Presented by: • Subhashis Hazarika, The Ohio State University
  • 62. Other Analysis • TCAV extension: Relative TCAV • Validating the learned CAVs:
  • 63. TCAV for medical application • Diabetic Retinopathy (DR) (level 0- 4) • Validating the trained model with known concepts from the doctors