SlideShare a Scribd company logo
3
Most read
7
Most read
8
Most read
WeightWatcher:
A Diagnostic Tool for Deep Neural Networks
Charles H. Martin PhD, Calculation Consulting
pip install weightwatcher
Martin WeightWatcher March 2021 1 / 9
Open source tool: weightwatcher
https://github.com/CalculatedContent/WeightWatcher
WeightWatcher (WW): is an open-source, diagnostic tool for analyzing
Deep Neural Networks (DNN), without needing access to training or even
test data. It can be used to:
analyze pre/trained pyTorch and keras models
inspect models that are difficult to train
gauge improvements in model performance
predict test accuracies across different models
detect potential problems when compressing or fine-tuning pretrained
models
It is based on theoretical research (done injoint with UC Berkeley) into
Why Deep Learning Works, using ideas from Random Matrix Theory
(RMT), Statistical Mechanics, and Strongly Correlated Systems.
pip install weightwatcher
Martin WeightWatcher March 2021 2 / 9
Shape and Scale Metrics
WeightWatcher (WW): analyzes the shape and scale of the correlations
in the layer weight matrices:
WW: extracts, plots, and fits the Empirical Spectral Density (ESD, or
eigenvalues) for each layer weight matrix (or tensor slice).
The tail of the ESD contains the most informative components.
The shape of the tail carries useful information!
Martin WeightWatcher March 2021 3 / 9
WeightWatcher: Usage
Usage
import weightwatcher as ww
watcher = ww.WeightWatcher(model=model)
details = watcher.analyze(plot=True)
summary = watcher.get_summary(details)
summary = watcher.get_summary(details)
Martin WeightWatcher March 2021 4 / 9
Layer-by-Layer Analysis
WW layer metrics: can detect potential problems in the ESD shapes
Poorly trained models (orange) can have unusually large layer α’s.
Martin WeightWatcher March 2021 5 / 9
Layer-by-Layer Analysis
WW layer metrics: can detect potential problems in the ESD Scales
Compressed models (red) can show unexpected scale changes
example from Intel distiller Group Regularization technique
Martin WeightWatcher March 2021 6 / 9
α: a regularization metric
The WW hαi metric: predicts test accuracy for a given model (i.e same
depth) when varying the regularization hyper-parameters (such as batch
size, weight decay, momentum, etc.)—without access to the test or
training data.
Martin WeightWatcher March 2021 7 / 9
α̂: a multi-purpose metric
The WW α̂ metric: predicts test accuracy for models in the same
architecture series across varying depth and other architecture parameters
and regularization hyper-parameters—-without access to the test or
training data.
Martin WeightWatcher March 2021 8 / 9

More Related Content

PDF
ENS Macrh 2022.pdf
PDF
Weight watcher Bay Area ACM Feb 28, 2022
PDF
Statistical Mechanics Methods for Discovering Knowledge from Production-Scale...
PDF
Stanford ICME Lecture on Why Deep Learning Works
PDF
This Week in Machine Learning and AI Feb 2019
PDF
Why Deep Learning Works: Self Regularization in Deep Neural Networks
PDF
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC Berkeley
PDF
Why Deep Learning Works: Self Regularization in Deep Neural Networks
ENS Macrh 2022.pdf
Weight watcher Bay Area ACM Feb 28, 2022
Statistical Mechanics Methods for Discovering Knowledge from Production-Scale...
Stanford ICME Lecture on Why Deep Learning Works
This Week in Machine Learning and AI Feb 2019
Why Deep Learning Works: Self Regularization in Deep Neural Networks
Why Deep Learning Works: Dec 13, 2018 at ICSI, UC Berkeley
Why Deep Learning Works: Self Regularization in Deep Neural Networks

What's hot (20)

PDF
Why Deep Learning Works: Self Regularization in Deep Neural Networks
PDF
Cc stat phys draft
PDF
Metric-learn, a Scikit-learn compatible package
PPTX
Image Classification And Support Vector Machine
PDF
EE660_Report_YaxinLiu_8448347171
PPTX
Support Vector Machine and Implementation using Weka
PPT
Support Vector Machines
PPT
Support Vector Machine (Classification) - Step by Step
PDF
Huong dan cu the svm
PDF
Evolutionary Design of Swarms (SSCI 2014)
PPTX
Machine Learning
PPT
2.6 support vector machines and associative classifiers revised
PDF
Chap 8. Optimization for training deep models
PDF
PR-232: AutoML-Zero:Evolving Machine Learning Algorithms From Scratch
PDF
SVD and the Netflix Dataset
PDF
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
PPTX
CS 402 DATAMINING AND WAREHOUSING -MODULE 4
PPTX
Practical tips for handling noisy data and annotaiton
PDF
IRJET- Interactive Image Segmentation with Seed Propagation
PPTX
Support Vector Machines- SVM
Why Deep Learning Works: Self Regularization in Deep Neural Networks
Cc stat phys draft
Metric-learn, a Scikit-learn compatible package
Image Classification And Support Vector Machine
EE660_Report_YaxinLiu_8448347171
Support Vector Machine and Implementation using Weka
Support Vector Machines
Support Vector Machine (Classification) - Step by Step
Huong dan cu the svm
Evolutionary Design of Swarms (SSCI 2014)
Machine Learning
2.6 support vector machines and associative classifiers revised
Chap 8. Optimization for training deep models
PR-232: AutoML-Zero:Evolving Machine Learning Algorithms From Scratch
SVD and the Netflix Dataset
Time Series Forecasting Using Recurrent Neural Network and Vector Autoregress...
CS 402 DATAMINING AND WAREHOUSING -MODULE 4
Practical tips for handling noisy data and annotaiton
IRJET- Interactive Image Segmentation with Seed Propagation
Support Vector Machines- SVM
Ad

Similar to WeightWatcher Introduction (20)

PDF
forest-cover-type
PDF
Opinion mining framework using proposed RB-bayes model for text classication
PDF
Weibull Distribution and Reliability of Ceramics
PDF
Weibull Distribution and Reliability of Ceramics
PPTX
Overview of Statistical Bin Analysis and Its Uses in Semiconductor Testing.pptx
PDF
Informing product design with analytical data
PDF
MACHINE LEARNING TOOLBOX
PDF
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
PDF
Caravan insurance data mining prediction models
PDF
Caravan insurance data mining prediction models
PPTX
presentationIDC - 14MAY2015
PDF
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
PDF
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
PPTX
Random Forest Ensemble of Support Vector Regression for Solar Power Forecasting
PDF
Quantile Regression with Q1/Q3 Anchoring: A Robust Alternative for Outlier-Re...
PDF
Quantile Regression with Q1/Q3 Anchoring: A Robust Alternative for Outlier-Re...
DOCX
Btm8107 8 week2 activity understanding and exploring assumptions a+ work
PDF
Introduction to EDA and Data Analytics with Power BI
DOCX
3rd quarter plan
forest-cover-type
Opinion mining framework using proposed RB-bayes model for text classication
Weibull Distribution and Reliability of Ceramics
Weibull Distribution and Reliability of Ceramics
Overview of Statistical Bin Analysis and Its Uses in Semiconductor Testing.pptx
Informing product design with analytical data
MACHINE LEARNING TOOLBOX
A SURVEY ON BLOOD DISEASE DETECTION USING MACHINE LEARNING
Caravan insurance data mining prediction models
Caravan insurance data mining prediction models
presentationIDC - 14MAY2015
IRJET - Breast Cancer Prediction using Supervised Machine Learning Algorithms...
E-Healthcare monitoring System for diagnosis of Heart Disease using Machine L...
Random Forest Ensemble of Support Vector Regression for Solar Power Forecasting
Quantile Regression with Q1/Q3 Anchoring: A Robust Alternative for Outlier-Re...
Quantile Regression with Q1/Q3 Anchoring: A Robust Alternative for Outlier-Re...
Btm8107 8 week2 activity understanding and exploring assumptions a+ work
Introduction to EDA and Data Analytics with Power BI
3rd quarter plan
Ad

More from Charles Martin (20)

PDF
The Emergence of Signatures of AGI: The Physics of Learning
PDF
An Overview of the WeightWatcher Project: March 2025
PDF
Spin Glass Models of Neural Networks: The Curie-Weiss Model from Statistical ...
PDF
Overview of basic statistical mechanics of NNs
PDF
SETOL: SemiEmpirical Theory of (Deep Learning)
PDF
SETOL: a SemiEmpirical Theory of (Deep) Learning
PDF
WeightWatcher: Data Free Diagnostics for Deep Learning
PDF
Heavy Tails Workshop NeurIPS2023.pdf
PDF
LLM avalanche June 2023.pdf
PDF
WeightWatcher LLM Update
PDF
ICCF24.pdf
PDF
Georgetown B-school Talk 2021
PDF
Search relevance
PDF
WeightWatcher Update: January 2021
PDF
Building AI Products: Delivery Vs Discovery
PDF
AI and Machine Learning for the Lean Start Up
PDF
Capsule Networks
PDF
Palo alto university rotary club talk Sep 29, 2107
PDF
CC mmds talk 2106
PDF
Applied machine learning for search engine relevance 3
The Emergence of Signatures of AGI: The Physics of Learning
An Overview of the WeightWatcher Project: March 2025
Spin Glass Models of Neural Networks: The Curie-Weiss Model from Statistical ...
Overview of basic statistical mechanics of NNs
SETOL: SemiEmpirical Theory of (Deep Learning)
SETOL: a SemiEmpirical Theory of (Deep) Learning
WeightWatcher: Data Free Diagnostics for Deep Learning
Heavy Tails Workshop NeurIPS2023.pdf
LLM avalanche June 2023.pdf
WeightWatcher LLM Update
ICCF24.pdf
Georgetown B-school Talk 2021
Search relevance
WeightWatcher Update: January 2021
Building AI Products: Delivery Vs Discovery
AI and Machine Learning for the Lean Start Up
Capsule Networks
Palo alto university rotary club talk Sep 29, 2107
CC mmds talk 2106
Applied machine learning for search engine relevance 3

Recently uploaded (20)

PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
1. Introduction to Computer Programming.pptx
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
August Patch Tuesday
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PPTX
Spectroscopy.pptx food analysis technology
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PPTX
TLE Review Electricity (Electricity).pptx
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Machine learning based COVID-19 study performance prediction
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
A Presentation on Artificial Intelligence
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
MIND Revenue Release Quarter 2 2025 Press Release
Programs and apps: productivity, graphics, security and other tools
1. Introduction to Computer Programming.pptx
Assigned Numbers - 2025 - Bluetooth® Document
August Patch Tuesday
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Spectroscopy.pptx food analysis technology
Reach Out and Touch Someone: Haptics and Empathic Computing
Agricultural_Statistics_at_a_Glance_2022_0.pdf
TLE Review Electricity (Electricity).pptx
Per capita expenditure prediction using model stacking based on satellite ima...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
A comparative analysis of optical character recognition models for extracting...
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Advanced methodologies resolving dimensionality complications for autism neur...
Machine learning based COVID-19 study performance prediction
Digital-Transformation-Roadmap-for-Companies.pptx
cloud_computing_Infrastucture_as_cloud_p
A Presentation on Artificial Intelligence
Spectral efficient network and resource selection model in 5G networks
MIND Revenue Release Quarter 2 2025 Press Release

WeightWatcher Introduction

  • 1. WeightWatcher: A Diagnostic Tool for Deep Neural Networks Charles H. Martin PhD, Calculation Consulting pip install weightwatcher Martin WeightWatcher March 2021 1 / 9
  • 2. Open source tool: weightwatcher https://github.com/CalculatedContent/WeightWatcher WeightWatcher (WW): is an open-source, diagnostic tool for analyzing Deep Neural Networks (DNN), without needing access to training or even test data. It can be used to: analyze pre/trained pyTorch and keras models inspect models that are difficult to train gauge improvements in model performance predict test accuracies across different models detect potential problems when compressing or fine-tuning pretrained models It is based on theoretical research (done injoint with UC Berkeley) into Why Deep Learning Works, using ideas from Random Matrix Theory (RMT), Statistical Mechanics, and Strongly Correlated Systems. pip install weightwatcher Martin WeightWatcher March 2021 2 / 9
  • 3. Shape and Scale Metrics WeightWatcher (WW): analyzes the shape and scale of the correlations in the layer weight matrices: WW: extracts, plots, and fits the Empirical Spectral Density (ESD, or eigenvalues) for each layer weight matrix (or tensor slice). The tail of the ESD contains the most informative components. The shape of the tail carries useful information! Martin WeightWatcher March 2021 3 / 9
  • 4. WeightWatcher: Usage Usage import weightwatcher as ww watcher = ww.WeightWatcher(model=model) details = watcher.analyze(plot=True) summary = watcher.get_summary(details) summary = watcher.get_summary(details) Martin WeightWatcher March 2021 4 / 9
  • 5. Layer-by-Layer Analysis WW layer metrics: can detect potential problems in the ESD shapes Poorly trained models (orange) can have unusually large layer α’s. Martin WeightWatcher March 2021 5 / 9
  • 6. Layer-by-Layer Analysis WW layer metrics: can detect potential problems in the ESD Scales Compressed models (red) can show unexpected scale changes example from Intel distiller Group Regularization technique Martin WeightWatcher March 2021 6 / 9
  • 7. α: a regularization metric The WW hαi metric: predicts test accuracy for a given model (i.e same depth) when varying the regularization hyper-parameters (such as batch size, weight decay, momentum, etc.)—without access to the test or training data. Martin WeightWatcher March 2021 7 / 9
  • 8. α̂: a multi-purpose metric The WW α̂ metric: predicts test accuracy for models in the same architecture series across varying depth and other architecture parameters and regularization hyper-parameters—-without access to the test or training data. Martin WeightWatcher March 2021 8 / 9