SlideShare a Scribd company logo
PRIOR ON MODEL SPACE
W h a t m a k e s a m o d e l s i m p l e ?
M E I R M A O R
C h i e f A r c h i t e c t @ S p a r k B e y o n d
Outline
Why are simple models desirable
Traditional approaches for simplicity
Alternative approaches
Why Simple models?
PAC model
No Free Lunch
Better generalization
In Reality
Transfer learning and non stationary distributions
Robust against correlated samples, etc. Leslie Valiant, 1984
Why Simple Models? Cont.
Understandable
Trustworthy
Explainable (also for Regulatory reasons)
Understandable models are ultimately more accurate
Traditional complexity control
Bias / Variance tradeoff → We must limit our search space
Shrink the hypothesis space:
limit boosting iterations
tree size
min sample in leaf
number of hidden nodes
impose sparsity constraint
...
Traditional complexity control cont.
Penalize “less favorable” models:
Lasso / ridge regularization
Bagging / Boot strap sampling
Drop out
Which is more likely?
Coefficients from two feed forward ReLu NN
single-output single hidden layer
NETWORK A NETWORK B
Which feature is a more likely?
Both have ℝ2 = 0.1
Math.ulp(x) - The positive distance between this floating-point value and the
double value next larger in magnitude
vs.
Math.log(x) - Natural logarithm
Which is a more likely feature? #2
The distance to the nearest railway station?
vs.
arctan(latitude * longitude)
Everyone is a domain expert!
We are experts in the world we live in.
Currently, humans have a much better prior than machines.
Many ideas repeat themselves across domains.
For example, you don’t have to be a rocket scientist to be familiar with second
derivatives.
Transfer Learning to the rescue
We can and must learn from previous problems
How can a child learn to identify a Ring Tailed Lemur from a single photo?
Becoming common
Pre-Trained Neural networks
Pre-Trained embeddings
A lot of work on Images and Text
Much less research on other data:
TimeNet - RNN for embedding time series data
Most real-life problems tend to have a more complicated shape
A different approach
Use already codified human knowledge
Explicitly look for patterns similar to things you have seen before
Extraordinary claims require extraordinary evidence
At SparkBeyond
Find the best hypotheses, using simple compositions of tried and true building
blocks
The building block may require a lot of code to implement. Yet, will be useful
across domains
Incorporate pre-trained embeddings
Use external knowledge
Prioritize simple hypotheses
Always meta-learn how to learn
● Domain expert can review such a finding
● True phenomenon
● Insightful and actionable
Shops near recreational parks are more successful
Becomes intuitive when you see concrete examples
Colorful gadgets tend to be cheaper
EXTERNAL
DATA
Language
models
News
Social
Media
Wikipedia
Dictionaries
Maps
Simple = compressible = common
A simple model or feature is one which we can be expressed briefly.
MDL - minimum description length is optimal compression.
Better compression leads to better model performance.
But should we be using a vanilla Turing machine for MDL?
Benefits of a better prior
Learn with less data
More robust to change
More robust to data issues
Understandable and explainable
Actionability without a complete model
Open questions and challenges
What is simple?
What makes an insight insightful?
What makes a feature likely to generalize?
Efficient search over insightful hypothesis space
VISIT OUR BOOTH

More Related Content

PPTX
Design the Brain Simulator
PPTX
Machine learning || Introduction || Main Components || Examples || Techniques
PPTX
Model Development And Evaluation in ML.pptx
PDF
Probabilistic modeling in deep learning
PPT
Machine Learning: Foundations Course Number 0368403401
PPT
Machine Learning: Foundations Course Number 0368403401
PPT
Machine Learning and Inductive Inference
PDF
Some Take-Home Message about Machine Learning
Design the Brain Simulator
Machine learning || Introduction || Main Components || Examples || Techniques
Model Development And Evaluation in ML.pptx
Probabilistic modeling in deep learning
Machine Learning: Foundations Course Number 0368403401
Machine Learning: Foundations Course Number 0368403401
Machine Learning and Inductive Inference
Some Take-Home Message about Machine Learning

Similar to Prior On Model Space (20)

PPTX
Introduction
PPTX
Introduction
PPTX
Introduction
DOC
Lecture #1: Introduction to machine learning (ML)
PPT
Statistical Machine________ Learning.ppt
PDF
Machine learning pour les données massives algorithmes randomis´es, en ligne ...
PPTX
Artificial intelligence: Simulation of Intelligence
PPT
Emergence Berkeley presentation for devices
PPT
notes as .ppt
PPTX
Module 1 Taxonomy of Machine L(1).pptx
PDF
MACHINE LEARNING
PDF
Machine Learning Basics and Supervised, unsupervised
PDF
Introduction to Data Science
PDF
ml basics ARTIFICIAL INTELLIGENCE, MACHINE LEARNING, TYPES OF MACHINE LEARNIN...
PDF
Train, explain, acclaim. Build a good model in three steps
PPT
LECTURE8.PPT
PPTX
Symbolic Background Knowledge for Machine Learning
PPT
Introduction to Machine Learning.
PPTX
Generalization abstraction
Introduction
Introduction
Introduction
Lecture #1: Introduction to machine learning (ML)
Statistical Machine________ Learning.ppt
Machine learning pour les données massives algorithmes randomis´es, en ligne ...
Artificial intelligence: Simulation of Intelligence
Emergence Berkeley presentation for devices
notes as .ppt
Module 1 Taxonomy of Machine L(1).pptx
MACHINE LEARNING
Machine Learning Basics and Supervised, unsupervised
Introduction to Data Science
ml basics ARTIFICIAL INTELLIGENCE, MACHINE LEARNING, TYPES OF MACHINE LEARNIN...
Train, explain, acclaim. Build a good model in three steps
LECTURE8.PPT
Symbolic Background Knowledge for Machine Learning
Introduction to Machine Learning.
Generalization abstraction
Ad

More from Meir Maor (6)

PPTX
Sketch algoritms
ODP
Actionable Machine Learning
ODP
Limits of Machine Learning
PPTX
Can automated feature engineering prevent target leaks
ODP
Scala Reflection & Runtime MetaProgramming
ODP
10 Things I Hate About Scala
Sketch algoritms
Actionable Machine Learning
Limits of Machine Learning
Can automated feature engineering prevent target leaks
Scala Reflection & Runtime MetaProgramming
10 Things I Hate About Scala
Ad

Recently uploaded (20)

PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPTX
Computer network topology notes for revision
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPT
ISS -ESG Data flows What is ESG and HowHow
PDF
Optimise Shopper Experiences with a Strong Data Estate.pdf
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PDF
Clinical guidelines as a resource for EBP(1).pdf
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PDF
[EN] Industrial Machine Downtime Prediction
PDF
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
PDF
Business Analytics and business intelligence.pdf
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PDF
Mega Projects Data Mega Projects Data
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
oil_refinery_comprehensive_20250804084928 (1).pptx
Computer network topology notes for revision
Reliability_Chapter_ presentation 1221.5784
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
ISS -ESG Data flows What is ESG and HowHow
Optimise Shopper Experiences with a Strong Data Estate.pdf
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Clinical guidelines as a resource for EBP(1).pdf
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
[EN] Industrial Machine Downtime Prediction
Capcut Pro Crack For PC Latest Version {Fully Unlocked 2025}
Business Analytics and business intelligence.pdf
Qualitative Qantitative and Mixed Methods.pptx
Acceptance and paychological effects of mandatory extra coach I classes.pptx
climate analysis of Dhaka ,Banglades.pptx
Supervised vs unsupervised machine learning algorithms
Introduction-to-Cloud-ComputingFinal.pptx
Mega Projects Data Mega Projects Data

Prior On Model Space

  • 1. PRIOR ON MODEL SPACE W h a t m a k e s a m o d e l s i m p l e ? M E I R M A O R C h i e f A r c h i t e c t @ S p a r k B e y o n d
  • 2. Outline Why are simple models desirable Traditional approaches for simplicity Alternative approaches
  • 3. Why Simple models? PAC model No Free Lunch Better generalization In Reality Transfer learning and non stationary distributions Robust against correlated samples, etc. Leslie Valiant, 1984
  • 4. Why Simple Models? Cont. Understandable Trustworthy Explainable (also for Regulatory reasons) Understandable models are ultimately more accurate
  • 5. Traditional complexity control Bias / Variance tradeoff → We must limit our search space Shrink the hypothesis space: limit boosting iterations tree size min sample in leaf number of hidden nodes impose sparsity constraint ...
  • 6. Traditional complexity control cont. Penalize “less favorable” models: Lasso / ridge regularization Bagging / Boot strap sampling Drop out
  • 7. Which is more likely? Coefficients from two feed forward ReLu NN single-output single hidden layer NETWORK A NETWORK B
  • 8. Which feature is a more likely? Both have ℝ2 = 0.1 Math.ulp(x) - The positive distance between this floating-point value and the double value next larger in magnitude vs. Math.log(x) - Natural logarithm
  • 9. Which is a more likely feature? #2 The distance to the nearest railway station? vs. arctan(latitude * longitude)
  • 10. Everyone is a domain expert! We are experts in the world we live in. Currently, humans have a much better prior than machines. Many ideas repeat themselves across domains. For example, you don’t have to be a rocket scientist to be familiar with second derivatives.
  • 11. Transfer Learning to the rescue We can and must learn from previous problems How can a child learn to identify a Ring Tailed Lemur from a single photo?
  • 12. Becoming common Pre-Trained Neural networks Pre-Trained embeddings A lot of work on Images and Text Much less research on other data: TimeNet - RNN for embedding time series data Most real-life problems tend to have a more complicated shape
  • 13. A different approach Use already codified human knowledge Explicitly look for patterns similar to things you have seen before Extraordinary claims require extraordinary evidence
  • 14. At SparkBeyond Find the best hypotheses, using simple compositions of tried and true building blocks The building block may require a lot of code to implement. Yet, will be useful across domains Incorporate pre-trained embeddings Use external knowledge Prioritize simple hypotheses Always meta-learn how to learn
  • 15. ● Domain expert can review such a finding ● True phenomenon ● Insightful and actionable Shops near recreational parks are more successful
  • 16. Becomes intuitive when you see concrete examples Colorful gadgets tend to be cheaper
  • 18. Simple = compressible = common A simple model or feature is one which we can be expressed briefly. MDL - minimum description length is optimal compression. Better compression leads to better model performance. But should we be using a vanilla Turing machine for MDL?
  • 19. Benefits of a better prior Learn with less data More robust to change More robust to data issues Understandable and explainable Actionability without a complete model
  • 20. Open questions and challenges What is simple? What makes an insight insightful? What makes a feature likely to generalize? Efficient search over insightful hypothesis space