SlideShare a Scribd company logo
The neglected importance
of complexity
in statistics and Metascience

Daniele Fanelli
In this talk:
1) what’s missing in the current paradigm
2) what a new paradigm might look like
3) evidence in support of this proposal
The elephant in the room of:
1) metascience
●
e.g. reproducibility
2) statistics
●
e.g. model “complexity”
●
[see SW seminar 2021, Fanelli 2019, 2022]
What is “complex”?
Level of complexity
Many, diverse, interacting parts.
Long to describe, difficult to predict.
example 1) complexity deflates the
“reproducibility crisis”.
year project discipline N result
2014 Many labs 1 psychology,
misc.
13, 36 labs 77%
2016 COS social+cognitive
psychology
100 36-68%
2016 Camerer et al. experimental
economics
18 61-78%
2018 Many labs 2 social+cognitive
psychology
28, 62 samples,
36 countries
54%
2018 Camerer et al. social studies in
Nature, Science
21 57-67%
2021 RPCB cancer biology 188, 50 exper.,
23 papers
3-82%
Lower reproducib:
1) complex phenomena
2) complex methods
●
not just random noise
●
structured, systematic
diff.
example 2) complexity confuses
statistical results
AIC:−2log( L)+2k
models vary by “fitting propensity”
(“complexity” beyond n. of parameters)
(Bonifay and Cai 2017)
universe of possible data
When is a theory actually supported?
How might the elephant appear?
1) integrate complexity of phenomena & methods in measuring, forecasting,
correcting reproducibility
2) penalize statistical models for complexity beyond number of parameters
integrating “complexity” ~ severity of testing
https://www.wallpaperflare.com/elephant-during-daytime-grayscale-photo-of-elephant-portrait-wallpaper-zhdat
Part 2: A candidate alternative
K theory in a nutshell
(Fanelli 2019, Fanelli 2022)
K =
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
+
K = consilience
K =
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
explain/predict/control
more/diverse phenomena
with fewer/simpler theories/methods
+
what the variables represent
=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
K
explain/predict/control
more/diverse phenomena
with fewer/simpler theories/methods
+
more information Y makes K GROW
(Fanelli 2019, Fanelli 2022)
K=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
+
(Fanelli 2019, Fanelli 2022)
=
- H(Y|X, τ))
H(X) D(τ))
nX
H(Y1
) + H(Y2
) + H(Y3
)
nY
H(Y1
) + H(Y2
) + H(Y3
) +
K
more information Y makes K GROW
+
K theory in a nutshell
(Fanelli 2019, Fanelli 2022)
=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
K
+
more info. Y|X, X, τ makes K small
(Fanelli 2019, Fanelli 2022)
K
=
H(Y) - H(Y|X, τ))
H(Y)
H(X) D(τ))
+
nX
nY
+
(Fanelli 2019, Fanelli 2022)
K
=
H(Y) -
H(Y)
H(X)
+
nX
nY
H(Y1
|X, τ))+H(Y2
|X, τ))+H(Y3
|X, τ))
D(τ)1
)+D(τ)2
)+D(τ)3
)
+
more info. Y|X, X, τ makes K small
K theory in a nutshell
(Fanelli 2019, Fanelli 2022)
=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
K
+
=
≈1, full consilience
=0, no knowledge
<0, wrong
K theory in a nutshell
(Fanelli 2019, Fanelli 2022)
=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
K
+
K of a regression model
(Fanelli 2019, Fanelli 2022)
Y = α + β X + error
=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
K
+
(Fanelli 2019, Fanelli 2022)
Y = α + β X + error
this has been said before
=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
K
+
(Fanelli 2019, Fanelli 2022)
key theoretical innovations
Y = α + β X + error
=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
K
+
(Fanelli 2019, Fanelli 2022)
key methodological innovations
Y = α + β X + error
=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
K
+
(Fanelli 2019, Fanelli 2022)
graphs are everywhere in science
methodologies
theories
(Fanelli 2019, Fanelli 2022)
www.protocols.io/view/an-optimized-protocol-
for-in-vivo-analysis-of-tumo-3byl471m2lo5/v1
=
H(Y) - H(Y|X, τ))
H(Y) H(X) D(τ))
+ nX
nY
K
+
(Mueller 2015, ICSS)
https://www.wallpaperflare.com/elephant-in-black-and-white-elephant-photo-animal-grey-wallpaper-tkqgk
Part 3: supporting evidence
1) K predicts perceived
and actual reproducibility
“tau” of biological experiments
(Fanelli, Tan, Amaral & Neves, 2022, MetaArxiv)
K vs. perceived reproducibility
(Fanelli, Tan, Amaral & Neves, 2022, MetaArxiv)
K vs. actual reproducibility
Kr=K o 2−λ⋅d
kr hr=ko ho 2−λ⋅d
log
kr
ko
=log
ho
hr
−λ⋅d
R≡log
H (Y )−H (Y∣X , τr)
H (Y )−H (Y∣X , τo)
=α+βlog
1
D (τr)/ N
(Fanelli, Tan, Amaral & Neves, in prep)
1) K predicts actual reproducibility
(part of collaboration with Brazilian Reproducibility Initiative)
(Fanelli, Tan, Amaral & Neves, in prep)
independent new predictor
multiple regression, Y=reproducibility
(Fanelli, Tan, Amaral & Neves, in prep)
multiple regression, Y=reproducibility
(here D(τ) based on sentences in replication protocol!)
very easy to measure, automatize
better than just P-values and N
(Fanelli, Tan, Amaral & Neves, in prep)
multiple regression, Y=reproducibility R
leads to progress in metascience
(Fanelli, Tan, Amaral & Neves, in prep)
multiple regression, Y=reproducibility
2) D(τ) might
predict fitting propensity
preregistered test
(Fanelli & Bonifay, in prep)
1) generated N=20 models, all with 36 parameters
2) derived D(τ)), predicted their fitting propensity
3) tested them on 20,000 random covariance matrices
(Fanelli & Bonifay, in prep)
preregistered test
D(τ) uniquely reflects model complexity
(pre-registered test)
(Fanelli & Bonifay, in progress)
(Fanelli & Bonifay, in progress)
NO alternative theory explains this!
(pre-registered test)
multiple regression: Y=fitting propensity
The neglected importance of complexity in statistics and Metascience
Spearman’s ρ = 0.64,
P<0.002
Spearman’s ρ = 0.73,
P<0.001
(Bonifay and Cai 2017)
Bifactor
confirmatory
(theoretical)
20 parameters
EIFA:
exploratory
(a-theoretical)
20 parameters
(Fanelli & Bonifay, in progress)
how K improves theory testing
example of application
theories invoking
general
factor
(e.g. IQ, stress,
psychosis)
Bifactor widely used to “test” theories
(Bonifay and Cai 2017)
Bifactor
confirmatory
(theoretical)
20 parameters
EIFA:
exploratory
(a-theoretical)
20 parameters
K(Bifactor) ≥ K(EIFA)
“The [bifactor-encoded] theory
is specifically supported
by the data”
(Fanelli & Bonifay, in progress)
P<0.05
when is a theory actually supported?
example of application
Summary of this talk:
1) what’s missing in the current paradigm?
●
we pretend complexity is irrelevant
2) what might a new paradigm look like?
●
measuring D(τ), integrating/penalizing with K
3) evidence in support of this proposal?
●
increasingly promising
●
any alternative, better suggestions?

More Related Content

PPTX
PDF
Knewton adaptive-learning-white-paper
PDF
2014UMAP Student Modeling with Reduced Content Models
DOCX
REPORT FOR SCIENCE AND TECHNOLOGY EDUCATION
PPTX
Basic Statistical Concepts & Decision-Making
PPTX
CS194Lec0hbh6EDA.pptx
PDF
Knewton - Adaptive learning
PPTX
kmean_naivebayes.pptx
Knewton adaptive-learning-white-paper
2014UMAP Student Modeling with Reduced Content Models
REPORT FOR SCIENCE AND TECHNOLOGY EDUCATION
Basic Statistical Concepts & Decision-Making
CS194Lec0hbh6EDA.pptx
Knewton - Adaptive learning
kmean_naivebayes.pptx

Similar to The neglected importance of complexity in statistics and Metascience (20)

PDF
3. hypothesis
PDF
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
PDF
The Theory and Practice of Item Response Theory 1st Edition R. J. De Ayala Phd
PDF
Hypothesis testing and statistically sound-pattern mining
PDF
Hypothesis testing - Environmental Data analysis
PPT
Hypothesis
DOCX
Figure 2.1Biology is not done by reading textbooks. Text readi.docx
PDF
A. spanos slides ch14-2013 (4)
PPTX
theory testing in psychology: risky predictions and that pesky data prior
PPTX
Dowhy: An end-to-end library for causal inference
PDF
Causal Inference for Everyone
PDF
An Introduction to AI (Formerly Data Science)
PPTX
Buiding blocks of social scientific research
PDF
Regression shrinkage: better answers to causal questions
PPTX
Poggi analytics - concepts - 1a
PPT
Scientific method
PPT
Scientific method
PDF
Dr. Geoffrey J. Gordon: What can machine learning do for open education?
PDF
Data science pitfalls
3. hypothesis
Frequentist Statistics as a Theory of Inductive Inference (2/27/14)
The Theory and Practice of Item Response Theory 1st Edition R. J. De Ayala Phd
Hypothesis testing and statistically sound-pattern mining
Hypothesis testing - Environmental Data analysis
Hypothesis
Figure 2.1Biology is not done by reading textbooks. Text readi.docx
A. spanos slides ch14-2013 (4)
theory testing in psychology: risky predictions and that pesky data prior
Dowhy: An end-to-end library for causal inference
Causal Inference for Everyone
An Introduction to AI (Formerly Data Science)
Buiding blocks of social scientific research
Regression shrinkage: better answers to causal questions
Poggi analytics - concepts - 1a
Scientific method
Scientific method
Dr. Geoffrey J. Gordon: What can machine learning do for open education?
Data science pitfalls
Ad

More from jemille6 (20)

PDF
What is the Philosophy of Statistics? (and how I was drawn to it)
PDF
Mayo, DG March 8-Emory AI Systems and society conference slides.pdf
PDF
Severity as a basic concept in philosophy of statistics
PDF
“The importance of philosophy of science for statistical science and vice versa”
PDF
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
PDF
D. Mayo JSM slides v2.pdf
PDF
reid-postJSM-DRC.pdf
PDF
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
PDF
Causal inference is not statistical inference
PDF
What are questionable research practices?
PDF
What's the question?
PDF
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
PDF
On Severity, the Weight of Evidence, and the Relationship Between the Two
PDF
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
PDF
Comparing Frequentists and Bayesian Control of Multiple Testing
PPTX
Good Data Dredging
PDF
The Duality of Parameters and the Duality of Probability
PDF
Error Control and Severity
PDF
The Statistics Wars and Their Causalities (refs)
PDF
The Statistics Wars and Their Casualties (w/refs)
What is the Philosophy of Statistics? (and how I was drawn to it)
Mayo, DG March 8-Emory AI Systems and society conference slides.pdf
Severity as a basic concept in philosophy of statistics
“The importance of philosophy of science for statistical science and vice versa”
Statistical Inference as Severe Testing: Beyond Performance and Probabilism
D. Mayo JSM slides v2.pdf
reid-postJSM-DRC.pdf
Errors of the Error Gatekeepers: The case of Statistical Significance 2016-2022
Causal inference is not statistical inference
What are questionable research practices?
What's the question?
Mathematically Elegant Answers to Research Questions No One is Asking (meta-a...
On Severity, the Weight of Evidence, and the Relationship Between the Two
Revisiting the Two Cultures in Statistical Modeling and Inference as they rel...
Comparing Frequentists and Bayesian Control of Multiple Testing
Good Data Dredging
The Duality of Parameters and the Duality of Probability
Error Control and Severity
The Statistics Wars and Their Causalities (refs)
The Statistics Wars and Their Casualties (w/refs)
Ad

Recently uploaded (20)

PPTX
master seminar digital applications in india
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PPTX
Lesson notes of climatology university.
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PPTX
Pharma ospi slides which help in ospi learning
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
Institutional Correction lecture only . . .
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PPTX
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
202450812 BayCHI UCSC-SV 20250812 v17.pptx
PDF
O7-L3 Supply Chain Operations - ICLT Program
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
GDM (1) (1).pptx small presentation for students
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
master seminar digital applications in india
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
Lesson notes of climatology university.
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Pharma ospi slides which help in ospi learning
VCE English Exam - Section C Student Revision Booklet
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Institutional Correction lecture only . . .
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Tissue processing ( HISTOPATHOLOGICAL TECHNIQUE
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
202450812 BayCHI UCSC-SV 20250812 v17.pptx
O7-L3 Supply Chain Operations - ICLT Program
O5-L3 Freight Transport Ops (International) V1.pdf
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
GDM (1) (1).pptx small presentation for students
Abdominal Access Techniques with Prof. Dr. R K Mishra

The neglected importance of complexity in statistics and Metascience

  • 1. The neglected importance of complexity in statistics and Metascience  Daniele Fanelli
  • 2. In this talk: 1) what’s missing in the current paradigm 2) what a new paradigm might look like 3) evidence in support of this proposal
  • 3. The elephant in the room of: 1) metascience ● e.g. reproducibility 2) statistics ● e.g. model “complexity” ● [see SW seminar 2021, Fanelli 2019, 2022]
  • 4. What is “complex”? Level of complexity Many, diverse, interacting parts. Long to describe, difficult to predict.
  • 5. example 1) complexity deflates the “reproducibility crisis”. year project discipline N result 2014 Many labs 1 psychology, misc. 13, 36 labs 77% 2016 COS social+cognitive psychology 100 36-68% 2016 Camerer et al. experimental economics 18 61-78% 2018 Many labs 2 social+cognitive psychology 28, 62 samples, 36 countries 54% 2018 Camerer et al. social studies in Nature, Science 21 57-67% 2021 RPCB cancer biology 188, 50 exper., 23 papers 3-82% Lower reproducib: 1) complex phenomena 2) complex methods ● not just random noise ● structured, systematic diff.
  • 6. example 2) complexity confuses statistical results AIC:−2log( L)+2k
  • 7. models vary by “fitting propensity” (“complexity” beyond n. of parameters) (Bonifay and Cai 2017) universe of possible data When is a theory actually supported?
  • 8. How might the elephant appear? 1) integrate complexity of phenomena & methods in measuring, forecasting, correcting reproducibility 2) penalize statistical models for complexity beyond number of parameters integrating “complexity” ~ severity of testing
  • 10. K theory in a nutshell (Fanelli 2019, Fanelli 2022) K = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY +
  • 11. K = consilience K = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY explain/predict/control more/diverse phenomena with fewer/simpler theories/methods +
  • 12. what the variables represent = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY K explain/predict/control more/diverse phenomena with fewer/simpler theories/methods +
  • 13. more information Y makes K GROW (Fanelli 2019, Fanelli 2022) K= H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY +
  • 14. (Fanelli 2019, Fanelli 2022) = - H(Y|X, τ)) H(X) D(τ)) nX H(Y1 ) + H(Y2 ) + H(Y3 ) nY H(Y1 ) + H(Y2 ) + H(Y3 ) + K more information Y makes K GROW +
  • 15. K theory in a nutshell (Fanelli 2019, Fanelli 2022) = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY K +
  • 16. more info. Y|X, X, τ makes K small (Fanelli 2019, Fanelli 2022) K = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY +
  • 17. (Fanelli 2019, Fanelli 2022) K = H(Y) - H(Y) H(X) + nX nY H(Y1 |X, τ))+H(Y2 |X, τ))+H(Y3 |X, τ)) D(τ)1 )+D(τ)2 )+D(τ)3 ) + more info. Y|X, X, τ makes K small
  • 18. K theory in a nutshell (Fanelli 2019, Fanelli 2022) = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY K + = ≈1, full consilience =0, no knowledge <0, wrong
  • 19. K theory in a nutshell (Fanelli 2019, Fanelli 2022) = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY K +
  • 20. K of a regression model (Fanelli 2019, Fanelli 2022) Y = α + β X + error = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY K +
  • 21. (Fanelli 2019, Fanelli 2022) Y = α + β X + error this has been said before = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY K +
  • 22. (Fanelli 2019, Fanelli 2022) key theoretical innovations Y = α + β X + error = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY K +
  • 23. (Fanelli 2019, Fanelli 2022) key methodological innovations Y = α + β X + error = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY K +
  • 24. (Fanelli 2019, Fanelli 2022) graphs are everywhere in science methodologies theories (Fanelli 2019, Fanelli 2022) www.protocols.io/view/an-optimized-protocol- for-in-vivo-analysis-of-tumo-3byl471m2lo5/v1 = H(Y) - H(Y|X, τ)) H(Y) H(X) D(τ)) + nX nY K + (Mueller 2015, ICSS)
  • 26. 1) K predicts perceived and actual reproducibility
  • 27. “tau” of biological experiments (Fanelli, Tan, Amaral & Neves, 2022, MetaArxiv)
  • 28. K vs. perceived reproducibility (Fanelli, Tan, Amaral & Neves, 2022, MetaArxiv)
  • 29. K vs. actual reproducibility Kr=K o 2−λ⋅d kr hr=ko ho 2−λ⋅d log kr ko =log ho hr −λ⋅d R≡log H (Y )−H (Y∣X , τr) H (Y )−H (Y∣X , τo) =α+βlog 1 D (τr)/ N
  • 30. (Fanelli, Tan, Amaral & Neves, in prep) 1) K predicts actual reproducibility (part of collaboration with Brazilian Reproducibility Initiative)
  • 31. (Fanelli, Tan, Amaral & Neves, in prep) independent new predictor multiple regression, Y=reproducibility
  • 32. (Fanelli, Tan, Amaral & Neves, in prep) multiple regression, Y=reproducibility (here D(τ) based on sentences in replication protocol!) very easy to measure, automatize
  • 33. better than just P-values and N (Fanelli, Tan, Amaral & Neves, in prep) multiple regression, Y=reproducibility R
  • 34. leads to progress in metascience (Fanelli, Tan, Amaral & Neves, in prep) multiple regression, Y=reproducibility
  • 35. 2) D(τ) might predict fitting propensity
  • 36. preregistered test (Fanelli & Bonifay, in prep)
  • 37. 1) generated N=20 models, all with 36 parameters 2) derived D(τ)), predicted their fitting propensity 3) tested them on 20,000 random covariance matrices (Fanelli & Bonifay, in prep) preregistered test
  • 38. D(τ) uniquely reflects model complexity (pre-registered test) (Fanelli & Bonifay, in progress)
  • 39. (Fanelli & Bonifay, in progress) NO alternative theory explains this! (pre-registered test) multiple regression: Y=fitting propensity
  • 41. Spearman’s ρ = 0.64, P<0.002 Spearman’s ρ = 0.73, P<0.001
  • 42. (Bonifay and Cai 2017) Bifactor confirmatory (theoretical) 20 parameters EIFA: exploratory (a-theoretical) 20 parameters (Fanelli & Bonifay, in progress) how K improves theory testing example of application
  • 43. theories invoking general factor (e.g. IQ, stress, psychosis) Bifactor widely used to “test” theories
  • 44. (Bonifay and Cai 2017) Bifactor confirmatory (theoretical) 20 parameters EIFA: exploratory (a-theoretical) 20 parameters K(Bifactor) ≥ K(EIFA) “The [bifactor-encoded] theory is specifically supported by the data” (Fanelli & Bonifay, in progress) P<0.05 when is a theory actually supported? example of application
  • 45. Summary of this talk: 1) what’s missing in the current paradigm? ● we pretend complexity is irrelevant 2) what might a new paradigm look like? ● measuring D(τ), integrating/penalizing with K 3) evidence in support of this proposal? ● increasingly promising ● any alternative, better suggestions?