3
Most read
12
Most read
17
Most read
DHARMa - Residual diagnostics for
hierarchical statistical models
Talk at ISEC 2018 @florianhartig, Uni Regensburg
Dharma wheel, Sun Temple, Konark, photo credit: Lisa Davis, via Wikimedia commons
Motivation – standard residuals for 2 Poisson regressions
Model 1
Model 2
Issues in interpreting residuals for GLMMs + beyond
§ GLMM distributions are typically asymmetric and change
their shape with the mean – won’t be transformed away by
simply dividing through expected sd (Pearson)
§ Problems get worse for more complicated GLMMs and
hierarchical models, where the effective distribution of the
residuals arises from a mix of distributions / random
effects
§ Consequence: GLMMs(+) are in practice rarely checked,
although they can have all the same problems we teach for
LMs (e.g. misfit, heteroskedasticity, outliers, …)
Solution: simulation-based residual diagnostics
For any statistical model, we can simulate new data based on the
fitted model. Based on this, we can
§ Compare simulated to observed data, either
– Via summary statistics
– Per data point
§ Or refit (aka parametric bootstrap), and compare refitted to
observed residuals
Not a new idea, but the challenge is to make this user-friendly, and
to understand how to best calculate residuals / tests
– Various methods for simulated residual checks implemented in the
DHARMa package (Hartig, 2017, on CRAN)
– DHARMa = Diagnostics for HierArchical Regression Models, but also
broadly “natural order / law” in Eastern philosophies
Teaser: example workflow in DHARMa
How does this work?
Assume new data simulated …
Dharma wheel, Sun Temple, Konark, by Lisa Davis,via Wikimedia commo
Option 1: “global” p-values
§ Calculate p-values for “global” (= all data) summary
statistics:
– Zero-inflation test - calculate simulated number of zeros vs.
observed number of zeros
– Dispersion test - calculate simulated vs. observed variance
around model predictions
Residual
variance
simulated
Residual
variance
observed
What we gain: a generalized dispersion test
§ Omnibus dispersion tests for any statistical model
(including observation-level REs, and terms for var / cor
structures, zi terms)
– Simulations show good power, also compared to parametric tests
(disclaimer: depends a bit on the model structure, for some model
structures refit = T required to get proper power)
Option 2: calculate p-values per observation
§ Goal: measure how far each data point deviates from the expected
distribution
§ Idea: express this in terms of the cumulative distribution of the
simulated data / residuals à standardizes residual to [0,1]
Translation: residual = p(x >= X0),
X0 = null distribution from the
fitted model
Key property for these residuals
§ Each residual [0,1] is essentially a p-value: p(x >= X0)
§ Thus: if the fitted model is correct (H0), the residual
distribution p(x >= X0) should be uniform
– Side note: for discrete distribution, it is essential to add some
additional noise on x and X to make the distribution flat
(Dunn & Smyth, 1996)
§ Consequence: for ANY hierarchical model structure, if
the model is correct in structure and parameters,
residuals should be uniform!
Now we can understand the teaser
DHARMa implements this idea for a wide range of models
A range of further options
§ DHARMa can read in Bayesian posterior predictive
simulations
§ Calculate residuals / dispersion / other arbitrary
summary statistics also per grouping variable
§ Plot / test spatial / temporal autocorrelation
Experience with students and research: extremely helpful,
because it allows to query / examine the model in much
more detail and understand possible problems
Statistical details and challenges I
§ Simulate conditional / unconditional on fitted REs?
– DHARMa allows changingthe conditional structure for the
RE simulations, but default is to re-simulate the entire model
structure (including all REs)
§ Simulate from point estimate, or include uncertainty
of the parameter estimates, as in Bayesian p-values?
– DHARMa currently based on point estimates, and I’m
leaning towards keeping this for frequentist residuals. With
informative priors MLE (not MAP) could even be preferred
for Bayesian model checking (to avoid prior influences), but I
acknowledge that this is philosophically controversial.
Statistical details and challenges II
§ What is the expected distributions of the calculated
summaries / residuals?
– The devil is in the detail. A common question in forums: plot
residuals (DHARMa or others) against mixed model
predictions including REs à you will see a pattern like this
– This pattern is perfectly normal for a structurally correct
model and (I think) originates from the shrinkage on the REs
– remember: plot residuals against fixed effect predictions
only!
§ Many further examples like this
Statistical details and challenges III
§ How do we display the residuals?
§ I prefer the [0,1] (cdf) scaling because it is neutral, and I
believe that uniformity is easier to check visually than
normality
§ However, the [0,1] scaling hides outliers, i.e. is not
necessarily proportional to leverage on the fit – would
desirable additionally highlight outliers / leverage.
Summary
§ Simulation-based / quantile-based residuals create a
very general and flexible framework for checking any
hierarchical statistical model / GLMM
§ Many open question that would warrant further
attention:
– Tests, expected distributions / patterns under H0, how to
best display the residuals …
§ Worth doing this work, because in the end our results
are only as good as our ability to choose the right
model - without proper checks, we are operating blind!
Thank you!
And if you want to check the Dharma of your
models, run install.packages(“DHARMa”)
Dharma wheel, Sun Temple, Konark, by Lisa Davis,via Wikimedia commo

More Related Content

PDF
Eviews forecasting
PPTX
Ethics For Pathologists.pptx
PDF
Reif Regression Diagnostics I and II
PPT
Input analysis
PDF
Rademacher Averages: Theory and Practice
PDF
2 UEDA.pdf
PPT
Random Variate Generation Conceptual and Practice
Eviews forecasting
Ethics For Pathologists.pptx
Reif Regression Diagnostics I and II
Input analysis
Rademacher Averages: Theory and Practice
2 UEDA.pdf
Random Variate Generation Conceptual and Practice

Recently uploaded (20)

PDF
Cosmology using numerical relativity - what hapenned before big bang?
PPTX
HAEMATOLOGICAL DISEASES lack of red blood cells, which carry oxygen throughou...
PPTX
A powerpoint on colorectal cancer with brief background
PPT
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
PDF
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
PDF
Packaging materials of fruits and vegetables
PPTX
ELISA(Enzyme linked immunosorbent assay)
PPTX
Presentation1 INTRODUCTION TO ENZYMES.pptx
PDF
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
PPTX
2currentelectricity1-201006102815 (1).pptx
PPTX
Understanding the Circulatory System……..
PPTX
TORCH INFECTIONS in pregnancy with toxoplasma
PPTX
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
PPTX
Cells and Organs of the Immune System (Unit-2) - Majesh Sir.pptx
PPT
LEC Synthetic Biology and its application.ppt
PDF
Chapter 3 - Human Development Poweroint presentation
PDF
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
PPT
Mutation in dna of bacteria and repairss
PDF
The Future of Telehealth: Engineering New Platforms for Care (www.kiu.ac.ug)
PDF
Science Form five needed shit SCIENEce so
Cosmology using numerical relativity - what hapenned before big bang?
HAEMATOLOGICAL DISEASES lack of red blood cells, which carry oxygen throughou...
A powerpoint on colorectal cancer with brief background
1. INTRODUCTION TO EPIDEMIOLOGY.pptx for community medicine
Communicating Health Policies to Diverse Populations (www.kiu.ac.ug)
Packaging materials of fruits and vegetables
ELISA(Enzyme linked immunosorbent assay)
Presentation1 INTRODUCTION TO ENZYMES.pptx
GROUP 2 ORIGINAL PPT. pdf Hhfiwhwifhww0ojuwoadwsfjofjwsofjw
2currentelectricity1-201006102815 (1).pptx
Understanding the Circulatory System……..
TORCH INFECTIONS in pregnancy with toxoplasma
SCIENCE 4 Q2W5 PPT.pptx Lesson About Plnts and animals and their habitat
Cells and Organs of the Immune System (Unit-2) - Majesh Sir.pptx
LEC Synthetic Biology and its application.ppt
Chapter 3 - Human Development Poweroint presentation
Worlds Next Door: A Candidate Giant Planet Imaged in the Habitable Zone of ↵ ...
Mutation in dna of bacteria and repairss
The Future of Telehealth: Engineering New Platforms for Care (www.kiu.ac.ug)
Science Form five needed shit SCIENEce so
Ad
Ad

Mon c-5-hartig-2493

  • 1. DHARMa - Residual diagnostics for hierarchical statistical models Talk at ISEC 2018 @florianhartig, Uni Regensburg Dharma wheel, Sun Temple, Konark, photo credit: Lisa Davis, via Wikimedia commons
  • 2. Motivation – standard residuals for 2 Poisson regressions Model 1 Model 2
  • 3. Issues in interpreting residuals for GLMMs + beyond § GLMM distributions are typically asymmetric and change their shape with the mean – won’t be transformed away by simply dividing through expected sd (Pearson) § Problems get worse for more complicated GLMMs and hierarchical models, where the effective distribution of the residuals arises from a mix of distributions / random effects § Consequence: GLMMs(+) are in practice rarely checked, although they can have all the same problems we teach for LMs (e.g. misfit, heteroskedasticity, outliers, …)
  • 4. Solution: simulation-based residual diagnostics For any statistical model, we can simulate new data based on the fitted model. Based on this, we can § Compare simulated to observed data, either – Via summary statistics – Per data point § Or refit (aka parametric bootstrap), and compare refitted to observed residuals Not a new idea, but the challenge is to make this user-friendly, and to understand how to best calculate residuals / tests – Various methods for simulated residual checks implemented in the DHARMa package (Hartig, 2017, on CRAN) – DHARMa = Diagnostics for HierArchical Regression Models, but also broadly “natural order / law” in Eastern philosophies
  • 6. How does this work? Assume new data simulated … Dharma wheel, Sun Temple, Konark, by Lisa Davis,via Wikimedia commo
  • 7. Option 1: “global” p-values § Calculate p-values for “global” (= all data) summary statistics: – Zero-inflation test - calculate simulated number of zeros vs. observed number of zeros – Dispersion test - calculate simulated vs. observed variance around model predictions Residual variance simulated Residual variance observed
  • 8. What we gain: a generalized dispersion test § Omnibus dispersion tests for any statistical model (including observation-level REs, and terms for var / cor structures, zi terms) – Simulations show good power, also compared to parametric tests (disclaimer: depends a bit on the model structure, for some model structures refit = T required to get proper power)
  • 9. Option 2: calculate p-values per observation § Goal: measure how far each data point deviates from the expected distribution § Idea: express this in terms of the cumulative distribution of the simulated data / residuals à standardizes residual to [0,1] Translation: residual = p(x >= X0), X0 = null distribution from the fitted model
  • 10. Key property for these residuals § Each residual [0,1] is essentially a p-value: p(x >= X0) § Thus: if the fitted model is correct (H0), the residual distribution p(x >= X0) should be uniform – Side note: for discrete distribution, it is essential to add some additional noise on x and X to make the distribution flat (Dunn & Smyth, 1996) § Consequence: for ANY hierarchical model structure, if the model is correct in structure and parameters, residuals should be uniform!
  • 11. Now we can understand the teaser
  • 12. DHARMa implements this idea for a wide range of models
  • 13. A range of further options § DHARMa can read in Bayesian posterior predictive simulations § Calculate residuals / dispersion / other arbitrary summary statistics also per grouping variable § Plot / test spatial / temporal autocorrelation Experience with students and research: extremely helpful, because it allows to query / examine the model in much more detail and understand possible problems
  • 14. Statistical details and challenges I § Simulate conditional / unconditional on fitted REs? – DHARMa allows changingthe conditional structure for the RE simulations, but default is to re-simulate the entire model structure (including all REs) § Simulate from point estimate, or include uncertainty of the parameter estimates, as in Bayesian p-values? – DHARMa currently based on point estimates, and I’m leaning towards keeping this for frequentist residuals. With informative priors MLE (not MAP) could even be preferred for Bayesian model checking (to avoid prior influences), but I acknowledge that this is philosophically controversial.
  • 15. Statistical details and challenges II § What is the expected distributions of the calculated summaries / residuals? – The devil is in the detail. A common question in forums: plot residuals (DHARMa or others) against mixed model predictions including REs à you will see a pattern like this – This pattern is perfectly normal for a structurally correct model and (I think) originates from the shrinkage on the REs – remember: plot residuals against fixed effect predictions only! § Many further examples like this
  • 16. Statistical details and challenges III § How do we display the residuals? § I prefer the [0,1] (cdf) scaling because it is neutral, and I believe that uniformity is easier to check visually than normality § However, the [0,1] scaling hides outliers, i.e. is not necessarily proportional to leverage on the fit – would desirable additionally highlight outliers / leverage.
  • 17. Summary § Simulation-based / quantile-based residuals create a very general and flexible framework for checking any hierarchical statistical model / GLMM § Many open question that would warrant further attention: – Tests, expected distributions / patterns under H0, how to best display the residuals … § Worth doing this work, because in the end our results are only as good as our ability to choose the right model - without proper checks, we are operating blind!
  • 18. Thank you! And if you want to check the Dharma of your models, run install.packages(“DHARMa”) Dharma wheel, Sun Temple, Konark, by Lisa Davis,via Wikimedia commo