SlideShare a Scribd company logo
QUANTIFYING THE IMPACT OF DIFFERENT
APPROACHES FOR HANDLING CONTINUOUS
PREDICTORS ON THE PERFORMANCE OF A
PROGNOSTIC MODEL
Gary Collins, Emmanuel Ogundimu, Jonathan Cook,
Yannick Le Manach, Doug Altman
Centre for Statistics in Medicine
University of Oxford
20-July-2016
gary.collins@csm.ox.ac.uk
Outline
 Existing guidance
 What’s done in practice?
 Brief overview of the study sample & simulation set-up
 Findings & Discussion
2
Basis of this presentation
3
Not a new idea…
4
It’s all in the title…(1994-2006)
1. Problems in dichotomizing continuous variables (Altman 1994)
2. Dangers of using "optimal" cutpoints in the evaluation of prognostic
factors. (Altman et al 1994)
3. How bad is categorization? (Weinberg; 1995)
4. Seven reasons why you should NOT categorize continuous data
(Dinero; 1996)
5. Breaking Up is Hard to Do: The Heartbreak of Dichotomizing
Continuous Data (Streiner; 2002)
6. Negative consequences of dichotomizing continuous predictor
variables (Irwin & McClelland; 2003)
7. Why carve up your continuous data? (Owen 2005)
8. Chopped liver? OK. Chopped data? Not OK. Chopped liver? OK.
Chopped data? Not OK (Butts & Ng 2005)
9. Categorizing continuous variables resulted in different
predictors in a prognostic model for nonspecific neck pain
(Schellingerhout et al 2006)
5
It’s all in the title…(2006-2014)
10.Dichotomizing continuous predictors in multiple regression: a bad idea
(Royston et el 2006)
11. The cost of dichotomising continuous variables (Altman & Royston; 2006)
12.Leave 'em alone - why continuous variables should be analyzed as such
(van Walraven & Hart; 2008)
13.Dichotomization of continuous data--a pitfall in prognostic factor studies
(Metze; 2008)
14. Analysis by categorizing or dichotomizing continuous variables is
inadvisable: an example from the natural history of unruptured aneurysms
(Naggara et al 2011)
15.Against quantiles: categorization of continuous variables in epidemiologic
research, and its discontents (Bennette & Vickers; 2012)
16.Dichotomizing continuous variables in statistical analysis: a practice to
avoid (Dawson & Weiss; 2012)
17. The danger of dichotomizing continuous variables: A visualization (Kuss
2013)
18. The “anathema” of arbitrary categorization of continuous predictors
(Vintzileos et al; 2014)
19. Ophthalmic statistics note: the perils of dichotomising continuous variables
(Cumberland et al 2014)
6
Prognostic factor (PF)
A B C
PF not present
(low risk)
PF present
(high risk)
Cut-point
Biologically implausible
Slide adapted from Michael Babyak (‘Modeling with Observational Data’)
Prognostic factor (PF)
A B C
PF not present
(low risk)
PF present
(high risk)
Cut-point
Biologically implausible
“Convoluted Reasoning and
Anti-intellectual Pomposity”
“C.R.A.P”
(Norman & Streiner;
Biostatistics: the Bare
Essentials, 2008)
Slide adapted from Michael Babyak (‘Modeling with Observational Data’)
Still, what happens in practice…?
 Breast cancer models (Altman 2009)
– Categorised some/all - 34/53 (64%)
 Diabetes models (Collins et al 2011)
– Categorised some/all 21/43 (49%)
 General medical journals (Bouwmeester et al 2012)
– Categorised 30/64 (47%)
– Dichotomised 21/64 (21%)
 Cancer models (Mallett et al 2010)
– All categorised/dichotomised 24/47 (51%)
9
Aim of the study
 Investigate the impact of different approaches for
handling continuous predictors on the
– apparent performance (same data)
– validation performance (different data; geographical validation)
 Investigate the influence of sample size has on the
approach for handling continuous predictors
10
Sample characteristics (THIN)
11
80,800 CVD
events
4688 CVD
events
565 hip
fractures
7721 hip
fractures
Models
 Cox models to predict
– 10-year risk of CVD (men & women)
– 10-year risk of hip fracture (women only)
 CVD model contained 7 predictors
– Age, sex, family history, cholesterol, SBP, BMI, hypertension
 Hip fracture model contained 5 predictors
– Age, BMI, Townsend score, asthma, antidepressants
12
Resampling strategy
 MODEL DEVELOPMENT
– To ensure the number of events in each sample was fixed at
25, 50, 100, and 2000 events
– Sample were drawn from those with and without the event
(separately)
– 200 samples randomly drawn (with replacement)
 MODEL VALIDATION
– All available data were used
• CVD: n=110,934 (4688 CVD events)
• Hip fracture: n=61,563 (565 hip fractures)
13
Approaches considered
 Dichotomised at the
– Median predictor value
– ‘optimal’ cut-point based on the logrank test
 Categorised into
– 3 groups (using tertile predictor values)
– 4 groups (using quartile predictor values)
– 5 groups (using quintile predictor values)
– 5-year age categories
– 10-year age categories
 Linear relationship
 Nonlinear relationship
– fractional polynomials (FP2; 4 degrees of freedom per predictor)
– restricted cubic splines (3 knots)
14
Performance measures calculated
 Calibration
– Calibration plot
– Harrell’s “val.surv” function; hazard regression with linear
splines
 Discrimination
– Harrell’s c-index
 Clinical utility
– Decision curve analysis (Vickers & Elkin 2006)
– Net benefit;
• weighted difference between true positives and false positives
 D-statistic; Brier Score; R-squared also examined
– Not reported here - but in the supplementary material of
Collins et al Stat Med 2016.
15
Net benefit (recap)
 pt is the probability threshold to denote ‘high
risk’
– Used to weight the FPs and FN results
 TP and FP calculated using Kaplan-Meier
estimates of the percentage surviving at 10
years among those with predicted risks
greater than pt
 Bottom line: model with highest NB ‘wins’
16
Age & CVD
17
Total serum cholesterol & CVD
18
Age, cholesterol, BMI, SBP & CVD
19
Age, BMI & Hip fracture
20
RESULTS: CVD 25 events
21
RESULTS: CVD 50 events
22
RESULTS: CVD 100 events
23
RESULTS: CVD 1000 events
24
RESULTS: Hip fracture 25 events
25
RESULTS: Hip fracture 50 events
26
RESULTS: Hip fracture 100 events
27
RESULTS: Hip fracture 1000 events
28
RESULTS: Discrimination CVD
 At small sample sizes (25 events)
– Large difference in between apparent performance and
validation performance for ‘optimal’ dichotomisation
• 0.84 (apparent); 0.72 (validation)
– Smaller differences observed for FP/RCS/Linear
• 0.84 (apparent); 0.78 (validation)
 Observed difference between dichotomisation (at
the median) and linear/FP/RCS
– Apparent performance: difference of 0.05
– Validation performance: difference of 0.05
– Observed over all 4 sample sizes examined
 Negligible differences between linear/FP/RCS
29
RESULTS: Discrimination Hip Fracture
 At small sample sizes (25 events)
– Large difference in between apparent performance and
validation performance for ‘optimal’ dichotomisation
• 0.86 (apparent); 0.76 (validation)
– FP/RCS/Linear
• 0.90 (apparent); 0.87 (validation)
 Observed difference between dichotomisation (at
the median) and linear/FP/RCS
– Apparent performance: difference of 0.1
– Validation performance: difference of 0.1
– Observed over all 4 sample sizes examined
 Negligible differences between linear/FP/RCS
30
RESULTS: Discrimination Hip Fracture
31
RESULTS: Decision Curve Analysis
(CVD only) [higher NB better model]
32
FP/RCS
dichotomisation
RESULTS: Net cases found per 1000
33
Conclusions
 Systematic reviews show dichotomising /
categorising continuous predictors routinely done
when developing a prediction model
 Dichotomising, either at the median or ‘optimal’
predictor value leads to models with substantially
poorer performance
– Poor discrimination; poor calibration; poor clinical utility
 Large discrepancies between apparent performance
and validation performance observed for ‘optimal’
split dichotomising
 The impact of dichotomising continuous predictors
are handled are more pronounced at smaller sample
sizes
34

More Related Content

PDF
Why the EPV≥10 sample size rule is rubbish and what to use instead
PDF
Introduction to prediction modelling - Berlin 2018 - Part II
PDF
Biology statistics made_simple_using_excel
PDF
Development and evaluation of prediction models: pitfalls and solutions
PPTX
Is it causal, is it prediction or is it neither?
PDF
NCDLS_Ứng dụng CNTT tại Khoa Dược - BV Quận 11, Tp HCM
PDF
Clinical prediction models
PDF
How to write a 1st class dissertation on a laboratory based honours project
Why the EPV≥10 sample size rule is rubbish and what to use instead
Introduction to prediction modelling - Berlin 2018 - Part II
Biology statistics made_simple_using_excel
Development and evaluation of prediction models: pitfalls and solutions
Is it causal, is it prediction or is it neither?
NCDLS_Ứng dụng CNTT tại Khoa Dược - BV Quận 11, Tp HCM
Clinical prediction models
How to write a 1st class dissertation on a laboratory based honours project

Similar to QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDICTORS ON THE PERFORMANCE OF A PROGNOSTIC MODEL (20)

PPTX
Presentation to CCG - Capita Health Freakononics v3
PPTX
Multimorbidity Multistate Model
PPTX
Prediction research: perspectives on performance Stanford 19May22.pptx
PDF
Clinical prediction models: development, validation and beyond
PPT
Coronary artery disease
PPT
Coronary artery disease
PDF
Europe heart journal Advance DEC-2012
PPT
136 framingham model and coronary events
PPT
136 framingham model and coronary events
PPT
Vp meeting orlando2003_new template
PPTX
Heart Disease Prediction Analysis - Sushil Gupta.pptx
PDF
MH Prediction Modeling and Validation -clean
PPT
05 diagnostic tests cwq
PDF
Development and evaluation of prediction models: pitfalls and solutions (Part...
PDF
Big Data Analytics for Healthcare
PPT
Daniel Edmundowicz: Atherosclerosis Imaging
PDF
Developing and validating statistical models for clinical prediction and prog...
PPT
Displaying your results
Presentation to CCG - Capita Health Freakononics v3
Multimorbidity Multistate Model
Prediction research: perspectives on performance Stanford 19May22.pptx
Clinical prediction models: development, validation and beyond
Coronary artery disease
Coronary artery disease
Europe heart journal Advance DEC-2012
136 framingham model and coronary events
136 framingham model and coronary events
Vp meeting orlando2003_new template
Heart Disease Prediction Analysis - Sushil Gupta.pptx
MH Prediction Modeling and Validation -clean
05 diagnostic tests cwq
Development and evaluation of prediction models: pitfalls and solutions (Part...
Big Data Analytics for Healthcare
Daniel Edmundowicz: Atherosclerosis Imaging
Developing and validating statistical models for clinical prediction and prog...
Displaying your results
Ad

Recently uploaded (20)

PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PPTX
Introduction to Fisheries Biotechnology_Lesson 1.pptx
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PDF
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
2. Earth - The Living Planet earth and life
PPTX
neck nodes and dissection types and lymph nodes levels
PPTX
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
PDF
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PDF
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
PDF
The scientific heritage No 166 (166) (2025)
DOCX
Viruses (History, structure and composition, classification, Bacteriophage Re...
PPTX
famous lake in india and its disturibution and importance
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PDF
An interstellar mission to test astrophysical black holes
PPTX
microscope-Lecturecjchchchchcuvuvhc.pptx
PPT
protein biochemistry.ppt for university classes
Introduction to Cardiovascular system_structure and functions-1
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
Introduction to Fisheries Biotechnology_Lesson 1.pptx
bbec55_b34400a7914c42429908233dbd381773.pdf
ELS_Q1_Module-11_Formation-of-Rock-Layers_v2.pdf
The KM-GBF monitoring framework – status & key messages.pptx
2. Earth - The Living Planet earth and life
neck nodes and dissection types and lymph nodes levels
ognitive-behavioral therapy, mindfulness-based approaches, coping skills trai...
Unveiling a 36 billion solar mass black hole at the centre of the Cosmic Hors...
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
IFIT3 RNA-binding activity primores influenza A viruz infection and translati...
The scientific heritage No 166 (166) (2025)
Viruses (History, structure and composition, classification, Bacteriophage Re...
famous lake in india and its disturibution and importance
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
An interstellar mission to test astrophysical black holes
microscope-Lecturecjchchchchcuvuvhc.pptx
protein biochemistry.ppt for university classes
Ad

QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDICTORS ON THE PERFORMANCE OF A PROGNOSTIC MODEL

  • 1. QUANTIFYING THE IMPACT OF DIFFERENT APPROACHES FOR HANDLING CONTINUOUS PREDICTORS ON THE PERFORMANCE OF A PROGNOSTIC MODEL Gary Collins, Emmanuel Ogundimu, Jonathan Cook, Yannick Le Manach, Doug Altman Centre for Statistics in Medicine University of Oxford 20-July-2016 gary.collins@csm.ox.ac.uk
  • 2. Outline  Existing guidance  What’s done in practice?  Brief overview of the study sample & simulation set-up  Findings & Discussion 2
  • 3. Basis of this presentation 3
  • 4. Not a new idea… 4
  • 5. It’s all in the title…(1994-2006) 1. Problems in dichotomizing continuous variables (Altman 1994) 2. Dangers of using "optimal" cutpoints in the evaluation of prognostic factors. (Altman et al 1994) 3. How bad is categorization? (Weinberg; 1995) 4. Seven reasons why you should NOT categorize continuous data (Dinero; 1996) 5. Breaking Up is Hard to Do: The Heartbreak of Dichotomizing Continuous Data (Streiner; 2002) 6. Negative consequences of dichotomizing continuous predictor variables (Irwin & McClelland; 2003) 7. Why carve up your continuous data? (Owen 2005) 8. Chopped liver? OK. Chopped data? Not OK. Chopped liver? OK. Chopped data? Not OK (Butts & Ng 2005) 9. Categorizing continuous variables resulted in different predictors in a prognostic model for nonspecific neck pain (Schellingerhout et al 2006) 5
  • 6. It’s all in the title…(2006-2014) 10.Dichotomizing continuous predictors in multiple regression: a bad idea (Royston et el 2006) 11. The cost of dichotomising continuous variables (Altman & Royston; 2006) 12.Leave 'em alone - why continuous variables should be analyzed as such (van Walraven & Hart; 2008) 13.Dichotomization of continuous data--a pitfall in prognostic factor studies (Metze; 2008) 14. Analysis by categorizing or dichotomizing continuous variables is inadvisable: an example from the natural history of unruptured aneurysms (Naggara et al 2011) 15.Against quantiles: categorization of continuous variables in epidemiologic research, and its discontents (Bennette & Vickers; 2012) 16.Dichotomizing continuous variables in statistical analysis: a practice to avoid (Dawson & Weiss; 2012) 17. The danger of dichotomizing continuous variables: A visualization (Kuss 2013) 18. The “anathema” of arbitrary categorization of continuous predictors (Vintzileos et al; 2014) 19. Ophthalmic statistics note: the perils of dichotomising continuous variables (Cumberland et al 2014) 6
  • 7. Prognostic factor (PF) A B C PF not present (low risk) PF present (high risk) Cut-point Biologically implausible Slide adapted from Michael Babyak (‘Modeling with Observational Data’)
  • 8. Prognostic factor (PF) A B C PF not present (low risk) PF present (high risk) Cut-point Biologically implausible “Convoluted Reasoning and Anti-intellectual Pomposity” “C.R.A.P” (Norman & Streiner; Biostatistics: the Bare Essentials, 2008) Slide adapted from Michael Babyak (‘Modeling with Observational Data’)
  • 9. Still, what happens in practice…?  Breast cancer models (Altman 2009) – Categorised some/all - 34/53 (64%)  Diabetes models (Collins et al 2011) – Categorised some/all 21/43 (49%)  General medical journals (Bouwmeester et al 2012) – Categorised 30/64 (47%) – Dichotomised 21/64 (21%)  Cancer models (Mallett et al 2010) – All categorised/dichotomised 24/47 (51%) 9
  • 10. Aim of the study  Investigate the impact of different approaches for handling continuous predictors on the – apparent performance (same data) – validation performance (different data; geographical validation)  Investigate the influence of sample size has on the approach for handling continuous predictors 10
  • 11. Sample characteristics (THIN) 11 80,800 CVD events 4688 CVD events 565 hip fractures 7721 hip fractures
  • 12. Models  Cox models to predict – 10-year risk of CVD (men & women) – 10-year risk of hip fracture (women only)  CVD model contained 7 predictors – Age, sex, family history, cholesterol, SBP, BMI, hypertension  Hip fracture model contained 5 predictors – Age, BMI, Townsend score, asthma, antidepressants 12
  • 13. Resampling strategy  MODEL DEVELOPMENT – To ensure the number of events in each sample was fixed at 25, 50, 100, and 2000 events – Sample were drawn from those with and without the event (separately) – 200 samples randomly drawn (with replacement)  MODEL VALIDATION – All available data were used • CVD: n=110,934 (4688 CVD events) • Hip fracture: n=61,563 (565 hip fractures) 13
  • 14. Approaches considered  Dichotomised at the – Median predictor value – ‘optimal’ cut-point based on the logrank test  Categorised into – 3 groups (using tertile predictor values) – 4 groups (using quartile predictor values) – 5 groups (using quintile predictor values) – 5-year age categories – 10-year age categories  Linear relationship  Nonlinear relationship – fractional polynomials (FP2; 4 degrees of freedom per predictor) – restricted cubic splines (3 knots) 14
  • 15. Performance measures calculated  Calibration – Calibration plot – Harrell’s “val.surv” function; hazard regression with linear splines  Discrimination – Harrell’s c-index  Clinical utility – Decision curve analysis (Vickers & Elkin 2006) – Net benefit; • weighted difference between true positives and false positives  D-statistic; Brier Score; R-squared also examined – Not reported here - but in the supplementary material of Collins et al Stat Med 2016. 15
  • 16. Net benefit (recap)  pt is the probability threshold to denote ‘high risk’ – Used to weight the FPs and FN results  TP and FP calculated using Kaplan-Meier estimates of the percentage surviving at 10 years among those with predicted risks greater than pt  Bottom line: model with highest NB ‘wins’ 16
  • 19. Age, cholesterol, BMI, SBP & CVD 19
  • 20. Age, BMI & Hip fracture 20
  • 21. RESULTS: CVD 25 events 21
  • 22. RESULTS: CVD 50 events 22
  • 23. RESULTS: CVD 100 events 23
  • 24. RESULTS: CVD 1000 events 24
  • 25. RESULTS: Hip fracture 25 events 25
  • 26. RESULTS: Hip fracture 50 events 26
  • 27. RESULTS: Hip fracture 100 events 27
  • 28. RESULTS: Hip fracture 1000 events 28
  • 29. RESULTS: Discrimination CVD  At small sample sizes (25 events) – Large difference in between apparent performance and validation performance for ‘optimal’ dichotomisation • 0.84 (apparent); 0.72 (validation) – Smaller differences observed for FP/RCS/Linear • 0.84 (apparent); 0.78 (validation)  Observed difference between dichotomisation (at the median) and linear/FP/RCS – Apparent performance: difference of 0.05 – Validation performance: difference of 0.05 – Observed over all 4 sample sizes examined  Negligible differences between linear/FP/RCS 29
  • 30. RESULTS: Discrimination Hip Fracture  At small sample sizes (25 events) – Large difference in between apparent performance and validation performance for ‘optimal’ dichotomisation • 0.86 (apparent); 0.76 (validation) – FP/RCS/Linear • 0.90 (apparent); 0.87 (validation)  Observed difference between dichotomisation (at the median) and linear/FP/RCS – Apparent performance: difference of 0.1 – Validation performance: difference of 0.1 – Observed over all 4 sample sizes examined  Negligible differences between linear/FP/RCS 30
  • 32. RESULTS: Decision Curve Analysis (CVD only) [higher NB better model] 32 FP/RCS dichotomisation
  • 33. RESULTS: Net cases found per 1000 33
  • 34. Conclusions  Systematic reviews show dichotomising / categorising continuous predictors routinely done when developing a prediction model  Dichotomising, either at the median or ‘optimal’ predictor value leads to models with substantially poorer performance – Poor discrimination; poor calibration; poor clinical utility  Large discrepancies between apparent performance and validation performance observed for ‘optimal’ split dichotomising  The impact of dichotomising continuous predictors are handled are more pronounced at smaller sample sizes 34