SlideShare a Scribd company logo
Biostatistics
Lecture 10
Lecture 9 Review –
Measures of association
Measures of association
–
–
–
Risk difference
Risk ratio
Odds ratio
Calculation & interpretation
interval for each measure
of confidence
of association
2×2 table - Measures of association
Outcome - binary
Measure of Effect Formula
Risk difference p1-p0
Risk ratio p1 / p0
Odds ratio (d1/h1) / (d0/h0)
Differences in measures of association
• When there is no association between exposure and outcome,
–
–
–
risk difference = 0
risk ratio (RR) = 1
odds ratio (OR) = 1
•
•
Risk difference can be negative or positive
RR & OR are always positive
• For rare outcomes, OR ~ RR
• OR is always further from 1 than corresponding RR
– If RR > 1 then OR > RR
– If RR < 1 the OR < RR
Interpretation of measures of association
• RR & OR < 1, associated with a reduced risk / odds (may
protective)
be
– RR = 0.8 (reduced risk of 20%)
• RR
–
& OR > 1, associated with an increased risk / odds
RR = 1.2 (increased risk of 20%)
• RR & OR – further the risk is from 1, stronger the association
between exposure and outcome (e.g. RR=2 versus RR=3).
Comparing the outcome measure of two exposure groups
(groups 1 & 0)
s.e.(log RR) = − + −e
eR
(log
e
OR )
d h d h
Outcome
variable –
data type
Population
parameter
Estimate
of
population
parameter
from
sample
Standard error of
loge(parameter)
95% Confidence interval of
loge(population parameter)
Categorical
Population
risk ratio
p1/p0 1 1 1 1
d1 n1 d0 n0
logeRR
±1.96× s.e.(log R )
Categorical Population
odds ratio
(d1/h1) /
(d0/h0) s.e. =
1
+
1
+
1
+
1
1 1 0 0
logeOR
±1.96xs.e.(log eOR)
Calculation of p-values for comparing two groups
1 0
z =
s.e.(lo g ( RR ))
s.e.(log (OR))
Outcome
variable –
data type
Population parameter Population parameter
under null hypothesis
Test statistic
Categorical
π1-π0
Population risk ratio
Population odds ratio
π1-π0=0
Population risk ratio=1
Population odds ratio=1
p − p
s.e.( p
1
− p
0
)
z =
loge (RR)
e
z =
loge (OR)
e
Comparing the outcome measure of two exposure groups
(TBM trial: dexamethasone versus placebo)
Outcome
variable –
data type
Population
parameter
under null
hypothesis
Estimate of
population
parameter
from sample
95% confidence
interval for
population
parameter
Two-sided p-value
Categorical Population
risk
difference
= 0
p1-p0
= -0.095
-0.175, -0.015 0.020
Categorical
Population
risk ratio
= 1
p1/p0
= 0.77
0.62, 0.96 0.016
Categorical Population
odds ratio
= 1
(d1/h1) / (d0/h0)
= 0.66
0.46, 0.93 0.021
2×2 table – TBM trial example
Odds ratio for death = (d1/h1) / (d0/h0) = 0.465 / 0.704 = 0.66
Odds ratio for exposure to dexamethasone = (d1/d0) / (h1/h0) = 0.777 / 1.176 = 0.66
Odds ratio for not dying = (h1/d1) / (h0/d0) = 2.149 / 1.420 = 1.51 = (1/0.66)
Odds ratio for exposure to placebo = (d0/d1) / (h0/h1) = 1.287 / 0.850 = 1.51 = (1/0.66)
Death during 9 months post start
of treatment
Treatment group Yes No Total
Dexamethasone
(group 1)
87 (d1) 187 (h1) 274 (n1)
Placebo
(group 0)
112 (d0) 159 (h0) 271 (n0)
Total 199 346 545
Measure of association
Study Design Risk
difference
Risk
Ratio
Odds
Ratio
Randomised controlled trial
√ √ √
Cohort Study
√ √ √
Case-control Study
× × √
Lecture 10 – Controlling for confounding:
stratification and regression
• A description of confounding
• How to control for confounding
analysis by
– Stratification
– Regression modelling
in statistical
• A brief description of the role of multiple
linear or logistic regression in adjusting for
confounding
Outcome and exposure variables
(RECAP)
Outcomes are variables of interest (population
health relevance) whose patterns and
determinants we wish to learn about from data
•
• Exposures are the variables we think might
explain observed variation in the outcomes
• Statistical analysis can be used to quantify the
association between outcomes and exposures
What is confounding?
A confounding variable
1)
2)
3)
is associated with the outcome variable;
is associated with the exposure variable;
does not lie on the causal pathway.
Outcome variableExposure variable
Confounding variable
Failing to control for confounding may result in a
biased estimate of the magnitude of the association
between exposure and outcome
Example of confounding
Exposure variable Outcome variable
Alcohol intake Heart disease
Confounding variables
Cigarette smoking
Control of confounding
Design of Study
• Randomisation
(randomised controlled trial: e.g. TBM trial)
• Restriction
(only include those with one value of confounder)
• Matching
Control of confounding
Statistical analysis
• Stratification
• Regression modelling
Hypothetical example of a case-control study
Association between energy intake and heart disease
Odds
Odds
of heart disease in high energy intake group = 730/600 = 1.22
of heart disease in low energy intake group = 700/540 = 1.30
Odds ratio = 1.22 / 1.30 = 0.94
95% confidence interval: 0.80 up to 1.10
Heart disease
Energy intake Yes No Total
High
(group 1)
730 (d1) 600 (h1) 1330 (n1)
Low
(group 0)
700 (d0) 540 (h0) 1240 (n0)
Total 1430 1140 2570
Is this association confounded
by physical activity?
Exposure variable Outcome variable
Energy intake Heart disease
Confounding variables
Physical activity
Stratify by physical activity…..
Calculate the stratum specific odds ratios…
Energy
intake
High physical activity Low physical activity
Heart disease Heart disease
Yes No Yes No
High
(group 1)
500 510 230 90
Low
(group 0)
100 150 600 390
Stratify by physical activity…..
For high physical activity group:
OR (95% CI) = 1.47 (1.11, 1.95)
For low physical activity group:
OR (95% CI) = 1.66 (1.26, 2.19)
Energy
intake
High physical activity Low physical activity
Heart disease Heart disease
Yes No Yes No
High
(group 1)
500 510 230 90
Low
(group 0)
100 150 600 390
Is this association confounded
by physical
???
activity?
Exposure variable
Energy intake
Outcome variable
Heart disease
??????
Confounding variables
Physical activity
Confounding – condition 1
Association between physical activity and heart disease
** Look particularly in those who are not exposed to the factor of interest**
For low energy intake group:
OR (95% CI) = 0.43 (0.33, 0.58)
For high energy intake group:
OR (95% CI) = 0.38 (0.29, 0.50)
Physical
activity
High energy intake Low energy intake
Heart disease Heart disease
Yes No Yes No
High
(group 1)
500 510 100 150
Low
(group 0)
230 90 600 390
Confounding – condition 2
Association between energy intake and physical activity
• In a case-control study: examine the association in the controls
• In a cohort study: use the whole cohort
Confounding – condition 2
Association between energy intake and physical activity for those
without heart disease (n=1140)
Proportion in high energy intake group who report high physical activity =
510/600 = 0.85 (85%)
Proportion in low energy intake group who report high physical activity =
150/540 = 0.28 (28%)
Odds Ratio = (510/90) / (150/390) = 14.7; 95% CI: 11.0 up to 19.7
Physical activity
Energy intake High Low Total
High
(group 1)
510 90 600
Low
(group 0)
150 390 540
Is this association confounded
by physical
???
activity?
Exposure variable
Energy intake
Outcome variable
Heart disease
High energy intake:
OR = 0.38 (95% CI: 0.29, 0.50)
Low energy intake:
OR = 0.43 (95% CI: 0.33, 0.58)
High energy intake
associated
with high physical
activity
Confounding variables
Physical activity
So physical activity is a potential confounder
Control for confounding - Stratified analyses
1) Start with stratum specific estimates
differences, rate ratios
of odds ratios, risk ratios, risk
2) Calculate a weighted average of the
‘pooled’ estimate
stratum-specific estimates

Usual method is Mantel-Haenszel method
– Weights assigned according to amount of information in each
stratum
Calculate a pooled OR
(600×90)/1310) = 41.2
For low physical activity:
OR = 1.66
w= (d0×h1)/n =
For high physical activity:
OR = 1.47
w= (d0×h1)/n =
(100×510)/1260) = 40.5
Energy
intake
High physical activity
(n=1260)
Low physical activity
(n=1310)
Heart disease Heart disease
Yes No Yes No
High
(group 1)
500 (d1) 510 (h1) 230 (d1) 90 (h1)
Low
(group 0)
100 (d0) 150 (h0) 600 (d0) 390 (h0)
Calculate a pooled OR
(600×90)/1310) = 41.2
Mantel-Haenszel estimate of pooled odds ratio:
∑(wi × ORi )
OR =MH
∑wi
Stratum ‘i’
For low physical activity:
OR = 1.66
w= (d0×h1)/n =
For high physical activity:
OR = 1.47
w= (d0×h1)/n =
(100×510)/1260) = 40.5
Calculate a pooled OR
(600×90)/1310) = 41.2
Mantel-Haenszel estimate of pooled odds ratio:
(40.5×1.47) + (41.2×1.66)
OR =1.57=MH
(40.5+ 41.2)
95% CI: 1.29 up to 1.91
Recall that the crude OR was 0.94 (95% CI 0.80-1.10)
Is there a difference between crude
and adjusted measures of effect?
For low physical activity:
OR = 1.66
w= (d0×h1)/n =
For high physical activity:
OR = 1.47
w= (d0×h1)/n =
(100×510)/1260) = 40.5
Association between energy intake & heart
disease adjusting for physical activity
ORMH = 1.57
95% CI: 1.29, 1.91
Exposure variable
Energy intake
Outcome variable
Heart disease
High energy intake:
OR = 0.38 (95% CI: 0.29, 0.50)
Low energy intake:
OR = 0.43 (95% CI: 0.33, 0.58)
High energy intake
associated
with high physical
activity
Confounding variables
Physical activity
Multiple logistic regression
Outcome variable (y-variable) – binary
e.g. dead or alive; treatment failure or success;
disease or no disease..
Measure of association – Odds ratio
Multiple logistic regression model –
loge(odds of outcome) = β0 + β1X1 + β2X2 + β3X3 +…. + βkXk
β1,…βk – loge(odds ratios)
X1, …..Xk – k different exposure variables (do not need to
be binary but can be categorical with more than 2 categories
or numerical)
Useful when there are many confounding variables…
Logistic regression
Example – Association between energy intake and heart disease
Outcome variable (y-variable) – heart disease (coded as yes-1 & no-0)
Logistic regression model –
loge(odds of outcome) = β0 + β1X1
β1 – loge(odds ratios)
X1 – energy intake (high versus low)
Exposure Odds Ratio (expβi) 95% Confidence Interval
Energy intake
(high vs low)
0.94 0.80, 1.10
Multiple logistic regression
Example – Association between energy intake and heart disease
Outcome variable (y-variable) – heart disease (coded as yes-1 & no-0)
Multiple logistic regression model –
loge(odds of outcome) = β0 + β1X1 + β2X2
β1, β2 – loge(odds ratios)
X1 – energy intake (high versus low)
X2 – physical activity (high versus low)
Exposure Odds Ratio (expβi) 95% Confidence Interval
Energy intake
(high vs low)
1.57 1.29, 1.91
Physical activity
(high vs low)
0.41 0.33, 0.49
Multiple linear regression
Outcome variable (y-variable) – numerical
e.g. blood pressure, forced expiratory volume in 1 sec (FEV1)
Linear regression model –
y = β0 + β1X1 + β2X2 + β3X3 +…. + βkXk
y – numerical outcome variable,
β1,…βk – increase in y for every unit increase in x
X1, …..Xk – k different exposure variables (can be numerical
or categorical with 2+ categories)
Useful when there are many confounding variables…
Lecture 10 - Objectives
• Understand confounding
• Calculate the Mantel-Haenszel estimate of
pooled odds ratio
the
• Understand the difference between linear and
logistic regression
Thank You
www.HelpWithAssignment.com

More Related Content

PPTX
Basics of Statistical Analysis
PPTX
Statistics for Medical students
PPTX
What does an odds ratio or relative risk mean?
PPTX
Confidence interval
PPTX
1_ Sample size determination.pptx
PPTX
Chi square test
PPT
Probability distribution
Basics of Statistical Analysis
Statistics for Medical students
What does an odds ratio or relative risk mean?
Confidence interval
1_ Sample size determination.pptx
Chi square test
Probability distribution

What's hot (20)

PPTX
Case control study
PPTX
Chi square test final
PPT
Normal distribution
PPTX
Normal distribution
PPTX
Normal Distribution.pptx
PPTX
Odds ratio
PPT
Introduction to Biostatistics.ppt
PPTX
ECOLOGICAL STUDY
PDF
Probability Distributions
PPTX
The Normal Distribution
PPSX
Bias, confounding and fallacies in epidemiology
PPTX
Estimating risk
PPTX
Significance test
PPT
Confidence intervals
PPT
Chi square mahmoud
PPTX
Error, bias and confounding
PPT
Statistical Inference
PPT
Incidence And Prevalence
PPTX
Confidence interval & probability statements
Case control study
Chi square test final
Normal distribution
Normal distribution
Normal Distribution.pptx
Odds ratio
Introduction to Biostatistics.ppt
ECOLOGICAL STUDY
Probability Distributions
The Normal Distribution
Bias, confounding and fallacies in epidemiology
Estimating risk
Significance test
Confidence intervals
Chi square mahmoud
Error, bias and confounding
Statistical Inference
Incidence And Prevalence
Confidence interval & probability statements
Ad

Viewers also liked (6)

PDF
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
PDF
The impact of innovation on travel and tourism industries (World Travel Marke...
PDF
Open Source Creativity
PPSX
Reuters: Pictures of the Year 2016 (Part 2)
PDF
The Six Highest Performing B2B Blog Post Formats
PDF
The Outcome Economy
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
The impact of innovation on travel and tourism industries (World Travel Marke...
Open Source Creativity
Reuters: Pictures of the Year 2016 (Part 2)
The Six Highest Performing B2B Blog Post Formats
The Outcome Economy
Ad

Similar to Measures of association - Biostatistics (20)

PPTX
Multivariable_Regression_Dec_2025 about reg
PPTX
unmatched case control studies
DOCX
You have just finished a health education in-service to the communit.docx
PDF
10-Interpretation& Causality by Mehdi Ehtesham
DOCX
Excelsior College PBH 321 Page 1 CONFOUNDING .docx
PDF
15 Causation and causal inference.pdf basic epidemiology
PDF
Biostatistics and epidemiology 01stats20
PPTX
Association_and_Causation.pptx
PDF
Introduction to small samples binomial inference
PDF
Difference of Proportions, Relative Risk, Odds Ratio
PDF
Choosing appropriate statistical test RSS6 2104
PPTX
Module7_RamdomError.pptx
PPTX
Proportions and Confidence Intervals in Biostatistics
PDF
Statistical Journal club
DOCX
Chapter 9Multivariable MethodsObjectives• .docx
PPT
Risk assessment
PDF
03 Confounding and Interaction lecture.pdf
PPTX
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
PPTX
concept of risk
PPTX
Measures of effect.pptx
Multivariable_Regression_Dec_2025 about reg
unmatched case control studies
You have just finished a health education in-service to the communit.docx
10-Interpretation& Causality by Mehdi Ehtesham
Excelsior College PBH 321 Page 1 CONFOUNDING .docx
15 Causation and causal inference.pdf basic epidemiology
Biostatistics and epidemiology 01stats20
Association_and_Causation.pptx
Introduction to small samples binomial inference
Difference of Proportions, Relative Risk, Odds Ratio
Choosing appropriate statistical test RSS6 2104
Module7_RamdomError.pptx
Proportions and Confidence Intervals in Biostatistics
Statistical Journal club
Chapter 9Multivariable MethodsObjectives• .docx
Risk assessment
03 Confounding and Interaction lecture.pdf
Dr. RM Pandey -Importance of Biostatistics in Biomedical Research.pptx
concept of risk
Measures of effect.pptx

Recently uploaded (20)

PDF
Computing-Curriculum for Schools in Ghana
PDF
Trump Administration's workforce development strategy
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Complications of Minimal Access Surgery at WLH
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Lesson notes of climatology university.
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
LDMMIA Reiki Yoga Finals Review Spring Summer
PDF
What if we spent less time fighting change, and more time building what’s rig...
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
UNIT III MENTAL HEALTH NURSING ASSESSMENT
PDF
Updated Idioms and Phrasal Verbs in English subject
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
RMMM.pdf make it easy to upload and study
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Cell Structure & Organelles in detailed.
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
Practical Manual AGRO-233 Principles and Practices of Natural Farming
Computing-Curriculum for Schools in Ghana
Trump Administration's workforce development strategy
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Complications of Minimal Access Surgery at WLH
Final Presentation General Medicine 03-08-2024.pptx
Lesson notes of climatology university.
2.FourierTransform-ShortQuestionswithAnswers.pdf
LDMMIA Reiki Yoga Finals Review Spring Summer
What if we spent less time fighting change, and more time building what’s rig...
Paper A Mock Exam 9_ Attempt review.pdf.
Anesthesia in Laparoscopic Surgery in India
UNIT III MENTAL HEALTH NURSING ASSESSMENT
Updated Idioms and Phrasal Verbs in English subject
Final Presentation General Medicine 03-08-2024.pptx
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
RMMM.pdf make it easy to upload and study
Supply Chain Operations Speaking Notes -ICLT Program
Cell Structure & Organelles in detailed.
Microbial disease of the cardiovascular and lymphatic systems
Practical Manual AGRO-233 Principles and Practices of Natural Farming

Measures of association - Biostatistics

  • 2. Lecture 9 Review – Measures of association Measures of association – – – Risk difference Risk ratio Odds ratio Calculation & interpretation interval for each measure of confidence of association
  • 3. 2×2 table - Measures of association Outcome - binary Measure of Effect Formula Risk difference p1-p0 Risk ratio p1 / p0 Odds ratio (d1/h1) / (d0/h0)
  • 4. Differences in measures of association • When there is no association between exposure and outcome, – – – risk difference = 0 risk ratio (RR) = 1 odds ratio (OR) = 1 • • Risk difference can be negative or positive RR & OR are always positive • For rare outcomes, OR ~ RR • OR is always further from 1 than corresponding RR – If RR > 1 then OR > RR – If RR < 1 the OR < RR
  • 5. Interpretation of measures of association • RR & OR < 1, associated with a reduced risk / odds (may protective) be – RR = 0.8 (reduced risk of 20%) • RR – & OR > 1, associated with an increased risk / odds RR = 1.2 (increased risk of 20%) • RR & OR – further the risk is from 1, stronger the association between exposure and outcome (e.g. RR=2 versus RR=3).
  • 6. Comparing the outcome measure of two exposure groups (groups 1 & 0) s.e.(log RR) = − + −e eR (log e OR ) d h d h Outcome variable – data type Population parameter Estimate of population parameter from sample Standard error of loge(parameter) 95% Confidence interval of loge(population parameter) Categorical Population risk ratio p1/p0 1 1 1 1 d1 n1 d0 n0 logeRR ±1.96× s.e.(log R ) Categorical Population odds ratio (d1/h1) / (d0/h0) s.e. = 1 + 1 + 1 + 1 1 1 0 0 logeOR ±1.96xs.e.(log eOR)
  • 7. Calculation of p-values for comparing two groups 1 0 z = s.e.(lo g ( RR )) s.e.(log (OR)) Outcome variable – data type Population parameter Population parameter under null hypothesis Test statistic Categorical π1-π0 Population risk ratio Population odds ratio π1-π0=0 Population risk ratio=1 Population odds ratio=1 p − p s.e.( p 1 − p 0 ) z = loge (RR) e z = loge (OR) e
  • 8. Comparing the outcome measure of two exposure groups (TBM trial: dexamethasone versus placebo) Outcome variable – data type Population parameter under null hypothesis Estimate of population parameter from sample 95% confidence interval for population parameter Two-sided p-value Categorical Population risk difference = 0 p1-p0 = -0.095 -0.175, -0.015 0.020 Categorical Population risk ratio = 1 p1/p0 = 0.77 0.62, 0.96 0.016 Categorical Population odds ratio = 1 (d1/h1) / (d0/h0) = 0.66 0.46, 0.93 0.021
  • 9. 2×2 table – TBM trial example Odds ratio for death = (d1/h1) / (d0/h0) = 0.465 / 0.704 = 0.66 Odds ratio for exposure to dexamethasone = (d1/d0) / (h1/h0) = 0.777 / 1.176 = 0.66 Odds ratio for not dying = (h1/d1) / (h0/d0) = 2.149 / 1.420 = 1.51 = (1/0.66) Odds ratio for exposure to placebo = (d0/d1) / (h0/h1) = 1.287 / 0.850 = 1.51 = (1/0.66) Death during 9 months post start of treatment Treatment group Yes No Total Dexamethasone (group 1) 87 (d1) 187 (h1) 274 (n1) Placebo (group 0) 112 (d0) 159 (h0) 271 (n0) Total 199 346 545
  • 10. Measure of association Study Design Risk difference Risk Ratio Odds Ratio Randomised controlled trial √ √ √ Cohort Study √ √ √ Case-control Study × × √
  • 11. Lecture 10 – Controlling for confounding: stratification and regression • A description of confounding • How to control for confounding analysis by – Stratification – Regression modelling in statistical • A brief description of the role of multiple linear or logistic regression in adjusting for confounding
  • 12. Outcome and exposure variables (RECAP) Outcomes are variables of interest (population health relevance) whose patterns and determinants we wish to learn about from data • • Exposures are the variables we think might explain observed variation in the outcomes • Statistical analysis can be used to quantify the association between outcomes and exposures
  • 13. What is confounding? A confounding variable 1) 2) 3) is associated with the outcome variable; is associated with the exposure variable; does not lie on the causal pathway. Outcome variableExposure variable Confounding variable Failing to control for confounding may result in a biased estimate of the magnitude of the association between exposure and outcome
  • 14. Example of confounding Exposure variable Outcome variable Alcohol intake Heart disease Confounding variables Cigarette smoking
  • 15. Control of confounding Design of Study • Randomisation (randomised controlled trial: e.g. TBM trial) • Restriction (only include those with one value of confounder) • Matching
  • 16. Control of confounding Statistical analysis • Stratification • Regression modelling
  • 17. Hypothetical example of a case-control study Association between energy intake and heart disease Odds Odds of heart disease in high energy intake group = 730/600 = 1.22 of heart disease in low energy intake group = 700/540 = 1.30 Odds ratio = 1.22 / 1.30 = 0.94 95% confidence interval: 0.80 up to 1.10 Heart disease Energy intake Yes No Total High (group 1) 730 (d1) 600 (h1) 1330 (n1) Low (group 0) 700 (d0) 540 (h0) 1240 (n0) Total 1430 1140 2570
  • 18. Is this association confounded by physical activity? Exposure variable Outcome variable Energy intake Heart disease Confounding variables Physical activity
  • 19. Stratify by physical activity….. Calculate the stratum specific odds ratios… Energy intake High physical activity Low physical activity Heart disease Heart disease Yes No Yes No High (group 1) 500 510 230 90 Low (group 0) 100 150 600 390
  • 20. Stratify by physical activity….. For high physical activity group: OR (95% CI) = 1.47 (1.11, 1.95) For low physical activity group: OR (95% CI) = 1.66 (1.26, 2.19) Energy intake High physical activity Low physical activity Heart disease Heart disease Yes No Yes No High (group 1) 500 510 230 90 Low (group 0) 100 150 600 390
  • 21. Is this association confounded by physical ??? activity? Exposure variable Energy intake Outcome variable Heart disease ?????? Confounding variables Physical activity
  • 22. Confounding – condition 1 Association between physical activity and heart disease ** Look particularly in those who are not exposed to the factor of interest** For low energy intake group: OR (95% CI) = 0.43 (0.33, 0.58) For high energy intake group: OR (95% CI) = 0.38 (0.29, 0.50) Physical activity High energy intake Low energy intake Heart disease Heart disease Yes No Yes No High (group 1) 500 510 100 150 Low (group 0) 230 90 600 390
  • 23. Confounding – condition 2 Association between energy intake and physical activity • In a case-control study: examine the association in the controls • In a cohort study: use the whole cohort
  • 24. Confounding – condition 2 Association between energy intake and physical activity for those without heart disease (n=1140) Proportion in high energy intake group who report high physical activity = 510/600 = 0.85 (85%) Proportion in low energy intake group who report high physical activity = 150/540 = 0.28 (28%) Odds Ratio = (510/90) / (150/390) = 14.7; 95% CI: 11.0 up to 19.7 Physical activity Energy intake High Low Total High (group 1) 510 90 600 Low (group 0) 150 390 540
  • 25. Is this association confounded by physical ??? activity? Exposure variable Energy intake Outcome variable Heart disease High energy intake: OR = 0.38 (95% CI: 0.29, 0.50) Low energy intake: OR = 0.43 (95% CI: 0.33, 0.58) High energy intake associated with high physical activity Confounding variables Physical activity
  • 26. So physical activity is a potential confounder Control for confounding - Stratified analyses 1) Start with stratum specific estimates differences, rate ratios of odds ratios, risk ratios, risk 2) Calculate a weighted average of the ‘pooled’ estimate stratum-specific estimates  Usual method is Mantel-Haenszel method – Weights assigned according to amount of information in each stratum
  • 27. Calculate a pooled OR (600×90)/1310) = 41.2 For low physical activity: OR = 1.66 w= (d0×h1)/n = For high physical activity: OR = 1.47 w= (d0×h1)/n = (100×510)/1260) = 40.5 Energy intake High physical activity (n=1260) Low physical activity (n=1310) Heart disease Heart disease Yes No Yes No High (group 1) 500 (d1) 510 (h1) 230 (d1) 90 (h1) Low (group 0) 100 (d0) 150 (h0) 600 (d0) 390 (h0)
  • 28. Calculate a pooled OR (600×90)/1310) = 41.2 Mantel-Haenszel estimate of pooled odds ratio: ∑(wi × ORi ) OR =MH ∑wi Stratum ‘i’ For low physical activity: OR = 1.66 w= (d0×h1)/n = For high physical activity: OR = 1.47 w= (d0×h1)/n = (100×510)/1260) = 40.5
  • 29. Calculate a pooled OR (600×90)/1310) = 41.2 Mantel-Haenszel estimate of pooled odds ratio: (40.5×1.47) + (41.2×1.66) OR =1.57=MH (40.5+ 41.2) 95% CI: 1.29 up to 1.91 Recall that the crude OR was 0.94 (95% CI 0.80-1.10) Is there a difference between crude and adjusted measures of effect? For low physical activity: OR = 1.66 w= (d0×h1)/n = For high physical activity: OR = 1.47 w= (d0×h1)/n = (100×510)/1260) = 40.5
  • 30. Association between energy intake & heart disease adjusting for physical activity ORMH = 1.57 95% CI: 1.29, 1.91 Exposure variable Energy intake Outcome variable Heart disease High energy intake: OR = 0.38 (95% CI: 0.29, 0.50) Low energy intake: OR = 0.43 (95% CI: 0.33, 0.58) High energy intake associated with high physical activity Confounding variables Physical activity
  • 31. Multiple logistic regression Outcome variable (y-variable) – binary e.g. dead or alive; treatment failure or success; disease or no disease.. Measure of association – Odds ratio Multiple logistic regression model – loge(odds of outcome) = β0 + β1X1 + β2X2 + β3X3 +…. + βkXk β1,…βk – loge(odds ratios) X1, …..Xk – k different exposure variables (do not need to be binary but can be categorical with more than 2 categories or numerical) Useful when there are many confounding variables…
  • 32. Logistic regression Example – Association between energy intake and heart disease Outcome variable (y-variable) – heart disease (coded as yes-1 & no-0) Logistic regression model – loge(odds of outcome) = β0 + β1X1 β1 – loge(odds ratios) X1 – energy intake (high versus low) Exposure Odds Ratio (expβi) 95% Confidence Interval Energy intake (high vs low) 0.94 0.80, 1.10
  • 33. Multiple logistic regression Example – Association between energy intake and heart disease Outcome variable (y-variable) – heart disease (coded as yes-1 & no-0) Multiple logistic regression model – loge(odds of outcome) = β0 + β1X1 + β2X2 β1, β2 – loge(odds ratios) X1 – energy intake (high versus low) X2 – physical activity (high versus low) Exposure Odds Ratio (expβi) 95% Confidence Interval Energy intake (high vs low) 1.57 1.29, 1.91 Physical activity (high vs low) 0.41 0.33, 0.49
  • 34. Multiple linear regression Outcome variable (y-variable) – numerical e.g. blood pressure, forced expiratory volume in 1 sec (FEV1) Linear regression model – y = β0 + β1X1 + β2X2 + β3X3 +…. + βkXk y – numerical outcome variable, β1,…βk – increase in y for every unit increase in x X1, …..Xk – k different exposure variables (can be numerical or categorical with 2+ categories) Useful when there are many confounding variables…
  • 35. Lecture 10 - Objectives • Understand confounding • Calculate the Mantel-Haenszel estimate of pooled odds ratio the • Understand the difference between linear and logistic regression