SlideShare a Scribd company logo
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Introduction to
logistic regression
Tuan V. Nguyen
Professor and NHMRC Senior Research Fellow
Garvan Institute of Medical Research
University of New South Wales
Sydney, Australia
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
What we are going to learn
•  Uses of logistic regression model
•  Probability, odds, logit
•  Estimation and interpretation of parameters
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Consider a case-control study
Lung
Cancer
Controls
Smokers 647 622
Non-smokers 2 27
R Doll and B Hill. BMJ 1950; ii:739-748
•  How can we show the association between smoking
and lung cancer risk?
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Risk factors for fracture: prospective study
id sex fx durfx age wt ht bmi Tscores fnbmd lsbmd fall priorfx death
3 M 0 0.55 73 98 175 32 0.33 1.08 1.458 1 0 1
8 F 0 15.38 68 72 166 26 -0.25 0.97 1.325 0 0 0
9 M 0 5.06 68 87 184 26 -0.25 1.01 1.494 0 0 1
10 F 0 14.25 62 72 173 24 -1.33 0.84 1.214 0 0 0
23 M 0 15.07 61 72 173 24 -1.92 0.81 1.144 0 0 0
24 F 0 12.3 76 57 156 23 -2.17 0.74 0.98 1 0 1
26 M 0 11.47 63 97 173 32 -0.25 1.01 1.376 1 0 1
27 F 0 15.13 64 85 167 30 -1.17 0.86 1.073 0 0 0
28 F 0 15.08 76 48 153 21 -2.92 0.65 0.874 0 0 0
29 F 0 14.72 64 89 166 32 -0.17 0.98 1.088 0 0 0
32 F 0 14.92 60 105 165 39 -0.33 0.96 1.154 3 0 0
33 F 0 14.67 75 52 156 21 -1.42 0.83 0.852 0 0 0
34 F 1 1.64 75 70 160 27 -1.75 0.79 1.186 0 0 0
36 M 0 15.32 62 97 171 33 1 1.16 1.441 0 0 0
37 F 0 15.32 60 60 161 23 -1.75 0.79 0.909 0 0 0
•  Dubbo Osteoporosis Epidemiology Study
•  Question: what are predictors of fracture risk
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Uses of logistic regression
•  To describe relationships between outcome
(dependent variable) and risk factors (independent
variables)
•  Controlling for confounders
•  Developing prognostic models
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Logistic regression model
Professor David R. Cox
Imperial College, London
1970
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Some examples of logistic regression
ARTICLE
Identification of undiagnosed type 2 diabetes by systolic
blood pressure and waist-to-hip ratio
M. T. T. Ta & K. T. Nguyen & N. D. Nguyen &
L. V. Campbell & T. V. Nguyen
Received: 3 May 2010 /Accepted: 11 June 2010 /Published online: 2 July 2010
# Springer-Verlag 2010
Abstract
Aims/hypothesis We estimated the current prevalence of
type 2 diabetes in the Vietnamese population and developed
simple diagnostic models for identifying individuals at high
risk of undiagnosed type 2 diabetes.
Methods The study was designed as a cross-sectional
investigation with 721 men and 1,421 women, who were
aged between 30 and 72 years and were randomly sampled
from Ho Chi Minh City (formerly Saigon) in Vietnam. A
75 g oral glucose tolerance test to assess fasting and 2 h
Results The prevalence of type 2 diabetes was
men and 11.7% in women. Higher WHR a
pressure were independently associated with a g
of type 2 diabetes. Compared with participan
central obesity and hypertension, the odds of dia
increased by 6.4-fold (95% CI 3.2–13.0) in me
fold (2.2–7.6) in women with central obesity and
sion. Two nomograms were developed that he
men and women at high risk of type 2 diabetes.
Conclusions/interpretation The current prevalenc
DOI 10.1007/s00125-010-1841-6
Table 2 Association between risk factor and type 2 diabetes: univariate logistic regression analysis
Risk factor Comparison unita
Men Women
OR (95% CI) c statistic OR (95% CI) c statistic
Age (years) 5 1.28 (1.05–1.56) 0.58 1.19 (1.05–1.36) 0.56
Weight (kg) 10 1.57 (1.26–1.96) 0.64 1.53 (1.30–1.81) 0.61
Waist circumference (cm) 10 1.89 (1.48–2.40) 0.69 1.60 (1.37–1.86) 0.63
WHR 0.07 2.54 (1.85–3.50) 0.71 1.72 (1.46–2.03) 0.64
Lean mass (kg) 7 1.46 (1.08–1.96) 0.59 1.36 (1.00–1.85) 0.55
Fat mass (kg) 7 1.84 (1.43–2.38) 0.66 1.60 (1.36–1.88) 0.62
Per cent body fat 10 2.29 (1.61–3.28) 0.66 2.01 (1.54–2.65) 0.62
Abdominal fat (kg) 4 1.77 (1.38–2.27) 0.65 1.58 (1.35–1.84) 0.63
Systolic BP (mmHg) 20 1.62 (1.32–2.00) 0.65 1.50 (1.31–1.73) 0.63
Diastolic BP (mmHg) 12 1.44 (1.16–1.79) 0.62 1.40 (1.21–1.61) 0.61
a
The comparison unit was set to be close to the standard deviation of each risk factor
Diabetologia (2010) 53:2139–2146 2143
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Some examples of logistic regression
•  “This study identified behavioral and psychosocial/ interpersonal
factors in young adolescence that are associated with handgun
carrying in later adolescence.” Results
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
When to use logistic regression?
•  Logistic regression:
–  outcome is a categorical variable (usually binary – yes/no)
–  risk factors are either continuous or categorical variables
•  Linear regression:
–  outcome is a continuous variable
–  risk factors are either continuous or categorical variables
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Logistic regression and Odds
•  Linear regression works on continuous data
•  Logistic regression works on odds of an outcome
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Risk, probability and odds
•  Risk: probability (P) of an event [during a period]
•  Odds: ratio of probability of having an event to the
probability of not having the event
Odds = P / (1 – P)
•  One out of 5 patients suffer a stroke …
P = 1/ 5 = 0.20
Odds = 0.2 / 0.8 = 1 to 4
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Probability and odds
•  P = 1/5 = 0.2 or 20%
•  Odds = (P) / (1-P)
•  Odds = 0.2 / 0.8 or 1:4
or “one to four”
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Probability, odds, and logit
•  Probability: from 0 to 1
•  Odds: continuous variable
–  When Probability = 0.5, odds = 1
•  Logit = log odds
logit p
( )= log
p
1− p
"
#
$
%
&
'
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
The logistic regression model
•  Let X be a risk factor
•  Let P be the probability of an event (outcome)
•  The logistic regression model is defined as:
logit p
( )=α + βX
log
p
1− p
"
#
$
%
&
' =α + βX
or
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
The logistic regression model
That also means:
log
p
1− p
"
#
$
%
&
' =α + βX
p =
eα+βX
1+eα+βX
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Relationship between X, p and logit(p)
Logistic Regression Model
0
1
x
P(x)
P(x) = exp[ 0+ 1x]
1+exp[ 0+ 1x]
log[ P (x)
1 P (x) ] = 0 + 1x
!6
!4
!2
0
2
4
6
8
x
log
[
P(x)
/
(
1
!
P(x)
)
]
linear form nonlinear form
31
log
p
1− p
"
#
$
%
&
' =α + βX p =
eα+βX
1+eα+βX
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Meaning of logistic regression parameters
•  a is the log odds of the outcome for X = 0
•  b is the log odds ratio associated with a unit increase
in X
•  Odds ratio = exp(b)
log
p
1− p
"
#
$
%
&
' =α + βX
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Assumptions of logistic regression model
•  Model provides an appropriate representation for the
dependence of outcome probability on predictor(s)
•  Outcomes are independent
•  Predictors measured without error
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Advantages of logistic regression model
•  Outcome probability changes smoothly with
increasing values of predictor, valid for arbitrary
predictor values
•  Coefficients are interpreted as log odds ratios
•  Can be applied to a range of study designs (including
case- control)
•  Software widely available
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Analysis of case control study
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Consider a case-control study
Lung
Cancer
Controls
Smokers 647 622
Non-smokers 2 27
R Doll and B Hill. BMJ 1950; ii:739-748
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Manual calculation of odds ratio
Disease No disease
Risk +ve a b
Risk –ve c d
bc
ad
OR =
( )
OR
LOR log
=
( )
d
c
b
a
LOR
SE
1
1
1
1
+
+
+
=
( ) ( )
LOR
SE
LOR
LOR
CI 96
.
1
%
95 
=
( ) ( )
LOR
SE
LOR
e
OR
CI 96
.
1
%
95 
=
Lung K Control
Smoking 647 622
No smoking 2 27
04
.
14
2
622
27
647
=
×
×
=
OR
( ) 64
.
2
04
.
14
log =
=
LOR
( ) 735
.
0
27
1
2
1
622
1
647
1
=
+
+
+
=
LOR
SE
( ) 735
.
0
96
.
1
642
.
2
%
95 ×
= 
LOR
CI
( ) 735
.
0
96
.
1
64
.
2
%
95 ×
= 
e
OR
CI
= 3.32 to 59.03
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Analysis by logistic regression model
•  P = probability of cancer (0 = No cancer, 1 = Cancer)
•  X = smoking status (0 = No, 1 = Yes)
•  Logistic regression model
log
p
1− p
"
#
$
%
&
' =α + βX
•  We want to estimate a and b
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
R codes
noyes = c(1, 0) # define a variable with 2 values 1=yes, 0=no
smoking = gl(2,1, 4, noyes) # smoking
cancer = gl(2,2, 4, noyes) # cancer
ntotal = c(647, 2, 622, 27) # actual number of patients
res = glm(cancer ~ smoking, family=binomial, weight=ntotal)
summary(res)
Lung K Control
Smoking 647 622
No smoking 2 27
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
R codes (longer way)
cancer = c(1, 1, 0, 0)
smoking = c(1, 0, 1, 0)
ntotal = c(647, 2, 622, 27) # actual number of patients
res = glm(cancer ~ smoking, family=binomial, weight=ntotal)
summary(res)
Lung K Control
Smoking 647 622
No smoking 2 27
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
R codes (rms package)
cancer = c(1, 1, 0, 0)
smoking = c(1, 0, 1, 0)
ntotal = c(647, 2, 622, 27) # actual number of patients
res = lrm(cancer ~ smoking, weight=ntotal)
summary(res)
Lung K Control
Smoking 647 622
No smoking 2 27
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
R results
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.6027 0.7320 -3.556 0.000377 ***
smoking 2.6421 0.7341 3.599 0.000319 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1
‘ ’ 1
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 1799.4 on 3 degrees of freedom
Residual deviance: 1773.3 on 2 degrees of freedom
AIC: 1777.3
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
R results
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) -2.6027 0.7320 -3.556 0.000377 ***
smoking 2.6421 0.7341 3.599 0.000319 ***
•  The model is:
• 
Note that the coefficient for smoking is 2.64 (exactly the
same with manual calculation)
•  That is log(odds ratio) = 2.64
•  Odds ratio = exp(2.64) = 14.01
log
p
1− p
"
#
$
%
&
' = −2.60+ 2.64× smoking
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Calculating odds ratio (OR)
cancer = c(1, 1, 0, 0)
smoking = c(1, 0, 1, 0)
ntotal = c(647, 2, 622, 27) # actual number of patients
res = glm(cancer ~ smoking, family=binomial,
weight=ntotal)
library(epicalc)
logistic.display(res)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Calculating odds ratio (OR) and 95% CI
> logistic.display(res)
Logistic regression predicting cancer
OR(95%CI) P(Wald's test) P(LR-
test)
smoking: 1 vs 0 14.04 (3.33,59.2) < 0.001 < 0.001
Log-likelihood = -886.6352
No. of observations = 4
AIC value = 1777.2704
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Analysis of raw data
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Formal description of logistic regression
•  Let Y be a binary response variable
–  Yi = 1 if the trait is present in observation (person, unit,
etc...) i
–  Yi = 0 if the trait is NOT present in observation i
•  X = (X1, X2, ..., Xk) be a set of explanatory variables
which can be discrete, continuous, or a combination.
xi is the observed value of the explanatory variables
for observation i.
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Formal description of logistic regression
•  The logistic regression model is:
πi = Pr Yi =1| Xi = xi
( )=
exp β0 + βi xi
( )
1+exp β0 + βi xi
( )
•  Or, in logit expression:
logit πi
( )= log
πi
1−πi
!
"
#
$
%
& = β0 + β1xi1 + β2 xi2 +...
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Assumptions of logistic regression
•  The data Y1, Y2, ..., Yn are independently distributed
•  Distribution of Yi is Bin(ni, πi), i.e., binary logistic
regression model assumes binomial distribution of
the response
•  Linear relationship between the logit of the
explanatory variables and the response; logit(π) = β0
+ βX.
•  The homogeneity of variance does NOT need to be
satisfied
•  Errors need to be independent but NOT normally
distributed
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Assessment of goodness-of-fit
•  Overall goodness-of-fit statistics of the model;
•  Pearson chi-square statistic, c2
•  Deviance, G2
•  Likelihood ratio test, and statistic, ΔG2
•  Hosmer-Lemeshow test and statistic
•  Residual analysis: Pearson, deviance, adjusted
residuals, etc
•  Overdispersion
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Parameter estimation
•  The maximum likelihood estimator (MLE) for (β0, β1)
is obtained by finding ( ) that maximizes
L β0,β1
( )= πi
yi
i=1
N
∏ 1−πi
( )
ni−yi
=
exp yi β0 + β1xi
( )
( )
1+exp β0 + β1xi
( )
i=1
N
∏
•  This is implemented in R program called “glm” and
“lrm”
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Function glm in R
•  General format
res= glm(outcome ~ riskfactor, family=binomial)
•  outcome has values (0, 1)
•  riskfactor has any value
•  To get odds ratio and 95% CI
library(epicalc)
logistic.display(res)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Function glm in R
•  To get goodness of fit of a model, use rms package
library(rms)
res = lrm(outcome ~ riskfactor)
summary(res)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
An example of analysis: fracture data
id sex fx durfx age wt ht bmi Tscores fnbmd lsbmd fall priorfx death
3 M 0 0.55 73 98 175 32 0.33 1.08 1.458 1 0 1
8 F 0 15.38 68 72 166 26 -0.25 0.97 1.325 0 0 0
9 M 0 5.06 68 87 184 26 -0.25 1.01 1.494 0 0 1
10 F 0 14.25 62 72 173 24 -1.33 0.84 1.214 0 0 0
23 M 0 15.07 61 72 173 24 -1.92 0.81 1.144 0 0 0
24 F 0 12.3 76 57 156 23 -2.17 0.74 0.98 1 0 1
26 M 0 11.47 63 97 173 32 -0.25 1.01 1.376 1 0 1
27 F 0 15.13 64 85 167 30 -1.17 0.86 1.073 0 0 0
28 F 0 15.08 76 48 153 21 -2.92 0.65 0.874 0 0 0
29 F 0 14.72 64 89 166 32 -0.17 0.98 1.088 0 0 0
32 F 0 14.92 60 105 165 39 -0.33 0.96 1.154 3 0 0
33 F 0 14.67 75 52 156 21 -1.42 0.83 0.852 0 0 0
34 F 1 1.64 75 70 160 27 -1.75 0.79 1.186 0 0 0
36 M 0 15.32 62 97 171 33 1 1.16 1.441 0 0 0
37 F 0 15.32 60 60 161 23 -1.75 0.79 0.909 0 0 0
•  Filename: fracture.csv
•  Question: what are effects of age, weight, sex on
fracture risk
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
R analysis
setwd("/Users/tuannguyen/Documents/_Vietnam2012/Can
Tho /Datasets") # can also use file.choose()
fract = read.csv("fracture.csv", na.string=".”,
header=T)
attach(fract)
names(fract)
library(rms)
dat = datadist(fract)
options(datadist="dat")
res = lrm(fx ~ sex)
summary(res)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Effect of sex on fracture risk
> res = lrm(fx ~ sex)
> summary(res)
Effects Response : fx
Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95
sex - M:F 1 2 NA -0.78 0.11 -0.99 -0.57
Odds Ratio 1 2 NA 0.46 NA 0.37 0.57
•  Men had lower ODDS of fracture than women (OR
0.46; 95% CI: 0.37 to 0.57)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
More on R output …
> res
Model Likelihood Discrimination Rank Discrim.
Ratio Test Indexes Indexes
Obs 2216 LR chi2 55.76 R2 0.036 C 0.586
0 1641 d.f. 1 g 0.369 Dxy 0.173
1 575 Pr(> chi2) <0.0001 gr 1.446 gamma 0.370
max |deriv| 1e-11 gp 0.066 tau-a 0.066
Brier 0.187
Coef S.E. Wald Z Pr(>|Z|)
Intercept -0.7829 0.0585 -13.39 <0.0001
sex=M -0.7770 0.1074 -7.23 <0.0001
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Effect of bone mineral density on fracture risk
•  Bone mineral density measured at the
femoral neck (fnbmd)
•  Values: 0.28 to 1.51 g/cm2
•  Lower FNBMD increases the risk of
fracture
•  We want to estimate the odds ratio of
fracture associated with FNBMD
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
R analysis
> res = lrm(fx ~ fnbmd)
> summary(res)
Effects Response : fx
Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95
fnbmd 0.73 0.93 0.2 -0.96 0.08 -1.11 -0.81
Odds Ratio 0.73 0.93 0.2 0.38 NA 0.33 0.45
•  Each standard deviation increase in FNBMD is
associated with a 72% reduction in the odds of
fracture (OR 0.38; 95% CI 0.33 to 0.45)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Summary
•  Logistic regression model is very useful for
–  Decsribing relationship between an outcome and risk
factors
–  Developing prognostic models in medicine
•  Logistic regression model is applied when
–  Outcome is a categorical variable
•  Logistic regression model is applicable to all study
desgns, but mainly case control study

More Related Content

PPTX
Time series analysis
PPTX
Survival analysis
PPT
Logistic regression
PDF
Basic survival analysis
PPT
Logistic regression (blyth 2006) (simplified)
PPTX
Logistic regression with SPSS
PPTX
Logistic regression with SPSS examples
PDF
Logistic regression
Time series analysis
Survival analysis
Logistic regression
Basic survival analysis
Logistic regression (blyth 2006) (simplified)
Logistic regression with SPSS
Logistic regression with SPSS examples
Logistic regression

What's hot (20)

PPT
Linear Regression Using SPSS
PPTX
Poisson regression models for count data
PDF
Cox model
PDF
Diagnostic in poisson regression models
PPTX
Multivariate analysis - Multiple regression analysis
PDF
Logistic Regression Analysis
PPT
Test of hypothesis
PPTX
T test
PPTX
Non Parametric Tests
PPTX
Logistic regression
PPT
Meta analysis
PDF
General Introduction to ROC Curves
PPT
PPTX
Multinomial Logistic Regression Analysis
PPTX
Logistical Regression.pptx
PPT
Effect size presentation
PDF
MANOVA SPSS
PDF
7. logistics regression using spss
PPT
Part 2 Cox Regression
Linear Regression Using SPSS
Poisson regression models for count data
Cox model
Diagnostic in poisson regression models
Multivariate analysis - Multiple regression analysis
Logistic Regression Analysis
Test of hypothesis
T test
Non Parametric Tests
Logistic regression
Meta analysis
General Introduction to ROC Curves
Multinomial Logistic Regression Analysis
Logistical Regression.pptx
Effect size presentation
MANOVA SPSS
7. logistics regression using spss
Part 2 Cox Regression
Ad

Similar to Ct lecture 17. introduction to logistic regression (20)

PPT
LogisticRegressionDichotomousResponse.ppt
PPT
chapter15c.ppt
PPT
chapter15c.ppt
PPT
Estatística aplicada a saúde: regressão logística
PPT
Logistic regression1
PPTX
7. The sixCategorical data analysis.pptx
PPT
M8.logreg.ppt
PPT
M8.logreg.ppt
PPT
Logistic Regression: Predicting The Chances Of Coronary Heart Disease
PPT
epiet-22- Logistic regression 2006-1.ppt
PDF
Ct lecture 8. comparing two groups categorical data
PPT
Logistic Regression.ppt
PDF
Ct lecture 12. simple linear regression analysis
PPTX
Multivariable_Regression_Dec_2025 about reg
PDF
Applied statistics lecture_7
PDF
Logistic-Regression-Webinar.pdf
PDF
Ct lecture 13. more on linear regression analysis
DOCX
Chapter 9Multivariable MethodsObjectives• .docx
PPT
Analysis Of A Binary Outcome Variable
PPTX
4.5. logistic regression
 
LogisticRegressionDichotomousResponse.ppt
chapter15c.ppt
chapter15c.ppt
Estatística aplicada a saúde: regressão logística
Logistic regression1
7. The sixCategorical data analysis.pptx
M8.logreg.ppt
M8.logreg.ppt
Logistic Regression: Predicting The Chances Of Coronary Heart Disease
epiet-22- Logistic regression 2006-1.ppt
Ct lecture 8. comparing two groups categorical data
Logistic Regression.ppt
Ct lecture 12. simple linear regression analysis
Multivariable_Regression_Dec_2025 about reg
Applied statistics lecture_7
Logistic-Regression-Webinar.pdf
Ct lecture 13. more on linear regression analysis
Chapter 9Multivariable MethodsObjectives• .docx
Analysis Of A Binary Outcome Variable
4.5. logistic regression
 
Ad

More from Hau Pham (14)

PDF
Introductory Biostatistics_ Chap T Le_Wiley 2003.pdf
PPT
2008_Plague-Slide_Ref SR20080097
PDF
Thuc hanh Dich Te Hoc Y Ha Noi 2003
PDF
Ct lecture 20. survival analysis (part 2)
PDF
Lecture 3. planning data analysis
PDF
Ct lecture 16. model selection
PDF
Ct lecture 11. correlation analysis
PDF
Ct lecture 7. comparing two groups cont data
PDF
Ct lecture 6. test of significance and test of h
PDF
Ct lecture 5. descriptive analysis of categorical variables
PDF
Ct lecture 4. descriptive analysis of cont variables
PDF
Ct lecture 2. questionnaire deisgn
PDF
Ct lecture 1. theory of measurements
PDF
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bản
Introductory Biostatistics_ Chap T Le_Wiley 2003.pdf
2008_Plague-Slide_Ref SR20080097
Thuc hanh Dich Te Hoc Y Ha Noi 2003
Ct lecture 20. survival analysis (part 2)
Lecture 3. planning data analysis
Ct lecture 16. model selection
Ct lecture 11. correlation analysis
Ct lecture 7. comparing two groups cont data
Ct lecture 6. test of significance and test of h
Ct lecture 5. descriptive analysis of categorical variables
Ct lecture 4. descriptive analysis of cont variables
Ct lecture 2. questionnaire deisgn
Ct lecture 1. theory of measurements
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bản

Recently uploaded (20)

PPTX
surgery guide for USMLE step 2-part 1.pptx
PPTX
Anatomy and physiology of the digestive system
PPT
genitourinary-cancers_1.ppt Nursing care of clients with GU cancer
PPTX
CHEM421 - Biochemistry (Chapter 1 - Introduction)
PDF
Intl J Gynecology Obste - 2021 - Melamed - FIGO International Federation o...
PDF
Copy of OB - Exam #2 Study Guide. pdf
PPTX
obstructive neonatal jaundice.pptx yes it is
PPT
ASRH Presentation for students and teachers 2770633.ppt
PPTX
NASO ALVEOLAR MOULDNIG IN CLEFT LIP AND PALATE PATIENT
PPT
OPIOID ANALGESICS AND THEIR IMPLICATIONS
PPTX
antibiotics rational use of antibiotics.pptx
PPT
STD NOTES INTRODUCTION TO COMMUNITY HEALT STRATEGY.ppt
PPTX
Acid Base Disorders educational power point.pptx
PPTX
MANAGEMENT SNAKE BITE IN THE TROPICALS.pptx
PPTX
History and examination of abdomen, & pelvis .pptx
PPTX
regulatory aspects for Bulk manufacturing
PPTX
Electrolyte Disturbance in Paediatric - Nitthi.pptx
PPTX
JUVENILE NASOPHARYNGEAL ANGIOFIBROMA.pptx
DOC
Adobe Premiere Pro CC Crack With Serial Key Full Free Download 2025
PPTX
anaemia in PGJKKKKKKKKKKKKKKKKHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH...
surgery guide for USMLE step 2-part 1.pptx
Anatomy and physiology of the digestive system
genitourinary-cancers_1.ppt Nursing care of clients with GU cancer
CHEM421 - Biochemistry (Chapter 1 - Introduction)
Intl J Gynecology Obste - 2021 - Melamed - FIGO International Federation o...
Copy of OB - Exam #2 Study Guide. pdf
obstructive neonatal jaundice.pptx yes it is
ASRH Presentation for students and teachers 2770633.ppt
NASO ALVEOLAR MOULDNIG IN CLEFT LIP AND PALATE PATIENT
OPIOID ANALGESICS AND THEIR IMPLICATIONS
antibiotics rational use of antibiotics.pptx
STD NOTES INTRODUCTION TO COMMUNITY HEALT STRATEGY.ppt
Acid Base Disorders educational power point.pptx
MANAGEMENT SNAKE BITE IN THE TROPICALS.pptx
History and examination of abdomen, & pelvis .pptx
regulatory aspects for Bulk manufacturing
Electrolyte Disturbance in Paediatric - Nitthi.pptx
JUVENILE NASOPHARYNGEAL ANGIOFIBROMA.pptx
Adobe Premiere Pro CC Crack With Serial Key Full Free Download 2025
anaemia in PGJKKKKKKKKKKKKKKKKHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH...

Ct lecture 17. introduction to logistic regression

  • 1. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Introduction to logistic regression Tuan V. Nguyen Professor and NHMRC Senior Research Fellow Garvan Institute of Medical Research University of New South Wales Sydney, Australia
  • 2. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 What we are going to learn •  Uses of logistic regression model •  Probability, odds, logit •  Estimation and interpretation of parameters
  • 3. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Consider a case-control study Lung Cancer Controls Smokers 647 622 Non-smokers 2 27 R Doll and B Hill. BMJ 1950; ii:739-748 •  How can we show the association between smoking and lung cancer risk?
  • 4. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Risk factors for fracture: prospective study id sex fx durfx age wt ht bmi Tscores fnbmd lsbmd fall priorfx death 3 M 0 0.55 73 98 175 32 0.33 1.08 1.458 1 0 1 8 F 0 15.38 68 72 166 26 -0.25 0.97 1.325 0 0 0 9 M 0 5.06 68 87 184 26 -0.25 1.01 1.494 0 0 1 10 F 0 14.25 62 72 173 24 -1.33 0.84 1.214 0 0 0 23 M 0 15.07 61 72 173 24 -1.92 0.81 1.144 0 0 0 24 F 0 12.3 76 57 156 23 -2.17 0.74 0.98 1 0 1 26 M 0 11.47 63 97 173 32 -0.25 1.01 1.376 1 0 1 27 F 0 15.13 64 85 167 30 -1.17 0.86 1.073 0 0 0 28 F 0 15.08 76 48 153 21 -2.92 0.65 0.874 0 0 0 29 F 0 14.72 64 89 166 32 -0.17 0.98 1.088 0 0 0 32 F 0 14.92 60 105 165 39 -0.33 0.96 1.154 3 0 0 33 F 0 14.67 75 52 156 21 -1.42 0.83 0.852 0 0 0 34 F 1 1.64 75 70 160 27 -1.75 0.79 1.186 0 0 0 36 M 0 15.32 62 97 171 33 1 1.16 1.441 0 0 0 37 F 0 15.32 60 60 161 23 -1.75 0.79 0.909 0 0 0 •  Dubbo Osteoporosis Epidemiology Study •  Question: what are predictors of fracture risk
  • 5. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Uses of logistic regression •  To describe relationships between outcome (dependent variable) and risk factors (independent variables) •  Controlling for confounders •  Developing prognostic models
  • 6. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Logistic regression model Professor David R. Cox Imperial College, London 1970
  • 7. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Some examples of logistic regression ARTICLE Identification of undiagnosed type 2 diabetes by systolic blood pressure and waist-to-hip ratio M. T. T. Ta & K. T. Nguyen & N. D. Nguyen & L. V. Campbell & T. V. Nguyen Received: 3 May 2010 /Accepted: 11 June 2010 /Published online: 2 July 2010 # Springer-Verlag 2010 Abstract Aims/hypothesis We estimated the current prevalence of type 2 diabetes in the Vietnamese population and developed simple diagnostic models for identifying individuals at high risk of undiagnosed type 2 diabetes. Methods The study was designed as a cross-sectional investigation with 721 men and 1,421 women, who were aged between 30 and 72 years and were randomly sampled from Ho Chi Minh City (formerly Saigon) in Vietnam. A 75 g oral glucose tolerance test to assess fasting and 2 h Results The prevalence of type 2 diabetes was men and 11.7% in women. Higher WHR a pressure were independently associated with a g of type 2 diabetes. Compared with participan central obesity and hypertension, the odds of dia increased by 6.4-fold (95% CI 3.2–13.0) in me fold (2.2–7.6) in women with central obesity and sion. Two nomograms were developed that he men and women at high risk of type 2 diabetes. Conclusions/interpretation The current prevalenc DOI 10.1007/s00125-010-1841-6 Table 2 Association between risk factor and type 2 diabetes: univariate logistic regression analysis Risk factor Comparison unita Men Women OR (95% CI) c statistic OR (95% CI) c statistic Age (years) 5 1.28 (1.05–1.56) 0.58 1.19 (1.05–1.36) 0.56 Weight (kg) 10 1.57 (1.26–1.96) 0.64 1.53 (1.30–1.81) 0.61 Waist circumference (cm) 10 1.89 (1.48–2.40) 0.69 1.60 (1.37–1.86) 0.63 WHR 0.07 2.54 (1.85–3.50) 0.71 1.72 (1.46–2.03) 0.64 Lean mass (kg) 7 1.46 (1.08–1.96) 0.59 1.36 (1.00–1.85) 0.55 Fat mass (kg) 7 1.84 (1.43–2.38) 0.66 1.60 (1.36–1.88) 0.62 Per cent body fat 10 2.29 (1.61–3.28) 0.66 2.01 (1.54–2.65) 0.62 Abdominal fat (kg) 4 1.77 (1.38–2.27) 0.65 1.58 (1.35–1.84) 0.63 Systolic BP (mmHg) 20 1.62 (1.32–2.00) 0.65 1.50 (1.31–1.73) 0.63 Diastolic BP (mmHg) 12 1.44 (1.16–1.79) 0.62 1.40 (1.21–1.61) 0.61 a The comparison unit was set to be close to the standard deviation of each risk factor Diabetologia (2010) 53:2139–2146 2143
  • 8. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Some examples of logistic regression •  “This study identified behavioral and psychosocial/ interpersonal factors in young adolescence that are associated with handgun carrying in later adolescence.” Results
  • 9. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 When to use logistic regression? •  Logistic regression: –  outcome is a categorical variable (usually binary – yes/no) –  risk factors are either continuous or categorical variables •  Linear regression: –  outcome is a continuous variable –  risk factors are either continuous or categorical variables
  • 10. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Logistic regression and Odds •  Linear regression works on continuous data •  Logistic regression works on odds of an outcome
  • 11. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Risk, probability and odds •  Risk: probability (P) of an event [during a period] •  Odds: ratio of probability of having an event to the probability of not having the event Odds = P / (1 – P) •  One out of 5 patients suffer a stroke … P = 1/ 5 = 0.20 Odds = 0.2 / 0.8 = 1 to 4
  • 12. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Probability and odds •  P = 1/5 = 0.2 or 20% •  Odds = (P) / (1-P) •  Odds = 0.2 / 0.8 or 1:4 or “one to four”
  • 13. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Probability, odds, and logit •  Probability: from 0 to 1 •  Odds: continuous variable –  When Probability = 0.5, odds = 1 •  Logit = log odds logit p ( )= log p 1− p " # $ % & '
  • 14. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 The logistic regression model •  Let X be a risk factor •  Let P be the probability of an event (outcome) •  The logistic regression model is defined as: logit p ( )=α + βX log p 1− p " # $ % & ' =α + βX or
  • 15. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 The logistic regression model That also means: log p 1− p " # $ % & ' =α + βX p = eα+βX 1+eα+βX
  • 16. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Relationship between X, p and logit(p) Logistic Regression Model 0 1 x P(x) P(x) = exp[ 0+ 1x] 1+exp[ 0+ 1x] log[ P (x) 1 P (x) ] = 0 + 1x !6 !4 !2 0 2 4 6 8 x log [ P(x) / ( 1 ! P(x) ) ] linear form nonlinear form 31 log p 1− p " # $ % & ' =α + βX p = eα+βX 1+eα+βX
  • 17. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Meaning of logistic regression parameters •  a is the log odds of the outcome for X = 0 •  b is the log odds ratio associated with a unit increase in X •  Odds ratio = exp(b) log p 1− p " # $ % & ' =α + βX
  • 18. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Assumptions of logistic regression model •  Model provides an appropriate representation for the dependence of outcome probability on predictor(s) •  Outcomes are independent •  Predictors measured without error
  • 19. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Advantages of logistic regression model •  Outcome probability changes smoothly with increasing values of predictor, valid for arbitrary predictor values •  Coefficients are interpreted as log odds ratios •  Can be applied to a range of study designs (including case- control) •  Software widely available
  • 20. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Analysis of case control study
  • 21. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Consider a case-control study Lung Cancer Controls Smokers 647 622 Non-smokers 2 27 R Doll and B Hill. BMJ 1950; ii:739-748
  • 22. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Manual calculation of odds ratio Disease No disease Risk +ve a b Risk –ve c d bc ad OR = ( ) OR LOR log = ( ) d c b a LOR SE 1 1 1 1 + + + = ( ) ( ) LOR SE LOR LOR CI 96 . 1 % 95  = ( ) ( ) LOR SE LOR e OR CI 96 . 1 % 95  = Lung K Control Smoking 647 622 No smoking 2 27 04 . 14 2 622 27 647 = × × = OR ( ) 64 . 2 04 . 14 log = = LOR ( ) 735 . 0 27 1 2 1 622 1 647 1 = + + + = LOR SE ( ) 735 . 0 96 . 1 642 . 2 % 95 × =  LOR CI ( ) 735 . 0 96 . 1 64 . 2 % 95 × =  e OR CI = 3.32 to 59.03
  • 23. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Analysis by logistic regression model •  P = probability of cancer (0 = No cancer, 1 = Cancer) •  X = smoking status (0 = No, 1 = Yes) •  Logistic regression model log p 1− p " # $ % & ' =α + βX •  We want to estimate a and b
  • 24. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 R codes noyes = c(1, 0) # define a variable with 2 values 1=yes, 0=no smoking = gl(2,1, 4, noyes) # smoking cancer = gl(2,2, 4, noyes) # cancer ntotal = c(647, 2, 622, 27) # actual number of patients res = glm(cancer ~ smoking, family=binomial, weight=ntotal) summary(res) Lung K Control Smoking 647 622 No smoking 2 27
  • 25. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 R codes (longer way) cancer = c(1, 1, 0, 0) smoking = c(1, 0, 1, 0) ntotal = c(647, 2, 622, 27) # actual number of patients res = glm(cancer ~ smoking, family=binomial, weight=ntotal) summary(res) Lung K Control Smoking 647 622 No smoking 2 27
  • 26. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 R codes (rms package) cancer = c(1, 1, 0, 0) smoking = c(1, 0, 1, 0) ntotal = c(647, 2, 622, 27) # actual number of patients res = lrm(cancer ~ smoking, weight=ntotal) summary(res) Lung K Control Smoking 647 622 No smoking 2 27
  • 27. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 R results Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.6027 0.7320 -3.556 0.000377 *** smoking 2.6421 0.7341 3.599 0.000319 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 (Dispersion parameter for binomial family taken to be 1) Null deviance: 1799.4 on 3 degrees of freedom Residual deviance: 1773.3 on 2 degrees of freedom AIC: 1777.3
  • 28. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 R results Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) -2.6027 0.7320 -3.556 0.000377 *** smoking 2.6421 0.7341 3.599 0.000319 *** •  The model is: •  Note that the coefficient for smoking is 2.64 (exactly the same with manual calculation) •  That is log(odds ratio) = 2.64 •  Odds ratio = exp(2.64) = 14.01 log p 1− p " # $ % & ' = −2.60+ 2.64× smoking
  • 29. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Calculating odds ratio (OR) cancer = c(1, 1, 0, 0) smoking = c(1, 0, 1, 0) ntotal = c(647, 2, 622, 27) # actual number of patients res = glm(cancer ~ smoking, family=binomial, weight=ntotal) library(epicalc) logistic.display(res)
  • 30. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Calculating odds ratio (OR) and 95% CI > logistic.display(res) Logistic regression predicting cancer OR(95%CI) P(Wald's test) P(LR- test) smoking: 1 vs 0 14.04 (3.33,59.2) < 0.001 < 0.001 Log-likelihood = -886.6352 No. of observations = 4 AIC value = 1777.2704
  • 31. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Analysis of raw data
  • 32. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Formal description of logistic regression •  Let Y be a binary response variable –  Yi = 1 if the trait is present in observation (person, unit, etc...) i –  Yi = 0 if the trait is NOT present in observation i •  X = (X1, X2, ..., Xk) be a set of explanatory variables which can be discrete, continuous, or a combination. xi is the observed value of the explanatory variables for observation i.
  • 33. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Formal description of logistic regression •  The logistic regression model is: πi = Pr Yi =1| Xi = xi ( )= exp β0 + βi xi ( ) 1+exp β0 + βi xi ( ) •  Or, in logit expression: logit πi ( )= log πi 1−πi ! " # $ % & = β0 + β1xi1 + β2 xi2 +...
  • 34. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Assumptions of logistic regression •  The data Y1, Y2, ..., Yn are independently distributed •  Distribution of Yi is Bin(ni, πi), i.e., binary logistic regression model assumes binomial distribution of the response •  Linear relationship between the logit of the explanatory variables and the response; logit(π) = β0 + βX. •  The homogeneity of variance does NOT need to be satisfied •  Errors need to be independent but NOT normally distributed
  • 35. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Assessment of goodness-of-fit •  Overall goodness-of-fit statistics of the model; •  Pearson chi-square statistic, c2 •  Deviance, G2 •  Likelihood ratio test, and statistic, ΔG2 •  Hosmer-Lemeshow test and statistic •  Residual analysis: Pearson, deviance, adjusted residuals, etc •  Overdispersion
  • 36. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Parameter estimation •  The maximum likelihood estimator (MLE) for (β0, β1) is obtained by finding ( ) that maximizes L β0,β1 ( )= πi yi i=1 N ∏ 1−πi ( ) ni−yi = exp yi β0 + β1xi ( ) ( ) 1+exp β0 + β1xi ( ) i=1 N ∏ •  This is implemented in R program called “glm” and “lrm”
  • 37. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Function glm in R •  General format res= glm(outcome ~ riskfactor, family=binomial) •  outcome has values (0, 1) •  riskfactor has any value •  To get odds ratio and 95% CI library(epicalc) logistic.display(res)
  • 38. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Function glm in R •  To get goodness of fit of a model, use rms package library(rms) res = lrm(outcome ~ riskfactor) summary(res)
  • 39. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 An example of analysis: fracture data id sex fx durfx age wt ht bmi Tscores fnbmd lsbmd fall priorfx death 3 M 0 0.55 73 98 175 32 0.33 1.08 1.458 1 0 1 8 F 0 15.38 68 72 166 26 -0.25 0.97 1.325 0 0 0 9 M 0 5.06 68 87 184 26 -0.25 1.01 1.494 0 0 1 10 F 0 14.25 62 72 173 24 -1.33 0.84 1.214 0 0 0 23 M 0 15.07 61 72 173 24 -1.92 0.81 1.144 0 0 0 24 F 0 12.3 76 57 156 23 -2.17 0.74 0.98 1 0 1 26 M 0 11.47 63 97 173 32 -0.25 1.01 1.376 1 0 1 27 F 0 15.13 64 85 167 30 -1.17 0.86 1.073 0 0 0 28 F 0 15.08 76 48 153 21 -2.92 0.65 0.874 0 0 0 29 F 0 14.72 64 89 166 32 -0.17 0.98 1.088 0 0 0 32 F 0 14.92 60 105 165 39 -0.33 0.96 1.154 3 0 0 33 F 0 14.67 75 52 156 21 -1.42 0.83 0.852 0 0 0 34 F 1 1.64 75 70 160 27 -1.75 0.79 1.186 0 0 0 36 M 0 15.32 62 97 171 33 1 1.16 1.441 0 0 0 37 F 0 15.32 60 60 161 23 -1.75 0.79 0.909 0 0 0 •  Filename: fracture.csv •  Question: what are effects of age, weight, sex on fracture risk
  • 40. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 R analysis setwd("/Users/tuannguyen/Documents/_Vietnam2012/Can Tho /Datasets") # can also use file.choose() fract = read.csv("fracture.csv", na.string=".”, header=T) attach(fract) names(fract) library(rms) dat = datadist(fract) options(datadist="dat") res = lrm(fx ~ sex) summary(res)
  • 41. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Effect of sex on fracture risk > res = lrm(fx ~ sex) > summary(res) Effects Response : fx Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95 sex - M:F 1 2 NA -0.78 0.11 -0.99 -0.57 Odds Ratio 1 2 NA 0.46 NA 0.37 0.57 •  Men had lower ODDS of fracture than women (OR 0.46; 95% CI: 0.37 to 0.57)
  • 42. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 More on R output … > res Model Likelihood Discrimination Rank Discrim. Ratio Test Indexes Indexes Obs 2216 LR chi2 55.76 R2 0.036 C 0.586 0 1641 d.f. 1 g 0.369 Dxy 0.173 1 575 Pr(> chi2) <0.0001 gr 1.446 gamma 0.370 max |deriv| 1e-11 gp 0.066 tau-a 0.066 Brier 0.187 Coef S.E. Wald Z Pr(>|Z|) Intercept -0.7829 0.0585 -13.39 <0.0001 sex=M -0.7770 0.1074 -7.23 <0.0001
  • 43. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Effect of bone mineral density on fracture risk •  Bone mineral density measured at the femoral neck (fnbmd) •  Values: 0.28 to 1.51 g/cm2 •  Lower FNBMD increases the risk of fracture •  We want to estimate the odds ratio of fracture associated with FNBMD
  • 44. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 R analysis > res = lrm(fx ~ fnbmd) > summary(res) Effects Response : fx Factor Low High Diff. Effect S.E. Lower 0.95 Upper 0.95 fnbmd 0.73 0.93 0.2 -0.96 0.08 -1.11 -0.81 Odds Ratio 0.73 0.93 0.2 0.38 NA 0.33 0.45 •  Each standard deviation increase in FNBMD is associated with a 72% reduction in the odds of fracture (OR 0.38; 95% CI 0.33 to 0.45)
  • 45. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Summary •  Logistic regression model is very useful for –  Decsribing relationship between an outcome and risk factors –  Developing prognostic models in medicine •  Logistic regression model is applied when –  Outcome is a categorical variable •  Logistic regression model is applicable to all study desgns, but mainly case control study