SlideShare a Scribd company logo
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Simple linear regression
analysis
Tuan V. Nguyen
Professor and NHMRC Senior Research Fellow
Garvan Institute of Medical Research
University of New South Wales
Sydney, Australia
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
What we are going to learn …
•  Examples
•  Purposes of linear regression analysis
•  Questions of interest
•  Model parameters
•  R analysis
•  Interpretation
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Femoral neck bone density and age
women = subset(vd, sex==2)
plot(fnbmd ~ age, pch=16)
abline(lm(fnbmd ~ age))
20 30 40 50 60 70 80
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
age
fnbmd
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Weight and femoral neck bone density
plot(fnbmd ~ weight, pch=16)
abline(lm(fnbmd ~ weight))
30 40 50 60 70 80
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
weight
fnbmd
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Correlation analysis
•  Assessment of a relationship
•  The coefficient of correlation: a measure of the
relationship
•  We want to know more …
–  The magnitude of effect of a predictor variable on the
outcome
–  Prediction of outcome by using the predictor variable(s)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Our interests
•  Finding a statistical model that decribes the
relationship between age, weight, and BMD
•  Adjustment of effect
•  Prediction
Linear regression model
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Weight and femoral neck bone density
30 40 50 60 70 80
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
weight
fnbmd
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
We can also describe the line in terms of a slope and an intercept
•  The slope the change in the y-value for a unit change in the
x-value. In this simple situation we can think of this as the
change in the height of the line as we progress along the x-
axis
•  The intercept is the height of the line when x = 0
(x1, y1)
(x2, y2)
x-axis
y-axis
2 1
2 1
y y y
slope
x x x
Δ −
= =
Δ −
0
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Linear regression: model
•  Y : random variable representing a response
•  X : random variable representing a predictor variable
(predictor, risk factor)
–  Both Y and X can be a categorical variable (e.g., yes / no) or a
continuous variable (e.g., age).
–  If Y is categorical, the model is a logistic regression model; if Y is
continuous, a simple linear regression model.
•  Model
Y = a + bX + e	

a : intercept
b : slope / gradient
e  : random error (variation between subjects in y even if x is constant, e.g.,
variation in cholesterol for patients of the same age.)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
•  The relationship is linear in terms of the
parameter;
•  X is measured without error;
•  The values of Y are independently from each
other (e.g., Y1 is not correlated with Y2) ;
•  The random error term (e) is normally
distributed with mean 0 and constant variance.
Linear regression: assumptions
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Criteria of estimation
Y
X
i
i bx
a
y +
=
ˆ
i
i
i y
y
d ˆ
−
=
yi
The goal of least square estimator (LSE) is to find a and b such that the sum of
d2 is minimal.
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
•  We could try fitting a line “by eye”
•  But everyone’s best guess would probably be different
•  We want consistency
0 5 10 15 20 25
0
20
40
60
80
100
x
y
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Estimating parameters by R
•  Our interest: relationship between BMD and weight
•  Model:
BMD = a + b*weight + e
•  We want to estimate a and b
•  R language
lm(bmd ~ weight)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
R analysis
> m1 = lm(fnbmd ~ weight)
> summary(m1)
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4699822 0.0310144 15.15 < 2e-16 ***
weight 0.0049416 0.0006041 8.18 1.95e-15 ***
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 0.1152 on 556 degrees of freedom
Multiple R-squared: 0.1074, Adjusted R-squared: 0.1058
F-statistic: 66.9 on 1 and 556 DF, p-value: 1.945e-15
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Interpretation of outputs
Coefficients:
Estimate Std. Error t value Pr(>|t|)
(Intercept) 0.4699822 0.0310144 15.15 < 2e-16 ***
weight 0.0049416 0.0006041 8.18 1.95e-15 ***
•  Remember our model:
BMD = a + b*weight
•  Our equation:
BMD = 0.47 + 0.0049*weight
•  Interpretation: 1 kg increase in weight was
associated with a 0.0049 g/cm2 increase in BMD. The
association is statistically significant (P < 0.0001)
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
BMD = 0.47 + 0.0049*weight
30 40 50 60 70 80
0.4
0.5
0.6
0.7
0.8
0.9
1.0
1.1
Weight
FNBMD
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Analysis of variance
•  BMD = a + b*weight + e
•  Observed variation = model + random
“Variation” = sum of squares
•  SST = total sum of squares
SSR = sum of squares due to the regresson model
SSE = sum of squares due to random component
•  SST = SSR + SSE
•  R2 = SSR / SST
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Partitioning of variations: geometry
BMD
Weight
mean
SSR
SSE
SST
SST = SSR + SSE
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Partitioning of variation by R
•  Total SS = 0.8883 + 7.3819 = 8.2702
•  R2 = 0.8883 / 8.2702 = 0.107
> m1 = lm(fnbmd ~ weight)
> anova(m1)
Analysis of Variance Table
Response: fnbmd
Df Sum Sq Mean Sq F value Pr(>F)
weight 1 0.8883 0.88829 66.905 1.945e-15 ***
Residuals 556 7.3819 0.01328
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Interpretation of outputs
Residual standard error: 0.1152 on 556 degrees of freedom
Multiple R-squared: 0.1074, Adjusted R-squared: 0.1058
F-statistic: 66.9 on 1 and 556 DF, p-value: 1.945e-15
•  R2 = 0.107
•  Interpretation: Approximately 11% of BMD variance
could be accounted for by body weight
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Variance of BMD after adjusting for weight
•  Mean square (MS) = sum of squares / (degrees of freedom)
•  MS(residuals) = 7.3819 / 556 = 0.01328
Þ  Variance of BMD after adjusting for weight is 0.01328
(variance of BMD before the adjustment: 0.01485
> m1 = lm(fnbmd ~ weight)
> anove(m1)
Analysis of Variance Table
Response: fnbmd
Df Sum Sq Mean Sq F value Pr(>F)
weight 1 0.8883 0.88829 66.905 1.945e-15 ***
Residuals 556 7.3819 0.01328
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Prediction of BMD by weight
•  The model: BMD = 0.47 + 0.0049*weight
•  Without the knowledge of weight, the mean BMD is
0.72 g/cm2
•  With knowledge of weight, we know that BMD is
dependent on weight
•  Weight = 50 kg, BMD = 0.47 + 0.0049*50 = 0.72 g/cm2
Weight = 40 kg, BMD = 0.47 + 0.0049*40 = 0.67 g/cm2
Weight = 60 kg, BMD = 0.47 + 0.0049*60 = 0.76 g/cm2
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Checking model assumptions
par(mfrow=c(2,2))
plot(m1)
0.65 0.75 0.85
-0.4
-0.2
0.0
0.2
0.4
Fitted values
Residuals
Residuals vs Fitted
390
3
141
-3 -2 -1 0 1 2 3
-3
-1
0
1
2
3
Theoretical Quantiles
Standardized
residuals
Normal Q-Q
390
3
141
0.65 0.75 0.85
0.0
0.5
1.0
1.5
Fitted values
Standardized
residuals
Scale-Location
390
3141
0.000 0.010 0.020 0.030
-3
-1
1
2
3
Leverage
Standardized
residuals
Cook's distance
Residuals vs Leverage
40
27
13
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Be careful! Anscrombe’s data
Frank Anscombe devised 4 sets of X-Y pairs
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Mean and SD of Anscombe’s data
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Correlation between X and Y: Anscombe’s
data
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Regression analysis: Anscombe’s data
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
But …
Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012
Summary
•  Simple linear regression model is used for
–  Understanding the effect of a risk factor or determinant
on an outcome variable
–  Predicting an outcome variable
•  It’s appropriate when the functional relationship is
linear
•  Always check assumptions!

More Related Content

PPT
Lecture 2
PDF
Ct lecture 7. comparing two groups cont data
PPTX
Feys peter, international gait and balance symposium st louis 2013
PPTX
Does the physical work capacity of subjects with early osteoarthritis of hip ...
PDF
Dr. Vasantha b, Dr. Muttappa, Dr. Kiran jsv - 2020
PPTX
第四組Topic最終版
PDF
Comparison Effects of Three Burden Methods Using Maximum Burdens in Increasin...
PDF
Comparison of Max100, SWARA and Pairwise Weight Elicitation Methods
Lecture 2
Ct lecture 7. comparing two groups cont data
Feys peter, international gait and balance symposium st louis 2013
Does the physical work capacity of subjects with early osteoarthritis of hip ...
Dr. Vasantha b, Dr. Muttappa, Dr. Kiran jsv - 2020
第四組Topic最終版
Comparison Effects of Three Burden Methods Using Maximum Burdens in Increasin...
Comparison of Max100, SWARA and Pairwise Weight Elicitation Methods

Similar to Ct lecture 12. simple linear regression analysis (20)

PDF
Ct lecture 11. correlation analysis
PDF
Ct lecture 4. descriptive analysis of cont variables
PDF
Ct lecture 13. more on linear regression analysis
PDF
Ct lecture 17. introduction to logistic regression
PDF
Ct lecture 1. theory of measurements
PDF
Ct lecture 8. comparing two groups categorical data
DOCX
Chapter 9Multivariable MethodsObjectives• .docx
PDF
Robust Methods for Health-related Quality-of-life Assessment
PPT
Metanalysis Lecture
PDF
Robust Methods for Health-related Quality-of-life Assessment
PPTX
HFS 3283 independent t test
PDF
Ct lecture 20. survival analysis (part 2)
PDF
Ct lecture 16. model selection
PDF
Ct lecture 6. test of significance and test of h
PPTX
AxioMed Technology 2014 vFinale
PPTX
Thriving, not just surviving after critical illness
PPTX
PPTX
Phoebe - A Framework of Estimating Fetus Weight and Age
PPT
Test of significance (t-test, proportion test, chi-square test)
PPTX
Lecture 11 Paired t test.pptx
Ct lecture 11. correlation analysis
Ct lecture 4. descriptive analysis of cont variables
Ct lecture 13. more on linear regression analysis
Ct lecture 17. introduction to logistic regression
Ct lecture 1. theory of measurements
Ct lecture 8. comparing two groups categorical data
Chapter 9Multivariable MethodsObjectives• .docx
Robust Methods for Health-related Quality-of-life Assessment
Metanalysis Lecture
Robust Methods for Health-related Quality-of-life Assessment
HFS 3283 independent t test
Ct lecture 20. survival analysis (part 2)
Ct lecture 16. model selection
Ct lecture 6. test of significance and test of h
AxioMed Technology 2014 vFinale
Thriving, not just surviving after critical illness
Phoebe - A Framework of Estimating Fetus Weight and Age
Test of significance (t-test, proportion test, chi-square test)
Lecture 11 Paired t test.pptx
Ad

More from Hau Pham (7)

PDF
Introductory Biostatistics_ Chap T Le_Wiley 2003.pdf
PPT
2008_Plague-Slide_Ref SR20080097
PDF
Thuc hanh Dich Te Hoc Y Ha Noi 2003
PDF
Lecture 3. planning data analysis
PDF
Ct lecture 5. descriptive analysis of categorical variables
PDF
Ct lecture 2. questionnaire deisgn
PDF
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bản
Introductory Biostatistics_ Chap T Le_Wiley 2003.pdf
2008_Plague-Slide_Ref SR20080097
Thuc hanh Dich Te Hoc Y Ha Noi 2003
Lecture 3. planning data analysis
Ct lecture 5. descriptive analysis of categorical variables
Ct lecture 2. questionnaire deisgn
ThongKe Y-Sinh Hoc_Bài 1 một số kiến thức toán cơ bản
Ad

Recently uploaded (20)

PPTX
anaemia in PGJKKKKKKKKKKKKKKKKHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH...
PPT
OPIOID ANALGESICS AND THEIR IMPLICATIONS
PPT
1b - INTRODUCTION TO EPIDEMIOLOGY (comm med).ppt
PPTX
CME 2 Acute Chest Pain preentation for education
PPTX
post stroke aphasia rehabilitation physician
PPTX
Cardiovascular - antihypertensive medical backgrounds
PDF
Medical Evidence in the Criminal Justice Delivery System in.pdf
PPTX
Respiratory drugs, drugs acting on the respi system
PPT
genitourinary-cancers_1.ppt Nursing care of clients with GU cancer
PDF
Copy of OB - Exam #2 Study Guide. pdf
PPT
Breast Cancer management for medicsl student.ppt
PPTX
POLYCYSTIC OVARIAN SYNDROME.pptx by Dr( med) Charles Amoateng
PPT
Management of Acute Kidney Injury at LAUTECH
PPT
Copy-Histopathology Practical by CMDA ESUTH CHAPTER(0) - Copy.ppt
PDF
focused on the development and application of glycoHILIC, pepHILIC, and comm...
PPTX
LUNG ABSCESS - respiratory medicine - ppt
PPTX
Stimulation Protocols for IUI | Dr. Laxmi Shrikhande
PPTX
Important Obstetric Emergency that must be recognised
PPTX
CEREBROVASCULAR DISORDER.POWERPOINT PRESENTATIONx
PPTX
JUVENILE NASOPHARYNGEAL ANGIOFIBROMA.pptx
anaemia in PGJKKKKKKKKKKKKKKKKHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHHH...
OPIOID ANALGESICS AND THEIR IMPLICATIONS
1b - INTRODUCTION TO EPIDEMIOLOGY (comm med).ppt
CME 2 Acute Chest Pain preentation for education
post stroke aphasia rehabilitation physician
Cardiovascular - antihypertensive medical backgrounds
Medical Evidence in the Criminal Justice Delivery System in.pdf
Respiratory drugs, drugs acting on the respi system
genitourinary-cancers_1.ppt Nursing care of clients with GU cancer
Copy of OB - Exam #2 Study Guide. pdf
Breast Cancer management for medicsl student.ppt
POLYCYSTIC OVARIAN SYNDROME.pptx by Dr( med) Charles Amoateng
Management of Acute Kidney Injury at LAUTECH
Copy-Histopathology Practical by CMDA ESUTH CHAPTER(0) - Copy.ppt
focused on the development and application of glycoHILIC, pepHILIC, and comm...
LUNG ABSCESS - respiratory medicine - ppt
Stimulation Protocols for IUI | Dr. Laxmi Shrikhande
Important Obstetric Emergency that must be recognised
CEREBROVASCULAR DISORDER.POWERPOINT PRESENTATIONx
JUVENILE NASOPHARYNGEAL ANGIOFIBROMA.pptx

Ct lecture 12. simple linear regression analysis

  • 1. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Simple linear regression analysis Tuan V. Nguyen Professor and NHMRC Senior Research Fellow Garvan Institute of Medical Research University of New South Wales Sydney, Australia
  • 2. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 What we are going to learn … •  Examples •  Purposes of linear regression analysis •  Questions of interest •  Model parameters •  R analysis •  Interpretation
  • 3. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Femoral neck bone density and age women = subset(vd, sex==2) plot(fnbmd ~ age, pch=16) abline(lm(fnbmd ~ age)) 20 30 40 50 60 70 80 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 age fnbmd
  • 4. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Weight and femoral neck bone density plot(fnbmd ~ weight, pch=16) abline(lm(fnbmd ~ weight)) 30 40 50 60 70 80 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 weight fnbmd
  • 5. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Correlation analysis •  Assessment of a relationship •  The coefficient of correlation: a measure of the relationship •  We want to know more … –  The magnitude of effect of a predictor variable on the outcome –  Prediction of outcome by using the predictor variable(s)
  • 6. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Our interests •  Finding a statistical model that decribes the relationship between age, weight, and BMD •  Adjustment of effect •  Prediction Linear regression model
  • 7. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Weight and femoral neck bone density 30 40 50 60 70 80 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 weight fnbmd
  • 8. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 We can also describe the line in terms of a slope and an intercept •  The slope the change in the y-value for a unit change in the x-value. In this simple situation we can think of this as the change in the height of the line as we progress along the x- axis •  The intercept is the height of the line when x = 0 (x1, y1) (x2, y2) x-axis y-axis 2 1 2 1 y y y slope x x x Δ − = = Δ − 0
  • 9. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Linear regression: model •  Y : random variable representing a response •  X : random variable representing a predictor variable (predictor, risk factor) –  Both Y and X can be a categorical variable (e.g., yes / no) or a continuous variable (e.g., age). –  If Y is categorical, the model is a logistic regression model; if Y is continuous, a simple linear regression model. •  Model Y = a + bX + e a : intercept b : slope / gradient e  : random error (variation between subjects in y even if x is constant, e.g., variation in cholesterol for patients of the same age.)
  • 10. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 •  The relationship is linear in terms of the parameter; •  X is measured without error; •  The values of Y are independently from each other (e.g., Y1 is not correlated with Y2) ; •  The random error term (e) is normally distributed with mean 0 and constant variance. Linear regression: assumptions
  • 11. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Criteria of estimation Y X i i bx a y + = ˆ i i i y y d ˆ − = yi The goal of least square estimator (LSE) is to find a and b such that the sum of d2 is minimal.
  • 12. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 •  We could try fitting a line “by eye” •  But everyone’s best guess would probably be different •  We want consistency 0 5 10 15 20 25 0 20 40 60 80 100 x y
  • 13. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Estimating parameters by R •  Our interest: relationship between BMD and weight •  Model: BMD = a + b*weight + e •  We want to estimate a and b •  R language lm(bmd ~ weight)
  • 14. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 R analysis > m1 = lm(fnbmd ~ weight) > summary(m1) Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.4699822 0.0310144 15.15 < 2e-16 *** weight 0.0049416 0.0006041 8.18 1.95e-15 *** --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 Residual standard error: 0.1152 on 556 degrees of freedom Multiple R-squared: 0.1074, Adjusted R-squared: 0.1058 F-statistic: 66.9 on 1 and 556 DF, p-value: 1.945e-15
  • 15. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Interpretation of outputs Coefficients: Estimate Std. Error t value Pr(>|t|) (Intercept) 0.4699822 0.0310144 15.15 < 2e-16 *** weight 0.0049416 0.0006041 8.18 1.95e-15 *** •  Remember our model: BMD = a + b*weight •  Our equation: BMD = 0.47 + 0.0049*weight •  Interpretation: 1 kg increase in weight was associated with a 0.0049 g/cm2 increase in BMD. The association is statistically significant (P < 0.0001)
  • 16. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 BMD = 0.47 + 0.0049*weight 30 40 50 60 70 80 0.4 0.5 0.6 0.7 0.8 0.9 1.0 1.1 Weight FNBMD
  • 17. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Analysis of variance •  BMD = a + b*weight + e •  Observed variation = model + random “Variation” = sum of squares •  SST = total sum of squares SSR = sum of squares due to the regresson model SSE = sum of squares due to random component •  SST = SSR + SSE •  R2 = SSR / SST
  • 18. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Partitioning of variations: geometry BMD Weight mean SSR SSE SST SST = SSR + SSE
  • 19. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Partitioning of variation by R •  Total SS = 0.8883 + 7.3819 = 8.2702 •  R2 = 0.8883 / 8.2702 = 0.107 > m1 = lm(fnbmd ~ weight) > anova(m1) Analysis of Variance Table Response: fnbmd Df Sum Sq Mean Sq F value Pr(>F) weight 1 0.8883 0.88829 66.905 1.945e-15 *** Residuals 556 7.3819 0.01328 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  • 20. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Interpretation of outputs Residual standard error: 0.1152 on 556 degrees of freedom Multiple R-squared: 0.1074, Adjusted R-squared: 0.1058 F-statistic: 66.9 on 1 and 556 DF, p-value: 1.945e-15 •  R2 = 0.107 •  Interpretation: Approximately 11% of BMD variance could be accounted for by body weight
  • 21. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Variance of BMD after adjusting for weight •  Mean square (MS) = sum of squares / (degrees of freedom) •  MS(residuals) = 7.3819 / 556 = 0.01328 Þ  Variance of BMD after adjusting for weight is 0.01328 (variance of BMD before the adjustment: 0.01485 > m1 = lm(fnbmd ~ weight) > anove(m1) Analysis of Variance Table Response: fnbmd Df Sum Sq Mean Sq F value Pr(>F) weight 1 0.8883 0.88829 66.905 1.945e-15 *** Residuals 556 7.3819 0.01328 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
  • 22. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Prediction of BMD by weight •  The model: BMD = 0.47 + 0.0049*weight •  Without the knowledge of weight, the mean BMD is 0.72 g/cm2 •  With knowledge of weight, we know that BMD is dependent on weight •  Weight = 50 kg, BMD = 0.47 + 0.0049*50 = 0.72 g/cm2 Weight = 40 kg, BMD = 0.47 + 0.0049*40 = 0.67 g/cm2 Weight = 60 kg, BMD = 0.47 + 0.0049*60 = 0.76 g/cm2
  • 23. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Checking model assumptions par(mfrow=c(2,2)) plot(m1) 0.65 0.75 0.85 -0.4 -0.2 0.0 0.2 0.4 Fitted values Residuals Residuals vs Fitted 390 3 141 -3 -2 -1 0 1 2 3 -3 -1 0 1 2 3 Theoretical Quantiles Standardized residuals Normal Q-Q 390 3 141 0.65 0.75 0.85 0.0 0.5 1.0 1.5 Fitted values Standardized residuals Scale-Location 390 3141 0.000 0.010 0.020 0.030 -3 -1 1 2 3 Leverage Standardized residuals Cook's distance Residuals vs Leverage 40 27 13
  • 24. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Be careful! Anscrombe’s data Frank Anscombe devised 4 sets of X-Y pairs
  • 25. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Mean and SD of Anscombe’s data
  • 26. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Correlation between X and Y: Anscombe’s data
  • 27. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Regression analysis: Anscombe’s data
  • 28. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 But …
  • 29. Workshop on Analysis of Clinical Studies – Can Tho University of Medicine and Pharmacy – April 2012 Summary •  Simple linear regression model is used for –  Understanding the effect of a risk factor or determinant on an outcome variable –  Predicting an outcome variable •  It’s appropriate when the functional relationship is linear •  Always check assumptions!