SlideShare a Scribd company logo
Topic 3: Simple Linear
Regression
Outline
• Simple linear regression model
– Model parameters
– Distribution of error terms
• Estimation of regression parameters
– Method of least squares
– Maximum likelihood
Data for Simple Linear
Regression
• Observe i=1,2,...,n pairs of variables
• Each pair often called a case
• Yi = ith
response variable
• Xi = ith
explanatory variable
Simple Linear Regression
Model
• Yi = b0 + b1Xi + ei
• b0 is the intercept
• b1 is the slope
• ei is a random error term
– E(ei)=0 and s2
(ei)=s2
– ei and ej are uncorrelated
Simple Linear Normal Error
Regression Model
• Yi = b0 + b1Xi + ei
• b0 is the intercept
• b1 is the slope
• ei is a Normally distributed random error
with mean 0 and variance σ2
• ei and ej are uncorrelated → indep
Model Parameters
• β0 : the intercept
• β1 : the slope
• σ2 :
the variance of the error term
Features of Both
Regression Models
• Yi = β0 + β1Xi + ei
• E (Yi) = β0 + β1Xi + E(ei) = β0 + β1Xi
• Var(Yi) = 0 + var(ei) = σ2
– Mean of Yi determined by value of Xi
– All possible means fall on a line
– The Yi vary about this line
Features of Normal Error
Regression Model
• Yi = β0 + β1Xi + ei
• If ei is Normally distributed then
Yi is N(β0 + β1Xi , σ2
) (A.36)
• Does not imply the collection of Yi are
Normally distributed
Fitted Regression Equation
and Residuals
• Ŷi = b0 + b1Xi
–b0 is the estimated intercept
–b1 is the estimated slope
• ei : residual for ith
case
• ei = Yi – Ŷi = Yi – (b0 + b1Xi)
X=82
Ŷ82=b0 + b182 e82=Y82-Ŷ82
Plot the residuals
proc gplot data=a2;
plot resid*year vref=0;
where lean ne .;
run;
Continuation of pisa.sas
Using data set from output statement
vref=0 adds horizontal line to plot at zero
e82
Least Squares
• Want to find “best” b0 and b1
• Will minimize Σ(Yi – (b0 + b1Xi) )2
• Use calculus: take derivative with
respect to b0 and with respect to b1
and set the two resulting equations
equal to zero and solve for b0 and b1
• See KNNL pgs 16-17
Least Squares Solution
• These are also maximum likelihood estimators
for Normal error model, see KNNL pp 30-32
X
b
Y
b
X
X
Y
Y
X
X
b
1
0
2
i
i
i
1
)
(
)
)(
(








Maximum Likelihood
 
2
i 0 1 i
2
i 0 1 i
Y X
1
2
i
1 2 n
0 1
Y ~ X ,
1
2
(likelihood function)
Find and which maximizes
N
f e
L f f f
L
 

  

 
 
 
  
 


   

Estimation of σ2
MSE
s
s
MSE
df
SSE
s
E
Root
2
n
e
2
n
Ŷ
Y
2
2
i
i
i
2 )
(








 

Standard output from Proc REG
Analysis of Variance
Source DF
Sum of
Squares
Mean
Square F Value Pr > F
Model 1 15804 15804 904.12 <.0001
Error 11 192.2857
1
17.48052
Corrected Total 12 15997
Root MSE 4.18097 R-Square 0.9880
Dependent Mean 693.69231 Adj R-Sq 0.9869
Coeff Var 0.60271
MSE
dfe
s
Properties of Least Squares
Line
• The line always goes through
•
• Other properties on pgs 23-24
)
Y
,
X
(
0
)
)
((
))
(
(
0
1
1
0
1
0
1
0













 

 
b
X
b
Y
n
X
nb
nb
Y
n
X
b
b
Y
X
b
b
Y
e
i
i
i
i
i
Background Reading
• Chapter 1
– 1.6 : Estimation of regression function
– 1.7 : Estimation of error variance
– 1.8 : Normal regression model
• Chapter 2
– 2.1 and 2.2 : inference concerning  ’s
• Appendix A
– A.4, A.5, A.6, and A.7

More Related Content

PDF
Chap7 2 Ecc Intro
PPTX
Sessions 18 19- Regression- SLR MLR.pptx
PPT
Least square method
PDF
Estimation rs
PDF
Bitwise
PPTX
Regression Analysis.pptx
PPT
logarithmic, exponential, trigonometric functions and their graphs.ppt
Chap7 2 Ecc Intro
Sessions 18 19- Regression- SLR MLR.pptx
Least square method
Estimation rs
Bitwise
Regression Analysis.pptx
logarithmic, exponential, trigonometric functions and their graphs.ppt

Similar to what is regression and why it needed in the analysis (20)

PDF
Linearprog, Reading Materials for Operational Research
PDF
me310_5_regression.pdf numerical method for engineering
PPT
simple linear regression statistics course
PPT
Boolean Algebra
PDF
digital logic design Chapter 2 boolean_algebra_&_logic_gates
PPTX
machine learning.pptx
PPT
part3for food and accelerationpresentation.ppt
PPT
5163147.ppt
PDF
Lect 2 boolean algebra (4 5-21)
PPT
booleanalgebra-140914001141-phpapp01 (1).ppt
PDF
Fast coputation of Phi(x) inverse
PDF
Cg 04-math
PDF
bv_cvxslides (1).pdf
PPT
Boolean_Algebra and digital circuits .ppt
PPT
Chapter 1 Errors and Approximations.ppt
PDF
Numerical Methods in Civil engineering for problem solving
PPTX
Chapter 3_Boolean Algebra _ Logic Gate (3).pptx
PPT
Elliptical curve cryptography
PDF
Roots of polynomials
PPTX
Unit_2_Boolean_algebra_and_Karnaugh_maps.pptx
Linearprog, Reading Materials for Operational Research
me310_5_regression.pdf numerical method for engineering
simple linear regression statistics course
Boolean Algebra
digital logic design Chapter 2 boolean_algebra_&_logic_gates
machine learning.pptx
part3for food and accelerationpresentation.ppt
5163147.ppt
Lect 2 boolean algebra (4 5-21)
booleanalgebra-140914001141-phpapp01 (1).ppt
Fast coputation of Phi(x) inverse
Cg 04-math
bv_cvxslides (1).pdf
Boolean_Algebra and digital circuits .ppt
Chapter 1 Errors and Approximations.ppt
Numerical Methods in Civil engineering for problem solving
Chapter 3_Boolean Algebra _ Logic Gate (3).pptx
Elliptical curve cryptography
Roots of polynomials
Unit_2_Boolean_algebra_and_Karnaugh_maps.pptx
Ad

More from ssuserd23711 (10)

PPT
data mining and the purpose of using mining
PPT
multiple regression and other refgression analysis
PPT
regression basics with linear form and binary regression
PDF
1 - Introduction.pdf
PDF
DL.pdf
PPTX
NITW_Improving Deep Neural Networks.pptx
PDF
MELAKU.pdf
PDF
Digital_IOT_(Microsoft_Solution).pdf
PPT
Introduction.ppt
PPT
L2-3.FA19.ppt
data mining and the purpose of using mining
multiple regression and other refgression analysis
regression basics with linear form and binary regression
1 - Introduction.pdf
DL.pdf
NITW_Improving Deep Neural Networks.pptx
MELAKU.pdf
Digital_IOT_(Microsoft_Solution).pdf
Introduction.ppt
L2-3.FA19.ppt
Ad

Recently uploaded (20)

PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Pre independence Education in Inndia.pdf
PDF
RMMM.pdf make it easy to upload and study
PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
01-Introduction-to-Information-Management.pdf
PDF
Computing-Curriculum for Schools in Ghana
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
Anesthesia in Laparoscopic Surgery in India
PPTX
Cell Structure & Organelles in detailed.
PPTX
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Institutional Correction lecture only . . .
PDF
Insiders guide to clinical Medicine.pdf
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Pre independence Education in Inndia.pdf
RMMM.pdf make it easy to upload and study
FourierSeries-QuestionsWithAnswers(Part-A).pdf
human mycosis Human fungal infections are called human mycosis..pptx
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
01-Introduction-to-Information-Management.pdf
Computing-Curriculum for Schools in Ghana
2.FourierTransform-ShortQuestionswithAnswers.pdf
Anesthesia in Laparoscopic Surgery in India
Cell Structure & Organelles in detailed.
1st Inaugural Professorial Lecture held on 19th February 2020 (Governance and...
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
Supply Chain Operations Speaking Notes -ICLT Program
Institutional Correction lecture only . . .
Insiders guide to clinical Medicine.pdf
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf

what is regression and why it needed in the analysis

  • 1. Topic 3: Simple Linear Regression
  • 2. Outline • Simple linear regression model – Model parameters – Distribution of error terms • Estimation of regression parameters – Method of least squares – Maximum likelihood
  • 3. Data for Simple Linear Regression • Observe i=1,2,...,n pairs of variables • Each pair often called a case • Yi = ith response variable • Xi = ith explanatory variable
  • 4. Simple Linear Regression Model • Yi = b0 + b1Xi + ei • b0 is the intercept • b1 is the slope • ei is a random error term – E(ei)=0 and s2 (ei)=s2 – ei and ej are uncorrelated
  • 5. Simple Linear Normal Error Regression Model • Yi = b0 + b1Xi + ei • b0 is the intercept • b1 is the slope • ei is a Normally distributed random error with mean 0 and variance σ2 • ei and ej are uncorrelated → indep
  • 6. Model Parameters • β0 : the intercept • β1 : the slope • σ2 : the variance of the error term
  • 7. Features of Both Regression Models • Yi = β0 + β1Xi + ei • E (Yi) = β0 + β1Xi + E(ei) = β0 + β1Xi • Var(Yi) = 0 + var(ei) = σ2 – Mean of Yi determined by value of Xi – All possible means fall on a line – The Yi vary about this line
  • 8. Features of Normal Error Regression Model • Yi = β0 + β1Xi + ei • If ei is Normally distributed then Yi is N(β0 + β1Xi , σ2 ) (A.36) • Does not imply the collection of Yi are Normally distributed
  • 9. Fitted Regression Equation and Residuals • Ŷi = b0 + b1Xi –b0 is the estimated intercept –b1 is the estimated slope • ei : residual for ith case • ei = Yi – Ŷi = Yi – (b0 + b1Xi)
  • 10. X=82 Ŷ82=b0 + b182 e82=Y82-Ŷ82
  • 11. Plot the residuals proc gplot data=a2; plot resid*year vref=0; where lean ne .; run; Continuation of pisa.sas Using data set from output statement vref=0 adds horizontal line to plot at zero
  • 12. e82
  • 13. Least Squares • Want to find “best” b0 and b1 • Will minimize Σ(Yi – (b0 + b1Xi) )2 • Use calculus: take derivative with respect to b0 and with respect to b1 and set the two resulting equations equal to zero and solve for b0 and b1 • See KNNL pgs 16-17
  • 14. Least Squares Solution • These are also maximum likelihood estimators for Normal error model, see KNNL pp 30-32 X b Y b X X Y Y X X b 1 0 2 i i i 1 ) ( ) )( (        
  • 15. Maximum Likelihood   2 i 0 1 i 2 i 0 1 i Y X 1 2 i 1 2 n 0 1 Y ~ X , 1 2 (likelihood function) Find and which maximizes N f e L f f f L                         
  • 16. Estimation of σ2 MSE s s MSE df SSE s E Root 2 n e 2 n Ŷ Y 2 2 i i i 2 ) (           
  • 17. Standard output from Proc REG Analysis of Variance Source DF Sum of Squares Mean Square F Value Pr > F Model 1 15804 15804 904.12 <.0001 Error 11 192.2857 1 17.48052 Corrected Total 12 15997 Root MSE 4.18097 R-Square 0.9880 Dependent Mean 693.69231 Adj R-Sq 0.9869 Coeff Var 0.60271 MSE dfe s
  • 18. Properties of Least Squares Line • The line always goes through • • Other properties on pgs 23-24 ) Y , X ( 0 ) ) (( )) ( ( 0 1 1 0 1 0 1 0                   b X b Y n X nb nb Y n X b b Y X b b Y e i i i i i
  • 19. Background Reading • Chapter 1 – 1.6 : Estimation of regression function – 1.7 : Estimation of error variance – 1.8 : Normal regression model • Chapter 2 – 2.1 and 2.2 : inference concerning  ’s • Appendix A – A.4, A.5, A.6, and A.7