2. 1
Y
SIMPLE REGRESSION MODEL
Suppose that a variable Y is a linear function of another variable X, with unknown
parameters b1 and b2 that we wish to estimate.
1 2
Y X
1
X
X1 X2 X3 X4
3. Suppose that we have a sample of 4 observations with X values as shown.
2
SIMPLE REGRESSION MODEL
Y
1 2
Y X
1
X
X1 X2 X3 X4
4. If the relationship were an exact one, the observations would lie on a straight line and we
would have no trouble obtaining accurate estimates of b1 and b2.
3
SIMPLE REGRESSION MODEL
Y
1 2
Y X
1
X
X1 X2 X3 X4
1
Q
2
Q
3
Q
4
Q
5. In practice, most economic relationships are not exact and the actual values of Y are
different from those corresponding to the straight line.
4
SIMPLE REGRESSION MODEL
Y
1 2
Y X
1
1
P
X
X1 X2 X3 X4
2
P
3
P
4
P
1
Q
2
Q
3
Q
4
Q
6. To allow for such divergences, we will write the model as Y = b1 + b2X + u, where u is a
disturbance term.
5
SIMPLE REGRESSION MODEL
Y
1 2
Y X
1
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
1
Q
2
Q
3
Q
4
Q
7. Each value of Y thus has a nonrandom component, b1 + b2X, and a random component, u.
The first observation has been decomposed into these two components.
6
SIMPLE REGRESSION MODEL
Y
u = disturbance term
1 2
Y X
1 2 1
X
1
u
1
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
1
Q
2
Q
3
Q
4
Q
8. In practice we can see only the P points.
7
SIMPLE REGRESSION MODEL
Y
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
9. Obviously, we can use the P points to draw a line which is an approximation to the line
Y = b1 + b2X. If we write this line , is an estimate of b1 and is an estimate of
b2.
8
SIMPLE REGRESSION MODEL
Y
1 2
ˆ ˆ
Ŷ X
1
ˆ
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
1 2
ˆ ˆ
Ŷ X
1
ˆ
2
̂
10. The line is called the fitted model and the values of Y predicted by it are called the fitted
values of Y. They are given by the heights of the R points.
9
SIMPLE REGRESSION MODEL
Y (fitted value)
Y (actual value)
Ŷ
1 2
ˆ ˆ
Ŷ X
1
ˆ
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
4
R
3
R
2
R
1
R
11. The discrepancies between the actual and fitted values of Y are known as the residuals
denoted .
10
SIMPLE REGRESSION MODEL
Y (fitted value)
Y (actual value)
Ŷ
1 2
ˆ ˆ
Ŷ X
(residual)
ˆ ˆ
Y Y u
1
û
2
û
3
û
4
û
1
ˆ
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
1
R
2
R
3
R
4
R
û
12. Note that the values of the residuals are not the same as the values of the disturbance term.
The diagram now shows the true unknown relationship as well as the fitted line.
11
SIMPLE REGRESSION MODEL
Y (fitted value)
Y (actual value)
Ŷ
1 2
ˆ ˆ
Ŷ X
1 2
Y X
1
1
ˆ
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
1
R
2
R
3
R
4
R
13. The disturbance term in each observation is responsible for the divergence between the
nonrandom component of the true relationship and the actual observation.
12
SIMPLE REGRESSION MODEL
Y (fitted value)
Y (actual value)
Ŷ
1 2
ˆ ˆ
Ŷ X
1 2
Y X
1
Q
2
Q
3
Q
4
Q
1
1
ˆ
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
14. The residuals are the discrepancies between the actual and the fitted values.
13
SIMPLE REGRESSION MODEL
Y (fitted value)
Y (actual value)
Ŷ
1 2
ˆ ˆ
Ŷ X
1 2
Y X
1
1
ˆ
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
1
R
2
R
3
R
4
R
15. If the fit is a good one, the residuals and the values of the disturbance term will be similar,
but they must be kept apart conceptually.
14
SIMPLE REGRESSION MODEL
Y (fitted value)
Y (actual value)
Ŷ
1 2
ˆ ˆ
Ŷ X
1 2
Y X
1
1
ˆ
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
1
R
2
R
3
R
4
R
16. Both of these lines will be used in our analysis. Each permits a decomposition of the value
of Y. The decompositions will be illustrated with the fourth observation.
15
SIMPLE REGRESSION MODEL
Y (fitted value)
Y (actual value)
Ŷ
1 2
ˆ ˆ
Ŷ X
1 2
Y X
u = disturbance term
1
1
ˆ
X
X1 X2 X3 X4
4
P
4
Q
1 2 4
X
4
u
17. Using the theoretical relationship, Y can be decomposed into its nonstochastic component
b1 + b2X and its random component u.
16
SIMPLE REGRESSION MODEL
Y (fitted value)
Y (actual value)
Ŷ
1 2
ˆ ˆ
Ŷ X
1 2
Y X
u = disturbance term
1
1
ˆ
X
X1 X2 X3 X4
4
P
4
Q
4
u
1 2 4
X
18. This is a theoretical decomposition because we do not know the values of b1 or b2, or the
values of the disturbance term. We shall use it in our analysis of the properties of the
regression coefficients.
17
SIMPLE REGRESSION MODEL
u = disturbance term
Y (fitted value)
Y (actual value)
Ŷ
1 2
ˆ ˆ
Ŷ X
1 2
Y X
1
1
ˆ
X
X1 X2 X3 X4
4
P
4
Q
4
u
1 2 4
X
19. The other decomposition is with reference to the fitted line. In each observation, the actual
value of Y is equal to the fitted value plus the residual. This is an operational
decomposition which we will use for practical purposes.
18
1 2
ˆ ˆ
Ŷ X
1 2
Y X
SIMPLE REGRESSION MODEL
(residual)
ˆ ˆ
Y Y u
(fitted value)
Y (actual value)
Y Ŷ
1
1
ˆ
4
R
X
X1 X2 X3 X4
4
P
1 2 4
ˆ ˆ X
4
û
20. To begin with, we will draw the fitted line so as to minimize the sum of the squares of the
residuals, RSS. This is described as the least squares criterion.
19
SIMPLE REGRESSION MODEL
Least squares criterion:
Minimize RSS (residual sum of squares), where
2 2 2
1
1
ˆ ˆ ˆ
...
n
i n
i
RSS u u u
21. Why the squares of the residuals? Why not just minimize the sum of the residuals?
Least squares criterion:
Why not minimize
2 2 2
1
1
ˆ ˆ ˆ
...
n
i n
i
RSS u u u
1
1
ˆ ˆ ˆ
...
n
i n
i
u u u
20
Minimize RSS (residual sum of squares), where
SIMPLE REGRESSION MODEL
22. The answer is that you would get an apparently perfect fit by drawing a horizontal line
through the mean value of Y. The sum of the residuals would be zero.
21
SIMPLE REGRESSION MODEL
Y
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
Y
23. You must prevent negative residuals from cancelling positive ones, and one way to do this
is to use the squares of the residuals.
22
SIMPLE REGRESSION MODEL
Y
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
Y
24. Of course there are other ways of dealing with the problem. The least squares criterion has
the attraction that the estimators derived with it have desirable properties, provided that
certain conditions are satisfied.
23
SIMPLE REGRESSION MODEL
Y
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
Y
25. The next sequence shows how the least squares criterion is used to calculate the
coefficients of the fitted line.
24
SIMPLE REGRESSION MODEL
Y
X
X1 X2 X3 X4
1
P
2
P
3
P
4
P
Y
26. Copyright Christopher Dougherty 2016.
These slideshows may be downloaded by anyone, anywhere for personal use.
Subject to respect for copyright and, where appropriate, attribution, they may be
used as a resource for teaching an econometrics course. There is no need to
refer to the author.
The content of this slideshow comes from Section 1.2 of C. Dougherty,
Introduction to Econometrics, fifth edition 2016, Oxford University Press.
Additional (free) resources for both students and instructors may be
downloaded from the OUP Online Resource Centre
http://www.oxfordtextbooks.co.uk/orc/dougherty5e/
Individuals studying econometrics on their own who feel that they might benefit
from participation in a formal course should consider the London School of
Economics summer school course
EC212 Introduction to Econometrics
http://www2.lse.ac.uk/study/summerSchools/summerSchool/Home.aspx
or the University of London International Programmes distance learning course
EC2020 Elements of Econometrics
www.londoninternational.ac.uk/lse.
2015.12.18