SlideShare a Scribd company logo
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 1
Chapter 3
A brief overview of the
classical linear regression model
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 2
Regression
• Regression is probably the single most important tool at the
econometrician’s disposal.
But what is regression analysis?
• It is concerned with describing and evaluating the relationship between
a given variable (usually called the dependent variable) and one or
more other variables (usually known as the independent variable(s)).
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 3
Some Notation
• Denote the dependent variable by y and the independent variable(s) by x1, x2,
... , xk where there are k independent variables.
• Some alternative names for the y and x variables:
y x
dependent variable independent variables
regressand regressors
effect variable causal variables
explained variable explanatory variable
• Note that there can be many x variables but we will limit ourselves to the
case where there is only one x variable to start with. In our set-up, there is
only one y variable.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 4
Regression is different from Correlation
• If we say y and x are correlated, it means that we are treating y and x in
a completely symmetrical way.
• In regression, we treat the dependent variable (y) and the independent
variable(s) (x’s) very differently. The y variable is assumed to be
random or “stochastic” in some way, i.e. to have a probability
distribution. The x variables are, however, assumed to have fixed
(“non-stochastic”) values in repeated samples.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 5
Simple Regression
• For simplicity, say k=1. This is the situation where y depends on only one x
variable.
• Examples of the kind of relationship that may be of interest include:
– How asset returns vary with their level of market risk
– Measuring the long-term relationship between stock prices and
dividends.
– Constructing an optimal hedge ratio
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 6
Simple Regression: An Example
• Suppose that we have the following data on the excess returns on a fund
manager’s portfolio (“fund XXX”) together with the excess returns on a
market index:
• We have some intuition that the beta on this fund is positive, and we
therefore want to find whether there appears to be a relationship between
x and y given the data that we have. The first stage would be to form a
scatter plot of the two variables.
Year, t Excess return
= rXXX,t – rft
Excess return on market index
= rmt - rft
1 17.8 13.7
2 39.0 23.2
3 12.8 6.9
4 24.2 16.8
5 17.2 12.3
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 7
Graph (Scatter Diagram)
0
5
10
15
20
25
30
35
40
45
0 5 10 15 20 25
Excess return on market portfolio
Excess
return
on
fund
XXX
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 8
Finding a Line of Best Fit
• We can use the general equation for a straight line,
y=a+bx
to get the line that best “fits” the data.
• However, this equation (y=a+bx) is completely deterministic.
• Is this realistic? No. So what we do is to add a random disturbance
term, u into the equation.
yt =  + xt + ut
where t = 1,2,3,4,5
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 9
Why do we include a Disturbance term?
• The disturbance term can capture a number of features:
- We always leave out some determinants of yt
- There may be errors in the measurement of yt that cannot be
modelled.
- Random outside influences on yt which we cannot model
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 10
Determining the Regression Coefficients
• So how do we determine what  and  are?
• Choose  and  so that the (vertical) distances from the data points to the
fitted lines are minimised (so that the line fits the data as closely as
possible): y
x
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 11
Ordinary Least Squares
• The most common method used to fit a line to the data is known as
OLS (ordinary least squares).
• What we actually do is take each distance and square it (i.e. take the
area of each of the squares in the diagram) and minimise the total sum
of the squares (hence least squares).
• Tightening up the notation, let
yt denote the actual data point t
denote the fitted value from the regression line
denote the residual, yt - t
ŷ
t
ŷ
t
û
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 12
Actual and Fitted Value
y
i
x x
i
y
i
ŷ
i
û
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 13
How OLS Works
• So min. , or minimise . This is known
as the residual sum of squares.
• But what was ? It was the difference between the actual point and
the line, yt - .
• So minimising is equivalent to minimising
with respect to and .
$
 $

2
5
2
4
2
3
2
2
2
1
ˆ
ˆ
ˆ
ˆ
ˆ u
u
u
u
u 



t
ŷ
t
û


5
1
2
ˆ
t
t
u
 2
ˆ
  t
t y
y  2
ˆt
u
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 14
Deriving the OLS Estimator
• But , so let
• Want to minimise L with respect to (w.r.t.) and , so differentiate L
w.r.t. and
(1)
(2)
• From (1),
• But and .
$
 $

$
 $

t
t x
y 
 ˆ
ˆ
ˆ 

 




t
t
t x
y
L
0
)
ˆ
ˆ
(
2
ˆ





 




t
t
t
t x
y
x
L
0
)
ˆ
ˆ
(
2
ˆ





0
ˆ
ˆ
0
)
ˆ
ˆ
( 





  
 t
t
t
t
t x
T
y
x
y 



  y
T
yt   x
T
xt
  




t i
t
t
t
t x
y
y
y
L 2
2
)
ˆ
ˆ
(
)
ˆ
( 

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 15
Deriving the OLS Estimator (cont’d)
• So we can write or (3)
• From (2), (4)
• From (3), (5)
• Substitute into (4) for from (5),
$

0
ˆ
ˆ 

 x
y 

 


t
t
t
t x
y
x 0
)
ˆ
ˆ
( 

x
y 
 ˆ
ˆ 

 
 















t
t
t
t
t
t
t
t
t
t
t
t
t
t
x
x
T
x
y
T
y
x
x
x
x
x
y
y
x
x
x
y
y
x
0
ˆ
ˆ
0
ˆ
ˆ
0
)
ˆ
ˆ
(
2
2
2






0
ˆ
ˆ 

 x
T
T
y
T 

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 16
Deriving the OLS Estimator (cont’d)
• Rearranging for ,
• So overall we have
• This method of finding the optimum is known as ordinary least squares.
$


 

 t
t
t y
x
x
y
T
x
x
T )
(
ˆ 2
2

x
y
x
T
x
y
x
T
y
x
t
t
t


 ˆ
ˆ
and
ˆ
2
2







‘Introductory Econometrics for Finance’ © Chris Brooks 2013 17
What do We Use and For?
• In the CAPM example used above, plugging the 5 observations in to make up
the formulae given above would lead to the estimates
= -1.74 and = 1.64. We would write the fitted line as:
• Question: If an analyst tells you that she expects the market to yield a return
20% higher than the risk-free rate next year, what would you expect the return
on fund XXX to be?
• Solution: We can say that the expected value of y = “-1.74 + 1.64 * value of x”,
so plug x = 20 into the equation to get the expected value for y:
$
 $

$
 $

06
.
31
20
64
.
1
74
.
1
ˆ 




i
y
t
t x
y 64
.
1
74
.
1
ˆ 


‘Introductory Econometrics for Finance’ © Chris Brooks 2013 18
Accuracy of Intercept Estimate
• Care needs to be exercised when considering the intercept estimate,
particularly if there are no or few observations close to the y-axis:
y
0 x
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 19
The Population and the Sample
• The population is the total collection of all objects or people to be studied,
for example,
• Interested in Population of interest
predicting outcome the entire electorate
of an election
• A sample is a selection of just some items from the population.
• A random sample is a sample in which each individual item in the
population is equally likely to be drawn.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 20
The DGP and the PRF
• The population regression function (PRF) is a description of the model that
is thought to be generating the actual data and the true relationship
between the variables (i.e. the true values of  and ).
• The PRF is
• The SRF is
and we also know that .
• We use the SRF to infer likely values of the PRF.
• We also want to know how “good” our estimates of  and  are.
t
t x
y 
 ˆ
ˆ
ˆ 

t
t
t u
x
y 

 

t
t
t y
y
u ˆ
ˆ 

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 21
Linearity
• In order to use OLS, we need a model which is linear in the parameters (
and  ). It does not necessarily have to be linear in the variables (y and x).
• Linear in the parameters means that the parameters are not multiplied
together, divided, squared or cubed etc.
• Some models can be transformed to linear ones by a suitable substitution
or manipulation, e.g. the exponential regression model
• Then let yt=ln Yt and xt=ln Xt
t
t
t u
x
y 

 

t
t
t
u
t
t u
X
Y
e
X
e
Y t




 ln
ln 



‘Introductory Econometrics for Finance’ © Chris Brooks 2013 22
Linear and Non-linear Models
• This is known as the exponential regression model. Here, the coefficients
can be interpreted as elasticities.
• Similarly, if theory suggests that y and x should be inversely related:
then the regression can be estimated using OLS by substituting
• But some models are intrinsically non-linear, e.g.
t
t
t u
x
y 




t
t
x
z
1

t
t
t u
x
y 




‘Introductory Econometrics for Finance’ © Chris Brooks 2013 23
Estimator or Estimate?
• Estimators are the formulae used to calculate the coefficients
• Estimates are the actual numerical values for the coefficients.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 24
The Assumptions Underlying the
Classical Linear Regression Model (CLRM)
• The model which we have used is known as the classical linear regression model.
• We observe data for xt, but since yt also depends on ut, we must be specific about
how the ut are generated.
• We usually make the following set of assumptions about the ut’s (the
unobservable error terms):
• Technical Notation Interpretation
1. E(ut) = 0 The errors have zero mean
2. Var (ut) = 2 The variance of the errors is constant and finite
over all values of xt
3. Cov (ui,uj)=0 The errors are statistically independent of
one another
4. Cov (ut,xt)=0 No relationship between the error and
corresponding x variate
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 25
The Assumptions Underlying the
CLRM Again
• An alternative assumption to 4., which is slightly stronger, is that the
xt’s are non-stochastic or fixed in repeated samples.
• A fifth assumption is required if we want to make inferences about the
population parameters (the actual  and ) from the sample parameters
( and )
• Additional Assumption
5. ut is normally distributed
$
 $

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 26
Properties of the OLS Estimator
• If assumptions 1. through 4. hold, then the estimators and determined by
OLS are known as Best Linear Unbiased Estimators (BLUE).
What does the acronym stand for?
• “Estimator” - is an estimator of the true value of .
• “Linear” - is a linear estimator
• “Unbiased” - On average, the actual value of the and ’s will be equal to
the true values.
• “Best” - means that the OLS estimator has minimum variance among
the class of linear unbiased estimators. The Gauss-Markov
theorem proves that the OLS estimator is best.
$
 $

$

$

$

$

$

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 27
Consistency/Unbiasedness/Efficiency
• Consistent
The least squares estimators and are consistent. That is, the estimates will
converge to their true values as the sample size increases to infinity. Need the
assumptions E(xtut)=0 and Var(ut)=2 <  to prove this. Consistency implies that
• Unbiased
The least squares estimates of and are unbiased. That is E( )= and E( )=
Thus on average the estimated value will be equal to the true values. To prove
this also requires the assumption that E(ut)=0. Unbiasedness is a stronger
condition than consistency.
• Efficiency
An estimator of parameter  is said to be efficient if it is unbiased and no other
unbiased estimator has a smaller variance. If the estimator is efficient, we are
minimising the probability that it is a long way off from the true value of .
$
 $

$
 $

$
 $

$

  0
0
ˆ
Pr
lim 










T
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 28
Precision and Standard Errors
• Any set of regression estimates of and are specific to the sample used in
their estimation.
• Recall that the estimators of  and  from the sample parameters ( and ) are
given by
• What we need is some measure of the reliability or precision of the estimators
( and ). The precision of the estimate is given by its standard error. Given
assumptions 1 - 4 above, then the standard errors can be shown to be given by
where s is the estimated standard deviation of the residuals.
$
 $

$

$

$

$

x
y
x
T
x
y
x
T
y
x
t
t
t


 ˆ
ˆ
and
ˆ
2
2





















2
2
2
2
2
2
2
2
2
1
)
(
1
)
ˆ
(
,
)
(
)
ˆ
(
x
T
x
s
x
x
s
SE
x
T
x
T
x
s
x
x
T
x
s
SE
t
t
t
t
t
t


‘Introductory Econometrics for Finance’ © Chris Brooks 2013 29
Estimating the Variance of the Disturbance Term
• The variance of the random variable ut is given by
Var(ut) = E[(ut)-E(ut)]2
which reduces to
Var(ut) = E(ut
2)
• We could estimate this using the average of :
• Unfortunately this is not workable since ut is not observable. We can use
the sample counterpart to ut, which is :
But this estimator is a biased estimator of 2.
2
t
u

 2
2 1
t
u
T
s

 2
2
ˆ
1
t
u
T
s
t
û
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 30
Estimating the Variance of the Disturbance Term
(cont’d)
• An unbiased estimator of  is given by
where is the residual sum of squares and T is the sample size.
Some Comments on the Standard Error Estimators
1. Both SE( ) and SE( ) depend on s2 (or s). The greater the variance s2, then
the more dispersed the errors are about their mean value and therefore the
more dispersed y will be about its mean value.
2. The sum of the squares of x about their mean appears in both formulae.
The larger the sum of squares, the smaller the coefficient variances.
$
 $

2
ˆ2



T
u
s t
 2
ˆt
u
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 31
Some Comments on the Standard Error Estimators
Consider what happens if is small or large:
y
y
0
x
x
y
y
0 x x
 2
  x
xt
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 32
Some Comments on the Standard Error Estimators
(cont’d)
3. The larger the sample size, T, the smaller will be the coefficient
variances. T appears explicitly in SE( ) and implicitly in SE( ).
T appears implicitly since the sum is from t = 1 to T.
4. The term appears in the SE( ).
The reason is that measures how far the points are away from the
y-axis.
$
 $

$

 2
  x
xt
 2
t
x
 2
t
x
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 33
Example: How to Calculate the Parameters and
Standard Errors
• Assume we have the following data calculated from a regression of y on a
single variable x and a constant over 22 observations.
• Data:
• Calculations:
• We write
$ ( * . * . )
*( . )
.
 



830102 22 4165 86 65
3919654 22 4165
035
2
$ . . * . .
   
8665 035 4165 5912
6
.
130
,
3919654
,
65
.
86
,
5
.
416
,
22
,
830102
2








RSS
x
y
x
T
y
x
t
t
t
t
t x
y 
 ˆ
ˆ
ˆ 

t
t x
y 35
.
0
12
.
59
ˆ 

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 34
Example (cont’d)
• SE(regression),
• We now write the results as
   
  0079
.
0
5
.
416
22
3919654
1
*
55
.
2
)
(
35
.
3
5
.
416
22
3919654
22
3919654
*
55
.
2
)
(
2
2











SE
SE
)
0079
.
0
(
35
.
0
)
35
.
3
(
12
.
59
ˆ t
t x
y 


55
.
2
20
6
.
130
2
ˆ2





T
u
s t
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 35
An Introduction to Statistical Inference
• We want to make inferences about the likely population values from
the regression parameters.
Example: Suppose we have the following regression results:
• is a single (point) estimate of the unknown population
parameter, . How “reliable” is this estimate?
• The reliability of the point estimate is measured by the coefficient’s
standard error.
$ .
  05091
)
2561
.
0
(
5091
.
0
)
38
.
14
(
3
.
20
ˆ t
t x
y 

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 36
Hypothesis Testing: Some Concepts
• We can use the information in the sample to make inferences about the
population.
• We will always have two hypotheses that go together, the null hypothesis
(denoted H0) and the alternative hypothesis (denoted H1).
• The null hypothesis is the statement or the statistical hypothesis that is actually
being tested. The alternative hypothesis represents the remaining outcomes of
interest.
• For example, suppose given the regression results above, we are interested in
the hypothesis that the true value of  is in fact 0.5. We would use the notation
H0 :  = 0.5
H1 :   0.5
This would be known as a two sided test.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 37
One-Sided Hypothesis Tests
• Sometimes we may have some prior information that, for example, we
would expect  > 0.5 rather than  < 0.5. In this case, we would do a
one-sided test:
H0 :  = 0.5
H1 :  > 0.5
or we could have had
H0 :  = 0.5
H1 :  < 0.5
• There are two ways to conduct a hypothesis test: via the test of
significance approach or via the confidence interval approach.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 38
The Probability Distribution of the
Least Squares Estimators
• We assume that ut  N(0,2)
• Since the least squares estimators are linear combinations of the random
variables
i.e.
• The weighted sum of normal random variables is also normally distributed, so
 N(, Var())
 N(, Var())
• What if the errors are not normally distributed? Will the parameter estimates
still be normally distributed?
• Yes, if the other assumptions of the CLRM hold, and the sample size is
sufficiently large.
$
  w y
t t
$

$

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 39
The Probability Distribution of the
Least Squares Estimators (cont’d)
• Standard normal variates can be constructed from and :
and
• But var() and var() are unknown, so
and
$
 $

 
 
1
,
0
~
var
ˆ
N


 
 
 
1
,
0
~
var
ˆ
N


 
2
~
)
ˆ
(
ˆ


T
t
SE 


2
~
)
ˆ
(
ˆ


T
t
SE 


‘Introductory Econometrics for Finance’ © Chris Brooks 2013 40
Testing Hypotheses:
The Test of Significance Approach
• Assume the regression equation is given by ,
for t=1,2,...,T
• The steps involved in doing a test of significance are:
1. Estimate , and , in the usual way
2. Calculate the test statistic. This is given by the formula
where is the value of  under the null hypothesis.
test statistic
SE


$ *
( $)
 

 *
SE( $)
 SE( $)

$
 $

t
t
t u
x
y 

 

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 41
The Test of Significance Approach (cont’d)
3. We need some tabulated distribution with which to compare the estimated
test statistics. Test statistics derived in this way can be shown to follow a t-
distribution with T-2 degrees of freedom.
As the number of degrees of freedom increases, we need to be less cautious in
our approach since we can be more sure that our results are robust.
4. We need to choose a “significance level”, often denoted . This is also
sometimes called the size of the test and it determines the region where we
will reject or not reject the null hypothesis that we are testing. It is
conventional to use a significance level of 5%.
Intuitive explanation is that we would only expect a result as extreme as this
or more extreme 5% of the time as a consequence of chance alone.
Conventional to use a 5% size of test, but 10% and 1% are also commonly
used.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 42
Determining the Rejection Region for a Test of
Significance
5. Given a significance level, we can determine a rejection region and non-
rejection region. For a 2-sided test:
f(x)
95% non-rejection
region
2.5%
rejection region
2.5%
rejection region
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 43
The Rejection Region for a 1-Sided Test (Upper Tail)
f(x)
95% non-rejection
region 5% rejection region
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 44
The Rejection Region for a 1-Sided Test (Lower Tail)
f(x)
95% non-rejection region
5% rejection region
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 45
The Test of Significance Approach: Drawing
Conclusions
6. Use the t-tables to obtain a critical value or values with which to
compare the test statistic.
7. Finally perform the test. If the test statistic lies in the rejection
region then reject the null hypothesis (H0), else do not reject H0.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 46
A Note on the t and the Normal Distribution
• You should all be familiar with the normal distribution and its
characteristic “bell” shape.
• We can scale a normal variate to have zero mean and unit variance by
subtracting its mean and dividing by its standard deviation.
• There is, however, a specific relationship between the t- and the
standard normal distribution. Both are symmetrical and centred on
zero. The t-distribution has another parameter, its degrees of freedom.
We will always know this (for the time being from the number of
observations -2).
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 47
What Does the t-Distribution Look Like?
normal distribution
t-distribution
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 48
Comparing the t and the Normal Distribution
• In the limit, a t-distribution with an infinite number of degrees of freedom is
a standard normal, i.e.
• Examples from statistical tables:
Significance level N(0,1) t(40) t(4)
50% 0 0 0
5% 1.64 1.68 2.13
2.5% 1.96 2.02 2.78
0.5% 2.57 2.70 4.60
• The reason for using the t-distribution rather than the standard normal is that
we had to estimate , the variance of the disturbances.
t N
( ) ( , )
  01
2
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 49
The Confidence Interval Approach
to Hypothesis Testing
• An example of its usage: We estimate a parameter, say to be 0.93, and
a “95% confidence interval” to be (0.77,1.09). This means that we are
95% confident that the interval containing the true (but unknown)
value of .
• Confidence intervals are almost invariably two-sided, although in
theory a one-sided interval can be constructed.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 50
How to Carry out a Hypothesis Test
Using Confidence Intervals
1. Calculate , and , as before.
2. Choose a significance level, , (again the convention is 5%). This is equivalent to
choosing a (1-)100% confidence interval, i.e. 5% significance level = 95%
confidence interval
3. Use the t-tables to find the appropriate critical value, which will again have T-2
degrees of freedom.
4. The confidence interval is given by
5. Perform the test: If the hypothesised value of  (*) lies outside the confidence
interval, then reject the null hypothesis that  = *, otherwise do not reject the null.
$
 $
 SE( $)
 SE( $)

))
ˆ
(
ˆ
),
ˆ
(
ˆ
( 


 SE
t
SE
t crit
crit 



‘Introductory Econometrics for Finance’ © Chris Brooks 2013 51
Confidence Intervals Versus Tests of Significance
• Note that the Test of Significance and Confidence Interval approaches
always give the same answer.
• Under the test of significance approach, we would not reject H0 that  = *
if the test statistic lies within the non-rejection region, i.e. if
• Rearranging, we would not reject if
• But this is just the rule under the confidence interval approach.
 £

£ 
t
SE
t
crit crit
$ *
( $)
 

)
ˆ
(
*
ˆ
)
ˆ
( 


 SE
t
SE
t crit
crit 

£

£


)
ˆ
(
ˆ
*
)
ˆ
(
ˆ 



 SE
t
SE
t crit
crit 

£
£


‘Introductory Econometrics for Finance’ © Chris Brooks 2013 52
Constructing Tests of Significance and
Confidence Intervals: An Example
• Using the regression results above,
, T=22
• Using both the test of significance and confidence interval approaches,
test the hypothesis that  =1 against a two-sided alternative.
• The first step is to obtain the critical value. We want tcrit = t20;5%
)
2561
.
0
(
5091
.
0
)
38
.
14
(
3
.
20
ˆ t
t x
y 

‘Introductory Econometrics for Finance’ © Chris Brooks 2013 53
Determining the Rejection Region
-2.086 +2.086
2.5% rejection region
2.5% rejection region
f(x)
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 54
Performing the Test
• The hypotheses are:
H0 :  = 1
H1 :   1
Test of significance Confidence interval
approach approach
Do not reject H0 since Since 1 lies within the
test stat lies within confidence interval,
non-rejection region do not reject H0
test stat
SE




 
$ *
( $)
.
.
.
 

05091 1
02561
1917
)
0433
.
1
,
0251
.
0
(
2561
.
0
086
.
2
5091
.
0
)
ˆ
(
ˆ






 
 SE
tcrit
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 55
Testing other Hypotheses
• What if we wanted to test H0 :  = 0 or H0 :  = 2?
• Note that we can test these with the confidence interval approach.
For interest (!), test
H0 :  = 0
vs. H1 :   0
H0 :  = 2
vs. H1 :   2
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 56
Changing the Size of the Test
• But note that we looked at only a 5% size of test. In marginal cases
(e.g. H0 :  = 1), we may get a completely different answer if we use a
different size of test. This is where the test of significance approach is
better than a confidence interval.
• For example, say we wanted to use a 10% size of test. Using the test of
significance approach,
as above. The only thing that changes is the critical t-value.
test stat
SE




 
$ *
( $)
.
.
.
 

05091 1
02561
1917
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 57
Changing the Size of the Test:
The New Rejection Regions
-1.725 +1.725
5% rejection region
5% rejection region
f(x)
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 58
Changing the Size of the Test:
The Conclusion
• t20;10% = 1.725. So now, as the test statistic lies in the rejection region,
we would reject H0.
• Caution should therefore be used when placing emphasis on or making
decisions in marginal cases (i.e. in cases where we only just reject or
not reject).
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 59
Some More Terminology
• If we reject the null hypothesis at the 5% level, we say that the result
of the test is statistically significant.
• Note that a statistically significant result may be of no practical
significance. E.g. if a shipment of cans of beans is expected to weigh
450g per tin, but the actual mean weight of some tins is 449g, the
result may be highly statistically significant but presumably nobody
would care about 1g of beans.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 60
The Errors That We Can Make
Using Hypothesis Tests
• We usually reject H0 if the test statistic is statistically significant at a
chosen significance level.
• There are two possible errors we could make:
1. Rejecting H0 when it was really true. This is called a type I error.
2. Not rejecting H0 when it was in fact false. This is called a type II error.
Reality
H0 is true H0 is false
Result of
Significant
(reject H0)
Type I error
= 

Test Insignificant
( do not
reject H0)

Type II error
= 
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 61
The Trade-off Between Type I and Type II Errors
• The probability of a type I error is just , the significance level or size of test we
chose. To see this, recall what we said significance at the 5% level meant: it is only
5% likely that a result as or more extreme as this could have occurred purely by
chance.
• Note that there is no chance for a free lunch here! What happens if we reduce the size
of the test (e.g. from a 5% test to a 1% test)? We reduce the chances of making a type
I error ... but we also reduce the probability that we will reject the null hypothesis at
all, so we increase the probability of a type II error:
• So there is always a trade off between type I and type II errors when choosing a
significance level. The only way we can reduce the chances of both is to increase the
sample size.
less likely
to falsely reject
Reduce size  more strict  reject null
of test criterion for hypothesis more likely to
rejection less often incorrectly not
reject
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 62
A Special Type of Hypothesis Test: The t-ratio
• Recall that the formula for a test of significance approach to hypothesis
testing using a t-test was
• If the test is H0 : i = 0
H1 : i  0
i.e. a test that the population coefficient is zero against a two-sided
alternative, this is known as a t-ratio test:
Since  i* = 0,
• The ratio of the coefficient to its SE is known as the t-ratio or t-statistic.
 
test statistic
SE
i i
i


$
$
*
 

test stat
SE
i
i

$
( $ )


‘Introductory Econometrics for Finance’ © Chris Brooks 2013 63
The t-ratio: An Example
• Suppose that we have the following parameter estimates, standard errors
and t-ratios for an intercept and slope respectively.
Coefficient 1.10 -4.40
SE 1.35 0.96
t-ratio 0.81 -4.63
Compare this with a tcrit with 15-3 = 12 d.f.
(2½% in each tail for a 5% test) = 2.179 5%
= 3.055 1%
• Do we reject H0: 1 = 0? (No)
H0: 2 = 0? (Yes)
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 64
What Does the t-ratio tell us?
• If we reject H0, we say that the result is significant. If the coefficient is not
“significant” (e.g. the intercept coefficient in the last regression above), then
it means that the variable is not helping to explain variations in y. Variables
that are not significant are usually removed from the regression model.
• In practice there are good statistical reasons for always having a constant
even if it is not significant. Look at what happens if no intercept is included:
t
y
t
x
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 65
An Example of the Use of a Simple t-test to Test a
Theory in Finance
• Testing for the presence and significance of abnormal returns (“Jensen’s
alpha” - Jensen, 1968).
• The Data: Annual Returns on the portfolios of 115 mutual funds from
1945-1964.
• The model: for j = 1, …, 115
• We are interested in the significance of j.
• The null hypothesis is H0: j = 0 .
jt
ft
mt
j
j
ft
jt u
R
R
R
R 



 )
(


‘Introductory Econometrics for Finance’ © Chris Brooks 2013 66
Frequency Distribution of t-ratios of Mutual Fund
Alphas (gross of transactions costs)
Source Jensen (1968). Reprinted with the permission of Blackwell publishers.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 67
Frequency Distribution of t-ratios of Mutual Fund
Alphas (net of transactions costs)
Source Jensen (1968). Reprinted with the permission of Blackwell publishers.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 68
Can UK Unit Trust Managers “Beat the Market”?
• We now perform a variant on Jensen’s test in the context of the UK market,
considering monthly returns on 76 equity unit trusts. The data cover the
period January 1979 – May 2000 (257 observations for each fund). Some
summary statistics for the funds are:
Mean Minimum Maximum Median
Average monthly return, 1979-2000 1.0% 0.6% 1.4% 1.0%
Standard deviation of returns over time 5.1% 4.3% 6.9% 5.0%
• Jensen Regression Results for UK Unit Trust Returns, January 1979-May
2000
R R R R
jt ft j j mt ft jt
    
  
( )
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 69
Can UK Unit Trust Managers “Beat the Market”?
: Results
Estimates of Mean Minimum Maximum Median
 -0.02% -0.54% 0.33% -0.03%
 0.91 0.56 1.09 0.91
t-ratio on  -0.07 -2.44 3.11 -0.25
• In fact, gross of transactions costs, 9 funds of the sample of 76 were
able to significantly out-perform the market by providing a significant
positive alpha, while 7 funds yielded significant negative alphas.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 70
The Overreaction Hypothesis and
the UK Stock Market
• Motivation
Two studies by DeBondt and Thaler (1985, 1987) showed that stocks which
experience a poor performance over a 3 to 5 year period tend to outperform
stocks which had previously performed relatively well.
• How Can This be Explained?
2 suggestions
1. A manifestation of the size effect
DeBondt & Thaler did not believe this a sufficient explanation, but Zarowin
(1990) found that allowing for firm size did reduce the subsequent return on
the losers.
2. Reversals reflect changes in equilibrium required returns
Ball & Kothari (1989) find the CAPM beta of losers to be considerably
higher than that of winners.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 71
The Overreaction Hypothesis and
the UK Stock Market (cont’d)
• Another interesting anomaly: the January effect.
– Another possible reason for the superior subsequent performance
of losers.
– Zarowin (1990) finds that 80% of the extra return available from
holding the losers accrues to investors in January.
• Example study: Clare and Thomas (1995)
Data:
Monthly UK stock returns from January 1955 to 1990 on all firms
traded on the London Stock exchange.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 72
Methodology
• Calculate the monthly excess return of the stock over the market over a 12,
24 or 36 month period for each stock i:
Uit = Rit - Rmt n = 12, 24 or 36 months
• Calculate the average monthly return for the stock i over the first 12, 24, or
36 month period:
R
n
U
i it
t
n



1
1
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 73
Portfolio Formation
• Then rank the stocks from highest average return to lowest and from 5
portfolios:
Portfolio 1: Best performing 20% of firms
Portfolio 2: Next 20%
Portfolio 3: Next 20%
Portfolio 4: Next 20%
Portfolio 5: Worst performing 20% of firms.
• Use the same sample length n to monitor the performance of each
portfolio.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 74
Portfolio Formation and
Portfolio Tracking Periods
• How many samples of length n have we got?
n = 1, 2, or 3 years.
• If n = 1year:
Estimate for year 1
Monitor portfolios for year 2
Estimate for year 3
...
Monitor portfolios for year 36
• So if n = 1, we have 18 INDEPENDENT (non-overlapping) observation /
tracking periods.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 75
Constructing Winner and Loser Returns
• Similarly, n = 2 gives 9 independent periods and n = 3 gives 6 independent
periods.
• Calculate monthly portfolio returns assuming an equal weighting of stocks in
each portfolio.
• Denote the mean return for each month over the 18, 9 or 6 periods for the
winner and loser portfolios respectively as and respectively.
• Define the difference between these as = - .
• Then perform the regression
= 1 + t (Test 1)
• Look at the significance of 1.
Rp
W
Rp
L
RDt
Rp
L
Rp
W
RDt
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 76
Allowing for Differences in the Riskiness
of the Winner and Loser Portfolios
• Problem: Significant and positive 1 could be due to higher return being
required on loser stocks due to loser stocks being more risky.
• Solution: Allow for risk differences by regressing against the market risk
premium:
= 2 + (Rmt-Rft) + t (Test 2)
where
Rmt is the return on the FTA All-share
Rft is the return on a UK government 3 month t-bill.
RDt
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 77
Is there an Overreaction Effect in the
UK Stock Market? Results
Panel A: All Months
n = 12 n = 24 n =36
Return on Loser 0.0033 0.0011 0.0129
Return on Winner 0.0036 -0.0003 0.0115
Implied annualised return difference -0.37% 1.68% 1.56%
Coefficient for (3.47): 1
̂ -0.00031
(0.29)
0.0014**
(2.01)
0.0013
(1.55)
Coefficients for (3.48): 2
̂ -0.00034
(-0.30)
0.00147**
(2.01)
0.0013*
(1.41)
̂ -0.022
(-0.25)
0.010
(0.21)
-0.0025
(-0.06)
Panel B: All Months Except January
Coefficient for (3.47): 1
̂ -0.0007
(-0.72)
0.0012*
(1.63)
0.0009
(1.05)
Notes: t-ratios in parentheses; * and ** denote significance at the 10% and 5% levels
respectively. Source: Clare and Thomas (1995). Reprinted with the permission of Blackwell
Publishers.
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 78
Testing for Seasonal Effects in Overreactions
• Is there evidence that losers out-perform winners more at one time of the
year than another?
• To test this, calculate the difference between the winner & loser portfolios
as previously, , and regress this on 12 month-of-the-year dummies:
• Significant out-performance of losers over winners in,
– June (for the 24-month horizon), and
– January, April and October (for the 36-month horizon)
– winners appear to stay significantly as winners in
• March (for the 12-month horizon).
R M
Dt i i t
i
 

 
1
12
RDt
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 79
Conclusions
• Evidence of overreactions in stock returns.
• Losers tend to be small so we can attribute most of the overreaction in the
UK to the size effect.
Comments
• Small samples
• No diagnostic checks of model adequacy
‘Introductory Econometrics for Finance’ © Chris Brooks 2013 80
The Exact Significance Level or p-value
• This is equivalent to choosing an infinite number of critical t-values from
tables. It gives us the marginal significance level where we would be
indifferent between rejecting and not rejecting the null hypothesis.
• If the test statistic is large in absolute value, the p-value will be small, and
vice versa. The p-value gives the plausibility of the null hypothesis.
e.g. a test statistic is distributed as a t62 = 1.47.
The p-value = 0.12.
• Do we reject at the 5% level?...........................No
• Do we reject at the 10% level?.........................No
• Do we reject at the 20% level?.........................Yes

More Related Content

PDF
Modern theory of Rent and Quasi-rent
PPTX
school of Thought in Economics
PPTX
Isoquants ppt
PPTX
Monopoly market
PPTX
INFLATION : NATURE,EFFECT AND CONTROL
PPTX
demand curve
PPTX
stackelberg Duopoly model
PPTX
Permanent and Life Cycle Income Hypothesis
Modern theory of Rent and Quasi-rent
school of Thought in Economics
Isoquants ppt
Monopoly market
INFLATION : NATURE,EFFECT AND CONTROL
demand curve
stackelberg Duopoly model
Permanent and Life Cycle Income Hypothesis

What's hot (20)

PPTX
Monopoly and Price Determination
PPTX
Quantity theory of money
PPSX
The quantity theory of money
PPTX
PPT
Autocorrelation- Remedial Measures
PPT
Elasticity Of Demand
PPT
Recardian Theory of Rent
PPTX
Classical theory of employment
PPT
Phillips Curve, Inflation & Interest Rate
PPTX
Types of elasticity of demand
PPT
Ch10ppt
PPT
Hicks slutsky income and substitution effect
PDF
Relationship between Average Revenue (AR), Marginal Revenue (MR), and Elastic...
PPT
Chap5(the open economy)
PPTX
Elasticity of demand
PPT
Analysis of supply
PPSX
Quantity theory of money
PDF
Price elasticity of demand
PPT
Price change income and substittution effects
PPSX
The Social welfare function
Monopoly and Price Determination
Quantity theory of money
The quantity theory of money
Autocorrelation- Remedial Measures
Elasticity Of Demand
Recardian Theory of Rent
Classical theory of employment
Phillips Curve, Inflation & Interest Rate
Types of elasticity of demand
Ch10ppt
Hicks slutsky income and substitution effect
Relationship between Average Revenue (AR), Marginal Revenue (MR), and Elastic...
Chap5(the open economy)
Elasticity of demand
Analysis of supply
Quantity theory of money
Price elasticity of demand
Price change income and substittution effects
The Social welfare function
Ad

Similar to A brief overview of the classical linear regression model (20)

PPT
Ch2_slides.ppt
PPT
Macroeconometric forecasting IMF MOOC slides
PPTX
Unit 03 - Consolidated.pptx
PDF
Linear regression model in econometrics undergraduate
PPTX
1.1.Introduction Econometrics.pptx
PPTX
Chapter III.pptx
PDF
Lecture Notes in Econometrics Arsen Palestini.pdf
DOCX
Chapter 2 - Econometrics_0856mkmkmkmok13.docx
PPT
TWO-VARIABLE REGRESSION ANALYSIS SOME BASIC IDEAS.ppt
PPTX
Introduction to Econometrics
DOCX
Chapter 2.docxnjnjnijijijijijijoiopooutdhuj
PDF
Introduction to financial forecasting in investment analysis
PDF
econometrics
PPT
Lecture 10.ppt
PPT
Ch11 slides
PDF
Econometrics notes (Introduction, Simple Linear regression, Multiple linear r...
PDF
CH1ECONMETRICS 3 USES, REGRESS ANAL-GRPAH EG UNI MULTIVARIATE, STOCHASTIC ERR...
PPTX
Chapter 2 Simple Linear Regression Model.pptx
PDF
PanelDadasdsadadsadasdasdasdataNotes-1b.pdf
PPT
chapter two linear programming in finance.ppt
Ch2_slides.ppt
Macroeconometric forecasting IMF MOOC slides
Unit 03 - Consolidated.pptx
Linear regression model in econometrics undergraduate
1.1.Introduction Econometrics.pptx
Chapter III.pptx
Lecture Notes in Econometrics Arsen Palestini.pdf
Chapter 2 - Econometrics_0856mkmkmkmok13.docx
TWO-VARIABLE REGRESSION ANALYSIS SOME BASIC IDEAS.ppt
Introduction to Econometrics
Chapter 2.docxnjnjnijijijijijijoiopooutdhuj
Introduction to financial forecasting in investment analysis
econometrics
Lecture 10.ppt
Ch11 slides
Econometrics notes (Introduction, Simple Linear regression, Multiple linear r...
CH1ECONMETRICS 3 USES, REGRESS ANAL-GRPAH EG UNI MULTIVARIATE, STOCHASTIC ERR...
Chapter 2 Simple Linear Regression Model.pptx
PanelDadasdsadadsadasdasdasdataNotes-1b.pdf
chapter two linear programming in finance.ppt
Ad

Recently uploaded (20)

PPTX
ENTREPRENEURSHIP..PPT.pptx..1234567891011
PDF
Chapter 1 - Introduction to management.pdf
PDF
Investment Risk Assessment Brief: Zacharia Ali and Associated Entities
PPTX
TimeBee vs. Toggl: Which Time Tracking Tool is Best for You?
PDF
4. Finance for non-financial managers.08.08.2025.pdf
PPTX
Peerless Plumbing Company-Fort Worth.pptx
PDF
Business Risk Assessment and Due Diligence Report: Zacharia Ali and Associate...
PDF
Chapter 3 - Business environment - Final.pdf
PPTX
Process-and-Ethics-in-Research-1.potatoi
PDF
AI Cloud Sprawl Is Real—Here’s How CXOs Can Regain Control Before It Costs Mi...
PPT
Organizational Culture and Management.ppt
PPTX
Daily stand up meeting on the various business
PDF
initiate-entrepreneurship-in-healthcare-service-management-in-sierra-leone.pdf
PDF
Why DevOps Teams Are Dropping Spreadsheets for Real-Time Cloud Hygiene.pdf
PDF
Decision trees for high uncertainty decisions
PDF
Driving Innovation & Growth, Scalable Startup IT Services That Deliver Result...
PDF
Why Has Vertical Farming Recently Become More Economical.pdf
PPT
Chap8. Product & Service Strategy and branding
PPTX
ELS-07 Lifeskills ToT PPt-Adama (ABE).pptx
PDF
Budora Case Study: Building Trust in Canada’s Online Cannabis Market
ENTREPRENEURSHIP..PPT.pptx..1234567891011
Chapter 1 - Introduction to management.pdf
Investment Risk Assessment Brief: Zacharia Ali and Associated Entities
TimeBee vs. Toggl: Which Time Tracking Tool is Best for You?
4. Finance for non-financial managers.08.08.2025.pdf
Peerless Plumbing Company-Fort Worth.pptx
Business Risk Assessment and Due Diligence Report: Zacharia Ali and Associate...
Chapter 3 - Business environment - Final.pdf
Process-and-Ethics-in-Research-1.potatoi
AI Cloud Sprawl Is Real—Here’s How CXOs Can Regain Control Before It Costs Mi...
Organizational Culture and Management.ppt
Daily stand up meeting on the various business
initiate-entrepreneurship-in-healthcare-service-management-in-sierra-leone.pdf
Why DevOps Teams Are Dropping Spreadsheets for Real-Time Cloud Hygiene.pdf
Decision trees for high uncertainty decisions
Driving Innovation & Growth, Scalable Startup IT Services That Deliver Result...
Why Has Vertical Farming Recently Become More Economical.pdf
Chap8. Product & Service Strategy and branding
ELS-07 Lifeskills ToT PPt-Adama (ABE).pptx
Budora Case Study: Building Trust in Canada’s Online Cannabis Market

A brief overview of the classical linear regression model

  • 1. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 1 Chapter 3 A brief overview of the classical linear regression model
  • 2. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 2 Regression • Regression is probably the single most important tool at the econometrician’s disposal. But what is regression analysis? • It is concerned with describing and evaluating the relationship between a given variable (usually called the dependent variable) and one or more other variables (usually known as the independent variable(s)).
  • 3. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 3 Some Notation • Denote the dependent variable by y and the independent variable(s) by x1, x2, ... , xk where there are k independent variables. • Some alternative names for the y and x variables: y x dependent variable independent variables regressand regressors effect variable causal variables explained variable explanatory variable • Note that there can be many x variables but we will limit ourselves to the case where there is only one x variable to start with. In our set-up, there is only one y variable.
  • 4. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 4 Regression is different from Correlation • If we say y and x are correlated, it means that we are treating y and x in a completely symmetrical way. • In regression, we treat the dependent variable (y) and the independent variable(s) (x’s) very differently. The y variable is assumed to be random or “stochastic” in some way, i.e. to have a probability distribution. The x variables are, however, assumed to have fixed (“non-stochastic”) values in repeated samples.
  • 5. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 5 Simple Regression • For simplicity, say k=1. This is the situation where y depends on only one x variable. • Examples of the kind of relationship that may be of interest include: – How asset returns vary with their level of market risk – Measuring the long-term relationship between stock prices and dividends. – Constructing an optimal hedge ratio
  • 6. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 6 Simple Regression: An Example • Suppose that we have the following data on the excess returns on a fund manager’s portfolio (“fund XXX”) together with the excess returns on a market index: • We have some intuition that the beta on this fund is positive, and we therefore want to find whether there appears to be a relationship between x and y given the data that we have. The first stage would be to form a scatter plot of the two variables. Year, t Excess return = rXXX,t – rft Excess return on market index = rmt - rft 1 17.8 13.7 2 39.0 23.2 3 12.8 6.9 4 24.2 16.8 5 17.2 12.3
  • 7. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 7 Graph (Scatter Diagram) 0 5 10 15 20 25 30 35 40 45 0 5 10 15 20 25 Excess return on market portfolio Excess return on fund XXX
  • 8. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 8 Finding a Line of Best Fit • We can use the general equation for a straight line, y=a+bx to get the line that best “fits” the data. • However, this equation (y=a+bx) is completely deterministic. • Is this realistic? No. So what we do is to add a random disturbance term, u into the equation. yt =  + xt + ut where t = 1,2,3,4,5
  • 9. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 9 Why do we include a Disturbance term? • The disturbance term can capture a number of features: - We always leave out some determinants of yt - There may be errors in the measurement of yt that cannot be modelled. - Random outside influences on yt which we cannot model
  • 10. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 10 Determining the Regression Coefficients • So how do we determine what  and  are? • Choose  and  so that the (vertical) distances from the data points to the fitted lines are minimised (so that the line fits the data as closely as possible): y x
  • 11. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 11 Ordinary Least Squares • The most common method used to fit a line to the data is known as OLS (ordinary least squares). • What we actually do is take each distance and square it (i.e. take the area of each of the squares in the diagram) and minimise the total sum of the squares (hence least squares). • Tightening up the notation, let yt denote the actual data point t denote the fitted value from the regression line denote the residual, yt - t ŷ t ŷ t û
  • 12. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 12 Actual and Fitted Value y i x x i y i ŷ i û
  • 13. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 13 How OLS Works • So min. , or minimise . This is known as the residual sum of squares. • But what was ? It was the difference between the actual point and the line, yt - . • So minimising is equivalent to minimising with respect to and . $  $  2 5 2 4 2 3 2 2 2 1 ˆ ˆ ˆ ˆ ˆ u u u u u     t ŷ t û   5 1 2 ˆ t t u  2 ˆ   t t y y  2 ˆt u
  • 14. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 14 Deriving the OLS Estimator • But , so let • Want to minimise L with respect to (w.r.t.) and , so differentiate L w.r.t. and (1) (2) • From (1), • But and . $  $  $  $  t t x y   ˆ ˆ ˆ         t t t x y L 0 ) ˆ ˆ ( 2 ˆ            t t t t x y x L 0 ) ˆ ˆ ( 2 ˆ      0 ˆ ˆ 0 ) ˆ ˆ (           t t t t t x T y x y       y T yt   x T xt        t i t t t t x y y y L 2 2 ) ˆ ˆ ( ) ˆ (  
  • 15. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 15 Deriving the OLS Estimator (cont’d) • So we can write or (3) • From (2), (4) • From (3), (5) • Substitute into (4) for from (5), $  0 ˆ ˆ    x y       t t t t x y x 0 ) ˆ ˆ (   x y   ˆ ˆ                      t t t t t t t t t t t t t t x x T x y T y x x x x x y y x x x y y x 0 ˆ ˆ 0 ˆ ˆ 0 ) ˆ ˆ ( 2 2 2       0 ˆ ˆ    x T T y T  
  • 16. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 16 Deriving the OLS Estimator (cont’d) • Rearranging for , • So overall we have • This method of finding the optimum is known as ordinary least squares. $       t t t y x x y T x x T ) ( ˆ 2 2  x y x T x y x T y x t t t    ˆ ˆ and ˆ 2 2       
  • 17. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 17 What do We Use and For? • In the CAPM example used above, plugging the 5 observations in to make up the formulae given above would lead to the estimates = -1.74 and = 1.64. We would write the fitted line as: • Question: If an analyst tells you that she expects the market to yield a return 20% higher than the risk-free rate next year, what would you expect the return on fund XXX to be? • Solution: We can say that the expected value of y = “-1.74 + 1.64 * value of x”, so plug x = 20 into the equation to get the expected value for y: $  $  $  $  06 . 31 20 64 . 1 74 . 1 ˆ      i y t t x y 64 . 1 74 . 1 ˆ   
  • 18. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 18 Accuracy of Intercept Estimate • Care needs to be exercised when considering the intercept estimate, particularly if there are no or few observations close to the y-axis: y 0 x
  • 19. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 19 The Population and the Sample • The population is the total collection of all objects or people to be studied, for example, • Interested in Population of interest predicting outcome the entire electorate of an election • A sample is a selection of just some items from the population. • A random sample is a sample in which each individual item in the population is equally likely to be drawn.
  • 20. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 20 The DGP and the PRF • The population regression function (PRF) is a description of the model that is thought to be generating the actual data and the true relationship between the variables (i.e. the true values of  and ). • The PRF is • The SRF is and we also know that . • We use the SRF to infer likely values of the PRF. • We also want to know how “good” our estimates of  and  are. t t x y   ˆ ˆ ˆ   t t t u x y      t t t y y u ˆ ˆ  
  • 21. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 21 Linearity • In order to use OLS, we need a model which is linear in the parameters ( and  ). It does not necessarily have to be linear in the variables (y and x). • Linear in the parameters means that the parameters are not multiplied together, divided, squared or cubed etc. • Some models can be transformed to linear ones by a suitable substitution or manipulation, e.g. the exponential regression model • Then let yt=ln Yt and xt=ln Xt t t t u x y      t t t u t t u X Y e X e Y t      ln ln    
  • 22. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 22 Linear and Non-linear Models • This is known as the exponential regression model. Here, the coefficients can be interpreted as elasticities. • Similarly, if theory suggests that y and x should be inversely related: then the regression can be estimated using OLS by substituting • But some models are intrinsically non-linear, e.g. t t t u x y      t t x z 1  t t t u x y     
  • 23. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 23 Estimator or Estimate? • Estimators are the formulae used to calculate the coefficients • Estimates are the actual numerical values for the coefficients.
  • 24. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 24 The Assumptions Underlying the Classical Linear Regression Model (CLRM) • The model which we have used is known as the classical linear regression model. • We observe data for xt, but since yt also depends on ut, we must be specific about how the ut are generated. • We usually make the following set of assumptions about the ut’s (the unobservable error terms): • Technical Notation Interpretation 1. E(ut) = 0 The errors have zero mean 2. Var (ut) = 2 The variance of the errors is constant and finite over all values of xt 3. Cov (ui,uj)=0 The errors are statistically independent of one another 4. Cov (ut,xt)=0 No relationship between the error and corresponding x variate
  • 25. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 25 The Assumptions Underlying the CLRM Again • An alternative assumption to 4., which is slightly stronger, is that the xt’s are non-stochastic or fixed in repeated samples. • A fifth assumption is required if we want to make inferences about the population parameters (the actual  and ) from the sample parameters ( and ) • Additional Assumption 5. ut is normally distributed $  $ 
  • 26. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 26 Properties of the OLS Estimator • If assumptions 1. through 4. hold, then the estimators and determined by OLS are known as Best Linear Unbiased Estimators (BLUE). What does the acronym stand for? • “Estimator” - is an estimator of the true value of . • “Linear” - is a linear estimator • “Unbiased” - On average, the actual value of the and ’s will be equal to the true values. • “Best” - means that the OLS estimator has minimum variance among the class of linear unbiased estimators. The Gauss-Markov theorem proves that the OLS estimator is best. $  $  $  $  $  $  $ 
  • 27. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 27 Consistency/Unbiasedness/Efficiency • Consistent The least squares estimators and are consistent. That is, the estimates will converge to their true values as the sample size increases to infinity. Need the assumptions E(xtut)=0 and Var(ut)=2 <  to prove this. Consistency implies that • Unbiased The least squares estimates of and are unbiased. That is E( )= and E( )= Thus on average the estimated value will be equal to the true values. To prove this also requires the assumption that E(ut)=0. Unbiasedness is a stronger condition than consistency. • Efficiency An estimator of parameter  is said to be efficient if it is unbiased and no other unbiased estimator has a smaller variance. If the estimator is efficient, we are minimising the probability that it is a long way off from the true value of . $  $  $  $  $  $  $    0 0 ˆ Pr lim            T
  • 28. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 28 Precision and Standard Errors • Any set of regression estimates of and are specific to the sample used in their estimation. • Recall that the estimators of  and  from the sample parameters ( and ) are given by • What we need is some measure of the reliability or precision of the estimators ( and ). The precision of the estimate is given by its standard error. Given assumptions 1 - 4 above, then the standard errors can be shown to be given by where s is the estimated standard deviation of the residuals. $  $  $  $  $  $  x y x T x y x T y x t t t    ˆ ˆ and ˆ 2 2                      2 2 2 2 2 2 2 2 2 1 ) ( 1 ) ˆ ( , ) ( ) ˆ ( x T x s x x s SE x T x T x s x x T x s SE t t t t t t  
  • 29. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 29 Estimating the Variance of the Disturbance Term • The variance of the random variable ut is given by Var(ut) = E[(ut)-E(ut)]2 which reduces to Var(ut) = E(ut 2) • We could estimate this using the average of : • Unfortunately this is not workable since ut is not observable. We can use the sample counterpart to ut, which is : But this estimator is a biased estimator of 2. 2 t u   2 2 1 t u T s   2 2 ˆ 1 t u T s t û
  • 30. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 30 Estimating the Variance of the Disturbance Term (cont’d) • An unbiased estimator of  is given by where is the residual sum of squares and T is the sample size. Some Comments on the Standard Error Estimators 1. Both SE( ) and SE( ) depend on s2 (or s). The greater the variance s2, then the more dispersed the errors are about their mean value and therefore the more dispersed y will be about its mean value. 2. The sum of the squares of x about their mean appears in both formulae. The larger the sum of squares, the smaller the coefficient variances. $  $  2 ˆ2    T u s t  2 ˆt u
  • 31. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 31 Some Comments on the Standard Error Estimators Consider what happens if is small or large: y y 0 x x y y 0 x x  2   x xt
  • 32. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 32 Some Comments on the Standard Error Estimators (cont’d) 3. The larger the sample size, T, the smaller will be the coefficient variances. T appears explicitly in SE( ) and implicitly in SE( ). T appears implicitly since the sum is from t = 1 to T. 4. The term appears in the SE( ). The reason is that measures how far the points are away from the y-axis. $  $  $   2   x xt  2 t x  2 t x
  • 33. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 33 Example: How to Calculate the Parameters and Standard Errors • Assume we have the following data calculated from a regression of y on a single variable x and a constant over 22 observations. • Data: • Calculations: • We write $ ( * . * . ) *( . ) .      830102 22 4165 86 65 3919654 22 4165 035 2 $ . . * . .     8665 035 4165 5912 6 . 130 , 3919654 , 65 . 86 , 5 . 416 , 22 , 830102 2         RSS x y x T y x t t t t t x y   ˆ ˆ ˆ   t t x y 35 . 0 12 . 59 ˆ  
  • 34. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 34 Example (cont’d) • SE(regression), • We now write the results as       0079 . 0 5 . 416 22 3919654 1 * 55 . 2 ) ( 35 . 3 5 . 416 22 3919654 22 3919654 * 55 . 2 ) ( 2 2            SE SE ) 0079 . 0 ( 35 . 0 ) 35 . 3 ( 12 . 59 ˆ t t x y    55 . 2 20 6 . 130 2 ˆ2      T u s t
  • 35. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 35 An Introduction to Statistical Inference • We want to make inferences about the likely population values from the regression parameters. Example: Suppose we have the following regression results: • is a single (point) estimate of the unknown population parameter, . How “reliable” is this estimate? • The reliability of the point estimate is measured by the coefficient’s standard error. $ .   05091 ) 2561 . 0 ( 5091 . 0 ) 38 . 14 ( 3 . 20 ˆ t t x y  
  • 36. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 36 Hypothesis Testing: Some Concepts • We can use the information in the sample to make inferences about the population. • We will always have two hypotheses that go together, the null hypothesis (denoted H0) and the alternative hypothesis (denoted H1). • The null hypothesis is the statement or the statistical hypothesis that is actually being tested. The alternative hypothesis represents the remaining outcomes of interest. • For example, suppose given the regression results above, we are interested in the hypothesis that the true value of  is in fact 0.5. We would use the notation H0 :  = 0.5 H1 :   0.5 This would be known as a two sided test.
  • 37. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 37 One-Sided Hypothesis Tests • Sometimes we may have some prior information that, for example, we would expect  > 0.5 rather than  < 0.5. In this case, we would do a one-sided test: H0 :  = 0.5 H1 :  > 0.5 or we could have had H0 :  = 0.5 H1 :  < 0.5 • There are two ways to conduct a hypothesis test: via the test of significance approach or via the confidence interval approach.
  • 38. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 38 The Probability Distribution of the Least Squares Estimators • We assume that ut  N(0,2) • Since the least squares estimators are linear combinations of the random variables i.e. • The weighted sum of normal random variables is also normally distributed, so  N(, Var())  N(, Var()) • What if the errors are not normally distributed? Will the parameter estimates still be normally distributed? • Yes, if the other assumptions of the CLRM hold, and the sample size is sufficiently large. $   w y t t $  $ 
  • 39. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 39 The Probability Distribution of the Least Squares Estimators (cont’d) • Standard normal variates can be constructed from and : and • But var() and var() are unknown, so and $  $      1 , 0 ~ var ˆ N         1 , 0 ~ var ˆ N     2 ~ ) ˆ ( ˆ   T t SE    2 ~ ) ˆ ( ˆ   T t SE   
  • 40. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 40 Testing Hypotheses: The Test of Significance Approach • Assume the regression equation is given by , for t=1,2,...,T • The steps involved in doing a test of significance are: 1. Estimate , and , in the usual way 2. Calculate the test statistic. This is given by the formula where is the value of  under the null hypothesis. test statistic SE   $ * ( $)     * SE( $)  SE( $)  $  $  t t t u x y     
  • 41. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 41 The Test of Significance Approach (cont’d) 3. We need some tabulated distribution with which to compare the estimated test statistics. Test statistics derived in this way can be shown to follow a t- distribution with T-2 degrees of freedom. As the number of degrees of freedom increases, we need to be less cautious in our approach since we can be more sure that our results are robust. 4. We need to choose a “significance level”, often denoted . This is also sometimes called the size of the test and it determines the region where we will reject or not reject the null hypothesis that we are testing. It is conventional to use a significance level of 5%. Intuitive explanation is that we would only expect a result as extreme as this or more extreme 5% of the time as a consequence of chance alone. Conventional to use a 5% size of test, but 10% and 1% are also commonly used.
  • 42. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 42 Determining the Rejection Region for a Test of Significance 5. Given a significance level, we can determine a rejection region and non- rejection region. For a 2-sided test: f(x) 95% non-rejection region 2.5% rejection region 2.5% rejection region
  • 43. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 43 The Rejection Region for a 1-Sided Test (Upper Tail) f(x) 95% non-rejection region 5% rejection region
  • 44. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 44 The Rejection Region for a 1-Sided Test (Lower Tail) f(x) 95% non-rejection region 5% rejection region
  • 45. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 45 The Test of Significance Approach: Drawing Conclusions 6. Use the t-tables to obtain a critical value or values with which to compare the test statistic. 7. Finally perform the test. If the test statistic lies in the rejection region then reject the null hypothesis (H0), else do not reject H0.
  • 46. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 46 A Note on the t and the Normal Distribution • You should all be familiar with the normal distribution and its characteristic “bell” shape. • We can scale a normal variate to have zero mean and unit variance by subtracting its mean and dividing by its standard deviation. • There is, however, a specific relationship between the t- and the standard normal distribution. Both are symmetrical and centred on zero. The t-distribution has another parameter, its degrees of freedom. We will always know this (for the time being from the number of observations -2).
  • 47. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 47 What Does the t-Distribution Look Like? normal distribution t-distribution
  • 48. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 48 Comparing the t and the Normal Distribution • In the limit, a t-distribution with an infinite number of degrees of freedom is a standard normal, i.e. • Examples from statistical tables: Significance level N(0,1) t(40) t(4) 50% 0 0 0 5% 1.64 1.68 2.13 2.5% 1.96 2.02 2.78 0.5% 2.57 2.70 4.60 • The reason for using the t-distribution rather than the standard normal is that we had to estimate , the variance of the disturbances. t N ( ) ( , )   01 2
  • 49. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 49 The Confidence Interval Approach to Hypothesis Testing • An example of its usage: We estimate a parameter, say to be 0.93, and a “95% confidence interval” to be (0.77,1.09). This means that we are 95% confident that the interval containing the true (but unknown) value of . • Confidence intervals are almost invariably two-sided, although in theory a one-sided interval can be constructed.
  • 50. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 50 How to Carry out a Hypothesis Test Using Confidence Intervals 1. Calculate , and , as before. 2. Choose a significance level, , (again the convention is 5%). This is equivalent to choosing a (1-)100% confidence interval, i.e. 5% significance level = 95% confidence interval 3. Use the t-tables to find the appropriate critical value, which will again have T-2 degrees of freedom. 4. The confidence interval is given by 5. Perform the test: If the hypothesised value of  (*) lies outside the confidence interval, then reject the null hypothesis that  = *, otherwise do not reject the null. $  $  SE( $)  SE( $)  )) ˆ ( ˆ ), ˆ ( ˆ (     SE t SE t crit crit    
  • 51. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 51 Confidence Intervals Versus Tests of Significance • Note that the Test of Significance and Confidence Interval approaches always give the same answer. • Under the test of significance approach, we would not reject H0 that  = * if the test statistic lies within the non-rejection region, i.e. if • Rearranging, we would not reject if • But this is just the rule under the confidence interval approach.  £  £  t SE t crit crit $ * ( $)    ) ˆ ( * ˆ ) ˆ (     SE t SE t crit crit   £  £   ) ˆ ( ˆ * ) ˆ ( ˆ      SE t SE t crit crit   £ £  
  • 52. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 52 Constructing Tests of Significance and Confidence Intervals: An Example • Using the regression results above, , T=22 • Using both the test of significance and confidence interval approaches, test the hypothesis that  =1 against a two-sided alternative. • The first step is to obtain the critical value. We want tcrit = t20;5% ) 2561 . 0 ( 5091 . 0 ) 38 . 14 ( 3 . 20 ˆ t t x y  
  • 53. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 53 Determining the Rejection Region -2.086 +2.086 2.5% rejection region 2.5% rejection region f(x)
  • 54. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 54 Performing the Test • The hypotheses are: H0 :  = 1 H1 :   1 Test of significance Confidence interval approach approach Do not reject H0 since Since 1 lies within the test stat lies within confidence interval, non-rejection region do not reject H0 test stat SE       $ * ( $) . . .    05091 1 02561 1917 ) 0433 . 1 , 0251 . 0 ( 2561 . 0 086 . 2 5091 . 0 ) ˆ ( ˆ          SE tcrit
  • 55. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 55 Testing other Hypotheses • What if we wanted to test H0 :  = 0 or H0 :  = 2? • Note that we can test these with the confidence interval approach. For interest (!), test H0 :  = 0 vs. H1 :   0 H0 :  = 2 vs. H1 :   2
  • 56. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 56 Changing the Size of the Test • But note that we looked at only a 5% size of test. In marginal cases (e.g. H0 :  = 1), we may get a completely different answer if we use a different size of test. This is where the test of significance approach is better than a confidence interval. • For example, say we wanted to use a 10% size of test. Using the test of significance approach, as above. The only thing that changes is the critical t-value. test stat SE       $ * ( $) . . .    05091 1 02561 1917
  • 57. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 57 Changing the Size of the Test: The New Rejection Regions -1.725 +1.725 5% rejection region 5% rejection region f(x)
  • 58. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 58 Changing the Size of the Test: The Conclusion • t20;10% = 1.725. So now, as the test statistic lies in the rejection region, we would reject H0. • Caution should therefore be used when placing emphasis on or making decisions in marginal cases (i.e. in cases where we only just reject or not reject).
  • 59. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 59 Some More Terminology • If we reject the null hypothesis at the 5% level, we say that the result of the test is statistically significant. • Note that a statistically significant result may be of no practical significance. E.g. if a shipment of cans of beans is expected to weigh 450g per tin, but the actual mean weight of some tins is 449g, the result may be highly statistically significant but presumably nobody would care about 1g of beans.
  • 60. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 60 The Errors That We Can Make Using Hypothesis Tests • We usually reject H0 if the test statistic is statistically significant at a chosen significance level. • There are two possible errors we could make: 1. Rejecting H0 when it was really true. This is called a type I error. 2. Not rejecting H0 when it was in fact false. This is called a type II error. Reality H0 is true H0 is false Result of Significant (reject H0) Type I error =   Test Insignificant ( do not reject H0)  Type II error = 
  • 61. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 61 The Trade-off Between Type I and Type II Errors • The probability of a type I error is just , the significance level or size of test we chose. To see this, recall what we said significance at the 5% level meant: it is only 5% likely that a result as or more extreme as this could have occurred purely by chance. • Note that there is no chance for a free lunch here! What happens if we reduce the size of the test (e.g. from a 5% test to a 1% test)? We reduce the chances of making a type I error ... but we also reduce the probability that we will reject the null hypothesis at all, so we increase the probability of a type II error: • So there is always a trade off between type I and type II errors when choosing a significance level. The only way we can reduce the chances of both is to increase the sample size. less likely to falsely reject Reduce size  more strict  reject null of test criterion for hypothesis more likely to rejection less often incorrectly not reject
  • 62. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 62 A Special Type of Hypothesis Test: The t-ratio • Recall that the formula for a test of significance approach to hypothesis testing using a t-test was • If the test is H0 : i = 0 H1 : i  0 i.e. a test that the population coefficient is zero against a two-sided alternative, this is known as a t-ratio test: Since  i* = 0, • The ratio of the coefficient to its SE is known as the t-ratio or t-statistic.   test statistic SE i i i   $ $ *    test stat SE i i  $ ( $ )  
  • 63. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 63 The t-ratio: An Example • Suppose that we have the following parameter estimates, standard errors and t-ratios for an intercept and slope respectively. Coefficient 1.10 -4.40 SE 1.35 0.96 t-ratio 0.81 -4.63 Compare this with a tcrit with 15-3 = 12 d.f. (2½% in each tail for a 5% test) = 2.179 5% = 3.055 1% • Do we reject H0: 1 = 0? (No) H0: 2 = 0? (Yes)
  • 64. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 64 What Does the t-ratio tell us? • If we reject H0, we say that the result is significant. If the coefficient is not “significant” (e.g. the intercept coefficient in the last regression above), then it means that the variable is not helping to explain variations in y. Variables that are not significant are usually removed from the regression model. • In practice there are good statistical reasons for always having a constant even if it is not significant. Look at what happens if no intercept is included: t y t x
  • 65. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 65 An Example of the Use of a Simple t-test to Test a Theory in Finance • Testing for the presence and significance of abnormal returns (“Jensen’s alpha” - Jensen, 1968). • The Data: Annual Returns on the portfolios of 115 mutual funds from 1945-1964. • The model: for j = 1, …, 115 • We are interested in the significance of j. • The null hypothesis is H0: j = 0 . jt ft mt j j ft jt u R R R R      ) (  
  • 66. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 66 Frequency Distribution of t-ratios of Mutual Fund Alphas (gross of transactions costs) Source Jensen (1968). Reprinted with the permission of Blackwell publishers.
  • 67. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 67 Frequency Distribution of t-ratios of Mutual Fund Alphas (net of transactions costs) Source Jensen (1968). Reprinted with the permission of Blackwell publishers.
  • 68. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 68 Can UK Unit Trust Managers “Beat the Market”? • We now perform a variant on Jensen’s test in the context of the UK market, considering monthly returns on 76 equity unit trusts. The data cover the period January 1979 – May 2000 (257 observations for each fund). Some summary statistics for the funds are: Mean Minimum Maximum Median Average monthly return, 1979-2000 1.0% 0.6% 1.4% 1.0% Standard deviation of returns over time 5.1% 4.3% 6.9% 5.0% • Jensen Regression Results for UK Unit Trust Returns, January 1979-May 2000 R R R R jt ft j j mt ft jt         ( )
  • 69. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 69 Can UK Unit Trust Managers “Beat the Market”? : Results Estimates of Mean Minimum Maximum Median  -0.02% -0.54% 0.33% -0.03%  0.91 0.56 1.09 0.91 t-ratio on  -0.07 -2.44 3.11 -0.25 • In fact, gross of transactions costs, 9 funds of the sample of 76 were able to significantly out-perform the market by providing a significant positive alpha, while 7 funds yielded significant negative alphas.
  • 70. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 70 The Overreaction Hypothesis and the UK Stock Market • Motivation Two studies by DeBondt and Thaler (1985, 1987) showed that stocks which experience a poor performance over a 3 to 5 year period tend to outperform stocks which had previously performed relatively well. • How Can This be Explained? 2 suggestions 1. A manifestation of the size effect DeBondt & Thaler did not believe this a sufficient explanation, but Zarowin (1990) found that allowing for firm size did reduce the subsequent return on the losers. 2. Reversals reflect changes in equilibrium required returns Ball & Kothari (1989) find the CAPM beta of losers to be considerably higher than that of winners.
  • 71. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 71 The Overreaction Hypothesis and the UK Stock Market (cont’d) • Another interesting anomaly: the January effect. – Another possible reason for the superior subsequent performance of losers. – Zarowin (1990) finds that 80% of the extra return available from holding the losers accrues to investors in January. • Example study: Clare and Thomas (1995) Data: Monthly UK stock returns from January 1955 to 1990 on all firms traded on the London Stock exchange.
  • 72. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 72 Methodology • Calculate the monthly excess return of the stock over the market over a 12, 24 or 36 month period for each stock i: Uit = Rit - Rmt n = 12, 24 or 36 months • Calculate the average monthly return for the stock i over the first 12, 24, or 36 month period: R n U i it t n    1 1
  • 73. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 73 Portfolio Formation • Then rank the stocks from highest average return to lowest and from 5 portfolios: Portfolio 1: Best performing 20% of firms Portfolio 2: Next 20% Portfolio 3: Next 20% Portfolio 4: Next 20% Portfolio 5: Worst performing 20% of firms. • Use the same sample length n to monitor the performance of each portfolio.
  • 74. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 74 Portfolio Formation and Portfolio Tracking Periods • How many samples of length n have we got? n = 1, 2, or 3 years. • If n = 1year: Estimate for year 1 Monitor portfolios for year 2 Estimate for year 3 ... Monitor portfolios for year 36 • So if n = 1, we have 18 INDEPENDENT (non-overlapping) observation / tracking periods.
  • 75. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 75 Constructing Winner and Loser Returns • Similarly, n = 2 gives 9 independent periods and n = 3 gives 6 independent periods. • Calculate monthly portfolio returns assuming an equal weighting of stocks in each portfolio. • Denote the mean return for each month over the 18, 9 or 6 periods for the winner and loser portfolios respectively as and respectively. • Define the difference between these as = - . • Then perform the regression = 1 + t (Test 1) • Look at the significance of 1. Rp W Rp L RDt Rp L Rp W RDt
  • 76. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 76 Allowing for Differences in the Riskiness of the Winner and Loser Portfolios • Problem: Significant and positive 1 could be due to higher return being required on loser stocks due to loser stocks being more risky. • Solution: Allow for risk differences by regressing against the market risk premium: = 2 + (Rmt-Rft) + t (Test 2) where Rmt is the return on the FTA All-share Rft is the return on a UK government 3 month t-bill. RDt
  • 77. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 77 Is there an Overreaction Effect in the UK Stock Market? Results Panel A: All Months n = 12 n = 24 n =36 Return on Loser 0.0033 0.0011 0.0129 Return on Winner 0.0036 -0.0003 0.0115 Implied annualised return difference -0.37% 1.68% 1.56% Coefficient for (3.47): 1 ̂ -0.00031 (0.29) 0.0014** (2.01) 0.0013 (1.55) Coefficients for (3.48): 2 ̂ -0.00034 (-0.30) 0.00147** (2.01) 0.0013* (1.41) ̂ -0.022 (-0.25) 0.010 (0.21) -0.0025 (-0.06) Panel B: All Months Except January Coefficient for (3.47): 1 ̂ -0.0007 (-0.72) 0.0012* (1.63) 0.0009 (1.05) Notes: t-ratios in parentheses; * and ** denote significance at the 10% and 5% levels respectively. Source: Clare and Thomas (1995). Reprinted with the permission of Blackwell Publishers.
  • 78. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 78 Testing for Seasonal Effects in Overreactions • Is there evidence that losers out-perform winners more at one time of the year than another? • To test this, calculate the difference between the winner & loser portfolios as previously, , and regress this on 12 month-of-the-year dummies: • Significant out-performance of losers over winners in, – June (for the 24-month horizon), and – January, April and October (for the 36-month horizon) – winners appear to stay significantly as winners in • March (for the 12-month horizon). R M Dt i i t i      1 12 RDt
  • 79. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 79 Conclusions • Evidence of overreactions in stock returns. • Losers tend to be small so we can attribute most of the overreaction in the UK to the size effect. Comments • Small samples • No diagnostic checks of model adequacy
  • 80. ‘Introductory Econometrics for Finance’ © Chris Brooks 2013 80 The Exact Significance Level or p-value • This is equivalent to choosing an infinite number of critical t-values from tables. It gives us the marginal significance level where we would be indifferent between rejecting and not rejecting the null hypothesis. • If the test statistic is large in absolute value, the p-value will be small, and vice versa. The p-value gives the plausibility of the null hypothesis. e.g. a test statistic is distributed as a t62 = 1.47. The p-value = 0.12. • Do we reject at the 5% level?...........................No • Do we reject at the 10% level?.........................No • Do we reject at the 20% level?.........................Yes