Week 8 - Functional Forms.pptx this is presentation

EC2017: Introductory
Econometrics
Week 8
Functional Forms

2
Functional Forms of Regression Models
• Up to now, we have considered models that
are linear in parameters and linear in
variables, like the linear regression model:
• Example:

Example
• Suppose you want to understand whether and
how age, gender, and education predict
people’s earnings
• Formal specification of your econometric
model:
• Estimated form?

Example
𝑤𝑎𝑔𝑒𝑖=−2.12+0.06∗𝑎𝑔𝑒𝑖−2.16𝑓𝑒𝑚𝑎𝑙𝑒𝑖+0.54𝑒𝑑𝑢𝑐𝑖

Example
t-test:
H0:
HA:
F-test:
H0:
HA: at least one

Linear regression model
• So far we have assumed that the population
regression function is linear
• What does this mean?
• The slope of the population regression
function is constant and does not depend on
the values of x

Ex: relationship between test score and
student teacher ratio (Maybe linear)

Ex: relationship between test score and
income however does not look linear

Today
Two different approaches:
1) Polynomial regression models in X:
The population regression function is approximated by a
quadratic, cubic, or higher-degree polynomial
2) Logarithmic transformation:
Y and/or X is transformed by taking its logarithm, which provides
a “percentages” interpretation of the coefficients that makes
sense in many applications

10
Polynomial regression model
• One way to specify a nonlinear regression function is to use a polynomial of X.
• Let r be the highest power of X. The polynomial regression model of degree r is:
• When r=2, we have a quadratic regression model (Two-degree polynomial):
• When r=3, we have a cubic regression model (Three-degree polynomial):
• It is a multiple regression model
• Does it suffer from the problem of collinearity? Since it is the same variable entering
the model as squared; to the power of 3; …

11
Polynomial regression model (cont.)
• and are no linear functions of
• So it does not violate the assumption of no
perfect collinearity

Example
• You estimate the following
population regression model:
• Your estimated model is:
• How does an increase in X affect Y?

Example
• You estimate the following population
regression model:
• Your estimated model is:
• How does an increase in X affect Y?
• Now the marginal effect of X on Y depends on the
values of X: an additional increase in X brings an
increase in Y but it increases in a diminishing rate

When does the impact of X on Y becomes zero?

Example
• Exper has a diminishing effect on wage
• Every additional year of experience brings
an increase in wage but it increases in a
smaller speed every year

Example (cont.)
• Interesting is to see when does the return to
experience become zero.
• This is the turning point
 when you solve it for exper, exper=31 years (i.e.,
after 31 years there is no benefit of experience to
your wage)
• Is this realistic?

Example (cont.)
• It is not very realistic but is one of the consequences
of using a quadratic form.
• At some point the function reaches the maximum
and then will curve downward
• That point is usually large enough. For example,
Mean of experience = 17 (on average people in our
sample have 17 years of experience)

Example: The test score, income relation
• average district income in the ith district (thousands of
dollars per capita)
• Quadratic specification:
• Cubic Specification:
In these specification the marginal effect of income on test score depends on values of
income. To show this:
Income

Estimation of the quadratic specification in
STATA
Test the null hypothesis of linearity against the alternative that the regression function is a quadratic.…

Interpreting the estimated regression
function (1 of 3)
(a) Plot the predicted values
 2
607.3 + 3.85 0 0423( )
(2.9) (0.27) (0 0048)
i i
TestScore = Income Income
 


function (2 of 3)
(b) Compute the slope, evaluated at various values of X
Q: What is the predicted change in TestScore for a change
in income from $5,000 per capita to $6,000 per capita?
^
TestScore=607.3+3.85 Incomei−0.0423¿

function (2 of 3)
Predicted change in TestScore for a change in income from
$5,000 per capita to $6,000 per capita:
 2
2
607 3 + 3.85 6 0 0423 6
(607.3 + 3 85 5 0 0423 5 )
= 3 4
ΔTestScore =     
     

^

function (2 of 3)
Predicted change in TestScore for a change in income from
$5,000 per capita to $6,000 per capita:
^

function (2 of 3)
^
Q: What is the predicted change in TestScore for a change
in income from $25,000 per capita to $26,000 per capita?

function (3 of 3)
 2
607 3 + 3 85 0 0423( )

i i
  
Predicted “effects” for different values of X:
Change in Income ($1000 per capita) Delta T
est score.
from 5 to 6 3.4
from 25 to 26 1.7
from 45 to 46 0.0

Δ TestScore
The “effect” of a change in income is greater at low than high
income levels
What is the effect of a change from 65 to 66?

function (3 of 3)
 2
607 3 + 3 85 0 0423( )

i i
  
Predicted “effects” for different values of X:
Change in Income ($1000 per capita) Delta T
est score.
from 5 to 6 3.4
from 25 to 26 1.7
from 45 to 46 0.0

Δ TestScore
The “effect” of a change in income is greater at low than high
income levels
What is the effect of a change from 65 to 66?
Caution! Don’t extrapolate outside the range of the data!

Estimation of a cubic specification in STATA
(1 of 2)

Estimation of a cubic specification in STATA
(2 of 2)
Testing the null hypothesis of linearity, against the alternative that the
population regression is quadratic and/or cubic, that is, it is a polynomial of
degree up to 3:
0 :
H population coefficients on 2
Income and
3
= 0
Income
1 :
H at least one of these coefficients is nonzero.
test avginc2 avginc3
(1) avginc2 = 0.0
(2) avginc3 = 0.0
 
F 2 416 = 37 69
 
Prob > F = 0 0000

The hypothesis that the population regression is linear is rejected at the 1%
significance level against the alternative that it is a polynomial of degree up to
3.

Which degree polynomial should I use?
• Plot the data and follow a sequential hypothesis testing:
1) Pick a maximum value of r and estimate the polynomial regression for that r
2) Use the t-statistic to test whether the coefficient on is zero. If you reject this hypothesis,
then keep in the regression
3) If you do not reject, then eliminate and use a polynomial regression of degree r-1, test
whether the coefficient on is zero. If you reject use the polynomial of degree r-1.
4) If you do not reject, continue until you find the coefficient on the highest power to be
significant.
If you don’t see sharp jumps in the data (which is usually the case in economic data) then start
with polynomials of degree 2 to 4.

Summary: polynomial regression functions
2
0 1 2
r
i i i r i i
Y = β + β X + β X + + β X + u

• Estimation: by OLS after defining new regressors
• To interpret the estimated regression function:
– plot predicted values as a function of x
– compute predicted Δ Δ
Y / X for different values of x
• Hypotheses concerning degree r can be tested by t- and F-tests on
the appropriate (blocks of ) variable(s).
• Choice of degree r
– plot the data; t- and F-tests, check sensitivity of estimated
effects; judgment.

Logarithmic functions of Y and/or X

Logarithmic functions of Y and/or X
• Another way to specify a non-linear regression
function is to use the natural logarithm of Y
and/or X
• Logarithms convert changes in variables into
percentage changes
• Many relationships are naturally expressed in
terms of percentages

Examples where we are interested in
expressing relationships in percentages
• In our previous example, we studies that the relationship
between income and test scores are nonlinear. BUT would this
relationship be linear if we change income by 1% rather than
$1000?
• When we study consumer demand: we are interested in learning
about how 1% increase in price leads to a certain percentage
decrease in quantity demanded. (Price elasticity)
• Wage gap between male and female college graduates: we can
compare wage gaps in terms of dollars but it is easier to compare
wage gaps across professions in percentage terms

Logarithms and Percentages
• Logarithmic transformations allow us to model relations in “percentage”
terms (like elasticities), rather than linearly.
• Why? The link between logarithms and percentages relies on this
approximation:
• Numerically:
When X= 100 and , then or 1%
or 0.995% so it is also approximately 1%.
• This approximation is only true for small changes in x

The three logarithmic regression models

Case Population regression function
Roman numeral one. linear-log Upper Y subscript i baselineequals betasubscript 0baselineplus betasubscript 1baselinethenatural logofleft parenthesis upper X subscripti rightparenthesis plus usubscript i.
Roman numeral two. log-linear The natural logof left parenthesis upper Y subscript i baselineright parenthesis equals betasubscript0 baseline plus betasubscript 1upper X subscript i plus usubscript i.
Roman numeral three. log-log Thenatural logof leftparenthesis upper Ysubscript i baseline right parenthesis equals betasubscript 0baselineplus betasubscript 1baselinethe natural logofleft parenthesis upper X subscripti right parenthesis plus usubscript i.
I.
II.
III.
 
0 1ln
i i i
Y = β + β X + u
 
n  
  
0 1
l i i i
Y X u
   
 
  
0 1ln
ln i i i
Y X u

• The interpretation of the slope coefficient
differs in each case.
• The interpretation is found by applying the
general “before and after” rule: “figure out the
change in Y for a given change in X.”
• Each case has a natural interpretation (for
small changes in X )

Linear-log Model
• Compute Y before and after changing X:
• ) (b)
• Now change X: ) (a)
• Subtract a – b:
• Since
• =

Linear-log population function
=
• Now if X changes by 1%, then then Y will
change by

Example: Test Score versus ln(income)
• First define the new regressor, ln(income)
• The model is now linear in ln(income), so the linear log
model can be estimated by OLS:
• Interpretation: a 1% increase in income is associated with
an increase in TestScore of 36.42 0.36 points on the test
• Standard errors, confidence intervals, all the usual tools
of regression apply here

The linear-log and cubic regression
functions

Log-linear Model
• (b)
• Now change X: (a)
• Subtract (a) – (b):
• Because
• So

Log-linear population regression function
• For small ,
• Now if X changes by 1 unit, then changes by .
• Translate it into percentages: When X changes by 1 unit, y changes by
100 %
• This quantity is called semi-elasticity of y with respect to x: It shows
the percentage change in y when x increases by one unit.

Example
As the years of education increase by 1 year, we expect
wage to increase by 8.27%.

Log-log population regression function
• (b)
• Now change X: (a)
• Subtract (a) – (b):
• Because
• So

Log-log population regression function
= =
• Now if the percentage change in X is 1%, then is the
percentage change in Y associated with a 1% change in X.
• In other words, is the elasticity of Y with respect to X.

49
Note: Elasticity and slope
• Using elasticity and slope are two different
concepts:

Example: ln(TestScore) vs. ln(Income)
• First define the new dependent variable, ln(TestScore)
and the new regressor, ln(income)
• The model is now a linear regression of ln(TestScore)
against ln(income), so it can be estimated by OLS:
• ln(
• Interpretation: a 1% increase in income is associated
with an increase of 0.0554% in TestScore (income up by
a factor of 1.01, Testscore up by a factor of 1.000554)

continued
ln(
• For example, suppose income increases from $10,000 to
$11,000 or by 10%. Then test score increases by
approximately:
Testscore = .
• If TestScore =650, this corresponds to an increase of
0.00554 650 = 3.6 points.

continued
ln(
• Q: If there is a 2% increase in income, by what
percentage will test scores increase? Calculate the
increase in test score if the test score is 650.

continued
ln(
• Q: If there is a 2% increase in income, by what percentage will test
scores increase?
Testscore =
• If TestScore =650, this corresponds to an increase of 0.001108 650 =
0.72 points.
• How does this model compare to the log-linear model?

The log-linear and log-log specifications
• Note vertical axis
• The log-linear model doesn’t seem to fit as well as the log-log model, based
on visual inspection.

55
Summary of main functional forms
Linear-linear or
level-level
log-log
Log-linear %
linear-log

Logarithmic regression models with multiple
regressors and applications

57
Logarithmic Regression Models with multiple
regressors
• and are partial slope coefficients or partial elasticities
• : measures the elasticity of with respect to , holding the influence
of constant
– that is, it measures the percentage change in for a percentage
change in , holding the influence of constant
• : measures the elasticity of with respect to , holding the influence
of constant
– that is, it measures the percentage change in for a percentage
change in , holding the influence of constant

58
Application for log-log model: Estimating a
Cobb-Douglas production function (1 if 3)
• : output; : labour; : capital
• Using Mexican data over the years 1955-
1974, we obtain:

59
Application for log-log model: Estimating a
Cobb-Douglas production function (2 of 3)
• : measures the elasticity of output with respect to labour,
holding capital constant
•
• Interpretation: holding capital constant, if labour (employment)
increases by 1%, average output increases by about 0.34%
• : measures the elasticity of output with respect to capital,
holding labour constant
•
• Interpretation: holding labour constant, if capital increases by
1%, average output goes up by about 0.85%

60
Application: Estimating a Cobb-Douglas
production function (3 of 3)
• : returns to scale parameter
• It measures the response of output to a
proportional change in input
• constant returns to scale ()
• decreasing returns to scale ()
• increasing returns to scale ()

61
Application for log-linear model: Estimating a
wage equation (1 if 2)
• Using a sample of 1,801 City graduates, the following earnings
equation has been estimated:
• : annual earnings
• : years of education
• : years of working experience
• : variable equal to 1 if the individual is a female; 0 otherwise
• Standard errors are reported in parenthesis
• Q: Interpret the coefficient estimates on educ, experience and
female

Application for log-linear model:
Estimating a wage equation (2 if 2)
• Interpretation of the coefficients:
• : indicates that one additional year of education increases average
earnings by 14.7%, holding gender and experience constant
• : indicates that one additional year of experience increases average
earnings by 4.9%, holding gender and education constant
• : indicates that average earnings of females are 20.1% lower than
that of males, holding experience and education constant
62

Summary
Two different approaches to incorporate
nonlinear relationships in regression models:
1) Polynomial regression models in X:
Interpretation, test
2) Logarithmic transformation:
Interpretation and their application to economics

Week 8 - Functional Forms.pptx this is presentation

More Related Content

Similar to Week 8 - Functional Forms.pptx this is presentation (20)

Recently uploaded (20)

Week 8 - Functional Forms.pptx this is presentation

Editor's Notes