2. 2
Functional Forms of Regression Models
• Up to now, we have considered models that
are linear in parameters and linear in
variables, like the linear regression model:
• Example:
3. Example
• Suppose you want to understand whether and
how age, gender, and education predict
people’s earnings
• Formal specification of your econometric
model:
• Estimated form?
6. Linear regression model
• So far we have assumed that the population
regression function is linear
• What does this mean?
• The slope of the population regression
function is constant and does not depend on
the values of x
9. Today
Two different approaches:
1) Polynomial regression models in X:
The population regression function is approximated by a
quadratic, cubic, or higher-degree polynomial
2) Logarithmic transformation:
Y and/or X is transformed by taking its logarithm, which provides
a “percentages” interpretation of the coefficients that makes
sense in many applications
10. 10
Polynomial regression model
• One way to specify a nonlinear regression function is to use a polynomial of X.
• Let r be the highest power of X. The polynomial regression model of degree r is:
• When r=2, we have a quadratic regression model (Two-degree polynomial):
• When r=3, we have a cubic regression model (Three-degree polynomial):
• It is a multiple regression model
• Does it suffer from the problem of collinearity? Since it is the same variable entering
the model as squared; to the power of 3; …
11. 11
Polynomial regression model (cont.)
• and are no linear functions of
• So it does not violate the assumption of no
perfect collinearity
12. Example
• You estimate the following
population regression model:
• Your estimated model is:
• How does an increase in X affect Y?
13. Example
• You estimate the following population
regression model:
• Your estimated model is:
• How does an increase in X affect Y?
• Now the marginal effect of X on Y depends on the
values of X: an additional increase in X brings an
increase in Y but it increases in a diminishing rate
15. Example
• Exper has a diminishing effect on wage
• Every additional year of experience brings
an increase in wage but it increases in a
smaller speed every year
16. Example (cont.)
• Interesting is to see when does the return to
experience become zero.
• This is the turning point
when you solve it for exper, exper=31 years (i.e.,
after 31 years there is no benefit of experience to
your wage)
• Is this realistic?
17. Example (cont.)
• It is not very realistic but is one of the consequences
of using a quadratic form.
• At some point the function reaches the maximum
and then will curve downward
• That point is usually large enough. For example,
Mean of experience = 17 (on average people in our
sample have 17 years of experience)
18. Example: The test score, income relation
• average district income in the ith district (thousands of
dollars per capita)
• Quadratic specification:
• Cubic Specification:
In these specification the marginal effect of income on test score depends on values of
income. To show this:
Income
19. Estimation of the quadratic specification in
STATA
Test the null hypothesis of linearity against the alternative that the regression function is a quadratic.…
20. Interpreting the estimated regression
function (1 of 3)
(a) Plot the predicted values
2
607.3 + 3.85 0 0423( )
(2.9) (0.27) (0 0048)
i i
TestScore = Income Income
21. Interpreting the estimated regression
function (2 of 3)
(b) Compute the slope, evaluated at various values of X
Q: What is the predicted change in TestScore for a change
in income from $5,000 per capita to $6,000 per capita?
^
TestScore=607.3+3.85 Incomei−0.0423¿
22. Interpreting the estimated regression
function (2 of 3)
(b) Compute the slope, evaluated at various values of X
Predicted change in TestScore for a change in income from
$5,000 per capita to $6,000 per capita:
2
2
607 3 + 3.85 6 0 0423 6
(607.3 + 3 85 5 0 0423 5 )
= 3 4
ΔTestScore =
^
TestScore=607.3+3.85 Incomei−0.0423¿
23. Interpreting the estimated regression
function (2 of 3)
(b) Compute the slope, evaluated at various values of X
Predicted change in TestScore for a change in income from
$5,000 per capita to $6,000 per capita:
^
TestScore=607.3+3.85 Incomei−0.0423¿
24. Interpreting the estimated regression
function (2 of 3)
^
TestScore=607.3+3.85 Incomei−0.0423¿
Q: What is the predicted change in TestScore for a change
in income from $25,000 per capita to $26,000 per capita?
25. Interpreting the estimated regression
function (2 of 3)
^
TestScore=607.3+3.85 Incomei−0.0423¿
Q: What is the predicted change in TestScore for a change
in income from $25,000 per capita to $26,000 per capita?
26. Interpreting the estimated regression
function (3 of 3)
2
607 3 + 3 85 0 0423( )
i i
TestScore = Income Income
Predicted “effects” for different values of X:
Change in Income ($1000 per capita) Delta T
est score.
from 5 to 6 3.4
from 25 to 26 1.7
from 45 to 46 0.0
Δ TestScore
The “effect” of a change in income is greater at low than high
income levels
What is the effect of a change from 65 to 66?
27. Interpreting the estimated regression
function (3 of 3)
2
607 3 + 3 85 0 0423( )
i i
TestScore = Income Income
Predicted “effects” for different values of X:
Change in Income ($1000 per capita) Delta T
est score.
from 5 to 6 3.4
from 25 to 26 1.7
from 45 to 46 0.0
Δ TestScore
The “effect” of a change in income is greater at low than high
income levels
What is the effect of a change from 65 to 66?
Caution! Don’t extrapolate outside the range of the data!
29. Estimation of a cubic specification in STATA
(2 of 2)
Testing the null hypothesis of linearity, against the alternative that the
population regression is quadratic and/or cubic, that is, it is a polynomial of
degree up to 3:
0 :
H population coefficients on 2
Income and
3
= 0
Income
1 :
H at least one of these coefficients is nonzero.
test avginc2 avginc3
(1) avginc2 = 0.0
(2) avginc3 = 0.0
F 2 416 = 37 69
Prob > F = 0 0000
The hypothesis that the population regression is linear is rejected at the 1%
significance level against the alternative that it is a polynomial of degree up to
3.
30. Which degree polynomial should I use?
• Plot the data and follow a sequential hypothesis testing:
1) Pick a maximum value of r and estimate the polynomial regression for that r
2) Use the t-statistic to test whether the coefficient on is zero. If you reject this hypothesis,
then keep in the regression
3) If you do not reject, then eliminate and use a polynomial regression of degree r-1, test
whether the coefficient on is zero. If you reject use the polynomial of degree r-1.
4) If you do not reject, continue until you find the coefficient on the highest power to be
significant.
If you don’t see sharp jumps in the data (which is usually the case in economic data) then start
with polynomials of degree 2 to 4.
31. Summary: polynomial regression functions
2
0 1 2
r
i i i r i i
Y = β + β X + β X + + β X + u
• Estimation: by OLS after defining new regressors
• To interpret the estimated regression function:
– plot predicted values as a function of x
– compute predicted Δ Δ
Y / X for different values of x
• Hypotheses concerning degree r can be tested by t- and F-tests on
the appropriate (blocks of ) variable(s).
• Choice of degree r
– plot the data; t- and F-tests, check sensitivity of estimated
effects; judgment.
33. Logarithmic functions of Y and/or X
• Another way to specify a non-linear regression
function is to use the natural logarithm of Y
and/or X
• Logarithms convert changes in variables into
percentage changes
• Many relationships are naturally expressed in
terms of percentages
34. Examples where we are interested in
expressing relationships in percentages
• In our previous example, we studies that the relationship
between income and test scores are nonlinear. BUT would this
relationship be linear if we change income by 1% rather than
$1000?
• When we study consumer demand: we are interested in learning
about how 1% increase in price leads to a certain percentage
decrease in quantity demanded. (Price elasticity)
• Wage gap between male and female college graduates: we can
compare wage gaps in terms of dollars but it is easier to compare
wage gaps across professions in percentage terms
35. Logarithms and Percentages
• Logarithmic transformations allow us to model relations in “percentage”
terms (like elasticities), rather than linearly.
• Why? The link between logarithms and percentages relies on this
approximation:
• Numerically:
When X= 100 and , then or 1%
or 0.995% so it is also approximately 1%.
• This approximation is only true for small changes in x
37. The three logarithmic regression models
Case Population regression function
Roman numeral one. linear-log Upper Y subscript i baselineequals betasubscript 0baselineplus betasubscript 1baselinethenatural logofleft parenthesis upper X subscripti rightparenthesis plus usubscript i.
Roman numeral two. log-linear The natural logof left parenthesis upper Y subscript i baselineright parenthesis equals betasubscript0 baseline plus betasubscript 1upper X subscript i plus usubscript i.
Roman numeral three. log-log Thenatural logof leftparenthesis upper Ysubscript i baseline right parenthesis equals betasubscript 0baselineplus betasubscript 1baselinethe natural logofleft parenthesis upper X subscripti right parenthesis plus usubscript i.
I.
II.
III.
0 1ln
i i i
Y = β + β X + u
n
0 1
l i i i
Y X u
0 1ln
ln i i i
Y X u
38. The three logarithmic regression models
• The interpretation of the slope coefficient
differs in each case.
• The interpretation is found by applying the
general “before and after” rule: “figure out the
change in Y for a given change in X.”
• Each case has a natural interpretation (for
small changes in X )
39. Linear-log Model
• Compute Y before and after changing X:
• ) (b)
• Now change X: ) (a)
• Subtract a – b:
• Since
• =
41. Example: Test Score versus ln(income)
• First define the new regressor, ln(income)
• The model is now linear in ln(income), so the linear log
model can be estimated by OLS:
• Interpretation: a 1% increase in income is associated with
an increase in TestScore of 36.42 0.36 points on the test
• Standard errors, confidence intervals, all the usual tools
of regression apply here
44. Log-linear population regression function
• For small ,
• Now if X changes by 1 unit, then changes by .
• Translate it into percentages: When X changes by 1 unit, y changes by
100 %
• This quantity is called semi-elasticity of y with respect to x: It shows
the percentage change in y when x increases by one unit.
48. Log-log population regression function
= =
• Now if the percentage change in X is 1%, then is the
percentage change in Y associated with a 1% change in X.
• In other words, is the elasticity of Y with respect to X.
50. Example: ln(TestScore) vs. ln(Income)
• First define the new dependent variable, ln(TestScore)
and the new regressor, ln(income)
• The model is now a linear regression of ln(TestScore)
against ln(income), so it can be estimated by OLS:
• ln(
• Interpretation: a 1% increase in income is associated
with an increase of 0.0554% in TestScore (income up by
a factor of 1.01, Testscore up by a factor of 1.000554)
51. Example: ln(TestScore) vs. ln(Income)
continued
ln(
• For example, suppose income increases from $10,000 to
$11,000 or by 10%. Then test score increases by
approximately:
Testscore = .
• If TestScore =650, this corresponds to an increase of
0.00554 650 = 3.6 points.
52. Example: ln(TestScore) vs. ln(Income)
continued
ln(
• Q: If there is a 2% increase in income, by what
percentage will test scores increase? Calculate the
increase in test score if the test score is 650.
53. Example: ln(TestScore) vs. ln(Income)
continued
ln(
• Q: If there is a 2% increase in income, by what percentage will test
scores increase?
Testscore =
• If TestScore =650, this corresponds to an increase of 0.001108 650 =
0.72 points.
• How does this model compare to the log-linear model?
54. The log-linear and log-log specifications
• Note vertical axis
• The log-linear model doesn’t seem to fit as well as the log-log model, based
on visual inspection.
55. 55
Summary of main functional forms
Linear-linear or
level-level
log-log
Log-linear %
linear-log
57. 57
Logarithmic Regression Models with multiple
regressors
• and are partial slope coefficients or partial elasticities
• : measures the elasticity of with respect to , holding the influence
of constant
– that is, it measures the percentage change in for a percentage
change in , holding the influence of constant
• : measures the elasticity of with respect to , holding the influence
of constant
– that is, it measures the percentage change in for a percentage
change in , holding the influence of constant
58. 58
Application for log-log model: Estimating a
Cobb-Douglas production function (1 if 3)
• : output; : labour; : capital
• Using Mexican data over the years 1955-
1974, we obtain:
59. 59
Application for log-log model: Estimating a
Cobb-Douglas production function (2 of 3)
• : measures the elasticity of output with respect to labour,
holding capital constant
•
• Interpretation: holding capital constant, if labour (employment)
increases by 1%, average output increases by about 0.34%
• : measures the elasticity of output with respect to capital,
holding labour constant
•
• Interpretation: holding labour constant, if capital increases by
1%, average output goes up by about 0.85%
60. 60
Application: Estimating a Cobb-Douglas
production function (3 of 3)
• : returns to scale parameter
• It measures the response of output to a
proportional change in input
• constant returns to scale ()
• decreasing returns to scale ()
• increasing returns to scale ()
61. 61
Application for log-linear model: Estimating a
wage equation (1 if 2)
• Using a sample of 1,801 City graduates, the following earnings
equation has been estimated:
• : annual earnings
• : years of education
• : years of working experience
• : variable equal to 1 if the individual is a female; 0 otherwise
• Standard errors are reported in parenthesis
• Q: Interpret the coefficient estimates on educ, experience and
female
62. Application for log-linear model:
Estimating a wage equation (2 if 2)
• Interpretation of the coefficients:
• : indicates that one additional year of education increases average
earnings by 14.7%, holding gender and experience constant
• : indicates that one additional year of experience increases average
earnings by 4.9%, holding gender and education constant
• : indicates that average earnings of females are 20.1% lower than
that of males, holding experience and education constant
62
63. Summary
Two different approaches to incorporate
nonlinear relationships in regression models:
1) Polynomial regression models in X:
Interpretation, test
2) Logarithmic transformation:
Interpretation and their application to economics
Editor's Notes
#19:Long Description:
The data is as follows.
generate a v g i n c 2 equals a v g i n c asterisk a v g i n c
r e g test s c r a v g i n c a v g i n c 2 comma r
Regression with robust standard errors.
Number of observation equals 420, F left parenthesis 2 comma 417 right parenthesis equals 428.52, P r o b is greater than F equals 0.0000, R-squared equals 0.5562, and Root M S E equals 12.724.
Test s c r, a v g i n c; c o e f, 3.850995; Robust s t d e r r, 0.2680941; t, 14.36; p is greater than the absolute value of t, 0.000; 95 percent c o n f interval, 3.32401, 4.377979. Test s c r, a v g i n 2; c o e f, negative 0.0423085; Robust s t d e r r, 0.0047803; t, negative 8.85; p is greater than the absolute value of t, 0.000; 95 percent c o n f interval, negative 0.051705, negative 0.0329119. Test s c r, underscore c o n s; c o e f, 607.3017; Robust s t d e r r, 2.901754; t, 209.29; p is greater than the absolute value of t, 0.000; 95 percent c o n f interval, 601.5978, 613.0056.
#20:Long Description 1:
The equation is as follows.
Test score equals 607.3 left parenthesis 2.9 right parenthesis plus 3.85 left parenthesis 0.27 right parenthesis Income subscript i baseline minus 0.0423 left parenthesis 0.0048 right parenthesis left parenthesis Income subscript i baseline right parenthesis squared.
Long Description 2:
In the scatterplot, the horizontal axis represents District income in thousands of dollars ranges from 0 to 60 in increments of 10 units. The vertical axis represents Test score ranges from 600 to 740 in increments of 20 units. The graph plots a rising slope labeled Linear regression from (4, 638) to (55, 736) and a concave curve labeled Quadratic regression from (4, 624), (26, 680) to (55, 690). The dots are densely unevenly scattered above and below the line and the curve. All values are estimated.
#21:Long Description 1:
The equation is as follows.
Test left parenthesis 2.9 right parenthesis score left parenthesis 0.27 right parenthesis equals 607.3 plus 3.85 left parenthesis 0.0048 right parenthesis Income subscript i baseline minus 0.0423 left parenthesis Income subscript i baseline right parenthesis squared.
Long Description 2:
Delta Test score equals 607.3 plus 3.85 times 6 minus 0.0423 times 6 squared minus left parenthesis 607.3 plus 3.85 times 5 minus 0.0423 times 5 squared right parenthesis equals 3.4.
#22:Long Description 1:
The equation is as follows.
Test left parenthesis 2.9 right parenthesis score left parenthesis 0.27 right parenthesis equals 607.3 plus 3.85 left parenthesis 0.0048 right parenthesis Income subscript i baseline minus 0.0423 left parenthesis Income subscript i baseline right parenthesis squared.
Long Description 2:
Delta Test score equals 607.3 plus 3.85 times 6 minus 0.0423 times 6 squared minus left parenthesis 607.3 plus 3.85 times 5 minus 0.0423 times 5 squared right parenthesis equals 3.4.
#23:Long Description 1:
The equation is as follows.
Test left parenthesis 2.9 right parenthesis score left parenthesis 0.27 right parenthesis equals 607.3 plus 3.85 left parenthesis 0.0048 right parenthesis Income subscript i baseline minus 0.0423 left parenthesis Income subscript i baseline right parenthesis squared.
Long Description 2:
Delta Test score equals 607.3 plus 3.85 times 6 minus 0.0423 times 6 squared minus left parenthesis 607.3 plus 3.85 times 5 minus 0.0423 times 5 squared right parenthesis equals 3.4.
#24:Long Description 1:
The equation is as follows.
Test left parenthesis 2.9 right parenthesis score left parenthesis 0.27 right parenthesis equals 607.3 plus 3.85 left parenthesis 0.0048 right parenthesis Income subscript i baseline minus 0.0423 left parenthesis Income subscript i baseline right parenthesis squared.
Long Description 2:
Delta Test score equals 607.3 plus 3.85 times 6 minus 0.0423 times 6 squared minus left parenthesis 607.3 plus 3.85 times 5 minus 0.0423 times 5 squared right parenthesis equals 3.4.
#25:Long Description 1:
The equation is as follows.
Test left parenthesis 2.9 right parenthesis score left parenthesis 0.27 right parenthesis equals 607.3 plus 3.85 left parenthesis 0.0048 right parenthesis Income subscript i baseline minus 0.0423 left parenthesis Income subscript i baseline right parenthesis squared.
Long Description 2:
Delta Test score equals 607.3 plus 3.85 times 6 minus 0.0423 times 6 squared minus left parenthesis 607.3 plus 3.85 times 5 minus 0.0423 times 5 squared right parenthesis equals 3.4.
#28:Long Description:
The data is as follows.
g e n a v g i n c 3 equals a v g i n c asterisk a v g i n c 2
r e g test s c r a v g i n c a v g i n c 2 a v g i n c 3 comma r
Regression with robust standard errors.
Number of observation equals 420, F left parenthesis 3 comma 416 right parenthesis equals 270.18, P r o b greater than F equals 0.0000, R-squared equals 0.5584, and Root M S E equals 12.707.
Test s c r, a v g i n c; c o e f, 5.018677; Robust s t d e r r, 0.7073505; t, 7.10; p is greater than the absolute value of t, 0.000; 95 percent c o n f interval, 3.628251, 6.409104. Test s c r, a v g i n 2; c o e f, negative 0.0958052; Robust s t d e r r, 0.0289537; t, negative 3.31; p is greater than the absolute value of t, 0.001; 95 percent c o n f interval, negative 0.1527191, negative 0.0388913.Test s c r, a v g i n 3; c o e f, 0.0006855; Robust s t d e r r, 0.0003471; t, 1.98; p is greater than the absolute value of t, 0.049; 95 percent c o n f interval, 3.27 e negative 06, 0.0013677. Test s c r, c o n s; c o e f, 600.079; Robust s t d e r r, 5.102062; t, 117.61; p is greater than the absolute value of t, 0.000; 95 percent c o n f interval, 590.0499, 610.108.
#31:Long Description:
Upper Y sub i baseline equals Beta sub 0 baseline plus Beta sub 1 baseline upper X sub i baseline plus Beta sub 2 baseline upper X squared sub i baseline plus ellipsis plus Beta sub r baseline upper X super r sub i baseline plus u sub i baseline.
#42:Long Description:
In the scatterplot, the horizontal axis represents District income in thousands of dollars ranges from 0 to 60 in increments of 10 units. The vertical axis represents Test score ranges from 600 to 740 in increments of 20 units. The graph plots two concave curves labeled Linear-log regression from left parenthesis 5, 619 right parenthesis, left parenthesis 30, 680 right parenthesis to left parenthesis 55, 704 right parenthesis and the another concave curve labeled Cubic regression from left parenthesis 5, 621 right parenthesis, left parenthesis 30, 682 right parenthesis to left parenthesis 55, 698 right parenthesis. The dots are densely unevenly scattered above and below the curve. All values are estimated
#49:The elasticity clearly depends on the values of x and is not constant along a demand/supply curve
#54:Long Description:
In the scatterplot, the horizontal axis represents District income in thousands of dollars ranges from 0 to 60 in increments of 10 units. The vertical axis represents In left parenthesis Test scores right parenthesis ranges from 6.40 to 6.60 in increments of 0.05 unit. The graph plots a rising slope labeled Log-Linear regression from left parenthesis 4, 6.45 right parenthesis to left parenthesis 55, 6.58 right parenthesis and a concave curve labeled Log-log regression from left parenthesis 5, 6.25 right parenthesis, left parenthesis 30, 6.52 right parenthesis to left parenthesis 55, 6.55 right parenthesis. The dots are densely unevenly scattered above and below the line and the curve. All values are estimated.