SlideShare a Scribd company logo
Nonparametric testing for exogeneity with discrete
regressors and instruments
Katarzyna Bech and Grant Hillier
Warsaw School of Economics
and
University of Southampton
July 8, 2016
1/28
Outline
1 Motivation.
2/28
Outline
1 Motivation.
2 Simplest nonparametric additive error model- setup, identi…cation,
estimation.
2/28
Outline
1 Motivation.
2 Simplest nonparametric additive error model- setup, identi…cation,
estimation.
3 Two test statistics and critical values computation.
2/28
Outline
1 Motivation.
2 Simplest nonparametric additive error model- setup, identi…cation,
estimation.
3 Two test statistics and critical values computation.
4 Generalization to several variables of each type- tested, exogenous,
instrument.
2/28
Outline
1 Motivation.
2 Simplest nonparametric additive error model- setup, identi…cation,
estimation.
3 Two test statistics and critical values computation.
4 Generalization to several variables of each type- tested, exogenous,
instrument.
5 Applications: Card (1995) and Angrist and Krueger (1991).
2/28
Motivation
Endogeneity is one of the common problems in econometric models.
3/28
Motivation
Endogeneity is one of the common problems in econometric models.
In nonparametric models with discrete regressors and instruments, the
presence of endogenous regressors produces bias (in the identi…ed
case) or non-existance of any consistent estimator (in the partially
identi…ed case).
3/28
Motivation
Endogeneity is one of the common problems in econometric models.
In nonparametric models with discrete regressors and instruments, the
presence of endogenous regressors produces bias (in the identi…ed
case) or non-existance of any consistent estimator (in the partially
identi…ed case).
IV for nonparametric models with discrete regressors: Das (2005) and
Florens and Malavolti (2003).
3/28
Motivation
Endogeneity is one of the common problems in econometric models.
In nonparametric models with discrete regressors and instruments, the
presence of endogenous regressors produces bias (in the identi…ed
case) or non-existance of any consistent estimator (in the partially
identi…ed case).
IV for nonparametric models with discrete regressors: Das (2005) and
Florens and Malavolti (2003).
Nonparametric testing for exogeneity with continuous regressors:
Blundell and Horowitz (2007), Lavergne and Patilea (2008) among
others.
3/28
Simple model
Nonparametric additive error model
Y = h(X) + ε
E[εjZ = zj ] = 0, 8j
where we have i.i.d. data (xs
i , ys
i , zs
i ) on (X, Y , Z), and
4/28
Simple model
Nonparametric additive error model
Y = h(X) + ε
E[εjZ = zj ] = 0, 8j
where we have i.i.d. data (xs
i , ys
i , zs
i ) on (X, Y , Z), and
Y is a continuous scalar dependent variable,
4/28
Simple model
Nonparametric additive error model
Y = h(X) + ε
E[εjZ = zj ] = 0, 8j
where we have i.i.d. data (xs
i , ys
i , zs
i ) on (X, Y , Z), and
Y is a continuous scalar dependent variable,
X is a single discrete regressor with support fxk , k = 1, ..., Kg, that
may be endogenous, with associated probabilities pk > 0,
4/28
Simple model
Nonparametric additive error model
Y = h(X) + ε
E[εjZ = zj ] = 0, 8j
where we have i.i.d. data (xs
i , ys
i , zs
i ) on (X, Y , Z), and
Y is a continuous scalar dependent variable,
X is a single discrete regressor with support fxk , k = 1, ..., Kg, that
may be endogenous, with associated probabilities pk > 0,
Z is a discrete instrumental variable with support fzj , j = 1, ..., Jg,
with associated probabilities qj > 0.
4/28
Hypothesis of interest
Null hypothesis (exogeneity):
H0 : E[εjX = xk ] = 0, k = 1, .., K.
5/28
Hypothesis of interest
Null hypothesis (exogeneity):
H0 : E[εjX = xk ] = 0, k = 1, .., K.
Under the null, h( ) can be consistently estimated using standard
nonparametric techniques.
5/28
Hypothesis of interest
Null hypothesis (exogeneity):
H0 : E[εjX = xk ] = 0, k = 1, .., K.
Under the null, h( ) can be consistently estimated using standard
nonparametric techniques.
Under the alternative, the IV solution to endogeneity only possible
under point identi…cation.
5/28
Identi…cation
Since
Y =
K
∑
k=1
h(xk )I(X = xk ) + ε,
the conditional expectation of Y given Z = zj is
E[Y jZ = zj ] =
K
∑
k=1
Pr[X = xk , Z = zj ]h(xk ).
6/28
Identi…cation
Since
Y =
K
∑
k=1
h(xk )I(X = xk ) + ε,
the conditional expectation of Y given Z = zj is
E[Y jZ = zj ] =
K
∑
k=1
Pr[X = xk , Z = zj ]h(xk ).
) the instrument Z supplies the equations
π = Πβ,
where βk = h(xk ), πj = E[Y jZ = zj ], Πjk = P[X = xk jZ = zj ].
6/28
Identi…cation
Since
Y =
K
∑
k=1
h(xk )I(X = xk ) + ε,
the conditional expectation of Y given Z = zj is
E[Y jZ = zj ] =
K
∑
k=1
Pr[X = xk , Z = zj ]h(xk ).
) the instrument Z supplies the equations
π = Πβ,
where βk = h(xk ), πj = E[Y jZ = zj ], Πjk = P[X = xk jZ = zj ].
h( ) is identi…ed at ALL support points of X i¤ J K.
6/28
Identi…cation when J<K
h( ) is partially identi…ed when J < K :
7/28
Identi…cation when J<K
h( ) is partially identi…ed when J < K :
Theorem 1
Let L(β) = c0β be a linear functional of the elements of β. When
rank(Π) = J < K, the following are true:
(1) for any c orthogonal to the null space of Π, L(β) is point-identi…ed;
the dimension of this set is J.
(2) for c not orthogonal to the null space of Π, L(β) is completely
unconstrained; the dimension of this set is K J.
7/28
Identi…cation when J<K
h( ) is partially identi…ed when J < K :
Theorem 1
Let L(β) = c0β be a linear functional of the elements of β. When
rank(Π) = J < K, the following are true:
(1) for any c orthogonal to the null space of Π, L(β) is point-identi…ed;
the dimension of this set is J.
(2) for c not orthogonal to the null space of Π, L(β) is completely
unconstrained; the dimension of this set is K J.
That is: when J < K, some linear functionals are point-identi…ed,
some completely arbitrary (not even set-identi…ed!).
7/28
Identi…cation when J<K
h( ) is partially identi…ed when J < K :
Theorem 1
Let L(β) = c0β be a linear functional of the elements of β. When
rank(Π) = J < K, the following are true:
(1) for any c orthogonal to the null space of Π, L(β) is point-identi…ed;
the dimension of this set is J.
(2) for c not orthogonal to the null space of Π, L(β) is completely
unconstrained; the dimension of this set is K J.
That is: when J < K, some linear functionals are point-identi…ed,
some completely arbitrary (not even set-identi…ed!).
Point-identi…ability of L(β) can be tested (for a given choice of c):
Gn = n c0
2 c0
1
ˆΠ 1
1
ˆΠ2 V 1
ˆP
c0
2 c0
1
ˆΠ 1
1
ˆΠ2
0
!d χ2
K J .
7/28
Linear Model Setup
We de…ne the (0, 1) matrix LX (n K) with elements
(LX )ik = I(xs
i = xk ). Likewise LZ (n J). Then, H0 says:
y = LX β + ε, E[εjX = xk ] = 08k.
8/28
Linear Model Setup
We de…ne the (0, 1) matrix LX (n K) with elements
(LX )ik = I(xs
i = xk ). Likewise LZ (n J). Then, H0 says:
y = LX β + ε, E[εjX = xk ] = 08k.
β can be consistently estimated by OLS under exogeneity:
bβ = (L0
X LX ) 1
L0
X y =
0
B
B
@
∑n
i=1 yi I (xs
i =x1)
∑n
i=1 I (xs
i =x1)
...
∑n
i=1 yi I (xs
i =xK )
∑n
i=1 I (xs
i =xK )
1
C
C
A .
8/28
Linear Model Setup
We de…ne the (0, 1) matrix LX (n K) with elements
(LX )ik = I(xs
i = xk ). Likewise LZ (n J). Then, H0 says:
y = LX β + ε, E[εjX = xk ] = 08k.
β can be consistently estimated by OLS under exogeneity:
bβ = (L0
X LX ) 1
L0
X y =
0
B
B
@
∑n
i=1 yi I (xs
i =x1)
∑n
i=1 I (xs
i =x1)
...
∑n
i=1 yi I (xs
i =xK )
∑n
i=1 I (xs
i =xK )
1
C
C
A .
Theorem 2
If X is exogenous then the nonparametric (OLS) estimator bβ is consistent
and p
n bβ β !d
N 0, σ2
D 1
X ,
where DX is diag(pk ).
8/28
Linear Model Setup
or by IV (with LZ as instruments) under endogeneity, when J K:
bβIV = L0
X PLZ
LX
1
L0
X PLZ
y.
9/28
Linear Model Setup
or by IV (with LZ as instruments) under endogeneity, when J K:
bβIV = L0
X PLZ
LX
1
L0
X PLZ
y.
Theorem 3
Under assumptions above, the IV estimator bβIV is consistent and
p
n bβIV β !d
N 0, σ2
P0
D 1
Z P
1
,
where P is a matrix of joint probabilities with elements
pjk = Pr [Z = zj , X = xk ] ; j = 1, ..., J; k = 1, ..., K
and DZ is diag(qj ).
9/28
Linear Model Setup
or by IV (with LZ as instruments) under endogeneity, when J K:
bβIV = L0
X PLZ
LX
1
L0
X PLZ
y.
Theorem 3
Under assumptions above, the IV estimator bβIV is consistent and
p
n bβIV β !d
N 0, σ2
P0
D 1
Z P
1
,
where P is a matrix of joint probabilities with elements
pjk = Pr [Z = zj , X = xk ] ; j = 1, ..., J; k = 1, ..., K
and DZ is diag(qj ).
BUT no consistent estimator exists for K J linear functionals if X is
endogenous and J < K.
9/28
Test for exogeneity
Test statistics di¤er depending on whether J K or J < K.
10/28
Test for exogeneity
Test statistics di¤er depending on whether J K or J < K.
For J K, (a modi…ed version of) Wu-Hausman test:
10/28
Test for exogeneity
Test statistics di¤er depending on whether J K or J < K.
For J K, (a modi…ed version of) Wu-Hausman test:
Theorem 4
Under H0, and the assumptions above,
Tn !d
χ2
K 1.
10/28
Test for exogeneity
Test statistics di¤er depending on whether J K or J < K.
For J K, (a modi…ed version of) Wu-Hausman test:
Theorem 4
Under H0, and the assumptions above,
Tn !d
χ2
K 1.
Theorem 5
Under the sequence of local alternatives, and the assumptions above, the
test statistic
Tn !d
Gamma (β, λ, θ) ,
with the shape parameter α = K 1
2 , the scale parameter θ = 2 σ2
σ 2 and the
noncentrality parameter λ = 2δ2
, where
δ2
=
ξ0
Σ 1
11 ξ
σ2
.
10/28
Test for exogeneity
For J < K, the test is based on the two SSE:
11/28
Test for exogeneity
For J < K, the test is based on the two SSE:
Unrestricted: in the model y = LX β + ε, i.e., y0MLX
y, and
11/28
Test for exogeneity
For J < K, the test is based on the two SSE:
Unrestricted: in the model y = LX β + ε, i.e., y0MLX
y, and
Restricted: minimising the SSE in this model subject to ˆπ = ˆΠβ.
11/28
Test for exogeneity
For J < K, the test is based on the two SSE:
Unrestricted: in the model y = LX β + ε, i.e., y0MLX
y, and
Restricted: minimising the SSE in this model subject to ˆπ = ˆΠβ.
Test statistic:
Rn =
y0MLX
LZ (L0
Z PLX
LZ ) 1
L0
Z MLX
y
n 1y0MLX
y
.
11/28
Test for exogeneity
Theorem 6
Under H0 and the assumptions above,
Rn !d
z0
Ω 1
z
J 1
∑
j=1
ωj χ2
j (1)
where z N(0, Σ), with Σ as de…ned above,
Ω := C0
J (PD 1
X P0
pZ p0
Z )CJ ,
and the ωj are positive eigenvalues satisfying
det[Σ ωΩ] = 0
with the χ2
j (1) variables independent copies of a χ2
1 random variable.
12/28
Critical value computation
using consistent estimates of bωj , simulate the distribution of
∑
(J 1)
j=1 bωj χ2
j (1) to get the appropriate 1 α quantiles,
13/28
Critical value computation
using consistent estimates of bωj , simulate the distribution of
∑
(J 1)
j=1 bωj χ2
j (1) to get the appropriate 1 α quantiles,
simulate the quadratic form z0 bΩ 1z, with z N(0, bΣ) and compute
the quantiles,
13/28
Critical value computation
using consistent estimates of bωj , simulate the distribution of
∑
(J 1)
j=1 bωj χ2
j (1) to get the appropriate 1 α quantiles,
simulate the quadratic form z0 bΩ 1z, with z N(0, bΣ) and compute
the quantiles,
approximate by the distribution of aχ2
(v ) + b, choosing (a, b, v) to
match the …rst three cumulants.
13/28
Generalizations: model with two discrete regressors
Y = h(W , X) + ε
E[εjZ = zj , W = wd ] = 0, 8j, d
We de…ne LWX (n DK) with elements
(LWX )i,dk = I(W = wd )I(X = xk ). Likewise LWZ (n DJ), and H0
says:
y = LWX β + ε, E[εjW = wd , X = xk ] = 08d, k.
14/28
Structure of regression matrix
LWX is a permutation of the rows of
2
6
6
4
L1
X 0 0 0
0 L2
X 0 0
0 0 0
0 0 0 LD
X
3
7
7
5
15/28
Structure of regression matrix
LWX is a permutation of the rows of
2
6
6
4
L1
X 0 0 0
0 L2
X 0 0
0 0 0
0 0 0 LD
X
3
7
7
5
Observations corresponding to Ld
X all have W = wd , and rows
identify which values of X occur where.
15/28
Structure of regression matrix
LWX is a permutation of the rows of
2
6
6
4
L1
X 0 0 0
0 L2
X 0 0
0 0 0
0 0 0 LD
X
3
7
7
5
Observations corresponding to Ld
X all have W = wd , and rows
identify which values of X occur where.
Similarly for LWZ .
15/28
Structure of regression matrix
LWX is a permutation of the rows of
2
6
6
4
L1
X 0 0 0
0 L2
X 0 0
0 0 0
0 0 0 LD
X
3
7
7
5
Observations corresponding to Ld
X all have W = wd , and rows
identify which values of X occur where.
Similarly for LWZ .
We assume that all possible combinations of K support points of X, J
support points of Z and D support points of W occur in the sample.
15/28
Identi…cation in general model
The vector with elements h(wd , xk ) can be split into D K 1
vectors, with component vectors hd (xk ), say, one for each wd .
16/28
Identi…cation in general model
The vector with elements h(wd , xk ) can be split into D K 1
vectors, with component vectors hd (xk ), say, one for each wd .
So we have D problems of the same type as the case with W absent.
[Split y into D subvectors yd ]
16/28
Identi…cation in general model
The vector with elements h(wd , xk ) can be split into D K 1
vectors, with component vectors hd (xk ), say, one for each wd .
So we have D problems of the same type as the case with W absent.
[Split y into D subvectors yd ]
The instrument is valid for each subsample.
16/28
Identi…cation in general model
The vector with elements h(wd , xk ) can be split into D K 1
vectors, with component vectors hd (xk ), say, one for each wd .
So we have D problems of the same type as the case with W absent.
[Split y into D subvectors yd ]
The instrument is valid for each subsample.
For h to be point-identi…ed, each hd must be, so the condition is
again J K.
16/28
Most general model
Now assume several variables of each type:
X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of
dimensions Kr , Su, Jt .
17/28
Most general model
Now assume several variables of each type:
X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of
dimensions Kr , Su, Jt .
We want to test the joint endogeneity of (X1, .., XR ) .
17/28
Most general model
Now assume several variables of each type:
X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of
dimensions Kr , Su, Jt .
We want to test the joint endogeneity of (X1, .., XR ) .
Label combinations of support points thus:
xα = (xα1 , .., xαR
), 1 αr Kr ,
wβ = (wβ1
, .., wβS
), 1 βu Su,
zγ = (zγ1
, .., zγT
), 1 γt Jt
17/28
Most general model
Now assume several variables of each type:
X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of
dimensions Kr , Su, Jt .
We want to test the joint endogeneity of (X1, .., XR ) .
Label combinations of support points thus:
xα = (xα1 , .., xαR
), 1 αr Kr ,
wβ = (wβ1
, .., wβS
), 1 βu Su,
zγ = (zγ1
, .., zγT
), 1 γt Jt
Order the sequences lexicographically.
17/28
Most general model
Now assume several variables of each type:
X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of
dimensions Kr , Su, Jt .
We want to test the joint endogeneity of (X1, .., XR ) .
Label combinations of support points thus:
xα = (xα1 , .., xαR
), 1 αr Kr ,
wβ = (wβ1
, .., wβS
), 1 βu Su,
zγ = (zγ1
, .., zγT
), 1 γt Jt
Order the sequences lexicographically.
Indistinguishable from the case of one variable of each type, except
that
J = ΠT
t=1Jt , K = ΠR
r=1Kr , S = ΠU
u=1Su.
17/28
In particular - Identi…cation
The N&S condition remains that J K, but with J and K de…ned as
products of the Jt and Kr .
18/28
In particular - Identi…cation
The N&S condition remains that J K, but with J and K de…ned as
products of the Jt and Kr .
BUT note:
18/28
In particular - Identi…cation
The N&S condition remains that J K, but with J and K de…ned as
products of the Jt and Kr .
BUT note:
There is NO requirement that there should be at least as many
instruments as endogenous variables (T R);
18/28
In particular - Identi…cation
The N&S condition remains that J K, but with J and K de…ned as
products of the Jt and Kr .
BUT note:
There is NO requirement that there should be at least as many
instruments as endogenous variables (T R);
All that is needed is J K.
18/28
In particular - Identi…cation
The N&S condition remains that J K, but with J and K de…ned as
products of the Jt and Kr .
BUT note:
There is NO requirement that there should be at least as many
instruments as endogenous variables (T R);
All that is needed is J K.
Of course: more instruments increases J = ΠT
t=1Jt .
18/28
Applications motivation
There are many published applications, where discrete endogenous
regressor is instrumented by a variable with insu¢ cient support, e.g.
Card (1995), Angrist and Krueger (1991), Bronars and Grogger
(1994), Lochner and Moretti (2004).
19/28
Applications motivation
There are many published applications, where discrete endogenous
regressor is instrumented by a variable with insu¢ cient support, e.g.
Card (1995), Angrist and Krueger (1991), Bronars and Grogger
(1994), Lochner and Moretti (2004).
The point identi…cation is achieved by assuming a parametric (linear)
speci…cation.
19/28
Applications motivation
There are many published applications, where discrete endogenous
regressor is instrumented by a variable with insu¢ cient support, e.g.
Card (1995), Angrist and Krueger (1991), Bronars and Grogger
(1994), Lochner and Moretti (2004).
The point identi…cation is achieved by assuming a parametric (linear)
speci…cation.
Parametric vs. nonparametric speci…cation testing (e.g. Horowitz
(2006)) not possible in this case.
19/28
Application- Card (1995) on returns to schooling
We are interested in the relationship between individual’s wage Y and
education X (in the presence of exogenous covariates W ) in
Y = h(X, W ) + ε.
20/28
Application- Card (1995) on returns to schooling
We are interested in the relationship between individual’s wage Y and
education X (in the presence of exogenous covariates W ) in
Y = h(X, W ) + ε.
Card (1995) treats education as endogenous and estimates
ln(wagei ) = β0 + β1Xi +
S
∑
s=1
γs Wsi + εi
by 2SLS using a binary instrument Z, which takes value 1 if there is a
college in the neighbourhood, 0 otherwise. The point identi…cation is
achieved by imposing the parametric (linear) speci…cation, that is not
testable.
20/28
Data
The dataset consists of 3010 observations from the National
Longitudinal Survey of Young Men.
21/28
Data
The dataset consists of 3010 observations from the National
Longitudinal Survey of Young Men.
The (sample) support of education variable consists of K = 18
di¤erent values, and for a binary instrument: J = 2.
21/28
Data
The dataset consists of 3010 observations from the National
Longitudinal Survey of Young Men.
The (sample) support of education variable consists of K = 18
di¤erent values, and for a binary instrument: J = 2.
Data limitation: the more exogenous covariates, the less likely it is to
get observations for all possible combinations of the support points.
21/28
Data
The dataset consists of 3010 observations from the National
Longitudinal Survey of Young Men.
The (sample) support of education variable consists of K = 18
di¤erent values, and for a binary instrument: J = 2.
Data limitation: the more exogenous covariates, the less likely it is to
get observations for all possible combinations of the support points.
Educational levels (K = 4): less than high school, high school, some
college, post-college education.
21/28
Data
The dataset consists of 3010 observations from the National
Longitudinal Survey of Young Men.
The (sample) support of education variable consists of K = 18
di¤erent values, and for a binary instrument: J = 2.
Data limitation: the more exogenous covariates, the less likely it is to
get observations for all possible combinations of the support points.
Educational levels (K = 4): less than high school, high school, some
college, post-college education.
Potential labour market experience levels: low and high.
21/28
Results
Covariates Rn cv.1 cv.2 cv.3 α
Educ 1.765 0.239 0.232 0.238 1%
0.136 0.132 0.138 5%
0.094 0.096 0.097 10%
Educ*, Exp* 4.147 1.221 1.259 1.217 1%
0.715 0.696 0.719 5%
0.511 0.500 0.515 10%
Educ*, Exp*, Race 3.572 1.771 1.692 1.688 1%
1.107 1.131 1.108 5%
0.849 0.871 0.860 10%
Educ*, Exp*, Race, SMSA 2.955 2.382 2.330 2.415 1%
1.702 1.679 1.735 5%
1.399 1.365 1.430 10%
22/28
Outcome
Education is endogenous, whatever the speci…cation of the W 0s.
23/28
Outcome
Education is endogenous, whatever the speci…cation of the W 0s.
So: linearity is not testable, because no consistent estimator for h( )
23/28
Outcome
Education is endogenous, whatever the speci…cation of the W 0s.
So: linearity is not testable, because no consistent estimator for h( )
Some linear functionals of interest may be - use the test above to
check.
23/28
Outcome
Education is endogenous, whatever the speci…cation of the W 0s.
So: linearity is not testable, because no consistent estimator for h( )
Some linear functionals of interest may be - use the test above to
check.
Can consistently estimate an identi…ed linear combination.
23/28
Testing for point-identi…ability of linear functionals
As J = 2, linear functionals of only 2 parameters might be point
identi…ed, eg. the di¤erence in earnings across di¤erent years of
education.
linear combination Gn
bL(β) Chesher’s bounds
h(3) h(2) 0.1356 0.0040 -
h(7) h(6) 1.9332 0.1017 (0.0365, 0.2895)
h(8) h(7) 0.1494 0.2395 ( 0.1732; 0.352)
h(9) h(8) 26.5527 - ( 0.2742; 0.1334)
h(10) h(9) 75.2217 - -
h(11) h(10) 4.7003 0.1317 ( 0.057; 0.3187)
h(14) h(13) 61.5525 - -
h(17) h(16) 10.7344 -0.1900 -
h(18) h(17) 74.1413 - -
24/28
Application- Angrist and Krueger (1991) on returns to
schooling
Angrist and Krueger (1991) estimate
ln(wagei ) = βXi + ∑
c
δc Yci +
S
∑
s=1
γs Wsi + εi
by 2SLS using quarter of birth as an instrument for (assumed)
endogenous education.
25/28
Application- Angrist and Krueger (1991) on returns to
schooling
Angrist and Krueger (1991) estimate
ln(wagei ) = βXi + ∑
c
δc Yci +
S
∑
s=1
γs Wsi + εi
by 2SLS using quarter of birth as an instrument for (assumed)
endogenous education.
Data: 1980 Census, split into 1930-1939 cohort (40-49 year-old men)
and 1940-1949 cohort (30-39 year-old men).
25/28
Application- Angrist and Krueger (1991) on returns to
schooling
Angrist and Krueger (1991) estimate
ln(wagei ) = βXi + ∑
c
δc Yci +
S
∑
s=1
γs Wsi + εi
by 2SLS using quarter of birth as an instrument for (assumed)
endogenous education.
Data: 1980 Census, split into 1930-1939 cohort (40-49 year-old men)
and 1940-1949 cohort (30-39 year-old men).
Now, K = 21 and J = 4.
25/28
Results: 1930’s cohort
critical values
Rn 1% 5% 10%
1930 0.645 17.144 11.026 8.433
1931 1.000 19.806 12.614 9.582
1932 10.843 21.541 14.313 11.184
1933 2.385 18.980 12.685 9.952
1934 6.498 25.674 16.824 13.025
1935 2.728 20.451 13.374 10.340
1936 10.990 29.102 18.465 13.980
1937 1.614 13.467 9.032 7.101
1938 1.344 22.932 15.107 11.737
1939 9.649 22.130 14.837 11.664
full cohort 38.044 85.933 72.138 65.465
26/28
Results: 1940’s cohort
critical values
Rn 1% 5% 10%
1940 3.528 24.137 15.704 12.096
1941 18.143 24.733 16.005* 12.286*
1942 6.517 34.282 21.810 16.535
1943 99.840 55.818* 35.712* 27.202*
1944 22.665 39.214 24.860 18.823*
1945 31.736 26.623* 17.705* 13.847*
1946 17.181 23.478 15.183* 11.642*
1947 22.803 33.000 21.830* 17.012*
1948 34.116 46.991 29.790* 22.552*
1949 32.952 36.445 23.627* 18.168*
full cohort 278.703 138.344* 114.551* 103.182*
27/28
To conlude...
we propose consistent nonparametric exogeneity test(s) applicable in
models with discrete regressors,
28/28
To conlude...
we propose consistent nonparametric exogeneity test(s) applicable in
models with discrete regressors,
the tests con…rm endogeneity of education variable in some classic
applied work, but
28/28
To conlude...
we propose consistent nonparametric exogeneity test(s) applicable in
models with discrete regressors,
the tests con…rm endogeneity of education variable in some classic
applied work, but
suggest that linearity of these models might be a bold assumption;
28/28
To conlude...
we propose consistent nonparametric exogeneity test(s) applicable in
models with discrete regressors,
the tests con…rm endogeneity of education variable in some classic
applied work, but
suggest that linearity of these models might be a bold assumption;
we suggest a nonparametric approach, or …nding instruments with
more support points!
28/28
To conlude...
we propose consistent nonparametric exogeneity test(s) applicable in
models with discrete regressors,
the tests con…rm endogeneity of education variable in some classic
applied work, but
suggest that linearity of these models might be a bold assumption;
we suggest a nonparametric approach, or …nding instruments with
more support points!
THE END
28/28

More Related Content

PDF
Seattle.Slides.7
PDF
Independent Component Analysis
DOC
Chi square tests
PDF
Gm2511821187
PDF
Principal Component Analysis for Tensor Analysis and EEG classification
PDF
Fj25991998
PDF
Numerical solution of boundary value problems by piecewise analysis method
PDF
Seattle.Slides.7
Independent Component Analysis
Chi square tests
Gm2511821187
Principal Component Analysis for Tensor Analysis and EEG classification
Fj25991998
Numerical solution of boundary value problems by piecewise analysis method

What's hot (18)

PDF
Reading Birnbaum's (1962) paper, by Li Chenlu
PDF
Ma2002 1.19 rm
PPT
MATLAB ODE
PPSX
Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
PDF
Research Inventy : International Journal of Engineering and Science
PDF
Stochastic Differentiation
PDF
The Computational Algorithm for Supported Solutions Set of Linear Diophantine...
PDF
Lec 2 discrete random variable
PDF
Lesson 21: Antiderivatives (notes)
PDF
On New Root Finding Algorithms for Solving Nonlinear Transcendental Equations
PDF
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
PPTX
Presentation on Matlab pde toolbox
PDF
PDF
International Journal of Engineering Research and Development (IJERD)
PPT
Ch07 6
PDF
F023064072
PDF
Chapter8
PDF
Estimation of Parameters and Missing Responses In Second Order Response Surfa...
Reading Birnbaum's (1962) paper, by Li Chenlu
Ma2002 1.19 rm
MATLAB ODE
Ch 05 MATLAB Applications in Chemical Engineering_陳奇中教授教學投影片
Research Inventy : International Journal of Engineering and Science
Stochastic Differentiation
The Computational Algorithm for Supported Solutions Set of Linear Diophantine...
Lec 2 discrete random variable
Lesson 21: Antiderivatives (notes)
On New Root Finding Algorithms for Solving Nonlinear Transcendental Equations
PMED Opening Workshop - Inference on Individualized Treatment Rules from Obse...
Presentation on Matlab pde toolbox
International Journal of Engineering Research and Development (IJERD)
Ch07 6
F023064072
Chapter8
Estimation of Parameters and Missing Responses In Second Order Response Surfa...
Ad

Viewers also liked (20)

PDF
Evaluation of doctoral studies by Polish PhD graduates
PDF
Strzelecki tyrowicz prezentacja20150310_nolyx
PDF
Minimum wage violation in Central and Eastern European
PDF
Polityczna (nie)stabilność reform systemów emerytalnych
PDF
Differences in access to funding
PDF
Are all researchers male?
PDF
Within occupation wage dispersion and task inequality
PDF
Author's gender affects rating of academic article
PDF
The shadow of longevity – does social security reform reduce gains from incre...
PPTX
Gender and research funding in a Norwegian context
PDF
Comparison between Polish and Norwegian PhDs
PDF
Reforma emerytalna w świetle modelu nakładających się pokoleń (OLG)
PDF
Gender, beauty and support in academia
PDF
Do gender and beauty affect assessment of academic performance?
PDF
Getting things right: optimal tax policy with labor market duality
PDF
The impact of business cycle fluctuations on aggregate endogenous growth rates
PDF
Getting things right: optimal tax policy with labor market duality
PPTX
Reformy systemu emerytalnego - analiza GRAPE
PDF
Im szybciej, tym lepiej - skutki dobrobytowe podnoszenia wieku emerytalnego w...
PDF
Gender gaps and female entrepreneurship
Evaluation of doctoral studies by Polish PhD graduates
Strzelecki tyrowicz prezentacja20150310_nolyx
Minimum wage violation in Central and Eastern European
Polityczna (nie)stabilność reform systemów emerytalnych
Differences in access to funding
Are all researchers male?
Within occupation wage dispersion and task inequality
Author's gender affects rating of academic article
The shadow of longevity – does social security reform reduce gains from incre...
Gender and research funding in a Norwegian context
Comparison between Polish and Norwegian PhDs
Reforma emerytalna w świetle modelu nakładających się pokoleń (OLG)
Gender, beauty and support in academia
Do gender and beauty affect assessment of academic performance?
Getting things right: optimal tax policy with labor market duality
The impact of business cycle fluctuations on aggregate endogenous growth rates
Getting things right: optimal tax policy with labor market duality
Reformy systemu emerytalnego - analiza GRAPE
Im szybciej, tym lepiej - skutki dobrobytowe podnoszenia wieku emerytalnego w...
Gender gaps and female entrepreneurship
Ad

Similar to Nonparametric testing for exogeneity with discrete regressors and instruments (20)

PDF
Litv_Denmark_Weak_Supervised_Learning.pdf
PDF
Estimation rs
PDF
Minimum mean square error estimation and approximation of the Bayesian update
PDF
On non-negative unbiased estimators
PDF
Bayesian Variable Selection in Linear Regression and A Comparison
PDF
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
PDF
Talk at CIRM on Poisson equation and debiasing techniques
PDF
Estimation of the score vector and observed information matrix in intractable...
PDF
E028047054
PDF
Unbiased Markov chain Monte Carlo methods
PDF
Slides ACTINFO 2016
PDF
Markov chain Monte Carlo methods and some attempts at parallelizing them
PDF
AI Lesson 26
PDF
Lesson 26
PDF
Estimation Theory, PhD Course, Ghent University, Belgium
PDF
Unbiased MCMC with couplings
PDF
A PROBABILISTIC ALGORITHM FOR COMPUTATION OF POLYNOMIAL GREATEST COMMON WITH ...
PDF
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
PDF
A PROBABILISTIC ALGORITHM FOR COMPUTATION OF POLYNOMIAL GREATEST COMMON WITH ...
Litv_Denmark_Weak_Supervised_Learning.pdf
Estimation rs
Minimum mean square error estimation and approximation of the Bayesian update
On non-negative unbiased estimators
Bayesian Variable Selection in Linear Regression and A Comparison
Low Complexity Regularization of Inverse Problems - Course #2 Recovery Guaran...
Talk at CIRM on Poisson equation and debiasing techniques
Estimation of the score vector and observed information matrix in intractable...
E028047054
Unbiased Markov chain Monte Carlo methods
Slides ACTINFO 2016
Markov chain Monte Carlo methods and some attempts at parallelizing them
AI Lesson 26
Lesson 26
Estimation Theory, PhD Course, Ghent University, Belgium
Unbiased MCMC with couplings
A PROBABILISTIC ALGORITHM FOR COMPUTATION OF POLYNOMIAL GREATEST COMMON WITH ...
A Probabilistic Algorithm for Computation of Polynomial Greatest Common with ...
A PROBABILISTIC ALGORITHM FOR COMPUTATION OF POLYNOMIAL GREATEST COMMON WITH ...

More from GRAPE (20)

PDF
The 401k program and its impact on household financial investment and knowledge
PDF
Heuristics and signals - lab labor market experiment
PDF
Revisiting gender board diversity and firm performance
PDF
Factors behind rising U.S. wealth inequality: a decomposition and policy impl...
PDF
Shocks and Inequality An Empirical Exploration
PDF
Raport "Równość a aspiracje zawodowe kobiet i mężcyzn"
PDF
Równość a aspiracje zawodowe kobiet i mężczyzn
PDF
Raport Równość a aspiracje zawodowe
PDF
Information_and_statistical_discrimination (1).pdf
PDF
Information_and_statistical_discrimination_LUX.pdf
PDF
Gender neutral hiring of young scholars: an experiment
PPTX
Wykresy do badania "Breaking barriers: Exploring women’s representation on co...
PPTX
Figures from a study "Breaking barriers: Exploring women’s representation on ...
PDF
Spillovers_in_Ownership_Interlocking_Networks.pdf
PDF
Gender neutral hiring of young scholars: experimental evidence
PDF
(Gender) Tone at the top: board diversity and earnings inequality
PDF
Demographic transition and the rise of wealth inequality
PDF
Gender board diversity and firm performance
PDF
Women on boards, czyli wszystko co chcielibyście wiedzieć o kobietach we wład...
PDF
Information_and_statistical_discrimination.pdf
The 401k program and its impact on household financial investment and knowledge
Heuristics and signals - lab labor market experiment
Revisiting gender board diversity and firm performance
Factors behind rising U.S. wealth inequality: a decomposition and policy impl...
Shocks and Inequality An Empirical Exploration
Raport "Równość a aspiracje zawodowe kobiet i mężcyzn"
Równość a aspiracje zawodowe kobiet i mężczyzn
Raport Równość a aspiracje zawodowe
Information_and_statistical_discrimination (1).pdf
Information_and_statistical_discrimination_LUX.pdf
Gender neutral hiring of young scholars: an experiment
Wykresy do badania "Breaking barriers: Exploring women’s representation on co...
Figures from a study "Breaking barriers: Exploring women’s representation on ...
Spillovers_in_Ownership_Interlocking_Networks.pdf
Gender neutral hiring of young scholars: experimental evidence
(Gender) Tone at the top: board diversity and earnings inequality
Demographic transition and the rise of wealth inequality
Gender board diversity and firm performance
Women on boards, czyli wszystko co chcielibyście wiedzieć o kobietach we wład...
Information_and_statistical_discrimination.pdf

Recently uploaded (20)

PPTX
ML Credit Scoring of Thin-File Borrowers
PPT
Fundamentals of Financial Management Chapter 3
PDF
The Right Social Media Strategy Can Transform Your Business
PDF
How to join illuminati agent in Uganda Kampala call 0782561496/0756664682
PPTX
OAT_ORI_Fed Independence_August 2025.pptx
PPTX
Module5_Session1 (mlzrkfbbbbbbbbbbbz1).pptx
PDF
discourse-2025-02-building-a-trillion-dollar-dream.pdf
PDF
THE EFFECT OF FOREIGN AID ON ECONOMIC GROWTH IN ETHIOPIA
PDF
6a Transition Through Old Age in a Dynamic Retirement Distribution Model JFP ...
PDF
Financial discipline for educational purpose
PDF
1a In Search of the Numbers ssrn 1488130 Oct 2009.pdf
PPTX
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
PPT
KPMG FA Benefits Report_FINAL_Jan 27_2010.ppt
PDF
Blockchain Pesa Research by Samuel Mefane
PPT
features and equilibrium under MONOPOLY 17.11.20.ppt
PDF
Statistics for Management and Economics Keller 10th Edition by Gerald Keller ...
PPTX
Maths science sst hindi english cucumber
PPTX
social-studies-subject-for-high-school-globalization.pptx
PDF
Why Ignoring Passive Income for Retirees Could Cost You Big.pdf
PDF
Bitcoin Layer August 2025: Power Laws of Bitcoin: The Core and Bubbles
ML Credit Scoring of Thin-File Borrowers
Fundamentals of Financial Management Chapter 3
The Right Social Media Strategy Can Transform Your Business
How to join illuminati agent in Uganda Kampala call 0782561496/0756664682
OAT_ORI_Fed Independence_August 2025.pptx
Module5_Session1 (mlzrkfbbbbbbbbbbbz1).pptx
discourse-2025-02-building-a-trillion-dollar-dream.pdf
THE EFFECT OF FOREIGN AID ON ECONOMIC GROWTH IN ETHIOPIA
6a Transition Through Old Age in a Dynamic Retirement Distribution Model JFP ...
Financial discipline for educational purpose
1a In Search of the Numbers ssrn 1488130 Oct 2009.pdf
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
KPMG FA Benefits Report_FINAL_Jan 27_2010.ppt
Blockchain Pesa Research by Samuel Mefane
features and equilibrium under MONOPOLY 17.11.20.ppt
Statistics for Management and Economics Keller 10th Edition by Gerald Keller ...
Maths science sst hindi english cucumber
social-studies-subject-for-high-school-globalization.pptx
Why Ignoring Passive Income for Retirees Could Cost You Big.pdf
Bitcoin Layer August 2025: Power Laws of Bitcoin: The Core and Bubbles

Nonparametric testing for exogeneity with discrete regressors and instruments

  • 1. Nonparametric testing for exogeneity with discrete regressors and instruments Katarzyna Bech and Grant Hillier Warsaw School of Economics and University of Southampton July 8, 2016 1/28
  • 3. Outline 1 Motivation. 2 Simplest nonparametric additive error model- setup, identi…cation, estimation. 2/28
  • 4. Outline 1 Motivation. 2 Simplest nonparametric additive error model- setup, identi…cation, estimation. 3 Two test statistics and critical values computation. 2/28
  • 5. Outline 1 Motivation. 2 Simplest nonparametric additive error model- setup, identi…cation, estimation. 3 Two test statistics and critical values computation. 4 Generalization to several variables of each type- tested, exogenous, instrument. 2/28
  • 6. Outline 1 Motivation. 2 Simplest nonparametric additive error model- setup, identi…cation, estimation. 3 Two test statistics and critical values computation. 4 Generalization to several variables of each type- tested, exogenous, instrument. 5 Applications: Card (1995) and Angrist and Krueger (1991). 2/28
  • 7. Motivation Endogeneity is one of the common problems in econometric models. 3/28
  • 8. Motivation Endogeneity is one of the common problems in econometric models. In nonparametric models with discrete regressors and instruments, the presence of endogenous regressors produces bias (in the identi…ed case) or non-existance of any consistent estimator (in the partially identi…ed case). 3/28
  • 9. Motivation Endogeneity is one of the common problems in econometric models. In nonparametric models with discrete regressors and instruments, the presence of endogenous regressors produces bias (in the identi…ed case) or non-existance of any consistent estimator (in the partially identi…ed case). IV for nonparametric models with discrete regressors: Das (2005) and Florens and Malavolti (2003). 3/28
  • 10. Motivation Endogeneity is one of the common problems in econometric models. In nonparametric models with discrete regressors and instruments, the presence of endogenous regressors produces bias (in the identi…ed case) or non-existance of any consistent estimator (in the partially identi…ed case). IV for nonparametric models with discrete regressors: Das (2005) and Florens and Malavolti (2003). Nonparametric testing for exogeneity with continuous regressors: Blundell and Horowitz (2007), Lavergne and Patilea (2008) among others. 3/28
  • 11. Simple model Nonparametric additive error model Y = h(X) + ε E[εjZ = zj ] = 0, 8j where we have i.i.d. data (xs i , ys i , zs i ) on (X, Y , Z), and 4/28
  • 12. Simple model Nonparametric additive error model Y = h(X) + ε E[εjZ = zj ] = 0, 8j where we have i.i.d. data (xs i , ys i , zs i ) on (X, Y , Z), and Y is a continuous scalar dependent variable, 4/28
  • 13. Simple model Nonparametric additive error model Y = h(X) + ε E[εjZ = zj ] = 0, 8j where we have i.i.d. data (xs i , ys i , zs i ) on (X, Y , Z), and Y is a continuous scalar dependent variable, X is a single discrete regressor with support fxk , k = 1, ..., Kg, that may be endogenous, with associated probabilities pk > 0, 4/28
  • 14. Simple model Nonparametric additive error model Y = h(X) + ε E[εjZ = zj ] = 0, 8j where we have i.i.d. data (xs i , ys i , zs i ) on (X, Y , Z), and Y is a continuous scalar dependent variable, X is a single discrete regressor with support fxk , k = 1, ..., Kg, that may be endogenous, with associated probabilities pk > 0, Z is a discrete instrumental variable with support fzj , j = 1, ..., Jg, with associated probabilities qj > 0. 4/28
  • 15. Hypothesis of interest Null hypothesis (exogeneity): H0 : E[εjX = xk ] = 0, k = 1, .., K. 5/28
  • 16. Hypothesis of interest Null hypothesis (exogeneity): H0 : E[εjX = xk ] = 0, k = 1, .., K. Under the null, h( ) can be consistently estimated using standard nonparametric techniques. 5/28
  • 17. Hypothesis of interest Null hypothesis (exogeneity): H0 : E[εjX = xk ] = 0, k = 1, .., K. Under the null, h( ) can be consistently estimated using standard nonparametric techniques. Under the alternative, the IV solution to endogeneity only possible under point identi…cation. 5/28
  • 18. Identi…cation Since Y = K ∑ k=1 h(xk )I(X = xk ) + ε, the conditional expectation of Y given Z = zj is E[Y jZ = zj ] = K ∑ k=1 Pr[X = xk , Z = zj ]h(xk ). 6/28
  • 19. Identi…cation Since Y = K ∑ k=1 h(xk )I(X = xk ) + ε, the conditional expectation of Y given Z = zj is E[Y jZ = zj ] = K ∑ k=1 Pr[X = xk , Z = zj ]h(xk ). ) the instrument Z supplies the equations π = Πβ, where βk = h(xk ), πj = E[Y jZ = zj ], Πjk = P[X = xk jZ = zj ]. 6/28
  • 20. Identi…cation Since Y = K ∑ k=1 h(xk )I(X = xk ) + ε, the conditional expectation of Y given Z = zj is E[Y jZ = zj ] = K ∑ k=1 Pr[X = xk , Z = zj ]h(xk ). ) the instrument Z supplies the equations π = Πβ, where βk = h(xk ), πj = E[Y jZ = zj ], Πjk = P[X = xk jZ = zj ]. h( ) is identi…ed at ALL support points of X i¤ J K. 6/28
  • 21. Identi…cation when J<K h( ) is partially identi…ed when J < K : 7/28
  • 22. Identi…cation when J<K h( ) is partially identi…ed when J < K : Theorem 1 Let L(β) = c0β be a linear functional of the elements of β. When rank(Π) = J < K, the following are true: (1) for any c orthogonal to the null space of Π, L(β) is point-identi…ed; the dimension of this set is J. (2) for c not orthogonal to the null space of Π, L(β) is completely unconstrained; the dimension of this set is K J. 7/28
  • 23. Identi…cation when J<K h( ) is partially identi…ed when J < K : Theorem 1 Let L(β) = c0β be a linear functional of the elements of β. When rank(Π) = J < K, the following are true: (1) for any c orthogonal to the null space of Π, L(β) is point-identi…ed; the dimension of this set is J. (2) for c not orthogonal to the null space of Π, L(β) is completely unconstrained; the dimension of this set is K J. That is: when J < K, some linear functionals are point-identi…ed, some completely arbitrary (not even set-identi…ed!). 7/28
  • 24. Identi…cation when J<K h( ) is partially identi…ed when J < K : Theorem 1 Let L(β) = c0β be a linear functional of the elements of β. When rank(Π) = J < K, the following are true: (1) for any c orthogonal to the null space of Π, L(β) is point-identi…ed; the dimension of this set is J. (2) for c not orthogonal to the null space of Π, L(β) is completely unconstrained; the dimension of this set is K J. That is: when J < K, some linear functionals are point-identi…ed, some completely arbitrary (not even set-identi…ed!). Point-identi…ability of L(β) can be tested (for a given choice of c): Gn = n c0 2 c0 1 ˆΠ 1 1 ˆΠ2 V 1 ˆP c0 2 c0 1 ˆΠ 1 1 ˆΠ2 0 !d χ2 K J . 7/28
  • 25. Linear Model Setup We de…ne the (0, 1) matrix LX (n K) with elements (LX )ik = I(xs i = xk ). Likewise LZ (n J). Then, H0 says: y = LX β + ε, E[εjX = xk ] = 08k. 8/28
  • 26. Linear Model Setup We de…ne the (0, 1) matrix LX (n K) with elements (LX )ik = I(xs i = xk ). Likewise LZ (n J). Then, H0 says: y = LX β + ε, E[εjX = xk ] = 08k. β can be consistently estimated by OLS under exogeneity: bβ = (L0 X LX ) 1 L0 X y = 0 B B @ ∑n i=1 yi I (xs i =x1) ∑n i=1 I (xs i =x1) ... ∑n i=1 yi I (xs i =xK ) ∑n i=1 I (xs i =xK ) 1 C C A . 8/28
  • 27. Linear Model Setup We de…ne the (0, 1) matrix LX (n K) with elements (LX )ik = I(xs i = xk ). Likewise LZ (n J). Then, H0 says: y = LX β + ε, E[εjX = xk ] = 08k. β can be consistently estimated by OLS under exogeneity: bβ = (L0 X LX ) 1 L0 X y = 0 B B @ ∑n i=1 yi I (xs i =x1) ∑n i=1 I (xs i =x1) ... ∑n i=1 yi I (xs i =xK ) ∑n i=1 I (xs i =xK ) 1 C C A . Theorem 2 If X is exogenous then the nonparametric (OLS) estimator bβ is consistent and p n bβ β !d N 0, σ2 D 1 X , where DX is diag(pk ). 8/28
  • 28. Linear Model Setup or by IV (with LZ as instruments) under endogeneity, when J K: bβIV = L0 X PLZ LX 1 L0 X PLZ y. 9/28
  • 29. Linear Model Setup or by IV (with LZ as instruments) under endogeneity, when J K: bβIV = L0 X PLZ LX 1 L0 X PLZ y. Theorem 3 Under assumptions above, the IV estimator bβIV is consistent and p n bβIV β !d N 0, σ2 P0 D 1 Z P 1 , where P is a matrix of joint probabilities with elements pjk = Pr [Z = zj , X = xk ] ; j = 1, ..., J; k = 1, ..., K and DZ is diag(qj ). 9/28
  • 30. Linear Model Setup or by IV (with LZ as instruments) under endogeneity, when J K: bβIV = L0 X PLZ LX 1 L0 X PLZ y. Theorem 3 Under assumptions above, the IV estimator bβIV is consistent and p n bβIV β !d N 0, σ2 P0 D 1 Z P 1 , where P is a matrix of joint probabilities with elements pjk = Pr [Z = zj , X = xk ] ; j = 1, ..., J; k = 1, ..., K and DZ is diag(qj ). BUT no consistent estimator exists for K J linear functionals if X is endogenous and J < K. 9/28
  • 31. Test for exogeneity Test statistics di¤er depending on whether J K or J < K. 10/28
  • 32. Test for exogeneity Test statistics di¤er depending on whether J K or J < K. For J K, (a modi…ed version of) Wu-Hausman test: 10/28
  • 33. Test for exogeneity Test statistics di¤er depending on whether J K or J < K. For J K, (a modi…ed version of) Wu-Hausman test: Theorem 4 Under H0, and the assumptions above, Tn !d χ2 K 1. 10/28
  • 34. Test for exogeneity Test statistics di¤er depending on whether J K or J < K. For J K, (a modi…ed version of) Wu-Hausman test: Theorem 4 Under H0, and the assumptions above, Tn !d χ2 K 1. Theorem 5 Under the sequence of local alternatives, and the assumptions above, the test statistic Tn !d Gamma (β, λ, θ) , with the shape parameter α = K 1 2 , the scale parameter θ = 2 σ2 σ 2 and the noncentrality parameter λ = 2δ2 , where δ2 = ξ0 Σ 1 11 ξ σ2 . 10/28
  • 35. Test for exogeneity For J < K, the test is based on the two SSE: 11/28
  • 36. Test for exogeneity For J < K, the test is based on the two SSE: Unrestricted: in the model y = LX β + ε, i.e., y0MLX y, and 11/28
  • 37. Test for exogeneity For J < K, the test is based on the two SSE: Unrestricted: in the model y = LX β + ε, i.e., y0MLX y, and Restricted: minimising the SSE in this model subject to ˆπ = ˆΠβ. 11/28
  • 38. Test for exogeneity For J < K, the test is based on the two SSE: Unrestricted: in the model y = LX β + ε, i.e., y0MLX y, and Restricted: minimising the SSE in this model subject to ˆπ = ˆΠβ. Test statistic: Rn = y0MLX LZ (L0 Z PLX LZ ) 1 L0 Z MLX y n 1y0MLX y . 11/28
  • 39. Test for exogeneity Theorem 6 Under H0 and the assumptions above, Rn !d z0 Ω 1 z J 1 ∑ j=1 ωj χ2 j (1) where z N(0, Σ), with Σ as de…ned above, Ω := C0 J (PD 1 X P0 pZ p0 Z )CJ , and the ωj are positive eigenvalues satisfying det[Σ ωΩ] = 0 with the χ2 j (1) variables independent copies of a χ2 1 random variable. 12/28
  • 40. Critical value computation using consistent estimates of bωj , simulate the distribution of ∑ (J 1) j=1 bωj χ2 j (1) to get the appropriate 1 α quantiles, 13/28
  • 41. Critical value computation using consistent estimates of bωj , simulate the distribution of ∑ (J 1) j=1 bωj χ2 j (1) to get the appropriate 1 α quantiles, simulate the quadratic form z0 bΩ 1z, with z N(0, bΣ) and compute the quantiles, 13/28
  • 42. Critical value computation using consistent estimates of bωj , simulate the distribution of ∑ (J 1) j=1 bωj χ2 j (1) to get the appropriate 1 α quantiles, simulate the quadratic form z0 bΩ 1z, with z N(0, bΣ) and compute the quantiles, approximate by the distribution of aχ2 (v ) + b, choosing (a, b, v) to match the …rst three cumulants. 13/28
  • 43. Generalizations: model with two discrete regressors Y = h(W , X) + ε E[εjZ = zj , W = wd ] = 0, 8j, d We de…ne LWX (n DK) with elements (LWX )i,dk = I(W = wd )I(X = xk ). Likewise LWZ (n DJ), and H0 says: y = LWX β + ε, E[εjW = wd , X = xk ] = 08d, k. 14/28
  • 44. Structure of regression matrix LWX is a permutation of the rows of 2 6 6 4 L1 X 0 0 0 0 L2 X 0 0 0 0 0 0 0 0 LD X 3 7 7 5 15/28
  • 45. Structure of regression matrix LWX is a permutation of the rows of 2 6 6 4 L1 X 0 0 0 0 L2 X 0 0 0 0 0 0 0 0 LD X 3 7 7 5 Observations corresponding to Ld X all have W = wd , and rows identify which values of X occur where. 15/28
  • 46. Structure of regression matrix LWX is a permutation of the rows of 2 6 6 4 L1 X 0 0 0 0 L2 X 0 0 0 0 0 0 0 0 LD X 3 7 7 5 Observations corresponding to Ld X all have W = wd , and rows identify which values of X occur where. Similarly for LWZ . 15/28
  • 47. Structure of regression matrix LWX is a permutation of the rows of 2 6 6 4 L1 X 0 0 0 0 L2 X 0 0 0 0 0 0 0 0 LD X 3 7 7 5 Observations corresponding to Ld X all have W = wd , and rows identify which values of X occur where. Similarly for LWZ . We assume that all possible combinations of K support points of X, J support points of Z and D support points of W occur in the sample. 15/28
  • 48. Identi…cation in general model The vector with elements h(wd , xk ) can be split into D K 1 vectors, with component vectors hd (xk ), say, one for each wd . 16/28
  • 49. Identi…cation in general model The vector with elements h(wd , xk ) can be split into D K 1 vectors, with component vectors hd (xk ), say, one for each wd . So we have D problems of the same type as the case with W absent. [Split y into D subvectors yd ] 16/28
  • 50. Identi…cation in general model The vector with elements h(wd , xk ) can be split into D K 1 vectors, with component vectors hd (xk ), say, one for each wd . So we have D problems of the same type as the case with W absent. [Split y into D subvectors yd ] The instrument is valid for each subsample. 16/28
  • 51. Identi…cation in general model The vector with elements h(wd , xk ) can be split into D K 1 vectors, with component vectors hd (xk ), say, one for each wd . So we have D problems of the same type as the case with W absent. [Split y into D subvectors yd ] The instrument is valid for each subsample. For h to be point-identi…ed, each hd must be, so the condition is again J K. 16/28
  • 52. Most general model Now assume several variables of each type: X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of dimensions Kr , Su, Jt . 17/28
  • 53. Most general model Now assume several variables of each type: X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of dimensions Kr , Su, Jt . We want to test the joint endogeneity of (X1, .., XR ) . 17/28
  • 54. Most general model Now assume several variables of each type: X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of dimensions Kr , Su, Jt . We want to test the joint endogeneity of (X1, .., XR ) . Label combinations of support points thus: xα = (xα1 , .., xαR ), 1 αr Kr , wβ = (wβ1 , .., wβS ), 1 βu Su, zγ = (zγ1 , .., zγT ), 1 γt Jt 17/28
  • 55. Most general model Now assume several variables of each type: X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of dimensions Kr , Su, Jt . We want to test the joint endogeneity of (X1, .., XR ) . Label combinations of support points thus: xα = (xα1 , .., xαR ), 1 αr Kr , wβ = (wβ1 , .., wβS ), 1 βu Su, zγ = (zγ1 , .., zγT ), 1 γt Jt Order the sequences lexicographically. 17/28
  • 56. Most general model Now assume several variables of each type: X1, .., XR , W1, .., WU , Z1, .., ZT , with respective supports of dimensions Kr , Su, Jt . We want to test the joint endogeneity of (X1, .., XR ) . Label combinations of support points thus: xα = (xα1 , .., xαR ), 1 αr Kr , wβ = (wβ1 , .., wβS ), 1 βu Su, zγ = (zγ1 , .., zγT ), 1 γt Jt Order the sequences lexicographically. Indistinguishable from the case of one variable of each type, except that J = ΠT t=1Jt , K = ΠR r=1Kr , S = ΠU u=1Su. 17/28
  • 57. In particular - Identi…cation The N&S condition remains that J K, but with J and K de…ned as products of the Jt and Kr . 18/28
  • 58. In particular - Identi…cation The N&S condition remains that J K, but with J and K de…ned as products of the Jt and Kr . BUT note: 18/28
  • 59. In particular - Identi…cation The N&S condition remains that J K, but with J and K de…ned as products of the Jt and Kr . BUT note: There is NO requirement that there should be at least as many instruments as endogenous variables (T R); 18/28
  • 60. In particular - Identi…cation The N&S condition remains that J K, but with J and K de…ned as products of the Jt and Kr . BUT note: There is NO requirement that there should be at least as many instruments as endogenous variables (T R); All that is needed is J K. 18/28
  • 61. In particular - Identi…cation The N&S condition remains that J K, but with J and K de…ned as products of the Jt and Kr . BUT note: There is NO requirement that there should be at least as many instruments as endogenous variables (T R); All that is needed is J K. Of course: more instruments increases J = ΠT t=1Jt . 18/28
  • 62. Applications motivation There are many published applications, where discrete endogenous regressor is instrumented by a variable with insu¢ cient support, e.g. Card (1995), Angrist and Krueger (1991), Bronars and Grogger (1994), Lochner and Moretti (2004). 19/28
  • 63. Applications motivation There are many published applications, where discrete endogenous regressor is instrumented by a variable with insu¢ cient support, e.g. Card (1995), Angrist and Krueger (1991), Bronars and Grogger (1994), Lochner and Moretti (2004). The point identi…cation is achieved by assuming a parametric (linear) speci…cation. 19/28
  • 64. Applications motivation There are many published applications, where discrete endogenous regressor is instrumented by a variable with insu¢ cient support, e.g. Card (1995), Angrist and Krueger (1991), Bronars and Grogger (1994), Lochner and Moretti (2004). The point identi…cation is achieved by assuming a parametric (linear) speci…cation. Parametric vs. nonparametric speci…cation testing (e.g. Horowitz (2006)) not possible in this case. 19/28
  • 65. Application- Card (1995) on returns to schooling We are interested in the relationship between individual’s wage Y and education X (in the presence of exogenous covariates W ) in Y = h(X, W ) + ε. 20/28
  • 66. Application- Card (1995) on returns to schooling We are interested in the relationship between individual’s wage Y and education X (in the presence of exogenous covariates W ) in Y = h(X, W ) + ε. Card (1995) treats education as endogenous and estimates ln(wagei ) = β0 + β1Xi + S ∑ s=1 γs Wsi + εi by 2SLS using a binary instrument Z, which takes value 1 if there is a college in the neighbourhood, 0 otherwise. The point identi…cation is achieved by imposing the parametric (linear) speci…cation, that is not testable. 20/28
  • 67. Data The dataset consists of 3010 observations from the National Longitudinal Survey of Young Men. 21/28
  • 68. Data The dataset consists of 3010 observations from the National Longitudinal Survey of Young Men. The (sample) support of education variable consists of K = 18 di¤erent values, and for a binary instrument: J = 2. 21/28
  • 69. Data The dataset consists of 3010 observations from the National Longitudinal Survey of Young Men. The (sample) support of education variable consists of K = 18 di¤erent values, and for a binary instrument: J = 2. Data limitation: the more exogenous covariates, the less likely it is to get observations for all possible combinations of the support points. 21/28
  • 70. Data The dataset consists of 3010 observations from the National Longitudinal Survey of Young Men. The (sample) support of education variable consists of K = 18 di¤erent values, and for a binary instrument: J = 2. Data limitation: the more exogenous covariates, the less likely it is to get observations for all possible combinations of the support points. Educational levels (K = 4): less than high school, high school, some college, post-college education. 21/28
  • 71. Data The dataset consists of 3010 observations from the National Longitudinal Survey of Young Men. The (sample) support of education variable consists of K = 18 di¤erent values, and for a binary instrument: J = 2. Data limitation: the more exogenous covariates, the less likely it is to get observations for all possible combinations of the support points. Educational levels (K = 4): less than high school, high school, some college, post-college education. Potential labour market experience levels: low and high. 21/28
  • 72. Results Covariates Rn cv.1 cv.2 cv.3 α Educ 1.765 0.239 0.232 0.238 1% 0.136 0.132 0.138 5% 0.094 0.096 0.097 10% Educ*, Exp* 4.147 1.221 1.259 1.217 1% 0.715 0.696 0.719 5% 0.511 0.500 0.515 10% Educ*, Exp*, Race 3.572 1.771 1.692 1.688 1% 1.107 1.131 1.108 5% 0.849 0.871 0.860 10% Educ*, Exp*, Race, SMSA 2.955 2.382 2.330 2.415 1% 1.702 1.679 1.735 5% 1.399 1.365 1.430 10% 22/28
  • 73. Outcome Education is endogenous, whatever the speci…cation of the W 0s. 23/28
  • 74. Outcome Education is endogenous, whatever the speci…cation of the W 0s. So: linearity is not testable, because no consistent estimator for h( ) 23/28
  • 75. Outcome Education is endogenous, whatever the speci…cation of the W 0s. So: linearity is not testable, because no consistent estimator for h( ) Some linear functionals of interest may be - use the test above to check. 23/28
  • 76. Outcome Education is endogenous, whatever the speci…cation of the W 0s. So: linearity is not testable, because no consistent estimator for h( ) Some linear functionals of interest may be - use the test above to check. Can consistently estimate an identi…ed linear combination. 23/28
  • 77. Testing for point-identi…ability of linear functionals As J = 2, linear functionals of only 2 parameters might be point identi…ed, eg. the di¤erence in earnings across di¤erent years of education. linear combination Gn bL(β) Chesher’s bounds h(3) h(2) 0.1356 0.0040 - h(7) h(6) 1.9332 0.1017 (0.0365, 0.2895) h(8) h(7) 0.1494 0.2395 ( 0.1732; 0.352) h(9) h(8) 26.5527 - ( 0.2742; 0.1334) h(10) h(9) 75.2217 - - h(11) h(10) 4.7003 0.1317 ( 0.057; 0.3187) h(14) h(13) 61.5525 - - h(17) h(16) 10.7344 -0.1900 - h(18) h(17) 74.1413 - - 24/28
  • 78. Application- Angrist and Krueger (1991) on returns to schooling Angrist and Krueger (1991) estimate ln(wagei ) = βXi + ∑ c δc Yci + S ∑ s=1 γs Wsi + εi by 2SLS using quarter of birth as an instrument for (assumed) endogenous education. 25/28
  • 79. Application- Angrist and Krueger (1991) on returns to schooling Angrist and Krueger (1991) estimate ln(wagei ) = βXi + ∑ c δc Yci + S ∑ s=1 γs Wsi + εi by 2SLS using quarter of birth as an instrument for (assumed) endogenous education. Data: 1980 Census, split into 1930-1939 cohort (40-49 year-old men) and 1940-1949 cohort (30-39 year-old men). 25/28
  • 80. Application- Angrist and Krueger (1991) on returns to schooling Angrist and Krueger (1991) estimate ln(wagei ) = βXi + ∑ c δc Yci + S ∑ s=1 γs Wsi + εi by 2SLS using quarter of birth as an instrument for (assumed) endogenous education. Data: 1980 Census, split into 1930-1939 cohort (40-49 year-old men) and 1940-1949 cohort (30-39 year-old men). Now, K = 21 and J = 4. 25/28
  • 81. Results: 1930’s cohort critical values Rn 1% 5% 10% 1930 0.645 17.144 11.026 8.433 1931 1.000 19.806 12.614 9.582 1932 10.843 21.541 14.313 11.184 1933 2.385 18.980 12.685 9.952 1934 6.498 25.674 16.824 13.025 1935 2.728 20.451 13.374 10.340 1936 10.990 29.102 18.465 13.980 1937 1.614 13.467 9.032 7.101 1938 1.344 22.932 15.107 11.737 1939 9.649 22.130 14.837 11.664 full cohort 38.044 85.933 72.138 65.465 26/28
  • 82. Results: 1940’s cohort critical values Rn 1% 5% 10% 1940 3.528 24.137 15.704 12.096 1941 18.143 24.733 16.005* 12.286* 1942 6.517 34.282 21.810 16.535 1943 99.840 55.818* 35.712* 27.202* 1944 22.665 39.214 24.860 18.823* 1945 31.736 26.623* 17.705* 13.847* 1946 17.181 23.478 15.183* 11.642* 1947 22.803 33.000 21.830* 17.012* 1948 34.116 46.991 29.790* 22.552* 1949 32.952 36.445 23.627* 18.168* full cohort 278.703 138.344* 114.551* 103.182* 27/28
  • 83. To conlude... we propose consistent nonparametric exogeneity test(s) applicable in models with discrete regressors, 28/28
  • 84. To conlude... we propose consistent nonparametric exogeneity test(s) applicable in models with discrete regressors, the tests con…rm endogeneity of education variable in some classic applied work, but 28/28
  • 85. To conlude... we propose consistent nonparametric exogeneity test(s) applicable in models with discrete regressors, the tests con…rm endogeneity of education variable in some classic applied work, but suggest that linearity of these models might be a bold assumption; 28/28
  • 86. To conlude... we propose consistent nonparametric exogeneity test(s) applicable in models with discrete regressors, the tests con…rm endogeneity of education variable in some classic applied work, but suggest that linearity of these models might be a bold assumption; we suggest a nonparametric approach, or …nding instruments with more support points! 28/28
  • 87. To conlude... we propose consistent nonparametric exogeneity test(s) applicable in models with discrete regressors, the tests con…rm endogeneity of education variable in some classic applied work, but suggest that linearity of these models might be a bold assumption; we suggest a nonparametric approach, or …nding instruments with more support points! THE END 28/28