Lecture 5: Instrumental Variables
Applied Micro-Econometrics,Fall 2020
Zhaopeng Qu
Nanjing University
10/29/2020
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 1 / 99
1 Review Previous Lecture of Internal Validity
2 Instrumental Variable Method
3 Checking Instrument Validity
4 Instrumental Variable for multiple regression
5 Review the last lecture
6 IV with Heterogeneous Causal Effects
7 Some Practical Guides by Angrist and Pischke(2012)
8 An good example: Long live Keju
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 2 / 99
Review Previous Lecture of Internal Validity
Section 1
Review Previous Lecture of Internal Validity
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 3 / 99
Review Previous Lecture of Internal Validity
Threatens to Internal Validity
Three endogenous in OLS regression are:
Omitted Variable Bias(a variable that is correlated with X but is
unobserved)
Simultaneity or reverse causality Bias (X causes Y,Y causes X)
Errors-in-Variables Bias (X is measured with error)
One easy way to deal with these endogeneity is using Instrumental
Variable method.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 4 / 99
Instrumental Variable Method
Section 2
Instrumental Variable Method
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 5 / 99
Instrumental Variable Method
Introduction
The earliest application involved attempts to estimate demand and
supply curve for product.
A simple but difficult question: How to find the supply or demand
curves?
Difficulty: We can only observe intersections of supply and demand,
yielding pairs.
Solution: Wright(1928) use variables that appear in one equation to
shift this equation and trace out the other.
The variables that do the shifting came to be known as Instrumental
Variables method.
It is well-known that IV can address the problems of omitted variable
bias, measurement error and reverse causality problems.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 6 / 99
Instrumental Variable Method
Terminology: endogeneity and exogeneity
An endogenous variable is one that both we are interested in and is
correlated with u.
An exogenous variable is one that is uncorrelated with u.
Historical note: “Endogenous” literally means “determined within the
system,” that is, a variable that is jointly determined with Y, that is,
a variable subject to simultaneous causality.
However, this definition is narrow and IV regression can be used to
address OVB and errors-in-variable bias, not just to simultaneous
causality bias.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 7 / 99
Instrumental Variable Method
Instrumental variables: 1 endogenous regressor & 1
instrument
suppose a simple OLS regression like previous equation
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖
Because 𝐸[𝑢𝑖|𝑋𝑖] ≠ 0, then we can use an instrumental variable(𝑍𝑖)
to obtain an consistent estimate of coefficient.
Intuitively, we want to split 𝑋𝑖 into two parts:
1 part that is correlated with the error term.
2 part that is uncorrelated with the error term.
If we can isolate the variation in 𝑋𝑖 that is uncorrelated with 𝑢𝑖,then
we can use this part to obtain a consistent estimate of the causal
effect of 𝑋𝑖 on 𝑌𝑖.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 8 / 99
Instrumental Variable Method
Instrumental variables: 1 endogenous regressor & 1
instrument
An instrumental variable 𝑍𝑖 must satisfy the following 2 properties:
1 Instrumental relevance: 𝑍𝑖 should be correlated with the casual
variable of interest, 𝑋𝑖 (endogenous variable),thus
𝐶𝑜𝑣(𝑋𝑖, 𝑍𝑖) ≠ 0
.
2 Instumental exogeneity: 𝑍𝑖 is as good as randomly assigned and 𝑍𝑖
only affect on 𝑌𝑖 through 𝑋𝑖 affecting 𝑌𝑖 channel.
𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖) = 0
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 9 / 99
Instrumental Variable Method
IV estimator:Jargon
Our simple OLS regression: Causal relationship of interest
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖
First-Stage regression: regress endogenous variable on IV
𝑋𝑖 = 𝜋0 + 𝜋1𝑍𝑖 + 𝑣𝑖
Reduced-Form: regress outcome variable on IV
𝑌𝑖 = 𝛿0 + 𝛿1𝑍𝑖 + 𝑒𝑖
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 10 / 99
Instrumental Variable Method
IV estimator:Two Steps Least Square (2SLS)
We can estimate the causal effect of 𝑋𝑖 on 𝑌𝑖 in two steps
1 First stage: Regress 𝑋𝑖 on 𝑍𝑖 & obtain predicted values of ̂
𝑋𝑖,if
𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖) = 0, then ̂
𝑋𝑖 contains variation in 𝑋𝑖 that is uncorrelated
with 𝑢𝑖
̂
𝑋𝑖 = ̂
𝜋0 + ̂
𝜋1𝑍𝑖
.
2 Second stage: Regress 𝑌𝑖 on ̂
𝑋𝑖 to obtain the Two Stage Least
Squares estimator ̂
𝛽2𝑆𝐿𝑆
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )( ̂
𝑋𝑖 − ̂
𝑋)
∑( ̂
𝑋𝑖 − ̂
𝑋)2
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 11 / 99
Instrumental Variable Method
IV estimator:Two Steps Least Square (2SLS)
we substitute
̂
𝑋𝑖 − ̂
𝑋 = ̂
𝜋1(𝑍𝑖 − ̄
𝑍)
then we could obtain
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )( ̂
𝑋𝑖 − ̂
𝑋)
∑( ̂
𝑋𝑖 − ̂
𝑋)2
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 12 / 99
Instrumental Variable Method
IV estimator:Two Steps Least Square (2SLS)
we substitute
̂
𝑋𝑖 − ̂
𝑋 = ̂
𝜋1(𝑍𝑖 − ̄
𝑍)
then we could obtain
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )( ̂
𝑋𝑖 − ̂
𝑋)
∑( ̂
𝑋𝑖 − ̂
𝑋)2
=
∑(𝑌𝑖 − ̄
𝑌 ) ̂
𝜋1(𝑍𝑖 − 𝑍)
∑ ̂
𝜋2
1(𝑍𝑖 − 𝑍)2
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 12 / 99
Instrumental Variable Method
IV estimator:Two Steps Least Square (2SLS)
we substitute
̂
𝑋𝑖 − ̂
𝑋 = ̂
𝜋1(𝑍𝑖 − ̄
𝑍)
then we could obtain
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )( ̂
𝑋𝑖 − ̂
𝑋)
∑( ̂
𝑋𝑖 − ̂
𝑋)2
=
∑(𝑌𝑖 − ̄
𝑌 ) ̂
𝜋1(𝑍𝑖 − 𝑍)
∑ ̂
𝜋2
1(𝑍𝑖 − 𝑍)2
=
1
̂
𝜋1
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − 𝑍)
∑(𝑍𝑖 − 𝑍)2
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 12 / 99
Instrumental Variable Method
IV estimator:Two Steps Least Square (2SLS)
we substitute
̂
𝑋𝑖 − ̂
𝑋 = ̂
𝜋1(𝑍𝑖 − ̄
𝑍)
then we could obtain
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )( ̂
𝑋𝑖 − ̂
𝑋)
∑( ̂
𝑋𝑖 − ̂
𝑋)2
=
∑(𝑌𝑖 − ̄
𝑌 ) ̂
𝜋1(𝑍𝑖 − 𝑍)
∑ ̂
𝜋2
1(𝑍𝑖 − 𝑍)2
=
1
̂
𝜋1
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − 𝑍)
∑(𝑍𝑖 − 𝑍)2
=
∑(𝑍𝑖 − 𝑍)2
∑(𝑋𝑖 − 𝑋)(𝑍𝑖 − 𝑍)
×
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − 𝑍)
∑(𝑍𝑖 − 𝑍)2
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 12 / 99
Instrumental Variable Method
IV estimator:Two Steps Least Square (2SLS)
Which gives the instrumental variable estimator
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
=
𝑠𝑍𝑌
𝑠𝑍𝑋
The TSLS estimator of 𝛽1 is the ratio of the sample covariance
between 𝑍 and 𝑌 to the sample covariance between 𝑍 and 𝑋.
If 𝑍𝑖 = 𝑋𝑖, then
̂
𝛽2𝑆𝐿𝑆 = ̂
𝛽𝑜𝑙𝑠
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 13 / 99
Instrumental Variable Method
Statistical propertise of 2SLS estimator: Unbiasedness
Consider 𝐸[ ̂
𝛽𝐼𝑉 ]
𝐸[ ̂
𝛽2𝑆𝐿𝑆] = 𝐸[
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
Instrumental Variable Method
Statistical propertise of 2SLS estimator: Unbiasedness
Consider 𝐸[ ̂
𝛽𝐼𝑉 ]
𝐸[ ̂
𝛽2𝑆𝐿𝑆] = 𝐸[
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝐸[
∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1
̄
𝑋 + ̄
𝑢)](𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
Instrumental Variable Method
Statistical propertise of 2SLS estimator: Unbiasedness
Consider 𝐸[ ̂
𝛽𝐼𝑉 ]
𝐸[ ̂
𝛽2𝑆𝐿𝑆] = 𝐸[
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝐸[
∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1
̄
𝑋 + ̄
𝑢)](𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝐸[
∑ 𝛽1(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍) + ∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
Instrumental Variable Method
Statistical propertise of 2SLS estimator: Unbiasedness
Consider 𝐸[ ̂
𝛽𝐼𝑉 ]
𝐸[ ̂
𝛽2𝑆𝐿𝑆] = 𝐸[
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝐸[
∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1
̄
𝑋 + ̄
𝑢)](𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝐸[
∑ 𝛽1(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍) + ∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝛽1 + 𝐸[
∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
Instrumental Variable Method
Statistical propertise of 2SLS estimator: Unbiasedness
Consider 𝐸[ ̂
𝛽𝐼𝑉 ]
𝐸[ ̂
𝛽2𝑆𝐿𝑆] = 𝐸[
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝐸[
∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1
̄
𝑋 + ̄
𝑢)](𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝐸[
∑ 𝛽1(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍) + ∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝛽1 + 𝐸[
∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
= 𝛽1 + 𝐸[
∑ 𝑢𝑖(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
]
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
Instrumental Variable Method
Statistical propertise of 2SLS estimator: Unbiasedness
Because instrument exogeneity implies 𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖) = 0,but not
𝐸[𝑢𝑖|𝑍𝑖, 𝑋𝑖] = 0,then
𝐸[
∑ 𝑢𝑖(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
] = 𝐸[
∑ 𝐸[𝑢𝑖|𝑋𝑖, 𝑍𝑖](𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
] ≠ 0
Then we have
𝐸[ ̂
𝛽2𝑆𝐿𝑆] ≠ 𝛽1
It means that 2SLS estimator is biased.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 15 / 99
Instrumental Variable Method
Statistical propertise of 2SLS estimator: Consistent
We have a simple regression 𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖 and take a
covariance of 𝑌𝑖 and 𝑍𝑖
𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖) = 𝐶𝑜𝑣[𝑍𝑖, (𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖)]
= 𝐶𝑜𝑣(𝑍𝑖, 𝛽0) + 𝛽1𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖) + 𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖)
= 𝛽1𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖)
Thus if the instrument is valid,
𝛽1 =
𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖)
𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖)
The population coefficient is the ratio of the population covariance
between 𝑍 and 𝑌 to the popualtion covariance between 𝑍 and 𝑋.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 16 / 99
Instrumental Variable Method
Statistical propertise of 2SLS estimator: Consistent
As discussed in Section 3.7,the sample covariance is a consistent
estimator of the population covariance, thus 𝑠𝑍𝑌
𝑝
−
→ 𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖) and
𝑠𝑍𝑋
𝑝
−
→ 𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖)
Then the TSLS estimator is consistent.
̂
𝛽2𝑆𝐿𝑆 =
𝑠𝑍𝑌
𝑠𝑍𝑋
𝑝
−
→
𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖)
𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖)
= 𝛽1
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 17 / 99
Instrumental Variable Method
Statistical propertise of 2SLS : sampling distribution
Similar to the expression for the OLS estimator in Equation
(4.30,page 183 in S.W)
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
Instrumental Variable Method
Statistical propertise of 2SLS : sampling distribution
Similar to the expression for the OLS estimator in Equation
(4.30,page 183 in S.W)
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
=
∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1
̄
𝑋 + ̄
𝑢)](𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
Instrumental Variable Method
Statistical propertise of 2SLS : sampling distribution
Similar to the expression for the OLS estimator in Equation
(4.30,page 183 in S.W)
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
=
∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1
̄
𝑋 + ̄
𝑢)](𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
=
∑ 𝛽1(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍) + ∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
Instrumental Variable Method
Statistical propertise of 2SLS : sampling distribution
Similar to the expression for the OLS estimator in Equation
(4.30,page 183 in S.W)
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
=
∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1
̄
𝑋 + ̄
𝑢)](𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
=
∑ 𝛽1(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍) + ∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
= 𝛽1 +
∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
Instrumental Variable Method
Statistical propertise of 2SLS : sampling distribution
Similar to the expression for the OLS estimator in Equation
(4.30,page 183 in S.W)
̂
𝛽2𝑆𝐿𝑆 =
∑(𝑌𝑖 − ̄
𝑌 )(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
=
∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1
̄
𝑋 + ̄
𝑢)](𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
=
∑ 𝛽1(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍) + ∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
= 𝛽1 +
∑(𝑢𝑖 − ̄
𝑢)(𝑍𝑖 − ̄
𝑍)
∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
= 𝛽1 +
1
𝑛 ∑ 𝑢𝑖(𝑍𝑖 − ̄
𝑍)
1
𝑛 ∑(𝑋𝑖 − ̄
𝑋)(𝑍𝑖 − ̄
𝑍)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
Instrumental Variable Method
Statistical propertise of 2SLS: sampling distribution
Large sample: ̄
𝑍 ≅ 𝜇𝑧. Let 𝑞𝑖 = (𝑍𝑖 − 𝜇𝑍)𝑢𝑖,then the numerator
1
𝑛
∑ 𝑢𝑖(𝑍𝑖 − ̄
𝑍) ≅
1
𝑛
∑ 𝑞𝑖 = ̄
𝑞
Because 𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖) = 0 and 𝐸(𝑢𝑖)=0,so
𝐶𝑜𝑣(𝑍𝑖 − 𝜇𝑍, 𝑢𝑖) = 𝐸[(𝑍𝑖 − 𝜇𝑍)𝑢𝑖] = 𝐸(𝑞𝑖) = 0
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 19 / 99
Instrumental Variable Method
Statistical propertise of 2SLS: sampling distribution
In addition,the variance of 𝑞𝑖 is 𝜎2
𝑞 = 𝑉 𝑎𝑟[(𝑍𝑖 − 𝜇𝑍)𝑢𝑖].
We also have
𝑉 𝑎𝑟( ̄
𝑞) = 𝜎2
̄
𝑞 =
𝜎2
𝑞
𝑛
=
1
𝑛
𝑉 𝑎𝑟[(𝑍𝑖 − 𝜇𝑍)𝑢𝑖]
By the C.L.T.(central limit theorem) in large sample,
̄
𝑞
𝜎2
̄
𝑞
𝑑
−
→ 𝑁(0, 1)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 20 / 99
Instrumental Variable Method
Statistical propertise of 2SLS: sampling distribution
Because the sample covariance is consistent for the population
covariance,thus 𝑠𝑋𝑌
𝑝
−
→ 𝐶𝑜𝑣(𝑋𝑖, 𝑌𝑖), then we obtain
̂
𝛽2𝑆𝐿𝑆 ≅ 𝛽1 +
̄
𝑞
𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖)
In addition,because ̄
𝑞
𝑑
−
→ 𝑁(0, 𝜎2
̄
𝑞),then we have
̄
𝑞
𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖)
𝑑
−
→ 𝑁(0,
𝜎2
̄
𝑞
[𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖)]2
)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 21 / 99
Instrumental Variable Method
Statistical propertise of 2SLS: sampling distribution
At last, so in large samples ̂
𝛽2𝑆𝐿𝑆 is approximately distributed
̂
𝛽2𝑆𝐿𝑆
𝑑
−
→ 𝑁(𝛽, 𝜎2
̂
𝛽2𝑆𝐿𝑆
)
Where
𝜎2
̂
𝛽2𝑆𝐿𝑆
=
𝜎2
̄
𝑞
[𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖)]2
=
1
𝑛
𝑉 𝑎𝑟[(𝑍𝑖 − 𝜇𝑍)𝑢𝑖]
𝐶𝑜𝑣[(𝑍𝑖, 𝑋𝑖)]2
(12.8)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 22 / 99
Instrumental Variable Method
Statistical propertise of 2SLS: Statistical Inference
The variance ̂
𝛽2𝑆𝐿𝑆 can be estimated by estimating the variance and
covariance terms appearing in Equation (12.8),thus
𝑆𝐸( ̂
𝛽2𝑆𝐿𝑆) = √
1
𝑛 ∑(𝑍𝑖 − 𝜇𝑍)2 ̂
𝑢2
𝑖
𝑛( 1
𝑛 ∑(𝑍𝑖 − 𝜇𝑍)𝑋𝑖)2
Then the square root of the estimate of 𝜎2
̂
𝛽2𝑆𝐿𝑆
, thus the standard
error of the IV estimator, which is a little bit complicated.
Fortunately,this is done automatically in TSLS regression commands
in econometric software packages.
Because ̂
𝛽2𝑆𝐿𝑆 is normally distributed in large samples, hypothesis
tests about 𝛽 can be performed by computing the t-statistic,and a
95% large-sample confidence interval is given by
̂
𝛽2𝑆𝐿𝑆 ± 1.96𝑆𝐸( ̂
𝛽2𝑆𝐿𝑆)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 23 / 99
Instrumental Variable Method
Application: Angrist and Krueger(1991)
Angrist, Joshua D. and Alan B. Krueger. 1991. “Does Compulsory
School Attendance Affect Schooling and Earnings?” The Quarterly
Journal of Economics 106 (4):pp979–1014.
They use quarter of birth as an instrument for education to estimate
the returns to schooling.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 24 / 99
Instrumental Variable Method
Application: Angrist and Krueger(1991)
Why is the Quarter of Birth?
In most of the U.S. must attend school until age 16 (at least during
1938-1967)
Age when starting school depends on birthday, so grade when can
legally drop out depends on birthday by compulsory schooling laws.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 25 / 99
Instrumental Variable Method
Application: Angrist and Krueger(1991)
Is Schooling related to Quarter of Birth?(Assumption 1)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 26 / 99
Instrumental Variable Method
Angrist and Krueger(1991): The First Stage
Does quarter of birth affect education?
Regress education outcomes on quarter of birth dummy variables:
𝑆𝑖𝑗𝑐 = 𝛼 + 𝛽1𝑄1𝑖𝑐 + 𝛽2𝑄2𝑖𝑐 + 𝛽3𝑄3𝑖𝑐 + 𝜖𝑖𝑗𝑐
where individual 𝑖, cohort 𝑐, education outcome 𝑆, birth quarter 𝑄𝑗
It is the first stage regression
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 27 / 99
Instrumental Variable Method
Angrist and Krueger(1991): The First Stage
It shows that 𝑄𝑗 does impact education outcomes such as total years
of education and high school graduation.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 28 / 99
Instrumental Variable Method
Angrist and Krueger(1991): exogeneity
Due to compulsory schooling laws?
Indirect evidence: on post-secondary outcomes that are not expected
to be affected by compulsory schooling laws.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 29 / 99
Instrumental Variable Method
Angrist and Krueger(1991): Reduced form
Is Earnings related to Quarter of Birth?
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 30 / 99
Instrumental Variable Method
Angrist and Krueger(1991): OLS v.s IV
IV Estimates
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 31 / 99
Instrumental Variable Method
Angrist and Krueger(1991): OLS v.s IV with covariates
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 32 / 99
Checking Instrument Validity
Section 3
Checking Instrument Validity
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 33 / 99
Checking Instrument Validity
Assumption #1 Instrument Relevance
Instrumental strategy that seems very robust.
But how to understand that Angrist and Krueger(1991) IV’s result
larger than that of OLS?
Bound et al(1995) prove that when instruments have limited
explanatory power over endogenous variable,
1.IV is biased towards OLS in finite samples. 2.May happen even on
very large sample
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 34 / 99
Checking Instrument Validity
Assumption #1 Instrument Relevance
Recall 2SLS: a simple OLS regression equation is
𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖
Get the predict value from the first stage
̂
𝑋𝑖 = ̂
𝜋0 + ̂
𝜋1𝑍𝑖
Running the second stage regression
𝑌𝑖 = 𝛽0 + 𝛽1
̂
𝑋𝑖 + 𝑢𝑖
So following the OLS formula in large sample, we can obtain
̂
𝛽1
𝑝
−
→ 𝛽1 +
𝐶𝑜𝑣( ̂
𝑋, 𝑢)
𝑉 𝑎𝑟( ̂
𝑋)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 35 / 99
Checking Instrument Validity
Assumption #1 Instrument Relevance
An 2SLS version of OVB
̂
𝛽2𝑆𝐿𝑆
𝑝
−
→ 𝛽 +
𝐶𝑜𝑣( ̂
𝑋, 𝑢)
𝑉 𝑎𝑟( ̂
𝑋)
= 𝛽 +
𝐶𝑜𝑣( ̂
𝜋0 + ̂
𝜋1𝑍, 𝑢)
𝑉 𝑎𝑟( ̂
𝜋0 + ̂
𝜋1𝑍)
= 𝛽 +
̂
𝜋1𝐶𝑜𝑣(𝑍, 𝑢)
̂
𝜋2
1𝑉 𝑎𝑟( ̂
𝑍)
= 𝛽 +
𝑉 𝑎𝑟(𝑍)
𝐶𝑜𝑣(𝑍, 𝑋)
𝐶𝑜𝑣(𝑍, 𝑢)
𝑉 𝑎𝑟(𝑍)
= 𝛽 +
𝐶𝑜𝑣(𝑍, 𝑢)
𝐶𝑜𝑣(𝑍, 𝑋)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 36 / 99
Checking Instrument Validity
Weak Instruments
Assumption 1: Instrument Relevance
𝐶𝑜𝑣(𝑋𝑖, 𝑍𝑖) ≠ 0
.
Intuition: the more the variation in 𝑋 is explained by the instruments,
thus the more information is available for use in IV regression
On the contrary, instruments explain little of variation in 𝑋 are called
Weak Instruments, thus there is a very weak correlation between
𝑋(endogenous variable) and 𝑍(IV).
Because
̂
𝛽2𝑆𝐿𝑆
𝑝
−
→ 𝛽 +
𝐶𝑜𝑣(𝑍, 𝑢)
𝐶𝑜𝑣(𝑍, 𝑋)
So if 𝐶𝑜𝑣(𝑍, 𝑋) = 0,thus 𝑋 and 𝑍 is irrelevant,the bias will
approximate to infinity.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 37 / 99
Checking Instrument Validity
Weak Instruments: How to test weak instruments ?
We should therefore always check whether an instrument is relevant
enough.
Compute the first stage F-statistic provide a measure of the in
formation content contained in the instruments.
Stock and Yogo(2005) showed that
𝐸(𝛽2𝑆𝐿𝑆) − 𝛽 ≅
𝐸(𝛽𝑜𝑙𝑠) − 𝛽
𝐸(𝐹) − 1
𝐸(𝐹) is the expectation of the first stage F-statistics.And if
𝐸(𝐹) = 10,the bias of 2SLS, relative to the bias of OLS,is
approximately 1
9 , which is small enough to be acceptable.
A Rule of Thumb: if F-statistic exceeds 10,then don’t need worry
about too much.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 38 / 99
Checking Instrument Validity
Angrist and Krueger(1991): Why IV over OLS?
In Angrist and Krueger(1991),despite large samples sizes, the
F-statistics for a test of the joint statistical significance of the
excluded exogenous variables in the first-stage regression are not over
2.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 39 / 99
Checking Instrument Validity
Wrap up
If the correlation between the instruments and the endogenous
variable is small, then even the enormous sample sizes do not
guarantee that quantitatively important finite sample biases will be
eliminated from IV estimates.
The first assumption of IV method, thus relevance of IV, can be
justified by the F-statistic in the first stage.
Potential Solutions
If you have many IVs, some are strong, some are weak. Then discard
weak ones.
If you only have an weak IV, then find other more stronger IV(easy to
say, very hard to do)
Employing other estimator(LIML) other than 2SLS methods.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 40 / 99
Checking Instrument Validity
Assumption #2 Instrument Exogeneity
If the instruments are not exogenous, then TSLS is inconsistent.
After all, the idea of instrumental variables regression is that the
instrument contains information about variation in 𝑋𝑖 that is
unrelated to the error term 𝑢𝑖.
Can we statistically test the assumption that the instruments are
exogenous?
Answer: In most case,NO.
Assessing whether the instruments are exogenous necessarily requires
making an expert judgment based on personal knowledge and expert
opinion of the application.(“讲好故事”)
In some case,you can test partially,thus overidentification test.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 41 / 99
Checking Instrument Validity
Assumption #2 Instrument Exogeneity
Terminology: The relationship between the number of
instruments(𝑚) and the number of endogenous regressors(𝑘)
exactly(just) identified:𝑚 = 𝑘
overidentified 𝑚 > 𝑘
underidentified 𝑚 < 𝑘
when the coefficients are just identified, you can’t do a formal
statistical test of the hypothesis that the instruments are in fact
exogenous.
If, however, there are more instruments than endogenous regressors,
then there is a statistical tool that can be helpful in this process: the
so-called test of overidentifying restrictions.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 42 / 99
Checking Instrument Validity
Overidentification-test:Intuition
Suppose there are two valid instruments: 𝑍1 𝑍2(you are very lucky.)
Then you could compute two separate TSLS estimates.
Intuitively,if these 2 TSLS estimates are very different from each
other, then something must be wrong: one or the other (or both) of
the instruments must be invalid.
The overidentifying restrictions test makes this comparison in a
statistically precise way.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 43 / 99
Checking Instrument Validity
Overidentification test:
Our model is a multiple regression
𝑌𝑖 = 𝛽0+𝛽1𝑋1,𝑖+𝛽2𝑋2,𝑖+...+𝛽𝑘𝑋𝑘,𝑖+𝛽𝑘+1𝑊1,𝑖+...+𝛽𝑘+𝑟𝑊𝑟,𝑖+𝑢𝑖
(12.13)
Where
𝑌𝑖 is the dependent variable
𝑋1, 𝑋2, ...𝑋𝑘 are 𝐾 endogenous regressors
𝑊1, 𝑋2, ...𝑊𝑟 are the additional exogenous variables
we have 𝑚 instruments,𝑍1, 𝑍2, ...𝑍𝑚,instrumental variables
𝑢𝑖 is the regression error term.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 44 / 99
Checking Instrument Validity
Overidentification test:
A set of m instruments,𝑍1, 𝑍2, ...𝑍𝑚
then 2sls regression
𝑌𝑖 = 𝛽0+𝛽1
̂
𝑋1,𝑖+𝛽2
̂
𝑋2,𝑖+...+𝛽𝑘
̂
𝑋𝑘,𝑖+𝛽𝑘+1𝑊1,𝑖+...+𝛽𝑘+𝑟𝑊𝑟,𝑖+𝑢𝑖
(12.13)
then we can get the predict value of ̂
𝑢𝑖
𝑇𝑆𝐿𝑆
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 45 / 99
Checking Instrument Validity
Overidentification test:
Let
̂
𝑢𝑇𝑆𝐿𝑆
𝑖 = 𝛿0 + 𝛿1𝑍1𝑖 + ... + 𝛿𝑚𝑍𝑚𝑖 + 𝛿𝑚+1𝑊1,𝑖 + ... + 𝛿𝑚+𝑟𝑊𝑟𝑖 + 𝑒𝑖
Let 𝐹 denote the homoskedasticity-only F-statistic testing the
hypothesis that 𝛿0 = ... = 𝛿𝑚 = 0
Then the overidentifying restrictions test statistic is 𝐽 = 𝑚𝐹
Under the null hypothesis that all the instruments are exogenous,
𝐽
𝑑
−
→ 𝜒2
𝑚−𝑘
Where 𝑚 − 𝑘 is the “degree of over-identification,” that is, the
number of instruments minus the number of endogenous regressors.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 46 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
A serious public health issue: huge externalities
One policy tool is to tax cigarettes so heavily that current smokers
cut back and potential new smokers are discouraged from taking up
the habit.
Precisely how big a tax hike is needed to make a dent in cigarette
consumption?
For example, what would the after-tax sales price of cigarettes need
to be to achieve a 20% reduction in cigarette consumption?
The answer to this question depends on the elasticity of demand for
cigarettes.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 47 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
Because of the interactions between supply and demand, the elasticity
of demand for cigarettes cannot be estimated consistently by an OLS
regression of log quantity on log price.
Using annual data for the 48 contiguous U.S. states for in 1995,we
therefore use TSLS to estimate the elasticity of demand for cigarettes.
The instrumental variable, 𝑆𝑎𝑙𝑒𝑠𝑇𝑎𝑥𝑖, is the portion of the tax on
cigarettes arising from the general sales tax,measured in dollars per
pack.
Cigarette consumption, 𝑄𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠
𝑖 , is the number of packs of
cigarettes sold per capita in the state,
and the price, 𝑃𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠
𝑖 ,is the average real price per pack of
cigarettes including all taxes.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 48 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
We consider quantity and price changes that occur over 10-year
periods.
Dependent variable:
Δ𝑙𝑛(𝑄𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠
𝑖 ) = 𝑙𝑛(𝑄𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠
𝑖,1995 ) − 𝑙𝑛(𝑄𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠
𝑖,1985 )
Independent variable:
Δ𝑙𝑛(𝑃𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠
𝑖 ) = 𝑙𝑛(𝑃𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠
𝑖,1995 ) − 𝑙𝑛(𝑃𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠
𝑖,1985 )
Control variable:
Δ𝑙𝑛(𝐼𝑛𝑐𝑖) = 𝑙𝑛(𝐼𝑛𝑐𝑖,1995) − 𝑙𝑛(𝐼𝑛𝑐𝑖,1985)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 49 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
Two instruments
1 the change in the sales tax over 10 years,
Δ𝑆𝑎𝑙𝑒𝑠𝑇𝑎𝑥𝑖 = 𝑆𝑎𝑙𝑒𝑠𝑇𝑎𝑥𝑖,1995 − 𝑆𝑎𝑙𝑒𝑠𝑇𝑎𝑥𝑖,1985
2 the change in the cigarette-specific tax over 10 years
Δ𝐶𝑖𝑔𝑇𝑎𝑥𝑖 = 𝐶𝑖𝑔𝑇𝑎𝑥𝑖,1995 − 𝐶𝑖𝑔𝑇𝑎𝑥𝑖,1985
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 50 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
The first stage
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 51 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 52 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
Over-identifying J-test reject the null hypothesis that both the
instruments are exogenous at the 5% significant
level(𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.026)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 53 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
The reason the J-statistic rejects the null hypothesis that both
instruments are exogenous is that the two instruments produce rather
different estimated coefficients.
The J-statistic rejection means that the regression in column (3) is
based on invalid instruments (the instrument exogeneity condition
fails).
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 54 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
The J-statistic rejection says that at least one of the instruments is
endogenous, so there are three logical possibilities
The sales tax is exogenous but the cigarette-specific tax is not, in
which case the column (1) regression is reliable;
the cigarette-specific tax is exogenous but the sales tax is not, so the
column (2) regression is reliable;
or neither tax is exogenous, so neither regression is reliable. The
statistical evidence cannot tell us which possibility is correct, so we
must use our judgement.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 55 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
We think that the case for the exogeneity of the general sales tax is
stronger than that for the cigarette-specific tax.
because the political process can link changes in the cigarette-specific
tax to changes in the cigarette market and smoking policy.
if smoking decreases in a state because it falls out of fashion, there
will be fewer smokers and a weakened lobby against cigarettespecific
tax increases, which in turn could lead to higher cigarette-specific
taxes.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 56 / 99
Checking Instrument Validity
Application: Demand for Cigarettes
So the result that use the cigarette-only tax as an instrument and
adopting the price elasticity estimated using the general sales tax as
an instrument is more reliable.
The estimate of -0.94 indicates that cigarette consumption is
somewhat elastic:An increase in price of 1% leads to a decrease in
consumption of 0.94%.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 57 / 99
Instrumental Variable for multiple regression
Section 4
Instrumental Variable for multiple regression
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 58 / 99
Instrumental Variable for multiple regression
IV for multiple regression(Key Concept 12.1)
Our model is a multiple regression
𝑌𝑖 = 𝛽0+𝛽1𝑋1,𝑖+𝛽2𝑋2,𝑖+...+𝛽𝑘𝑋𝑘,𝑖+𝛽𝑘+1𝑊1,𝑖+...+𝛽𝑘+𝑟𝑊𝑟,𝑖+𝑢𝑖
(12.13)
Where
𝑌𝑖 is the dependent variable
𝑋1, 𝑋2, ...𝑋𝑘 are 𝐾 endogenous regressors
𝑊1, 𝑋2, ...𝑊𝑟 are the additional exogenous variables
we have 𝑚 instruments,𝑍1, 𝑍2, ...𝑍𝑚,instrumental variables
𝑢𝑖 is the regression error term.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 59 / 99
Instrumental Variable for multiple regression
Two Conditions for Valid Instruments
A set of m instruments ,𝑍1, 𝑍2, ...𝑍𝑚 must satisfy the following two
conditions to be valid:
1 Instrument Relevance:
In general,let ̂
𝑋∗
1𝑖 be the predicted value of 𝑋1𝑖 from the population
regression of 𝑋1𝑖 on the instruments (𝑍) and the included exogenous
regressors (𝑊), and let “1” denote the constant regressor that takes on
the value 1 for all observations. Then ( ̂
𝑋∗
1𝑖, ..., ̂
𝑋∗
𝑘𝑖, 𝑊1, 𝑋2, ...𝑊𝑟, 1)
are not perfectly multicollinear.
If there is only one X, then for the previous condition to hold, at least
one 𝑍 must have a non-zero coefficient in the population regression of
𝑋 on the 𝑍 and the 𝑊.
2 Instrument Exogeneity
The instruments are uncorrelated with the error term,
𝐶𝑜𝑣(𝑍1𝑖, 𝑢𝑖) = 0, ..., 𝐶𝑜𝑣(𝑍𝑚𝑖, 𝑢𝑖) = 0
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 60 / 99
Instrumental Variable for multiple regression
The IV Regression Assumptions(Key Concept 12.4)
The variables and errors in the IV regression model in Key Concept
12.1 satisfy the following:
1 𝐸(𝑢𝑖|𝑊1𝑖, ..., 𝑊𝑟𝑖) = 0
2 (𝑋1𝑖, ..., 𝑋𝑘𝑖, 𝑊1𝑖, ..., 𝑊𝑟𝑖, 𝑍1𝑖, ..., 𝑍𝑚𝑖, 𝑌 𝑖) are i.i.d. draws from
their joint distribution;
3 Large outliers are unlikely: The 𝑋,𝑊,𝑍, and 𝑌 have nonzero finite
fourth moments;
4 The two conditions for a valid instrument hold.
Under the IV regression assumptions,the TSLS estimator is consistent
and normally distributed in large samples.
Because the sampling distribution of the TSLS estimator is normal in
large samples,the general procedures for statistical inference
(hypothesis tests and confidence intervals) in regression models
extend to TSLS regression.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 61 / 99
Review the last lecture
Section 5
Review the last lecture
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 62 / 99
Review the last lecture
Instrument Variables:Constant-effect
Instrumental Variable is a useful method to make causal inference. It
can eliminate
Omitted Variable Bias
Measurement Error
Reverse Causality
Two Assumptions
Relevance(Weak Instrument): It can be test by the first stage
regression and F-statistic.
Exogeneity: Can’t be test formally but argue it using professional
knowledges.
Estimation and Inference
When IV satisify these two assumptions,the causal effect of coefficients
of interest,TSLS estimator,𝛽𝑇𝑆𝐿𝑆 can be NOT unbiased but
consistent.
The sampling distribution of the TSLS estimator is also normal in large
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 63 / 99
IV with Heterogeneous Causal Effects
Section 6
IV with Heterogeneous Causal Effects
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 64 / 99
IV with Heterogeneous Causal Effects
Example: Angrist(1990)
Topic: How does veteran status effect on earnings
Methods: Instrumental Variable
Use the lottery outcome as an instrument for veteran status
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 65 / 99
IV with Heterogeneous Causal Effects
Example: Angrist(1990) Background
In the 1960s and 70s young men in the US were at risk of being
drafted for military service in Vietnam.
Fairness concerns led to the institution of a draft lottery in 1970 that
was used to determine priority for conscription.
In each year from 1970 to 1972, random sequence numbers were
randomly assigned to each birth date in cohorts of 19-year-olds.
Men with lottery numbers below a cutoff were eligible for the draft
Men with lottery numbers above the cutoff were not.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 66 / 99
IV with Heterogeneous Causal Effects
Example: Angrist(1990) Instrumental Variables
The instrument(𝑍𝑖) is thus defined as follows:
𝑍𝑖 = 1 if lottery implied individual i would be draft eligible,
𝑍𝑖 = 0 if lottery implied individual i would NOT be draft eligible.
The econometrician observes treatment status(𝐷𝑖) as follows:
𝐷𝑖 = 1 if individual i served in the Vietnam war (veteran)
𝐷𝑖 = 0 if individual i did not serve in the Vietnam war (not veteran)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 67 / 99
IV with Heterogeneous Causal Effects
Example: Angrist(1990): IV’s Relevance and Exogenous
While the lottery didn’t completely determine veteran status, it
certainly mattered: relevance.
The lottery outcome was random and seems reasonable to suppose
that its only effect was on veteran status: exogenous.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 68 / 99
IV with Heterogeneous Causal Effects
Example: Angrist(1990): heterogeneous effects
We can classify individuals according to assignment(Z) an
treatment(X) into four parts
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 69 / 99
IV with Heterogeneous Causal Effects
Local Average Treatment Effect(LATE)
So IV estimate only get the X effect on Y on the
subpopulation-compilers.
Angrist and Imbens(1994) called it as Local Average Treatment
Effect(LATE), thus the treatment effect on those that change their
behavior under the instrument.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 70 / 99
IV with Heterogeneous Causal Effects
IV with Heterogeneous Causal Effects: Generalization
If the population is heterogeneous, then the 𝑖𝑡ℎ
individual now has his
or her own causal effect, 𝛽1𝑖,then the population regression equation
can be written
𝑌𝑖 = 𝛽0𝑖 + 𝛽1𝑖𝑋𝑖 + 𝑢𝑖 (13.9)
𝛽1𝑖 is a random variable that,just like 𝑢𝑖, reflects unobserved variation
across individuals.
The average causal effect is the population mean value of the causal
effect,𝐸(𝛽1𝑖) which is the expected causal effect of a randomly
selected member of the population.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 71 / 99
IV with Heterogeneous Causal Effects
OLS with Heterogeneous Causal Effects
If there is heterogeneity in the causal effect and if 𝑋𝑖 is randomly
assigned, then the differences estimator is a consistent estimator of
the average causal effect.
̂
𝛽𝑜𝑙𝑠 =
𝑠𝑋𝑌
𝑠2
𝑋
𝑝
−
→
𝐶𝑜𝑣(𝑌𝑖, 𝑋𝑖)
𝑉 𝑎𝑟(𝑋𝑖)
=
𝐶𝑜𝑣(𝛽0𝑖 + 𝛽1𝑖𝑋𝑖 + 𝑢𝑖, 𝑋𝑖)
𝑉 𝑎𝑟(𝑋𝑖)
=
𝐶𝑜𝑣(𝛽1𝑖𝑋𝑖, 𝑋𝑖)
𝑉 𝑎𝑟(𝑋𝑖)
= 𝐸(𝛽1𝑖)
Thus, if 𝑋𝑖 is randomly assigned, ̂
𝛽1 is a consistent estimator of the
average causal effect 𝐸(𝛽1𝑖).
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 72 / 99
IV with Heterogeneous Causal Effects
IV Regression with Heterogeneous Causal Effects
Specifically, suppose that 𝑋𝑖 is related to 𝑍𝑖 by the linear model
𝑋𝑖 = 𝜋0𝑖 + 𝜋1𝑖𝑍𝑖 + 𝑣𝑖
where the coefficients 𝜋0𝑖 and 𝜋1𝑖 vary from one individual to the
next. And it is the first-stage equation of TSLS with the modification
of heterogeneous effect of 𝑍 on 𝑋.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 73 / 99
IV with Heterogeneous Causal Effects
IV Regression with Heterogeneous Causal Effects
Then TSLS estimator becomes
̂
𝛽2𝑆𝐿𝑆 =
𝑠𝑍𝑌
𝑠𝑍𝑋
𝑝
−
→
𝜎𝑍𝑌
𝜎𝑍𝑋
=
𝐸(𝛽1𝑖𝜋1𝑖)
𝐸(𝜋1𝑖)
Exercise: prove it by yourself (refers to Appendix 13.2)
The TSLS estimator converges in probability to the ratio of the
expected value of the product of 𝛽1𝑖 and 𝜋1𝑖 to the expected value of
𝜋1𝑖.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 74 / 99
IV with Heterogeneous Causal Effects
IV Regression with Heterogeneous Causal Effects
It is a weighted average of the individual causal effects 𝛽1𝑖, The
weights are 𝜋1𝑖
𝐸(𝜋1𝑖) , which measure the relative degree to which the
instrument influences whether the 𝑖𝑡ℎ individual receives treatment,
In other words,TSLS estimator is a consistent estimator of a weighted
average of the individual causal effects, where the individuals who
receive the most weight are those for whom the instrument is most
influential.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 75 / 99
IV with Heterogeneous Causal Effects
IV Regression with Heterogeneous Causal Effects
Three special cases:
The treatment effect is the same for all individuals.
𝛽1𝑖 = 𝛽1
The instrument affects each individual equally.
𝜋1𝑖 = 𝜋1
The heterogeneity in the treatment effect and heterogeneity in the
effect of the instrument are uncorrelated.
𝐶𝑜𝑣(𝛽1𝑖𝜋1𝑖) = 0
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 76 / 99
IV with Heterogeneous Causal Effects
IV Regression with Heterogeneous Causal Effects
LATE equals to the ATE: all three cases we have
𝐸(𝛽1𝑖𝜋1𝑖)
𝐸(𝜋1𝑖)
= 𝐸(𝛽1𝑖) = 𝛽1
Aside from these three special cases, in general the local average
treatment effect differs from the average treatment effect.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 77 / 99
IV with Heterogeneous Causal Effects
IV Regression with Heterogeneous Causal Effects:
Implications
Different instruments can identify different parameters because they
estimate the impact on different populations.
The difference arises because each researcher is implicitly estimating a
different weighted average of the individual causal effects in the
population.
Recall: J-test of overidentifying restrictions can reject if the two
instruments estimate different local average treatment effects,even if
both instruments are valid. In general neither estimator is a consistent
estimator of the average causal effect.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 78 / 99
IV with Heterogeneous Causal Effects
In Summary
The IV paradigm provides a powerful and flexible framework for
causal inference.
An alternative to random assignment with a strong claim on internal
validity.
The LATE framework highlights questions of external validity
Can one instrument identify the average effect induced by another
source of variation?
Can we go from average effects on compliers to average effects on the
entire treated population or an unconditional effect?
The answer to these questions is usually: NO, at least without
additional assumptions.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 79 / 99
Some Practical Guides by Angrist and Pischke(2012)
Section 7
Some Practical Guides by Angrist and Pischke(2012)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 80 / 99
Some Practical Guides by Angrist and Pischke(2012)
Practical Guides
1 Check IV relevance
Always report the first stage and think about whether it makes
sense(Signs and magnitudes)
Always report the F-statistic on the excluded instruments. The
bigger,the better. Don’t forget the rule of thumb.(𝐹 > 10)
2 Check exclusion restriction
The exclusion restriction cannot be tested directly, but it can be
falsified
Run and examine the reduced form(regression of dependent variable on
instruments) and look at the coefficients, t-statistics and F-statistics
for excluded instruments.
Because the reduced form is proportional to the casual effect of interest
and is unbiased(OLS), so we should see the causal relation in the
reduced form.If you can’t see the causal relation in the reduced
form,it’s probably not there
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 81 / 99
Some Practical Guides by Angrist and Pischke(2012)
Practical Guides
3 Provide a substantive explanation for observed difference between
2SLS and OLS
How bid is the difference? What does this tell you?
Is the coefficient bigger when theory of endogeneity suggests it should
be smaller? If so, why?
Measurement Error or heterogeneous effects?
4 If you have multiple instruments, report over-identification tests.
Pick your best single instrument and report just-identified estimates
using this one only because just-identified IV is relatively unlikely to be
subject to a weak bias.
Worry if it is substantially different from what you get using multiple
instruments.
Check over-identified 2SLS estimates with LIML. LIML is less than
precise than 2SlS but also less biased. If the results come out similar,
be happy. If not, worry, and try to find stronger instruments.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 82 / 99
Some Practical Guides by Angrist and Pischke(2012)
How to Evaluate IV paper in a simple way?
1 Relevant: The first stage regression
Does the author report the first stage regression?
Does the instrument perform well in the first stage?
Testable: rule of thumb: first stage 𝐹 > 10
2 Exclusion restriction:
Is the instrument exogenous enough?(the random assignment is the
best)
Would you expect a direct effect of Z on Y
Not directly testable: Except when equation is overidentified.
3 What LATE is being estimated?
Whose behavior is affected by the instrument?
Is this the LATE you would want? Is it a quantify of theoretical
interest?
Would other LATEs possible yield different estimates?
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 83 / 99
An good example: Long live Keju
Section 8
An good example: Long live Keju
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 84 / 99
An good example: Long live Keju
Chen, Kung and MA(2017)
Title: Long Live Keju! The Persistent Effects of China’s Imperial
Examination System.
Topic: Long term persistence of human capital:the effect of Keju
Dependent Variable: education level in 2010
Indepenet Variable: the density of jinshi in the Ming-Qing dynasties
Data: 272 perfectures in jinshi.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 85 / 99
An good example: Long live Keju
Chen, Kung and MA(2018)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 86 / 99
An good example: Long live Keju
Chen, Kung and MA(2018)
The effect of Keju on human capital at present
Run regression
𝑙𝑛𝑌𝑖 = 𝛽𝑙𝑛(𝐾𝑒𝑗𝑢𝑖) + 𝛾1𝑋𝑐
𝑖 + 𝛾2𝑋ℎ
𝑖 + 𝛼𝑝 + 𝑢𝑖
𝑌𝑖: 2010 年 i 地区的平均受教育年限。
𝐾𝑒𝑗𝑢𝑖: 明清时期 i 地区获得进士的人数。
𝑋𝑐
𝑖 : 控制变量(当代)
,包括经济繁荣程度 (夜间灯光);地理因素:
该地区到海选距离、地形(免于遭受自然灾害)
。
𝑋𝑐
𝑖 : 控制变量(历史)
:
历史经济繁荣程度
基础教育设施
社会和政治影响力
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 87 / 99
An good example: Long live Keju
Chen, Kung and MA(2018): OLS
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 88 / 99
An good example: Long live Keju
Chen, Kung and MA(2018): Potential Bias
OVB: that are simultaneously associated with both historical jinshi
density and years of schooling today.
For instance, prefectures that had produced more jinshi may be
associated with unobserved (natural or genetic) endowments.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 89 / 99
An good example: Long live Keju
Chen, Kung and MA(2018): Instrumental Variable
IV: Distance to the Printing Ingredients (Pine and Bamboo) as the
Instrumental Variable of Keju
A logic chain:
𝑀𝑜𝑟𝑒; 𝑠𝑢𝑐𝑒𝑒𝑑𝑒𝑑 𝑖𝑛 𝐾𝑒𝑗𝑢 ⟺ 𝑚𝑜𝑟𝑒 𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑏𝑜𝑜𝑘𝑠
⟺ 𝑝𝑟𝑖𝑛𝑡 𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑏𝑜𝑜𝑘𝑠 𝑖𝑛 𝑐𝑒𝑛𝑡𝑒𝑟𝑠
⟺ 𝑝𝑟𝑖𝑛𝑡 𝑐𝑒𝑛𝑡𝑒𝑟𝑠 𝑙𝑜𝑐𝑎𝑡𝑒𝑠 𝑛𝑒𝑎𝑟𝑏𝑦 𝑠𝑜𝑚𝑒 𝑖𝑛
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 90 / 99
An good example: Long live Keju
Chen, Kung and MA(2018): First Stage
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 91 / 99
An good example: Long live Keju
Chen, Kung and MA(2018): Reduced-form and 2SLS
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 92 / 99
Where Do Valid Instruments Come From?
Section 9
Where Do Valid Instruments Come From?
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 93 / 99
Where Do Valid Instruments Come From?
Where do we find an IV?
Generally Speaking
“可遇不可求”
Two main approaches
1 Economic Theory/Logics
2 Exogenous Source of Variation in X(natural experiments)
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 94 / 99
Where Do Valid Instruments Come From?
Where do we find an IV?
Example 1: Does putting criminals in jail reduce crime?
Run a regression of crime rates(d.v.) on incarceration rates(id.v) by
using annual data at a suitable level of jurisdiction(states) and
covariates (economic conditions)
Simultaneous causality bias: crime rates goes up, more prisoners and
more prisoners,reduced crime.
IV: it must affect the incarceration rate but be unrelated to any of the
unobserved factors that determine the crime rate.
Levitt (1996) suggested that lawsuits aimed at reducing prison
overcrowding could serve as an instrumental variable.
Result: The estimated effect was three times larger than the effect
estimated using OLS.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 95 / 99
Where Do Valid Instruments Come From?
Where do we find an IV?: Class Size and Test Score
Example 2: Does cutting class sizes increase test scores?
Omited Variable bias: such as parental interest in learning, learning
opportunities outside the classroom, quality of the teachers and
school facilities.
IV: correlated with class size (relevance) but uncorrelated with the
omitted determinants of test performance.
Hoxby (2000) suggested biology. Because of random fluctuations in
timings of births, the size of the incoming kindergarten class varies
from one year to the next.
But potential enrollment also fluctuates because parents with young
children choose to move into an improving school district and out of
one in trouble. She used the deviation of potential enrollment from
its long-term trend as her instrument.
Result: the effect on test scores of class size is small.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 96 / 99
Where Do Valid Instruments Come From?
Where do we find an IV?
1 Institutional Background
Angrist(1990)-draft lottery: Vietnam veterans were randomly
designated based on birth day used to estimate the wage impact of a
shorter work experience.
Acemoglu, Johnson, and Robinson(2001): the dead rate of some
diseases in some areas to estimate the impact of institutions to
economic growth.
Feng et al.(2012) “The Returns to Education in China: Evidence from
the 1986 Compulsory Education Law”.
Li and Zhang(2007),Liu(2012)- “One Child policy”
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 97 / 99
Where Do Valid Instruments Come From?
Where do we find an IV?
2 Natural conditions(geography,weather,disaster)
the Rainfall,Hurricane,Earthquake,Tsunami…
the number of Rivers: Hoxby(2000)
Ying Bai and Ruixue Jia(2014)-“keju” and “the number of small
rivers”
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 98 / 99
Where Do Valid Instruments Come From?
Where do we find an IV?
3 Economic theory and Economic logic
study the alcohol consumption and income relationship. alcohol price
in a local market may be as a instrument of alcohol consumption.
Angrist & Evans(1998): have same sex or different sex children used
to estimate the impact of an additional birth on women labor supply.
Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 99 / 99

More Related Content

PDF
research on journaling
PDF
Point symmetries of lagrangians
PDF
RESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEATED MEASUREMENT DATA
PDF
Asymptotic properties of bayes factor in one way repeated measurements model
PDF
Asymptotic properties of bayes factor in one way repeated measurements model
PDF
Time Delay and Mean Square Stochastic Differential Equations in Impetuous Sta...
PDF
Numerical Solution of the Nonlocal Singularly Perturbed Problem
PDF
Econometrics 1 Slide from the masters degree 1
research on journaling
Point symmetries of lagrangians
RESIDUALS AND INFLUENCE IN NONLINEAR REGRESSION FOR REPEATED MEASUREMENT DATA
Asymptotic properties of bayes factor in one way repeated measurements model
Asymptotic properties of bayes factor in one way repeated measurements model
Time Delay and Mean Square Stochastic Differential Equations in Impetuous Sta...
Numerical Solution of the Nonlocal Singularly Perturbed Problem
Econometrics 1 Slide from the masters degree 1

Similar to Gradient Metrics for Artificial _2020_Lec4.pdf (20)

PDF
On The Distribution of Non - Zero Zeros of Generalized Mittag – Leffler Funct...
PDF
A System of Estimators of the Population Mean under Two-Phase Sampling in Pre...
PPTX
Basic Concepts of Standard Experimental Designs ( Statistics )
PDF
A Theoretical Framework for Understanding Mutation-Based Testing Methods
PDF
RuleML2015: Input-Output STIT Logic for Normative Systems
PDF
A Two Stage Estimator of Instrumental Variable Quantile Regression for Panel ...
PDF
Exponential decay for the solution of the nonlinear equation induced by the m...
PDF
Project 7
PPSX
Stability analysis of impulsive fractional differential systems with delay
PPTX
Changing the subject of a formula (grouping like terms and factorizing)
PDF
A Moment Inequality for Overall Decreasing Life Class of Life Distributions w...
PDF
AJMS_477_23.pdf
PDF
AJMS_480_23.pdf
PDF
A note on estimation of population mean in sample survey using auxiliary info...
PDF
Vibration analysis and response characteristics of a half car model subjected...
PPT
Finite Element Analysis - UNIT-1
PDF
QTML2021 UAP Quantum Feature Map
PDF
Wigner-Ville Distribution: In Perspective of Fault Diagnosis
On The Distribution of Non - Zero Zeros of Generalized Mittag – Leffler Funct...
A System of Estimators of the Population Mean under Two-Phase Sampling in Pre...
Basic Concepts of Standard Experimental Designs ( Statistics )
A Theoretical Framework for Understanding Mutation-Based Testing Methods
RuleML2015: Input-Output STIT Logic for Normative Systems
A Two Stage Estimator of Instrumental Variable Quantile Regression for Panel ...
Exponential decay for the solution of the nonlinear equation induced by the m...
Project 7
Stability analysis of impulsive fractional differential systems with delay
Changing the subject of a formula (grouping like terms and factorizing)
A Moment Inequality for Overall Decreasing Life Class of Life Distributions w...
AJMS_477_23.pdf
AJMS_480_23.pdf
A note on estimation of population mean in sample survey using auxiliary info...
Vibration analysis and response characteristics of a half car model subjected...
Finite Element Analysis - UNIT-1
QTML2021 UAP Quantum Feature Map
Wigner-Ville Distribution: In Perspective of Fault Diagnosis
Ad

More from Akkal Bahadur Bist (7)

PPTX
Artificial Neural Network8_Practical (1).pptx
ODP
Circos plot
PPTX
Data visualization
PPT
Gopher Protocol
ODP
Data Warehouse Introduction
PDF
Data warehouse
Artificial Neural Network8_Practical (1).pptx
Circos plot
Data visualization
Gopher Protocol
Data Warehouse Introduction
Data warehouse
Ad

Recently uploaded (20)

DOCX
Cambridge-Practice-Tests-for-IELTS-12.docx
PDF
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
PPTX
B.Sc. DS Unit 2 Software Engineering.pptx
PPTX
Virtual and Augmented Reality in Current Scenario
PDF
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
PDF
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
PDF
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
PDF
Paper A Mock Exam 9_ Attempt review.pdf.
PDF
LEARNERS WITH ADDITIONAL NEEDS ProfEd Topic
PDF
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
PPTX
Unit 4 Computer Architecture Multicore Processor.pptx
PDF
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
PDF
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
PDF
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
PDF
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
PDF
English Textual Question & Ans (12th Class).pdf
PPTX
A powerpoint presentation on the Revised K-10 Science Shaping Paper
PDF
Hazard Identification & Risk Assessment .pdf
PDF
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
PDF
Uderstanding digital marketing and marketing stratergie for engaging the digi...
Cambridge-Practice-Tests-for-IELTS-12.docx
medical_surgical_nursing_10th_edition_ignatavicius_TEST_BANK_pdf.pdf
B.Sc. DS Unit 2 Software Engineering.pptx
Virtual and Augmented Reality in Current Scenario
LIFE & LIVING TRILOGY- PART (1) WHO ARE WE.pdf
MBA _Common_ 2nd year Syllabus _2021-22_.pdf
CISA (Certified Information Systems Auditor) Domain-Wise Summary.pdf
Paper A Mock Exam 9_ Attempt review.pdf.
LEARNERS WITH ADDITIONAL NEEDS ProfEd Topic
ChatGPT for Dummies - Pam Baker Ccesa007.pdf
Unit 4 Computer Architecture Multicore Processor.pptx
BP 704 T. NOVEL DRUG DELIVERY SYSTEMS (UNIT 2).pdf
LIFE & LIVING TRILOGY - PART (3) REALITY & MYSTERY.pdf
Skin Care and Cosmetic Ingredients Dictionary ( PDFDrive ).pdf
BP 505 T. PHARMACEUTICAL JURISPRUDENCE (UNIT 1).pdf
English Textual Question & Ans (12th Class).pdf
A powerpoint presentation on the Revised K-10 Science Shaping Paper
Hazard Identification & Risk Assessment .pdf
1.3 FINAL REVISED K-10 PE and Health CG 2023 Grades 4-10 (1).pdf
Uderstanding digital marketing and marketing stratergie for engaging the digi...

Gradient Metrics for Artificial _2020_Lec4.pdf

  • 1. Lecture 5: Instrumental Variables Applied Micro-Econometrics,Fall 2020 Zhaopeng Qu Nanjing University 10/29/2020 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 1 / 99
  • 2. 1 Review Previous Lecture of Internal Validity 2 Instrumental Variable Method 3 Checking Instrument Validity 4 Instrumental Variable for multiple regression 5 Review the last lecture 6 IV with Heterogeneous Causal Effects 7 Some Practical Guides by Angrist and Pischke(2012) 8 An good example: Long live Keju Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 2 / 99
  • 3. Review Previous Lecture of Internal Validity Section 1 Review Previous Lecture of Internal Validity Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 3 / 99
  • 4. Review Previous Lecture of Internal Validity Threatens to Internal Validity Three endogenous in OLS regression are: Omitted Variable Bias(a variable that is correlated with X but is unobserved) Simultaneity or reverse causality Bias (X causes Y,Y causes X) Errors-in-Variables Bias (X is measured with error) One easy way to deal with these endogeneity is using Instrumental Variable method. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 4 / 99
  • 5. Instrumental Variable Method Section 2 Instrumental Variable Method Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 5 / 99
  • 6. Instrumental Variable Method Introduction The earliest application involved attempts to estimate demand and supply curve for product. A simple but difficult question: How to find the supply or demand curves? Difficulty: We can only observe intersections of supply and demand, yielding pairs. Solution: Wright(1928) use variables that appear in one equation to shift this equation and trace out the other. The variables that do the shifting came to be known as Instrumental Variables method. It is well-known that IV can address the problems of omitted variable bias, measurement error and reverse causality problems. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 6 / 99
  • 7. Instrumental Variable Method Terminology: endogeneity and exogeneity An endogenous variable is one that both we are interested in and is correlated with u. An exogenous variable is one that is uncorrelated with u. Historical note: “Endogenous” literally means “determined within the system,” that is, a variable that is jointly determined with Y, that is, a variable subject to simultaneous causality. However, this definition is narrow and IV regression can be used to address OVB and errors-in-variable bias, not just to simultaneous causality bias. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 7 / 99
  • 8. Instrumental Variable Method Instrumental variables: 1 endogenous regressor & 1 instrument suppose a simple OLS regression like previous equation 𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖 Because 𝐸[𝑢𝑖|𝑋𝑖] ≠ 0, then we can use an instrumental variable(𝑍𝑖) to obtain an consistent estimate of coefficient. Intuitively, we want to split 𝑋𝑖 into two parts: 1 part that is correlated with the error term. 2 part that is uncorrelated with the error term. If we can isolate the variation in 𝑋𝑖 that is uncorrelated with 𝑢𝑖,then we can use this part to obtain a consistent estimate of the causal effect of 𝑋𝑖 on 𝑌𝑖. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 8 / 99
  • 9. Instrumental Variable Method Instrumental variables: 1 endogenous regressor & 1 instrument An instrumental variable 𝑍𝑖 must satisfy the following 2 properties: 1 Instrumental relevance: 𝑍𝑖 should be correlated with the casual variable of interest, 𝑋𝑖 (endogenous variable),thus 𝐶𝑜𝑣(𝑋𝑖, 𝑍𝑖) ≠ 0 . 2 Instumental exogeneity: 𝑍𝑖 is as good as randomly assigned and 𝑍𝑖 only affect on 𝑌𝑖 through 𝑋𝑖 affecting 𝑌𝑖 channel. 𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖) = 0 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 9 / 99
  • 10. Instrumental Variable Method IV estimator:Jargon Our simple OLS regression: Causal relationship of interest 𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖 First-Stage regression: regress endogenous variable on IV 𝑋𝑖 = 𝜋0 + 𝜋1𝑍𝑖 + 𝑣𝑖 Reduced-Form: regress outcome variable on IV 𝑌𝑖 = 𝛿0 + 𝛿1𝑍𝑖 + 𝑒𝑖 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 10 / 99
  • 11. Instrumental Variable Method IV estimator:Two Steps Least Square (2SLS) We can estimate the causal effect of 𝑋𝑖 on 𝑌𝑖 in two steps 1 First stage: Regress 𝑋𝑖 on 𝑍𝑖 & obtain predicted values of ̂ 𝑋𝑖,if 𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖) = 0, then ̂ 𝑋𝑖 contains variation in 𝑋𝑖 that is uncorrelated with 𝑢𝑖 ̂ 𝑋𝑖 = ̂ 𝜋0 + ̂ 𝜋1𝑍𝑖 . 2 Second stage: Regress 𝑌𝑖 on ̂ 𝑋𝑖 to obtain the Two Stage Least Squares estimator ̂ 𝛽2𝑆𝐿𝑆 ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )( ̂ 𝑋𝑖 − ̂ 𝑋) ∑( ̂ 𝑋𝑖 − ̂ 𝑋)2 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 11 / 99
  • 12. Instrumental Variable Method IV estimator:Two Steps Least Square (2SLS) we substitute ̂ 𝑋𝑖 − ̂ 𝑋 = ̂ 𝜋1(𝑍𝑖 − ̄ 𝑍) then we could obtain ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )( ̂ 𝑋𝑖 − ̂ 𝑋) ∑( ̂ 𝑋𝑖 − ̂ 𝑋)2 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 12 / 99
  • 13. Instrumental Variable Method IV estimator:Two Steps Least Square (2SLS) we substitute ̂ 𝑋𝑖 − ̂ 𝑋 = ̂ 𝜋1(𝑍𝑖 − ̄ 𝑍) then we could obtain ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )( ̂ 𝑋𝑖 − ̂ 𝑋) ∑( ̂ 𝑋𝑖 − ̂ 𝑋)2 = ∑(𝑌𝑖 − ̄ 𝑌 ) ̂ 𝜋1(𝑍𝑖 − 𝑍) ∑ ̂ 𝜋2 1(𝑍𝑖 − 𝑍)2 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 12 / 99
  • 14. Instrumental Variable Method IV estimator:Two Steps Least Square (2SLS) we substitute ̂ 𝑋𝑖 − ̂ 𝑋 = ̂ 𝜋1(𝑍𝑖 − ̄ 𝑍) then we could obtain ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )( ̂ 𝑋𝑖 − ̂ 𝑋) ∑( ̂ 𝑋𝑖 − ̂ 𝑋)2 = ∑(𝑌𝑖 − ̄ 𝑌 ) ̂ 𝜋1(𝑍𝑖 − 𝑍) ∑ ̂ 𝜋2 1(𝑍𝑖 − 𝑍)2 = 1 ̂ 𝜋1 ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − 𝑍) ∑(𝑍𝑖 − 𝑍)2 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 12 / 99
  • 15. Instrumental Variable Method IV estimator:Two Steps Least Square (2SLS) we substitute ̂ 𝑋𝑖 − ̂ 𝑋 = ̂ 𝜋1(𝑍𝑖 − ̄ 𝑍) then we could obtain ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )( ̂ 𝑋𝑖 − ̂ 𝑋) ∑( ̂ 𝑋𝑖 − ̂ 𝑋)2 = ∑(𝑌𝑖 − ̄ 𝑌 ) ̂ 𝜋1(𝑍𝑖 − 𝑍) ∑ ̂ 𝜋2 1(𝑍𝑖 − 𝑍)2 = 1 ̂ 𝜋1 ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − 𝑍) ∑(𝑍𝑖 − 𝑍)2 = ∑(𝑍𝑖 − 𝑍)2 ∑(𝑋𝑖 − 𝑋)(𝑍𝑖 − 𝑍) × ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − 𝑍) ∑(𝑍𝑖 − 𝑍)2 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 12 / 99
  • 16. Instrumental Variable Method IV estimator:Two Steps Least Square (2SLS) Which gives the instrumental variable estimator ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = 𝑠𝑍𝑌 𝑠𝑍𝑋 The TSLS estimator of 𝛽1 is the ratio of the sample covariance between 𝑍 and 𝑌 to the sample covariance between 𝑍 and 𝑋. If 𝑍𝑖 = 𝑋𝑖, then ̂ 𝛽2𝑆𝐿𝑆 = ̂ 𝛽𝑜𝑙𝑠 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 13 / 99
  • 17. Instrumental Variable Method Statistical propertise of 2SLS estimator: Unbiasedness Consider 𝐸[ ̂ 𝛽𝐼𝑉 ] 𝐸[ ̂ 𝛽2𝑆𝐿𝑆] = 𝐸[ ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
  • 18. Instrumental Variable Method Statistical propertise of 2SLS estimator: Unbiasedness Consider 𝐸[ ̂ 𝛽𝐼𝑉 ] 𝐸[ ̂ 𝛽2𝑆𝐿𝑆] = 𝐸[ ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝐸[ ∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1 ̄ 𝑋 + ̄ 𝑢)](𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
  • 19. Instrumental Variable Method Statistical propertise of 2SLS estimator: Unbiasedness Consider 𝐸[ ̂ 𝛽𝐼𝑉 ] 𝐸[ ̂ 𝛽2𝑆𝐿𝑆] = 𝐸[ ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝐸[ ∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1 ̄ 𝑋 + ̄ 𝑢)](𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝐸[ ∑ 𝛽1(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) + ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
  • 20. Instrumental Variable Method Statistical propertise of 2SLS estimator: Unbiasedness Consider 𝐸[ ̂ 𝛽𝐼𝑉 ] 𝐸[ ̂ 𝛽2𝑆𝐿𝑆] = 𝐸[ ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝐸[ ∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1 ̄ 𝑋 + ̄ 𝑢)](𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝐸[ ∑ 𝛽1(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) + ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝛽1 + 𝐸[ ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
  • 21. Instrumental Variable Method Statistical propertise of 2SLS estimator: Unbiasedness Consider 𝐸[ ̂ 𝛽𝐼𝑉 ] 𝐸[ ̂ 𝛽2𝑆𝐿𝑆] = 𝐸[ ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝐸[ ∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1 ̄ 𝑋 + ̄ 𝑢)](𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝐸[ ∑ 𝛽1(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) + ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝛽1 + 𝐸[ ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝛽1 + 𝐸[ ∑ 𝑢𝑖(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 14 / 99
  • 22. Instrumental Variable Method Statistical propertise of 2SLS estimator: Unbiasedness Because instrument exogeneity implies 𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖) = 0,but not 𝐸[𝑢𝑖|𝑍𝑖, 𝑋𝑖] = 0,then 𝐸[ ∑ 𝑢𝑖(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] = 𝐸[ ∑ 𝐸[𝑢𝑖|𝑋𝑖, 𝑍𝑖](𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) ] ≠ 0 Then we have 𝐸[ ̂ 𝛽2𝑆𝐿𝑆] ≠ 𝛽1 It means that 2SLS estimator is biased. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 15 / 99
  • 23. Instrumental Variable Method Statistical propertise of 2SLS estimator: Consistent We have a simple regression 𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖 and take a covariance of 𝑌𝑖 and 𝑍𝑖 𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖) = 𝐶𝑜𝑣[𝑍𝑖, (𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖)] = 𝐶𝑜𝑣(𝑍𝑖, 𝛽0) + 𝛽1𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖) + 𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖) = 𝛽1𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖) Thus if the instrument is valid, 𝛽1 = 𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖) 𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖) The population coefficient is the ratio of the population covariance between 𝑍 and 𝑌 to the popualtion covariance between 𝑍 and 𝑋. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 16 / 99
  • 24. Instrumental Variable Method Statistical propertise of 2SLS estimator: Consistent As discussed in Section 3.7,the sample covariance is a consistent estimator of the population covariance, thus 𝑠𝑍𝑌 𝑝 − → 𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖) and 𝑠𝑍𝑋 𝑝 − → 𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖) Then the TSLS estimator is consistent. ̂ 𝛽2𝑆𝐿𝑆 = 𝑠𝑍𝑌 𝑠𝑍𝑋 𝑝 − → 𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖) 𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖) = 𝛽1 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 17 / 99
  • 25. Instrumental Variable Method Statistical propertise of 2SLS : sampling distribution Similar to the expression for the OLS estimator in Equation (4.30,page 183 in S.W) ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
  • 26. Instrumental Variable Method Statistical propertise of 2SLS : sampling distribution Similar to the expression for the OLS estimator in Equation (4.30,page 183 in S.W) ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = ∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1 ̄ 𝑋 + ̄ 𝑢)](𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
  • 27. Instrumental Variable Method Statistical propertise of 2SLS : sampling distribution Similar to the expression for the OLS estimator in Equation (4.30,page 183 in S.W) ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = ∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1 ̄ 𝑋 + ̄ 𝑢)](𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = ∑ 𝛽1(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) + ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
  • 28. Instrumental Variable Method Statistical propertise of 2SLS : sampling distribution Similar to the expression for the OLS estimator in Equation (4.30,page 183 in S.W) ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = ∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1 ̄ 𝑋 + ̄ 𝑢)](𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = ∑ 𝛽1(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) + ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = 𝛽1 + ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
  • 29. Instrumental Variable Method Statistical propertise of 2SLS : sampling distribution Similar to the expression for the OLS estimator in Equation (4.30,page 183 in S.W) ̂ 𝛽2𝑆𝐿𝑆 = ∑(𝑌𝑖 − ̄ 𝑌 )(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = ∑[(𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖) − (𝛽0 + 𝛽1 ̄ 𝑋 + ̄ 𝑢)](𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = ∑ 𝛽1(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) + ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = 𝛽1 + ∑(𝑢𝑖 − ̄ 𝑢)(𝑍𝑖 − ̄ 𝑍) ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) = 𝛽1 + 1 𝑛 ∑ 𝑢𝑖(𝑍𝑖 − ̄ 𝑍) 1 𝑛 ∑(𝑋𝑖 − ̄ 𝑋)(𝑍𝑖 − ̄ 𝑍) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 18 / 99
  • 30. Instrumental Variable Method Statistical propertise of 2SLS: sampling distribution Large sample: ̄ 𝑍 ≅ 𝜇𝑧. Let 𝑞𝑖 = (𝑍𝑖 − 𝜇𝑍)𝑢𝑖,then the numerator 1 𝑛 ∑ 𝑢𝑖(𝑍𝑖 − ̄ 𝑍) ≅ 1 𝑛 ∑ 𝑞𝑖 = ̄ 𝑞 Because 𝐶𝑜𝑣(𝑍𝑖, 𝑢𝑖) = 0 and 𝐸(𝑢𝑖)=0,so 𝐶𝑜𝑣(𝑍𝑖 − 𝜇𝑍, 𝑢𝑖) = 𝐸[(𝑍𝑖 − 𝜇𝑍)𝑢𝑖] = 𝐸(𝑞𝑖) = 0 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 19 / 99
  • 31. Instrumental Variable Method Statistical propertise of 2SLS: sampling distribution In addition,the variance of 𝑞𝑖 is 𝜎2 𝑞 = 𝑉 𝑎𝑟[(𝑍𝑖 − 𝜇𝑍)𝑢𝑖]. We also have 𝑉 𝑎𝑟( ̄ 𝑞) = 𝜎2 ̄ 𝑞 = 𝜎2 𝑞 𝑛 = 1 𝑛 𝑉 𝑎𝑟[(𝑍𝑖 − 𝜇𝑍)𝑢𝑖] By the C.L.T.(central limit theorem) in large sample, ̄ 𝑞 𝜎2 ̄ 𝑞 𝑑 − → 𝑁(0, 1) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 20 / 99
  • 32. Instrumental Variable Method Statistical propertise of 2SLS: sampling distribution Because the sample covariance is consistent for the population covariance,thus 𝑠𝑋𝑌 𝑝 − → 𝐶𝑜𝑣(𝑋𝑖, 𝑌𝑖), then we obtain ̂ 𝛽2𝑆𝐿𝑆 ≅ 𝛽1 + ̄ 𝑞 𝐶𝑜𝑣(𝑍𝑖, 𝑌𝑖) In addition,because ̄ 𝑞 𝑑 − → 𝑁(0, 𝜎2 ̄ 𝑞),then we have ̄ 𝑞 𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖) 𝑑 − → 𝑁(0, 𝜎2 ̄ 𝑞 [𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖)]2 ) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 21 / 99
  • 33. Instrumental Variable Method Statistical propertise of 2SLS: sampling distribution At last, so in large samples ̂ 𝛽2𝑆𝐿𝑆 is approximately distributed ̂ 𝛽2𝑆𝐿𝑆 𝑑 − → 𝑁(𝛽, 𝜎2 ̂ 𝛽2𝑆𝐿𝑆 ) Where 𝜎2 ̂ 𝛽2𝑆𝐿𝑆 = 𝜎2 ̄ 𝑞 [𝐶𝑜𝑣(𝑍𝑖, 𝑋𝑖)]2 = 1 𝑛 𝑉 𝑎𝑟[(𝑍𝑖 − 𝜇𝑍)𝑢𝑖] 𝐶𝑜𝑣[(𝑍𝑖, 𝑋𝑖)]2 (12.8) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 22 / 99
  • 34. Instrumental Variable Method Statistical propertise of 2SLS: Statistical Inference The variance ̂ 𝛽2𝑆𝐿𝑆 can be estimated by estimating the variance and covariance terms appearing in Equation (12.8),thus 𝑆𝐸( ̂ 𝛽2𝑆𝐿𝑆) = √ 1 𝑛 ∑(𝑍𝑖 − 𝜇𝑍)2 ̂ 𝑢2 𝑖 𝑛( 1 𝑛 ∑(𝑍𝑖 − 𝜇𝑍)𝑋𝑖)2 Then the square root of the estimate of 𝜎2 ̂ 𝛽2𝑆𝐿𝑆 , thus the standard error of the IV estimator, which is a little bit complicated. Fortunately,this is done automatically in TSLS regression commands in econometric software packages. Because ̂ 𝛽2𝑆𝐿𝑆 is normally distributed in large samples, hypothesis tests about 𝛽 can be performed by computing the t-statistic,and a 95% large-sample confidence interval is given by ̂ 𝛽2𝑆𝐿𝑆 ± 1.96𝑆𝐸( ̂ 𝛽2𝑆𝐿𝑆) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 23 / 99
  • 35. Instrumental Variable Method Application: Angrist and Krueger(1991) Angrist, Joshua D. and Alan B. Krueger. 1991. “Does Compulsory School Attendance Affect Schooling and Earnings?” The Quarterly Journal of Economics 106 (4):pp979–1014. They use quarter of birth as an instrument for education to estimate the returns to schooling. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 24 / 99
  • 36. Instrumental Variable Method Application: Angrist and Krueger(1991) Why is the Quarter of Birth? In most of the U.S. must attend school until age 16 (at least during 1938-1967) Age when starting school depends on birthday, so grade when can legally drop out depends on birthday by compulsory schooling laws. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 25 / 99
  • 37. Instrumental Variable Method Application: Angrist and Krueger(1991) Is Schooling related to Quarter of Birth?(Assumption 1) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 26 / 99
  • 38. Instrumental Variable Method Angrist and Krueger(1991): The First Stage Does quarter of birth affect education? Regress education outcomes on quarter of birth dummy variables: 𝑆𝑖𝑗𝑐 = 𝛼 + 𝛽1𝑄1𝑖𝑐 + 𝛽2𝑄2𝑖𝑐 + 𝛽3𝑄3𝑖𝑐 + 𝜖𝑖𝑗𝑐 where individual 𝑖, cohort 𝑐, education outcome 𝑆, birth quarter 𝑄𝑗 It is the first stage regression Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 27 / 99
  • 39. Instrumental Variable Method Angrist and Krueger(1991): The First Stage It shows that 𝑄𝑗 does impact education outcomes such as total years of education and high school graduation. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 28 / 99
  • 40. Instrumental Variable Method Angrist and Krueger(1991): exogeneity Due to compulsory schooling laws? Indirect evidence: on post-secondary outcomes that are not expected to be affected by compulsory schooling laws. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 29 / 99
  • 41. Instrumental Variable Method Angrist and Krueger(1991): Reduced form Is Earnings related to Quarter of Birth? Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 30 / 99
  • 42. Instrumental Variable Method Angrist and Krueger(1991): OLS v.s IV IV Estimates Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 31 / 99
  • 43. Instrumental Variable Method Angrist and Krueger(1991): OLS v.s IV with covariates Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 32 / 99
  • 44. Checking Instrument Validity Section 3 Checking Instrument Validity Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 33 / 99
  • 45. Checking Instrument Validity Assumption #1 Instrument Relevance Instrumental strategy that seems very robust. But how to understand that Angrist and Krueger(1991) IV’s result larger than that of OLS? Bound et al(1995) prove that when instruments have limited explanatory power over endogenous variable, 1.IV is biased towards OLS in finite samples. 2.May happen even on very large sample Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 34 / 99
  • 46. Checking Instrument Validity Assumption #1 Instrument Relevance Recall 2SLS: a simple OLS regression equation is 𝑌𝑖 = 𝛽0 + 𝛽1𝑋𝑖 + 𝑢𝑖 Get the predict value from the first stage ̂ 𝑋𝑖 = ̂ 𝜋0 + ̂ 𝜋1𝑍𝑖 Running the second stage regression 𝑌𝑖 = 𝛽0 + 𝛽1 ̂ 𝑋𝑖 + 𝑢𝑖 So following the OLS formula in large sample, we can obtain ̂ 𝛽1 𝑝 − → 𝛽1 + 𝐶𝑜𝑣( ̂ 𝑋, 𝑢) 𝑉 𝑎𝑟( ̂ 𝑋) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 35 / 99
  • 47. Checking Instrument Validity Assumption #1 Instrument Relevance An 2SLS version of OVB ̂ 𝛽2𝑆𝐿𝑆 𝑝 − → 𝛽 + 𝐶𝑜𝑣( ̂ 𝑋, 𝑢) 𝑉 𝑎𝑟( ̂ 𝑋) = 𝛽 + 𝐶𝑜𝑣( ̂ 𝜋0 + ̂ 𝜋1𝑍, 𝑢) 𝑉 𝑎𝑟( ̂ 𝜋0 + ̂ 𝜋1𝑍) = 𝛽 + ̂ 𝜋1𝐶𝑜𝑣(𝑍, 𝑢) ̂ 𝜋2 1𝑉 𝑎𝑟( ̂ 𝑍) = 𝛽 + 𝑉 𝑎𝑟(𝑍) 𝐶𝑜𝑣(𝑍, 𝑋) 𝐶𝑜𝑣(𝑍, 𝑢) 𝑉 𝑎𝑟(𝑍) = 𝛽 + 𝐶𝑜𝑣(𝑍, 𝑢) 𝐶𝑜𝑣(𝑍, 𝑋) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 36 / 99
  • 48. Checking Instrument Validity Weak Instruments Assumption 1: Instrument Relevance 𝐶𝑜𝑣(𝑋𝑖, 𝑍𝑖) ≠ 0 . Intuition: the more the variation in 𝑋 is explained by the instruments, thus the more information is available for use in IV regression On the contrary, instruments explain little of variation in 𝑋 are called Weak Instruments, thus there is a very weak correlation between 𝑋(endogenous variable) and 𝑍(IV). Because ̂ 𝛽2𝑆𝐿𝑆 𝑝 − → 𝛽 + 𝐶𝑜𝑣(𝑍, 𝑢) 𝐶𝑜𝑣(𝑍, 𝑋) So if 𝐶𝑜𝑣(𝑍, 𝑋) = 0,thus 𝑋 and 𝑍 is irrelevant,the bias will approximate to infinity. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 37 / 99
  • 49. Checking Instrument Validity Weak Instruments: How to test weak instruments ? We should therefore always check whether an instrument is relevant enough. Compute the first stage F-statistic provide a measure of the in formation content contained in the instruments. Stock and Yogo(2005) showed that 𝐸(𝛽2𝑆𝐿𝑆) − 𝛽 ≅ 𝐸(𝛽𝑜𝑙𝑠) − 𝛽 𝐸(𝐹) − 1 𝐸(𝐹) is the expectation of the first stage F-statistics.And if 𝐸(𝐹) = 10,the bias of 2SLS, relative to the bias of OLS,is approximately 1 9 , which is small enough to be acceptable. A Rule of Thumb: if F-statistic exceeds 10,then don’t need worry about too much. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 38 / 99
  • 50. Checking Instrument Validity Angrist and Krueger(1991): Why IV over OLS? In Angrist and Krueger(1991),despite large samples sizes, the F-statistics for a test of the joint statistical significance of the excluded exogenous variables in the first-stage regression are not over 2. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 39 / 99
  • 51. Checking Instrument Validity Wrap up If the correlation between the instruments and the endogenous variable is small, then even the enormous sample sizes do not guarantee that quantitatively important finite sample biases will be eliminated from IV estimates. The first assumption of IV method, thus relevance of IV, can be justified by the F-statistic in the first stage. Potential Solutions If you have many IVs, some are strong, some are weak. Then discard weak ones. If you only have an weak IV, then find other more stronger IV(easy to say, very hard to do) Employing other estimator(LIML) other than 2SLS methods. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 40 / 99
  • 52. Checking Instrument Validity Assumption #2 Instrument Exogeneity If the instruments are not exogenous, then TSLS is inconsistent. After all, the idea of instrumental variables regression is that the instrument contains information about variation in 𝑋𝑖 that is unrelated to the error term 𝑢𝑖. Can we statistically test the assumption that the instruments are exogenous? Answer: In most case,NO. Assessing whether the instruments are exogenous necessarily requires making an expert judgment based on personal knowledge and expert opinion of the application.(“讲好故事”) In some case,you can test partially,thus overidentification test. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 41 / 99
  • 53. Checking Instrument Validity Assumption #2 Instrument Exogeneity Terminology: The relationship between the number of instruments(𝑚) and the number of endogenous regressors(𝑘) exactly(just) identified:𝑚 = 𝑘 overidentified 𝑚 > 𝑘 underidentified 𝑚 < 𝑘 when the coefficients are just identified, you can’t do a formal statistical test of the hypothesis that the instruments are in fact exogenous. If, however, there are more instruments than endogenous regressors, then there is a statistical tool that can be helpful in this process: the so-called test of overidentifying restrictions. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 42 / 99
  • 54. Checking Instrument Validity Overidentification-test:Intuition Suppose there are two valid instruments: 𝑍1 𝑍2(you are very lucky.) Then you could compute two separate TSLS estimates. Intuitively,if these 2 TSLS estimates are very different from each other, then something must be wrong: one or the other (or both) of the instruments must be invalid. The overidentifying restrictions test makes this comparison in a statistically precise way. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 43 / 99
  • 55. Checking Instrument Validity Overidentification test: Our model is a multiple regression 𝑌𝑖 = 𝛽0+𝛽1𝑋1,𝑖+𝛽2𝑋2,𝑖+...+𝛽𝑘𝑋𝑘,𝑖+𝛽𝑘+1𝑊1,𝑖+...+𝛽𝑘+𝑟𝑊𝑟,𝑖+𝑢𝑖 (12.13) Where 𝑌𝑖 is the dependent variable 𝑋1, 𝑋2, ...𝑋𝑘 are 𝐾 endogenous regressors 𝑊1, 𝑋2, ...𝑊𝑟 are the additional exogenous variables we have 𝑚 instruments,𝑍1, 𝑍2, ...𝑍𝑚,instrumental variables 𝑢𝑖 is the regression error term. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 44 / 99
  • 56. Checking Instrument Validity Overidentification test: A set of m instruments,𝑍1, 𝑍2, ...𝑍𝑚 then 2sls regression 𝑌𝑖 = 𝛽0+𝛽1 ̂ 𝑋1,𝑖+𝛽2 ̂ 𝑋2,𝑖+...+𝛽𝑘 ̂ 𝑋𝑘,𝑖+𝛽𝑘+1𝑊1,𝑖+...+𝛽𝑘+𝑟𝑊𝑟,𝑖+𝑢𝑖 (12.13) then we can get the predict value of ̂ 𝑢𝑖 𝑇𝑆𝐿𝑆 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 45 / 99
  • 57. Checking Instrument Validity Overidentification test: Let ̂ 𝑢𝑇𝑆𝐿𝑆 𝑖 = 𝛿0 + 𝛿1𝑍1𝑖 + ... + 𝛿𝑚𝑍𝑚𝑖 + 𝛿𝑚+1𝑊1,𝑖 + ... + 𝛿𝑚+𝑟𝑊𝑟𝑖 + 𝑒𝑖 Let 𝐹 denote the homoskedasticity-only F-statistic testing the hypothesis that 𝛿0 = ... = 𝛿𝑚 = 0 Then the overidentifying restrictions test statistic is 𝐽 = 𝑚𝐹 Under the null hypothesis that all the instruments are exogenous, 𝐽 𝑑 − → 𝜒2 𝑚−𝑘 Where 𝑚 − 𝑘 is the “degree of over-identification,” that is, the number of instruments minus the number of endogenous regressors. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 46 / 99
  • 58. Checking Instrument Validity Application: Demand for Cigarettes A serious public health issue: huge externalities One policy tool is to tax cigarettes so heavily that current smokers cut back and potential new smokers are discouraged from taking up the habit. Precisely how big a tax hike is needed to make a dent in cigarette consumption? For example, what would the after-tax sales price of cigarettes need to be to achieve a 20% reduction in cigarette consumption? The answer to this question depends on the elasticity of demand for cigarettes. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 47 / 99
  • 59. Checking Instrument Validity Application: Demand for Cigarettes Because of the interactions between supply and demand, the elasticity of demand for cigarettes cannot be estimated consistently by an OLS regression of log quantity on log price. Using annual data for the 48 contiguous U.S. states for in 1995,we therefore use TSLS to estimate the elasticity of demand for cigarettes. The instrumental variable, 𝑆𝑎𝑙𝑒𝑠𝑇𝑎𝑥𝑖, is the portion of the tax on cigarettes arising from the general sales tax,measured in dollars per pack. Cigarette consumption, 𝑄𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠 𝑖 , is the number of packs of cigarettes sold per capita in the state, and the price, 𝑃𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠 𝑖 ,is the average real price per pack of cigarettes including all taxes. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 48 / 99
  • 60. Checking Instrument Validity Application: Demand for Cigarettes We consider quantity and price changes that occur over 10-year periods. Dependent variable: Δ𝑙𝑛(𝑄𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠 𝑖 ) = 𝑙𝑛(𝑄𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠 𝑖,1995 ) − 𝑙𝑛(𝑄𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠 𝑖,1985 ) Independent variable: Δ𝑙𝑛(𝑃𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠 𝑖 ) = 𝑙𝑛(𝑃𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠 𝑖,1995 ) − 𝑙𝑛(𝑃𝑐𝑖𝑔𝑎𝑟𝑒𝑡𝑡𝑒𝑠 𝑖,1985 ) Control variable: Δ𝑙𝑛(𝐼𝑛𝑐𝑖) = 𝑙𝑛(𝐼𝑛𝑐𝑖,1995) − 𝑙𝑛(𝐼𝑛𝑐𝑖,1985) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 49 / 99
  • 61. Checking Instrument Validity Application: Demand for Cigarettes Two instruments 1 the change in the sales tax over 10 years, Δ𝑆𝑎𝑙𝑒𝑠𝑇𝑎𝑥𝑖 = 𝑆𝑎𝑙𝑒𝑠𝑇𝑎𝑥𝑖,1995 − 𝑆𝑎𝑙𝑒𝑠𝑇𝑎𝑥𝑖,1985 2 the change in the cigarette-specific tax over 10 years Δ𝐶𝑖𝑔𝑇𝑎𝑥𝑖 = 𝐶𝑖𝑔𝑇𝑎𝑥𝑖,1995 − 𝐶𝑖𝑔𝑇𝑎𝑥𝑖,1985 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 50 / 99
  • 62. Checking Instrument Validity Application: Demand for Cigarettes The first stage Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 51 / 99
  • 63. Checking Instrument Validity Application: Demand for Cigarettes Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 52 / 99
  • 64. Checking Instrument Validity Application: Demand for Cigarettes Over-identifying J-test reject the null hypothesis that both the instruments are exogenous at the 5% significant level(𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.026) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 53 / 99
  • 65. Checking Instrument Validity Application: Demand for Cigarettes The reason the J-statistic rejects the null hypothesis that both instruments are exogenous is that the two instruments produce rather different estimated coefficients. The J-statistic rejection means that the regression in column (3) is based on invalid instruments (the instrument exogeneity condition fails). Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 54 / 99
  • 66. Checking Instrument Validity Application: Demand for Cigarettes The J-statistic rejection says that at least one of the instruments is endogenous, so there are three logical possibilities The sales tax is exogenous but the cigarette-specific tax is not, in which case the column (1) regression is reliable; the cigarette-specific tax is exogenous but the sales tax is not, so the column (2) regression is reliable; or neither tax is exogenous, so neither regression is reliable. The statistical evidence cannot tell us which possibility is correct, so we must use our judgement. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 55 / 99
  • 67. Checking Instrument Validity Application: Demand for Cigarettes We think that the case for the exogeneity of the general sales tax is stronger than that for the cigarette-specific tax. because the political process can link changes in the cigarette-specific tax to changes in the cigarette market and smoking policy. if smoking decreases in a state because it falls out of fashion, there will be fewer smokers and a weakened lobby against cigarettespecific tax increases, which in turn could lead to higher cigarette-specific taxes. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 56 / 99
  • 68. Checking Instrument Validity Application: Demand for Cigarettes So the result that use the cigarette-only tax as an instrument and adopting the price elasticity estimated using the general sales tax as an instrument is more reliable. The estimate of -0.94 indicates that cigarette consumption is somewhat elastic:An increase in price of 1% leads to a decrease in consumption of 0.94%. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 57 / 99
  • 69. Instrumental Variable for multiple regression Section 4 Instrumental Variable for multiple regression Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 58 / 99
  • 70. Instrumental Variable for multiple regression IV for multiple regression(Key Concept 12.1) Our model is a multiple regression 𝑌𝑖 = 𝛽0+𝛽1𝑋1,𝑖+𝛽2𝑋2,𝑖+...+𝛽𝑘𝑋𝑘,𝑖+𝛽𝑘+1𝑊1,𝑖+...+𝛽𝑘+𝑟𝑊𝑟,𝑖+𝑢𝑖 (12.13) Where 𝑌𝑖 is the dependent variable 𝑋1, 𝑋2, ...𝑋𝑘 are 𝐾 endogenous regressors 𝑊1, 𝑋2, ...𝑊𝑟 are the additional exogenous variables we have 𝑚 instruments,𝑍1, 𝑍2, ...𝑍𝑚,instrumental variables 𝑢𝑖 is the regression error term. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 59 / 99
  • 71. Instrumental Variable for multiple regression Two Conditions for Valid Instruments A set of m instruments ,𝑍1, 𝑍2, ...𝑍𝑚 must satisfy the following two conditions to be valid: 1 Instrument Relevance: In general,let ̂ 𝑋∗ 1𝑖 be the predicted value of 𝑋1𝑖 from the population regression of 𝑋1𝑖 on the instruments (𝑍) and the included exogenous regressors (𝑊), and let “1” denote the constant regressor that takes on the value 1 for all observations. Then ( ̂ 𝑋∗ 1𝑖, ..., ̂ 𝑋∗ 𝑘𝑖, 𝑊1, 𝑋2, ...𝑊𝑟, 1) are not perfectly multicollinear. If there is only one X, then for the previous condition to hold, at least one 𝑍 must have a non-zero coefficient in the population regression of 𝑋 on the 𝑍 and the 𝑊. 2 Instrument Exogeneity The instruments are uncorrelated with the error term, 𝐶𝑜𝑣(𝑍1𝑖, 𝑢𝑖) = 0, ..., 𝐶𝑜𝑣(𝑍𝑚𝑖, 𝑢𝑖) = 0 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 60 / 99
  • 72. Instrumental Variable for multiple regression The IV Regression Assumptions(Key Concept 12.4) The variables and errors in the IV regression model in Key Concept 12.1 satisfy the following: 1 𝐸(𝑢𝑖|𝑊1𝑖, ..., 𝑊𝑟𝑖) = 0 2 (𝑋1𝑖, ..., 𝑋𝑘𝑖, 𝑊1𝑖, ..., 𝑊𝑟𝑖, 𝑍1𝑖, ..., 𝑍𝑚𝑖, 𝑌 𝑖) are i.i.d. draws from their joint distribution; 3 Large outliers are unlikely: The 𝑋,𝑊,𝑍, and 𝑌 have nonzero finite fourth moments; 4 The two conditions for a valid instrument hold. Under the IV regression assumptions,the TSLS estimator is consistent and normally distributed in large samples. Because the sampling distribution of the TSLS estimator is normal in large samples,the general procedures for statistical inference (hypothesis tests and confidence intervals) in regression models extend to TSLS regression. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 61 / 99
  • 73. Review the last lecture Section 5 Review the last lecture Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 62 / 99
  • 74. Review the last lecture Instrument Variables:Constant-effect Instrumental Variable is a useful method to make causal inference. It can eliminate Omitted Variable Bias Measurement Error Reverse Causality Two Assumptions Relevance(Weak Instrument): It can be test by the first stage regression and F-statistic. Exogeneity: Can’t be test formally but argue it using professional knowledges. Estimation and Inference When IV satisify these two assumptions,the causal effect of coefficients of interest,TSLS estimator,𝛽𝑇𝑆𝐿𝑆 can be NOT unbiased but consistent. The sampling distribution of the TSLS estimator is also normal in large Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 63 / 99
  • 75. IV with Heterogeneous Causal Effects Section 6 IV with Heterogeneous Causal Effects Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 64 / 99
  • 76. IV with Heterogeneous Causal Effects Example: Angrist(1990) Topic: How does veteran status effect on earnings Methods: Instrumental Variable Use the lottery outcome as an instrument for veteran status Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 65 / 99
  • 77. IV with Heterogeneous Causal Effects Example: Angrist(1990) Background In the 1960s and 70s young men in the US were at risk of being drafted for military service in Vietnam. Fairness concerns led to the institution of a draft lottery in 1970 that was used to determine priority for conscription. In each year from 1970 to 1972, random sequence numbers were randomly assigned to each birth date in cohorts of 19-year-olds. Men with lottery numbers below a cutoff were eligible for the draft Men with lottery numbers above the cutoff were not. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 66 / 99
  • 78. IV with Heterogeneous Causal Effects Example: Angrist(1990) Instrumental Variables The instrument(𝑍𝑖) is thus defined as follows: 𝑍𝑖 = 1 if lottery implied individual i would be draft eligible, 𝑍𝑖 = 0 if lottery implied individual i would NOT be draft eligible. The econometrician observes treatment status(𝐷𝑖) as follows: 𝐷𝑖 = 1 if individual i served in the Vietnam war (veteran) 𝐷𝑖 = 0 if individual i did not serve in the Vietnam war (not veteran) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 67 / 99
  • 79. IV with Heterogeneous Causal Effects Example: Angrist(1990): IV’s Relevance and Exogenous While the lottery didn’t completely determine veteran status, it certainly mattered: relevance. The lottery outcome was random and seems reasonable to suppose that its only effect was on veteran status: exogenous. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 68 / 99
  • 80. IV with Heterogeneous Causal Effects Example: Angrist(1990): heterogeneous effects We can classify individuals according to assignment(Z) an treatment(X) into four parts Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 69 / 99
  • 81. IV with Heterogeneous Causal Effects Local Average Treatment Effect(LATE) So IV estimate only get the X effect on Y on the subpopulation-compilers. Angrist and Imbens(1994) called it as Local Average Treatment Effect(LATE), thus the treatment effect on those that change their behavior under the instrument. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 70 / 99
  • 82. IV with Heterogeneous Causal Effects IV with Heterogeneous Causal Effects: Generalization If the population is heterogeneous, then the 𝑖𝑡ℎ individual now has his or her own causal effect, 𝛽1𝑖,then the population regression equation can be written 𝑌𝑖 = 𝛽0𝑖 + 𝛽1𝑖𝑋𝑖 + 𝑢𝑖 (13.9) 𝛽1𝑖 is a random variable that,just like 𝑢𝑖, reflects unobserved variation across individuals. The average causal effect is the population mean value of the causal effect,𝐸(𝛽1𝑖) which is the expected causal effect of a randomly selected member of the population. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 71 / 99
  • 83. IV with Heterogeneous Causal Effects OLS with Heterogeneous Causal Effects If there is heterogeneity in the causal effect and if 𝑋𝑖 is randomly assigned, then the differences estimator is a consistent estimator of the average causal effect. ̂ 𝛽𝑜𝑙𝑠 = 𝑠𝑋𝑌 𝑠2 𝑋 𝑝 − → 𝐶𝑜𝑣(𝑌𝑖, 𝑋𝑖) 𝑉 𝑎𝑟(𝑋𝑖) = 𝐶𝑜𝑣(𝛽0𝑖 + 𝛽1𝑖𝑋𝑖 + 𝑢𝑖, 𝑋𝑖) 𝑉 𝑎𝑟(𝑋𝑖) = 𝐶𝑜𝑣(𝛽1𝑖𝑋𝑖, 𝑋𝑖) 𝑉 𝑎𝑟(𝑋𝑖) = 𝐸(𝛽1𝑖) Thus, if 𝑋𝑖 is randomly assigned, ̂ 𝛽1 is a consistent estimator of the average causal effect 𝐸(𝛽1𝑖). Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 72 / 99
  • 84. IV with Heterogeneous Causal Effects IV Regression with Heterogeneous Causal Effects Specifically, suppose that 𝑋𝑖 is related to 𝑍𝑖 by the linear model 𝑋𝑖 = 𝜋0𝑖 + 𝜋1𝑖𝑍𝑖 + 𝑣𝑖 where the coefficients 𝜋0𝑖 and 𝜋1𝑖 vary from one individual to the next. And it is the first-stage equation of TSLS with the modification of heterogeneous effect of 𝑍 on 𝑋. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 73 / 99
  • 85. IV with Heterogeneous Causal Effects IV Regression with Heterogeneous Causal Effects Then TSLS estimator becomes ̂ 𝛽2𝑆𝐿𝑆 = 𝑠𝑍𝑌 𝑠𝑍𝑋 𝑝 − → 𝜎𝑍𝑌 𝜎𝑍𝑋 = 𝐸(𝛽1𝑖𝜋1𝑖) 𝐸(𝜋1𝑖) Exercise: prove it by yourself (refers to Appendix 13.2) The TSLS estimator converges in probability to the ratio of the expected value of the product of 𝛽1𝑖 and 𝜋1𝑖 to the expected value of 𝜋1𝑖. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 74 / 99
  • 86. IV with Heterogeneous Causal Effects IV Regression with Heterogeneous Causal Effects It is a weighted average of the individual causal effects 𝛽1𝑖, The weights are 𝜋1𝑖 𝐸(𝜋1𝑖) , which measure the relative degree to which the instrument influences whether the 𝑖𝑡ℎ individual receives treatment, In other words,TSLS estimator is a consistent estimator of a weighted average of the individual causal effects, where the individuals who receive the most weight are those for whom the instrument is most influential. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 75 / 99
  • 87. IV with Heterogeneous Causal Effects IV Regression with Heterogeneous Causal Effects Three special cases: The treatment effect is the same for all individuals. 𝛽1𝑖 = 𝛽1 The instrument affects each individual equally. 𝜋1𝑖 = 𝜋1 The heterogeneity in the treatment effect and heterogeneity in the effect of the instrument are uncorrelated. 𝐶𝑜𝑣(𝛽1𝑖𝜋1𝑖) = 0 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 76 / 99
  • 88. IV with Heterogeneous Causal Effects IV Regression with Heterogeneous Causal Effects LATE equals to the ATE: all three cases we have 𝐸(𝛽1𝑖𝜋1𝑖) 𝐸(𝜋1𝑖) = 𝐸(𝛽1𝑖) = 𝛽1 Aside from these three special cases, in general the local average treatment effect differs from the average treatment effect. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 77 / 99
  • 89. IV with Heterogeneous Causal Effects IV Regression with Heterogeneous Causal Effects: Implications Different instruments can identify different parameters because they estimate the impact on different populations. The difference arises because each researcher is implicitly estimating a different weighted average of the individual causal effects in the population. Recall: J-test of overidentifying restrictions can reject if the two instruments estimate different local average treatment effects,even if both instruments are valid. In general neither estimator is a consistent estimator of the average causal effect. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 78 / 99
  • 90. IV with Heterogeneous Causal Effects In Summary The IV paradigm provides a powerful and flexible framework for causal inference. An alternative to random assignment with a strong claim on internal validity. The LATE framework highlights questions of external validity Can one instrument identify the average effect induced by another source of variation? Can we go from average effects on compliers to average effects on the entire treated population or an unconditional effect? The answer to these questions is usually: NO, at least without additional assumptions. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 79 / 99
  • 91. Some Practical Guides by Angrist and Pischke(2012) Section 7 Some Practical Guides by Angrist and Pischke(2012) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 80 / 99
  • 92. Some Practical Guides by Angrist and Pischke(2012) Practical Guides 1 Check IV relevance Always report the first stage and think about whether it makes sense(Signs and magnitudes) Always report the F-statistic on the excluded instruments. The bigger,the better. Don’t forget the rule of thumb.(𝐹 > 10) 2 Check exclusion restriction The exclusion restriction cannot be tested directly, but it can be falsified Run and examine the reduced form(regression of dependent variable on instruments) and look at the coefficients, t-statistics and F-statistics for excluded instruments. Because the reduced form is proportional to the casual effect of interest and is unbiased(OLS), so we should see the causal relation in the reduced form.If you can’t see the causal relation in the reduced form,it’s probably not there Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 81 / 99
  • 93. Some Practical Guides by Angrist and Pischke(2012) Practical Guides 3 Provide a substantive explanation for observed difference between 2SLS and OLS How bid is the difference? What does this tell you? Is the coefficient bigger when theory of endogeneity suggests it should be smaller? If so, why? Measurement Error or heterogeneous effects? 4 If you have multiple instruments, report over-identification tests. Pick your best single instrument and report just-identified estimates using this one only because just-identified IV is relatively unlikely to be subject to a weak bias. Worry if it is substantially different from what you get using multiple instruments. Check over-identified 2SLS estimates with LIML. LIML is less than precise than 2SlS but also less biased. If the results come out similar, be happy. If not, worry, and try to find stronger instruments. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 82 / 99
  • 94. Some Practical Guides by Angrist and Pischke(2012) How to Evaluate IV paper in a simple way? 1 Relevant: The first stage regression Does the author report the first stage regression? Does the instrument perform well in the first stage? Testable: rule of thumb: first stage 𝐹 > 10 2 Exclusion restriction: Is the instrument exogenous enough?(the random assignment is the best) Would you expect a direct effect of Z on Y Not directly testable: Except when equation is overidentified. 3 What LATE is being estimated? Whose behavior is affected by the instrument? Is this the LATE you would want? Is it a quantify of theoretical interest? Would other LATEs possible yield different estimates? Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 83 / 99
  • 95. An good example: Long live Keju Section 8 An good example: Long live Keju Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 84 / 99
  • 96. An good example: Long live Keju Chen, Kung and MA(2017) Title: Long Live Keju! The Persistent Effects of China’s Imperial Examination System. Topic: Long term persistence of human capital:the effect of Keju Dependent Variable: education level in 2010 Indepenet Variable: the density of jinshi in the Ming-Qing dynasties Data: 272 perfectures in jinshi. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 85 / 99
  • 97. An good example: Long live Keju Chen, Kung and MA(2018) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 86 / 99
  • 98. An good example: Long live Keju Chen, Kung and MA(2018) The effect of Keju on human capital at present Run regression 𝑙𝑛𝑌𝑖 = 𝛽𝑙𝑛(𝐾𝑒𝑗𝑢𝑖) + 𝛾1𝑋𝑐 𝑖 + 𝛾2𝑋ℎ 𝑖 + 𝛼𝑝 + 𝑢𝑖 𝑌𝑖: 2010 年 i 地区的平均受教育年限。 𝐾𝑒𝑗𝑢𝑖: 明清时期 i 地区获得进士的人数。 𝑋𝑐 𝑖 : 控制变量(当代) ,包括经济繁荣程度 (夜间灯光);地理因素: 该地区到海选距离、地形(免于遭受自然灾害) 。 𝑋𝑐 𝑖 : 控制变量(历史) : 历史经济繁荣程度 基础教育设施 社会和政治影响力 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 87 / 99
  • 99. An good example: Long live Keju Chen, Kung and MA(2018): OLS Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 88 / 99
  • 100. An good example: Long live Keju Chen, Kung and MA(2018): Potential Bias OVB: that are simultaneously associated with both historical jinshi density and years of schooling today. For instance, prefectures that had produced more jinshi may be associated with unobserved (natural or genetic) endowments. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 89 / 99
  • 101. An good example: Long live Keju Chen, Kung and MA(2018): Instrumental Variable IV: Distance to the Printing Ingredients (Pine and Bamboo) as the Instrumental Variable of Keju A logic chain: 𝑀𝑜𝑟𝑒; 𝑠𝑢𝑐𝑒𝑒𝑑𝑒𝑑 𝑖𝑛 𝐾𝑒𝑗𝑢 ⟺ 𝑚𝑜𝑟𝑒 𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑏𝑜𝑜𝑘𝑠 ⟺ 𝑝𝑟𝑖𝑛𝑡 𝑟𝑒𝑓𝑒𝑟𝑒𝑛𝑐𝑒𝑠 𝑏𝑜𝑜𝑘𝑠 𝑖𝑛 𝑐𝑒𝑛𝑡𝑒𝑟𝑠 ⟺ 𝑝𝑟𝑖𝑛𝑡 𝑐𝑒𝑛𝑡𝑒𝑟𝑠 𝑙𝑜𝑐𝑎𝑡𝑒𝑠 𝑛𝑒𝑎𝑟𝑏𝑦 𝑠𝑜𝑚𝑒 𝑖𝑛 Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 90 / 99
  • 102. An good example: Long live Keju Chen, Kung and MA(2018): First Stage Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 91 / 99
  • 103. An good example: Long live Keju Chen, Kung and MA(2018): Reduced-form and 2SLS Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 92 / 99
  • 104. Where Do Valid Instruments Come From? Section 9 Where Do Valid Instruments Come From? Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 93 / 99
  • 105. Where Do Valid Instruments Come From? Where do we find an IV? Generally Speaking “可遇不可求” Two main approaches 1 Economic Theory/Logics 2 Exogenous Source of Variation in X(natural experiments) Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 94 / 99
  • 106. Where Do Valid Instruments Come From? Where do we find an IV? Example 1: Does putting criminals in jail reduce crime? Run a regression of crime rates(d.v.) on incarceration rates(id.v) by using annual data at a suitable level of jurisdiction(states) and covariates (economic conditions) Simultaneous causality bias: crime rates goes up, more prisoners and more prisoners,reduced crime. IV: it must affect the incarceration rate but be unrelated to any of the unobserved factors that determine the crime rate. Levitt (1996) suggested that lawsuits aimed at reducing prison overcrowding could serve as an instrumental variable. Result: The estimated effect was three times larger than the effect estimated using OLS. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 95 / 99
  • 107. Where Do Valid Instruments Come From? Where do we find an IV?: Class Size and Test Score Example 2: Does cutting class sizes increase test scores? Omited Variable bias: such as parental interest in learning, learning opportunities outside the classroom, quality of the teachers and school facilities. IV: correlated with class size (relevance) but uncorrelated with the omitted determinants of test performance. Hoxby (2000) suggested biology. Because of random fluctuations in timings of births, the size of the incoming kindergarten class varies from one year to the next. But potential enrollment also fluctuates because parents with young children choose to move into an improving school district and out of one in trouble. She used the deviation of potential enrollment from its long-term trend as her instrument. Result: the effect on test scores of class size is small. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 96 / 99
  • 108. Where Do Valid Instruments Come From? Where do we find an IV? 1 Institutional Background Angrist(1990)-draft lottery: Vietnam veterans were randomly designated based on birth day used to estimate the wage impact of a shorter work experience. Acemoglu, Johnson, and Robinson(2001): the dead rate of some diseases in some areas to estimate the impact of institutions to economic growth. Feng et al.(2012) “The Returns to Education in China: Evidence from the 1986 Compulsory Education Law”. Li and Zhang(2007),Liu(2012)- “One Child policy” Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 97 / 99
  • 109. Where Do Valid Instruments Come From? Where do we find an IV? 2 Natural conditions(geography,weather,disaster) the Rainfall,Hurricane,Earthquake,Tsunami… the number of Rivers: Hoxby(2000) Ying Bai and Ruixue Jia(2014)-“keju” and “the number of small rivers” Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 98 / 99
  • 110. Where Do Valid Instruments Come From? Where do we find an IV? 3 Economic theory and Economic logic study the alcohol consumption and income relationship. alcohol price in a local market may be as a instrument of alcohol consumption. Angrist & Evans(1998): have same sex or different sex children used to estimate the impact of an additional birth on women labor supply. Zhaopeng Qu (Nanjing University) Lecture 5: Instrumental Variables 10/29/2020 99 / 99