SlideShare a Scribd company logo
Instrumental Variables and
Control Functions
Day 3, Lecture 1
By Caroline Krafft
Training on Applied Micro-Econometrics and
Public Policy Evaluation
July 25-27, 2016
Economic Research Forum
Readings
• Primary source:
• Angrist, J.D. and J.-S. Pischke (2009). Chapter 4 “Instrumental
variables in action: Sometimes you get what you need” Mostly
Harmless Econometrics Princeton, NJ: Princeton University Press.
• Additional material from:
• Wooldridge, J. M. (2015). Control Function Methods in Applied
Econometrics. Journal of Human Resources, 50(2), 421–445.
• Terza, J. V., Basu, A., & Rathouz, P. J. (2008). Two-Stage Residual
Inclusion Estimation: Addressing Endogeneity in Health
Econometric Modeling. Journal of Health Economics, 27(3), 531–
543.
2
Type II solutions
• Have been examining Type I (conditional exogeneity of
placement) quasi-experimental solutions
• Propensity score matching
• Difference-in-difference
• Panel data (fixed and random)
• Now moving to Type II solutions
• Rules or instruments (instrumental variables) determine placement
• Instrumental variables or regression discontinuity design
3
Instrumental variables
• An instrument is something correlated with the causal variable
of interest but uncorrelated with any other determinants of the
dependent variable
• This instrumental variable can be used to solve missing or
unknown control variables (omitted variables bias) problems
• Instrumental variables can be used in different techniques:
• Two-stage least squares
• Control functions
• IVs also have an important role in:
• Correcting for random measurement error in continuous variables
• Not categorical or binary variables
• Solving simultaneous equations models
4
Case study: schooling and wages
• Interested in the link between schooling (s) and wages (Y).
Let’s assume that the causal relationship is a different function f
for each person (i): Ysi=fi(s)
• Tells us what individual i earns for any s
• A common simplifying assumption is that the functional form is
a linear, constant-effects (causal) model:
• Ysi=α+ρs+ηi
• ηi is other factors that determine potential earnings
• Let’s say that one observable factor is ability, Ai
• Selection on observables would mean:
• ηi=Ai’γ+υi
• For the moment, let us assume that only ability correlates ηi and si so
that E[siυi]=0
• So if ability is observed, then Ysi=α+ρsi+Ai’γ+υi
5
What if ability is not observed?
• Want to know ρ in Ysi=α+ρsi+Ai’γ+υi
• Do not observe Ai, which is likely to be correlated with si,
creating omitted variables bias if simply tried to estimate
Ysi=α+ρsi+ηi
• Potential solution: use an instrument (zi) that is correlated
with the causal variable of interest (si) but uncorrelated
with any other determinants of the dependent variable
• Cov(zi, ηi)=0
• Referred to as the exclusion restriction: zi could have been
excluded from our model of interest
6
Two stage least squares
• Want to know ρ in ysi=α’Xi+ρsi+Ai’γ+υi (structural equation)
• Both yi and si are endogenous variables, Xi exogenous
• Have some zi (exogenous variable) meeting exclusion
restriction
• Can estimate regression of the first stage (si on zi) and
regression for reduced form (yi on zi)
• First stage, controlling for covariates:
• Substituting predicted value of si into structural equation
generates two-stage least squares estimates:
7
si = Xi
'
p10 +p11zi +x1i
ˆsi = Xi
'
ˆp10 + ˆp11zi
yi =a'Xi +rˆsi +[hi +r(si - ˆsi )]
Two Stage Least Squares: A summary
• Two stage least squares uses an instrument, the concept
of a first stage and a second stage (two steps on previous
slide) to get a consistent estimate of ρ
• Doing this actually in two stages leads to incorrect standard errors
• Typically implemented in a single command in software
8
Finding an instrument
• Things you’ll need to check in implementing this two-stage
approach
• First stage is statistically significant (no weak instruments)
• Test this and present your test statistics (F-test)
• Exclusion restriction (zi only affects yi through si)
• To find a good instrument need to understand processes
determining the variable of interest (si)
• Institutional knowledge is particularly helpful
• There may be institutional constraints that can serve as good
instruments
• Need a very strong case for why instrument only affects
outcome through variable of interest
• If there are any other Xs that are affected by instrument, would need to
control for them to prevent a violation of the exclusion restriction
9
Example: Quarter of Birth Instrument
• Example (Angrist and Krueger 1991): Children enter
school in the calendar year in which they turn 6
• School start age is a function of date of birth
• Have to stay until 16
• Different grades when reach drop out age
• Compulsory schooling laws and age of entry create a natural
experiment where children are compelled to attend school for a
varying number of years based on when they are born
• Can use quarter of birth as an instrument for schooling
• Conceptually excludable: quarter of birth shouldn’t affect
ability (or motivation, or family connections, or anything
else that affects wages)
10
Average Education by Quarter of Birth
(First Stage)
11
Men born earlier in the
year tend to have less
schooling (quit “earlier”)
Earnings and quarter of birth
12
Earnings are lower for those who are born
in earlier quarters—those with less
schooling
Example: OLS and 2SLS results
13
Multiple instruments
• Can use multiple instruments (z1i, z2i, z3i, etc.) in the first
stage
• If have multiple endogenous variables in second stage,
will need multiple instruments
• Models are just identified when the # instruments=#
endogenous variables
• Models are over identified when the # instruments>#
endogenous variables
• This allows for additional testing of the assumptions underlying
instruments
14
IV with Heterogeneous Potential
Outcomes
• Our basic two stage least squares (2SLS) model
assumed that the causal effect of interest is constant.
• For a dummy variable (example: college (1) or no college (0)) this
means y1i-y0i=ρ for all i
• Homogenous treatment effect
• For a multivalued variable (example: years of schooling), this
means ysi-ys-1,i=ρ for all i and all s
• Linearity and homogenous treatment effect
15
Validity and Heterogeneity
• Treatment effects are likely to be heterogeneous
• A distribution of effects across individuals
• Example: Individuals who choose to take up a training program may
be those who particularly benefit from it. Expanding the training
program to the general population might have different (weaker)
effects
• Internal validity occurs when the analysis discovers causal effects
for the population being studied
• Will hold for a good IV study or RCT
• Regardless of heterogeneity
• External validity occurs when a study can predict effects into
different contexts
• Can be better assessed when allowing for heterogeneous treatment effects
16
Heterogeneity with a dummy treatment
variable
• Interested in the effect of some program on the outcome
yi, where capture participation as a dummy, Di
• Denote as yi(d, z) the potential outcome of i with Di=d and zi=z
• Denote as D1i i’s treatment status when zi=1 and denote
as D0i i’s treatment status when zi=0
• Only one is observed
• Observed treatment status is therefore:
• Di=D0i+(D1i -D0i)zi=
• Average causal effect of zi on Di is E[π1i]
17
p0 +p1izi +xi
Monotonicity assumption
• Have model for treatment of:
• For this model to be useful, it has to be the case that
monotonicity holds, meaning:
• The instrument has to either:
• increase participation or have no effect for
• or decrease participation or have no effect
• There cannot be some people who are more likely and
some less likely to participate from the instrument
18
Di = p0 +p1izi +xi
p1i ³ 0 or p1i £ 0
Independence Assumption
• Have to assume that the instrument is as good as
randomly assigned (independent of potential outcomes
and treatment assignments).
• This means then that the first stage is causal
19
[yi (D1i,1), yi (D0i,0),D1i,D0i )]^ zi
Exclusion restriction
• For the exclusion restriction to hold in the heterogeneous
treatment effects and dummy treatment framework, it must be
the case that yi(d,z) is a function only of d
• yi(d,0)=yi(d,1) for d=0,1
• Exclusion restriction fails if outcome of interest is affected by
instrument in some other way than by treatment (program) of
interest
• Need to have a unique channel for causal effects of instrument
• Treatment could still be randomly assigned
• Random assignment could lead to other changes in behavior
• Example: Those more likely to be drafted into the military stayed in
school longer. Draft numbers were by random lottery, but behavior or
remaining in college confounds estimated impact of military service on
wages
20
The LATE Theorem
• Given:
• A1-Independence
• A2-Exclusion: yi(d,0)=yi(d,1) for d=0,1
• A3-First stage (no weak instruments): E[D1i -D0i]≠0
• A4-Monotonicity
• In any study with IVs, you need to consider and discuss all of these
conditions in your work
• Then you can estimate the local average treatment effect
(LATE) (for the instrument increasing treatment case):
21
D1i -D0i ³ 0 or D1i -D0i £ 0 "i
[yi (D1i,1), yi (D0i,0),D1i,D0i )]^ zi
E[yi | zi =1]- E[yi | zi = 0]
E[Di | zi =1]- E[Di | zi = 0]
= E[y1i - y0i | D1i > D0i ]= E[r0i |p1i > 0]
Dividing the sample for LATE
• There could be four groups (assume IV increases treatment)
• Defiers (ruled out by monotonicity): D0i=1, D1i=0
• Compliers: D1i=1, D0i=0
• Affected by the instrument
• Always takers: D1i=1, D0i=1
• Never takers D1i=0, D0i=0
• With LATE, identify the effect of the treatment based on the
population of compliers
• Not informative about effects on never takers or always takers
• ATT based on always-takers and compliers
• ATU based on never-takers and compliers
• Compliers will be different (and therefore LATE different) for
different instruments
22
IVs in RCTs
• Often end up using IVs and LATE in RCTs when:
• RCT is a randomly assigned offer of treatment
• One-sided non-compliance: some take up of offer
• All controls remain untreated
• Comparing those who take up treatment with those who did not would be
misleading (selection bias, typically positive)
• Can use offer of treatment as an IV for treatment received
• Then the LATE is effect of treatment on compliers, treatment on the
treated (TOT)
• Distinct from intent-to-treat (ITT) estimates which show the
causal effect of offered treatment on those assigned to
treatment
• Whether or not they took it up
• ITT/compliance rate=TOT
23
JTPA (Job Training Partnership) experiment:
Program effects on earnings of disadvantaged
24
Complicating LATE
• Can add covariates
• Independence assumption becomes a conditional independence
assumption:
• As good as randomly assigned conditional on covariates
• May be necessary for instrument to be valid
• Can improve precision
• With linear modeling, 2SLS results are a (very) close approximation
of causal relationship of interest
• Can use multiple instruments
• Keeping in mind different instruments generate different compliers
25
[yi1, y0i,D1i,D0i )]^ zi | Xi
Extending to an Average Causal
Response Model
• Consider now the case where treatment is not just a dummy
• Example: years of schooling: Ysi=fi(s)
• There are s different unit causal effects: ysi-ys-1,i
• Linear causal model assumes these are all the same
• Assuming independence, exclusion, first stage, monotonicity 2SLS
generates a weighted average of unit causal effects
• Based on compliers over range of si (driven by the z from a treatment
intensity less than s to at least s)
26
Common 2SLS mistakes: Manual 2SLS
• Software packaged have 2SLS built in—so best to use
the built in functions as they help avoid some errors
• If you do “manually” compute two stages, need to make
sure to:
• Adjust the standard errors for the two-stage nature of the estimates
• OLS residual variance includes difference between predicted and
observed times coefficient
• Use the same covariates (X) in the first and second stages
• Failing to do so can create inconsistency in the second stage
27
2SLS with small samples
• 2SLS is consistent (as the sample becomes large) but is
biased in small samples
• 2SLS estimates may be systematically wrong
• 2SLS is most biased when instruments are weak, when
there are many instruments
• Biased towards OLS
• Essentially because the first stage is estimated and noisier the
weaker the instruments
• Most concerned about this with small samples, weak
instruments, many instruments
28
Making the case for your instrument
• 1. Always report the first stage
• Argue for why it makes sense (signs, magnitude)
• 2. Report your F-statistic on the instrument
• Bigger is better, need above 10 as a rule of thumb
• 3. If you have multiple instruments, use the best one for just-
identified estimates and present those
• 4. Use limited information maximum likelihood (LIML) for over-
identified instruments
• Less precise but less biased
• Compare results
• 5. Check model in the reduced-form regression of y on
instruments
• Unbiased since OLS
• Want to see causal relation of some size in reduced form
29
Common 2SLS mistakes: Forbidden
Regression
• The 2SLS models we’ve been talking about have been using
linear functional forms
• Using OLS on a nonlinear variable
• Should we use a nonlinear first stage instead?
• NO! Forbidden regression
• The forbidden regression uses a nonlinear first stage (predicted
values of endogenous regressor) in the second stage
• Only OLS creates first-stage residuals that are uncorrelated with
predicted values and covariates
• Using nonlinear fitted values as instruments means identifying off the
first stage nonlinearities
• Nonlinear second stage can also be problematic
30
Dealing with limited dependent
variables
• Limited dependent variables (LDVs, binary, categorical,
etc.) typically assume a latent linear index
• Require functional form assumptions
• Often 2SLS is still the best way to go (or at least should be shown
as one model)
• May be able to make the case for nonlinear form like
bivariate probit. Example: Have a third child? Depends on
zi (sex of first two kids)
• First stage:
• Second stage (Employment status):
• Problem would be correlation between error terms
• Estimate bivariate probit with maximum likelihood
• Generate average causal effects
31
Di =1[X'
ig*
0 +g*
1zi >ui ]
Yi =1[X'
ib*
0 +b*
1Di >ei ]
Control Function
Approaches
32
Control functions
• Although historically the term has been used in a variety of
ways, most modern applications of the term control function are
instrumental variable approaches (Wooldridge 2015)
• There is an endogenous explanatory variable
• Control function approaches use the exogenous variation from
an excluded instrument to generate variation in the residuals
from the reduced form (first stage)
• The residuals are the control functions—included in the second stage
with the endogenous variables
• Advantage is primarily in dealing with nonlinear models (Terza,
Basu & Rathouz 2008)
• Also allows for tests of the nature of selection
33
Control function equations
• Assume we are interested in the effect of endogenous
variable y2 on outcome y1 and have a vector z of
exogenous variables, including some instrument z2
• Essential problem is same as for 2SLS. Solution is
different, and based on the linear relationship between the
structural and reduced form error:
• e1 uncorrelated with y2
• Can then estimate:
• Essentially “control for” endogeneity of y2
34
y1 = z1d1 +g1y2 +u1 E[z'
ju1]= 0
y2 = z1p21 + z2p22 +n2 E[z'
jn2 ]= 0
u1 = r1n2 +e1 E[n2e1]= 0
y1 = z1d1 +g1y2 +r1n2 +e1
Implementing two-step control
function
• Regress yi2 on zi
• Predict OLS residuals
• Run OLS regression of yi1 on zi1 , yi2 ,
• Generates
• Essentially keeps residuals and actual endogenous variable,
whereas 2SLS uses predicted value of endogenous variable
• Bootstrap standard errors
• Can undertake heteroscedasticity robust Hausman testing
of endogeneity,
• Still relies on (strong) instrument for identification
35
ˆni2
ˆni2
ˆd1, ˆg1, ˆr1
r1 = 0

More Related Content

DOCX
Dummy variable
PPT
Eco Basic 1 8
PPTX
Probit and logit model
PPTX
Dummy variable model
PPTX
Multinomial Logistic Regression Analysis
PDF
Potential Solutions to the Fundamental Problem of Causal Inference: An Overview
PPTX
The probit model
PPTX
regression assumption by Ammara Aftab
Dummy variable
Eco Basic 1 8
Probit and logit model
Dummy variable model
Multinomial Logistic Regression Analysis
Potential Solutions to the Fundamental Problem of Causal Inference: An Overview
The probit model
regression assumption by Ammara Aftab

What's hot (20)

PPTX
Supply response models
PDF
Multicollinearity1
PPTX
Introduction to time series.pptx
PPTX
Regression analysis.
PDF
Normality tests
PPT
Autocorrelation- Concept, Causes and Consequences
PPT
Simple Linier Regression
PPTX
Autocorrelation
PPTX
Heteroscedasticity
PPT
Ch12 slides
PDF
Introduction to Generalized Linear Models
PPT
Cointegration and error correction model
PPT
Binomial Distribution
PDF
Propensity Score Matching Methods
PPTX
Ramsey-Cass-Koopmans model.pptx
PPTX
Cob-Web model 11-3-15.pptx
PPT
Econometrics ch2
PPT
Ch07
PPTX
Chapter 5.pptx
PPT
Ch04
Supply response models
Multicollinearity1
Introduction to time series.pptx
Regression analysis.
Normality tests
Autocorrelation- Concept, Causes and Consequences
Simple Linier Regression
Autocorrelation
Heteroscedasticity
Ch12 slides
Introduction to Generalized Linear Models
Cointegration and error correction model
Binomial Distribution
Propensity Score Matching Methods
Ramsey-Cass-Koopmans model.pptx
Cob-Web model 11-3-15.pptx
Econometrics ch2
Ch07
Chapter 5.pptx
Ch04
Ad

Viewers also liked (20)

PDF
Difference-in-Difference Methods
PDF
Panel Data Models
PDF
Regression Discontinuity Method
PDF
Causal Inference and Program Evaluation
PDF
Overview and Objectives of the Workshop
PDF
Overview of Mixed Models
PDF
Multilevel Binary Logistic Regression
PPTX
How do we do research in economics
PDF
Lessons for Democratic Transition in the Arab World
PDF
How to make democratic accountability work better for development?
PDF
Democracy, Elections, and Development
PDF
Forms of Democracy and Development
PDF
Similarities and differencies in values in the MENA-region
PDF
The Formation of the Youth's Gender Role Attitudes
PDF
Empirical Applications of Collective Household Labour Supply Models in Iraq
PDF
The use of opinion polls data in the Arab Human Development Report 2016
PDF
Diaspora Networks as a Bridge between Civilizations
PDF
Veil Preference in the Middle East and North Africa
PDF
Gender Equality Support in the Arab World Revisited
PDF
Correspondence Studies on Gender, Ethnicity and Religiosity Discrimination in...
Difference-in-Difference Methods
Panel Data Models
Regression Discontinuity Method
Causal Inference and Program Evaluation
Overview and Objectives of the Workshop
Overview of Mixed Models
Multilevel Binary Logistic Regression
How do we do research in economics
Lessons for Democratic Transition in the Arab World
How to make democratic accountability work better for development?
Democracy, Elections, and Development
Forms of Democracy and Development
Similarities and differencies in values in the MENA-region
The Formation of the Youth's Gender Role Attitudes
Empirical Applications of Collective Household Labour Supply Models in Iraq
The use of opinion polls data in the Arab Human Development Report 2016
Diaspora Networks as a Bridge between Civilizations
Veil Preference in the Middle East and North Africa
Gender Equality Support in the Arab World Revisited
Correspondence Studies on Gender, Ethnicity and Religiosity Discrimination in...
Ad

Similar to Instrumental Variables and Control Functions (20)

PDF
Classification and pattern recognition.pdf
PDF
Kaushik_Singh_Chakravarty_Rewards_Detection_Dishonesty_Experiment_India.pdf
PPT
SEM_for_Dummiesjfdlapppsln0jjjajaklmdbv.ppt
PPT
2013-02-22_sudano_perzynski_sem_1.ppt
PPT
Applied Structural Equation Modeling for Dummies by Dummies PPT by Sudano and...
PPTX
Ee eee
PPTX
This is a discussion in Practical Research 2 in hypothesis testing.
PDF
Lekcija 1 - Uvod.pdf
PPT
SAMPLE_AND_OTHER.ppt
PDF
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
PPTX
Statistics
PPTX
Bradford mvsu fall 2012 lecture 3 methods
PPTX
Ipac 2014
PPT
A well-defined research question is the cornerstone of any successful investi...
PDF
Alexis Diamond - quasi experiments
PPT
Information Retrieval 08
PDF
Lecture 1.pdf
PPTX
Ethics_Seminar_Uday.pptx
PDF
Differences-in-Differences
PPT
What is research
Classification and pattern recognition.pdf
Kaushik_Singh_Chakravarty_Rewards_Detection_Dishonesty_Experiment_India.pdf
SEM_for_Dummiesjfdlapppsln0jjjajaklmdbv.ppt
2013-02-22_sudano_perzynski_sem_1.ppt
Applied Structural Equation Modeling for Dummies by Dummies PPT by Sudano and...
Ee eee
This is a discussion in Practical Research 2 in hypothesis testing.
Lekcija 1 - Uvod.pdf
SAMPLE_AND_OTHER.ppt
Causal Inference Opening Workshop - Bracketing Bounds for Differences-in-Diff...
Statistics
Bradford mvsu fall 2012 lecture 3 methods
Ipac 2014
A well-defined research question is the cornerstone of any successful investi...
Alexis Diamond - quasi experiments
Information Retrieval 08
Lecture 1.pdf
Ethics_Seminar_Uday.pptx
Differences-in-Differences
What is research

More from Economic Research Forum (20)

PPTX
Session 4 farhad mehran, single most data gaps
PDF
Session 3 mahdi ben jelloul, microsimulation for policy evaluation
PDF
Session 3 m.a. marouani, structual change, skills demand and job quality
PPTX
Session 3 ishac diwn, bridging mirco and macro appraoches
PDF
Session 3 asif islam, jobs flagship report
PPTX
Session 2 yemen hlel, insights from tunisia
PPT
Session 2 samia satti, insights from sudan
PPTX
Session 2 mona amer, insights from egypt
PPTX
Session 2 ali souag, insights from algeria
PPTX
Session 2 abdel rahmen el lahga, insights from tunisia
PPTX
Session 1 ragui assaad, moving beyond the unemployment rate
PPTX
Session 1 luca fedi, towards a research agenda
PDF
من البيانات الى السياسات : مبادرة إتاحة البيانات المنسقة
PPTX
The Future of Jobs is Facing the Biggest Policy Induced Price Distortion in H...
PPTX
Job- Creating Growth in the Emerging Global Economy
PPTX
The Role of Knowledge in the Process of Innovation in the New Global Economy:...
PPTX
Rediscovering Industrial Policy for the 21st Century: Where to Start?
PPTX
How the Rise of the Intangibles Economy is Disrupting Work in Africa
PPTX
On Ideas and Economic Policy: A Survey of MENA Economists
PPTX
Future Research Directions for ERF
Session 4 farhad mehran, single most data gaps
Session 3 mahdi ben jelloul, microsimulation for policy evaluation
Session 3 m.a. marouani, structual change, skills demand and job quality
Session 3 ishac diwn, bridging mirco and macro appraoches
Session 3 asif islam, jobs flagship report
Session 2 yemen hlel, insights from tunisia
Session 2 samia satti, insights from sudan
Session 2 mona amer, insights from egypt
Session 2 ali souag, insights from algeria
Session 2 abdel rahmen el lahga, insights from tunisia
Session 1 ragui assaad, moving beyond the unemployment rate
Session 1 luca fedi, towards a research agenda
من البيانات الى السياسات : مبادرة إتاحة البيانات المنسقة
The Future of Jobs is Facing the Biggest Policy Induced Price Distortion in H...
Job- Creating Growth in the Emerging Global Economy
The Role of Knowledge in the Process of Innovation in the New Global Economy:...
Rediscovering Industrial Policy for the 21st Century: Where to Start?
How the Rise of the Intangibles Economy is Disrupting Work in Africa
On Ideas and Economic Policy: A Survey of MENA Economists
Future Research Directions for ERF

Recently uploaded (20)

PDF
Environmental Management Basics 2025 for BDOs WBCS by Samanjit Sen Gupta.pdf
PDF
It Helpdesk Solutions - ArcLight Group
PDF
Item # 4 -- 328 Albany St. compt. review
PPT
generalgeologygroundwaterchapt11-181117073208.ppt
PDF
PPT - Primary Rules of Interpretation (1).pdf
PDF
26.1.2025 venugopal K Awarded with commendation certificate.pdf
PDF
PPT Items # 6&7 - 900 Cambridge Oval Right-of-Way
PDF
The Role of FPOs in Advancing Rural Agriculture in India
PDF
Items # 6&7 - 900 Cambridge Oval Right-of-Way
DOCX
Alexistogel: Solusi Tepat untuk Anda yang Cari Bandar Toto Macau Resmi
PPTX
Nur Shakila Assesmentlwemkf;m;mwee f.pptx
PPTX
The DFARS - Part 250 - Extraordinary Contractual Actions
PDF
The Detrimental Impacts of Hydraulic Fracturing for Oil and Gas_ A Researched...
PDF
Item # 5 - 5307 Broadway St final review
PDF
Population Estimates 2025 Regional Snapshot 08.11.25
PPTX
Omnibus rules on leave administration.pptx
PPTX
26.1.2025 venugopal K Awarded with commendation certificate.pptx
PPTX
DFARS Part 249 - Termination Of Contracts
PPTX
Weekly Report 17-10-2024_cybersecutity.pptx
DOC
LU毕业证学历认证,赫尔大学毕业证硕士的学历和学位
Environmental Management Basics 2025 for BDOs WBCS by Samanjit Sen Gupta.pdf
It Helpdesk Solutions - ArcLight Group
Item # 4 -- 328 Albany St. compt. review
generalgeologygroundwaterchapt11-181117073208.ppt
PPT - Primary Rules of Interpretation (1).pdf
26.1.2025 venugopal K Awarded with commendation certificate.pdf
PPT Items # 6&7 - 900 Cambridge Oval Right-of-Way
The Role of FPOs in Advancing Rural Agriculture in India
Items # 6&7 - 900 Cambridge Oval Right-of-Way
Alexistogel: Solusi Tepat untuk Anda yang Cari Bandar Toto Macau Resmi
Nur Shakila Assesmentlwemkf;m;mwee f.pptx
The DFARS - Part 250 - Extraordinary Contractual Actions
The Detrimental Impacts of Hydraulic Fracturing for Oil and Gas_ A Researched...
Item # 5 - 5307 Broadway St final review
Population Estimates 2025 Regional Snapshot 08.11.25
Omnibus rules on leave administration.pptx
26.1.2025 venugopal K Awarded with commendation certificate.pptx
DFARS Part 249 - Termination Of Contracts
Weekly Report 17-10-2024_cybersecutity.pptx
LU毕业证学历认证,赫尔大学毕业证硕士的学历和学位

Instrumental Variables and Control Functions

  • 1. Instrumental Variables and Control Functions Day 3, Lecture 1 By Caroline Krafft Training on Applied Micro-Econometrics and Public Policy Evaluation July 25-27, 2016 Economic Research Forum
  • 2. Readings • Primary source: • Angrist, J.D. and J.-S. Pischke (2009). Chapter 4 “Instrumental variables in action: Sometimes you get what you need” Mostly Harmless Econometrics Princeton, NJ: Princeton University Press. • Additional material from: • Wooldridge, J. M. (2015). Control Function Methods in Applied Econometrics. Journal of Human Resources, 50(2), 421–445. • Terza, J. V., Basu, A., & Rathouz, P. J. (2008). Two-Stage Residual Inclusion Estimation: Addressing Endogeneity in Health Econometric Modeling. Journal of Health Economics, 27(3), 531– 543. 2
  • 3. Type II solutions • Have been examining Type I (conditional exogeneity of placement) quasi-experimental solutions • Propensity score matching • Difference-in-difference • Panel data (fixed and random) • Now moving to Type II solutions • Rules or instruments (instrumental variables) determine placement • Instrumental variables or regression discontinuity design 3
  • 4. Instrumental variables • An instrument is something correlated with the causal variable of interest but uncorrelated with any other determinants of the dependent variable • This instrumental variable can be used to solve missing or unknown control variables (omitted variables bias) problems • Instrumental variables can be used in different techniques: • Two-stage least squares • Control functions • IVs also have an important role in: • Correcting for random measurement error in continuous variables • Not categorical or binary variables • Solving simultaneous equations models 4
  • 5. Case study: schooling and wages • Interested in the link between schooling (s) and wages (Y). Let’s assume that the causal relationship is a different function f for each person (i): Ysi=fi(s) • Tells us what individual i earns for any s • A common simplifying assumption is that the functional form is a linear, constant-effects (causal) model: • Ysi=α+ρs+ηi • ηi is other factors that determine potential earnings • Let’s say that one observable factor is ability, Ai • Selection on observables would mean: • ηi=Ai’γ+υi • For the moment, let us assume that only ability correlates ηi and si so that E[siυi]=0 • So if ability is observed, then Ysi=α+ρsi+Ai’γ+υi 5
  • 6. What if ability is not observed? • Want to know ρ in Ysi=α+ρsi+Ai’γ+υi • Do not observe Ai, which is likely to be correlated with si, creating omitted variables bias if simply tried to estimate Ysi=α+ρsi+ηi • Potential solution: use an instrument (zi) that is correlated with the causal variable of interest (si) but uncorrelated with any other determinants of the dependent variable • Cov(zi, ηi)=0 • Referred to as the exclusion restriction: zi could have been excluded from our model of interest 6
  • 7. Two stage least squares • Want to know ρ in ysi=α’Xi+ρsi+Ai’γ+υi (structural equation) • Both yi and si are endogenous variables, Xi exogenous • Have some zi (exogenous variable) meeting exclusion restriction • Can estimate regression of the first stage (si on zi) and regression for reduced form (yi on zi) • First stage, controlling for covariates: • Substituting predicted value of si into structural equation generates two-stage least squares estimates: 7 si = Xi ' p10 +p11zi +x1i ˆsi = Xi ' ˆp10 + ˆp11zi yi =a'Xi +rˆsi +[hi +r(si - ˆsi )]
  • 8. Two Stage Least Squares: A summary • Two stage least squares uses an instrument, the concept of a first stage and a second stage (two steps on previous slide) to get a consistent estimate of ρ • Doing this actually in two stages leads to incorrect standard errors • Typically implemented in a single command in software 8
  • 9. Finding an instrument • Things you’ll need to check in implementing this two-stage approach • First stage is statistically significant (no weak instruments) • Test this and present your test statistics (F-test) • Exclusion restriction (zi only affects yi through si) • To find a good instrument need to understand processes determining the variable of interest (si) • Institutional knowledge is particularly helpful • There may be institutional constraints that can serve as good instruments • Need a very strong case for why instrument only affects outcome through variable of interest • If there are any other Xs that are affected by instrument, would need to control for them to prevent a violation of the exclusion restriction 9
  • 10. Example: Quarter of Birth Instrument • Example (Angrist and Krueger 1991): Children enter school in the calendar year in which they turn 6 • School start age is a function of date of birth • Have to stay until 16 • Different grades when reach drop out age • Compulsory schooling laws and age of entry create a natural experiment where children are compelled to attend school for a varying number of years based on when they are born • Can use quarter of birth as an instrument for schooling • Conceptually excludable: quarter of birth shouldn’t affect ability (or motivation, or family connections, or anything else that affects wages) 10
  • 11. Average Education by Quarter of Birth (First Stage) 11 Men born earlier in the year tend to have less schooling (quit “earlier”)
  • 12. Earnings and quarter of birth 12 Earnings are lower for those who are born in earlier quarters—those with less schooling
  • 13. Example: OLS and 2SLS results 13
  • 14. Multiple instruments • Can use multiple instruments (z1i, z2i, z3i, etc.) in the first stage • If have multiple endogenous variables in second stage, will need multiple instruments • Models are just identified when the # instruments=# endogenous variables • Models are over identified when the # instruments># endogenous variables • This allows for additional testing of the assumptions underlying instruments 14
  • 15. IV with Heterogeneous Potential Outcomes • Our basic two stage least squares (2SLS) model assumed that the causal effect of interest is constant. • For a dummy variable (example: college (1) or no college (0)) this means y1i-y0i=ρ for all i • Homogenous treatment effect • For a multivalued variable (example: years of schooling), this means ysi-ys-1,i=ρ for all i and all s • Linearity and homogenous treatment effect 15
  • 16. Validity and Heterogeneity • Treatment effects are likely to be heterogeneous • A distribution of effects across individuals • Example: Individuals who choose to take up a training program may be those who particularly benefit from it. Expanding the training program to the general population might have different (weaker) effects • Internal validity occurs when the analysis discovers causal effects for the population being studied • Will hold for a good IV study or RCT • Regardless of heterogeneity • External validity occurs when a study can predict effects into different contexts • Can be better assessed when allowing for heterogeneous treatment effects 16
  • 17. Heterogeneity with a dummy treatment variable • Interested in the effect of some program on the outcome yi, where capture participation as a dummy, Di • Denote as yi(d, z) the potential outcome of i with Di=d and zi=z • Denote as D1i i’s treatment status when zi=1 and denote as D0i i’s treatment status when zi=0 • Only one is observed • Observed treatment status is therefore: • Di=D0i+(D1i -D0i)zi= • Average causal effect of zi on Di is E[π1i] 17 p0 +p1izi +xi
  • 18. Monotonicity assumption • Have model for treatment of: • For this model to be useful, it has to be the case that monotonicity holds, meaning: • The instrument has to either: • increase participation or have no effect for • or decrease participation or have no effect • There cannot be some people who are more likely and some less likely to participate from the instrument 18 Di = p0 +p1izi +xi p1i ³ 0 or p1i £ 0
  • 19. Independence Assumption • Have to assume that the instrument is as good as randomly assigned (independent of potential outcomes and treatment assignments). • This means then that the first stage is causal 19 [yi (D1i,1), yi (D0i,0),D1i,D0i )]^ zi
  • 20. Exclusion restriction • For the exclusion restriction to hold in the heterogeneous treatment effects and dummy treatment framework, it must be the case that yi(d,z) is a function only of d • yi(d,0)=yi(d,1) for d=0,1 • Exclusion restriction fails if outcome of interest is affected by instrument in some other way than by treatment (program) of interest • Need to have a unique channel for causal effects of instrument • Treatment could still be randomly assigned • Random assignment could lead to other changes in behavior • Example: Those more likely to be drafted into the military stayed in school longer. Draft numbers were by random lottery, but behavior or remaining in college confounds estimated impact of military service on wages 20
  • 21. The LATE Theorem • Given: • A1-Independence • A2-Exclusion: yi(d,0)=yi(d,1) for d=0,1 • A3-First stage (no weak instruments): E[D1i -D0i]≠0 • A4-Monotonicity • In any study with IVs, you need to consider and discuss all of these conditions in your work • Then you can estimate the local average treatment effect (LATE) (for the instrument increasing treatment case): 21 D1i -D0i ³ 0 or D1i -D0i £ 0 "i [yi (D1i,1), yi (D0i,0),D1i,D0i )]^ zi E[yi | zi =1]- E[yi | zi = 0] E[Di | zi =1]- E[Di | zi = 0] = E[y1i - y0i | D1i > D0i ]= E[r0i |p1i > 0]
  • 22. Dividing the sample for LATE • There could be four groups (assume IV increases treatment) • Defiers (ruled out by monotonicity): D0i=1, D1i=0 • Compliers: D1i=1, D0i=0 • Affected by the instrument • Always takers: D1i=1, D0i=1 • Never takers D1i=0, D0i=0 • With LATE, identify the effect of the treatment based on the population of compliers • Not informative about effects on never takers or always takers • ATT based on always-takers and compliers • ATU based on never-takers and compliers • Compliers will be different (and therefore LATE different) for different instruments 22
  • 23. IVs in RCTs • Often end up using IVs and LATE in RCTs when: • RCT is a randomly assigned offer of treatment • One-sided non-compliance: some take up of offer • All controls remain untreated • Comparing those who take up treatment with those who did not would be misleading (selection bias, typically positive) • Can use offer of treatment as an IV for treatment received • Then the LATE is effect of treatment on compliers, treatment on the treated (TOT) • Distinct from intent-to-treat (ITT) estimates which show the causal effect of offered treatment on those assigned to treatment • Whether or not they took it up • ITT/compliance rate=TOT 23
  • 24. JTPA (Job Training Partnership) experiment: Program effects on earnings of disadvantaged 24
  • 25. Complicating LATE • Can add covariates • Independence assumption becomes a conditional independence assumption: • As good as randomly assigned conditional on covariates • May be necessary for instrument to be valid • Can improve precision • With linear modeling, 2SLS results are a (very) close approximation of causal relationship of interest • Can use multiple instruments • Keeping in mind different instruments generate different compliers 25 [yi1, y0i,D1i,D0i )]^ zi | Xi
  • 26. Extending to an Average Causal Response Model • Consider now the case where treatment is not just a dummy • Example: years of schooling: Ysi=fi(s) • There are s different unit causal effects: ysi-ys-1,i • Linear causal model assumes these are all the same • Assuming independence, exclusion, first stage, monotonicity 2SLS generates a weighted average of unit causal effects • Based on compliers over range of si (driven by the z from a treatment intensity less than s to at least s) 26
  • 27. Common 2SLS mistakes: Manual 2SLS • Software packaged have 2SLS built in—so best to use the built in functions as they help avoid some errors • If you do “manually” compute two stages, need to make sure to: • Adjust the standard errors for the two-stage nature of the estimates • OLS residual variance includes difference between predicted and observed times coefficient • Use the same covariates (X) in the first and second stages • Failing to do so can create inconsistency in the second stage 27
  • 28. 2SLS with small samples • 2SLS is consistent (as the sample becomes large) but is biased in small samples • 2SLS estimates may be systematically wrong • 2SLS is most biased when instruments are weak, when there are many instruments • Biased towards OLS • Essentially because the first stage is estimated and noisier the weaker the instruments • Most concerned about this with small samples, weak instruments, many instruments 28
  • 29. Making the case for your instrument • 1. Always report the first stage • Argue for why it makes sense (signs, magnitude) • 2. Report your F-statistic on the instrument • Bigger is better, need above 10 as a rule of thumb • 3. If you have multiple instruments, use the best one for just- identified estimates and present those • 4. Use limited information maximum likelihood (LIML) for over- identified instruments • Less precise but less biased • Compare results • 5. Check model in the reduced-form regression of y on instruments • Unbiased since OLS • Want to see causal relation of some size in reduced form 29
  • 30. Common 2SLS mistakes: Forbidden Regression • The 2SLS models we’ve been talking about have been using linear functional forms • Using OLS on a nonlinear variable • Should we use a nonlinear first stage instead? • NO! Forbidden regression • The forbidden regression uses a nonlinear first stage (predicted values of endogenous regressor) in the second stage • Only OLS creates first-stage residuals that are uncorrelated with predicted values and covariates • Using nonlinear fitted values as instruments means identifying off the first stage nonlinearities • Nonlinear second stage can also be problematic 30
  • 31. Dealing with limited dependent variables • Limited dependent variables (LDVs, binary, categorical, etc.) typically assume a latent linear index • Require functional form assumptions • Often 2SLS is still the best way to go (or at least should be shown as one model) • May be able to make the case for nonlinear form like bivariate probit. Example: Have a third child? Depends on zi (sex of first two kids) • First stage: • Second stage (Employment status): • Problem would be correlation between error terms • Estimate bivariate probit with maximum likelihood • Generate average causal effects 31 Di =1[X' ig* 0 +g* 1zi >ui ] Yi =1[X' ib* 0 +b* 1Di >ei ]
  • 33. Control functions • Although historically the term has been used in a variety of ways, most modern applications of the term control function are instrumental variable approaches (Wooldridge 2015) • There is an endogenous explanatory variable • Control function approaches use the exogenous variation from an excluded instrument to generate variation in the residuals from the reduced form (first stage) • The residuals are the control functions—included in the second stage with the endogenous variables • Advantage is primarily in dealing with nonlinear models (Terza, Basu & Rathouz 2008) • Also allows for tests of the nature of selection 33
  • 34. Control function equations • Assume we are interested in the effect of endogenous variable y2 on outcome y1 and have a vector z of exogenous variables, including some instrument z2 • Essential problem is same as for 2SLS. Solution is different, and based on the linear relationship between the structural and reduced form error: • e1 uncorrelated with y2 • Can then estimate: • Essentially “control for” endogeneity of y2 34 y1 = z1d1 +g1y2 +u1 E[z' ju1]= 0 y2 = z1p21 + z2p22 +n2 E[z' jn2 ]= 0 u1 = r1n2 +e1 E[n2e1]= 0 y1 = z1d1 +g1y2 +r1n2 +e1
  • 35. Implementing two-step control function • Regress yi2 on zi • Predict OLS residuals • Run OLS regression of yi1 on zi1 , yi2 , • Generates • Essentially keeps residuals and actual endogenous variable, whereas 2SLS uses predicted value of endogenous variable • Bootstrap standard errors • Can undertake heteroscedasticity robust Hausman testing of endogeneity, • Still relies on (strong) instrument for identification 35 ˆni2 ˆni2 ˆd1, ˆg1, ˆr1 r1 = 0