SlideShare a Scribd company logo
Chapter 15
Chapter 15
Estimation of Dynamic
Causal Effects
2
Estimation of Dynamic Causal
Effects (SW Chapter 15)
A dynamic causal effect is the effect on Y of a change in X over
time.
For example:
 The effect of an increase in cigarette taxes on cigarette
consumption this year, next year, in 5 years;
 The effect of a change in the Fed Funds rate on inflation, this
month, in 6 months, and 1 year;
 The effect of a freeze in Florida on the price of orange juice
concentrate in 1 month, 2 months, 3 months…
3
The Orange Juice Data
(SW Section 15.1)
Data
 Monthly, Jan. 1950 – Dec. 2000 (T = 612)
 Price = price of frozen OJ (a sub-component of the producer
price index; US Bureau of Labor Statistics)
 %ChgP = percentage change in price at an annual rate, so
%ChgPt = 1200ln(Pricet)
 FDD = number of freezing degree-days during the month,
recorded in Orlando FL
 Example: If November has 2 days with lows < 32o
, one at
30o
and at 25o
, then FDDNov = 2 + 7 = 9
4
5
Initial OJ regression

% t
ChgP = -.40 + .47FDDt
(.22) (.13)
 Statistically significant positive relation
 More freezing degree days  price increase
 Standard errors are heteroskedasticity and autocorrelation-
consistent (HAC) SE’s – more on this later
 But what is the effect of FDD over time?
6
Dynamic Causal Effects
(SW Section 15.2)
Example: What is the effect of fertilizer on tomato yield?
An ideal randomized controlled experiment
 Fertilize some plots, not others (random assignment)
 Measure yield over time – over repeated harvests – to
estimate causal effect of fertilizer on:
 Yield in year 1 of expt
 Yield in year 2, etc.
 The result (in a large expt) is the causal effect of fertilizer on
yield k years later.
7
Dynamic causal effects, ctd.
In time series applications, we can’t conduct this ideal
randomized controlled experiment:
 We only have one US OJ market ….
 We can’t randomly assign FDD to different replicates of the
US OJ market (?)
 We can’t measure the average (across “subjects”) outcome at
different times – only one “subject”
 So we can’t estimate the causal effect at different times using
the differences estimator
8
Dynamic causal effects, ctd.
An alternative thought experiment:
 Randomly give the same subject different treatments (FDDt)
at different times
 Measure the outcome variable (%ChgPt)
 The “population” of subjects consists of the same subject (OJ
market) but at different dates
 If the “different subjects” are drawn from the same
distribution – that is, if Yt, Xt are stationary – then the
dynamic causal effect can be deduced by OLS regression of
Yt on lagged values of Xt.
 This estimator (regression of Yt on Xt and lags of Xt) s called
the distributed lag estimator.
9
Dynamic causal effects and the
distributed lag model
The distributed lag model is:
Yt = 0 + 1Xt + … + rXt–r + ut
 1 = impact effect of change in X = effect of change in Xt on
Yt, holding past Xt constant
 2 = 1-period dynamic multiplier = effect of change in Xt–1 on
Yt, holding constant Xt, Xt–2, Xt–3,…
 3 = 2-period dynamic multiplier (etc.)= effect of change in Xt–
2 on Yt, holding constant Xt, Xt–1, Xt–3,…
 Cumulative dynamic multipliers
 Ex: the 2-period cumulative dynamic multiplier
= 1 + 2 + 3
10
Exogeneity in time series regression
Exogeneity (past and present)
X is exogenous if E(ut|Xt,Xt–1,Xt–2,…) = 0.
Strict Exogeneity (past, present, and future)
X is strictly exogenous if E(ut|…,Xt+1,Xt,Xt–1, …) = 0
 Strict exogeneity implies exogeneity
 For now we suppose that X is exogenous – we’ll return
(briefly) to the case of strict exogeneity later.
 If X is exogenous, then we can use OLS to estimate the
dynamic causal effect on Y of a change in X….
11
Estimation of Dynamic Causal Effects
with Exogenous Regressors
(SW Section 15.3)
Distributed Lag Model:
Yt = 0 + 1Xt + … + r+1Xt–r + ut
The Distributed Lag Model Assumptions
1. E(ut|Xt,Xt–1,Xt–2,…) = 0 (X is exogenous)
2. (a) Y and X have stationary distributions;
(b) (Yt,Xt) and (Yt–j,Xt–j) become independent as j
gets large
3. Y and X have eight nonzero finite moments
4. There is no perfect multicollinearity.
12
The distributed lag model, ctd.
 Assumptions 1 and 4 are familiar
 Assumption 3 is familiar, except for 8 (not four) finite
moments – this has to do with HAC estimators
 Assumption 2 is different – before it was (Xi, Yi) are i.i.d. – this
now becomes more complicated.
2. (a) Y and X have stationary distributions;
 If so, the coefficients don’t change within the sample
(internal validity);
 and the results can be extrapolated outside the sample
(external validity).
 This is the time series counterpart of the “identically
distributed” part of i.i.d.
13
The distributed lag model, ctd.
2. (b) (Yt,Xt) and (Yt–j, Xt–j) become independent as j
gets large
 Intuitively, this says that we have separate experiments
for time periods that are widely separated.
 In cross-sectional data, we assumed that Y and X were
i.i.d., a consequence of simple random sampling – this
led to the CLT.
 A version of the CLT holds for time series variables that
become independent as their temporal separation
increases – assumption 2(b) is the time series
counterpart of the “independently distributed” part of
i.i.d.
14
Under the Distributed Lag Model
Assumptions:
 OLS yields consistent* estimators of 1, 2,…,r (of the
dynamic multipliers) (*consistent but possibly biased!)
 The sampling distribution of 1
ˆ
 , etc., is normal
 BUT the formula for the variance of this sampling
distribution is not the usual one from cross-sectional (i.i.d.)
data, because ut is not i.i.d. – ut can be serially correlated!
 This means that the usual OLS standard errors (usual STATA
printout) are wrong!
 We need to use, instead, SEs that are robust to autocorrelation
as well as to heteroskedasticity…
15
Heteroskedasticity and Autocorrelation-
Consistent (HAC) Standard Errors
(SW Section 15.4)
 When ut is serially correlated, the variance of the sampling
distribution of the OLS estimator is different.
 Consequently, we need to use a different formula for the
standard errors.
 This is easy to do using STATA and most (but not all) other
statistical software.
 We encountered this before in panel data – we solved the
problem using cluster(state).
 The “cluster” approach required n > 1– so clustered
standard errors are only for panel data
 in TS data, n = 1 so we need a different method…
16
HAC standard errors, ctd.
Yt = 0 + 1Xt + ut
The OLS estimator: From SW, App. 4.3,
1
ˆ
 – 1 = 1
2
1
1
( )
1
( )
T
t t
t
T
t
t
X X u
T
X X
T






 1
2
1 T
t
t
X
v
T



(in large samples)
where vt = (Xt – X )ut.
17
HAC standard errors, ctd.
Thus, in large samples,
var( 1
ˆ
 ) =
1
1
var
T
t
t
v
T 
 
 
 
 / 2 2
( )
X

= 2
1 1
1
cov( , )
T T
t s
t s
v v
T  
 / 2 2
( )
X
 (still SW App. 4.3)
In i.i.d. cross sectional data, cov(vt, vs) = 0 for t  s, so
var( 1
ˆ
 ) = 2
1
1
var( )
T
t
t
v
T 
 )/ 2 2
( )
X
 =
2
2 2
( )
v
x
T


This is our usual cross-sectional result (SW App. 4.3).
18
HAC standard errors, ctd.
But in time series data, cov(vt, vs) 0 in general.
Consider T = 2:
1
1
var
T
t
t
v
T 
 
 
 
 = var[½(v1+v2)]
= ¼[var(v1) + var(v2) + 2cov(v1,v2)]
= ½ 2
v
 + ½1
2
v
 (1 = corr(v1,v2))
= ½ 2
v
  f2, where f2 = (1+1)
 In i.i.d. data, 1 = 0 so f2 = 1, yielding the usual formula
 In time series data, if 1  0 then var( 1
ˆ
 ) is not given by the
usual formula.
19
Expression for var(), general T
1
1
var
T
t
t
v
T 
 
 
 
 =
2
v
T

 fT
so var( 1
ˆ
 ) =
2
2 2
1
( )
v
X
T


 
 
 
 fT
where
fT =
1
1
1 2
T
j
j
T j
T




 
  
 
 (SW, eq. (15.13))
 Conventional OLS SE’s are wrong when ut is serially
correlated (STATA printout is wrong).
 The OLS SEs are off by the factor fT
 We need to use a different SE formula!!!
20
HAC Standard Errors
 Conventional OLS SEs (heteroskedasticity-robust or not) are
wrong when ut is autocorrelated
 So, we need a new formula that produces SEs that are robust to
autocorrelation as well as heteroskedasticity
We need Heteroskedasticity- and Autocorrelation-
Consistent (HAC) standard errors
 If we knew the factor fT, we could just make the adjustment.
 In panel data, the factor fT is (implicitly) estimated by
using “cluster” – but “cluster” requires n large.
 In time series data, we need a different formula – we must
estimate fT explicitly
21
HAC SEs, ctd.
var( 1
ˆ
 ) =
2
2 2
1
( )
v
X
T


 
 
 
 fT , where fT =
1
1
1 2
T
j
j
T j
T




 
  
 

The most commonly used estimator of fT is:
ˆ
T
f =
1
1
1 2
m
j
j
m j
m




 
  
 
  (Newey-West)
 j

 is an estimator of j
 This is the “Newey-West” HAC SE estimator
 m is called the truncation parameter
 Why not just set m = T?
 Then how should you choose m?
 Use the Goldilocks method
 Or, use the rule of thumb, m = 0.75T1/3
22
Example: OJ and HAC estimators in
STATA
. gen l0fdd = fdd; generate lag #0
. gen l1fdd = L1.fdd; generate lag #1
. gen l2fdd = L2.fdd; generate lag #2
. gen l3fdd = L3.fdd; .
. gen l4fdd = L4.fdd; .
. gen l5fdd = L5.fdd; .
. gen l6fdd = L6.fdd;
. reg dlpoj fdd if tin(1950m1,2000m12), r; NOT HAC SEs
Linear regression Number of obs = 612
F( 1, 610) = 12.12
Prob > F = 0.0005
R-squared = 0.0937
Root MSE = 4.8261
------------------------------------------------------------------------------
| Robust
dlpoj | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fdd | .4662182 .1339293 3.48 0.001 .2031998 .7292367
_cons | -.4022562 .1893712 -2.12 0.034 -.7741549 -.0303575
------------------------------------------------------------------------------
23
Example: OJ and HAC estimators in
STATA, ctd
Rerun this regression, but with Newey-West SEs:
. newey dlpoj fdd if tin(1950m1,2000m12), lag(7);
Regression with Newey-West standard errors Number of obs = 612
maximum lag: 7 F( 1, 610) = 12.23
Prob > F = 0.0005
------------------------------------------------------------------------------
| Newey-West
dlpoj | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fdd | .4662182 .1333142 3.50 0.001 .2044077 .7280288
_cons | -.4022562 .2159802 -1.86 0.063 -.8264112 .0218987
------------------------------------------------------------------------------
Uses autocorrelations up to m = 7 to compute the SEs
rule-of-thumb: 0.75*(6121/3
) = 6.4  7, rounded up a little.
OK, in this case the difference in SEs is small, but not always so!
24
Example: OJ and HAC estimators in
STATA, ctd.
. global lfdd6 "fdd l1fdd l2fdd l3fdd l4fdd l5fdd l6fdd";
. newey dlpoj $lfdd6 if tin(1950m1,2000m12), lag(7);
Regression with Newey-West standard errors Number of obs = 612
maximum lag : 7 F( 7, 604) = 3.56
Prob > F = 0.0009
------------------------------------------------------------------------------
| Newey-West
dlpoj | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
fdd | .4693121 .1359686 3.45 0.001 .2022834 .7363407
l1fdd | .1430512 .0837047 1.71 0.088 -.0213364 .3074388
l2fdd | .0564234 .0561724 1.00 0.316 -.0538936 .1667404
l3fdd | .0722595 .0468776 1.54 0.124 -.0198033 .1643223
l4fdd | .0343244 .0295141 1.16 0.245 -.0236383 .0922871
l5fdd | .0468222 .0308791 1.52 0.130 -.0138212 .1074657
l6fdd | .0481115 .0446404 1.08 0.282 -.0395577 .1357807
_cons | -.6505183 .2336986 -2.78 0.006 -1.109479 -.1915578
------------------------------------------------------------------------------
 global lfdd6 defines a string which is all the additional lags
 What are the estimated dynamic multipliers (dynamic effects)?
25
FAQ: Do I need to use HAC SEs when I
estimate an AR or an ADL model?
A: NO.
 The problem to which HAC SEs are the solution only arises
when ut is serially correlated: if ut is serially uncorrelated, then
OLS SE’s are fine
 In AR and ADL models, the errors are serially uncorrelated if
you have included enough lags of Y
 If you include enough lags of Y, then the error term can’t
be predicted using past Y, or equivalently by past u – so u
is serially uncorrelated
26
Estimation of Dynamic Causal Effects
with Strictly Exogenous Regressors
(SW Section 15.5)
 X is strictly exogenous if E(ut|…,Xt+1,Xt,Xt–1, …) = 0
 If X is strictly exogenous, there are more efficient ways to
estimate dynamic causal effects than by a distributed lag
regression:
 Generalized Least Squares (GLS) estimation
 Autoregressive Distributed Lag (ADL) estimation
 But the condition of strict exogeneity is very strong, so this
condition is rarely plausible in practice – not even in the
weather/OJ example (why?).
 So we won’t cover GLS or ADL estimation of dynamic
causal effects – for details, see SW
27
Analysis of the OJ Price Data
(SW Section 15.6)
What is the dynamic causal effect (what are the dynamic
multipliers) of a unit increase in FDD on OJ prices?
%ChgPt = 0 + 1FDDt + … + r+1FDDt–r + ut
 What r to use?
How about 18? (Goldilocks method)
 What m (Newey-West truncation parameter) to use?
m = .75 6121/3
= 6.4  7
28
Digression: Computation of cumulative
multipliers and their standard errors
The cumulative multipliers can be computed by estimating the
distributed lag model, then adding up the coefficients. However,
you should also compute standard errors for the cumulative
multipliers and while this can be done directly from the
distributed lag model it requires some modifications.
One easy way to compute cumulative multipliers and standard
errors of cumulative multipliers is to realize that cumulative
multipliers are linear combinations of regression coefficients –
so the methods of Section 7.3 can be applied to compute their
standard errors.
29
Computing cumulative multipliers,
ctd.
The trick of Section 7.3 is to rewrite the regression so that the
coefficients in the rewritten regression are the coefficients of
interest – here, the cumulative multipliers.
Example: Rewrite the distributed lag model with 1 lag:
Yt = 0 + 1Xt + 2Xt–1 + ut
= 0 + 1Xt – 1Xt–1 + 1Xt–1 + 2Xt–1 + ut
= 0 + 1(Xt –Xt–1) + (1 + 2)Xt–1 + ut
or
Yt = 0 + 1Xt + (1+2) Xt–1 + ut
30
Computing cumulative multipliers,
ctd.
So, let W1t = Xt and W2t = Xt–1 and estimate the regression,
Yt = 0 + 1 W1t + 2W2t + ui
Then
1 = 1 = impact effect
2 = 1 + 2 = the first cumulative multiplier
and the (HAC) standard errors on 1 and 2 are the standard
errors for the two cumulative multipliers.
31
Computing cumulative multipliers,
ctd.
In general, the ADL model can be rewritten as,
Yt = 0 + 1Xt + 2Xt–1 + … + q–1Xt–q+1 + qXt–q + ut
where
1 = 1
2 = 1 + 2
3 = 1 + 2 + 3
…
q = 1 + 2 + … + q
Cumulative multipliers and their HAC SEs can be computed
directly using this transformed regression
32
33
34
35
Are the OJ dynamic effects stable?
Recall from Section 14.7 that we can test for stability of time
series regression coefficients using the QLR statistic. So, we can
compute QLR for regression (1) in Table 15.1:
 Do you need HAC SEs? Why or why not?
 How specifically would you compute the Chow statistic?
 How would you compute the QLR statistic?
 What are the d.f. q of the Chow and QLR statistics?
 Result: QLR = 21.19.
 Is this significant? (see Table 14.6)
 At what significance level?
 How to interpret the result substantively? Estimate the
dynamic multipliers on subsamples and see how they have
changed over time…
36
37
OJ: Do the breaks matter
substantively?
38
When Can You Estimate Dynamic
Causal Effects? That is, When is
Exogeneity Plausible? (SW Section 15.7)
If X is exogenous (and assumptions #2-4 hold), then a distributed
lag model provides consistent estimators of dynamic causal
effects.
As in multiple regression with cross-sectional data, you must
think critically about whether X is exogenous in any application:
 is X exogenous, i.e. E(ut|Xt,Xt–1, …) = 0?
 is X strictly exogenous, i.e. E(ut|…,Xt+1,Xt,Xt–1, …) = 0?
39
Is exogeneity (or strict exogeneity)
plausible? Examples:
1. Y = OJ prices, X = FDD in Orlando
2. Y = Australian exports, X = US GDP (effect of US income on
demand for Australian exports)
3. Y = EU exports, X = US GDP (effect of US income on
demand for EU exports)
4. Y = US rate of inflation, X = percentage change in world oil
prices (as set by OPEC) (effect of OPEC oil price increase on
inflation)
5. Y = GDP growth, X =Federal Funds rate (the effect of
monetary policy on output growth)
6. Y = change in the rate of inflation, X = unemployment rate on
inflation (the Phillips curve)
40
Exogeneity, ctd.
 You must evaluate exogeneity and strict exogeneity on a case
by case basis
 Exogeneity is often not plausible in time series data because
of simultaneous causality
 Strict exogeneity is rarely plausible in time series data
because of feedback.
41
Estimation of Dynamic Causal
Effects: Summary (SW Section 15.8)
 Dynamic causal effects are measurable in theory using a
randomized controlled experiment with repeated
measurements over time.
 When X is exogenous, you can estimate dynamic causal
effects using a distributed lag regression
 If u is serially correlated, conventional OLS SEs are
incorrect; you must use HAC SEs
 To decide whether X is exogenous, think hard!

More Related Content

PDF
Paris2012 session4
PDF
State Space Model
DOC
Ch 12 Slides.doc. Introduction of science of business
PDF
Stochastic Vol Forecasting
PDF
An Exponential Observer Design for a Class of Chaotic Systems with Exponentia...
PDF
Computational Intelligence for Time Series Prediction
PDF
Priliminary Research on Multi-Dimensional Panel Data Modeling
PPT
Panel data_25412547859_andbcbgajkje852.ppt
Paris2012 session4
State Space Model
Ch 12 Slides.doc. Introduction of science of business
Stochastic Vol Forecasting
An Exponential Observer Design for a Class of Chaotic Systems with Exponentia...
Computational Intelligence for Time Series Prediction
Priliminary Research on Multi-Dimensional Panel Data Modeling
Panel data_25412547859_andbcbgajkje852.ppt

Similar to Estimation of Dynamic Causal Effects -Introduction to Economics (20)

PPT
panel data.ppt
PPT
extreme times in finance heston model.ppt
PDF
Numerical Solution of Stochastic Differential Equations in Finance
PDF
Hyper variance and autonomous bus
PDF
金利期間構造について:Forward Martingale Measureの導出
PDF
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
PPT
Time series mnr
PDF
mismatch_presentation_for_analog_circuits
PDF
Scalable inference for a full multivariate stochastic volatility
PDF
Ec8352 signals and systems 2 marks with answers
PDF
Master_Thesis_Harihara_Subramanyam_Sreenivasan
PDF
Ch14_slides.pdf
PDF
Principal Component Analysis for Tensor Analysis and EEG classification
PDF
An Approximate Distribution of Delta-Hedging Errors in a Jump-Diffusion Model...
PDF
Univariate Financial Time Series Analysis
PDF
Exponential State Observer Design for a Class of Uncertain Chaotic and Non-Ch...
PDF
auto correlation.pdf
PDF
IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...
PDF
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
panel data.ppt
extreme times in finance heston model.ppt
Numerical Solution of Stochastic Differential Equations in Finance
Hyper variance and autonomous bus
金利期間構造について:Forward Martingale Measureの導出
Simultaneous State and Actuator Fault Estimation With Fuzzy Descriptor PMID a...
Time series mnr
mismatch_presentation_for_analog_circuits
Scalable inference for a full multivariate stochastic volatility
Ec8352 signals and systems 2 marks with answers
Master_Thesis_Harihara_Subramanyam_Sreenivasan
Ch14_slides.pdf
Principal Component Analysis for Tensor Analysis and EEG classification
An Approximate Distribution of Delta-Hedging Errors in a Jump-Diffusion Model...
Univariate Financial Time Series Analysis
Exponential State Observer Design for a Class of Uncertain Chaotic and Non-Ch...
auto correlation.pdf
IRJET- Analytic Evaluation of the Head Injury Criterion (HIC) within the Fram...
Computing f-Divergences and Distances of\\ High-Dimensional Probability Densi...
Ad

Recently uploaded (20)

PPTX
Unilever_Financial_Analysis_Presentation.pptx
PDF
Spending, Allocation Choices, and Aging THROUGH Retirement. Are all of these ...
PPTX
Session 11-13. Working Capital Management and Cash Budget.pptx
PPTX
How best to drive Metrics, Ratios, and Key Performance Indicators
PPTX
Introduction to Essence of Indian traditional knowledge.pptx
PPTX
Who’s winning the race to be the world’s first trillionaire.pptx
PPT
E commerce busin and some important issues
PPTX
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
PDF
Bladex Earnings Call Presentation 2Q2025
PDF
illuminati Uganda brotherhood agent in Kampala call 0756664682,0782561496
PDF
Topic Globalisation and Lifelines of National Economy.pdf
PDF
Dr Tran Quoc Bao the first Vietnamese speaker at GITEX DigiHealth Conference ...
PPTX
FL INTRODUCTION TO AGRIBUSINESS CHAPTER 1
PDF
ECONOMICS AND ENTREPRENEURS LESSONSS AND
PDF
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
PDF
Q2 2025 :Lundin Gold Conference Call Presentation_Final.pdf
PPTX
social-studies-subject-for-high-school-globalization.pptx
PDF
Circular Flow of Income by Dr. S. Malini
PDF
discourse-2025-02-building-a-trillion-dollar-dream.pdf
PPTX
The discussion on the Economic in transportation .pptx
Unilever_Financial_Analysis_Presentation.pptx
Spending, Allocation Choices, and Aging THROUGH Retirement. Are all of these ...
Session 11-13. Working Capital Management and Cash Budget.pptx
How best to drive Metrics, Ratios, and Key Performance Indicators
Introduction to Essence of Indian traditional knowledge.pptx
Who’s winning the race to be the world’s first trillionaire.pptx
E commerce busin and some important issues
Basic Concepts of Economics.pvhjkl;vbjkl;ptx
Bladex Earnings Call Presentation 2Q2025
illuminati Uganda brotherhood agent in Kampala call 0756664682,0782561496
Topic Globalisation and Lifelines of National Economy.pdf
Dr Tran Quoc Bao the first Vietnamese speaker at GITEX DigiHealth Conference ...
FL INTRODUCTION TO AGRIBUSINESS CHAPTER 1
ECONOMICS AND ENTREPRENEURS LESSONSS AND
final_dropping_the_baton_-_how_america_is_failing_to_use_russia_sanctions_and...
Q2 2025 :Lundin Gold Conference Call Presentation_Final.pdf
social-studies-subject-for-high-school-globalization.pptx
Circular Flow of Income by Dr. S. Malini
discourse-2025-02-building-a-trillion-dollar-dream.pdf
The discussion on the Economic in transportation .pptx
Ad

Estimation of Dynamic Causal Effects -Introduction to Economics

  • 1. Chapter 15 Chapter 15 Estimation of Dynamic Causal Effects
  • 2. 2 Estimation of Dynamic Causal Effects (SW Chapter 15) A dynamic causal effect is the effect on Y of a change in X over time. For example:  The effect of an increase in cigarette taxes on cigarette consumption this year, next year, in 5 years;  The effect of a change in the Fed Funds rate on inflation, this month, in 6 months, and 1 year;  The effect of a freeze in Florida on the price of orange juice concentrate in 1 month, 2 months, 3 months…
  • 3. 3 The Orange Juice Data (SW Section 15.1) Data  Monthly, Jan. 1950 – Dec. 2000 (T = 612)  Price = price of frozen OJ (a sub-component of the producer price index; US Bureau of Labor Statistics)  %ChgP = percentage change in price at an annual rate, so %ChgPt = 1200ln(Pricet)  FDD = number of freezing degree-days during the month, recorded in Orlando FL  Example: If November has 2 days with lows < 32o , one at 30o and at 25o , then FDDNov = 2 + 7 = 9
  • 4. 4
  • 5. 5 Initial OJ regression  % t ChgP = -.40 + .47FDDt (.22) (.13)  Statistically significant positive relation  More freezing degree days  price increase  Standard errors are heteroskedasticity and autocorrelation- consistent (HAC) SE’s – more on this later  But what is the effect of FDD over time?
  • 6. 6 Dynamic Causal Effects (SW Section 15.2) Example: What is the effect of fertilizer on tomato yield? An ideal randomized controlled experiment  Fertilize some plots, not others (random assignment)  Measure yield over time – over repeated harvests – to estimate causal effect of fertilizer on:  Yield in year 1 of expt  Yield in year 2, etc.  The result (in a large expt) is the causal effect of fertilizer on yield k years later.
  • 7. 7 Dynamic causal effects, ctd. In time series applications, we can’t conduct this ideal randomized controlled experiment:  We only have one US OJ market ….  We can’t randomly assign FDD to different replicates of the US OJ market (?)  We can’t measure the average (across “subjects”) outcome at different times – only one “subject”  So we can’t estimate the causal effect at different times using the differences estimator
  • 8. 8 Dynamic causal effects, ctd. An alternative thought experiment:  Randomly give the same subject different treatments (FDDt) at different times  Measure the outcome variable (%ChgPt)  The “population” of subjects consists of the same subject (OJ market) but at different dates  If the “different subjects” are drawn from the same distribution – that is, if Yt, Xt are stationary – then the dynamic causal effect can be deduced by OLS regression of Yt on lagged values of Xt.  This estimator (regression of Yt on Xt and lags of Xt) s called the distributed lag estimator.
  • 9. 9 Dynamic causal effects and the distributed lag model The distributed lag model is: Yt = 0 + 1Xt + … + rXt–r + ut  1 = impact effect of change in X = effect of change in Xt on Yt, holding past Xt constant  2 = 1-period dynamic multiplier = effect of change in Xt–1 on Yt, holding constant Xt, Xt–2, Xt–3,…  3 = 2-period dynamic multiplier (etc.)= effect of change in Xt– 2 on Yt, holding constant Xt, Xt–1, Xt–3,…  Cumulative dynamic multipliers  Ex: the 2-period cumulative dynamic multiplier = 1 + 2 + 3
  • 10. 10 Exogeneity in time series regression Exogeneity (past and present) X is exogenous if E(ut|Xt,Xt–1,Xt–2,…) = 0. Strict Exogeneity (past, present, and future) X is strictly exogenous if E(ut|…,Xt+1,Xt,Xt–1, …) = 0  Strict exogeneity implies exogeneity  For now we suppose that X is exogenous – we’ll return (briefly) to the case of strict exogeneity later.  If X is exogenous, then we can use OLS to estimate the dynamic causal effect on Y of a change in X….
  • 11. 11 Estimation of Dynamic Causal Effects with Exogenous Regressors (SW Section 15.3) Distributed Lag Model: Yt = 0 + 1Xt + … + r+1Xt–r + ut The Distributed Lag Model Assumptions 1. E(ut|Xt,Xt–1,Xt–2,…) = 0 (X is exogenous) 2. (a) Y and X have stationary distributions; (b) (Yt,Xt) and (Yt–j,Xt–j) become independent as j gets large 3. Y and X have eight nonzero finite moments 4. There is no perfect multicollinearity.
  • 12. 12 The distributed lag model, ctd.  Assumptions 1 and 4 are familiar  Assumption 3 is familiar, except for 8 (not four) finite moments – this has to do with HAC estimators  Assumption 2 is different – before it was (Xi, Yi) are i.i.d. – this now becomes more complicated. 2. (a) Y and X have stationary distributions;  If so, the coefficients don’t change within the sample (internal validity);  and the results can be extrapolated outside the sample (external validity).  This is the time series counterpart of the “identically distributed” part of i.i.d.
  • 13. 13 The distributed lag model, ctd. 2. (b) (Yt,Xt) and (Yt–j, Xt–j) become independent as j gets large  Intuitively, this says that we have separate experiments for time periods that are widely separated.  In cross-sectional data, we assumed that Y and X were i.i.d., a consequence of simple random sampling – this led to the CLT.  A version of the CLT holds for time series variables that become independent as their temporal separation increases – assumption 2(b) is the time series counterpart of the “independently distributed” part of i.i.d.
  • 14. 14 Under the Distributed Lag Model Assumptions:  OLS yields consistent* estimators of 1, 2,…,r (of the dynamic multipliers) (*consistent but possibly biased!)  The sampling distribution of 1 ˆ  , etc., is normal  BUT the formula for the variance of this sampling distribution is not the usual one from cross-sectional (i.i.d.) data, because ut is not i.i.d. – ut can be serially correlated!  This means that the usual OLS standard errors (usual STATA printout) are wrong!  We need to use, instead, SEs that are robust to autocorrelation as well as to heteroskedasticity…
  • 15. 15 Heteroskedasticity and Autocorrelation- Consistent (HAC) Standard Errors (SW Section 15.4)  When ut is serially correlated, the variance of the sampling distribution of the OLS estimator is different.  Consequently, we need to use a different formula for the standard errors.  This is easy to do using STATA and most (but not all) other statistical software.  We encountered this before in panel data – we solved the problem using cluster(state).  The “cluster” approach required n > 1– so clustered standard errors are only for panel data  in TS data, n = 1 so we need a different method…
  • 16. 16 HAC standard errors, ctd. Yt = 0 + 1Xt + ut The OLS estimator: From SW, App. 4.3, 1 ˆ  – 1 = 1 2 1 1 ( ) 1 ( ) T t t t T t t X X u T X X T        1 2 1 T t t X v T    (in large samples) where vt = (Xt – X )ut.
  • 17. 17 HAC standard errors, ctd. Thus, in large samples, var( 1 ˆ  ) = 1 1 var T t t v T         / 2 2 ( ) X  = 2 1 1 1 cov( , ) T T t s t s v v T    / 2 2 ( ) X  (still SW App. 4.3) In i.i.d. cross sectional data, cov(vt, vs) = 0 for t  s, so var( 1 ˆ  ) = 2 1 1 var( ) T t t v T   )/ 2 2 ( ) X  = 2 2 2 ( ) v x T   This is our usual cross-sectional result (SW App. 4.3).
  • 18. 18 HAC standard errors, ctd. But in time series data, cov(vt, vs) 0 in general. Consider T = 2: 1 1 var T t t v T         = var[½(v1+v2)] = ¼[var(v1) + var(v2) + 2cov(v1,v2)] = ½ 2 v  + ½1 2 v  (1 = corr(v1,v2)) = ½ 2 v   f2, where f2 = (1+1)  In i.i.d. data, 1 = 0 so f2 = 1, yielding the usual formula  In time series data, if 1  0 then var( 1 ˆ  ) is not given by the usual formula.
  • 19. 19 Expression for var(), general T 1 1 var T t t v T         = 2 v T   fT so var( 1 ˆ  ) = 2 2 2 1 ( ) v X T          fT where fT = 1 1 1 2 T j j T j T             (SW, eq. (15.13))  Conventional OLS SE’s are wrong when ut is serially correlated (STATA printout is wrong).  The OLS SEs are off by the factor fT  We need to use a different SE formula!!!
  • 20. 20 HAC Standard Errors  Conventional OLS SEs (heteroskedasticity-robust or not) are wrong when ut is autocorrelated  So, we need a new formula that produces SEs that are robust to autocorrelation as well as heteroskedasticity We need Heteroskedasticity- and Autocorrelation- Consistent (HAC) standard errors  If we knew the factor fT, we could just make the adjustment.  In panel data, the factor fT is (implicitly) estimated by using “cluster” – but “cluster” requires n large.  In time series data, we need a different formula – we must estimate fT explicitly
  • 21. 21 HAC SEs, ctd. var( 1 ˆ  ) = 2 2 2 1 ( ) v X T          fT , where fT = 1 1 1 2 T j j T j T             The most commonly used estimator of fT is: ˆ T f = 1 1 1 2 m j j m j m              (Newey-West)  j   is an estimator of j  This is the “Newey-West” HAC SE estimator  m is called the truncation parameter  Why not just set m = T?  Then how should you choose m?  Use the Goldilocks method  Or, use the rule of thumb, m = 0.75T1/3
  • 22. 22 Example: OJ and HAC estimators in STATA . gen l0fdd = fdd; generate lag #0 . gen l1fdd = L1.fdd; generate lag #1 . gen l2fdd = L2.fdd; generate lag #2 . gen l3fdd = L3.fdd; . . gen l4fdd = L4.fdd; . . gen l5fdd = L5.fdd; . . gen l6fdd = L6.fdd; . reg dlpoj fdd if tin(1950m1,2000m12), r; NOT HAC SEs Linear regression Number of obs = 612 F( 1, 610) = 12.12 Prob > F = 0.0005 R-squared = 0.0937 Root MSE = 4.8261 ------------------------------------------------------------------------------ | Robust dlpoj | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- fdd | .4662182 .1339293 3.48 0.001 .2031998 .7292367 _cons | -.4022562 .1893712 -2.12 0.034 -.7741549 -.0303575 ------------------------------------------------------------------------------
  • 23. 23 Example: OJ and HAC estimators in STATA, ctd Rerun this regression, but with Newey-West SEs: . newey dlpoj fdd if tin(1950m1,2000m12), lag(7); Regression with Newey-West standard errors Number of obs = 612 maximum lag: 7 F( 1, 610) = 12.23 Prob > F = 0.0005 ------------------------------------------------------------------------------ | Newey-West dlpoj | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- fdd | .4662182 .1333142 3.50 0.001 .2044077 .7280288 _cons | -.4022562 .2159802 -1.86 0.063 -.8264112 .0218987 ------------------------------------------------------------------------------ Uses autocorrelations up to m = 7 to compute the SEs rule-of-thumb: 0.75*(6121/3 ) = 6.4  7, rounded up a little. OK, in this case the difference in SEs is small, but not always so!
  • 24. 24 Example: OJ and HAC estimators in STATA, ctd. . global lfdd6 "fdd l1fdd l2fdd l3fdd l4fdd l5fdd l6fdd"; . newey dlpoj $lfdd6 if tin(1950m1,2000m12), lag(7); Regression with Newey-West standard errors Number of obs = 612 maximum lag : 7 F( 7, 604) = 3.56 Prob > F = 0.0009 ------------------------------------------------------------------------------ | Newey-West dlpoj | Coef. Std. Err. t P>|t| [95% Conf. Interval] -------------+---------------------------------------------------------------- fdd | .4693121 .1359686 3.45 0.001 .2022834 .7363407 l1fdd | .1430512 .0837047 1.71 0.088 -.0213364 .3074388 l2fdd | .0564234 .0561724 1.00 0.316 -.0538936 .1667404 l3fdd | .0722595 .0468776 1.54 0.124 -.0198033 .1643223 l4fdd | .0343244 .0295141 1.16 0.245 -.0236383 .0922871 l5fdd | .0468222 .0308791 1.52 0.130 -.0138212 .1074657 l6fdd | .0481115 .0446404 1.08 0.282 -.0395577 .1357807 _cons | -.6505183 .2336986 -2.78 0.006 -1.109479 -.1915578 ------------------------------------------------------------------------------  global lfdd6 defines a string which is all the additional lags  What are the estimated dynamic multipliers (dynamic effects)?
  • 25. 25 FAQ: Do I need to use HAC SEs when I estimate an AR or an ADL model? A: NO.  The problem to which HAC SEs are the solution only arises when ut is serially correlated: if ut is serially uncorrelated, then OLS SE’s are fine  In AR and ADL models, the errors are serially uncorrelated if you have included enough lags of Y  If you include enough lags of Y, then the error term can’t be predicted using past Y, or equivalently by past u – so u is serially uncorrelated
  • 26. 26 Estimation of Dynamic Causal Effects with Strictly Exogenous Regressors (SW Section 15.5)  X is strictly exogenous if E(ut|…,Xt+1,Xt,Xt–1, …) = 0  If X is strictly exogenous, there are more efficient ways to estimate dynamic causal effects than by a distributed lag regression:  Generalized Least Squares (GLS) estimation  Autoregressive Distributed Lag (ADL) estimation  But the condition of strict exogeneity is very strong, so this condition is rarely plausible in practice – not even in the weather/OJ example (why?).  So we won’t cover GLS or ADL estimation of dynamic causal effects – for details, see SW
  • 27. 27 Analysis of the OJ Price Data (SW Section 15.6) What is the dynamic causal effect (what are the dynamic multipliers) of a unit increase in FDD on OJ prices? %ChgPt = 0 + 1FDDt + … + r+1FDDt–r + ut  What r to use? How about 18? (Goldilocks method)  What m (Newey-West truncation parameter) to use? m = .75 6121/3 = 6.4  7
  • 28. 28 Digression: Computation of cumulative multipliers and their standard errors The cumulative multipliers can be computed by estimating the distributed lag model, then adding up the coefficients. However, you should also compute standard errors for the cumulative multipliers and while this can be done directly from the distributed lag model it requires some modifications. One easy way to compute cumulative multipliers and standard errors of cumulative multipliers is to realize that cumulative multipliers are linear combinations of regression coefficients – so the methods of Section 7.3 can be applied to compute their standard errors.
  • 29. 29 Computing cumulative multipliers, ctd. The trick of Section 7.3 is to rewrite the regression so that the coefficients in the rewritten regression are the coefficients of interest – here, the cumulative multipliers. Example: Rewrite the distributed lag model with 1 lag: Yt = 0 + 1Xt + 2Xt–1 + ut = 0 + 1Xt – 1Xt–1 + 1Xt–1 + 2Xt–1 + ut = 0 + 1(Xt –Xt–1) + (1 + 2)Xt–1 + ut or Yt = 0 + 1Xt + (1+2) Xt–1 + ut
  • 30. 30 Computing cumulative multipliers, ctd. So, let W1t = Xt and W2t = Xt–1 and estimate the regression, Yt = 0 + 1 W1t + 2W2t + ui Then 1 = 1 = impact effect 2 = 1 + 2 = the first cumulative multiplier and the (HAC) standard errors on 1 and 2 are the standard errors for the two cumulative multipliers.
  • 31. 31 Computing cumulative multipliers, ctd. In general, the ADL model can be rewritten as, Yt = 0 + 1Xt + 2Xt–1 + … + q–1Xt–q+1 + qXt–q + ut where 1 = 1 2 = 1 + 2 3 = 1 + 2 + 3 … q = 1 + 2 + … + q Cumulative multipliers and their HAC SEs can be computed directly using this transformed regression
  • 32. 32
  • 33. 33
  • 34. 34
  • 35. 35 Are the OJ dynamic effects stable? Recall from Section 14.7 that we can test for stability of time series regression coefficients using the QLR statistic. So, we can compute QLR for regression (1) in Table 15.1:  Do you need HAC SEs? Why or why not?  How specifically would you compute the Chow statistic?  How would you compute the QLR statistic?  What are the d.f. q of the Chow and QLR statistics?  Result: QLR = 21.19.  Is this significant? (see Table 14.6)  At what significance level?  How to interpret the result substantively? Estimate the dynamic multipliers on subsamples and see how they have changed over time…
  • 36. 36
  • 37. 37 OJ: Do the breaks matter substantively?
  • 38. 38 When Can You Estimate Dynamic Causal Effects? That is, When is Exogeneity Plausible? (SW Section 15.7) If X is exogenous (and assumptions #2-4 hold), then a distributed lag model provides consistent estimators of dynamic causal effects. As in multiple regression with cross-sectional data, you must think critically about whether X is exogenous in any application:  is X exogenous, i.e. E(ut|Xt,Xt–1, …) = 0?  is X strictly exogenous, i.e. E(ut|…,Xt+1,Xt,Xt–1, …) = 0?
  • 39. 39 Is exogeneity (or strict exogeneity) plausible? Examples: 1. Y = OJ prices, X = FDD in Orlando 2. Y = Australian exports, X = US GDP (effect of US income on demand for Australian exports) 3. Y = EU exports, X = US GDP (effect of US income on demand for EU exports) 4. Y = US rate of inflation, X = percentage change in world oil prices (as set by OPEC) (effect of OPEC oil price increase on inflation) 5. Y = GDP growth, X =Federal Funds rate (the effect of monetary policy on output growth) 6. Y = change in the rate of inflation, X = unemployment rate on inflation (the Phillips curve)
  • 40. 40 Exogeneity, ctd.  You must evaluate exogeneity and strict exogeneity on a case by case basis  Exogeneity is often not plausible in time series data because of simultaneous causality  Strict exogeneity is rarely plausible in time series data because of feedback.
  • 41. 41 Estimation of Dynamic Causal Effects: Summary (SW Section 15.8)  Dynamic causal effects are measurable in theory using a randomized controlled experiment with repeated measurements over time.  When X is exogenous, you can estimate dynamic causal effects using a distributed lag regression  If u is serially correlated, conventional OLS SEs are incorrect; you must use HAC SEs  To decide whether X is exogenous, think hard!