2. Why Randomize

INTRODUCTION TO IMPACT EVALUATION AND
RANDOMIZED CONTROL TRIALS
PRESENTATION 2: WHY RANDOMIZE?
PARTICIPATION AND REGULATORY COMPLIANCE
PROJECT LAUNCH WORKSHOP
8-9 JULY 2014. HANOI
Héctor Salazar Salame, Executive Director J-PAL SEA

Presentations Overview
1. What is evaluation? Why Evaluate?
2. Why randomize?
3. How to randomize
4. Evaluation from Start to Finish

I. Background
II. What is a randomized experiment?
III. Why randomize?
IV.Common criticisms and responses
Lecture Overview

Impact: What is it?
Time
PrimaryOutcome
Impact
Intervention

How to measure impact?
Impact is defined as a comparison between:
1. the outcome some time after the program has
been introduced
2. the outcome at that same point in time had the
program not been introduced (the
”counterfactual”)
7

Counterfactual
• The counterfactual represents the state of
the world that program participants would
have experienced in the absence of the
program (i.e. had they not participated in
the program)
• Problem: Counterfactual cannot be
observed
• Solution: We need to “mimic” or construct
the counterfactual

Constructing the counterfactual
• Usually done by selecting a group of individuals
that did not participate in the program
• This group is usually referred to as the control
group or comparison group
• How this group is selected is a key decision in the
design of any impact evaluation

Selecting the comparison group
• Idea: Select a group that is exactly like the group of
participants in all ways except one: their exposure to
the program being evaluated
• Goal: To be able to attribute differences in outcomes
between the group of participants and the comparison
group to the program (and not to other factors)

Impact evaluation methods
1. Randomized Experiments
• Also known as:
– Random Assignment Studies
– Randomized Field Trials
– Social Experiments
– Randomized Controlled Trials (RCTs)
– Randomized Controlled Experiments
13

Impact evaluation methods
2. Non- or Quasi-Experimental Methods
a. Pre-Post
b. Simple Difference
c. Differences-in-Differences
d. Multivariate Regression
e. Statistical Matching
f. Interrupted Time Series
g. Instrumental Variables
h. Regression Discontinuity

II – What is a randomized
experiment?

The basics
Start with simple case:
• Take a sample of program applicants
• Randomly assign them to either:
 Treatment Group – is offered treatment
 Control Group - not offered treatment
(during the evaluation period)

Key advantage of experiments
Because members of the groups (treatment
and control) do not differ systematically at
the outset of the experiment,
any difference that subsequently arises
between them can be attributed to the
program rather than to other factors.
17

Evaluation of “Women as Policymakers”:
Treatment vs. Control villages at baseline
Variables
Treatment
Group
Control
Group
Difference
Female Literacy Rate 0.35 0.34
0.01
(0.01)
Number of Public Health
Facilities
0.06 0.08
-0.02
(0.02)
Tap Water 0.05 0.03
0.02
(0.02)
Number of Primary Schools 0.95 0.91
0.04
(0.08)
Number of High Schools 0.09 0.10
-0.01
(0.02)
Standard Errors in parentheses. Statistics displayed for West
Bengal
*/*/***: Statistically significant at the 10% / 5% / 1% level
Source: Chattopadhyay and Duflo (2004)

Some variations on the basics
• Assigning to multiple treatment groups
• Assigning of units other than individuals
or households
 Health Centers
 Schools
 Local Governments
 Villages

Key steps in conducting an experiment
1. Design the study carefully
2. Randomly assign people to treatment or
control
3. Collect baseline data
4. Verify that assignment looks random
5. Monitor process so that integrity of
experiment is not compromised

Key steps in conducting an experiment
(cont.)
6. Collect follow-up data for both the
treatment and control groups
7. Estimate program impacts by comparing
mean outcomes of treatment group vs.
mean outcomes of control group.
8. Assess whether program impacts are
statistically significant and practically
significant.

Why randomize? – Conceptual Argument
If properly designed and conducted,
randomized experiments provide the most
credible method to estimate the impact of a
program
23

Why “most credible”?
Because members of the groups (treatment
and control) do not differ systematically at
the outset of the experiment,
Any difference that subsequently arises
between them can be attributed to the
program rather than to other factors.
24

Example: Balsakhi Program
Case 2: Remedial Education in IndiaCase 2: Remedial Education in India

Balsakhi Program: Background
• Implemented by Pratham, an NGO in India
• Program provided tutors (Balsakhi) to help
at-risk children with school work
• In Vadodara, the balsakhi program was run
in government primary schools in 2002-
2003
• Teachers decided which children would get
the balsakhi

Balsakhi: Outcomes
• Children were tested at the beginning of the
school year (Pretest) and at the end of the year
(Post-test)
• QUESTION: How can we estimate the impact
of the balsakhi program on test scores?

Methods to estimate impacts
• Let’s look at different ways of estimating
the impacts using the data from the schools
that got a balsakhi
1. Pre – Post (Before vs. After)
2. Simple difference
3. Difference-in-difference
4. Other non-experimental methods
5. Randomized Experiment

• Look at average
change in test scores
over the school year
for the balsakhi
children
1 - Pre-post (Before vs. After)

1 - Pre-post (Before vs. After)
• QUESTION: Under what conditions can this
difference (26.42) be interpreted as the impact
of the balsakhi program?
Average post-test score for
children with a balsakhi
51.22
Average pretest score for
children with a balsakhi
24.80
Difference 26.42

What would have happened without balsakhi?
Method 1: Before vs. After
Impact = 26.42 points?
75
50
25
0
0
2002 2003
26.42 points?

2 - Simple difference
Children who got
balsakhi
Compare test scores of…
Children who did not get
balsakhi
With
test
scores
of…

2 - Simple difference
• QUESTION: Under what conditions can this
difference (-5.05) be interpreted as the impact
of the balsakhi program?
Average score for children
with a balsakhi
51.22
without a balsakhi
56.27
Difference -5.05

What would have happened without balsakhi?
Method 2: Simple Comparison
Impact = -5.05 points?
75
50
25
0
0
2002 2003
-5.05 points?

3 – Difference-in-Differences
Children who got
balsakhi
Compare gains in test scores of…
Children who did not get
balsakhi
With
gains
in test
scores
of…

3 - Difference-in-differences
Pretest Post-test Difference
with a balsakhi
24.80 51.22 26.42

with a balsakhi
24.80 51.22 26.42
without a balsakhi
36.67 56.27 19.60

• QUESTION: Under what conditions can 6.82 be
interpreted as the impact of the balsakhi program?
with a balsakhi
24.80 51.22 26.42
without a balsakhi
36.67 56.27 19.60
Difference 6.82

• There are more sophisticated non-experimental
methods to estimate program impacts:
– Regression
– Matching
– Instrumental Variables
– Regression Discontinuity
• These methods rely on being able to “mimic” the
counterfactual under certain assumptions
• Problem: Assumptions are not testable
4 – Other Methods

• Suppose we evaluated the balsakhi program
using a randomized experiment
• QUESTION #1: What would this entail?
How would we do it?
• QUESTION #2: What would be the
advantage of using this method to evaluate
the impact of the balsakhi program?
5 – Randomized Experiment
40Source: www.theoryofchange.org

Impact of Balsakhi - Summary
Method Impact Estimate
(1) Pre-post 26.42*
(2) Simple Difference -5.05*
(3) Difference-in-Difference 6.82*
(4) Regression 1.92
*: Statistically significant at the 5% level

(1) Pre-post 26.42*
(4) Regression 1.92
(5) Randomized Experiment 5.87*

(1) Pre-post 26.42*
(4) Regression 1.92
(5)Randomized Experiment 5.87*
Bottom Line: Which method we use matters!

Example #2 – South Africa microfinance
(1) Pre-post 2384*
(2) Simple Difference 1838*
(3) Difference-in-Difference 1068*
(4) Regression 1412
(5)Randomized Experiment

Example #2 – South Africa microfinance
(1) Pre-post 2384*
(2) Simple Difference 1838*
(3) Difference-in-Difference 1068*
(4) Regression 1412
(5)Randomized Experiment 292*

Example #3 - Pratham’s Read India program

Method Impact
(1) Pre-Post 0.60*
(3) Difference-in-Differences 0.31*
(4) Regression 0.06
(5) Randomized Experiment

Method Impact
(1) Pre-Post 0.60*
(3) Difference-in-Differences 0.31*
(4) Regression 0.06
(5) Randomized Experiment 0.88*

• There are many ways to estimate a
program’s impact
• This course argues in favor of one:
randomized experiments
– Conceptual argument: If properly designed and
conducted, randomized experiments provide the
most credible method to estimate the impact of
a program
– Empirical argument: Different methods can
generate different impact estimates
Conclusions - Why Randomize?

• When is a RCT not possible?
• If it is possible, when would a RCT be
unnecessary?
• What are common critiques you have heard
of RCTs?
Conclusions – A few parting questions

2. Why Randomize

More Related Content

What's hot (20)

Similar to 2. Why Randomize (20)

More from vinhthedang (12)

Recently uploaded (20)

2. Why Randomize