Matching Methods
Matching: Overview
 The ideal comparison group is selected such that
matches the treatment group using either a
comprehensive baseline survey or time invariant
characteristics
 The matches are selected on the basis of
similarities in observed characteristics
 This assumes no selection bias based on
unobserved characteristics
 Take the ITN Example from Yesterday: Households who
were more concerned about malaria also took other
preventative actions
 All such differences must be in the data in order for the
match to produce a valid estimate of project impacts
Propensity-Score Matching (PSM)
Propensity score matching: match treated and
untreated observations on the estimated
probability of being treated (propensity
score). Most commonly used.
 Match on the basis of the propensity score
P(X) = Pr (d=1|X)
 D indicates participation in project
 Instead of attempting to create a match for
each participant with exactly the same value
of X, we can instead match on the probability
of participation.
PSM: Key Assumptions
 Key assumption: participation is independent
of outcomes conditional on Xi
 This is false if there are unobserved
outcomes affecting participation
 Enables matching not just at the mean but
balances the distribution of observed
characteristics across treatment and control
Density
0 1
Propensity score
Region of
common
support
Density of scores for
participants
High probability of
participating given X
Density of scores
for non-
participants
Steps in Score Matching
1. Need representative and comparable data
for both treatment and comparison
groups
2. Use a logit (or other discrete choice
model) to estimate program participations
as a function of observable characteristics
3. Use predicted values from logit to
generate propensity score p(xi) for all
treatment and comparison group
members
Calculating Impact using PSM
4. Match Pairs:
 Restrict sample to common support (as in
Figure)
 Need to determine a tolerance limit: how
different can control individuals or villages be
and still be a match?
 Nearest neighbors, nonlinear matching,
multiple matches
5. Once matches are made, we can calculate impact
by comparing the means of outcomes across
participants and their matched pairs
PSM vs Randomization
 Randomization does not require the untestable
assumption of independence conditional on
observables
 PSM requires large samples and good data:
1. Ideally, the same data source is used for
participants and non-participants
2. Participants and non-participants have access
to similar institutions and markets, and
3. The data include X variables capable of
identifying program participation and
outcomes.
Lessons on Matching Methods
 Typically used when neither randomization,
RD or other quasi experimental options are
not possible
 Case 1: no baseline. Can do ex-post matching
 Dangers of ex-post matching:
 Matching on variables that change due to
participation (i.e., endogenous)
 What are some variables that won’t change?
 Matching helps control only for
OBSERVABLE differences, not unobservable
differences
More Lessons on Matching Methods
 Matching becomes much better in
combination with other techniques, such as:
 Exploiting baseline data for matching and using
difference-in-difference strategy
 If an assignment rule exists for project, can
match on this rule
 Need good quality data
 Common support can be a problem if two groups
are very different
Case Study: Piped Water in India
 Jalan and Ravaillion (2003): Impact of piped
water for children’s health in rural India
 Research questions of interest include:
1. Is a child less vulnerable to diarrhoeal disease if
he/she lives in a HH with access to piped water?
2. Do children in poor, or poorly educated, HH have
smaller health gains from piped water?
3. Does income matter independently of parental
education?
Piped Water: the IE Design
 Classic problem for infrastructure programs:
randomization is generally not an option (although
randomization in timing may be possible in other
contexts)
 The challenge: observable and unobservable
differences across households with piped water and
those without
 What are differences for such households in Nigeria?
 Jalan and Ravallion use cross-sectional data
 1993-1994 nationally representative survey on 33,000
rural HH from 1765 villages
PSM in Practice
 To estimate the propensity score, authors used:
 Village level characteristics
Including: Village size, amount of irrigated land,
schools, infrastructure (bus stop, railway station)
 Household variables
Including: Ethnicity / caste / religion, asset ownership
(bicycle, radio, thresher), educational background of HH
members
 Are there variables which can not be included?
Only using cross-section, so no variables influenced by
project
Piped Water: Behavioral Considerations
 IE is designed to estimate not only impact of piped
water but to look at how benefits vary across
group
 There is therefore a behavioral component: poor
households may be less able to benefit from piped
water b/c they do not properly store water
 With this in mind, Are there any key variables
missing?
Potential Unobserved Factors
 The behavioral factors – importance put on
sanitation and behavioral inputs – are also likely
correlated with whether a HH has piped water
 However, there are no behavioral variables in
data: water storage, soap usage, latrines
 These are unobserved factors NOT included in
propensity score
Matching_Methods.ppthttps://turnitinuk.comhttps://turnitinuk.com
Piped Water: Impacts
 Disease prevalence among those with piped
water would be 21% higher without it
 Gains from piped water exploited more by
wealthier households and households with more
educated mothers
 Even find counterintuitive result for low income,
illiterate HH: piped water is associated with
higher diarrhea prevalence
Design When to use Advantages Disadvantages
Randomization Whenever feasible
When there is
variation at the
individual or
community level
Gold standard
Most powerful
Not always feasible
Not always ethical
Randomized
Encouragement
Design
When an
intervention is
universally
implemented
 Provides
exogenous variation
for a subset of
beneficiaries
Only looks at sub-
group of sample
Power of
encouragement design
only known ex post
Regression
Discontinuity
If an intervention
has a clear, sharp
assignment rule
 Project
beneficiaries often
must qualify through
established criteria
Only look at sub-
group of sample
Assignment rule in
practice often not
implemented strictly
Difference-in-
Differences
If two groups are
growing at similar
rates
 Baseline and follow-
up data are available
Eliminates fixed
differences not
related to treatment
Can be biased if
trends change
Ideally have 2 pre-
intervention periods of
data
Matching  When other
methods are not
possible
Overcomes
observed differences
between treatment
and comparison
Assumes no
unobserved differences
(often implausible)

More Related Content

PPT
Matching methods
PPT
Matching_Methods.ppt
PPT
M&E Systems for Evaluation: Where M meets E
PPTX
Impact Evaluation Training with AERC: Ghana's LEAP Programme Technical Resear...
PDF
UNU WIDER Conf Daidone 1
PPTX
Evidence & Implementation of Strategies to Strengthen Health Services
 
PDF
Learning from a Class Imbalanced Public Health Dataset: a Cost-based Comparis...
DOCX
Addressing Behavioral Risk FactorsIt is unreasonable to expec.docx
Matching methods
Matching_Methods.ppt
M&E Systems for Evaluation: Where M meets E
Impact Evaluation Training with AERC: Ghana's LEAP Programme Technical Resear...
UNU WIDER Conf Daidone 1
Evidence & Implementation of Strategies to Strengthen Health Services
 
Learning from a Class Imbalanced Public Health Dataset: a Cost-based Comparis...
Addressing Behavioral Risk FactorsIt is unreasonable to expec.docx

Similar to Matching_Methods.ppthttps://turnitinuk.comhttps://turnitinuk.com (20)

PPTX
What is Evaluation
PDF
Prediciting happiness from mobile app survey data
PDF
The Donor Footprint and Gender Gaps
PPTX
Retention, attrition and motivation of voluntary workers in community-based p...
PPT
Introduction to Outcome Mapping
PPTX
What is impact evaluation?
PDF
External Validity and Mechanism Mapping
PDF
AAPOR 2016 - Dutwin and Buskirk - Apples to Oranges
PPT
8 M&E: Data Sources
DOCX
Statistic in Health Care Management Assignment Week 3Case Study.docx
PPTX
Evaluation of health services
PPTX
LESSON VII - Impact Evaluation Research Designs.pptx
DOCX
Secondary Data Table Template The data obtained on this table is.docx
PDF
Patton1990
DOCX
By Carrie E. Fry, Sayeh S. Nikpay, Erika Leslie, and Melinda B.docx
PPTX
Practical Research 2 First S - Akio.pptx
PPTX
Aspirations and Poverty in Rural Ethiopia
PPTX
Chi-Square Test Non Parametric Test Categorical Variable
DOCX
ScenarioStatistical significance is found in a study, but the ef.docx
PDF
Transitions in M&E of SBC Handout
What is Evaluation
Prediciting happiness from mobile app survey data
The Donor Footprint and Gender Gaps
Retention, attrition and motivation of voluntary workers in community-based p...
Introduction to Outcome Mapping
What is impact evaluation?
External Validity and Mechanism Mapping
AAPOR 2016 - Dutwin and Buskirk - Apples to Oranges
8 M&E: Data Sources
Statistic in Health Care Management Assignment Week 3Case Study.docx
Evaluation of health services
LESSON VII - Impact Evaluation Research Designs.pptx
Secondary Data Table Template The data obtained on this table is.docx
Patton1990
By Carrie E. Fry, Sayeh S. Nikpay, Erika Leslie, and Melinda B.docx
Practical Research 2 First S - Akio.pptx
Aspirations and Poverty in Rural Ethiopia
Chi-Square Test Non Parametric Test Categorical Variable
ScenarioStatistical significance is found in a study, but the ef.docx
Transitions in M&E of SBC Handout
Ad

Recently uploaded (20)

PPTX
TRAINNING, DEVELOPMENT AND APPRAISAL.pptx
PDF
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
PDF
Introduction to Generative Engine Optimization (GEO)
PPTX
basic introduction to research chapter 1.pptx
PPTX
BUSINESS CYCLE_INFLATION AND UNEMPLOYMENT.pptx
PDF
NEW - FEES STRUCTURES (01-july-2024).pdf
PDF
Ron Thomas - Top Influential Business Leaders Shaping the Modern Industry – 2025
PDF
1911 Gold Corporate Presentation Aug 2025.pdf
PPT
Lecture 3344;;,,(,(((((((((((((((((((((((
PPTX
IITM - FINAL Option - 01 - 12.08.25.pptx
DOCX
Center Enamel Powering Innovation and Resilience in the Italian Chemical Indu...
PDF
PMB 401-Identification-of-Potential-Biotechnological-Products.pdf
DOCX
Center Enamel A Strategic Partner for the Modernization of Georgia's Chemical...
PDF
Satish NS: Fostering Innovation and Sustainability: Haier India’s Customer-Ce...
PDF
Charisse Litchman: A Maverick Making Neurological Care More Accessible
PDF
Booking.com The Global AI Sentiment Report 2025
PPTX
chapter 2 entrepreneurship full lecture ppt
PDF
income tax laws notes important pakistan
PDF
533158074-Saudi-Arabia-Companies-List-Contact.pdf
DOCX
Hand book of Entrepreneurship 4 Chapters.docx
TRAINNING, DEVELOPMENT AND APPRAISAL.pptx
Nante Industrial Plug Factory: Engineering Quality for Modern Power Applications
Introduction to Generative Engine Optimization (GEO)
basic introduction to research chapter 1.pptx
BUSINESS CYCLE_INFLATION AND UNEMPLOYMENT.pptx
NEW - FEES STRUCTURES (01-july-2024).pdf
Ron Thomas - Top Influential Business Leaders Shaping the Modern Industry – 2025
1911 Gold Corporate Presentation Aug 2025.pdf
Lecture 3344;;,,(,(((((((((((((((((((((((
IITM - FINAL Option - 01 - 12.08.25.pptx
Center Enamel Powering Innovation and Resilience in the Italian Chemical Indu...
PMB 401-Identification-of-Potential-Biotechnological-Products.pdf
Center Enamel A Strategic Partner for the Modernization of Georgia's Chemical...
Satish NS: Fostering Innovation and Sustainability: Haier India’s Customer-Ce...
Charisse Litchman: A Maverick Making Neurological Care More Accessible
Booking.com The Global AI Sentiment Report 2025
chapter 2 entrepreneurship full lecture ppt
income tax laws notes important pakistan
533158074-Saudi-Arabia-Companies-List-Contact.pdf
Hand book of Entrepreneurship 4 Chapters.docx
Ad

Matching_Methods.ppthttps://turnitinuk.comhttps://turnitinuk.com

  • 2. Matching: Overview  The ideal comparison group is selected such that matches the treatment group using either a comprehensive baseline survey or time invariant characteristics  The matches are selected on the basis of similarities in observed characteristics  This assumes no selection bias based on unobserved characteristics  Take the ITN Example from Yesterday: Households who were more concerned about malaria also took other preventative actions  All such differences must be in the data in order for the match to produce a valid estimate of project impacts
  • 3. Propensity-Score Matching (PSM) Propensity score matching: match treated and untreated observations on the estimated probability of being treated (propensity score). Most commonly used.  Match on the basis of the propensity score P(X) = Pr (d=1|X)  D indicates participation in project  Instead of attempting to create a match for each participant with exactly the same value of X, we can instead match on the probability of participation.
  • 4. PSM: Key Assumptions  Key assumption: participation is independent of outcomes conditional on Xi  This is false if there are unobserved outcomes affecting participation  Enables matching not just at the mean but balances the distribution of observed characteristics across treatment and control
  • 5. Density 0 1 Propensity score Region of common support Density of scores for participants High probability of participating given X Density of scores for non- participants
  • 6. Steps in Score Matching 1. Need representative and comparable data for both treatment and comparison groups 2. Use a logit (or other discrete choice model) to estimate program participations as a function of observable characteristics 3. Use predicted values from logit to generate propensity score p(xi) for all treatment and comparison group members
  • 7. Calculating Impact using PSM 4. Match Pairs:  Restrict sample to common support (as in Figure)  Need to determine a tolerance limit: how different can control individuals or villages be and still be a match?  Nearest neighbors, nonlinear matching, multiple matches 5. Once matches are made, we can calculate impact by comparing the means of outcomes across participants and their matched pairs
  • 8. PSM vs Randomization  Randomization does not require the untestable assumption of independence conditional on observables  PSM requires large samples and good data: 1. Ideally, the same data source is used for participants and non-participants 2. Participants and non-participants have access to similar institutions and markets, and 3. The data include X variables capable of identifying program participation and outcomes.
  • 9. Lessons on Matching Methods  Typically used when neither randomization, RD or other quasi experimental options are not possible  Case 1: no baseline. Can do ex-post matching  Dangers of ex-post matching:  Matching on variables that change due to participation (i.e., endogenous)  What are some variables that won’t change?  Matching helps control only for OBSERVABLE differences, not unobservable differences
  • 10. More Lessons on Matching Methods  Matching becomes much better in combination with other techniques, such as:  Exploiting baseline data for matching and using difference-in-difference strategy  If an assignment rule exists for project, can match on this rule  Need good quality data  Common support can be a problem if two groups are very different
  • 11. Case Study: Piped Water in India  Jalan and Ravaillion (2003): Impact of piped water for children’s health in rural India  Research questions of interest include: 1. Is a child less vulnerable to diarrhoeal disease if he/she lives in a HH with access to piped water? 2. Do children in poor, or poorly educated, HH have smaller health gains from piped water? 3. Does income matter independently of parental education?
  • 12. Piped Water: the IE Design  Classic problem for infrastructure programs: randomization is generally not an option (although randomization in timing may be possible in other contexts)  The challenge: observable and unobservable differences across households with piped water and those without  What are differences for such households in Nigeria?  Jalan and Ravallion use cross-sectional data  1993-1994 nationally representative survey on 33,000 rural HH from 1765 villages
  • 13. PSM in Practice  To estimate the propensity score, authors used:  Village level characteristics Including: Village size, amount of irrigated land, schools, infrastructure (bus stop, railway station)  Household variables Including: Ethnicity / caste / religion, asset ownership (bicycle, radio, thresher), educational background of HH members  Are there variables which can not be included? Only using cross-section, so no variables influenced by project
  • 14. Piped Water: Behavioral Considerations  IE is designed to estimate not only impact of piped water but to look at how benefits vary across group  There is therefore a behavioral component: poor households may be less able to benefit from piped water b/c they do not properly store water  With this in mind, Are there any key variables missing?
  • 15. Potential Unobserved Factors  The behavioral factors – importance put on sanitation and behavioral inputs – are also likely correlated with whether a HH has piped water  However, there are no behavioral variables in data: water storage, soap usage, latrines  These are unobserved factors NOT included in propensity score
  • 17. Piped Water: Impacts  Disease prevalence among those with piped water would be 21% higher without it  Gains from piped water exploited more by wealthier households and households with more educated mothers  Even find counterintuitive result for low income, illiterate HH: piped water is associated with higher diarrhea prevalence
  • 18. Design When to use Advantages Disadvantages Randomization Whenever feasible When there is variation at the individual or community level Gold standard Most powerful Not always feasible Not always ethical Randomized Encouragement Design When an intervention is universally implemented  Provides exogenous variation for a subset of beneficiaries Only looks at sub- group of sample Power of encouragement design only known ex post Regression Discontinuity If an intervention has a clear, sharp assignment rule  Project beneficiaries often must qualify through established criteria Only look at sub- group of sample Assignment rule in practice often not implemented strictly Difference-in- Differences If two groups are growing at similar rates  Baseline and follow- up data are available Eliminates fixed differences not related to treatment Can be biased if trends change Ideally have 2 pre- intervention periods of data Matching  When other methods are not possible Overcomes observed differences between treatment and comparison Assumes no unobserved differences (often implausible)