PS.Observational.SAS_Y.Duan

Application of Propensity Score
Matching in Observational
Studies Using SAS
Yinghui (Delian) Duan, M.Sc., Ph.D candidate
Department of Community Medicine and Health Care,
University of Connecticut Health Center
Connecticut Institute for Clinical and Translational Science (CICATS)
Email: yduan@uchc.edu

Randomized Control Trials (RCTs)
› Treatment assignment is randomized
› Pre-treatment characteristics are balanced,
no confounding effects
› Difference in post-treatment outcomes can
be attributed to treatment effects
› “Gold standard” to estimate the effects of
treatment, interventions, and exposures

Observational Studies
› Non-experimental
› Treatment assignment is not determined by
design
› Usually the “treated” and “untreated” are
systematically different in some
characteristics that can affect outcome of
interest (i.e. confounders)
› Difficult to conclude causal effects due to
confounders

Propensity Score
Method
A useful tool to control confounding
effects in observational studies

Propensity Score (PS)
› Defined by Rosenbaum & Rubin in 1983:
the probability of treatment assignment
conditional on observed baseline covariates
PSi = Pr (Treatmenti = 1 |Xi)
› A useful tool to remove confounding effects
and enhance causal inference in
observational studies

Estimating PS
› PS is most often estimated by a logistic
regression model
› Can also be estimated using other methods,
e.g., bagging or boosting, recursive
partitioning or tree-based methods, random
forests, and neural networks.
› No significant advantages reported compared
to logistic regression model

Example dataset: 6-month Mortality after
Percutaneous Coronary Intervention (PCI)
› Study sample: patients who received PCI
› Treatments: usual care alone vs. usual care + a
blood thinner
› Baseline confounders: age, gender, height,
coronary stent placement, acute myocardial
infarction within 7 days, and diabetes
› Outcome: 6-month mortality (0 or 1)

Example dataset: 6-month Mortality after
Percutaneous Coronary Intervention (PCI)
› Table: Sample description
p-value
n % n %
Age (mean ± SD) 64.6 ± 4.2 62.0 ± 3.8 <0.001
Female 938 33.1 760 32.6 0.673
Height (mean ± SD) 172.4 ± 10.2 171.6 ± 9.5 0.002
Stent 1,794 63.4 1,611 69.1 <0.001
Diabetes 659 23.3 438 18.8 <0.001
Acute MI 193 6.8 356 15.3 <0.001
Usual care
alone
(N = 2,830)
Usual care + blood
thinner
(N = 2,332)

Snapshot of output dataset “new_ps”

Remove Confounding
Effects using PS

Two Important Assumptions
› The assignment of treatment is independent
of potential outcomes conditional on the
observed baseline covariates
› Every subjects has a nonzero probability to
receive either treatment

Four Methods
› PS matching – most widely used
› Stratification using PS

PS Treated Untreated Strata
U
T U
U
0.4 U
T U
T
0.5 T U
T U
T U
T U
T
0.7 T U
T U
T
0.8 T
T
0.9
1
2
3
4
5
0.3
0.6
› Stratification using PS Trimming
Trimming

Four Methods
› PS matching – most widely used
› Stratification using PS
› Weighting adjustment, e.g., Inverse
probability of treatment weighting (IPTW)
using PS
› Covariate adjustment using PS – not
recommended

Propensity Score Matching
To form matched sets of
treated and untreated
subjects who share a similar
value of PS

Common Support
Frequency
Untreated
Treated
0 Propensity Score 1
Region of Common Support

Four Methods – Common Support
› PS matching –
› Stratification – only when used together with
trimming
› IPTW – not explicitly examine common
support
› Covariate adjustment – not explicitly
examine common support

PS Matching
› Some decisions to be made:
› 1:1 or N:1 matching
› N:1 can improve efficiency, reduce
variance, but increase bias
› With or without replacement
› With-replacement may yield less bias,
but higher variance
› Which algorithm?

PSM Algorithms: Nearest-Neighbor
PS Treated Untreated
0.3 U
T U
0.4
T
0.5
T U
T U
0.6 U
T U
0.7
› Each treated will get a match,
even if it isn’t a very good one
› Will create problem when a
treated subject just doesn’t have
any controls with similar PS
› If there are multiple untreated
subjects with the same PS value as
the treated subject, randomly
select one

PSM Algorithms: Match within Caliper
› Caliper: limit matches to be within
some range of PS values
› 0.2 of the standard deviation of
the logit of the PS (Austin, 2011)
› 0.25 or 0.5 of the PS standard
deviation
PS Treated Untreated
0.3 U
T U
0.4
T
0.5
T U
T U
0.6 U
T U
0.7

PSM Algorithms: Greedy vs. Optimal
› Overall absolute distance = 0.01 + 0.03 = 0.04
ID PS ID PS
… … … …
… … … …
112 0.43 210 0.40
211 0.41
113 0.45 212 0.44
213 0.48
… … … …
Treated Untreated

› Overall absolute distance = 0.02 + 0.01 = 0.03
ID PS ID PS
… … … …
… … … …
112 0.43 210 0.40
211 0.41
103 0.45 212 0.44
213 0.48
… … … …
Treated Untreated

› Often does not make huge difference
› Generate the same results if matching with
replacement

PSM Example
A macro performing N:1
match on propensity score

› N:1 match
› Matching iterations are from 8-digit to
1-digit
› E.g., in the 3nd iteration, 6-digit matching,
PS = 0.12345698 is matched with PS =
0.12345605

› All macro variables are required except and
SiteN
› Lib has to be specified even if it’s “work”
(otherwise error will occur)
› If SiteN is specified, then subjects will be
matched within each site

› These statements can be modified or
removed to change matching precision

Run Matching for the Example
Dataset

Examine balance after PS Matching
› P-value can be misleading, especially in
large sample and with many confounders
› Standardized mean difference < 10
p-value
Standardized
Mean
Difference
n % n %
Age (mean ± SD) 62.7 ± 3.6 62.8 ± 3.5 0.818 0.76
Female 599 32.9 615 33.8 0.574 1.87
Height (mean ± SD) 171.9 ± 10.2 171.8 ± 9.5 0.7511 1.02
Stent 1,203 66.1 1,214 66.7 0.699 1.28
Diabetes 373 20.5 371 20.4 0.935 0.27
Acute MI 174 9.6 182 10.0 0.655 1.48
Usual care
alone
(N = 1,819)
Usual care + blood
thinner
(N = 1,819)

Standardized Mean Difference
› For continuous variables:
› For categorical variables:
› ± Sign does not matter
*100
*100

Another Example
Matching using specified
caliper = 0.2 of SD of logit
of PS

Calculate Logit of PS
Calculate SD of Logit of PS
0.2*SD = 0.158

Estimating Treatment
Effect in Matched Sample

Estimating Treatment Effects
› Run the same outcome analyses you would
have done on the original data
› Double robust: regression adjustment
for confounders can reduce residual
effects, increase precision
› If matching done with replacement, need
to use weight to reflect the fact that
controls used more than once

› PS model:
› Non-parsimonious model to estimate PS
› Include covariates that are associated
with outcome, or with both outcome and
treatment; do NOT include covariates
that are strongly correlated with
treatment, but not directly associated
with outcome
› Can include interaction terms and higher
order to improve PS estimation and
matching

› Sample size
› At least 1,000 – 1,500 (Shadish 2013)
› Missing data
› List-wise deletion

Thanks!
Questions?
Comments?
Further questions: yduan@uchc.edu

References:
Overview/tutorial of Propensity Score method:
1. Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the
propensity score in observational studies for causal
effects. Biometrika, 70(1), 41-55.
2. Austin, P. C. (2011). An Introduction to Propensity Score Methods for
Reducing the Effects of Confounding in Observational Studies.
Multivariate Behavioral Research, 46(3), 399–424.
http://doi.org/10.1080/00273171.2011.568786
3. Stuart, E. A. (2010). Matching methods for causal inference: A review
and a look forward. Statistical Science : A Review Journal of the
Institute of Mathematical Statistics, 25(1), 1–21.
http://doi.org/10.1214/09-STS313
4. Caliendo, M., & Kopeinig, S. (2008). Some practical guidance for the
implementation of propensity score matching. Journal of economic
surveys,22(1), 31-72.

References (cont.):
Others:
1. Brookhart, M. A., Schneeweiss, S., Rothman, K. J., Glynn, R. J., Avorn, J., &
Stürmer, T. (2006). Variable selection for propensity score
models.American journal of epidemiology, 163(12), 1149-1156.
2. Shadish, W. R. (2013). Propensity score analysis: promise, reality and
irrational exuberance. Journal of Experimental Criminology, 9(2), 129-
144.
3. Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as
nonparametric preprocessing for reducing model dependence in
parametric causal inference. Political analysis, 15(3), 199-236.

References (cont.):
Materials from Other Presentations:
1. Stuart, E. (2011). “The why, when, and how of propensity score methods
for estimating causal effects” at Society for Prevention Research, May
31, 2011. Slides: http://www.preventionresearch.org/wp-
content/uploads/2011/07/SPR-Propensity-pc-workshop-slides.pdf
2. VanEseltine, M. (2013). “Introduction to propensity score analysis” at
CFDR Summer Methods Seminar, June 26, 2013. Slides:
https://www.bgsu.edu/content/dam/BGSU/college-of-arts-and-
sciences/center-for-family-and-demographic-
research/documents/Workshops/2013%20-workshop-propensity-
score-analysis.pdf

References (cont.):
Macros for propensity score matching:
1. Parsons, L. (2004, May). Performing a 1: N case-control match on
propensity score. In Proceedings of the 29th Annual SAS Users Group
international conference (pp. 165-29).
2. Fraeman, K. H. (2010). An introduction to implementing propensity score
matching with SAS®. Bethesda, MD: United BioSource Corporation.
3. Feng, W. W., Jun, Y., & Xu, R. (2006). A method/macro based on
propensity score and Mahalanobis distance to reduce bias in treatment
comparison in observational study. In SAS PharmaSUG 2006
Conference.
4. Coca-Perraillon, M. (2007, April). Local and global optimal propensity
score matching. In SAS Global Forum (Vol. 185, pp. 1-9).

PS.Observational.SAS_Y.Duan

More Related Content

What's hot (19)

Similar to PS.Observational.SAS_Y.Duan (20)

PS.Observational.SAS_Y.Duan

Editor's Notes