Evaluating a Propensity Score
Adjustment for Combining
Probability and Non-Probability
Samples in a National Survey
FedCASIC
March 5, 2015
Kurt R. Peters, PhD
Heather Driscoll, MS
Pedro Saavedra, PhD
2
Outline
 2012 Canadian Nature Survey
– Research questions
– Survey design
 Weighting Methodology
 Results (Comparison of weighted estimates)
 Conclusions
3
Research Questions
 National population survey of Canadian adults
2012 CANADIAN NATURE SURVEY
Connection to &
awareness of nature
Nature-based activities,
participation, and
expenditures
Human/wildlife conflict
4
Survey Design
 Complex sample design with hybrid probability and non-probability samples
 Multi-mode administration (Paper + Web)
 For probability sample (nationally):
– 76,363 addresses sampled from ABS frame
– 15,207 completes
– 20% response rate (lower bound)
 For non-probability samples (nationally):
– 8,897 completes
2012 CANADIAN NATURE SURVEY
5
Survey Design
2012 CANADIAN NATURE SURVEY
P
W
P
W
P
W
C
P
P
P P
P
P
P
W
W W
C
P
W
P
W
P Probability (ABS)
W Non-Probability (Web Panel)
C Non-Probability (Community)
SAMPLE TYPES
PROVINCE
ABS
RESPONSES
WEB PANEL
RESPONSES
AB 1,511 818
ON 1,011 4,584
QC 1,029 2,986
TOTAL 3,551 8,388
6
Survey Design
 Address-Based Sample of Canadian Adults
– Drawn from Canada Post address file
– Stratification:
• Province/Territory (all except Nunavut)
• Urban/Rural address (Canada Post frame variable)
– Mode of Administration:
• Paper, with Web option
– Within-HH selection by Last Birthday Method
– Targeted 1,000 completes in each province and territory
2012 CANADIAN NATURE SURVEY
P
7
Survey Design
 Web Panel Sample
– Canadian adults recruited via social media and websites
– Recruited to match key demographic distributions (ethnicity, age, education, income)
– In each P/T, fielded until target number of completes was reached
2012 CANADIAN NATURE SURVEY
W
8
Weighting Methodology
 Focus of current research is evaluation of weighting to combine the
probability (ABS) and non-probability (Web panel) datasets for analysis
P W
0
5
10
15
20
25
18 - 25 26 - 35 36 - 45 46 - 55 56 - 65 66 - 75 76 - 100
Percent
Age (Unweighted)
Population ABS Panel
0
20
40
60
Male Female
Percent
Sex (Unweighted)
Population ABS Panel
9
Weighting Methodology
 An ABS analytic weight was developed for ABS respondents
– Standard probability-based selection weight adjusted for non-response and post-
stratified to Census totals:
• Province x Age x Sex
• Province x Urban/Rural
• Aboriginal/Non-Aboriginal
10
Weighting Methodology
 The following approach was explored for combining the ABS and Panel
respondents into a single weighted dataset:
1. Estimate probability of observation in Panel (vs. Population)
2. Score all (Panel and ABS) cases to assign a probability of observation under Panel
design
3. Assign probability of observation under ABS design to Panel cases
4. Combine ABS and Panel observation probabilities to compute combined weight
11
Weighting Methodology
 Estimate probability of observation in Panel (vs. Population) using weighted
logistic regression
– Outcome = Observation in Panel (vs. Population)
• P(Observation) = P(Selection) * P(Response)
– Weights:
• For ABS cases, weight = ABS analytic weight (NR-adjusted and post-stratified to population)
• For Panel cases, weight = 1
12
Weighting Methodology
 Estimate probability of observation in Panel (vs. Population) using weighted
logistic regression
– Predictors: Effect Comparison Odds Ratio
Province AB vs QC 0.7
ON vs QC 1.1
Age 18 - 25 vs 76 - 100 7.9
26 - 35 vs 76 - 100 8.7
36 - 45 vs 76 - 100 7.7
46 - 55 vs 76 - 100 6.4
56 - 65 vs 76 - 100 7.3
66 - 75 vs 76 - 100 4.3
Sex Female vs Male 1.2
Urbanicity Urban vs. Rural 1.2
Nature-related Profession No vs. Yes 0.9
Aboriginal No vs. Yes 1.1
Immigrant No vs. Yes 1.3
Education (Highest) Elementary vs. Other 0.4
Some HS vs. Other 1.4
HS vs. Other 2.4
2-yr College vs. Other 1.8
Bachelor’s vs. Other 1.2
Master’s vs. Other 1.1
Doctorate vs. Other 1.2
HH Income 0.9
ns
ns
ns
ns
ns
ns
Base Model
𝑅2
= .11
Full Model
𝑅2
= .19
13
Weighting Methodology
 Score all (Panel and ABS) cases to assign a probability of observation in Panel
• Mean estimated probability of observation under Panel design:
0.0000
0.0002
0.0004
0.0006
0.0008
Female Male
Sex
0.0000
0.0002
0.0004
0.0006
0.0008
Age
0.0000
0.0002
0.0004
0.0006
0.0008
Urban Rural
Urban/Rural
0.0000
0.0002
0.0004
0.0006
0.0008
Income
0
0.0002
0.0004
0.0006
0.0008
Education
0
0.0002
0.0004
0.0006
0.0008
No Yes
Nature-Related Profession
ns
14
Weighting Methodology
 Assign probability of observation under ABS design to Panel cases
– Probability of observation under ABS design computed as inverse of post-stratified
ABS analytic weight
– Within post-stratification classes, same ABS probability was assigned to Panel
respondents
• This assumes that ABS and Panel cases within these classes have the same probability of
observation under ABS design
– Result is that all cases in combined sample have a (true or estimated) probability of
observation under both the ABS and Panel designs
P(Observation)
ABS Panel
Sample
Source
ABS Inverse of post-stratified, NR-
adjusted ABS sampling weight
Matched by post-stratification
class
Panel Estimated Panel probability Estimated Panel probability
15
Weighting Methodology
 Combine ABS and Panel probabilities to compute combined weight
– 𝑝 𝐴𝐵𝑆 ∪ 𝑃𝑎𝑛𝑒𝑙 = 𝑝 𝐴𝐵𝑆 + 𝑝 𝑃𝑎𝑛𝑒𝑙 − 𝑝 𝐴𝐵𝑆 ∗ 𝑝 𝑃𝑎𝑛𝑒𝑙
– 𝑤 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 = 1/𝑝 𝐴𝐵𝑆 ∪ 𝑃𝑎𝑛𝑒𝑙
16
Results
 Demographics
0
5
10
15
20
25
18 - 25 26 - 35 36 - 45 46 - 55 56 - 65 66 - 75 76 - 100
Percent
Age
Population ABS (Unweighted)
Panel (Unweighted) Combined (Weighted)
0
10
20
30
40
50
60
Male Female
Percent
Sex
Population ABS (Unweighted) Panel (Unweighted) Combined (Weighted)
17
Results
 Demographics
0% 10% 20% 30% 40% 50% 60% 70% 80%
Education > HS
HH Income > $50,000
ABS (ABS Weight) Combined (Combined Weight)
Panel (1/p(Panel) Panel (Unweighted)
18
Results
 Key Survey Outcomes
MAD of Panel from ABS
population estimates is
10% lower after weighting,
and ~40% lower with
combined weighted sample
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
Nature-related profession
Chose where to live in part to have access to nature
Chose to spend more time outdoors in the last year to
experience nature
Aware of the concept of species at risk
Aware of the concept of biodiversity
Aware of the concept of ecosystem services
Participated in some form of nature-based recreation
Participated in fishing
Spent >$40 in donations and membership dues to nature
organizations
Experienced a threat from wild animals
Experienced damage to personal property caused by wild
animals
ABS (ABS Weight) Combined (Combined Weight) Panel (1/p(Panel) Panel (Unweighted)
19
Conclusions
 Unweighted panel data differed from benchmarks
– Demographics: More female, younger, lower income, less educated, more urban
– Outcomes:
• Accurate (±2 points):
– Nature-related profession
– Aware of the concept of species at risk
– Experienced a threat from wild animals
– Experienced damage to personal property caused by wild animals
• Overestimates (>2 points over):
– Chose where to live in part to have access to nature
– Participated in fishing
• Underestimates (>2 points under):
– Chose to spend more time outdoors in the last year to experience nature
– Aware of the concept of biodiversity
– Aware of the concept of ecosystem services
– Participated in some form of nature-based recreation
– Spent >$40 in donations and membership dues to nature organizations
20
Conclusions
 Propensity score model was used to estimate probability of being observed in
the panel compared to general population
– Model explained only some of the variance (𝑅2
= .19) – room for improvement
– Nevertheless, estimated probability of observation
• Brought panel demographics in line with population
• Reduced bias in panel estimates for key survey outcomes
• Made possible the combination of probability (ABS) and non-probability (Panel) data into a
single, weighted dataset
21
Conclusions
 Next steps…
– Building a more comprehensive model of P(Observation) under panel design
– Can statistical matching (“data fusion”) exploit differences between ABS and Panel
respondents to increase efficiency of data collection?
• For example, using ABS to estimate prevalence and Panel to collect detailed per-person data
(such as expenditures, travel days, etc.)
• May lower administration cost and respondent burden
– Does reduction in bias via panel weight come at the price of increased variance? How
accurate are estimates of sampling error from modeled probabilities of selection?
Thank You!
icfi.com/SurveyResearch
Contact: James Dayton James.Dayton@icfi.com

More Related Content

PPTX
From Evidence to Action
PDF
UNU WIDER Conf de Groot
PPTX
Impacts of cash transfers on schooling
PPTX
Poverty and perceived stress: evidence from two unconditional cash transfer p...
PDF
Ability of Household Food Insecurity Measures to Capture Vulnerability & Resi...
PPT
Analytic Methods and Issues in CER from Observational Data
PDF
Causal Inference and Program Evaluation
PDF
Difference-in-Difference Methods
From Evidence to Action
UNU WIDER Conf de Groot
Impacts of cash transfers on schooling
Poverty and perceived stress: evidence from two unconditional cash transfer p...
Ability of Household Food Insecurity Measures to Capture Vulnerability & Resi...
Analytic Methods and Issues in CER from Observational Data
Causal Inference and Program Evaluation
Difference-in-Difference Methods

Similar to Evaluating a Propensity Score Adjustment for Combining Probability and Non-Probability Samples in a National Survey (20)

PPSX
Workshop session 4 - Optimal sample designs for general community telephone s...
PDF
Population sampling RSS6 2014
PDF
ANOVA, significance levels, p-values and Bayesian Analysis
PDF
Chi square Test Using SPSS
PDF
Fit for Purpose Community Health Surveys: An Experiment in Three Communities
 
PDF
Weighting a probability online panel with multiple waves of recruitment
PDF
Jaredstarrmasccc2014
PDF
Ch0_Introduction_What sis Statistics.pdf
PDF
MD poverty indexes
PPTX
Adv.-Statistics-2.pptx
PDF
Session III - Census and Registers - S. Falorsi, A. Fasulo, Census and Soc...
PDF
AAPOR 2016 - Dutwin and Buskirk - Apples to Oranges
PDF
Links between Occupational History and Functional Limitations among Older Adu...
PPTX
Science Data, Responsibly
PPTX
intro to statistics and data analysis.pptx
PDF
Green Space Quantity and Mental Health: Evidence on Gender Differences in Rel...
PDF
A data-intensive assessment of the species abundance distribution
PPTX
sampling and statiscal inference
PPSX
Workshop session 9 - Alternatives to CATI (2) probability online panels
PDF
The ASA president Task Force Statement on Statistical Significance and Replic...
Workshop session 4 - Optimal sample designs for general community telephone s...
Population sampling RSS6 2014
ANOVA, significance levels, p-values and Bayesian Analysis
Chi square Test Using SPSS
Fit for Purpose Community Health Surveys: An Experiment in Three Communities
 
Weighting a probability online panel with multiple waves of recruitment
Jaredstarrmasccc2014
Ch0_Introduction_What sis Statistics.pdf
MD poverty indexes
Adv.-Statistics-2.pptx
Session III - Census and Registers - S. Falorsi, A. Fasulo, Census and Soc...
AAPOR 2016 - Dutwin and Buskirk - Apples to Oranges
Links between Occupational History and Functional Limitations among Older Adu...
Science Data, Responsibly
intro to statistics and data analysis.pptx
Green Space Quantity and Mental Health: Evidence on Gender Differences in Rel...
A data-intensive assessment of the species abundance distribution
sampling and statiscal inference
Workshop session 9 - Alternatives to CATI (2) probability online panels
The ASA president Task Force Statement on Statistical Significance and Replic...
Ad

More from ICF (20)

PDF
Sustainable aviation fuels: A new route to net zero for the aviation industry
 
PDF
Meeting and collaborating from a distance
 
PDF
Planning & Designing for Accessible Experiences
 
PPTX
IEDC COVID-19 webinar
 
PPTX
Strategies for developing measurable goals
 
PPTX
The Role of Government-Funded Assistance Programs on HIV Testing among Poor A...
 
PPTX
How one team unlocked a cultural experience that created a movement
 
PPTX
Federal Dollars for Improving Energy Infrastructure Resilience (NASEO 2019)
 
PPTX
A National Review of Combined Heat and Power Programs in utility Energy Effic...
 
PDF
Assessing the Impact of Mentoring: Lessons Learned from a Research Study in W...
 
PDF
Airport Competition Dynamics
 
PDF
Assessing Child Vaccine Hesitancy using Mobile Panels
 
PPTX
MRO Market Update & Industry Trends
 
PDF
MRO Market Update and Industry Trends
 
PPTX
Evaluation of the Impact of Fire and Rescue
 
PDF
Smoothing the NEPA Process for Freight Rail
 
PDF
Passenger Analytics: A Better Way to Manage Airports
 
PDF
Latin American MRO Market Update & Industry Trends
 
PDF
General International Trends and Efforts in Coping with Climate Change
 
PDF
ICF MRO Market Forecast & Trends – Asia Pacific March 9-10, 2016 Airline E&M:...
 
Sustainable aviation fuels: A new route to net zero for the aviation industry
 
Meeting and collaborating from a distance
 
Planning & Designing for Accessible Experiences
 
IEDC COVID-19 webinar
 
Strategies for developing measurable goals
 
The Role of Government-Funded Assistance Programs on HIV Testing among Poor A...
 
How one team unlocked a cultural experience that created a movement
 
Federal Dollars for Improving Energy Infrastructure Resilience (NASEO 2019)
 
A National Review of Combined Heat and Power Programs in utility Energy Effic...
 
Assessing the Impact of Mentoring: Lessons Learned from a Research Study in W...
 
Airport Competition Dynamics
 
Assessing Child Vaccine Hesitancy using Mobile Panels
 
MRO Market Update & Industry Trends
 
MRO Market Update and Industry Trends
 
Evaluation of the Impact of Fire and Rescue
 
Smoothing the NEPA Process for Freight Rail
 
Passenger Analytics: A Better Way to Manage Airports
 
Latin American MRO Market Update & Industry Trends
 
General International Trends and Efforts in Coping with Climate Change
 
ICF MRO Market Forecast & Trends – Asia Pacific March 9-10, 2016 Airline E&M:...
 
Ad

Recently uploaded (20)

PPT
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
PPTX
modul_python (1).pptx for professional and student
PPTX
SET 1 Compulsory MNH machine learning intro
PPTX
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
PPT
statistic analysis for study - data collection
PPTX
IMPACT OF LANDSLIDE.....................
PDF
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
PPTX
Business_Capability_Map_Collection__pptx
PPT
Image processing and pattern recognition 2.ppt
PPTX
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
PPTX
A Complete Guide to Streamlining Business Processes
PPTX
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
DOCX
Factor Analysis Word Document Presentation
PPTX
Pilar Kemerdekaan dan Identi Bangsa.pptx
PDF
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
PPTX
Leprosy and NLEP programme community medicine
PDF
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
PDF
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx
lectureusjsjdhdsjjshdshshddhdhddhhd1.ppt
modul_python (1).pptx for professional and student
SET 1 Compulsory MNH machine learning intro
Lesson-01intheselfoflifeofthekennyrogersoftheunderstandoftheunderstanded
statistic analysis for study - data collection
IMPACT OF LANDSLIDE.....................
Data Engineering Interview Questions & Answers Cloud Data Stacks (AWS, Azure,...
Business_Capability_Map_Collection__pptx
Image processing and pattern recognition 2.ppt
sac 451hinhgsgshssjsjsjheegdggeegegdggddgeg.pptx
A Complete Guide to Streamlining Business Processes
QUANTUM_COMPUTING_AND_ITS_POTENTIAL_APPLICATIONS[2].pptx
Factor Analysis Word Document Presentation
Pilar Kemerdekaan dan Identi Bangsa.pptx
Votre score augmente si vous choisissez une catégorie et que vous rédigez une...
Leprosy and NLEP programme community medicine
Data Engineering Interview Questions & Answers Batch Processing (Spark, Hadoo...
REAL ILLUMINATI AGENT IN KAMPALA UGANDA CALL ON+256765750853/0705037305
[EN] Industrial Machine Downtime Prediction
Phase1_final PPTuwhefoegfohwfoiehfoegg.pptx

Evaluating a Propensity Score Adjustment for Combining Probability and Non-Probability Samples in a National Survey

  • 1. Evaluating a Propensity Score Adjustment for Combining Probability and Non-Probability Samples in a National Survey FedCASIC March 5, 2015 Kurt R. Peters, PhD Heather Driscoll, MS Pedro Saavedra, PhD
  • 2. 2 Outline  2012 Canadian Nature Survey – Research questions – Survey design  Weighting Methodology  Results (Comparison of weighted estimates)  Conclusions
  • 3. 3 Research Questions  National population survey of Canadian adults 2012 CANADIAN NATURE SURVEY Connection to & awareness of nature Nature-based activities, participation, and expenditures Human/wildlife conflict
  • 4. 4 Survey Design  Complex sample design with hybrid probability and non-probability samples  Multi-mode administration (Paper + Web)  For probability sample (nationally): – 76,363 addresses sampled from ABS frame – 15,207 completes – 20% response rate (lower bound)  For non-probability samples (nationally): – 8,897 completes 2012 CANADIAN NATURE SURVEY
  • 5. 5 Survey Design 2012 CANADIAN NATURE SURVEY P W P W P W C P P P P P P P W W W C P W P W P Probability (ABS) W Non-Probability (Web Panel) C Non-Probability (Community) SAMPLE TYPES PROVINCE ABS RESPONSES WEB PANEL RESPONSES AB 1,511 818 ON 1,011 4,584 QC 1,029 2,986 TOTAL 3,551 8,388
  • 6. 6 Survey Design  Address-Based Sample of Canadian Adults – Drawn from Canada Post address file – Stratification: • Province/Territory (all except Nunavut) • Urban/Rural address (Canada Post frame variable) – Mode of Administration: • Paper, with Web option – Within-HH selection by Last Birthday Method – Targeted 1,000 completes in each province and territory 2012 CANADIAN NATURE SURVEY P
  • 7. 7 Survey Design  Web Panel Sample – Canadian adults recruited via social media and websites – Recruited to match key demographic distributions (ethnicity, age, education, income) – In each P/T, fielded until target number of completes was reached 2012 CANADIAN NATURE SURVEY W
  • 8. 8 Weighting Methodology  Focus of current research is evaluation of weighting to combine the probability (ABS) and non-probability (Web panel) datasets for analysis P W 0 5 10 15 20 25 18 - 25 26 - 35 36 - 45 46 - 55 56 - 65 66 - 75 76 - 100 Percent Age (Unweighted) Population ABS Panel 0 20 40 60 Male Female Percent Sex (Unweighted) Population ABS Panel
  • 9. 9 Weighting Methodology  An ABS analytic weight was developed for ABS respondents – Standard probability-based selection weight adjusted for non-response and post- stratified to Census totals: • Province x Age x Sex • Province x Urban/Rural • Aboriginal/Non-Aboriginal
  • 10. 10 Weighting Methodology  The following approach was explored for combining the ABS and Panel respondents into a single weighted dataset: 1. Estimate probability of observation in Panel (vs. Population) 2. Score all (Panel and ABS) cases to assign a probability of observation under Panel design 3. Assign probability of observation under ABS design to Panel cases 4. Combine ABS and Panel observation probabilities to compute combined weight
  • 11. 11 Weighting Methodology  Estimate probability of observation in Panel (vs. Population) using weighted logistic regression – Outcome = Observation in Panel (vs. Population) • P(Observation) = P(Selection) * P(Response) – Weights: • For ABS cases, weight = ABS analytic weight (NR-adjusted and post-stratified to population) • For Panel cases, weight = 1
  • 12. 12 Weighting Methodology  Estimate probability of observation in Panel (vs. Population) using weighted logistic regression – Predictors: Effect Comparison Odds Ratio Province AB vs QC 0.7 ON vs QC 1.1 Age 18 - 25 vs 76 - 100 7.9 26 - 35 vs 76 - 100 8.7 36 - 45 vs 76 - 100 7.7 46 - 55 vs 76 - 100 6.4 56 - 65 vs 76 - 100 7.3 66 - 75 vs 76 - 100 4.3 Sex Female vs Male 1.2 Urbanicity Urban vs. Rural 1.2 Nature-related Profession No vs. Yes 0.9 Aboriginal No vs. Yes 1.1 Immigrant No vs. Yes 1.3 Education (Highest) Elementary vs. Other 0.4 Some HS vs. Other 1.4 HS vs. Other 2.4 2-yr College vs. Other 1.8 Bachelor’s vs. Other 1.2 Master’s vs. Other 1.1 Doctorate vs. Other 1.2 HH Income 0.9 ns ns ns ns ns ns Base Model 𝑅2 = .11 Full Model 𝑅2 = .19
  • 13. 13 Weighting Methodology  Score all (Panel and ABS) cases to assign a probability of observation in Panel • Mean estimated probability of observation under Panel design: 0.0000 0.0002 0.0004 0.0006 0.0008 Female Male Sex 0.0000 0.0002 0.0004 0.0006 0.0008 Age 0.0000 0.0002 0.0004 0.0006 0.0008 Urban Rural Urban/Rural 0.0000 0.0002 0.0004 0.0006 0.0008 Income 0 0.0002 0.0004 0.0006 0.0008 Education 0 0.0002 0.0004 0.0006 0.0008 No Yes Nature-Related Profession ns
  • 14. 14 Weighting Methodology  Assign probability of observation under ABS design to Panel cases – Probability of observation under ABS design computed as inverse of post-stratified ABS analytic weight – Within post-stratification classes, same ABS probability was assigned to Panel respondents • This assumes that ABS and Panel cases within these classes have the same probability of observation under ABS design – Result is that all cases in combined sample have a (true or estimated) probability of observation under both the ABS and Panel designs P(Observation) ABS Panel Sample Source ABS Inverse of post-stratified, NR- adjusted ABS sampling weight Matched by post-stratification class Panel Estimated Panel probability Estimated Panel probability
  • 15. 15 Weighting Methodology  Combine ABS and Panel probabilities to compute combined weight – 𝑝 𝐴𝐵𝑆 ∪ 𝑃𝑎𝑛𝑒𝑙 = 𝑝 𝐴𝐵𝑆 + 𝑝 𝑃𝑎𝑛𝑒𝑙 − 𝑝 𝐴𝐵𝑆 ∗ 𝑝 𝑃𝑎𝑛𝑒𝑙 – 𝑤 𝑐𝑜𝑚𝑏𝑖𝑛𝑒𝑑 = 1/𝑝 𝐴𝐵𝑆 ∪ 𝑃𝑎𝑛𝑒𝑙
  • 16. 16 Results  Demographics 0 5 10 15 20 25 18 - 25 26 - 35 36 - 45 46 - 55 56 - 65 66 - 75 76 - 100 Percent Age Population ABS (Unweighted) Panel (Unweighted) Combined (Weighted) 0 10 20 30 40 50 60 Male Female Percent Sex Population ABS (Unweighted) Panel (Unweighted) Combined (Weighted)
  • 17. 17 Results  Demographics 0% 10% 20% 30% 40% 50% 60% 70% 80% Education > HS HH Income > $50,000 ABS (ABS Weight) Combined (Combined Weight) Panel (1/p(Panel) Panel (Unweighted)
  • 18. 18 Results  Key Survey Outcomes MAD of Panel from ABS population estimates is 10% lower after weighting, and ~40% lower with combined weighted sample 0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100% Nature-related profession Chose where to live in part to have access to nature Chose to spend more time outdoors in the last year to experience nature Aware of the concept of species at risk Aware of the concept of biodiversity Aware of the concept of ecosystem services Participated in some form of nature-based recreation Participated in fishing Spent >$40 in donations and membership dues to nature organizations Experienced a threat from wild animals Experienced damage to personal property caused by wild animals ABS (ABS Weight) Combined (Combined Weight) Panel (1/p(Panel) Panel (Unweighted)
  • 19. 19 Conclusions  Unweighted panel data differed from benchmarks – Demographics: More female, younger, lower income, less educated, more urban – Outcomes: • Accurate (±2 points): – Nature-related profession – Aware of the concept of species at risk – Experienced a threat from wild animals – Experienced damage to personal property caused by wild animals • Overestimates (>2 points over): – Chose where to live in part to have access to nature – Participated in fishing • Underestimates (>2 points under): – Chose to spend more time outdoors in the last year to experience nature – Aware of the concept of biodiversity – Aware of the concept of ecosystem services – Participated in some form of nature-based recreation – Spent >$40 in donations and membership dues to nature organizations
  • 20. 20 Conclusions  Propensity score model was used to estimate probability of being observed in the panel compared to general population – Model explained only some of the variance (𝑅2 = .19) – room for improvement – Nevertheless, estimated probability of observation • Brought panel demographics in line with population • Reduced bias in panel estimates for key survey outcomes • Made possible the combination of probability (ABS) and non-probability (Panel) data into a single, weighted dataset
  • 21. 21 Conclusions  Next steps… – Building a more comprehensive model of P(Observation) under panel design – Can statistical matching (“data fusion”) exploit differences between ABS and Panel respondents to increase efficiency of data collection? • For example, using ABS to estimate prevalence and Panel to collect detailed per-person data (such as expenditures, travel days, etc.) • May lower administration cost and respondent burden – Does reduction in bias via panel weight come at the price of increased variance? How accurate are estimates of sampling error from modeled probabilities of selection?

Editor's Notes

  • #5: Response rates (reported in Table 1) are calculated using the standard established by American Association of Public Opinion Research (AAPOR) for mail surveys. Response rates can be calculated for the address-based sample only – the Web and opt-in samples did not have an explicit sample draw to serve as a denominator for response rates. Of the addresses sampled, 15,207 resulted in a Completed Interview, and 61,156 resulted in an Eligible Non-interview. Determining eligibility is difficult when conducting a mail survey. AAPOR methodology allows the use of one of several methods to account for records of unknown eligibility. For the 2012 Canadian Nature Survey, a conservative assumption was made that addresses for which eligibility could not be determined (N=4,847), due to an undeliverable address, for example, would be counted as an Eligible Non-Interview. ICF determined this assumption to be sound, as the only eligibility criterion for addresses was that they reach a household in Canada. Thus, the response rates in Table 1 are simply equal to the number of completed interviews divided by the total sample draw. Given the assumption about treatment of addresses with unknown eligibility, figures in Table 1 represent a lower-bound estimate of the response rate.
  • #9: Sex: ABS is very close, Panel overrepresents women Age: ABS is skewed older, Panel is skewed younger
  • #12: ///something like: what is the probability of observing someone in the panel as opposed to the general population, given these characteristics (age, sex, etc.) P(Obs|Panel) = P(Joined Panel) * P(Drawn in Survey) * P(Responded) P(Obs|ABS) = P(In Frame ~ 100%) * P(Selected) * P(Responded)
  • #13: Note: Predictors were all mean-imputed for this model Re: Age: all age groups are overrepresented in Panel vs. ABS when comparing to the oldest age group – this means that the panel skews younger, but note that the Ors become less extreme with age, meaning the age bias diminishes when comparing older to oldest
  • #14: mean p_WEB by sex, age, urban (overall)→ to show that p_WEB is higher for different groups, i.e., probability to be observed via Panel is associated with these characteristics these scores are relative to ABS – e.g., someone with HS edu is much more likely to come from WEB (in terms of P(Observation)) compared to ABS
  • #17: Basically just shows that raking was effective at matching sex and age from population control totals
  • #18: More interesting because income and education were not included in raking Would be nice to include education, income, urban here as well, after grabbing relevant data from Census
  • #19: General trend is that combined weight brings combined data closer in line with ABS estimates from unweighted Panel estimates But awareness of biodiversity doesn’t fit this pattern, along with nature-related profession and damage to personal property ///framing: ABS estimates correct for NR but not completely; still have to treat this as benchmark for survey outcomes benefits of Web survey = different motivations for doing survey (e.g., maybe counters response-bias on ABS side due to interest in topic?) maybe less social desirability on Web (at least compared to interviewer-administered surveys)