SlideShare a Scribd company logo
Evaluating Linked Survey and Administrative Data for Policy Research Joint Statistical Meetings July 31,  2007 Michael Davern, Ph.D. Assistant Professor, Research Director SHADAC, Health Policy & Management University of Minnesota Supported by a grant from The Robert Wood Johnson Foundation
Start with conclusions There is great potential for linked survey and administrative data files Survey microdata are in the public domain Strengths and (especially) limitations are well known Administrative data are the standard on programatic issues, but…. Because these data are not in the public domain its imperative that limitations be thoroughly investigated Currently this can only be done by those entrusted with the data. Documentation and research on linked files (e.g., metadata and supporting file documentation) should be put into the public domain even if the microdata are not
Great potential for linked data Potential for linked data: Improving accuracy of survey data collection of enrollment data (Medicaid, SSI, etc.) Improve survey sample frames (Census MAF) Using linked data to create small area estimates Improve administrative data race/ethnicity information Great benefit to using information in imputation models and editing Improve policy simulations by allowing researchers to better engage errors and appropriately model them
Survey data have well known limitations  Survey data concerns Sample frame coverage error Sampling error and variance estimation Non-response error (item and unit) Measurement error Data processing, imputation/editing Need for better documentation  Timeliness and data access How do administrative data files and linked data files compare?
How do administrative data files and linked files compare? Sample frame and frame coverage The administrative data cover the entire enrolled/filing population For linked data it is important to understand how survey sample frame maps onto the administrative data universe Differing conceptions of “institutionalized” group quarter population Sampling error Not a problem for administrative data because it is a complete list of the enrolled population Linked data files carry with them the sample design of the survey data
Item non-response and missing data Non-response error (or missing data)  Item non-response can be a major issue in administrative data and when linking Important data for research can be missing (e.g., age, address, program codes, or race/ethnicity) Some of these data can be missing systematically E.g, VA study  Linking data can also be missing systematically Can be a large source of sample loss when matching survey and administrative data An example of linking the Medicaid data to the Current Population Survey
Validated Records in MSIS
Building a common ‘linkable universe’ Institutional Group quarters, dead,  People born, non-Medicaid eligible population, Medicaid eligible but unenrolled Medicaid enrollees in linkable CPS and MSIS universe CPS Sample Frame MSIS Medicaid Enrollees
Measurement error Administrative data are the standard for knowing about programmatic details However they tend to carry a lot of supplementary data that can be more unreliable Administrative data can be collected through many modes, during more than one wave of interviewing, using several instruments Interviewer assisted, self-administered, completely filled out by interviewer enrollees signs, etc. Interviewers have a wide variety of training/skills (e.g. Tax accountants and hospital staff)
Measurement error (continued) Research is needed into possible mode effects, longitudinal panel conditioning, interviewer effects, and instrumentation effects To do this the administrative data should try to keep track of paradata Important to remember that “respondents” in administrative data files may have different incentives for filling out administrative data versus survey data How do these motivations lead to measurement error/differences? Also if data are not accepted unless filled out, are elements ever “made-up” by data entry folks if unknown? Research on administrative data measurement error is essential
Data editing and imputation and documentation  Data editing and imputation and documentation There is little documentation in the public domain regarding collection, editing and/or imputation procedures of administrative and enrollment data relative to survey data Data editing and imputation activity happens but researchers who use the administrative data files can be caught off guard by it
Timeliness and Data Access Timeliness of linked data files Linking takes time Most recent linked data for our CPS-Medicaid link is 2004 Data access Due to sensitive nature of both, linking has to be done in a very restricted environment Access to linked data files is, by necessity, very limited Public use linked data files are unlikely US Census Bureau Research Data Centers and/or synthetic data hold promise in this area
How do administrative data compare to survey data for research purposes? Survey microdata, documentation and research are all in the public domain Survey data are very strong because of the many known (and well documented) problems Similar research into problems with administrative data needs to be done Especially since they are not created for the purpose of research Even though the microdata itself cannot be made public, research and documentation used to produce the file needs to be made public if the data are to be useful for research
Issues to deal with Essential data stewardship, agreements, confidentiality, privacy concerns need to be worked out Sample loss in linked data files needs to be further studied Need to treat administrative data like survey data and examine measurement error, produce public domain documentation (especially since public use files seem unlikely) Develop standards such as the Data Documentation Initiative for administrative data systems used for research If we do this, the linked administrative and survey data will become even more useful source of data for policy research
SHADAC contact information www.shadac.org State Health Access Data Assistance Center University of Minnesota 2221 University Avenue, Suite 345  Minneapolis Minnesota 55414  (612) 624-4802

More Related Content

PPT
The Impact of the American Community Survey on Small Area Estimation
PPT
Sample Frame Assessment for a Large Metropolitan County's Health Survey
PPT
Pres Wss13 Nov2008 Davern
PPT
Four Decades of Population Health: The Integrated Health Interview Series
PPT
Fitting Square Pegs Into Round Holes: Linking Medicaid and Current Population...
PPT
Data for Monitoring the Uninsured at the State Level
PPT
Partnering with Communities in Survey Design, Implementation and Dissemination
PPTX
Health Insurance Coverage Estimates from the American Community Survey
The Impact of the American Community Survey on Small Area Estimation
Sample Frame Assessment for a Large Metropolitan County's Health Survey
Pres Wss13 Nov2008 Davern
Four Decades of Population Health: The Integrated Health Interview Series
Fitting Square Pegs Into Round Holes: Linking Medicaid and Current Population...
Data for Monitoring the Uninsured at the State Level
Partnering with Communities in Survey Design, Implementation and Dissemination
Health Insurance Coverage Estimates from the American Community Survey

What's hot (20)

PPT
Using Linked Survey and Administrative Records Studies to Partially Correct S...
PPT
Will the Uninsured Enroll into Coverage Under National Health Reform?
PPTX
Pres mnhsr2011 mar1_sonier
PPT
Linking Administrative and Survey Data for Health Policy Research in the US: ...
PPT
The Problem, Progress Made and Prospects for the Future: Insights from Linked...
PPTX
SHADAC Resources
PPTX
State Health Access Reform Evaluation: Building the Evidence for Reform
PPTX
Blewett Ah Shrpig Jun2009
PPT
A Comparison of the Health Insurance Coverage Estimates from Four National Su...
PDF
Unstable Ground? Comparing Income, Poverty & Health Insurance Estimates from ...
PPT
Explanations of the Medicaid Undercount and the Factors Associated with Measu...
PPTX
Using the National Health Interview Survey to Evaluate State Health Reform: ...
PPTX
Making use of All-Payer Claims Databases for Health Care Reform Evaluation
PPT
Pres Omm Hdispar2008 Nov13 Johnson
PPT
Minnesota Comprehensive Health Association (MCHA)
PPT
American Indians and Alaska Natives (AIAN) in National Survey Data
PPTX
Coverage and Access to Care for Children with Chronic Health Conditions in th...
PPT
The Medicaid Underdcount: Synthesis of Research
PPT
The Growing Challenges to State Telephone Surveys of Health Insurance Coverag...
PPTX
Location, Location, Location: Leveraging Interactive Maps and ZIP Code Level ...
Using Linked Survey and Administrative Records Studies to Partially Correct S...
Will the Uninsured Enroll into Coverage Under National Health Reform?
Pres mnhsr2011 mar1_sonier
Linking Administrative and Survey Data for Health Policy Research in the US: ...
The Problem, Progress Made and Prospects for the Future: Insights from Linked...
SHADAC Resources
State Health Access Reform Evaluation: Building the Evidence for Reform
Blewett Ah Shrpig Jun2009
A Comparison of the Health Insurance Coverage Estimates from Four National Su...
Unstable Ground? Comparing Income, Poverty & Health Insurance Estimates from ...
Explanations of the Medicaid Undercount and the Factors Associated with Measu...
Using the National Health Interview Survey to Evaluate State Health Reform: ...
Making use of All-Payer Claims Databases for Health Care Reform Evaluation
Pres Omm Hdispar2008 Nov13 Johnson
Minnesota Comprehensive Health Association (MCHA)
American Indians and Alaska Natives (AIAN) in National Survey Data
Coverage and Access to Care for Children with Chronic Health Conditions in th...
The Medicaid Underdcount: Synthesis of Research
The Growing Challenges to State Telephone Surveys of Health Insurance Coverag...
Location, Location, Location: Leveraging Interactive Maps and ZIP Code Level ...
Ad

Similar to Evaluating Linked Survey and Administrative Data for Policy Research (20)

PPTX
Towards an administrative data census the story so far
PPTX
Medicaid Reporting Errors in Four National Surveys: ACS, CPS, MEPS, and NHIS
DOCX
Running head STATISTICS RESEARCH DESIGN METHODS .docx
PDF
P. Paruolo, Joint Research Centre - The European Commission's science and kno...
PPTX
PPTX
2018 Policy Contours for Using Linked Administrative Data in Evidence-Based ...
PDF
Shadac share news_2010_july26
PDF
Shadac share news_2010_july26
PDF
Data context new developments for research the social sciences
PPTX
Implications of the Affordable Care Act: Medicaid Expansion for Health Care A...
PDF
Privacy tool osha comments
PPTX
SHADAC: Overview and Evaluation
PPT
Aea presentation final 18_oct2013
DOCX
Suggested ResourcesThe resources provided here are optional. You.docx
PDF
Learning to Live Without a Statistical Abstract
DOCX
Module 1Discussion question 1Consider the following scenario Y.docx
PPT
Using Linked Survey and Administrative Records Studies to Partially Correct S...
PPTX
Using Clustering as a Tool: Mixed Methods in Qualitative Data Analysis
PDF
Key Concepts, Theories of Public Administration
PDF
Applied Survey Data Analysis Chapman Hall Crc Statistics In The Social And Be...
Towards an administrative data census the story so far
Medicaid Reporting Errors in Four National Surveys: ACS, CPS, MEPS, and NHIS
Running head STATISTICS RESEARCH DESIGN METHODS .docx
P. Paruolo, Joint Research Centre - The European Commission's science and kno...
2018 Policy Contours for Using Linked Administrative Data in Evidence-Based ...
Shadac share news_2010_july26
Shadac share news_2010_july26
Data context new developments for research the social sciences
Implications of the Affordable Care Act: Medicaid Expansion for Health Care A...
Privacy tool osha comments
SHADAC: Overview and Evaluation
Aea presentation final 18_oct2013
Suggested ResourcesThe resources provided here are optional. You.docx
Learning to Live Without a Statistical Abstract
Module 1Discussion question 1Consider the following scenario Y.docx
Using Linked Survey and Administrative Records Studies to Partially Correct S...
Using Clustering as a Tool: Mixed Methods in Qualitative Data Analysis
Key Concepts, Theories of Public Administration
Applied Survey Data Analysis Chapman Hall Crc Statistics In The Social And Be...
Ad

More from soder145 (20)

PDF
Trends and Disparities in Children's Health Insurance: New Data and the Impli...
PDF
Exploring Disparities Using New and Updated MEasures on SHADAC's State Health...
PDF
Leveraging 1332 State Innovation Waivers to Stabilize Individual Health Insur...
PDF
Modeling State-based Reinsurance: One Option for Stabilization of the Individ...
PDF
2017 Health Insurance Coverage Estimates: SHADAC Webinar Featuring U.S. Censu...
PDF
Exploring the New State-Level Opioid Data On SHADAC's State Health Compare
PPTX
Ibd intersectionality
PPTX
Who gets it right
PPTX
Mn ltss projection model
PPTX
Modeling financial eligibility, ltss
PPTX
Poster, advancements in care coordination mn sim
PPTX
Poster, section 1115 waivers
PPTX
Modeling state based reinsurance
PDF
Comparing Health Insurance Measurement Error (CHIME) in the ACS & CPS
PDF
Who Gets It Right? Characteristics Associated with Accurate Reporting of Heal...
PDF
Medicaid vs. Marketplace Coverage for Near-Poor Adults: Impact on Out-of-Pock...
PDF
The Impact of Medicaid Expansion on Employer Provision of Health Insurance
PDF
Physician Participation in Medi-Cal: Is Supply Meeting Demand?
PDF
Shadac acs cps-webinar 2016-final_sept21
PPTX
2014 SAHIE: Overview with Census Experts
Trends and Disparities in Children's Health Insurance: New Data and the Impli...
Exploring Disparities Using New and Updated MEasures on SHADAC's State Health...
Leveraging 1332 State Innovation Waivers to Stabilize Individual Health Insur...
Modeling State-based Reinsurance: One Option for Stabilization of the Individ...
2017 Health Insurance Coverage Estimates: SHADAC Webinar Featuring U.S. Censu...
Exploring the New State-Level Opioid Data On SHADAC's State Health Compare
Ibd intersectionality
Who gets it right
Mn ltss projection model
Modeling financial eligibility, ltss
Poster, advancements in care coordination mn sim
Poster, section 1115 waivers
Modeling state based reinsurance
Comparing Health Insurance Measurement Error (CHIME) in the ACS & CPS
Who Gets It Right? Characteristics Associated with Accurate Reporting of Heal...
Medicaid vs. Marketplace Coverage for Near-Poor Adults: Impact on Out-of-Pock...
The Impact of Medicaid Expansion on Employer Provision of Health Insurance
Physician Participation in Medi-Cal: Is Supply Meeting Demand?
Shadac acs cps-webinar 2016-final_sept21
2014 SAHIE: Overview with Census Experts

Recently uploaded (20)

PDF
Assigned Numbers - 2025 - Bluetooth® Document
PPTX
Programs and apps: productivity, graphics, security and other tools
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
PDF
A comparative study of natural language inference in Swahili using monolingua...
PPTX
The various Industrial Revolutions .pptx
PDF
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
PDF
WOOl fibre morphology and structure.pdf for textiles
PDF
STKI Israel Market Study 2025 version august
PDF
Enhancing emotion recognition model for a student engagement use case through...
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
August Patch Tuesday
PPT
What is a Computer? Input Devices /output devices
PPTX
Modernising the Digital Integration Hub
PPTX
TLE Review Electricity (Electricity).pptx
PDF
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
PDF
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
Getting Started with Data Integration: FME Form 101
PDF
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf
Assigned Numbers - 2025 - Bluetooth® Document
Programs and apps: productivity, graphics, security and other tools
cloud_computing_Infrastucture_as_cloud_p
MicrosoftCybserSecurityReferenceArchitecture-April-2025.pptx
A comparative study of natural language inference in Swahili using monolingua...
The various Industrial Revolutions .pptx
Transform Your ITIL® 4 & ITSM Strategy with AI in 2025.pdf
WOOl fibre morphology and structure.pdf for textiles
STKI Israel Market Study 2025 version august
Enhancing emotion recognition model for a student engagement use case through...
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
August Patch Tuesday
What is a Computer? Input Devices /output devices
Modernising the Digital Integration Hub
TLE Review Electricity (Electricity).pptx
TrustArc Webinar - Click, Consent, Trust: Winning the Privacy Game
From MVP to Full-Scale Product A Startup’s Software Journey.pdf
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
Getting Started with Data Integration: FME Form 101
DASA ADMISSION 2024_FirstRound_FirstRank_LastRank.pdf

Evaluating Linked Survey and Administrative Data for Policy Research

  • 1. Evaluating Linked Survey and Administrative Data for Policy Research Joint Statistical Meetings July 31, 2007 Michael Davern, Ph.D. Assistant Professor, Research Director SHADAC, Health Policy & Management University of Minnesota Supported by a grant from The Robert Wood Johnson Foundation
  • 2. Start with conclusions There is great potential for linked survey and administrative data files Survey microdata are in the public domain Strengths and (especially) limitations are well known Administrative data are the standard on programatic issues, but…. Because these data are not in the public domain its imperative that limitations be thoroughly investigated Currently this can only be done by those entrusted with the data. Documentation and research on linked files (e.g., metadata and supporting file documentation) should be put into the public domain even if the microdata are not
  • 3. Great potential for linked data Potential for linked data: Improving accuracy of survey data collection of enrollment data (Medicaid, SSI, etc.) Improve survey sample frames (Census MAF) Using linked data to create small area estimates Improve administrative data race/ethnicity information Great benefit to using information in imputation models and editing Improve policy simulations by allowing researchers to better engage errors and appropriately model them
  • 4. Survey data have well known limitations Survey data concerns Sample frame coverage error Sampling error and variance estimation Non-response error (item and unit) Measurement error Data processing, imputation/editing Need for better documentation Timeliness and data access How do administrative data files and linked data files compare?
  • 5. How do administrative data files and linked files compare? Sample frame and frame coverage The administrative data cover the entire enrolled/filing population For linked data it is important to understand how survey sample frame maps onto the administrative data universe Differing conceptions of “institutionalized” group quarter population Sampling error Not a problem for administrative data because it is a complete list of the enrolled population Linked data files carry with them the sample design of the survey data
  • 6. Item non-response and missing data Non-response error (or missing data) Item non-response can be a major issue in administrative data and when linking Important data for research can be missing (e.g., age, address, program codes, or race/ethnicity) Some of these data can be missing systematically E.g, VA study Linking data can also be missing systematically Can be a large source of sample loss when matching survey and administrative data An example of linking the Medicaid data to the Current Population Survey
  • 8. Building a common ‘linkable universe’ Institutional Group quarters, dead, People born, non-Medicaid eligible population, Medicaid eligible but unenrolled Medicaid enrollees in linkable CPS and MSIS universe CPS Sample Frame MSIS Medicaid Enrollees
  • 9. Measurement error Administrative data are the standard for knowing about programmatic details However they tend to carry a lot of supplementary data that can be more unreliable Administrative data can be collected through many modes, during more than one wave of interviewing, using several instruments Interviewer assisted, self-administered, completely filled out by interviewer enrollees signs, etc. Interviewers have a wide variety of training/skills (e.g. Tax accountants and hospital staff)
  • 10. Measurement error (continued) Research is needed into possible mode effects, longitudinal panel conditioning, interviewer effects, and instrumentation effects To do this the administrative data should try to keep track of paradata Important to remember that “respondents” in administrative data files may have different incentives for filling out administrative data versus survey data How do these motivations lead to measurement error/differences? Also if data are not accepted unless filled out, are elements ever “made-up” by data entry folks if unknown? Research on administrative data measurement error is essential
  • 11. Data editing and imputation and documentation Data editing and imputation and documentation There is little documentation in the public domain regarding collection, editing and/or imputation procedures of administrative and enrollment data relative to survey data Data editing and imputation activity happens but researchers who use the administrative data files can be caught off guard by it
  • 12. Timeliness and Data Access Timeliness of linked data files Linking takes time Most recent linked data for our CPS-Medicaid link is 2004 Data access Due to sensitive nature of both, linking has to be done in a very restricted environment Access to linked data files is, by necessity, very limited Public use linked data files are unlikely US Census Bureau Research Data Centers and/or synthetic data hold promise in this area
  • 13. How do administrative data compare to survey data for research purposes? Survey microdata, documentation and research are all in the public domain Survey data are very strong because of the many known (and well documented) problems Similar research into problems with administrative data needs to be done Especially since they are not created for the purpose of research Even though the microdata itself cannot be made public, research and documentation used to produce the file needs to be made public if the data are to be useful for research
  • 14. Issues to deal with Essential data stewardship, agreements, confidentiality, privacy concerns need to be worked out Sample loss in linked data files needs to be further studied Need to treat administrative data like survey data and examine measurement error, produce public domain documentation (especially since public use files seem unlikely) Develop standards such as the Data Documentation Initiative for administrative data systems used for research If we do this, the linked administrative and survey data will become even more useful source of data for policy research
  • 15. SHADAC contact information www.shadac.org State Health Access Data Assistance Center University of Minnesota 2221 University Avenue, Suite 345 Minneapolis Minnesota 55414 (612) 624-4802