SlideShare a Scribd company logo
Data Quality: Total Survey
Error (TSE)
Dr Olga Maslovskaya
NCRM National Centre for Research Methods
University of Southampton
Survey Data
• Vast amounts of survey data are collected for many
purposes, including governmental information, public
opinion and election surveys, advertising and market
research as well as scientific research
• Survey data underlie many public policy and business
decisions
• Good quality data reduces the risk of poor policies and
decisions and is of crucial importance
Total Survey Quality (TSQ)
Total Survey Quality (TSQ)
Statistical
Dimension
Non-statistical
Dimension
TSQ: Quality Dimensions –Statistical
• Accuracy of estimates is the difference between the estimate
and the true parameter value
• Accuracy is most important concept of TSQ
X = T + e
Observed item True value Error
Variance (random
error)
Bias (systematic
error)
Total Survey Error (TSE) (1)
•TSE concept was developed by Robert Groves
(1989) in book on Survey Errors and Survey
Costs
•Survey estimates are derived from complex
survey data
•Published estimates may differ from their true
parameter values due to survey errors
•Total Survey Error is the difference between a
population mean, total, or other population
parameter and the estimate of the parameter
based on the sample survey (Biemer and Lyberg,
2003)
Total Survey Error (TSE) (2)
•Survey error is any error arising from the
survey process that contributed to the deviation
of an estimate from its true parameter value
(Biemer, 2016)
•Survey error diminishes the accuracy of
inferences derived from the survey
•TSE is the accumulation of all errors that may
arise in the design, collection, processing, and
analysis of survey data (Biemer, 2016)
TSE framework (1)
• Set of principles, methods and processes that minimise TSE
within the budget allocated for accuracy, timing and other
constrains
• Non-statistical dimensions of TSQ can be viewed as constrains:
timeliness and comparability constrain the design; accessibility,
relevance and completeness constrain the budget (Biemer
2017)
TSE framework (2)
TSE framework provides principles that guide stages of
survey process:
• Survey design
• Survey implementation
• Data collection
• Data processing
• Data analysis
• Modelling and estimation
Each stage of survey process provides opportunities for
errors which add up to TSE
TSE
TSE= sampling errors + non-sampling
errors
Survey errors:
•Sampling errors – can be computed for
probability samples and are due to
selecting a sample instead of the entire
population
TSE
•Non-sampling errors (including
measurement error – cannot be formally
estimated but can be improved by
interviewing procedures and question
wordings etc.) - are errors due to mistakes or
system deficiencies, also from incomplete
responses to the survey or its questions, etc.
In many cases non-sampling error can be much
more damaging than sampling error to
estimates from surveys
Sources of Sampling Error
• Sampling scheme
• Stratification
• Clustering
• Selection probabilities
• Sample size
• Overall sample size
• Effective sample size
• Estimator choice
• Simple
• Use of auxiliary information
• Model-based
• Model-assisted
Components of Non-sampling Error
1. Specification error
2. Frame error
3. Nonresponse error
4. Measurement error
5. Processing error
6. Modelling/Estimation error
Biemer (2017)
Specification Error
•Refers to a question on the questionnaire
•Occurs when the concept implied by the survey
question and the concept that should be
measured in the survey differ (Biemer and
Lyberg, 2003)
Frame Error
•Arises from construction of the sampling
frame for the survey
•The sampling frame might have missing
elements (units), duplicates or erroneous
inclusions (nonpopulation units)
Nonresponse Error
• Unit nonresponse occurs when a sample unit
(individual, household or organisation) does not
response to any part of the questionnaire,
• Item nonresponse occurs when the questionnaire is
only partially completed and some items are not
answered
• Incomplete response occurs when the response to
open-ended question is incomplete or very short and
inadequate
• Panel attrition occurs when a sample unit is lost over
the period of a longitudinal study
Measurement error
•Measurement errors pose a serious limitation to
the validity and usefulness of the data collected
•Most damaging source of error
•Without reliable measurements, analysis of
data hardly make any sense
Sources of measurement error
•Respondents
•May deliberately or unintentionally provide
incorrect information
•Response style behaviours (agree with
everything, do not know to every question or
choose extreme response options); through
social desirability bias
•Satisficing (less efforts to provide optimal
responses)
•Interviewers - enumerators
•May falsify data
•May inappropriately influence responses
•May have negative impact on
responses to sensitive questions
•May record responses incorrectly
•May fail to comply with the survey
protocol
•Questionnaire - design
•Bad design
•Ambiguous questions
•Confusing instructions
•Unclear terms
•Mode of administration
•Online mode
•Non-optimised questionnaire for
smartphones
Processing Error
Contributes to measurement error
• Occurs during data processing stage
• Errors in data editing
• Errors in data entry
• Errors in coding
• Errors in outlier editing
• Errors in assignment of survey weights
• Errors in non-response imputing
Modelling and Estimation Error
•Occurs during data analysis stage
(modelling)
•Errors in weight adjustments,
•Errors in imputation,
•Errors in modelling process
Types of Errors
• Systematic Error – bias -errors that tend to agree –
results in biased estimates (strengthen the relations
between variables, leading to false conclusions) – e.g.
response styles or other stable behaviours - bias the
results, distorting the mean value on variables – does not
cancel out
• Random Error – variance - errors that tend to disagree
(unintended mistakes made by respondents) – affects the
variance of estimates (may weaken the relations between
variables), vary from case to case but are expected to
cancel out
Mean Squared Error (MSE)
• Total survey error (TSE) is a term that is used to refer to
all sources of bias (systematic error) and variance
(random error) that may affect accuracy of survey data.
• Mean Squared Error (MSE) – metric for measuring TSE
• MSE is the sum of the total bias squared plus the variance
components for all the various sources of error in the
survey design.
MSE
•MSE cannot be calculated directly but useful
conceptually to consider how large the different
components of error can be and how much
they add to the total survey error
•MSE is a great guide for optimal survey
designs
MSE
• Survey design goal is to minimise the MSE
• When two designs are similar on other quality dimensions,
the optimal design is the one achieving the smallest MSE
• Working to reduce the measurement error on one set of
questions could increase the error for a different set of
questions in the same survey
• Also, reducing one error could increase another error in
the survey
Survey designers face the following
questions:
• Where should additional resources be directed to generate
the greatest improvement to data quality: extensive
interviewer training for nonresponse reduction, greater
nonresponse follow up intensity, or by offering larger
incentives to sample members to encourage participation?
• Should a more expensive data collection mode be used,
even if the sample size must be reduced significantly to
stay within budget?
TSE in Practice
•Idea is to minimise all these error sources
•Minimising all of these errors would require an
unlimited budget (impossible)
•Cost-benefit trade-offs are needed to decide
which errors to minimise
TSE in Practice (1)
•Realistic scenario is to work on continuous
improvement of various survey processes so that
biases and unwanted variations are gradually
reduced
•Redesign of surveys if needed
•Non-response bias reduction through real time
responsive and adaptive survey designs
•Quality monitoring strategies, e.g., paradata
•Data quality indicators application in data
analysis
TSE in Practice (2)
Decisions are needed:
•To ignore some errors
•To measure and to control/adjust for some
(data analysis stage: complex designs,
measurement errors, missing data, sampling
errors)
Conclusions
•Data accuracy is of crucial importance
•Single score or measure of data quality (Total
Survey Quality) is not available
•TSE framework was developed and adopted
•Cost-benefit trade-offs to minimise different
errors of TSE depending on survey aims
•TSE helps keeping data quality standards high
and in line with survey aims under financial
constrains
References
• Biemer (2010) Total survey error: Design, implementation, and evaluation. Public
Opinion Quarterly, 74(5): 817-848.
• Biemer (2016) Total Survey Error Paradigm: Theory and Practice. In The Sage
handbook of survey methodology by Wolf, Joye, Smith and Fu. London: SAGE
publications.
• Biemer (2017) Total survey error: A Framework for censuses and surveys.
Presentation at the University of Southampton.
• Biemer and Lyberg (2003) Introduction to survey quality. New York: John Wiley &
Sons.
• Groves and Heeringa (2006) Responsive design for household surveys: Tools for
actively controlling survey errors and costs. Journal of the Royal Statistical Society
Series A, 169 (3): 439-457.
• Lyberg and Weisberg (2016) The SAGE handbook of survey methodology.
London: SAGE publications.
• Lynn (2004) Editorial: Measuring and communicating survey quality. Journal of the
Royal Statistical Society Series A, 167 (4): 575-578.
• Schouten et al. (2013) Optimizing quality of response through adaptive survey
designs. Survey Methodology, 39 (1): 29-39.
• Weisberg (2005) The total survey error approach. Chicago: University of Chicago
Press.

More Related Content

PPTX
PPTX
Imputation techniques for missing data in clinical trials
PPT
Randomised controlled trials : the basics
PDF
Biostatistics Workshop: Missing Data
PPTX
Basic principles.pptx
PPTX
Basic Concepts of Standard Experimental Designs ( Statistics )
PPT
The Kruskal-Wallis H Test
PPTX
presentation of factorial experiment 3*2
Imputation techniques for missing data in clinical trials
Randomised controlled trials : the basics
Biostatistics Workshop: Missing Data
Basic principles.pptx
Basic Concepts of Standard Experimental Designs ( Statistics )
The Kruskal-Wallis H Test
presentation of factorial experiment 3*2

What's hot (20)

PPT
6152935.ppt
PDF
Design of Experiment for Optimization Analysis
PPTX
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
PPTX
Factor Extraction method in factor analysis with example in R studio.pptx
PPTX
Normal distribution
PPTX
Sampling Techniques
PPTX
Minimally important differences v2
PPTX
Sampling (Types and Meaning)
DOC
Design of experiments(
PPTX
Binomial distribution
PPTX
Commonly used statistical tests in research
PPT
Introduction to ANOVAs
PPT
Cohort and case-controls studies
PPTX
Randomization
PPT
Survival Analysis Lecture.ppt
PPTX
Non parametric test
PDF
Lesson p values
PPTX
Randomisation
PPTX
ANALYSIS OF VARIANCE (ANOVA)
6152935.ppt
Design of Experiment for Optimization Analysis
Survival Data Analysis for Sekolah Tinggi Ilmu Statistik Jakarta
Factor Extraction method in factor analysis with example in R studio.pptx
Normal distribution
Sampling Techniques
Minimally important differences v2
Sampling (Types and Meaning)
Design of experiments(
Binomial distribution
Commonly used statistical tests in research
Introduction to ANOVAs
Cohort and case-controls studies
Randomization
Survival Analysis Lecture.ppt
Non parametric test
Lesson p values
Randomisation
ANALYSIS OF VARIANCE (ANOVA)
Ad

Similar to Data quality: total survey error (20)

PPTX
Introduction to Survey Data Quality
PPT
Total Survey Error & Institutional Research: A case study of the University E...
PPTX
Pertemuan 3 & 4 - Pengendalian Mutu Statistik.pptx
PDF
Mitigating errors of representation: a practical case study of the University...
PDF
What does collecting better data mean, and how to achieve it?
PPTX
Errors in Statistical Survey
PPT
Ch09 survey research
DOCX
Examples of Type of Errors in Survey Research
PPTX
ERRORS IN RESEARCH DESIGN
PPTX
Ch09-Survey research.pptx
PPTX
Fundamental of sampling
PDF
Sampling Errors
PPT
Chapter 9 (Business Research Methodology-Survey Research) .ppt
PPT
2_Errors in Experimental Observations_ML.ppt
PPTX
Total Survey Error across a program of three national surveys: using a risk m...
PPTX
sampling error.pptx
PPTX
Errors and types
PPTX
Errors in Research
PPTX
Sampling and statistical inference
PPTX
Survey research lecture 9
Introduction to Survey Data Quality
Total Survey Error & Institutional Research: A case study of the University E...
Pertemuan 3 & 4 - Pengendalian Mutu Statistik.pptx
Mitigating errors of representation: a practical case study of the University...
What does collecting better data mean, and how to achieve it?
Errors in Statistical Survey
Ch09 survey research
Examples of Type of Errors in Survey Research
ERRORS IN RESEARCH DESIGN
Ch09-Survey research.pptx
Fundamental of sampling
Sampling Errors
Chapter 9 (Business Research Methodology-Survey Research) .ppt
2_Errors in Experimental Observations_ML.ppt
Total Survey Error across a program of three national surveys: using a risk m...
sampling error.pptx
Errors and types
Errors in Research
Sampling and statistical inference
Survey research lecture 9
Ad

More from University of Southampton (20)

PPTX
Generating SPSS training materials in StatJR
PPTX
Introduction to the Stat-JR software package
PPT
Multi level modelling- random coefficient models | Ian Brunton-Smith
PPT
Multi level modelling - random intercept models | Ian Brunton Smith
PPT
Introduction to multilevel modelling | Ian Brunton-Smith
PPTX
Biosocial research:How to use biological data in social science research?
PPT
Integrating biological and social research data - Michaela Benzeval
PPTX
Teaching research methods: pedagogy hooks
PPTX
Teaching research methods: pedagogy of methods learning
PPTX
better off living with parents
PPTX
Multilevel models:random coefficient models
PPTX
Multilevel models:random intercept models
PPTX
Introduction to multilevel modelling
PPTX
How to write about research methods
PPTX
How to write about research methods
PDF
Introduction to spatial interaction modelling
PPTX
Cognitive interviewing
PPTX
Survey questions and measurement error
PPT
Participatory performative and mobile methods
PPTX
Participatory theatre as a social research methods
Generating SPSS training materials in StatJR
Introduction to the Stat-JR software package
Multi level modelling- random coefficient models | Ian Brunton-Smith
Multi level modelling - random intercept models | Ian Brunton Smith
Introduction to multilevel modelling | Ian Brunton-Smith
Biosocial research:How to use biological data in social science research?
Integrating biological and social research data - Michaela Benzeval
Teaching research methods: pedagogy hooks
Teaching research methods: pedagogy of methods learning
better off living with parents
Multilevel models:random coefficient models
Multilevel models:random intercept models
Introduction to multilevel modelling
How to write about research methods
How to write about research methods
Introduction to spatial interaction modelling
Cognitive interviewing
Survey questions and measurement error
Participatory performative and mobile methods
Participatory theatre as a social research methods

Recently uploaded (20)

PDF
RMMM.pdf make it easy to upload and study
PDF
01-Introduction-to-Information-Management.pdf
PDF
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
PPTX
Pharma ospi slides which help in ospi learning
PPTX
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
PDF
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
PPTX
Institutional Correction lecture only . . .
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Anesthesia in Laparoscopic Surgery in India
RMMM.pdf make it easy to upload and study
01-Introduction-to-Information-Management.pdf
OBE - B.A.(HON'S) IN INTERIOR ARCHITECTURE -Ar.MOHIUDDIN.pdf
O5-L3 Freight Transport Ops (International) V1.pdf
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
2.FourierTransform-ShortQuestionswithAnswers.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Pharmacology of Heart Failure /Pharmacotherapy of CHF
3rd Neelam Sanjeevareddy Memorial Lecture.pdf
Pharma ospi slides which help in ospi learning
IMMUNITY IMMUNITY refers to protection against infection, and the immune syst...
GENETICS IN BIOLOGY IN SECONDARY LEVEL FORM 3
Institutional Correction lecture only . . .
human mycosis Human fungal infections are called human mycosis..pptx
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Microbial disease of the cardiovascular and lymphatic systems
A systematic review of self-coping strategies used by university students to ...
A GUIDE TO GENETICS FOR UNDERGRADUATE MEDICAL STUDENTS
Microbial diseases, their pathogenesis and prophylaxis
Anesthesia in Laparoscopic Surgery in India

Data quality: total survey error

  • 1. Data Quality: Total Survey Error (TSE) Dr Olga Maslovskaya NCRM National Centre for Research Methods University of Southampton
  • 2. Survey Data • Vast amounts of survey data are collected for many purposes, including governmental information, public opinion and election surveys, advertising and market research as well as scientific research • Survey data underlie many public policy and business decisions • Good quality data reduces the risk of poor policies and decisions and is of crucial importance
  • 3. Total Survey Quality (TSQ) Total Survey Quality (TSQ) Statistical Dimension Non-statistical Dimension
  • 4. TSQ: Quality Dimensions –Statistical • Accuracy of estimates is the difference between the estimate and the true parameter value • Accuracy is most important concept of TSQ X = T + e Observed item True value Error Variance (random error) Bias (systematic error)
  • 5. Total Survey Error (TSE) (1) •TSE concept was developed by Robert Groves (1989) in book on Survey Errors and Survey Costs •Survey estimates are derived from complex survey data •Published estimates may differ from their true parameter values due to survey errors •Total Survey Error is the difference between a population mean, total, or other population parameter and the estimate of the parameter based on the sample survey (Biemer and Lyberg, 2003)
  • 6. Total Survey Error (TSE) (2) •Survey error is any error arising from the survey process that contributed to the deviation of an estimate from its true parameter value (Biemer, 2016) •Survey error diminishes the accuracy of inferences derived from the survey •TSE is the accumulation of all errors that may arise in the design, collection, processing, and analysis of survey data (Biemer, 2016)
  • 7. TSE framework (1) • Set of principles, methods and processes that minimise TSE within the budget allocated for accuracy, timing and other constrains • Non-statistical dimensions of TSQ can be viewed as constrains: timeliness and comparability constrain the design; accessibility, relevance and completeness constrain the budget (Biemer 2017)
  • 8. TSE framework (2) TSE framework provides principles that guide stages of survey process: • Survey design • Survey implementation • Data collection • Data processing • Data analysis • Modelling and estimation Each stage of survey process provides opportunities for errors which add up to TSE
  • 9. TSE TSE= sampling errors + non-sampling errors Survey errors: •Sampling errors – can be computed for probability samples and are due to selecting a sample instead of the entire population
  • 10. TSE •Non-sampling errors (including measurement error – cannot be formally estimated but can be improved by interviewing procedures and question wordings etc.) - are errors due to mistakes or system deficiencies, also from incomplete responses to the survey or its questions, etc. In many cases non-sampling error can be much more damaging than sampling error to estimates from surveys
  • 11. Sources of Sampling Error • Sampling scheme • Stratification • Clustering • Selection probabilities • Sample size • Overall sample size • Effective sample size • Estimator choice • Simple • Use of auxiliary information • Model-based • Model-assisted
  • 12. Components of Non-sampling Error 1. Specification error 2. Frame error 3. Nonresponse error 4. Measurement error 5. Processing error 6. Modelling/Estimation error Biemer (2017)
  • 13. Specification Error •Refers to a question on the questionnaire •Occurs when the concept implied by the survey question and the concept that should be measured in the survey differ (Biemer and Lyberg, 2003)
  • 14. Frame Error •Arises from construction of the sampling frame for the survey •The sampling frame might have missing elements (units), duplicates or erroneous inclusions (nonpopulation units)
  • 15. Nonresponse Error • Unit nonresponse occurs when a sample unit (individual, household or organisation) does not response to any part of the questionnaire, • Item nonresponse occurs when the questionnaire is only partially completed and some items are not answered • Incomplete response occurs when the response to open-ended question is incomplete or very short and inadequate • Panel attrition occurs when a sample unit is lost over the period of a longitudinal study
  • 16. Measurement error •Measurement errors pose a serious limitation to the validity and usefulness of the data collected •Most damaging source of error •Without reliable measurements, analysis of data hardly make any sense
  • 17. Sources of measurement error •Respondents •May deliberately or unintentionally provide incorrect information •Response style behaviours (agree with everything, do not know to every question or choose extreme response options); through social desirability bias •Satisficing (less efforts to provide optimal responses) •Interviewers - enumerators •May falsify data •May inappropriately influence responses
  • 18. •May have negative impact on responses to sensitive questions •May record responses incorrectly •May fail to comply with the survey protocol
  • 19. •Questionnaire - design •Bad design •Ambiguous questions •Confusing instructions •Unclear terms •Mode of administration •Online mode •Non-optimised questionnaire for smartphones
  • 20. Processing Error Contributes to measurement error • Occurs during data processing stage • Errors in data editing • Errors in data entry • Errors in coding • Errors in outlier editing • Errors in assignment of survey weights • Errors in non-response imputing
  • 21. Modelling and Estimation Error •Occurs during data analysis stage (modelling) •Errors in weight adjustments, •Errors in imputation, •Errors in modelling process
  • 22. Types of Errors • Systematic Error – bias -errors that tend to agree – results in biased estimates (strengthen the relations between variables, leading to false conclusions) – e.g. response styles or other stable behaviours - bias the results, distorting the mean value on variables – does not cancel out • Random Error – variance - errors that tend to disagree (unintended mistakes made by respondents) – affects the variance of estimates (may weaken the relations between variables), vary from case to case but are expected to cancel out
  • 23. Mean Squared Error (MSE) • Total survey error (TSE) is a term that is used to refer to all sources of bias (systematic error) and variance (random error) that may affect accuracy of survey data. • Mean Squared Error (MSE) – metric for measuring TSE • MSE is the sum of the total bias squared plus the variance components for all the various sources of error in the survey design.
  • 24. MSE •MSE cannot be calculated directly but useful conceptually to consider how large the different components of error can be and how much they add to the total survey error •MSE is a great guide for optimal survey designs
  • 25. MSE • Survey design goal is to minimise the MSE • When two designs are similar on other quality dimensions, the optimal design is the one achieving the smallest MSE • Working to reduce the measurement error on one set of questions could increase the error for a different set of questions in the same survey • Also, reducing one error could increase another error in the survey
  • 26. Survey designers face the following questions: • Where should additional resources be directed to generate the greatest improvement to data quality: extensive interviewer training for nonresponse reduction, greater nonresponse follow up intensity, or by offering larger incentives to sample members to encourage participation? • Should a more expensive data collection mode be used, even if the sample size must be reduced significantly to stay within budget?
  • 27. TSE in Practice •Idea is to minimise all these error sources •Minimising all of these errors would require an unlimited budget (impossible) •Cost-benefit trade-offs are needed to decide which errors to minimise
  • 28. TSE in Practice (1) •Realistic scenario is to work on continuous improvement of various survey processes so that biases and unwanted variations are gradually reduced •Redesign of surveys if needed •Non-response bias reduction through real time responsive and adaptive survey designs •Quality monitoring strategies, e.g., paradata •Data quality indicators application in data analysis
  • 29. TSE in Practice (2) Decisions are needed: •To ignore some errors •To measure and to control/adjust for some (data analysis stage: complex designs, measurement errors, missing data, sampling errors)
  • 30. Conclusions •Data accuracy is of crucial importance •Single score or measure of data quality (Total Survey Quality) is not available •TSE framework was developed and adopted •Cost-benefit trade-offs to minimise different errors of TSE depending on survey aims •TSE helps keeping data quality standards high and in line with survey aims under financial constrains
  • 31. References • Biemer (2010) Total survey error: Design, implementation, and evaluation. Public Opinion Quarterly, 74(5): 817-848. • Biemer (2016) Total Survey Error Paradigm: Theory and Practice. In The Sage handbook of survey methodology by Wolf, Joye, Smith and Fu. London: SAGE publications. • Biemer (2017) Total survey error: A Framework for censuses and surveys. Presentation at the University of Southampton. • Biemer and Lyberg (2003) Introduction to survey quality. New York: John Wiley & Sons. • Groves and Heeringa (2006) Responsive design for household surveys: Tools for actively controlling survey errors and costs. Journal of the Royal Statistical Society Series A, 169 (3): 439-457. • Lyberg and Weisberg (2016) The SAGE handbook of survey methodology. London: SAGE publications. • Lynn (2004) Editorial: Measuring and communicating survey quality. Journal of the Royal Statistical Society Series A, 167 (4): 575-578. • Schouten et al. (2013) Optimizing quality of response through adaptive survey designs. Survey Methodology, 39 (1): 29-39. • Weisberg (2005) The total survey error approach. Chicago: University of Chicago Press.

Editor's Notes

  • #3: So data quality is crucial
  • #4: TSQ – survey quality is more than its accuracy or statistical dimension. It also includes among other factors producing results that fit the needs of the survey users and providing results that users will have confidence in. Usability of results is of crucial importance. (Eurostat, Statistics Canada and Statistics Sweden) Statistics Canada: Relevance Accuracy Timeliness Accessibility Interpretability Coherence Statistics Sweden: Content Accuracy Timeliness Comparability/coherence Availability/clarity
  • #5: Bias – mean of errors is not equal to 0, does not cancel out; variance – mean of error is equal to 0, does cancel out Accuracy is The larger concept of Total Survey Quality (TSQ) Broader that accuracy definition is needed as users are not just interested in the accuracy of the estimates provided. Accuracy is the cornestone of quality, since without it, sruvey data are of little use. If the data are erroneous, it does not help much if relevance, timeliness, accessibility, comparability, coherence and completeness are sufficient.
  • #6: Simple random sampling is often neither possible nor cost-effective. Stratifying the sample can reduce the sampling error, clustering the sample can reduce costs but would increase the sampling error.
  • #7: Idea is to minimize the errors
  • #10: Biemer and Lyberg in their book Introduction to Survey Quality introduced devision between sampling and non-sampling errors Roots in cautioning against sole attention to sampling error Framework contains statistical and nonstatistical notions
  • #13: There are different components of non-sampling error Errors can be systematic or random and correlated or uncorrelated. Uncorrelated (e.g., interviewer mistakenly records a “yes” answer as a “no” Correlated (when interviewers take multiple interviewers and when cluster sampling is used – correlated errors increase the variance of estimates due to an effective sample size that is smaller than the intended one and thereby make it more difficult to achieve statistically significant results) Measurement errors pose a serious limitation to the validity and usefulness of the information collected via survey. Having excellent samples representative of the target population, having high response rates, having complete data, etc. does us little good if our measurement instruments evoke responses that are fraught with error. Measurement error is distinct from other survey errors and it is error that occurs when the recorded or observed value is different from the true value of the variable. Reliability and validity are important in measurement error. Reliability is “agreement between two efforts to measure the same thing, using maximally similar methods” How was the survey administered (e.g. in person, by telephone, online, multiple modes, etc.)? (sensitive questions) Were the questions well constructed, clear, and not leading or otherwise biasing? (satisficing) What steps, if any, were taken to ensure that respondents were providing truthful  answers to the questions, and were any respondents removed from the final dataset (e.g., identifying speeders, satisficers, multiple completions)? (in-survey behaviour)
  • #17: Having excellent samples representative of the target population, high response rates, complete data, etc. does us little good if our measurement instruments evoke responses that are fraught with error Response errors or response styles are measurement errors and found in the answers respondents give to survey Response styles: Acquiescence response style – tendency to agree with items regardless of content Disacquiescence reponse style – tendency to disagree with items regardless of content Mid-point response style – tendency to use the middle response category of a rating scale regardless of content Extreme response style – tendency to select most extreme response option regardless of content Straightlining – tendency to rush through the survey clicking on the same response every time regarding of content Tendency to select “do not know” options regardless of content Reliability and validity are important concepts in measurement error. Reliability is “agreement between two efforts to measure the same thing, using maximally similar methods” (in Alwin, 2016). The score for reliability is called the coefficient of precision Validity is an agreement or consistency between two efforts to measure the same thing using maximally different measurements (in Alwin, 2016) Satisfising behaviour increases measurement error (when respondents give answers that sound plausible so as to get through the task quickly Improving survey question wording might minimise the likelihood of satisficing
  • #18: Interviewers can cause errors in a number of ways Acquiescence response style – tendency to agree with items regardless of content Disacquiescence reponse style – tendency to disagree with items regardless of content Mid-point response style – tendency to use the middle response category of a rating scale regardless of content Extreme response style – tendency to select most extreme response option regardless of content Straightlining – tendency to rush through the survey clicking on the same response every time regarding of content Tendency to select “do not know” options regardless of content
  • #24: MSE is hypothetical
  • #29: Clients need to wiegh these trade offs deciding how they want to spend limited resources to minimize the potential survey errors
  • #31: Cost-benefit trade-offs are needed to decide which errors to minimize Quality frameworks were developed and adopted and provided statistics producers with clear description of how certain dimensions of quality can be measured and why it might be important to do so. The survey community needs to find ways of ensuring that as broad a range as possible of relevant indicators and information is made available routinely (Lynn 2004) The chances of users misusing the data or misinterpreting published statistics will be reduced if they understand better the strengths and limitations of the data. The publication of data quality measures itself represent an improvement in the quality of a survey