Sample Size ConsiderationsCARMA Internet Research ModuleJeff Stanton
Key ConsiderationsSample size versus response rate – planning for the number of usable data points you will actually obtainAttrition – Repeated measures, panel designs, and diary studies all lose participants over timeStatistical power – ability to draw inferences from the sample obtainedMargin of error – to the extent that the resulting statistics must be projectable to the larger population
May 15-17, 2008Internet Data Collection Methods (Day 2-3)Response Rate Reminder70%65%60%55%50%45%40%19751995Academic Surveys
Hope for the best / Plan for the worstTry to achieve an 80% response rateHope to achieve a 50% response ratePlan ahead for a 30% response rateMeans you need to sample 1000 people to obtain a sample of 300
Bad DataUnproctored, anonymous self report instruments generally have a higher percentage of:Unusual outliersMissing dataCarelessly entered dataIntentionally sabotaged dataAnother aspect of dealing with nonresponse is to anticipate, prepare for, and deal with item level data losses
Attrition
The Best Articles on Statistical PowerCohen, J. (1992). "A power primer." Psychological bulletin 112(1): 155-159.Cohen, J. (1992). "Statistical power analysis." Current Directions in Psychological Science: 98-101.Kraemer, H. and S. Thiemann (1987). How many subjects?: Statistical power analysis in research, Sage Publications, Inc.
May 15-17, 2008Internet Data Collection Methods (Day 2-8)Sample Size “Guestimates”(With apologies to Jacob Cohen)
May 15-17, 2008Internet Data Collection Methods (Day 2-9)Estimating Effect Size(Also with apologies to Jacob Cohen)Mean differences, calibrated in standard deviations: Large = .8+, Medium = .5, Small = .2Multiple regression, size of R-squared: Large =.35+, Medium = .15, Small = .02Chi-square, calibrated in the difference between null and alternate population proportions: Large = . 50, Medium = .30, Small = .10
Margin of ErrorGenerally represents only sampling error: Other sources of error will often make the margin much largerAssumes a large population, with no more than 5% drawn into the sampleMargin of error is half the width of a confidence intervalStraightforward calculation for a CI around a mean or a mean difference: generally about 1.96 standard errorsCI around a proportion/percentage is more complex:Use 1.96 times this SE; works fine for even splits; can be a little funky for extreme proportions
Margin of Error Calculatorshttp://www.raosoft.com/samplesize.htmlTrades off sample size and margin of errorhttp://www.surveysystem.com/sscalc.htmExplains terminologyhttp://faculty.vassar.edu/lowry/polls/calcs.htmlVarious tools for assessing poll datahttp://glass.ed.asu.edu/stats/analysis/rci.htmlConfidence intervals for correlationshttp://www.stat.tamu.edu/~jhardin/applets/signed/case11.htmlJava-based applet
An Overall Sampling PlanEstimate the expected effect size for the most important tests you plan to conductFor inferential testing, use power estimation tools to plan sample sizeFor projectability, use margin of error tools to plan sample sizeTake into account item level data loss due to bad dataTake attrition into account for longitudinal designsTake overall response rate into account for all types of designsDetermine overall initial sample size based on all of the factors listed above

More Related Content

PDF
ML Drift - How to find issues before they become problems
PDF
The Data Quality Formula
PDF
Mistakes I've Made- Cam Davidson-Pilon
PPTX
1.5 bias in sampling
PPTX
Data drift and machine learning
PPTX
Data drift and machine learning
PPTX
E bay amplify_final
PPTX
Analysis of "A Predictive Analytics Primer" by Tom Davenport
ML Drift - How to find issues before they become problems
The Data Quality Formula
Mistakes I've Made- Cam Davidson-Pilon
1.5 bias in sampling
Data drift and machine learning
Data drift and machine learning
E bay amplify_final
Analysis of "A Predictive Analytics Primer" by Tom Davenport

What's hot (19)

PDF
Simulating Patient Populations
PDF
MLSEV Virtual. Evaluations
PPT
Spreadsheet Errors
PPTX
Spreadsheet Errors John Park
PDF
MLSEV Virtual. State of the Art in ML
PPTX
Statistics in the age of data science, issues you can not ignore
PDF
MLSEV Virtual. Automating Model Selection
PPTX
Descriptive Statistics
PDF
MLSEV Virtual. Searching for Anomalies
DOC
Stayer mat 510 final exam2
PDF
MLSEV Virtual. Supervised vs Unsupervised
DOC
Stayer mat 510 final exam2
PPTX
Medical data diagnosis
PPT
07 Thompson, James R.
PPTX
Dfmw Spreadsheet Errors Presentation Jake Carney
PDF
10 things A.I. can do better than you
PDF
Lecture 5
PDF
A Pocket Guide in Machine Learning for Beginners
Simulating Patient Populations
MLSEV Virtual. Evaluations
Spreadsheet Errors
Spreadsheet Errors John Park
MLSEV Virtual. State of the Art in ML
Statistics in the age of data science, issues you can not ignore
MLSEV Virtual. Automating Model Selection
Descriptive Statistics
MLSEV Virtual. Searching for Anomalies
Stayer mat 510 final exam2
MLSEV Virtual. Supervised vs Unsupervised
Stayer mat 510 final exam2
Medical data diagnosis
07 Thompson, James R.
Dfmw Spreadsheet Errors Presentation Jake Carney
10 things A.I. can do better than you
Lecture 5
A Pocket Guide in Machine Learning for Beginners
Ad

Viewers also liked (13)

PPTX
Introduction to Sample size decision
PPTX
Determining the Sample Size
PPTX
Sample size calculation - a brief overview
PPTX
Phenomenological research
PPT
03 phenomenology
PPTX
Sample size calculation
PPT
Sample size
PPT
Collecting Qualitative Data
PPT
Sampling methods PPT
PPTX
Sample and sampling techniques
PPTX
Sampling Methods in Qualitative and Quantitative Research
PPTX
sampling ppt
PPTX
RESEARCH METHOD - SAMPLING
Introduction to Sample size decision
Determining the Sample Size
Sample size calculation - a brief overview
Phenomenological research
03 phenomenology
Sample size calculation
Sample size
Collecting Qualitative Data
Sampling methods PPT
Sample and sampling techniques
Sampling Methods in Qualitative and Quantitative Research
sampling ppt
RESEARCH METHOD - SAMPLING
Ad

Similar to Carma internet research module sample size considerations (20)

PPTX
Sample Size Estimation and Statistical Test Selection
PDF
Determining the optimal sample size for study/ research question.
PPTX
Sample size
PPT
Sample size and power
PPTX
sample size determination presentation ppt
PPTX
sample size determination and power of study
PDF
Survey Methods - OIISDP 2015
PDF
Novelties in social science statistics
PPTX
Practical Methods To Overcome Sample Size Challenges
PDF
Bmgt 311 chapter_10
PPTX
How to Structure the “Approach” Section of a Grant Application by David Elash...
PDF
How to Structure the “Approach” Section of a Grant Application (2020)
DOCX
BUS 308 Week 5 Lecture 3 A Different View Effect Sizes .docx
PDF
Samle size
PDF
Advanced sampling part 2 presentation notes
PDF
K-to-R Workshop: How to Structure the "Approach" Section (Part 1)
PPTX
Sampling, measurement, and stats(2013)
PDF
5 simple questions to determin sample size
PPTX
Sampling of Blood
Sample Size Estimation and Statistical Test Selection
Determining the optimal sample size for study/ research question.
Sample size
Sample size and power
sample size determination presentation ppt
sample size determination and power of study
Survey Methods - OIISDP 2015
Novelties in social science statistics
Practical Methods To Overcome Sample Size Challenges
Bmgt 311 chapter_10
How to Structure the “Approach” Section of a Grant Application by David Elash...
How to Structure the “Approach” Section of a Grant Application (2020)
BUS 308 Week 5 Lecture 3 A Different View Effect Sizes .docx
Samle size
Advanced sampling part 2 presentation notes
K-to-R Workshop: How to Structure the "Approach" Section (Part 1)
Sampling, measurement, and stats(2013)
5 simple questions to determin sample size
Sampling of Blood

More from Syracuse University (20)

PPTX
Discovery informaticsstanton
PPTX
Basic SEVIS Overview for U.S. University Faculty
PPTX
Why R? A Brief Introduction to the Open Source Statistics Platform
PPTX
Chapter9 r studio2
PPTX
Basic Overview of Data Mining
PPTX
Strategic planning
PPTX
Carma internet research module scale development
PPTX
Carma internet research module getting started with question pro
PPTX
Carma internet research module visual design issues
PPT
Siop impact of social media
PPTX
Basic Graphics with R
PPTX
R-Studio Vs. Rcmdr
PPTX
Getting Started with R
PPTX
Moving Data to and From R
PPTX
Introduction to Advance Analytics Course
PPTX
Installing R and R-Studio
PPTX
Mining tweets for security information (rev 2)
PPTX
What is Data Science
PPTX
Reducing Response Burden
PPTX
PACIS Survey Workshop
Discovery informaticsstanton
Basic SEVIS Overview for U.S. University Faculty
Why R? A Brief Introduction to the Open Source Statistics Platform
Chapter9 r studio2
Basic Overview of Data Mining
Strategic planning
Carma internet research module scale development
Carma internet research module getting started with question pro
Carma internet research module visual design issues
Siop impact of social media
Basic Graphics with R
R-Studio Vs. Rcmdr
Getting Started with R
Moving Data to and From R
Introduction to Advance Analytics Course
Installing R and R-Studio
Mining tweets for security information (rev 2)
What is Data Science
Reducing Response Burden
PACIS Survey Workshop

Carma internet research module sample size considerations

  • 1. Sample Size ConsiderationsCARMA Internet Research ModuleJeff Stanton
  • 2. Key ConsiderationsSample size versus response rate – planning for the number of usable data points you will actually obtainAttrition – Repeated measures, panel designs, and diary studies all lose participants over timeStatistical power – ability to draw inferences from the sample obtainedMargin of error – to the extent that the resulting statistics must be projectable to the larger population
  • 3. May 15-17, 2008Internet Data Collection Methods (Day 2-3)Response Rate Reminder70%65%60%55%50%45%40%19751995Academic Surveys
  • 4. Hope for the best / Plan for the worstTry to achieve an 80% response rateHope to achieve a 50% response ratePlan ahead for a 30% response rateMeans you need to sample 1000 people to obtain a sample of 300
  • 5. Bad DataUnproctored, anonymous self report instruments generally have a higher percentage of:Unusual outliersMissing dataCarelessly entered dataIntentionally sabotaged dataAnother aspect of dealing with nonresponse is to anticipate, prepare for, and deal with item level data losses
  • 7. The Best Articles on Statistical PowerCohen, J. (1992). "A power primer." Psychological bulletin 112(1): 155-159.Cohen, J. (1992). "Statistical power analysis." Current Directions in Psychological Science: 98-101.Kraemer, H. and S. Thiemann (1987). How many subjects?: Statistical power analysis in research, Sage Publications, Inc.
  • 8. May 15-17, 2008Internet Data Collection Methods (Day 2-8)Sample Size “Guestimates”(With apologies to Jacob Cohen)
  • 9. May 15-17, 2008Internet Data Collection Methods (Day 2-9)Estimating Effect Size(Also with apologies to Jacob Cohen)Mean differences, calibrated in standard deviations: Large = .8+, Medium = .5, Small = .2Multiple regression, size of R-squared: Large =.35+, Medium = .15, Small = .02Chi-square, calibrated in the difference between null and alternate population proportions: Large = . 50, Medium = .30, Small = .10
  • 10. Margin of ErrorGenerally represents only sampling error: Other sources of error will often make the margin much largerAssumes a large population, with no more than 5% drawn into the sampleMargin of error is half the width of a confidence intervalStraightforward calculation for a CI around a mean or a mean difference: generally about 1.96 standard errorsCI around a proportion/percentage is more complex:Use 1.96 times this SE; works fine for even splits; can be a little funky for extreme proportions
  • 11. Margin of Error Calculatorshttp://www.raosoft.com/samplesize.htmlTrades off sample size and margin of errorhttp://www.surveysystem.com/sscalc.htmExplains terminologyhttp://faculty.vassar.edu/lowry/polls/calcs.htmlVarious tools for assessing poll datahttp://glass.ed.asu.edu/stats/analysis/rci.htmlConfidence intervals for correlationshttp://www.stat.tamu.edu/~jhardin/applets/signed/case11.htmlJava-based applet
  • 12. An Overall Sampling PlanEstimate the expected effect size for the most important tests you plan to conductFor inferential testing, use power estimation tools to plan sample sizeFor projectability, use margin of error tools to plan sample sizeTake into account item level data loss due to bad dataTake attrition into account for longitudinal designsTake overall response rate into account for all types of designsDetermine overall initial sample size based on all of the factors listed above