SlideShare a Scribd company logo
Sample Size Calculation
Dr Santam Chakraborty
Assistant Professor, Radiation Oncology
Tata Memorial Hospital
Relax !!
No Formulae
No Mathematical Jargon
No complicated concepts
Sample
Subset of a defined population
Defined selection procedure
Has sampling points / units / observations
Allows inference without a “census”
Types of samples
Sample
Complete
Sample
Representative
Sample
Random
Sample
Non Random
Sample
Random Sample
Derived from a defined population
Each individual has the same chance of
being included in the sample
Sampling can be done with minimum
knowledge about the population
Allows externally valid conclusions
Sampling
Frame
1. Source material from which the sample is drawn
2. List all who can be sampled from a population
3. Example: Census
4. Must be representative of the population
5. No elements from outside the population of interest are
present in the frame
Q: Can telephone directory be used as a sampling frame to
represent adult population of Mumbai?
Q: Can a sample drawn randomly from this be called a random
sample?
Why Does it Matter?
1. Avoids resource wastage
2. Ensures aims are clear
3. Reduces harm
4. Discourages much needed future
research
5. Needed for publication and grants
Avoids an unethical underpowered
study
Why are underpowered studies unethical?
1. Often yield optimistic differences
2. Confidence intervals around these differences are wider
3. Small reductions of CI (w.r.t no trials) is not justified when risks to patients is
considered
4. Combined meta-analyses more susceptible to variability in study design and
execution
5. Impairs informed consent - do we inform patient of the limited benefit from an
underpowered study ? - a form of deception
6. Serendipitous results are rare - publication bias makes them seem more
J. P. A. Ioannidis, Why most
published research findings are
false. PLoS Med. 2, e124 (2005).
Ellis, P.D. (2010), “Effect Size FAQs,”: https://effectsizefaq.com/
μ0 μ1
d
Basic Theory
μ0 μ1
d
Basic Theory
Probability of
rejecting the null
hypothesis when it
is really true (Type
I Error)
μ0 μ1
d
Basic Theory
Probability of
rejecting the null
hypothesis when it
is really true (Type
I Error)
Probability of
rejecting the
alternate hypothesis
when it is true (Type
II error)
μ0 μ1
d
Basic Theory
Probability of
accepting the null
hypothesis as true
when it is really
false (Type I Error)
Power of the test
μ0 μ1
d
Increasing Power
μ0 μ1
d
Increasing Power
μ0 μ1
d
Increasing Power
μ0 μ1
d
Non Directional Hypothesis
Probability of
accepting the null
hypothesis as true
when it is really
false (Type I Error)
Power of the test
How to calculate : Software
1. G*Power
2. PASS
3. SPSS
4. R
Basic Principles
1. Define a research hypothesis
2. Define the primary and the secondary endpoints
3. Define the measurement:
a. What to measure
b. In whom to measure
c. Where to measure
d. When to measure
e. Why to measure - most important
Sample Size
Calculation
Example
Scenarios
1. Cataract surgery in mobile eye surgical unit:
Safe and viable alternative
1. Topical sodium cromoglycate in management
of chronic non-infectious conjunctivitis: A
Double blind controlled clinical trial
Sample size for comparing proportions
1. Endpoint : Cumulative infection rate at 72 hours. Measure : percent or ratio
2. Single sample design
3. “Hopefully” random
4. We approach in two ways:
a. Compare against a “known” rate
b. Estimate the precision of the estimate we generate
Sample size calculation
Sample size calculation
Sample Size for Confidence Interval Estimates
● Most commonly used for single sample situations
● Confidence intervals basically indicate the range of plausible values of the
population estimate that is desired.
● Essentially implies if the same experiment is repeated, the estimated value
will lie within the range of the confidence intervals x% of the time (only if the
sample mean is centered though)
● Easier to do as historical precedent need not be present.
Sample Size for Confidence Interval Estimates
Endpoint is the precision of estimate of the mean here.
Let us assume that you would be satisfied with a rate of 5% and do not want the
estimate to go beyond 8% (士 5%).
You want the confidence level to be 95%
https://select-statistics.co.uk/calculators/sample-size-calculator-population-proportion/
Sample Size
Calculation
Example
Scenarios
1. Cataract surgery in mobile eye surgical unit:
Safe and viable alternative
1. Topical sodium cromoglycate in management
of chronic non-infectious conjunctivitis: A
Double blind controlled clinical trial
Primary Endpoint
1. What are we measuring : Patient's subjective report of improvement in
symptoms
2. Whom are we measuring it in : Patients with B/L chronic non infective
conjunctivitis
3. Where are we measuring it : In a hospital where the study is being
conducted*
4. When are we measuring it : At 4 weeks
5. Why are we measuring it : Is the drug better than a placebo for this condition.
Sample Size : Mean Score
Endpoint is an estimate of the mean score in the questionnaire at 4 weeks
We want to know if the mean score of the patients in the control group is different
from the score in the test group
Assume a random sample
Sample size calculation
Sample size calculation
Time to Event Endpoint
Endpoint is an estimate of median time taken for the symptom score to normalize
Here the comparing the median times by a T test approach will fail
What we need is a sample size estimation for a time to event outcome
Hazard rates and ratio
Usual survival curves follow an exponential
distribution.
The probability of Surviving for a specific time
period is given as P = e-ht
Here h = the instantaneous hazard rate
h = ln (1/Median Survival Time)
h= - ln (S(T))/T .. where T is time and S is proportion
surviving upto time T
Sample Size : Time to Symptomatic Change
Assume that 40% of the patients receiving placebo in the control group at 4
weeks.
We consider a clinically meaningful difference exists if the proportion of patients
differs by 20%
● 20% or less improve with drug at 4 weeks - significantly worse
● 60% or more improve with drug at 4 weeks - significantly better
We assume that the rate of improvement over the 4 weeks is constant implying
uniform hazard rate.
Sample Size : Time to Symptomatic Change
% improving in 4 weeks in placebo arm : 40%
% not improving in 4 weeks in placebo arm : 60%
Hazard rate of not improving : - ln (0.4/4) or -ln(1-0.6)/4 = 2.3
% improving in 4 weeks with drug : 60%
Hazard rate of not improving : - ln (0.4/4) = 1.9
Hazard ratio = 1.9 / 2.3 = 0.82
Sample size calculation
Summary
1. Sample size calculation integral part of valid and ethical scientific research
2. Lots of tools available
3. Important to define the hypothesis and end point clearly for proper sample
size
Thank You

More Related Content

PPTX
Sample size calculation
PDF
Sample size calculation in medical research
PPTX
Sample size calculation - a brief overview
PPTX
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
PPTX
Sample size estimation
PPTX
Sample determinants and size
PPTX
Bio-Statistics in Bio-Medical research
Sample size calculation
Sample size calculation in medical research
Sample size calculation - a brief overview
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
Sample size estimation
Sample determinants and size
Bio-Statistics in Bio-Medical research

What's hot (20)

PPTX
How to determine sample size
PPTX
PPTX
SAMPLE SIZE, CONSENT, STATISTICS
PPTX
Bias in epidemiology uploaded
PPTX
Survival analysis
PPTX
Statistical tests
PPTX
Cross sectional study-dr.wah
PPTX
biostatistics basic
PPTX
Randomized Controlled Trial
PPTX
Odds ratio
PPT
2.epidemilogic measures
PDF
Study designs
PPT
Part 2 Cox Regression
PPT
Sample Size Estimation
PPTX
Randomized Controlled Trial
PPTX
Sample and sample size
PDF
Survival analysis & Kaplan Meire
PPTX
Survival analysis
How to determine sample size
SAMPLE SIZE, CONSENT, STATISTICS
Bias in epidemiology uploaded
Survival analysis
Statistical tests
Cross sectional study-dr.wah
biostatistics basic
Randomized Controlled Trial
Odds ratio
2.epidemilogic measures
Study designs
Part 2 Cox Regression
Sample Size Estimation
Randomized Controlled Trial
Sample and sample size
Survival analysis & Kaplan Meire
Survival analysis
Ad

Similar to Sample size calculation (20)

PPT
Sample size calculation final
PPTX
Sample size in clinical research 2021 april
PDF
Sample size and power calculations
PPTX
Determination of sample size in scientific research.pptx
PPTX
sample size calculations in different types of study..pptx
PPTX
Innovative Sample Size Methods For Clinical Trials
PPTX
Sample Size Estimation and Statistical Test Selection
DOCX
Sample size determination
PPTX
Advanced Biostatistics and Data Analysis abdul ghafoor sajjad
PPT
Lecture 10 Sample Size
PPTX
Sample size calculation
PPTX
Sample Size determine in health research
PPTX
Sample size estimation in health research
PPT
Biostatistics in Clinical Research
PDF
Biostaticstics, Application of Biostaticstics
PPT
Unit 9b. Sample size estimation.ppt
PPTX
Basics of Sample Size Estimation
PPT
7. Calculate samplesize for clinical trials
PPT
Sample size in general
PPTX
sample size determination presentation ppt
Sample size calculation final
Sample size in clinical research 2021 april
Sample size and power calculations
Determination of sample size in scientific research.pptx
sample size calculations in different types of study..pptx
Innovative Sample Size Methods For Clinical Trials
Sample Size Estimation and Statistical Test Selection
Sample size determination
Advanced Biostatistics and Data Analysis abdul ghafoor sajjad
Lecture 10 Sample Size
Sample size calculation
Sample Size determine in health research
Sample size estimation in health research
Biostatistics in Clinical Research
Biostaticstics, Application of Biostaticstics
Unit 9b. Sample size estimation.ppt
Basics of Sample Size Estimation
7. Calculate samplesize for clinical trials
Sample size in general
sample size determination presentation ppt
Ad

More from Santam Chakraborty (20)

PDF
Adjuvant radiation based on genomic risk factors emerging scenarios
PPTX
IGRT in lung cancer
PDF
Refresher in statistics and analysis skill
PDF
Induction chemotherapy followed by concurrent ct rt versus ct-rt in advanced ...
PDF
To use or not to use the LQ model at “high” radiation doses
PPTX
LDR and HDR Brachytherapy: A Primer for non radiation oncologists
PDF
Introduction to meta analysis
ODP
Concurrent Chemoradiation in Postoperative Setting In LAHNC. A comparision of...
ODP
Evolving Role of Radiation Therapy in Hodgkins Disease
PPT
Hormone Resistant Prostate Cancer
PPT
How to upload presentation
ODP
How to register at Isocentre
ODP
Isocentre Help Forum
ODP
Isocentre Help Forum
PPT
Isocentre Help Edit Page
PPT
Isocentre How to Create a Page
ODP
Helical Tomotherapy
ODP
New Techniques in Radiotherapy
ODP
IMRT and 3D CRT in cervical Cancers
PPT
Beam Directed Radiotherapy - methods and principles
Adjuvant radiation based on genomic risk factors emerging scenarios
IGRT in lung cancer
Refresher in statistics and analysis skill
Induction chemotherapy followed by concurrent ct rt versus ct-rt in advanced ...
To use or not to use the LQ model at “high” radiation doses
LDR and HDR Brachytherapy: A Primer for non radiation oncologists
Introduction to meta analysis
Concurrent Chemoradiation in Postoperative Setting In LAHNC. A comparision of...
Evolving Role of Radiation Therapy in Hodgkins Disease
Hormone Resistant Prostate Cancer
How to upload presentation
How to register at Isocentre
Isocentre Help Forum
Isocentre Help Forum
Isocentre Help Edit Page
Isocentre How to Create a Page
Helical Tomotherapy
New Techniques in Radiotherapy
IMRT and 3D CRT in cervical Cancers
Beam Directed Radiotherapy - methods and principles

Recently uploaded (20)

PDF
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PDF
[EN] Industrial Machine Downtime Prediction
PPTX
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
Database Infoormation System (DBIS).pptx
PDF
Fluorescence-microscope_Botany_detailed content
PPT
Miokarditis (Inflamasi pada Otot Jantung)
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PDF
Introduction to the R Programming Language
PDF
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
PPTX
SAP 2 completion done . PRESENTATION.pptx
168300704-gasification-ppt.pdfhghhhsjsjhsuxush
STUDY DESIGN details- Lt Col Maksud (21).pptx
Qualitative Qantitative and Mixed Methods.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Introduction-to-Cloud-ComputingFinal.pptx
Reliability_Chapter_ presentation 1221.5784
climate analysis of Dhaka ,Banglades.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
[EN] Industrial Machine Downtime Prediction
Microsoft-Fabric-Unifying-Analytics-for-the-Modern-Enterprise Solution.pptx
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
Database Infoormation System (DBIS).pptx
Fluorescence-microscope_Botany_detailed content
Miokarditis (Inflamasi pada Otot Jantung)
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Introduction to the R Programming Language
Recruitment and Placement PPT.pdfbjfibjdfbjfobj
SAP 2 completion done . PRESENTATION.pptx

Sample size calculation

  • 1. Sample Size Calculation Dr Santam Chakraborty Assistant Professor, Radiation Oncology Tata Memorial Hospital
  • 2. Relax !! No Formulae No Mathematical Jargon No complicated concepts
  • 3. Sample Subset of a defined population Defined selection procedure Has sampling points / units / observations Allows inference without a “census”
  • 5. Random Sample Derived from a defined population Each individual has the same chance of being included in the sample Sampling can be done with minimum knowledge about the population Allows externally valid conclusions
  • 6. Sampling Frame 1. Source material from which the sample is drawn 2. List all who can be sampled from a population 3. Example: Census 4. Must be representative of the population 5. No elements from outside the population of interest are present in the frame Q: Can telephone directory be used as a sampling frame to represent adult population of Mumbai? Q: Can a sample drawn randomly from this be called a random sample?
  • 7. Why Does it Matter? 1. Avoids resource wastage 2. Ensures aims are clear 3. Reduces harm 4. Discourages much needed future research 5. Needed for publication and grants Avoids an unethical underpowered study
  • 8. Why are underpowered studies unethical? 1. Often yield optimistic differences 2. Confidence intervals around these differences are wider 3. Small reductions of CI (w.r.t no trials) is not justified when risks to patients is considered 4. Combined meta-analyses more susceptible to variability in study design and execution 5. Impairs informed consent - do we inform patient of the limited benefit from an underpowered study ? - a form of deception 6. Serendipitous results are rare - publication bias makes them seem more
  • 9. J. P. A. Ioannidis, Why most published research findings are false. PLoS Med. 2, e124 (2005).
  • 10. Ellis, P.D. (2010), “Effect Size FAQs,”: https://effectsizefaq.com/
  • 12. μ0 μ1 d Basic Theory Probability of rejecting the null hypothesis when it is really true (Type I Error)
  • 13. μ0 μ1 d Basic Theory Probability of rejecting the null hypothesis when it is really true (Type I Error) Probability of rejecting the alternate hypothesis when it is true (Type II error)
  • 14. μ0 μ1 d Basic Theory Probability of accepting the null hypothesis as true when it is really false (Type I Error) Power of the test
  • 18. μ0 μ1 d Non Directional Hypothesis Probability of accepting the null hypothesis as true when it is really false (Type I Error) Power of the test
  • 19. How to calculate : Software 1. G*Power 2. PASS 3. SPSS 4. R
  • 20. Basic Principles 1. Define a research hypothesis 2. Define the primary and the secondary endpoints 3. Define the measurement: a. What to measure b. In whom to measure c. Where to measure d. When to measure e. Why to measure - most important
  • 21. Sample Size Calculation Example Scenarios 1. Cataract surgery in mobile eye surgical unit: Safe and viable alternative 1. Topical sodium cromoglycate in management of chronic non-infectious conjunctivitis: A Double blind controlled clinical trial
  • 22. Sample size for comparing proportions 1. Endpoint : Cumulative infection rate at 72 hours. Measure : percent or ratio 2. Single sample design 3. “Hopefully” random 4. We approach in two ways: a. Compare against a “known” rate b. Estimate the precision of the estimate we generate
  • 25. Sample Size for Confidence Interval Estimates ● Most commonly used for single sample situations ● Confidence intervals basically indicate the range of plausible values of the population estimate that is desired. ● Essentially implies if the same experiment is repeated, the estimated value will lie within the range of the confidence intervals x% of the time (only if the sample mean is centered though) ● Easier to do as historical precedent need not be present.
  • 26. Sample Size for Confidence Interval Estimates Endpoint is the precision of estimate of the mean here. Let us assume that you would be satisfied with a rate of 5% and do not want the estimate to go beyond 8% (士 5%). You want the confidence level to be 95%
  • 28. Sample Size Calculation Example Scenarios 1. Cataract surgery in mobile eye surgical unit: Safe and viable alternative 1. Topical sodium cromoglycate in management of chronic non-infectious conjunctivitis: A Double blind controlled clinical trial
  • 29. Primary Endpoint 1. What are we measuring : Patient's subjective report of improvement in symptoms 2. Whom are we measuring it in : Patients with B/L chronic non infective conjunctivitis 3. Where are we measuring it : In a hospital where the study is being conducted* 4. When are we measuring it : At 4 weeks 5. Why are we measuring it : Is the drug better than a placebo for this condition.
  • 30. Sample Size : Mean Score Endpoint is an estimate of the mean score in the questionnaire at 4 weeks We want to know if the mean score of the patients in the control group is different from the score in the test group Assume a random sample
  • 33. Time to Event Endpoint Endpoint is an estimate of median time taken for the symptom score to normalize Here the comparing the median times by a T test approach will fail What we need is a sample size estimation for a time to event outcome
  • 34. Hazard rates and ratio Usual survival curves follow an exponential distribution. The probability of Surviving for a specific time period is given as P = e-ht Here h = the instantaneous hazard rate h = ln (1/Median Survival Time) h= - ln (S(T))/T .. where T is time and S is proportion surviving upto time T
  • 35. Sample Size : Time to Symptomatic Change Assume that 40% of the patients receiving placebo in the control group at 4 weeks. We consider a clinically meaningful difference exists if the proportion of patients differs by 20% ● 20% or less improve with drug at 4 weeks - significantly worse ● 60% or more improve with drug at 4 weeks - significantly better We assume that the rate of improvement over the 4 weeks is constant implying uniform hazard rate.
  • 36. Sample Size : Time to Symptomatic Change % improving in 4 weeks in placebo arm : 40% % not improving in 4 weeks in placebo arm : 60% Hazard rate of not improving : - ln (0.4/4) or -ln(1-0.6)/4 = 2.3 % improving in 4 weeks with drug : 60% Hazard rate of not improving : - ln (0.4/4) = 1.9 Hazard ratio = 1.9 / 2.3 = 0.82
  • 38. Summary 1. Sample size calculation integral part of valid and ethical scientific research 2. Lots of tools available 3. Important to define the hypothesis and end point clearly for proper sample size

Editor's Notes

  • #5: An unbiased (representative) sample is a set of objects chosen from a complete sample using a selection process that does not depend on the properties of the objects. There are several types of non random sample but the most well known and most abused is convenience sample.
  • #18: Increasing the acceptable Type I error rejection threshold can improve the power and vice versa.
  • #35: An exponential distribution is produced when events occur CONTINUOUSLY and INDEPENDENTLY at a constant average rate.