SAMPLE SIZE (2)SAMPLE SIZE (2)
Dr Htin Zaw SoeDr Htin Zaw Soe
MBBS, DFT, MMedSc (P & TM), PhD,MBBS, DFT, MMedSc (P & TM), PhD,
DipMedEdDipMedEd
Associate Professor, Department ofAssociate Professor, Department of
BiostatisticsBiostatistics
University of Public Health, YangonUniversity of Public Health, Yangon
Sample sizeSample size
It isIt is not necessarily truenot necessarily true that the bigger the sample size, the better thethat the bigger the sample size, the better the
study becomesstudy becomes
To get a better study, it is necessary to increaseTo get a better study, it is necessary to increase accuracyaccuracy of dataof data
collection and to have acollection and to have a representativerepresentative samplesample
A desired sample size – determined by expected variation in data (ie.A desired sample size – determined by expected variation in data (ie.
the more varied the data, the larger the sample size to get same levelthe more varied the data, the larger the sample size to get same level
of accuracy)of accuracy)
For exploratory studies, start with a small sample size (eg. n= 30)For exploratory studies, start with a small sample size (eg. n= 30)
For cross-sectional and analytical studies, sample size - calculated.For cross-sectional and analytical studies, sample size - calculated.
The eventual sample size is usually a compromise between what isThe eventual sample size is usually a compromise between what is
desirabledesirable and what isand what is feasible.feasible.
 Feasible ‘n’ - determined by time/manpower/transport/moneyFeasible ‘n’ - determined by time/manpower/transport/money
Rules: many variables → smaller nRules: many variables → smaller n
: few variables → larger n: few variables → larger n
: more varied the data → larger n: more varied the data → larger n
: at least 5 – 10 study units per cell in cross- tabulations: at least 5 – 10 study units per cell in cross- tabulations
Sample size determinationSample size determination
- By formulaBy formula
- By table of minimum sample sizeBy table of minimum sample size
Sample size calculation formulaeSample size calculation formulae
- Divided into two categoriesDivided into two categories
(A) For studies trying to measure a variable with a certain(A) For studies trying to measure a variable with a certain precisionprecision
(B) For studies seeking to demonstrate a(B) For studies seeking to demonstrate a significant differencesignificant difference
between two groupsbetween two groups
(A) For studies trying to measure a variable with a certain(A) For studies trying to measure a variable with a certain precisionprecision
 Abbreviations used are:Abbreviations used are:
n = sample sizen = sample size
s = standard deviations = standard deviation
e = required size of standard errore = required size of standard error
( margin of error is used for ± 2 times the size of standard error (e) if( margin of error is used for ± 2 times the size of standard error (e) if
a precision of 95% is required)a precision of 95% is required)
r = rater = rate
p = percentagep = percentage
d = confidence leveld = confidence level
[For 90% confidence level, d = 1 (1.645) ][For 90% confidence level, d = 1 (1.645) ]
[For 95% confidence level , d = 2 (1.96) ][For 95% confidence level , d = 2 (1.96) ]
[For 99% confidence level , d = 3 (2.58) ][For 99% confidence level , d = 3 (2.58) ]
e = width of interval / 2de = width of interval / 2d
(1) Single mean(1) Single mean
n = sn = s22
/ e/ e22
(2) Single rate(2) Single rate
n = r / en = r / e22
(3) Single proportion(3) Single proportion
n = p (1-p) / en = p (1-p) / e22
(4) Difference between two means ( n in each group)(4) Difference between two means ( n in each group)
n = sn = s11
22
+ s+ s22
22
/ e/ e22
(5) Difference between two rates ( n in each group)(5) Difference between two rates ( n in each group)
n = rn = r11 + r+ r22 / e/ e22
(6) Difference between two proportions ( n in each group)(6) Difference between two proportions ( n in each group)
n = pn = p11(1 –p(1 –p11) + p) + p22(1-p(1-p22) / e) / e22
Single meanSingle mean
In a study the mean weight of newborn babies will be determined. TheIn a study the mean weight of newborn babies will be determined. The
mean weight is expected to be 3000 grams. Weights are approximatelymean weight is expected to be 3000 grams. Weights are approximately
normally distributed and 95% of the birth weights are probablynormally distributed and 95% of the birth weights are probably
between 2000 and 4000 gram; therefore the standard deviation wouldbetween 2000 and 4000 gram; therefore the standard deviation would
bebe 500500 gram. The desired 95% confidence interval isgram. The desired 95% confidence interval is 2950 to 30502950 to 3050
gram, so the standard error would be 25 gram. The required samplegram, so the standard error would be 25 gram. The required sample
size would be:size would be:
n=n=ss22
==50050022
==250000250000=400 new born babies=400 new born babies
ee22
252522
625625
(Note:(Note: e= width of interval /2de= width of interval /2d = 100/2× 2 = 25)= 100/2× 2 = 25)
Single rateSingle rate
  
The maternal mortality rate in a country is expected to be 70 per 10,000 The maternal mortality rate in a country is expected to be 70 per 10,000 
live births.  A survey is planned to determine the maternal mortality live births.  A survey is planned to determine the maternal mortality 
rate with a 95% confidence interval of 60 to 80 per 10,000 live births.  rate with a 95% confidence interval of 60 to 80 per 10,000 live births.  
The standard error would therefore be 5/10,000.  The required sample The standard error would therefore be 5/10,000.  The required sample 
size would be:size would be:
  
n=n=r r == 70/10000  70/10000  =28,000 live births=28,000 live births
        ee22
   (5/10000)(5/10000)22
(Note: (Note: e= width of interval /2de= width of interval /2d = [(20/2× 2) /10,000] = 5/10,000) = [(20/2× 2) /10,000] = 5/10,000)
Single proportionSingle proportion
  
The proportion of nurses leaving the health services within three years The proportion of nurses leaving the health services within three years 
of graduation is estimated to be 30%.  A study which aims to find causes of graduation is estimated to be 30%.  A study which aims to find causes 
for this, also aims to determine the percentage leaving the service with for this, also aims to determine the percentage leaving the service with 
a confidence interval of 25% to 35%.  The standard error would a confidence interval of 25% to 35%.  The standard error would 
therefore be 2.5%.  The required sample size would be:therefore be 2.5%.  The required sample size would be:
n=n=p (100 – p) p (100 – p) ==30 x 7030 x 70=336 nurses=336 nurses
ee2      2      
           2.5           2.522
(Note: (Note: e= width of interval /2d  e= width of interval /2d  = 10/2× 2 = 2.5)= 10/2× 2 = 2.5)
  
Difference between two means (sample size in each group)Difference between two means (sample size in each group)
  
The difference of the mean birth weights in district A and B will be The difference of the mean birth weights in district A and B will be 
determined.  In district A the mean is expected to be 3000 grammes determined.  In district A the mean is expected to be 3000 grammes 
with a standard deviation of 500 gram.  In district B the mean is with a standard deviation of 500 gram.  In district B the mean is 
expected to be 3200 gram with a standard deviation of 500 gram.  expected to be 3200 gram with a standard deviation of 500 gram.  
The difference in mean birth weight between districts A and B is The difference in mean birth weight between districts A and B is 
therefore expected to be 200 gram.  The desired 95% confidence therefore expected to be 200 gram.  The desired 95% confidence 
interval of this difference is 100 to 300 gram, giving a standard error interval of this difference is 100 to 300 gram, giving a standard error 
of the difference of 50 gram.  The required sample size would be:of the difference of 50 gram.  The required sample size would be:
n  =  n  =  ss11
22
 + s + s22
22
 = =50050022
 + 500 + 50022
=  200 newborns in each district=  200 newborns in each district
                    ee22
              50              5022
(Note: (Note: e= width of interval /2d  e= width of interval /2d  = 200/2× 2 = 50)= 200/2× 2 = 50)
Difference between two rates (sample size in each group)Difference between two rates (sample size in each group)
  
The difference in maternal mortality rates between urban and rural The difference in maternal mortality rates between urban and rural 
areas will be determined.  In the rural areas the maternal mortality rate areas will be determined.  In the rural areas the maternal mortality rate 
is expected to be 100 per 10,000 and in the urban areas 50 per 10,000 is expected to be 100 per 10,000 and in the urban areas 50 per 10,000 
live births.  The difference is therefore 50 per 10,000 live births.  The live births.  The difference is therefore 50 per 10,000 live births.  The 
desired 95% confidence interval of this difference is 30 to 70 per 10,000 desired 95% confidence interval of this difference is 30 to 70 per 10,000 
live births giving a standard error of the difference of 10/10,000.  The live births giving a standard error of the difference of 10/10,000.  The 
required sample size would be:required sample size would be:
  
n= n= rr11 + r + r22==100/10,000 + 50/10,000100/10,000 + 50/10,000 =15,000  =15,000 live births in each arealive births in each area
                ee22
     (10/10,000)     (10/10,000)22
(Note: (Note: e= width of interval /2d  e= width of interval /2d  = 40/2× 2 = 10)= 40/2× 2 = 10)
Difference between two proportions (sample size in each group)Difference between two proportions (sample size in each group)
  
The difference in the proportion of nurses leaving the service is The difference in the proportion of nurses leaving the service is 
determined between two regions.  In one region 30% of the nurses are determined between two regions.  In one region 30% of the nurses are 
estimated to leave the service within three years of graduation, in the estimated to leave the service within three years of graduation, in the 
other region 15%, giving a difference of 15%.  The desired 95% other region 15%, giving a difference of 15%.  The desired 95% 
confidence interval for this difference is 5% to 25%, giving a standard confidence interval for this difference is 5% to 25%, giving a standard 
error of 5%.  The sample size in each group would be:error of 5%.  The sample size in each group would be:
  n=n=pp11 (100 - p (100 - p11) + p) + p22 (100 - p (100 - p22))
      ee22
      ==30 x 70 + 15 x 8530 x 70 + 15 x 85=135 nurses in each region=135 nurses in each region
              5522
(Note: (Note: e= width of interval /2d  e= width of interval /2d  = 20/2× 2 = 5)= 20/2× 2 = 5)
(B) For studies seeking to demonstrate a (B) For studies seeking to demonstrate a significant differencesignificant difference between between 
two groupstwo groups
 Abbreviations used are:Abbreviations used are:
      n = sample sizen = sample size
      s = standard deviations = standard deviation
      e = required size of standard errore = required size of standard error
      m = meanm = mean
      r = rater = rate
      p = percentagep = percentage
      u = one-sided percentage point of the normal distribution, u = one-sided percentage point of the normal distribution, 
corresponding to 100% - the power. corresponding to 100% - the power. The power is the probability of The power is the probability of 
finding a significant resultfinding a significant result. (eg. if the power is 75%, u = 0.67). (eg. if the power is 75%, u = 0.67)
    v = percentage point of the normal distribution, corresponding to the v = percentage point of the normal distribution, corresponding to the 
(two-sided) significance level (eg. if the significance level is 5% (as (two-sided) significance level (eg. if the significance level is 5% (as 
usual), v = 1.96) usual), v = 1.96) 
(1) Comparison of two means (n in each group)(1) Comparison of two means (n in each group)
n = ( u + v)n = ( u + v)22
(s(s11
22
+ s+ s22
22
) / (m) / (m11 - m- m22))22
(2) Comparison of two rates (n in each group)(2) Comparison of two rates (n in each group)
n = ( u + v)n = ( u + v)22
(r(r11 + r+ r22) / (r) / (r11 - r- r22))22
(3) Comparison of two proportions (n in each group)(3) Comparison of two proportions (n in each group)
n = ( u + v)n = ( u + v)22
{p{p11(1 - p(1 - p11) + p) + p22(1 - p(1 - p22) } / (p) } / (p11 - p- p22))22
 Other formulaeOther formulae (Ref No. 2)(Ref No. 2)
(1) For cross-sectional study(1) For cross-sectional study
(1.1) For measuring one variable : single proportion(1.1) For measuring one variable : single proportion
n = (p q) (zn = (p q) (zαα /d)/d)22
(the same as in n = p (1-p) / e(the same as in n = p (1-p) / e22
))
n = sample sizen = sample size
p = the approximate value of the proportion or percentage ofp = the approximate value of the proportion or percentage of
interest to be determined (if is not known, use 0.5 for p as ainterest to be determined (if is not known, use 0.5 for p as a
conservative estimate)conservative estimate)
q = 1-pq = 1-p
zzαα = percentage point of the normal distribution, corresponding to= percentage point of the normal distribution, corresponding to
the two-sided significance level (can be found from the Standardthe two-sided significance level (can be found from the Standard
Normal Table or z table)Normal Table or z table)
d = precision - how close to the proportion of interest the estimated = precision - how close to the proportion of interest the estimate
is desired to beis desired to be
(1.2) For difference between two proportions(1.2) For difference between two proportions
n = zn = zαα
22
(p(p11qq11 + p+ p22qq22) / d) / d22
(the same as in n = p(the same as in n = p11(1 –p(1 –p11) + p) + p22(1-p(1-p22) / e) / e22
))
pp11 = the proportion or percentage of interest to be determined for= the proportion or percentage of interest to be determined for
group 1group 1
pp22 = the proportion or percentage of interest to be determined for= the proportion or percentage of interest to be determined for
group 2group 2
qq11 = 1 - p= 1 - p11
qq22 = 1 – p= 1 – p22
d = precisiond = precision
zzαα = percentage point of the normal distribution, corresponding to the= percentage point of the normal distribution, corresponding to the
two-sided significance leveltwo-sided significance level
n = sample size in each groupn = sample size in each group
 (2) For analytical studies(2) For analytical studies
(2.1) For significant difference between two groups: comparison of(2.1) For significant difference between two groups: comparison of
two proportionstwo proportions
n = [zn = [zαα ++ zzββ ]]22
[p[p11 qq1+1+ pp22 qq22] / (p] / (p11 - p- p22 ))22
(the same as in n = ( u + v)(the same as in n = ( u + v)22
{p{p11(1 - p(1 - p11) + p) + p22(1 - p(1 - p22) } / (p) } / (p11 - p- p22))22
))
pp11 = the prevalence, proportion or percentage of interest of group 1= the prevalence, proportion or percentage of interest of group 1
pp22 = the prevalence, proportion or percentage of interest of group 2= the prevalence, proportion or percentage of interest of group 2
qq11 = 1 - p= 1 - p11
qq22 = 1 – p= 1 – p22
zzαα = percentage point of the normal distribution, corresponding to= percentage point of the normal distribution, corresponding to
the two-sided significance levelthe two-sided significance level
zz1-1-ββ = One-sided percentage point of the normal distribution,= One-sided percentage point of the normal distribution,
corresponding to 100%, the power (can be found from the Standardcorresponding to 100%, the power (can be found from the Standard
Normal Table or z table)Normal Table or z table)
(2.2) For case control study
n = 2 (zα + zβ )2
(p q) / (p0 - p1 )2
p1 = p0 × OR / [ 1 + p0 (OR – 1)]
The estimate of proportion of individuals among the cases who
were exposed
p0 = proportion of individuals among the controls whom we expect
have been exposed
OR = Odds ratio that is to be tested as being statistically significant is
specified by investigator
p = p0 + p1 / 2
q = 1 – p
zα = percentage point of the normal distribution, corresponding to the
two-sided significance level
z 1-β = One-sided percentage point of the normal distribution,
corresponding to 100%, the power (can be found from the Standard
(2. 3) For cohort study
n = 1 / 1-f [2 (zα + zβ )2
(p q) / (p0 - p1 )2
]
f = proportion of study subjects who are expected to leave the study
(drop-out)
p0 = proportion of participants in the unexposed group who are
expected to exhibit the outcome of interest
p1 = proportion of participants in the exposed group who are expected
to exhibit the outcome of interest
p = p0 + p1 / 2
q = 1 – p
zα = percentage point of the normal distribution, corresponding to the
two-sided significance level
z1-β = One-sided percentage point of the normal distribution,
corresponding to 100%, the power (can be found from the Standard
Normal Table or z table)
(3) For randomized clinical trial
n = 1 / 1-f [2 (zα + zβ )2
(p q) / (p0 - p1 )2
]
f = proportion of study subjects who are expected to leave the study
(drop-out)
p0 = proportion of participants in the control treatment group who are
expected to exhibit the outcome of interest
p1 = proportion of participants in the treatment group who are
expected to exhibit the outcome of interest
p = p0 + p1 / 2
q = 1 – p
zα = percentage point of the normal distribution, corresponding to the
two-sided significance level
z1-β = One-sided percentage point of the normal distribution,
corresponding to 100%, the power (can be found from the Standard
Normal Table or z table)
 Sample size determination by table of minimum sample sizeSample size determination by table of minimum sample size
[See a manual by Lwanga SK and S Lemeshaw (1991)][See a manual by Lwanga SK and S Lemeshaw (1991)]
References:References:
(1)(1) C. Varkevisser, I. Pathmanathan, & A Brownlee (2000).C. Varkevisser, I. Pathmanathan, & A Brownlee (2000).
Health Systems Research Training SeriesHealth Systems Research Training Series: Volume 2-: Volume 2- Designing andDesigning and
conducting health systems research projects;conducting health systems research projects; Part I- ProposalPart I- Proposal
Development and Fieldwork.Development and Fieldwork.
(2) Department of Medical Research (Lower Myanmar). (2010)(2) Department of Medical Research (Lower Myanmar). (2010) LectureLecture
Guide onGuide on Research MethodologyResearch Methodology. 7th edition. Union of Myanmar.. 7th edition. Union of Myanmar.
Department of Medical Research (Lower Myanmar), Ministry ofDepartment of Medical Research (Lower Myanmar), Ministry of
Health: 187.Health: 187.
(3) Lwanga SK and S Lemeshaw (1991). Sample size determination in(3) Lwanga SK and S Lemeshaw (1991). Sample size determination in
health studies: A practical manual. WHO. Geneva. pp 80.health studies: A practical manual. WHO. Geneva. pp 80.
THE ENDTHE END

More Related Content

PPTX
Sample size calculation
PPT
Sample Size Estimation
PPTX
Hypothesis testing and p values 06
PPTX
Sample size calculation for cohort studies
PPTX
Sample size estimation
PPTX
Sampling in Medical Research
PPTX
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx
Sample size calculation
Sample Size Estimation
Hypothesis testing and p values 06
Sample size calculation for cohort studies
Sample size estimation
Sampling in Medical Research
SAMPLE SIZE CALCULATION IN DIFFERENT STUDY DESIGNS AT.pptx

What's hot (20)

PPTX
Survival analysis
PPTX
Systematic Review & Meta-Analysis Course - Summary Slides
PPT
3. Calculate samplesize for prevalence studies
PPTX
Study design in research
PPTX
Tests of significance
PPT
5. Calculate samplesize for case-control studies
PPT
4. Calculate samplesize for cross-sectional studies
PPT
Research Methodology - Study Designs
PPTX
Systematic review and meta analysis
PPT
Sampling methods in medical research
PPTX
STATISTIC ESTIMATION
PPTX
PPTX
SAMPLE SIZE, CONSENT, STATISTICS
PPTX
P value
PPTX
Introduction to Randomized control trial
PPTX
Analytical study designs.pptx
PPT
9. Calculate samplesize for diagnostic study
PPTX
Meta analysis
PPTX
Sampling techniques
Survival analysis
Systematic Review & Meta-Analysis Course - Summary Slides
3. Calculate samplesize for prevalence studies
Study design in research
Tests of significance
5. Calculate samplesize for case-control studies
4. Calculate samplesize for cross-sectional studies
Research Methodology - Study Designs
Systematic review and meta analysis
Sampling methods in medical research
STATISTIC ESTIMATION
SAMPLE SIZE, CONSENT, STATISTICS
P value
Introduction to Randomized control trial
Analytical study designs.pptx
9. Calculate samplesize for diagnostic study
Meta analysis
Sampling techniques
Ad

Similar to Sample size by formula (20)

PPTX
animal genetics and breeding AGB-Unit-I.pptx
PPT
Biostatistics ii4june
PPT
Tbs910 sampling hypothesis regression
PPT
what are the Determination of sample size.ppt
PPTX
M1-4 Estimasi Titik dan Intervaltttt.pptx
PPT
2_5332511410507220042.ppt
PPT
Sample size in general
PPT
sample size phd-finalpresentation111.ppt
PPT
sample size new 1111 ppt community-1.ppt
PPT
Public health and Epidemiology sample size estimation
PPTX
Sample size calculation
PPTX
Z-Test and Standard error
PPTX
Determination of sample size in scientific research.pptx
PPTX
Sample size calculation - a brief overview
PPTX
Pengenalan Ekonometrika
PPT
Lesson04_new
PPT
Lesson04_Static11
PPT
Statistik 1 7 estimasi & ci
PPT
Probability Distributions
PPT
L estimation
animal genetics and breeding AGB-Unit-I.pptx
Biostatistics ii4june
Tbs910 sampling hypothesis regression
what are the Determination of sample size.ppt
M1-4 Estimasi Titik dan Intervaltttt.pptx
2_5332511410507220042.ppt
Sample size in general
sample size phd-finalpresentation111.ppt
sample size new 1111 ppt community-1.ppt
Public health and Epidemiology sample size estimation
Sample size calculation
Z-Test and Standard error
Determination of sample size in scientific research.pptx
Sample size calculation - a brief overview
Pengenalan Ekonometrika
Lesson04_new
Lesson04_Static11
Statistik 1 7 estimasi & ci
Probability Distributions
L estimation
Ad

More from Mmedsc Hahm (20)

PPSX
Solid waste-management-2858710
PPTX
Situation analysis
PPT
Quantification of medicines need
PPTX
Quality in hospital
PPT
Patient satisfaction & quality in health care (16.3.2016) dr.nyunt nyunt wai
PPTX
Organising
PPT
Nscbl slide
PPTX
Introduction to hahm 2017
PPT
Hss lecture 2016 jan
PPTX
Hospital management17
PPTX
Hopital stat
PPT
Health planning approaches hahm 17
PPTX
Ephs and nhp
PPTX
Directing and leading 2017
PPT
Concepts of em
PPT
Access to medicines p pt 17 10-2015
PPTX
The dynamics of disease transmission
PPTX
Study designs dr.wah
PPTX
Standardization dr.wah
DOCX
Solid waste-management-2858710
Situation analysis
Quantification of medicines need
Quality in hospital
Patient satisfaction & quality in health care (16.3.2016) dr.nyunt nyunt wai
Organising
Nscbl slide
Introduction to hahm 2017
Hss lecture 2016 jan
Hospital management17
Hopital stat
Health planning approaches hahm 17
Ephs and nhp
Directing and leading 2017
Concepts of em
Access to medicines p pt 17 10-2015
The dynamics of disease transmission
Study designs dr.wah
Standardization dr.wah

Recently uploaded (20)

PPTX
Nancy Caroline Emergency Paramedic Chapter 17
PPTX
Nancy Caroline Emergency Paramedic Chapter 8
PPTX
unit1-introduction of nursing education..
PPT
Pyramid Points Lab Values Power Point(11).ppt
PDF
Introduction to Clinical Psychology, 4th Edition by John Hunsley Test Bank.pdf
PDF
Back node with known primary managementt
PDF
Zuri Health Pan-African Digital Health Innovator.pdf
PPTX
GCP GUIDELINES 2025 mmch workshop .pptx
PDF
_OB Finals 24.pdf notes for pregnant women
DOCX
ch 9 botes for OB aka Pregnant women eww
PPTX
Nancy Caroline Emergency Paramedic Chapter 4
PPTX
guidance--unit 1 semester-5 bsc nursing.
PPTX
Acute renal failure.pptx for BNs 2nd year
PDF
ENT MedMap you can study for the exam with this.pdf
PPT
Pyramid Points Acid Base Power Point (10).ppt
PDF
crisisintervention-210721062718.presentatiodnf
PPTX
POSTURE.pptx......,............. .........
PDF
01. Histology New Classification of histo is clear calssification
PPTX
ANALGESIC AND ANTI-INFLAMMssssssATORY DRUGS.pptx
PPTX
Public Health. Disasater mgt group 1.pptx
Nancy Caroline Emergency Paramedic Chapter 17
Nancy Caroline Emergency Paramedic Chapter 8
unit1-introduction of nursing education..
Pyramid Points Lab Values Power Point(11).ppt
Introduction to Clinical Psychology, 4th Edition by John Hunsley Test Bank.pdf
Back node with known primary managementt
Zuri Health Pan-African Digital Health Innovator.pdf
GCP GUIDELINES 2025 mmch workshop .pptx
_OB Finals 24.pdf notes for pregnant women
ch 9 botes for OB aka Pregnant women eww
Nancy Caroline Emergency Paramedic Chapter 4
guidance--unit 1 semester-5 bsc nursing.
Acute renal failure.pptx for BNs 2nd year
ENT MedMap you can study for the exam with this.pdf
Pyramid Points Acid Base Power Point (10).ppt
crisisintervention-210721062718.presentatiodnf
POSTURE.pptx......,............. .........
01. Histology New Classification of histo is clear calssification
ANALGESIC AND ANTI-INFLAMMssssssATORY DRUGS.pptx
Public Health. Disasater mgt group 1.pptx

Sample size by formula

  • 1. SAMPLE SIZE (2)SAMPLE SIZE (2) Dr Htin Zaw SoeDr Htin Zaw Soe MBBS, DFT, MMedSc (P & TM), PhD,MBBS, DFT, MMedSc (P & TM), PhD, DipMedEdDipMedEd Associate Professor, Department ofAssociate Professor, Department of BiostatisticsBiostatistics University of Public Health, YangonUniversity of Public Health, Yangon
  • 2. Sample sizeSample size It isIt is not necessarily truenot necessarily true that the bigger the sample size, the better thethat the bigger the sample size, the better the study becomesstudy becomes To get a better study, it is necessary to increaseTo get a better study, it is necessary to increase accuracyaccuracy of dataof data collection and to have acollection and to have a representativerepresentative samplesample A desired sample size – determined by expected variation in data (ie.A desired sample size – determined by expected variation in data (ie. the more varied the data, the larger the sample size to get same levelthe more varied the data, the larger the sample size to get same level of accuracy)of accuracy) For exploratory studies, start with a small sample size (eg. n= 30)For exploratory studies, start with a small sample size (eg. n= 30) For cross-sectional and analytical studies, sample size - calculated.For cross-sectional and analytical studies, sample size - calculated. The eventual sample size is usually a compromise between what isThe eventual sample size is usually a compromise between what is desirabledesirable and what isand what is feasible.feasible.
  • 3.  Feasible ‘n’ - determined by time/manpower/transport/moneyFeasible ‘n’ - determined by time/manpower/transport/money Rules: many variables → smaller nRules: many variables → smaller n : few variables → larger n: few variables → larger n : more varied the data → larger n: more varied the data → larger n : at least 5 – 10 study units per cell in cross- tabulations: at least 5 – 10 study units per cell in cross- tabulations Sample size determinationSample size determination - By formulaBy formula - By table of minimum sample sizeBy table of minimum sample size Sample size calculation formulaeSample size calculation formulae - Divided into two categoriesDivided into two categories (A) For studies trying to measure a variable with a certain(A) For studies trying to measure a variable with a certain precisionprecision (B) For studies seeking to demonstrate a(B) For studies seeking to demonstrate a significant differencesignificant difference between two groupsbetween two groups
  • 4. (A) For studies trying to measure a variable with a certain(A) For studies trying to measure a variable with a certain precisionprecision  Abbreviations used are:Abbreviations used are: n = sample sizen = sample size s = standard deviations = standard deviation e = required size of standard errore = required size of standard error ( margin of error is used for ± 2 times the size of standard error (e) if( margin of error is used for ± 2 times the size of standard error (e) if a precision of 95% is required)a precision of 95% is required) r = rater = rate p = percentagep = percentage d = confidence leveld = confidence level [For 90% confidence level, d = 1 (1.645) ][For 90% confidence level, d = 1 (1.645) ] [For 95% confidence level , d = 2 (1.96) ][For 95% confidence level , d = 2 (1.96) ] [For 99% confidence level , d = 3 (2.58) ][For 99% confidence level , d = 3 (2.58) ] e = width of interval / 2de = width of interval / 2d
  • 5. (1) Single mean(1) Single mean n = sn = s22 / e/ e22 (2) Single rate(2) Single rate n = r / en = r / e22 (3) Single proportion(3) Single proportion n = p (1-p) / en = p (1-p) / e22 (4) Difference between two means ( n in each group)(4) Difference between two means ( n in each group) n = sn = s11 22 + s+ s22 22 / e/ e22 (5) Difference between two rates ( n in each group)(5) Difference between two rates ( n in each group) n = rn = r11 + r+ r22 / e/ e22 (6) Difference between two proportions ( n in each group)(6) Difference between two proportions ( n in each group) n = pn = p11(1 –p(1 –p11) + p) + p22(1-p(1-p22) / e) / e22
  • 6. Single meanSingle mean In a study the mean weight of newborn babies will be determined. TheIn a study the mean weight of newborn babies will be determined. The mean weight is expected to be 3000 grams. Weights are approximatelymean weight is expected to be 3000 grams. Weights are approximately normally distributed and 95% of the birth weights are probablynormally distributed and 95% of the birth weights are probably between 2000 and 4000 gram; therefore the standard deviation wouldbetween 2000 and 4000 gram; therefore the standard deviation would bebe 500500 gram. The desired 95% confidence interval isgram. The desired 95% confidence interval is 2950 to 30502950 to 3050 gram, so the standard error would be 25 gram. The required samplegram, so the standard error would be 25 gram. The required sample size would be:size would be: n=n=ss22 ==50050022 ==250000250000=400 new born babies=400 new born babies ee22 252522 625625 (Note:(Note: e= width of interval /2de= width of interval /2d = 100/2× 2 = 25)= 100/2× 2 = 25)
  • 7. Single rateSingle rate    The maternal mortality rate in a country is expected to be 70 per 10,000 The maternal mortality rate in a country is expected to be 70 per 10,000  live births.  A survey is planned to determine the maternal mortality live births.  A survey is planned to determine the maternal mortality  rate with a 95% confidence interval of 60 to 80 per 10,000 live births.  rate with a 95% confidence interval of 60 to 80 per 10,000 live births.   The standard error would therefore be 5/10,000.  The required sample The standard error would therefore be 5/10,000.  The required sample  size would be:size would be:    n=n=r r == 70/10000  70/10000  =28,000 live births=28,000 live births         ee22    (5/10000)(5/10000)22 (Note: (Note: e= width of interval /2de= width of interval /2d = [(20/2× 2) /10,000] = 5/10,000) = [(20/2× 2) /10,000] = 5/10,000)
  • 8. Single proportionSingle proportion    The proportion of nurses leaving the health services within three years The proportion of nurses leaving the health services within three years  of graduation is estimated to be 30%.  A study which aims to find causes of graduation is estimated to be 30%.  A study which aims to find causes  for this, also aims to determine the percentage leaving the service with for this, also aims to determine the percentage leaving the service with  a confidence interval of 25% to 35%.  The standard error would a confidence interval of 25% to 35%.  The standard error would  therefore be 2.5%.  The required sample size would be:therefore be 2.5%.  The required sample size would be: n=n=p (100 – p) p (100 – p) ==30 x 7030 x 70=336 nurses=336 nurses ee2      2                  2.5           2.522 (Note: (Note: e= width of interval /2d  e= width of interval /2d  = 10/2× 2 = 2.5)= 10/2× 2 = 2.5)   
  • 9. Difference between two means (sample size in each group)Difference between two means (sample size in each group)    The difference of the mean birth weights in district A and B will be The difference of the mean birth weights in district A and B will be  determined.  In district A the mean is expected to be 3000 grammes determined.  In district A the mean is expected to be 3000 grammes  with a standard deviation of 500 gram.  In district B the mean is with a standard deviation of 500 gram.  In district B the mean is  expected to be 3200 gram with a standard deviation of 500 gram.  expected to be 3200 gram with a standard deviation of 500 gram.   The difference in mean birth weight between districts A and B is The difference in mean birth weight between districts A and B is  therefore expected to be 200 gram.  The desired 95% confidence therefore expected to be 200 gram.  The desired 95% confidence  interval of this difference is 100 to 300 gram, giving a standard error interval of this difference is 100 to 300 gram, giving a standard error  of the difference of 50 gram.  The required sample size would be:of the difference of 50 gram.  The required sample size would be: n  =  n  =  ss11 22  + s + s22 22  = =50050022  + 500 + 50022 =  200 newborns in each district=  200 newborns in each district                     ee22               50              5022 (Note: (Note: e= width of interval /2d  e= width of interval /2d  = 200/2× 2 = 50)= 200/2× 2 = 50)
  • 10. Difference between two rates (sample size in each group)Difference between two rates (sample size in each group)    The difference in maternal mortality rates between urban and rural The difference in maternal mortality rates between urban and rural  areas will be determined.  In the rural areas the maternal mortality rate areas will be determined.  In the rural areas the maternal mortality rate  is expected to be 100 per 10,000 and in the urban areas 50 per 10,000 is expected to be 100 per 10,000 and in the urban areas 50 per 10,000  live births.  The difference is therefore 50 per 10,000 live births.  The live births.  The difference is therefore 50 per 10,000 live births.  The  desired 95% confidence interval of this difference is 30 to 70 per 10,000 desired 95% confidence interval of this difference is 30 to 70 per 10,000  live births giving a standard error of the difference of 10/10,000.  The live births giving a standard error of the difference of 10/10,000.  The  required sample size would be:required sample size would be:    n= n= rr11 + r + r22==100/10,000 + 50/10,000100/10,000 + 50/10,000 =15,000  =15,000 live births in each arealive births in each area                 ee22      (10/10,000)     (10/10,000)22 (Note: (Note: e= width of interval /2d  e= width of interval /2d  = 40/2× 2 = 10)= 40/2× 2 = 10)
  • 11. Difference between two proportions (sample size in each group)Difference between two proportions (sample size in each group)    The difference in the proportion of nurses leaving the service is The difference in the proportion of nurses leaving the service is  determined between two regions.  In one region 30% of the nurses are determined between two regions.  In one region 30% of the nurses are  estimated to leave the service within three years of graduation, in the estimated to leave the service within three years of graduation, in the  other region 15%, giving a difference of 15%.  The desired 95% other region 15%, giving a difference of 15%.  The desired 95%  confidence interval for this difference is 5% to 25%, giving a standard confidence interval for this difference is 5% to 25%, giving a standard  error of 5%.  The sample size in each group would be:error of 5%.  The sample size in each group would be:   n=n=pp11 (100 - p (100 - p11) + p) + p22 (100 - p (100 - p22))       ee22       ==30 x 70 + 15 x 8530 x 70 + 15 x 85=135 nurses in each region=135 nurses in each region               5522 (Note: (Note: e= width of interval /2d  e= width of interval /2d  = 20/2× 2 = 5)= 20/2× 2 = 5)
  • 12. (B) For studies seeking to demonstrate a (B) For studies seeking to demonstrate a significant differencesignificant difference between between  two groupstwo groups  Abbreviations used are:Abbreviations used are:       n = sample sizen = sample size       s = standard deviations = standard deviation       e = required size of standard errore = required size of standard error       m = meanm = mean       r = rater = rate       p = percentagep = percentage       u = one-sided percentage point of the normal distribution, u = one-sided percentage point of the normal distribution,  corresponding to 100% - the power. corresponding to 100% - the power. The power is the probability of The power is the probability of  finding a significant resultfinding a significant result. (eg. if the power is 75%, u = 0.67). (eg. if the power is 75%, u = 0.67)     v = percentage point of the normal distribution, corresponding to the v = percentage point of the normal distribution, corresponding to the  (two-sided) significance level (eg. if the significance level is 5% (as (two-sided) significance level (eg. if the significance level is 5% (as  usual), v = 1.96) usual), v = 1.96) 
  • 13. (1) Comparison of two means (n in each group)(1) Comparison of two means (n in each group) n = ( u + v)n = ( u + v)22 (s(s11 22 + s+ s22 22 ) / (m) / (m11 - m- m22))22 (2) Comparison of two rates (n in each group)(2) Comparison of two rates (n in each group) n = ( u + v)n = ( u + v)22 (r(r11 + r+ r22) / (r) / (r11 - r- r22))22 (3) Comparison of two proportions (n in each group)(3) Comparison of two proportions (n in each group) n = ( u + v)n = ( u + v)22 {p{p11(1 - p(1 - p11) + p) + p22(1 - p(1 - p22) } / (p) } / (p11 - p- p22))22
  • 14.  Other formulaeOther formulae (Ref No. 2)(Ref No. 2) (1) For cross-sectional study(1) For cross-sectional study (1.1) For measuring one variable : single proportion(1.1) For measuring one variable : single proportion n = (p q) (zn = (p q) (zαα /d)/d)22 (the same as in n = p (1-p) / e(the same as in n = p (1-p) / e22 )) n = sample sizen = sample size p = the approximate value of the proportion or percentage ofp = the approximate value of the proportion or percentage of interest to be determined (if is not known, use 0.5 for p as ainterest to be determined (if is not known, use 0.5 for p as a conservative estimate)conservative estimate) q = 1-pq = 1-p zzαα = percentage point of the normal distribution, corresponding to= percentage point of the normal distribution, corresponding to the two-sided significance level (can be found from the Standardthe two-sided significance level (can be found from the Standard Normal Table or z table)Normal Table or z table) d = precision - how close to the proportion of interest the estimated = precision - how close to the proportion of interest the estimate is desired to beis desired to be
  • 15. (1.2) For difference between two proportions(1.2) For difference between two proportions n = zn = zαα 22 (p(p11qq11 + p+ p22qq22) / d) / d22 (the same as in n = p(the same as in n = p11(1 –p(1 –p11) + p) + p22(1-p(1-p22) / e) / e22 )) pp11 = the proportion or percentage of interest to be determined for= the proportion or percentage of interest to be determined for group 1group 1 pp22 = the proportion or percentage of interest to be determined for= the proportion or percentage of interest to be determined for group 2group 2 qq11 = 1 - p= 1 - p11 qq22 = 1 – p= 1 – p22 d = precisiond = precision zzαα = percentage point of the normal distribution, corresponding to the= percentage point of the normal distribution, corresponding to the two-sided significance leveltwo-sided significance level n = sample size in each groupn = sample size in each group
  • 16.  (2) For analytical studies(2) For analytical studies (2.1) For significant difference between two groups: comparison of(2.1) For significant difference between two groups: comparison of two proportionstwo proportions n = [zn = [zαα ++ zzββ ]]22 [p[p11 qq1+1+ pp22 qq22] / (p] / (p11 - p- p22 ))22 (the same as in n = ( u + v)(the same as in n = ( u + v)22 {p{p11(1 - p(1 - p11) + p) + p22(1 - p(1 - p22) } / (p) } / (p11 - p- p22))22 )) pp11 = the prevalence, proportion or percentage of interest of group 1= the prevalence, proportion or percentage of interest of group 1 pp22 = the prevalence, proportion or percentage of interest of group 2= the prevalence, proportion or percentage of interest of group 2 qq11 = 1 - p= 1 - p11 qq22 = 1 – p= 1 – p22 zzαα = percentage point of the normal distribution, corresponding to= percentage point of the normal distribution, corresponding to the two-sided significance levelthe two-sided significance level zz1-1-ββ = One-sided percentage point of the normal distribution,= One-sided percentage point of the normal distribution, corresponding to 100%, the power (can be found from the Standardcorresponding to 100%, the power (can be found from the Standard Normal Table or z table)Normal Table or z table)
  • 17. (2.2) For case control study n = 2 (zα + zβ )2 (p q) / (p0 - p1 )2 p1 = p0 × OR / [ 1 + p0 (OR – 1)] The estimate of proportion of individuals among the cases who were exposed p0 = proportion of individuals among the controls whom we expect have been exposed OR = Odds ratio that is to be tested as being statistically significant is specified by investigator p = p0 + p1 / 2 q = 1 – p zα = percentage point of the normal distribution, corresponding to the two-sided significance level z 1-β = One-sided percentage point of the normal distribution, corresponding to 100%, the power (can be found from the Standard
  • 18. (2. 3) For cohort study n = 1 / 1-f [2 (zα + zβ )2 (p q) / (p0 - p1 )2 ] f = proportion of study subjects who are expected to leave the study (drop-out) p0 = proportion of participants in the unexposed group who are expected to exhibit the outcome of interest p1 = proportion of participants in the exposed group who are expected to exhibit the outcome of interest p = p0 + p1 / 2 q = 1 – p zα = percentage point of the normal distribution, corresponding to the two-sided significance level z1-β = One-sided percentage point of the normal distribution, corresponding to 100%, the power (can be found from the Standard Normal Table or z table)
  • 19. (3) For randomized clinical trial n = 1 / 1-f [2 (zα + zβ )2 (p q) / (p0 - p1 )2 ] f = proportion of study subjects who are expected to leave the study (drop-out) p0 = proportion of participants in the control treatment group who are expected to exhibit the outcome of interest p1 = proportion of participants in the treatment group who are expected to exhibit the outcome of interest p = p0 + p1 / 2 q = 1 – p zα = percentage point of the normal distribution, corresponding to the two-sided significance level z1-β = One-sided percentage point of the normal distribution, corresponding to 100%, the power (can be found from the Standard Normal Table or z table)
  • 20.  Sample size determination by table of minimum sample sizeSample size determination by table of minimum sample size [See a manual by Lwanga SK and S Lemeshaw (1991)][See a manual by Lwanga SK and S Lemeshaw (1991)]
  • 21. References:References: (1)(1) C. Varkevisser, I. Pathmanathan, & A Brownlee (2000).C. Varkevisser, I. Pathmanathan, & A Brownlee (2000). Health Systems Research Training SeriesHealth Systems Research Training Series: Volume 2-: Volume 2- Designing andDesigning and conducting health systems research projects;conducting health systems research projects; Part I- ProposalPart I- Proposal Development and Fieldwork.Development and Fieldwork. (2) Department of Medical Research (Lower Myanmar). (2010)(2) Department of Medical Research (Lower Myanmar). (2010) LectureLecture Guide onGuide on Research MethodologyResearch Methodology. 7th edition. Union of Myanmar.. 7th edition. Union of Myanmar. Department of Medical Research (Lower Myanmar), Ministry ofDepartment of Medical Research (Lower Myanmar), Ministry of Health: 187.Health: 187. (3) Lwanga SK and S Lemeshaw (1991). Sample size determination in(3) Lwanga SK and S Lemeshaw (1991). Sample size determination in health studies: A practical manual. WHO. Geneva. pp 80.health studies: A practical manual. WHO. Geneva. pp 80.