SlideShare a Scribd company logo
ESTIMATIONESTIMATION
Dr Htin Zaw SoeDr Htin Zaw Soe
MBBS, DFT, MMedSc (P & TM), PhD, DipMedEdMBBS, DFT, MMedSc (P & TM), PhD, DipMedEd
Associate ProfessorAssociate Professor
Department of BiostatisticsDepartment of Biostatistics
University of Public HealthUniversity of Public Health
 Statistical InferenceStatistical Inference: The procedure by which we reach a: The procedure by which we reach a
conclusion about a population on the basis of the informationconclusion about a population on the basis of the information
contained in a sample drawn from that populationcontained in a sample drawn from that population
 Two general areas ofTwo general areas of Statistical InferenceStatistical Inference
(1) Estimation(1) Estimation
(2) Hypothesis Testing(2) Hypothesis Testing
 Process of estimation- statistic (sample ---Process of estimation- statistic (sample ---→→ parameter (pop)parameter (pop)
 Compute 2 estimatesCompute 2 estimates
1. Point estimate (Single numerical value)1. Point estimate (Single numerical value)
2. Interval estimate (Two numerical values – an2. Interval estimate (Two numerical values – an
interval with a specified degree of confidence)interval with a specified degree of confidence)
 The rule how to compute estimate is an estimatorThe rule how to compute estimate is an estimator
(( x =x = ∑ x∑ xii / n/ n))
 Sampled populationSampled population
 Target populationTarget population
They may or may not be the sameThey may or may not be the same
Random sample (representative to population)Random sample (representative to population)
Nonrandom sampleNonrandom sample
 I. Confidence interval for a population meanI. Confidence interval for a population mean
 Sample mean (x) is point estimate ofSample mean (x) is point estimate of μμ. Not equal to. Not equal to μμ. So an. So an
interval is needed.interval is needed.
 Sampling distribution and CLTSampling distribution and CLT
-- (a)(a) distribution of sample means (x) is normaldistribution of sample means (x) is normal
- (b)(b) μμxx == μμ
- (c)(c) σσ22
x=x= σσ22
/n/n
- 95% of sample means (x) within 2SD of95% of sample means (x) within 2SD of μμ (see fig)(see fig)
- μμ± 2± 2 σσxx will contain 95% of all possible values of sample meanswill contain 95% of all possible values of sample means
(x)(x)
-
 Construct interval using sample means (x) (which is known)Construct interval using sample means (x) (which is known)
ie. x ± 2ie. x ± 2 σσxx (instead of(instead of μμ± 2± 2 σσxx))
Several no. of x ± 2Several no. of x ± 2 σσxx with same width of interval about unknownwith same width of interval about unknown
μμ -- obtained.-- obtained.
95% of these intervals have centres falling within95% of these intervals have centres falling within ± 2± 2 σσxx aboutabout μμ..
Each of interval whose centre fall withinEach of interval whose centre fall within 22 σσxx ofof μμ will containwill contain μμ..
(See Fig)(See Fig)
Example 1: In a study investigating an enzyme level in manExample 1: In a study investigating an enzyme level in man
n = 10, a sample mean, x = 22 ,n = 10, a sample mean, x = 22 , σσ22
= 45= 45
We wish to estimateWe wish to estimate μμ..
Answer 1: 95% Confidence Interval forAnswer 1: 95% Confidence Interval for μμ is:is:
x ± 2x ± 2 σσxx ==22 ± 222 ± 2 √√ 45/1045/10
==22 ± 2 (2.1213)22 ± 2 (2.1213)
17.76, 26.2417.76, 26.24
[[ xx isis point estimate ofpoint estimate of μμ ]]
[[ 22 is a value from standard normal dist.;is a value from standard normal dist.; c95%c95% of x lie; this valueof x lie; this value
ofof zz isis reliability coefficientreliability coefficient ]]
[[ σσxx is SD of sampling distribution]is SD of sampling distribution]
Interval estimateInterval estimate is expressed in general as follows:is expressed in general as follows:
Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error)
 x ± 2x ± 2 σσxx
 When sampling is from a normal distribution with knownWhen sampling is from a normal distribution with known σσ22
,,
interval estimate forinterval estimate for μμ is expressed as followsis expressed as follows
x ± zx ± z(1-(1-αα/2)/2) σσxx
[ z[ z(1-(1- αα/2)/2) = a value of z ]= a value of z ]
(1 -(1 - αα/2) lies on left side of z under the curve./2) lies on left side of z under the curve.
((αα/2) lies on right side of z under the curve./2) lies on right side of z under the curve.
 Interpreting Confidence IntervalInterpreting Confidence Interval
Probabilistic interpretation:Probabilistic interpretation:
In a repeated sampling from a normally distributedIn a repeated sampling from a normally distributed
population with a known standard deviation, 100(1 -population with a known standard deviation, 100(1 - αα) percent) percent
of all interval of the formof all interval of the form x ± zx ± z(1-(1-αα/2)/2) σσxx will in long run include thewill in long run include the
population mean,population mean, μμ
Practical interpretation:Practical interpretation:
When sampling is from a normally distributed populationWhen sampling is from a normally distributed population
with a known standard deviation, we are 100(1 -with a known standard deviation, we are 100(1 - αα) percent) percent
confident that theconfident that the singlesingle computed interval,computed interval, x ± zx ± z(1-(1-αα/2)/2) σσxx ,, containscontains
the population mean,the population mean, μμ
Confidence CoefficientsConfidence Coefficients
.90 .95 .99.90 .95 .99
Reliability factorsReliability factors 1.645 1.96 2.581.645 1.96 2.58
[ In example 1, value of reliability coefficient 2 is used, but more[ In example 1, value of reliability coefficient 2 is used, but more
exact value is 1.96 for confidence coefficient of .95 ]exact value is 1.96 for confidence coefficient of .95 ]
 Precision:Precision:
Quantity obtained by multiplying the reliability factor by theQuantity obtained by multiplying the reliability factor by the
SE of the mean is called theSE of the mean is called the precisionprecision of the estimate. Thisof the estimate. This
quantity is also called thequantity is also called the margin of errormargin of error
Example2: Variance of muscular strength score = 144Example2: Variance of muscular strength score = 144
Mean of muscular strength score in a sample (n = 15)Mean of muscular strength score in a sample (n = 15)
is 84.3is 84.3
Find 99 % confidence interval for pop mean.Find 99 % confidence interval for pop mean.
Answer2:Answer2: x ± zx ± z(1-(1-αα/2)/2) σσxx
84.384.3 ± 2.58 ( √ 144/ 15)± 2.58 ( √ 144/ 15)
84.384.3 ± 8.0± 8.0
76.3, 92.376.3, 92.3
We are 99% confident that pop. mean,We are 99% confident that pop. mean, μμ lies between 76.3 andlies between 76.3 and
92.392.3
 Alternative estimates of central tendency:Alternative estimates of central tendency:
Median instead of mean when there are outliers in a data set.Median instead of mean when there are outliers in a data set.
Median is used as a point estimate and in an interval estimateMedian is used as a point estimate and in an interval estimate
with different formula.with different formula.
Trimmed mean:Trimmed mean:
It is one of robust estimators of central tendency for a data setIt is one of robust estimators of central tendency for a data set
with outlierswith outliers
Steps to compute trimmed meanSteps to compute trimmed mean
- Order the measurements- Order the measurements
- Discard smallest 100- Discard smallest 100 αα % and largest 100% and largest 100 αα % (% (αα = 0.1- 0.2)= 0.1- 0.2)
- Compute Art. Mean of the remaining measurements- Compute Art. Mean of the remaining measurements
Note: Median may be regarded as aNote: Median may be regarded as a 50% trimmed mean50% trimmed mean

II. t distributionII. t distribution
z = (x -z = (x - µ) / (µ) / ( σσ / √/ √ n)n)
σσ is usually unknownis usually unknown
SoSo ss is used insteadis used instead
tt = (x -= (x - µ) / (µ) / ( ss / √/ √ n)n) followsfollows t distributiont distribution
t distribution is used in a small sample sizet distribution is used in a small sample size
CalledCalled Student’s t distributionStudent’s t distribution oror t distributiont distribution
 Properties of t distribution:Properties of t distribution:
1. It has mean, 01. It has mean, 0
2. It is symmetrical about mean2. It is symmetrical about mean
3. It has variance > 1. Variance approaches 1 as n becomes3. It has variance > 1. Variance approaches 1 as n becomes
large.large.
Variance is df/(df-2) for df > 2Variance is df/(df-2) for df > 2
Alternatively, variance is (n-1)/(n-3) for n >3Alternatively, variance is (n-1)/(n-3) for n >3
4. Variable t ranges from -4. Variable t ranges from - ∞∞ to +to + ∞∞
5. Family distribution since a different distribution for each sample5. Family distribution since a different distribution for each sample
of n-1 (divisor used in computing sof n-1 (divisor used in computing s22
) (See fig)) (See fig)
6. t dist. is less peaked and higher tails when compared to normal6. t dist. is less peaked and higher tails when compared to normal
dist.dist.
7. t dist. approaches normal dist. as n-1 approaches7. t dist. approaches normal dist. as n-1 approaches ∞∞
 Table E or Table of t dist. is used for confidence coefficient andTable E or Table of t dist. is used for confidence coefficient and
df in calculationdf in calculation
 Confidence intervals using t :Confidence intervals using t :
Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error)
 When sampling is from a normal dist. whoseWhen sampling is from a normal dist. whose σσ is unknown,is unknown,
100(1-100(1- αα) % CI for pop mean,) % CI for pop mean,µ,µ, is given byis given by x ± tx ± t(1-(1-αα/2)/2) s /√ ns /√ n
[Reliability coefficient is obtained from table of t dist. Or Table[Reliability coefficient is obtained from table of t dist. Or Table
E]E]
 Example 3: In a study to estimate the pop. mean of muscularExample 3: In a study to estimate the pop. mean of muscular
strength. Sample mean (x) = 250.8strength. Sample mean (x) = 250.8
Sample SD (s) = 130.9Sample SD (s) = 130.9
Sample size (n) = 19 subjectsSample size (n) = 19 subjects
Find 95%CI for pop. meanFind 95%CI for pop. mean
x ± tx ± t(1-(1-αα/2)/2) s /√ ns /√ n
250.8 ± 2.10009250.8 ± 2.10009 (130.9 /√ 19)(130.9 /√ 19)
250.8 ± 63.1250.8 ± 63.1
187.7, 313.9187.7, 313.9
Interpretation: We are 95% confident that the true pop. Mean, µInterpretation: We are 95% confident that the true pop. Mean, µ
lies between 187.7 and 313.9 because in repeated samplinglies between 187.7 and 313.9 because in repeated sampling
95% of interval constructed in like manner will include µ.95% of interval constructed in like manner will include µ.
 Deciding between z and t :Deciding between z and t :
(See flowchart)(See flowchart)
See – Distribution of pop. (normal or not)See – Distribution of pop. (normal or not)
- Sample size ( large or not)- Sample size ( large or not)
- Pop. variance (known or not)- Pop. variance (known or not)
Choose z or tChoose z or t
III. Confidence interval for difference between two pop. meansIII. Confidence interval for difference between two pop. means
xx11 – x– x22 ± t± t(1-(1-αα/2)/2) √√ σσ22
11 /n/n11 ++ σσ22
22 /n/n22
If the interval includes zero, two pop. means are likely to be equalIf the interval includes zero, two pop. means are likely to be equal
and vice versaand vice versa
 Example 4: In a study to determine difference bet. serum uricExample 4: In a study to determine difference bet. serum uric
acid level of two groups of patients (with Down’s Syndrome andacid level of two groups of patients (with Down’s Syndrome and
without Down’s Syndrome )without Down’s Syndrome ) (Two pop of values – normally dist.)(Two pop of values – normally dist.)
 Group 1: Mean uric acid level (xGroup 1: Mean uric acid level (x11) = 4.5 mg/100 ml) = 4.5 mg/100 ml
Variance (Variance (σσ11
22
) = 1) = 1
Sample size (nSample size (n11) = 12 subjects) = 12 subjects
 Group 2: Mean uric acid level (xGroup 2: Mean uric acid level (x22) = 3.4 mg/100 ml) = 3.4 mg/100 ml
Variance (Variance (σσ22
22
) = 1.5) = 1.5
Sample size (nSample size (n22) = 15 subjects) = 15 subjects
Find 95% CI for difference between two pop. means (µFind 95% CI for difference between two pop. means (µ11--µµ22))
xx11 – x– x22 ±± zz(1-(1-αα/2)/2) √√ σσ22
11 /n/n11 ++ σσ22
22 /n/n22
4.5 – 3.4 ± 1.964.5 – 3.4 ± 1.96 √ 1 /12√ 1 /12+ 1.5 /15+ 1.5 /15
= 1.1 ± 1.96 (0.4282)= 1.1 ± 1.96 (0.4282)
= 1.1 ± 0.84= 1.1 ± 0.84
 Interpretation: We are 95% confident that difference betweenInterpretation: We are 95% confident that difference between
two serum uric acid levels lies between 0.26 and 1.94 mg/ 100two serum uric acid levels lies between 0.26 and 1.94 mg/ 100
ml. Since the intervalml. Since the interval does not include zerodoes not include zero we conclude thatwe conclude that
two pop. means aretwo pop. means are not equalnot equal
Sampling from nonnormal population:Sampling from nonnormal population:
- Apply CLT if sample sizes are large- Apply CLT if sample sizes are large
- use s- use s22
ifif σσ22
is unknownis unknown
 TheThe t distribution and the difference between meanst distribution and the difference between means
(A) When pop variances are equal(A) When pop variances are equal
(B) When pop variances are not equal(B) When pop variances are not equal
(A)(A) When pop variances are equalWhen pop variances are equal
1.Find1.Find pool estimatepool estimate of common variance byof common variance by
ss22
pp = (n= (n11-1)s-1)s11
22
+ (n+ (n22-1)s-1)s22
22
/ (n/ (n1 +1 + nn22 – 2)– 2)
2. Find SE of estimate by2. Find SE of estimate by
ss x1-x2x1-x2 == √(√( ss22
pp / n/ n11 ) +) +(( ss22
pp / n/ n22 ))
3. Find 100(1-3. Find 100(1-αα) % confidence interval by) % confidence interval by
xx11 – x– x22 ± t± t(1-(1-αα/2)/2) √(√( ss22
pp / n/ n11 ) +) +(( ss22
pp / n/ n22 ))
[Note: number of df is[Note: number of df is (n(n1 +1 + nn22 – 2)]– 2)]
 (B) When pop variances are not equal(B) When pop variances are not equal
1.Find reliability factor,1.Find reliability factor, t’t’(1-(1-αα/2)/2) byby
t’t’(1-(1-αα/2)/2) = (w= (w11tt11 ++ ww22tt22) / w) / w11 ++ ww22
[w[w11 = s= s11
22
/n/n11 ,, ww22 = s= s22
22
/n/n22 ]]
2. Find 100(1-2. Find 100(1-αα) % confidence interval by) % confidence interval by
xx11 – x– x22 ± t’± t’(1-(1-αα/2)/2) √(√( ss22
11 / n/ n11 ) +) +(( ss22
22 / n/ n22 ))
SEE THE EXAMPLE IN TEXTSEE THE EXAMPLE IN TEXT
See Flowchart to choose z, t, or t’See Flowchart to choose z, t, or t’
 IV. Confidence interval for a population proportionIV. Confidence interval for a population proportion
-What proportion of patients who receive a particular type of-What proportion of patients who receive a particular type of
treatment recover?treatment recover?
-What proportion of some pop. has a certain disease?-What proportion of some pop. has a certain disease?
-What proportion of a pop. is immune to a certain disease?-What proportion of a pop. is immune to a certain disease?
 Use same manner as in finding 100(1-Use same manner as in finding 100(1-αα) CI for pop. mean) CI for pop. mean
Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error)
Find 100(1-Find 100(1-αα) CI for pop. proportion,) CI for pop. proportion, pp by the following:by the following:
pp ± z± z(1-(1-αα/2)/2) √√ pp (1-(1- pp) /n) /n
[Note: When np and n(1-p) are greater than 5, it is considered that[Note: When np and n(1-p) are greater than 5, it is considered that
sampling distribution of p is quite close to normal distributionsampling distribution of p is quite close to normal distribution
and Reliability Coefficient is some value of z from standardand Reliability Coefficient is some value of z from standard
normal distribution]normal distribution]
 Example 5: In a study finding population proportion of thoseExample 5: In a study finding population proportion of those
searching health information among internet users.searching health information among internet users.
Sample prop, (p) = 0.18Sample prop, (p) = 0.18
Sample size = 1220 usersSample size = 1220 users
Find 95% CI for pop. prop.Find 95% CI for pop. prop.
pp ± z± z(1-(1-αα/2)/2) √√ pp (1-(1- pp) /n) /n
0.18 ± 1.960.18 ± 1.96 √ 0.18 (1- 0.18) /1220√ 0.18 (1- 0.18) /1220
0.18 ± 1.96 (0.0110)0.18 ± 1.96 (0.0110)
0.18 ± 0.0220.18 ± 0.022
0.158, 0.2020.158, 0.202
Interpretation: We are 95% confident thatInterpretation: We are 95% confident that population proportion ofpopulation proportion of
those searching health information among internet users liesthose searching health information among internet users lies
between 0.158 and 0.202between 0.158 and 0.202
 V. Confidence interval for the difference between twoV. Confidence interval for the difference between two
population proportionspopulation proportions
Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error)
pp11 – p– p22 ± z± z(1-(1-αα/2)/2) √ p√ p11(1 – p(1 – p11) /n) /n11 + p+ p22 (1 – p(1 – p22) /n) /n22
SEE THE EXAMPLE IN TEXTSEE THE EXAMPLE IN TEXT
If the interval includes zero, two pop. proportions are likely toIf the interval includes zero, two pop. proportions are likely to
be equal and vice versabe equal and vice versa
 VI. Determination of Sample Size for Estimating MeansVI. Determination of Sample Size for Estimating Means
A larger sample sizeA larger sample size →→ waste of resourceswaste of resources
A very small sample sizeA very small sample size →→ no practical useno practical use
It is essential to get aIt is essential to get a sufficient/optimum sample sizesufficient/optimum sample size
Objectives:Objectives: The objectives in interval estimation are to getThe objectives in interval estimation are to get
narrow intervals with high reliabilitynarrow intervals with high reliability
See againSee again Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error)
Width of interval = (reliability coefficient) (standard error)Width of interval = (reliability coefficient) (standard error)
d = z (SE)d = z (SE)
dd = z (= z (σσ /√n/√n))
dd22
= z= z22
((σσ /√n/√n))22
n =n = zz22
σσ22
// dd22
 Sample size formula when sampling is without replacementSample size formula when sampling is without replacement
from a small finite population is follow:from a small finite population is follow:
n = Nzn = Nz22
σσ22
/ d/ d22
(N-1) + z(N-1) + z22
σσ22
This formula is derived by using finite population correctionThis formula is derived by using finite population correction
√√ N-n /N-1N-n /N-1 (See text)(See text)
EstimatingEstimating σσ22
::
n =n = zz22
σσ22
// dd22
1. Using a pilot or preliminary sample1. Using a pilot or preliminary sample →→ σσ22
2.2. Using previous or similar studiesUsing previous or similar studies →→ σσ22
3. Using3. Using σσ ≈ R/6≈ R/6 if pop. is approx. normally distributedif pop. is approx. normally distributed (largest and(largest and
smallest value known)smallest value known) →→ σσ
 Example 6: In a study determining the average daily intake ofExample 6: In a study determining the average daily intake of
protein in teenage girls,protein in teenage girls, what is the required sample size?what is the required sample size?
-Protein intake is measured in gram (g)-Protein intake is measured in gram (g) →→ based onbased on
mean/averagemean/average
-Investigator assumed that width of interval is 10 g-Investigator assumed that width of interval is 10 g
(ie within about 5 g of pop. mean in either direction)(ie within about 5 g of pop. mean in either direction)
(ie. Margin of error is 5 g)(ie. Margin of error is 5 g)
-Pop. SD = 20 g-Pop. SD = 20 g
-Confidence coefficient = 0.95 (so reliability factor = 1.96)-Confidence coefficient = 0.95 (so reliability factor = 1.96)
-Ignoring finite pop correction as the pop. is large, required n is:-Ignoring finite pop correction as the pop. is large, required n is:
n =n = zz22
σσ22
// dd22
= 1.96= 1.9622
(20)(20)22
/ 5/ 522
= 61.47= 61.47
So,So, required sample size isrequired sample size is 6161 teenage girlsteenage girls
 VII. Determination of Sample Size for Estimating ProportionsVII. Determination of Sample Size for Estimating Proportions
 Assuming distribution ofAssuming distribution of pp is approx. normal andis approx. normal and
-When sampling is with replacement,-When sampling is with replacement,
-When sampling is from an infinite pop.,-When sampling is from an infinite pop.,
-When sampled pop. is large enough,-When sampled pop. is large enough, finite pop. correction is notfinite pop. correction is not
needneed
So we useSo we use n = zn = z22
pq / dpq / d22
IfIf finite pop. correction is used,finite pop. correction is used, useuse n = Nzn = Nz22
pq /pq / dd22
(N-1) + z(N-1) + z22
pqpq
EstimatingEstimating p :p :
1.1. Use a pilot sampleUse a pilot sample
2.2. Use upper bound forUse upper bound for pp (eg. true p not greater than 0.3)(eg. true p not greater than 0.3)
3.3. Use 0.5 forUse 0.5 for pp
 Example 7: In a study determining proportion of medicallyExample 7: In a study determining proportion of medically
indigent families in an area, what is the sample size?indigent families in an area, what is the sample size?
It is believed thatIt is believed that pp cannot be greater than 0.35. a 95% CI iscannot be greater than 0.35. a 95% CI is
desired withdesired with dd = 0.05= 0.05
n = zn = z22
pq / dpq / d
==1.961.9622
(0.35) (0.65) / 0.05(0.35) (0.65) / 0.0522
= 350= 350
So required sample size is 350 familiesSo required sample size is 350 families
 VIII. Confidence interval for the variance of normally distributedVIII. Confidence interval for the variance of normally distributed
populationpopulation (see text)(see text)
 IX. Confidence interval for the ratio of the variances of twoIX. Confidence interval for the ratio of the variances of two
normally distributed populationsnormally distributed populations (see text)(see text)
THE ENDTHE END

More Related Content

PPT
Statistik 1 7 estimasi & ci
PPT
Confidence intervals (probabilty and statistics
PPTX
Statistics Applied to Biomedical Sciences
PPTX
Estimating a Population Standard Deviation or Variance
PPT
Econometrics ch6
PDF
Statistical Confidence Level
DOCX
Normal distribution
PPTX
L10 confidence intervals
Statistik 1 7 estimasi & ci
Confidence intervals (probabilty and statistics
Statistics Applied to Biomedical Sciences
Estimating a Population Standard Deviation or Variance
Econometrics ch6
Statistical Confidence Level
Normal distribution
L10 confidence intervals

What's hot (20)

PDF
Statistics lecture 8 (chapter 7)
PPTX
Estimation by c.i
PPTX
Confidence interval & probability statements
PPT
PPT
Using Microsoft excel for six sigma
PPTX
Normal distribution
PPTX
Statistics Formulae for School Students
PPT
Econometrics ch4
PPTX
Confidence interval
PPT
Confidence Intervals And The T Distribution
PDF
Sample slides from "Getting Started with R" course
PPTX
Estimating a Population Standard Deviation or Variance
PPTX
6.5 central limit
PPTX
Estimating a Population Mean
PPTX
Normal Distribution
PPTX
law of large number and central limit theorem
PPT
PPTX
JM Statr session 13, Jan 11
PPT
Chapter 09
Statistics lecture 8 (chapter 7)
Estimation by c.i
Confidence interval & probability statements
Using Microsoft excel for six sigma
Normal distribution
Statistics Formulae for School Students
Econometrics ch4
Confidence interval
Confidence Intervals And The T Distribution
Sample slides from "Getting Started with R" course
Estimating a Population Standard Deviation or Variance
6.5 central limit
Estimating a Population Mean
Normal Distribution
law of large number and central limit theorem
JM Statr session 13, Jan 11
Chapter 09
Ad

Similar to L estimation (20)

PPT
Inferential statistics-estimation
PDF
Standard deviation
PPT
Introduction to analytical chemistry.ppt
PPTX
Estimating a Population Standard Deviation or Variance
PPTX
M1-4 Estimasi Titik dan Intervaltttt.pptx
PPT
Presentation1group b
PPTX
Estimation and confidence interval
DOCX
ECO202 formulae
PPT
Mean, median, and mode ug
PDF
Stat3 central tendency & dispersion
PPTX
Normal distribution.pptx aaaaaaaaaaaaaaa
PPT
Tbs910 sampling hypothesis regression
PPT
Stat3 central tendency & dispersion
PPT
Statistics in Research
PPT
Statistics in Research
PDF
Application of Statistical and mathematical equations in Chemistry Part 2
PPT
Stat3 central tendency & dispersion
PPTX
Lec. 10: Making Assumptions of Missing data
PPTX
Meaning of Variability and its measures .pptx
DOCX
Descriptive Statistics Formula Sheet Sample Populatio.docx
Inferential statistics-estimation
Standard deviation
Introduction to analytical chemistry.ppt
Estimating a Population Standard Deviation or Variance
M1-4 Estimasi Titik dan Intervaltttt.pptx
Presentation1group b
Estimation and confidence interval
ECO202 formulae
Mean, median, and mode ug
Stat3 central tendency & dispersion
Normal distribution.pptx aaaaaaaaaaaaaaa
Tbs910 sampling hypothesis regression
Stat3 central tendency & dispersion
Statistics in Research
Statistics in Research
Application of Statistical and mathematical equations in Chemistry Part 2
Stat3 central tendency & dispersion
Lec. 10: Making Assumptions of Missing data
Meaning of Variability and its measures .pptx
Descriptive Statistics Formula Sheet Sample Populatio.docx
Ad

More from Mmedsc Hahm (20)

PPSX
Solid waste-management-2858710
PPTX
Situation analysis
PPT
Quantification of medicines need
PPTX
Quality in hospital
PPT
Patient satisfaction & quality in health care (16.3.2016) dr.nyunt nyunt wai
PPTX
Organising
PPT
Nscbl slide
PPTX
Introduction to hahm 2017
PPT
Hss lecture 2016 jan
PPTX
Hospital management17
PPTX
Hopital stat
PPT
Health planning approaches hahm 17
PPTX
Ephs and nhp
PPTX
Directing and leading 2017
PPT
Concepts of em
PPT
Access to medicines p pt 17 10-2015
PPTX
The dynamics of disease transmission
PPTX
Study designs dr.wah
PPTX
Standardization dr.wah
DOCX
Solid waste-management-2858710
Situation analysis
Quantification of medicines need
Quality in hospital
Patient satisfaction & quality in health care (16.3.2016) dr.nyunt nyunt wai
Organising
Nscbl slide
Introduction to hahm 2017
Hss lecture 2016 jan
Hospital management17
Hopital stat
Health planning approaches hahm 17
Ephs and nhp
Directing and leading 2017
Concepts of em
Access to medicines p pt 17 10-2015
The dynamics of disease transmission
Study designs dr.wah
Standardization dr.wah

Recently uploaded (20)

PPTX
COMMUNICATION SKILSS IN NURSING PRACTICE
PPTX
Xray and usg Powerpoint presentation By Shanu
PPTX
Bronchial_Asthma_in_acute_exacerbation_.pptx
PPTX
1. Drug Distribution System.pptt b pharmacy
PPTX
Vaginal Bleeding and Uterine Fibroids p
PPTX
SPIROMETRY and pulmonary function test basic
PPTX
Galactosemia pathophysiology, clinical features, investigation and treatment ...
PPTX
different types of Gait in orthopaedic injuries
PPTX
Nursing Care Aspects for High Risk newborn.pptx
DOCX
ch 9 botes for OB aka Pregnant women eww
PPTX
HEMODYNAMICS - I DERANGEMENTS OF BODY FLUIDS.pptx
PPT
Parental-Carer-mental-illness-and-Potential-impact-on-Dependant-Children.ppt
PPTX
PEDIATRIC OSCE, MBBS, by Dr. Sangit Chhantyal(IOM)..pptx
PDF
Essentials of Hysteroscopy at World Laparoscopy Hospital
PDF
Assessment of Complications in Patients Maltreated with Fixed Self Cure Acryl...
PDF
NUTRITION THROUGHOUT THE LIFE CYCLE CHILDHOOD -AGEING
PPTX
PE and Health 7 Quarter 3 Lesson 1 Day 3,4 and 5.pptx
PPTX
Trichuris trichiura infection
PPTX
unit1-introduction of nursing education..
PDF
Structure Composition and Mechanical Properties of Australian O.pdf
COMMUNICATION SKILSS IN NURSING PRACTICE
Xray and usg Powerpoint presentation By Shanu
Bronchial_Asthma_in_acute_exacerbation_.pptx
1. Drug Distribution System.pptt b pharmacy
Vaginal Bleeding and Uterine Fibroids p
SPIROMETRY and pulmonary function test basic
Galactosemia pathophysiology, clinical features, investigation and treatment ...
different types of Gait in orthopaedic injuries
Nursing Care Aspects for High Risk newborn.pptx
ch 9 botes for OB aka Pregnant women eww
HEMODYNAMICS - I DERANGEMENTS OF BODY FLUIDS.pptx
Parental-Carer-mental-illness-and-Potential-impact-on-Dependant-Children.ppt
PEDIATRIC OSCE, MBBS, by Dr. Sangit Chhantyal(IOM)..pptx
Essentials of Hysteroscopy at World Laparoscopy Hospital
Assessment of Complications in Patients Maltreated with Fixed Self Cure Acryl...
NUTRITION THROUGHOUT THE LIFE CYCLE CHILDHOOD -AGEING
PE and Health 7 Quarter 3 Lesson 1 Day 3,4 and 5.pptx
Trichuris trichiura infection
unit1-introduction of nursing education..
Structure Composition and Mechanical Properties of Australian O.pdf

L estimation

  • 1. ESTIMATIONESTIMATION Dr Htin Zaw SoeDr Htin Zaw Soe MBBS, DFT, MMedSc (P & TM), PhD, DipMedEdMBBS, DFT, MMedSc (P & TM), PhD, DipMedEd Associate ProfessorAssociate Professor Department of BiostatisticsDepartment of Biostatistics University of Public HealthUniversity of Public Health
  • 2.  Statistical InferenceStatistical Inference: The procedure by which we reach a: The procedure by which we reach a conclusion about a population on the basis of the informationconclusion about a population on the basis of the information contained in a sample drawn from that populationcontained in a sample drawn from that population  Two general areas ofTwo general areas of Statistical InferenceStatistical Inference (1) Estimation(1) Estimation (2) Hypothesis Testing(2) Hypothesis Testing
  • 3.  Process of estimation- statistic (sample ---Process of estimation- statistic (sample ---→→ parameter (pop)parameter (pop)  Compute 2 estimatesCompute 2 estimates 1. Point estimate (Single numerical value)1. Point estimate (Single numerical value) 2. Interval estimate (Two numerical values – an2. Interval estimate (Two numerical values – an interval with a specified degree of confidence)interval with a specified degree of confidence)  The rule how to compute estimate is an estimatorThe rule how to compute estimate is an estimator (( x =x = ∑ x∑ xii / n/ n))  Sampled populationSampled population  Target populationTarget population They may or may not be the sameThey may or may not be the same Random sample (representative to population)Random sample (representative to population) Nonrandom sampleNonrandom sample
  • 4.  I. Confidence interval for a population meanI. Confidence interval for a population mean  Sample mean (x) is point estimate ofSample mean (x) is point estimate of μμ. Not equal to. Not equal to μμ. So an. So an interval is needed.interval is needed.  Sampling distribution and CLTSampling distribution and CLT -- (a)(a) distribution of sample means (x) is normaldistribution of sample means (x) is normal - (b)(b) μμxx == μμ - (c)(c) σσ22 x=x= σσ22 /n/n - 95% of sample means (x) within 2SD of95% of sample means (x) within 2SD of μμ (see fig)(see fig) - μμ± 2± 2 σσxx will contain 95% of all possible values of sample meanswill contain 95% of all possible values of sample means (x)(x) -
  • 5.  Construct interval using sample means (x) (which is known)Construct interval using sample means (x) (which is known) ie. x ± 2ie. x ± 2 σσxx (instead of(instead of μμ± 2± 2 σσxx)) Several no. of x ± 2Several no. of x ± 2 σσxx with same width of interval about unknownwith same width of interval about unknown μμ -- obtained.-- obtained. 95% of these intervals have centres falling within95% of these intervals have centres falling within ± 2± 2 σσxx aboutabout μμ.. Each of interval whose centre fall withinEach of interval whose centre fall within 22 σσxx ofof μμ will containwill contain μμ.. (See Fig)(See Fig)
  • 6. Example 1: In a study investigating an enzyme level in manExample 1: In a study investigating an enzyme level in man n = 10, a sample mean, x = 22 ,n = 10, a sample mean, x = 22 , σσ22 = 45= 45 We wish to estimateWe wish to estimate μμ.. Answer 1: 95% Confidence Interval forAnswer 1: 95% Confidence Interval for μμ is:is: x ± 2x ± 2 σσxx ==22 ± 222 ± 2 √√ 45/1045/10 ==22 ± 2 (2.1213)22 ± 2 (2.1213) 17.76, 26.2417.76, 26.24 [[ xx isis point estimate ofpoint estimate of μμ ]] [[ 22 is a value from standard normal dist.;is a value from standard normal dist.; c95%c95% of x lie; this valueof x lie; this value ofof zz isis reliability coefficientreliability coefficient ]] [[ σσxx is SD of sampling distribution]is SD of sampling distribution] Interval estimateInterval estimate is expressed in general as follows:is expressed in general as follows: Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error)
  • 7.  x ± 2x ± 2 σσxx  When sampling is from a normal distribution with knownWhen sampling is from a normal distribution with known σσ22 ,, interval estimate forinterval estimate for μμ is expressed as followsis expressed as follows x ± zx ± z(1-(1-αα/2)/2) σσxx [ z[ z(1-(1- αα/2)/2) = a value of z ]= a value of z ] (1 -(1 - αα/2) lies on left side of z under the curve./2) lies on left side of z under the curve. ((αα/2) lies on right side of z under the curve./2) lies on right side of z under the curve.
  • 8.  Interpreting Confidence IntervalInterpreting Confidence Interval Probabilistic interpretation:Probabilistic interpretation: In a repeated sampling from a normally distributedIn a repeated sampling from a normally distributed population with a known standard deviation, 100(1 -population with a known standard deviation, 100(1 - αα) percent) percent of all interval of the formof all interval of the form x ± zx ± z(1-(1-αα/2)/2) σσxx will in long run include thewill in long run include the population mean,population mean, μμ Practical interpretation:Practical interpretation: When sampling is from a normally distributed populationWhen sampling is from a normally distributed population with a known standard deviation, we are 100(1 -with a known standard deviation, we are 100(1 - αα) percent) percent confident that theconfident that the singlesingle computed interval,computed interval, x ± zx ± z(1-(1-αα/2)/2) σσxx ,, containscontains the population mean,the population mean, μμ
  • 9. Confidence CoefficientsConfidence Coefficients .90 .95 .99.90 .95 .99 Reliability factorsReliability factors 1.645 1.96 2.581.645 1.96 2.58 [ In example 1, value of reliability coefficient 2 is used, but more[ In example 1, value of reliability coefficient 2 is used, but more exact value is 1.96 for confidence coefficient of .95 ]exact value is 1.96 for confidence coefficient of .95 ]
  • 10.  Precision:Precision: Quantity obtained by multiplying the reliability factor by theQuantity obtained by multiplying the reliability factor by the SE of the mean is called theSE of the mean is called the precisionprecision of the estimate. Thisof the estimate. This quantity is also called thequantity is also called the margin of errormargin of error Example2: Variance of muscular strength score = 144Example2: Variance of muscular strength score = 144 Mean of muscular strength score in a sample (n = 15)Mean of muscular strength score in a sample (n = 15) is 84.3is 84.3 Find 99 % confidence interval for pop mean.Find 99 % confidence interval for pop mean. Answer2:Answer2: x ± zx ± z(1-(1-αα/2)/2) σσxx 84.384.3 ± 2.58 ( √ 144/ 15)± 2.58 ( √ 144/ 15) 84.384.3 ± 8.0± 8.0 76.3, 92.376.3, 92.3 We are 99% confident that pop. mean,We are 99% confident that pop. mean, μμ lies between 76.3 andlies between 76.3 and 92.392.3
  • 11.  Alternative estimates of central tendency:Alternative estimates of central tendency: Median instead of mean when there are outliers in a data set.Median instead of mean when there are outliers in a data set. Median is used as a point estimate and in an interval estimateMedian is used as a point estimate and in an interval estimate with different formula.with different formula. Trimmed mean:Trimmed mean: It is one of robust estimators of central tendency for a data setIt is one of robust estimators of central tendency for a data set with outlierswith outliers Steps to compute trimmed meanSteps to compute trimmed mean - Order the measurements- Order the measurements - Discard smallest 100- Discard smallest 100 αα % and largest 100% and largest 100 αα % (% (αα = 0.1- 0.2)= 0.1- 0.2) - Compute Art. Mean of the remaining measurements- Compute Art. Mean of the remaining measurements Note: Median may be regarded as aNote: Median may be regarded as a 50% trimmed mean50% trimmed mean
  • 12.  II. t distributionII. t distribution z = (x -z = (x - µ) / (µ) / ( σσ / √/ √ n)n) σσ is usually unknownis usually unknown SoSo ss is used insteadis used instead tt = (x -= (x - µ) / (µ) / ( ss / √/ √ n)n) followsfollows t distributiont distribution t distribution is used in a small sample sizet distribution is used in a small sample size CalledCalled Student’s t distributionStudent’s t distribution oror t distributiont distribution
  • 13.  Properties of t distribution:Properties of t distribution: 1. It has mean, 01. It has mean, 0 2. It is symmetrical about mean2. It is symmetrical about mean 3. It has variance > 1. Variance approaches 1 as n becomes3. It has variance > 1. Variance approaches 1 as n becomes large.large. Variance is df/(df-2) for df > 2Variance is df/(df-2) for df > 2 Alternatively, variance is (n-1)/(n-3) for n >3Alternatively, variance is (n-1)/(n-3) for n >3 4. Variable t ranges from -4. Variable t ranges from - ∞∞ to +to + ∞∞ 5. Family distribution since a different distribution for each sample5. Family distribution since a different distribution for each sample of n-1 (divisor used in computing sof n-1 (divisor used in computing s22 ) (See fig)) (See fig) 6. t dist. is less peaked and higher tails when compared to normal6. t dist. is less peaked and higher tails when compared to normal dist.dist. 7. t dist. approaches normal dist. as n-1 approaches7. t dist. approaches normal dist. as n-1 approaches ∞∞
  • 14.  Table E or Table of t dist. is used for confidence coefficient andTable E or Table of t dist. is used for confidence coefficient and df in calculationdf in calculation  Confidence intervals using t :Confidence intervals using t : Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error)  When sampling is from a normal dist. whoseWhen sampling is from a normal dist. whose σσ is unknown,is unknown, 100(1-100(1- αα) % CI for pop mean,) % CI for pop mean,µ,µ, is given byis given by x ± tx ± t(1-(1-αα/2)/2) s /√ ns /√ n [Reliability coefficient is obtained from table of t dist. Or Table[Reliability coefficient is obtained from table of t dist. Or Table E]E]
  • 15.  Example 3: In a study to estimate the pop. mean of muscularExample 3: In a study to estimate the pop. mean of muscular strength. Sample mean (x) = 250.8strength. Sample mean (x) = 250.8 Sample SD (s) = 130.9Sample SD (s) = 130.9 Sample size (n) = 19 subjectsSample size (n) = 19 subjects Find 95%CI for pop. meanFind 95%CI for pop. mean x ± tx ± t(1-(1-αα/2)/2) s /√ ns /√ n 250.8 ± 2.10009250.8 ± 2.10009 (130.9 /√ 19)(130.9 /√ 19) 250.8 ± 63.1250.8 ± 63.1 187.7, 313.9187.7, 313.9 Interpretation: We are 95% confident that the true pop. Mean, µInterpretation: We are 95% confident that the true pop. Mean, µ lies between 187.7 and 313.9 because in repeated samplinglies between 187.7 and 313.9 because in repeated sampling 95% of interval constructed in like manner will include µ.95% of interval constructed in like manner will include µ.
  • 16.  Deciding between z and t :Deciding between z and t : (See flowchart)(See flowchart) See – Distribution of pop. (normal or not)See – Distribution of pop. (normal or not) - Sample size ( large or not)- Sample size ( large or not) - Pop. variance (known or not)- Pop. variance (known or not) Choose z or tChoose z or t III. Confidence interval for difference between two pop. meansIII. Confidence interval for difference between two pop. means xx11 – x– x22 ± t± t(1-(1-αα/2)/2) √√ σσ22 11 /n/n11 ++ σσ22 22 /n/n22 If the interval includes zero, two pop. means are likely to be equalIf the interval includes zero, two pop. means are likely to be equal and vice versaand vice versa
  • 17.  Example 4: In a study to determine difference bet. serum uricExample 4: In a study to determine difference bet. serum uric acid level of two groups of patients (with Down’s Syndrome andacid level of two groups of patients (with Down’s Syndrome and without Down’s Syndrome )without Down’s Syndrome ) (Two pop of values – normally dist.)(Two pop of values – normally dist.)  Group 1: Mean uric acid level (xGroup 1: Mean uric acid level (x11) = 4.5 mg/100 ml) = 4.5 mg/100 ml Variance (Variance (σσ11 22 ) = 1) = 1 Sample size (nSample size (n11) = 12 subjects) = 12 subjects  Group 2: Mean uric acid level (xGroup 2: Mean uric acid level (x22) = 3.4 mg/100 ml) = 3.4 mg/100 ml Variance (Variance (σσ22 22 ) = 1.5) = 1.5 Sample size (nSample size (n22) = 15 subjects) = 15 subjects Find 95% CI for difference between two pop. means (µFind 95% CI for difference between two pop. means (µ11--µµ22)) xx11 – x– x22 ±± zz(1-(1-αα/2)/2) √√ σσ22 11 /n/n11 ++ σσ22 22 /n/n22 4.5 – 3.4 ± 1.964.5 – 3.4 ± 1.96 √ 1 /12√ 1 /12+ 1.5 /15+ 1.5 /15 = 1.1 ± 1.96 (0.4282)= 1.1 ± 1.96 (0.4282) = 1.1 ± 0.84= 1.1 ± 0.84
  • 18.  Interpretation: We are 95% confident that difference betweenInterpretation: We are 95% confident that difference between two serum uric acid levels lies between 0.26 and 1.94 mg/ 100two serum uric acid levels lies between 0.26 and 1.94 mg/ 100 ml. Since the intervalml. Since the interval does not include zerodoes not include zero we conclude thatwe conclude that two pop. means aretwo pop. means are not equalnot equal Sampling from nonnormal population:Sampling from nonnormal population: - Apply CLT if sample sizes are large- Apply CLT if sample sizes are large - use s- use s22 ifif σσ22 is unknownis unknown
  • 19.  TheThe t distribution and the difference between meanst distribution and the difference between means (A) When pop variances are equal(A) When pop variances are equal (B) When pop variances are not equal(B) When pop variances are not equal (A)(A) When pop variances are equalWhen pop variances are equal 1.Find1.Find pool estimatepool estimate of common variance byof common variance by ss22 pp = (n= (n11-1)s-1)s11 22 + (n+ (n22-1)s-1)s22 22 / (n/ (n1 +1 + nn22 – 2)– 2) 2. Find SE of estimate by2. Find SE of estimate by ss x1-x2x1-x2 == √(√( ss22 pp / n/ n11 ) +) +(( ss22 pp / n/ n22 )) 3. Find 100(1-3. Find 100(1-αα) % confidence interval by) % confidence interval by xx11 – x– x22 ± t± t(1-(1-αα/2)/2) √(√( ss22 pp / n/ n11 ) +) +(( ss22 pp / n/ n22 )) [Note: number of df is[Note: number of df is (n(n1 +1 + nn22 – 2)]– 2)]
  • 20.  (B) When pop variances are not equal(B) When pop variances are not equal 1.Find reliability factor,1.Find reliability factor, t’t’(1-(1-αα/2)/2) byby t’t’(1-(1-αα/2)/2) = (w= (w11tt11 ++ ww22tt22) / w) / w11 ++ ww22 [w[w11 = s= s11 22 /n/n11 ,, ww22 = s= s22 22 /n/n22 ]] 2. Find 100(1-2. Find 100(1-αα) % confidence interval by) % confidence interval by xx11 – x– x22 ± t’± t’(1-(1-αα/2)/2) √(√( ss22 11 / n/ n11 ) +) +(( ss22 22 / n/ n22 )) SEE THE EXAMPLE IN TEXTSEE THE EXAMPLE IN TEXT See Flowchart to choose z, t, or t’See Flowchart to choose z, t, or t’
  • 21.  IV. Confidence interval for a population proportionIV. Confidence interval for a population proportion -What proportion of patients who receive a particular type of-What proportion of patients who receive a particular type of treatment recover?treatment recover? -What proportion of some pop. has a certain disease?-What proportion of some pop. has a certain disease? -What proportion of a pop. is immune to a certain disease?-What proportion of a pop. is immune to a certain disease?  Use same manner as in finding 100(1-Use same manner as in finding 100(1-αα) CI for pop. mean) CI for pop. mean Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error) Find 100(1-Find 100(1-αα) CI for pop. proportion,) CI for pop. proportion, pp by the following:by the following: pp ± z± z(1-(1-αα/2)/2) √√ pp (1-(1- pp) /n) /n [Note: When np and n(1-p) are greater than 5, it is considered that[Note: When np and n(1-p) are greater than 5, it is considered that sampling distribution of p is quite close to normal distributionsampling distribution of p is quite close to normal distribution and Reliability Coefficient is some value of z from standardand Reliability Coefficient is some value of z from standard normal distribution]normal distribution]
  • 22.  Example 5: In a study finding population proportion of thoseExample 5: In a study finding population proportion of those searching health information among internet users.searching health information among internet users. Sample prop, (p) = 0.18Sample prop, (p) = 0.18 Sample size = 1220 usersSample size = 1220 users Find 95% CI for pop. prop.Find 95% CI for pop. prop. pp ± z± z(1-(1-αα/2)/2) √√ pp (1-(1- pp) /n) /n 0.18 ± 1.960.18 ± 1.96 √ 0.18 (1- 0.18) /1220√ 0.18 (1- 0.18) /1220 0.18 ± 1.96 (0.0110)0.18 ± 1.96 (0.0110) 0.18 ± 0.0220.18 ± 0.022 0.158, 0.2020.158, 0.202 Interpretation: We are 95% confident thatInterpretation: We are 95% confident that population proportion ofpopulation proportion of those searching health information among internet users liesthose searching health information among internet users lies between 0.158 and 0.202between 0.158 and 0.202
  • 23.  V. Confidence interval for the difference between twoV. Confidence interval for the difference between two population proportionspopulation proportions Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error) pp11 – p– p22 ± z± z(1-(1-αα/2)/2) √ p√ p11(1 – p(1 – p11) /n) /n11 + p+ p22 (1 – p(1 – p22) /n) /n22 SEE THE EXAMPLE IN TEXTSEE THE EXAMPLE IN TEXT If the interval includes zero, two pop. proportions are likely toIf the interval includes zero, two pop. proportions are likely to be equal and vice versabe equal and vice versa
  • 24.  VI. Determination of Sample Size for Estimating MeansVI. Determination of Sample Size for Estimating Means A larger sample sizeA larger sample size →→ waste of resourceswaste of resources A very small sample sizeA very small sample size →→ no practical useno practical use It is essential to get aIt is essential to get a sufficient/optimum sample sizesufficient/optimum sample size Objectives:Objectives: The objectives in interval estimation are to getThe objectives in interval estimation are to get narrow intervals with high reliabilitynarrow intervals with high reliability See againSee again Estimator ± (reliability coefficient) (standard error)Estimator ± (reliability coefficient) (standard error) Width of interval = (reliability coefficient) (standard error)Width of interval = (reliability coefficient) (standard error) d = z (SE)d = z (SE) dd = z (= z (σσ /√n/√n)) dd22 = z= z22 ((σσ /√n/√n))22 n =n = zz22 σσ22 // dd22
  • 25.  Sample size formula when sampling is without replacementSample size formula when sampling is without replacement from a small finite population is follow:from a small finite population is follow: n = Nzn = Nz22 σσ22 / d/ d22 (N-1) + z(N-1) + z22 σσ22 This formula is derived by using finite population correctionThis formula is derived by using finite population correction √√ N-n /N-1N-n /N-1 (See text)(See text) EstimatingEstimating σσ22 :: n =n = zz22 σσ22 // dd22 1. Using a pilot or preliminary sample1. Using a pilot or preliminary sample →→ σσ22 2.2. Using previous or similar studiesUsing previous or similar studies →→ σσ22 3. Using3. Using σσ ≈ R/6≈ R/6 if pop. is approx. normally distributedif pop. is approx. normally distributed (largest and(largest and smallest value known)smallest value known) →→ σσ
  • 26.  Example 6: In a study determining the average daily intake ofExample 6: In a study determining the average daily intake of protein in teenage girls,protein in teenage girls, what is the required sample size?what is the required sample size? -Protein intake is measured in gram (g)-Protein intake is measured in gram (g) →→ based onbased on mean/averagemean/average -Investigator assumed that width of interval is 10 g-Investigator assumed that width of interval is 10 g (ie within about 5 g of pop. mean in either direction)(ie within about 5 g of pop. mean in either direction) (ie. Margin of error is 5 g)(ie. Margin of error is 5 g) -Pop. SD = 20 g-Pop. SD = 20 g -Confidence coefficient = 0.95 (so reliability factor = 1.96)-Confidence coefficient = 0.95 (so reliability factor = 1.96) -Ignoring finite pop correction as the pop. is large, required n is:-Ignoring finite pop correction as the pop. is large, required n is: n =n = zz22 σσ22 // dd22 = 1.96= 1.9622 (20)(20)22 / 5/ 522 = 61.47= 61.47 So,So, required sample size isrequired sample size is 6161 teenage girlsteenage girls
  • 27.  VII. Determination of Sample Size for Estimating ProportionsVII. Determination of Sample Size for Estimating Proportions  Assuming distribution ofAssuming distribution of pp is approx. normal andis approx. normal and -When sampling is with replacement,-When sampling is with replacement, -When sampling is from an infinite pop.,-When sampling is from an infinite pop., -When sampled pop. is large enough,-When sampled pop. is large enough, finite pop. correction is notfinite pop. correction is not needneed So we useSo we use n = zn = z22 pq / dpq / d22 IfIf finite pop. correction is used,finite pop. correction is used, useuse n = Nzn = Nz22 pq /pq / dd22 (N-1) + z(N-1) + z22 pqpq EstimatingEstimating p :p : 1.1. Use a pilot sampleUse a pilot sample 2.2. Use upper bound forUse upper bound for pp (eg. true p not greater than 0.3)(eg. true p not greater than 0.3) 3.3. Use 0.5 forUse 0.5 for pp
  • 28.  Example 7: In a study determining proportion of medicallyExample 7: In a study determining proportion of medically indigent families in an area, what is the sample size?indigent families in an area, what is the sample size? It is believed thatIt is believed that pp cannot be greater than 0.35. a 95% CI iscannot be greater than 0.35. a 95% CI is desired withdesired with dd = 0.05= 0.05 n = zn = z22 pq / dpq / d ==1.961.9622 (0.35) (0.65) / 0.05(0.35) (0.65) / 0.0522 = 350= 350 So required sample size is 350 familiesSo required sample size is 350 families
  • 29.  VIII. Confidence interval for the variance of normally distributedVIII. Confidence interval for the variance of normally distributed populationpopulation (see text)(see text)  IX. Confidence interval for the ratio of the variances of twoIX. Confidence interval for the ratio of the variances of two normally distributed populationsnormally distributed populations (see text)(see text)