SlideShare a Scribd company logo
Stat-3203: Sampling Technique-II
(Chapter-2: Cluster and Multi-stage Sampling)
Md. Menhazul Abedin
Lecturer
Statistics Discipline
Khulna University, Khulna-9208
Email: menhaz70@gmail.com
Objectives and Outline
Single stage cluster sampling
Cluster sampling with equal and unequal
sample size
Properties
Advantages and disadvantages
Multi-stage cluster sampling (two stage)
Acknowledgement
• Daroga Singh & F. S. Chaudhary
• M. Nurul Islam
• Ravindra Singh & Naurang Singh Mangat
Background…
• SRS
• Stratified
• Systematic
Cluster
• A cluster is an aggregate or group, consisting
of several (nonhomogeneuos) population
elements
Intuition…
• Study variable: Income/ Awarness/ health status
etc
• Ghatbhogh, Rupsa,
Naihati
• PSU: Primary sampling
Unit
• Single stage sampling
Sample
Collect Information
from all individual
Intuition…
• Upazila Union
• Two stage Sampling
PSU SSU
Intuition…
• Study variable: Income/Awarness/Healthy etc
• Multistage sampling
Division District
UpazilaUnion
village Household
Why cluster sampling?
• Feasibility: No samling frame needed
• Economy: Reduction of cost
• Flexibility of cluster formation: Manipulation
of cluster size possible (like political division,
administrative division, commercial capital)
Disadvantages...
• Loss of precision:
• Problems in analysis:
• Do you think any other disadvantages…?
Please insert here...
Cluster sampling and Others
• Cluster sampling and SRS
• Cluster sampling and Stratified
• Cluster sampling and Systematic
Applications
Cluster sampling
Cluster-1 Cluster-3Cluster-2 Cluster-4 Cluster-5
Construct a sample
Definition…
• Cluster sampling is a method of sampling,
which consists of first selecting, at random
groups, called clusters of elements from the
population, and then choosing all of the
elements within each cluster to make up the
sample. (M. Nurul Islam)
Stratified sampling
Strata-1
N1
Strata-2
N2
Strata-3
N2
Strata-4
N2
n1 n3n2 n4
N1+N2+
N3+N4=
N
n1+n2+n
3+n4=n
Single-stage cluster sampling (equal)
Clusters
Elements 1 2 3 ... i ... N
1 𝑦11 𝑦21 𝑦31 ... 𝑦𝑖1 ... 𝑦 𝑁1
2 𝑦12 𝑦22 𝑦32 ... 𝑦𝑖2 ... 𝑦 𝑁2
... ... ... ... ... ... ... ...
j 𝑦1𝑗 𝑦2𝑗 𝑦3𝑗 ... 𝑦𝑖𝑗 ... 𝑦 𝑁𝑗
... ... ... ... ... ... ... ...
M 𝑦1𝑀 𝑦2𝑀 𝑦3𝑀 ... 𝑦𝑖𝑀 ... 𝑦 𝑁𝑀
Cluster total 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁
Cluster mean 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁
Layout of NM popn elements inclusters
Single-stage cluster sampling (equal)
Clusters
Elements 1 2 3 ... i ... n
1 𝑦11 𝑦21 𝑦31 ... 𝑦𝑖1 ... 𝑦 𝑛1
2 𝑦12 𝑦22 𝑦32 ... 𝑦𝑖2 ... 𝑦 𝑛2
... ... ... ... ... ... ... ...
j 𝑦1𝑗 𝑦2𝑗 𝑦3𝑗 ... 𝑦𝑖𝑗 ... 𝑦 𝑛𝑗
... ... ... ... ... ... ... ...
M 𝑦1𝑀 𝑦2𝑀 𝑦3𝑀 ... 𝑦𝑖𝑀 ... 𝑦 𝑛𝑀
Cluster total 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑛
Cluster mean 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑛
Layout of nM sample elements inclusters
Single-stage cluster sampling (equal)
• Indivisual cluster mean
• 𝑦𝑖 =
1
𝑀
𝑦𝑖1 + 𝑦𝑖2 + ⋯ + 𝑦𝑖𝑀 =
𝑦 𝑖
𝑀
=
1
𝑀 𝑗=1
𝑀
𝑦𝑖𝑗
• n cluster mean (sample mean)
• 𝑦𝑛 =
1
𝑛 𝑖=1
𝑛
𝑦𝑖
• Sample mean
𝑦 =
𝑦
𝑛𝑀
=
1
𝑛𝑀 𝑖=1
𝑛
𝑗=1
𝑀
𝑦𝑖𝑗 =
1
𝑛𝑀 𝑖=1
𝑛
𝑦𝑖 =
1
𝑛𝑀 𝑖=1
𝑛
𝑀 𝑦𝑖 =
1
𝑛 𝑖=1
𝑛
𝑦𝑖 = 𝑦𝑛= n cluster mean
Sample mean = n cluster mean
Single-stage cluster sampling (equal)
• N cluster mean 𝑌𝑁 =
1
𝑁 𝑖=1
𝑁
𝑦𝑖
• Population mean
𝑌 =
𝑌
𝑁𝑀
=
1
𝑁𝑀 𝑖=1
𝑁
𝑗=1
𝑀
𝑦𝑖𝑗 =
1
𝑁𝑀 𝑖=1
𝑁
𝑦𝑖 =
1
𝑁𝑀 𝑖=1
𝑁
𝑀 𝑦𝑖 =
1
𝑁 𝑖=1
𝑁
𝑦𝑖 = 𝑌𝑛 = N cluster
mean
Population mean = N cluster mean
Single-stage cluster sampling (equal)
• Variance calculation:
𝑉 𝑦𝑛 =
𝑁 − 𝑛
𝑁
1
𝑛
1
𝑀2
𝑖=1
𝑁
𝑦𝑖 − 𝑖=1
𝑁
𝑦𝑖
𝑁
2
𝑁 − 1
𝑉 𝑦𝑛 =
𝑁 − 𝑛
𝑁
1
n
𝑖=1
𝑁
𝑦𝑖 − 𝑌 2
𝑁 − 1
=
1−𝑓
n
𝑆 𝑏
2
• Replace 𝑆 𝑏
2
by 𝑠 𝑏
2
= 𝑖=1
𝑛
𝑦 𝑖− 𝑦 𝑛
2
𝑛−1
• Estimator of 𝑉 𝑦𝑛 is v 𝑦𝑛 =
1−𝑓
n
𝑠 𝑏
2
Single-stage cluster sampling (equal)
• Theorem 8.1: defined mean is unbiased and
estimate the variance of mean.
(Need intra-cluster correlation discussed next
slide)
• 𝑉 𝑦𝑛 =
(1−𝑓)(𝑁𝑀−1)
n𝑀2(𝑁−1)
𝑆2
[1 + (𝑀 − 1)𝜌]
Or 𝑉 𝑦𝑛 ≈
1−𝑓
nM
𝑆2
[1 + (𝑀 − 1)𝜌]
Intra-cluster correlation
• The similarity of observations within a cluster
can be quantified by means of the Intracluster
Correlation Coefficient (ICC), sometimes also
referred to as intraclass correlation coefficient.
• This is very similar to the well known
Pearson’s correlation coefficient; only that we
do not simultaneously look at observations of
two variables on the same object but we look
simultaneously on two values of the same
variable, but taken at two different objects.
• Calculation like Auto-correlation (discussed)
Intra-cluster correlation
• Mean square between elementsin the population
𝑆2
=
𝑖=1
𝑁
𝑗=1
𝑀
𝑦 𝑖𝑗− 𝑌
2
𝑁𝑀−1
• Intra cluster correlation
𝜌 =
𝐸(𝑦𝑖𝑗 − 𝑌)(𝑦𝑗𝑘 − 𝑌)
𝐸 𝑦𝑖𝑗 − 𝑌
2
=
2 𝑖=1
𝑁
𝑗=1<𝑘
𝑀
(𝑦𝑖𝑗 − 𝑌)(𝑦𝑗𝑘 − 𝑌)
(𝑀 − 1)(𝑁𝑀 − 1)𝑆2
Variance in terms of 𝜌
• 𝑉 𝑦𝑛 =
𝑁−𝑛
𝑁
1
n
𝑖=1
𝑁
𝑦 𝑖− 𝑌 2
𝑁−1
• Expand the squared term and relate with 𝜌
• 𝑉 𝑦𝑛 =
(1−𝑓)(𝑁𝑀−1)
n𝑀2(𝑁−1)
𝑆2
[1 + (𝑀 − 1)𝜌]
• If N large 𝑁𝑀 − 1 ≈ 𝑁𝑀 and 𝑁 − 1 ≈ 𝑁
• 𝑉 𝑦𝑛 ≈
1−𝑓
nM
𝑆2
[1 + (𝑀 − 1)𝜌]
• 𝑉 𝑦𝑛 =
1−𝑓
nM
𝑆2
[1 + (𝑀 − 1)𝜌] [simplicity ]
Design effect
• Variance of 𝐶𝑙𝑢𝑠𝑡𝑒𝑟 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔
• 𝑉 𝑦𝑛 =
1−𝑓
nM
𝑆2
[1 + (𝑀 − 1)𝜌]
• Variance of 𝑆𝑖𝑚𝑝𝑙𝑒 𝑟𝑎𝑛𝑑𝑜𝑚 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔
• 𝑉 𝑦 𝑛𝑀 =
𝑁𝑀−𝑛𝑀
𝑁𝑀
𝑆2
𝑛𝑀
=
1−𝑓
nM
𝑆2
• Dividing
𝑉 𝑦 𝑛
𝑉 𝑦 𝑛𝑀
= 1 + 𝑀 − 1 𝜌 = Deff
• What is the inter pretation of Design effect?
– It’s simple, can you find it. Try your best.
Relationship between 𝜌, Deff and M
• 𝐷𝑒𝑓𝑓 = 1 + 𝑀 − 1 𝜌
– See its property when
– 𝜌 = 1 [Deff=M all the M values in a cluster are
equal]
– 𝑀 = 1 [SRS= cluster sampling]
– 𝜌 = 0 [cluster void
– 𝐷𝑒𝑓𝑓 = 0 or +1 find range of intra-cluster
correlation
Efficiency of cluster sampling
•
𝑉 𝑦 𝑛𝑀
𝑉 𝑦 𝑛
=
1
1+ 𝑀−1 𝜌
=
1
𝐷𝑒𝑓𝑓
• Observe its characteristics when
– 𝜌 > 0 Cluster sampling less efficient compared to SRS
– 𝜌 < 0 Cluster sampling more efficient compared to SRS
Single-stage cluster sampling (Equal)
• Find Optimum n and M subject to constraint
cost.
– Ignore it provisionally
Example
• Example: 8.2
• Example: 8.3
Single stage cluster sampling with
Unequal cluster size
Single-stage cluster sampling (Unequal)
Clusters
Elements 1 2 3 ... i ... N
1 𝑦11 𝑦21 𝑦31 ... 𝑦𝑖1 ... 𝑦 𝑁1
2 𝑦12 𝑦22 𝑦32 ... 𝑦𝑖2 ... 𝑦 𝑁2
... … ... ... ... ... ... ...
j 𝑦1𝑗 𝑦2𝑗 𝑦3𝑗 ... 𝑦𝑖𝑗 ... 𝑦 𝑁𝑗
... ... ... ... ... ... ... ...
𝑀𝑖 𝑦1𝑀1
𝑦2𝑀2
𝑦3𝑀3
... 𝑦𝑖𝑀 𝑖
... 𝑦 𝑁𝑀 𝑁
Cluster total 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁
Cluster mean 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁
• Total number of elements
𝑀0 = 𝑖=1
𝑁
𝑀𝑖
• Total number of elements in each cluster
𝑦𝑖 = 𝑗=1
𝑀 𝑖
𝑦𝑖𝑗
• Average number of elements per cluster
𝑀 =
𝑖=1
𝑁
𝑀𝑖
N
=
𝑀0
𝑁
Single-stage cluster sampling (Unequal)
Single-stage cluster sampling (Unequal)
• Population mean (1)
𝑌 =
𝑖=1
𝑁
𝑗=1
𝑀 𝑖
𝑦𝑖𝑗
𝑖=1
𝑁
𝑀𝑖
=
𝑖=1
𝑁
𝑀𝑖 𝑦𝑖
𝑖=1
𝑁
𝑀𝑖
=
𝑖=1
𝑁
𝑀𝑖 𝑦𝑖
𝑀0
• Population mean (2)
𝑌𝑁 =
𝑖=1
𝑁
𝑦𝑖
𝑁
• Are they same?
Single-stage cluster sampling (Unequal)
• Sample mean (1)
𝑦𝑛 = 𝑖=1
𝑛
𝑦 𝑖
𝑛
Biased for 𝑌 but unbiased for 𝑌𝑁
• Sample mean (2)
• 𝑦𝑛 =
𝑁
𝑛𝑀0
𝑖=1
𝑛
𝑀𝑖 𝑦𝑖 =
1
𝑛 𝑖=1
𝑛
(
𝑀 𝑖 𝑦 𝑖
𝑀
)
This is unbiased for 𝑌
Single-stage cluster sampling (Unequal)
Clusters
Elements 1 2 3 ... i ... N
1 𝑦11 𝑦21 𝑦31 ... 𝑦𝑖1 ... 𝑦 𝑁1
2 𝑦12 𝑦22 𝑦32 ... 𝑦𝑖2 ... 𝑦 𝑁2
... … ... ... ... ... ... ...
j 𝑦1𝑗 𝑦2𝑗 𝑦3𝑗 ... 𝑦𝑖𝑗 ... 𝑦 𝑁𝑗
... ... ... ... ... ... ... ...
M 𝑦1𝑀1
𝑦2𝑀2
𝑦3𝑀3
... 𝑦𝑖𝑀 𝑖
... 𝑦 𝑁𝑀 𝑁
Cluster total 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁
Cluster mean 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁
• Do an example
Single-stage cluster sampling (Unequal)
• Further study
– Cluster sampling with PPS sampling (No need right
now )
Single-stage cluster sampling (Unequal)
Background...
• A unit may contain too many elements to
obtain a measurement on each
• A unit may contain elements that are nearly
alike.
Multi-stage cluster sampling (Two-stage)
Background...
•
𝑉 𝑦 𝑛𝑀
𝑉 𝑦 𝑛
=
1
1+ 𝑀−1 𝜌
or
𝑉 𝐶𝑙𝑢𝑠𝑡𝑒𝑟
𝑉 𝑆𝑅𝑆
=
1
1+ 𝑀−1 𝜌
– What will be happen when M increase??????
• Less efficient cluster sampling
• Large cluster draw small sample
Multi-stage cluster sampling (Two-stage)
• Sub-sampling (two stage sampling)
• A two stage cluster is one, which is obtained
by first selecting a sample of cluster and then
selecting again a sample of elements from
each sampled cluster.
• Village → Household (subsample)
Multi-stage cluster sampling (Two-stage)
Multi-stage cluster sampling (Two-stage)
Cluster 𝑴𝒊 Population elements Total Cluster mean
1 𝑀1 𝑦11, 𝑦12, … , 𝑦1𝑗, … , 𝑦1𝑀1 𝑌1 =
𝑗=1
𝑴 𝟏
𝒚 𝟏𝒋
𝑌1 =
𝑌1
𝑀1
2 𝑀2 𝑦21, 𝑦22, … , 𝑦2𝑗, … , 𝑦2𝑀2 𝑌2 =
𝑗=1
𝑴 𝟐
𝒚 𝟐𝒋
𝑌2 =
𝑌2
𝑀2
… … … … …
i 𝑀𝑖 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑗, … , 𝑦𝑖𝑀 𝑖 𝑌𝑖 =
𝑗=1
𝑴 𝒊
𝒚𝒊𝒋
𝑌𝑖 =
𝑌𝑖
𝑀𝑖
… … … … …
N 𝑀 𝑁 𝑦 𝑁1, 𝑦 𝑁2, … , 𝑦 𝑁𝑗, … , 𝑦 𝑁𝑀 𝑁 𝑌𝑁 =
𝑗=1
𝑴 𝑵
𝒚 𝑵𝒋
𝑌𝑁 =
𝑌𝑁
𝑀 𝑁
• 𝑌 = 𝑖=1
𝑁
𝑌𝑖 = 𝑖=1
𝑁
𝑗=𝑗
𝑀 𝑖
𝑦𝑖𝑗
• 𝑀0 = 𝑖=1
𝑁
𝑀𝑖
• 𝑌𝑖 =
𝑗=𝑗
𝑀 𝑖 𝑦 𝑖𝑗
𝑀 𝑖
=
𝑌 𝑖
𝑀 𝑖
• Population mean
𝑌 =
𝑖=1
𝑁
𝑗=𝑗
𝑀 𝑖
𝑦𝑖𝑗
𝑖=1
𝑁
𝑀𝑖
=
Y
𝑀0
=
𝑖=1
𝑁
𝑌𝑖
𝑀0
=
𝑖=1
𝑁
𝑌𝑖
𝑀0
=
𝑖=1
𝑁
𝑀𝑖 𝑌𝑖
𝑀0
• Population pooled mean
𝑌𝑖 =
𝑖=1
𝑁
𝑌𝑖
𝑁
=
𝑗=𝑗
𝑀 𝑖
𝑦𝑖𝑗
𝑁
=
𝑖=1
𝑁
𝑀𝑖 𝑌𝑖
𝑁
Multi-stage cluster sampling (Two-stage)
Red and blue mean
are different. Red is
individual cluster
mean but blue is
polled mean
Multi-stage cluster sampling (Two-stage)
Unit 𝑴𝒊 𝒎𝒊 Sample observation Total Cluster mean
1 𝑀1 𝑚1 𝑦11, 𝑦12, … , 𝑦1𝑗, … , 𝑦1𝑚1 𝑦1 =
𝑗=1
𝒎 𝟏
𝒚 𝟏𝒋
𝑦1 =
𝑦1
𝑚1
2 𝑀2 𝑚2 𝑦21, 𝑦22, … , 𝑦2𝑗, … , 𝑦2𝑚2 𝑦2 =
𝑗=1
𝒎 𝟐
𝒚 𝟐𝒋
𝑦2 =
𝑦2
𝑚2
… … … … … …
i 𝑀𝑖 𝑚𝑖 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑗, … , 𝑦𝑖𝑚 𝑖 𝑦𝑖 =
𝑗=1
𝒎 𝒊
𝒚𝒊𝒋
𝑦𝑖 =
𝑦𝑖
𝑚𝑖
… … … … … …
n 𝑀 𝑛 𝑚 𝑛 𝑦 𝑛1, 𝑦 𝑛2, … , 𝑦 𝑛𝑗, … , 𝑦𝑛𝑚 𝑛 𝑦 𝑁 =
𝑗=1
𝒎 𝒏
𝒚 𝒏𝒋
𝑦𝑛 =
𝑦𝑛
𝑚 𝑛
• 𝑦 = 𝑖=1
𝑛
𝑦𝑖 = 𝑖=1
𝑛
𝑗=𝑗
𝑚 𝑖
𝑦𝑖𝑗
• 𝑚0 = 𝑖=1
𝑛
𝑚𝑖 , 𝑚 =
𝑚0
𝑛
• Average value per second stage unit
• 𝑦𝑖 =
𝑗=𝑗
𝑚 𝑖 𝑦 𝑖𝑗
𝑚 𝑖
=
𝑦 𝑖
𝑚 𝑖
, 𝑦 =
y
𝑚0
• Average value per first-stage unit
𝑦𝑛 =
𝑦
𝑛
=
𝑖=1
𝑛
𝑗=𝑗
𝑚 𝑖 𝑦 𝑖𝑗
𝑛
Multi-stage cluster sampling (Two-stage)
• Number of estimator is defined (You can define
more with good properties as a researcher )
• 𝑦𝑡𝑠(1) = 𝑖=1
𝑛
𝑦 𝑖
𝑛
ordinary mean based on first
stage unit mean.
• 𝑦𝑡𝑠(2) = 𝑖=1
𝑛
𝑀 𝑖 𝑦 𝑖
𝑛 𝑀
=
𝑁
𝑀0
𝑖=1
𝑛
𝑀 𝑖 𝑦 𝑖
𝑛
based on 𝑀0
= 𝑖=1
𝑛
𝑀 𝑖 𝑦 𝑖
𝑖=1
𝑛
𝑀 𝑖
= 𝑦𝑡𝑠 Known as ratio estimator
• 𝑌𝑅 = 𝑀0
𝑖=1
𝑛
𝑀 𝑖 𝑦 𝑖
𝑖=1
𝑛
𝑀 𝑖
estimator of total
Multi-stage cluster sampling (Two-stage)
replace 𝑀0by 𝑀0 = 𝑁 𝑖=1
𝑛
𝑀 𝑖
𝑛
Why such Scribble functions?
• 𝑖 th cluster total= 𝑀𝑖 𝑦𝑖
• Estimator of total Y over selected n clusters
𝑖=1
𝑛
𝑀𝑖 𝑦𝑖
• Average value of Y per cluster is 𝑖=1
𝑛
𝑀 𝑖 𝑦 𝑖
𝑛
• Estimator of total Y over N clusters
N
n 𝑖=1
𝑛
𝑀𝑖 𝑦𝑖
• Total= Total frequency × mean
Multi-stage cluster sampling (Two-stage)
Unit 𝑴𝒊 𝒎𝒊 Sample observation Total Cluster mean
1 𝑀1 𝑚1 𝑦11, 𝑦12, … , 𝑦1𝑗, … , 𝑦1𝑚1 𝑦1 =
𝑗=1
𝒎 𝟏
𝒚 𝟏𝒋
𝑦1 =
𝑦1
𝑚1
2 𝑀2 𝑚2 𝑦21, 𝑦22, … , 𝑦2𝑗, … , 𝑦2𝑚2 𝑦2 =
𝑗=1
𝒎 𝟐
𝒚 𝟐𝒋
𝑦2 =
𝑦2
𝑚2
… … … … … …
i 𝑀𝑖 𝑚𝑖 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑗, … , 𝑦𝑖𝑚 𝑖 𝑦𝑖 =
𝑗=1
𝒎 𝒊
𝒚𝒊𝒋
𝑦𝑖 =
𝑦𝑖
𝑚𝑖
… … … … … …
n 𝑀 𝑛 𝑚 𝑛 𝑦 𝑛1, 𝑦 𝑛2, … , 𝑦 𝑛𝑗, … , 𝑦𝑛𝑚 𝑛 𝑦 𝑁 =
𝑗=1
𝒎 𝒏
𝒚 𝒏𝒋
𝑦𝑛 =
𝑦𝑛
𝑚 𝑛
Why such Scribble functions?
•
N
n 𝑖=1
𝑛
𝑀𝑖 𝑦𝑖 = 𝑌 = 𝑀0 × 𝑚𝑒𝑎𝑛 = 𝑀0
𝑁
𝑀0
𝑖=1
𝑛
𝑀 𝑖 𝑦 𝑖
𝑛
• 𝑚𝑒𝑎𝑛 =
𝑌
𝑀0
[Estimator for 𝑌]
• Thus 𝑦𝑡𝑠(2) =
𝑁
𝑀0
𝑖=1
𝑛
𝑀 𝑖 𝑦 𝑖
𝑛
Unbiasedness...
• Theorem 9.1: The estimator 𝑦𝑡𝑠(2) is unbiased
and its variance is given by
𝑉 𝑦 𝑡𝑠 2 = 1 − 𝑓1
1
𝑀2
𝑆 𝑏
2
𝑛
+
1
𝑛𝑁 𝑀2
𝑖=1
𝑁
𝑀𝑖
2
1 − 𝑓2𝑖
𝑆𝑖
2
𝑚𝑖
Where 𝑓1 =
𝑛
𝑁
, 𝑓2𝑖 =
𝑚 𝑖
𝑀 𝑖
Prerequisite given next slide
Conditional Expectation
• 𝐸 𝑋 = 𝐸[𝐸 𝑋 𝑌 ]
• 𝐸 𝑢 = 𝐸1 𝐸2 𝑢 𝑏∗
= 𝑗 𝑝 𝑏∗
= 𝐵∗
(𝐸 𝑢 𝐵∗
)
• 𝐸1 is unconditional in our context expectation of first
stage selection
• 𝐸2 conditional expectationin our context expectation of
second stage selections from a given set of first stage
units.
• 𝑉 𝑥 = 𝑉 𝐸 𝑋 𝑌 + 𝐸 𝑉 𝑋 𝑌
• 𝑉 𝑦 𝑡𝑠 2 = 𝑉1 𝐸2 𝑦 𝑡𝑠 2 𝑛 + 𝐸1 𝑉2 𝑦 𝑡𝑠 2 𝑛
Advantages
• Flexible than one stage
• Quality control purpose
• Large survey
• Less cost & more convenience over stratified
sampling of same size
• Study example

More Related Content

PPTX
Lecture 6. univariate and bivariate analysis
PPTX
Stat 3203 -pps sampling
PPTX
Multinomial Logistic Regression Analysis
PPTX
PPTX
Sampling Distribution
PPTX
Multivariate data analysis
PPTX
Stat 3203 -multphase sampling
PPTX
Cluster and multistage sampling
Lecture 6. univariate and bivariate analysis
Stat 3203 -pps sampling
Multinomial Logistic Regression Analysis
Sampling Distribution
Multivariate data analysis
Stat 3203 -multphase sampling
Cluster and multistage sampling

What's hot (20)

PPT
Graeco Latin Square Design
PDF
Missing data handling
PPTX
Systematic Sampling
PPTX
Statistical inference 2
PPTX
Statistical inference concept, procedure of hypothesis testing
PPTX
The probit model
PDF
Sampling and sampling distribution tttt
PDF
Introduction to Generalized Linear Models
PPTX
Introduction to sampling
PPT
Sampling methods 16
PPTX
SAMPLING AND ESTIMATION PPT.pptx
PPTX
Cluster sampling
PDF
Model selection
PPTX
Non Parametric Tests
PPTX
Probability sampling
PPTX
Probability And Its Axioms
PPTX
Basic concepts of probability
PPTX
Missing Data and data imputation techniques
Graeco Latin Square Design
Missing data handling
Systematic Sampling
Statistical inference 2
Statistical inference concept, procedure of hypothesis testing
The probit model
Sampling and sampling distribution tttt
Introduction to Generalized Linear Models
Introduction to sampling
Sampling methods 16
SAMPLING AND ESTIMATION PPT.pptx
Cluster sampling
Model selection
Non Parametric Tests
Probability sampling
Probability And Its Axioms
Basic concepts of probability
Missing Data and data imputation techniques
Ad

Similar to Stat 3203 -cluster and multi-stage sampling (20)

PPTX
Random Probability sampling by Sazzad Hossain
PPTX
Sampling research method
PPTX
Sampling Techniques
PPTX
Sampling techniques new
PPTX
Sampling techniques new
PDF
Ch6_Sampling_and_Estimation_1665986605149647534634cf02dbcbec (1).pdf
PPT
Sampling.ppt mathematics and statisticsss
PPTX
Sampling techniques
PPTX
Sampling distribution concepts
PPTX
SAMPLING TECHNIQUES.pptx
PPTX
2.7.21 sampling methods data analysis
PDF
8 sampling & sample size (Dr. Mai,2014)
PPT
12- Sampling.ppt
PPTX
probability and non-probability samplings
PPTX
sampling techniques.pptx
PPTX
sampling techniques.pptx
PPTX
sampling techniques
PPTX
Sampling designs in operational health research
PDF
Business research sampling
PPTX
Sampling techniques
Random Probability sampling by Sazzad Hossain
Sampling research method
Sampling Techniques
Sampling techniques new
Sampling techniques new
Ch6_Sampling_and_Estimation_1665986605149647534634cf02dbcbec (1).pdf
Sampling.ppt mathematics and statisticsss
Sampling techniques
Sampling distribution concepts
SAMPLING TECHNIQUES.pptx
2.7.21 sampling methods data analysis
8 sampling & sample size (Dr. Mai,2014)
12- Sampling.ppt
probability and non-probability samplings
sampling techniques.pptx
sampling techniques.pptx
sampling techniques
Sampling designs in operational health research
Business research sampling
Sampling techniques
Ad

More from Khulna University (9)

PPTX
Stat 2153 Introduction to Queiueng Theory
PPTX
Stat 2153 Stochastic Process and Markov chain
PPTX
Stat 3203 -sampling errors and non-sampling errors
PPTX
Ds 2251 -_hypothesis test
PPTX
Stat 1163 -statistics in environmental science
PPTX
Stat 1163 -correlation and regression
PPTX
Introduction to matlab
PPTX
Different kind of distance and Statistical Distance
PPTX
Regression and Classification: An Artificial Neural Network Approach
Stat 2153 Introduction to Queiueng Theory
Stat 2153 Stochastic Process and Markov chain
Stat 3203 -sampling errors and non-sampling errors
Ds 2251 -_hypothesis test
Stat 1163 -statistics in environmental science
Stat 1163 -correlation and regression
Introduction to matlab
Different kind of distance and Statistical Distance
Regression and Classification: An Artificial Neural Network Approach

Recently uploaded (20)

PDF
FourierSeries-QuestionsWithAnswers(Part-A).pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PPTX
Cell Structure & Organelles in detailed.
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
PDF
Insiders guide to clinical Medicine.pdf
PDF
VCE English Exam - Section C Student Revision Booklet
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Business Ethics Teaching Materials for college
PDF
TR - Agricultural Crops Production NC III.pdf
PPTX
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
PDF
Microbial disease of the cardiovascular and lymphatic systems
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
RMMM.pdf make it easy to upload and study
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
master seminar digital applications in india
PDF
Basic Mud Logging Guide for educational purpose
FourierSeries-QuestionsWithAnswers(Part-A).pdf
Microbial diseases, their pathogenesis and prophylaxis
Cell Structure & Organelles in detailed.
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Renaissance Architecture: A Journey from Faith to Humanism
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
ANTIBIOTICS.pptx.pdf………………… xxxxxxxxxxxxx
Insiders guide to clinical Medicine.pdf
VCE English Exam - Section C Student Revision Booklet
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Business Ethics Teaching Materials for college
TR - Agricultural Crops Production NC III.pdf
PPT- ENG7_QUARTER1_LESSON1_WEEK1. IMAGERY -DESCRIPTIONS pptx.pptx
Microbial disease of the cardiovascular and lymphatic systems
PPH.pptx obstetrics and gynecology in nursing
RMMM.pdf make it easy to upload and study
Final Presentation General Medicine 03-08-2024.pptx
master seminar digital applications in india
Basic Mud Logging Guide for educational purpose

Stat 3203 -cluster and multi-stage sampling

  • 1. Stat-3203: Sampling Technique-II (Chapter-2: Cluster and Multi-stage Sampling) Md. Menhazul Abedin Lecturer Statistics Discipline Khulna University, Khulna-9208 Email: menhaz70@gmail.com
  • 2. Objectives and Outline Single stage cluster sampling Cluster sampling with equal and unequal sample size Properties Advantages and disadvantages Multi-stage cluster sampling (two stage)
  • 3. Acknowledgement • Daroga Singh & F. S. Chaudhary • M. Nurul Islam • Ravindra Singh & Naurang Singh Mangat
  • 5. Cluster • A cluster is an aggregate or group, consisting of several (nonhomogeneuos) population elements
  • 6. Intuition… • Study variable: Income/ Awarness/ health status etc • Ghatbhogh, Rupsa, Naihati • PSU: Primary sampling Unit • Single stage sampling Sample Collect Information from all individual
  • 7. Intuition… • Upazila Union • Two stage Sampling PSU SSU
  • 8. Intuition… • Study variable: Income/Awarness/Healthy etc • Multistage sampling Division District UpazilaUnion village Household
  • 9. Why cluster sampling? • Feasibility: No samling frame needed • Economy: Reduction of cost • Flexibility of cluster formation: Manipulation of cluster size possible (like political division, administrative division, commercial capital)
  • 10. Disadvantages... • Loss of precision: • Problems in analysis: • Do you think any other disadvantages…? Please insert here...
  • 11. Cluster sampling and Others • Cluster sampling and SRS • Cluster sampling and Stratified • Cluster sampling and Systematic
  • 13. Cluster sampling Cluster-1 Cluster-3Cluster-2 Cluster-4 Cluster-5 Construct a sample
  • 14. Definition… • Cluster sampling is a method of sampling, which consists of first selecting, at random groups, called clusters of elements from the population, and then choosing all of the elements within each cluster to make up the sample. (M. Nurul Islam)
  • 16. Single-stage cluster sampling (equal) Clusters Elements 1 2 3 ... i ... N 1 𝑦11 𝑦21 𝑦31 ... 𝑦𝑖1 ... 𝑦 𝑁1 2 𝑦12 𝑦22 𝑦32 ... 𝑦𝑖2 ... 𝑦 𝑁2 ... ... ... ... ... ... ... ... j 𝑦1𝑗 𝑦2𝑗 𝑦3𝑗 ... 𝑦𝑖𝑗 ... 𝑦 𝑁𝑗 ... ... ... ... ... ... ... ... M 𝑦1𝑀 𝑦2𝑀 𝑦3𝑀 ... 𝑦𝑖𝑀 ... 𝑦 𝑁𝑀 Cluster total 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁 Cluster mean 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁 Layout of NM popn elements inclusters
  • 17. Single-stage cluster sampling (equal) Clusters Elements 1 2 3 ... i ... n 1 𝑦11 𝑦21 𝑦31 ... 𝑦𝑖1 ... 𝑦 𝑛1 2 𝑦12 𝑦22 𝑦32 ... 𝑦𝑖2 ... 𝑦 𝑛2 ... ... ... ... ... ... ... ... j 𝑦1𝑗 𝑦2𝑗 𝑦3𝑗 ... 𝑦𝑖𝑗 ... 𝑦 𝑛𝑗 ... ... ... ... ... ... ... ... M 𝑦1𝑀 𝑦2𝑀 𝑦3𝑀 ... 𝑦𝑖𝑀 ... 𝑦 𝑛𝑀 Cluster total 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑛 Cluster mean 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑛 Layout of nM sample elements inclusters
  • 18. Single-stage cluster sampling (equal) • Indivisual cluster mean • 𝑦𝑖 = 1 𝑀 𝑦𝑖1 + 𝑦𝑖2 + ⋯ + 𝑦𝑖𝑀 = 𝑦 𝑖 𝑀 = 1 𝑀 𝑗=1 𝑀 𝑦𝑖𝑗 • n cluster mean (sample mean) • 𝑦𝑛 = 1 𝑛 𝑖=1 𝑛 𝑦𝑖 • Sample mean 𝑦 = 𝑦 𝑛𝑀 = 1 𝑛𝑀 𝑖=1 𝑛 𝑗=1 𝑀 𝑦𝑖𝑗 = 1 𝑛𝑀 𝑖=1 𝑛 𝑦𝑖 = 1 𝑛𝑀 𝑖=1 𝑛 𝑀 𝑦𝑖 = 1 𝑛 𝑖=1 𝑛 𝑦𝑖 = 𝑦𝑛= n cluster mean Sample mean = n cluster mean
  • 19. Single-stage cluster sampling (equal) • N cluster mean 𝑌𝑁 = 1 𝑁 𝑖=1 𝑁 𝑦𝑖 • Population mean 𝑌 = 𝑌 𝑁𝑀 = 1 𝑁𝑀 𝑖=1 𝑁 𝑗=1 𝑀 𝑦𝑖𝑗 = 1 𝑁𝑀 𝑖=1 𝑁 𝑦𝑖 = 1 𝑁𝑀 𝑖=1 𝑁 𝑀 𝑦𝑖 = 1 𝑁 𝑖=1 𝑁 𝑦𝑖 = 𝑌𝑛 = N cluster mean Population mean = N cluster mean
  • 20. Single-stage cluster sampling (equal) • Variance calculation: 𝑉 𝑦𝑛 = 𝑁 − 𝑛 𝑁 1 𝑛 1 𝑀2 𝑖=1 𝑁 𝑦𝑖 − 𝑖=1 𝑁 𝑦𝑖 𝑁 2 𝑁 − 1 𝑉 𝑦𝑛 = 𝑁 − 𝑛 𝑁 1 n 𝑖=1 𝑁 𝑦𝑖 − 𝑌 2 𝑁 − 1 = 1−𝑓 n 𝑆 𝑏 2 • Replace 𝑆 𝑏 2 by 𝑠 𝑏 2 = 𝑖=1 𝑛 𝑦 𝑖− 𝑦 𝑛 2 𝑛−1 • Estimator of 𝑉 𝑦𝑛 is v 𝑦𝑛 = 1−𝑓 n 𝑠 𝑏 2
  • 21. Single-stage cluster sampling (equal) • Theorem 8.1: defined mean is unbiased and estimate the variance of mean. (Need intra-cluster correlation discussed next slide) • 𝑉 𝑦𝑛 = (1−𝑓)(𝑁𝑀−1) n𝑀2(𝑁−1) 𝑆2 [1 + (𝑀 − 1)𝜌] Or 𝑉 𝑦𝑛 ≈ 1−𝑓 nM 𝑆2 [1 + (𝑀 − 1)𝜌]
  • 22. Intra-cluster correlation • The similarity of observations within a cluster can be quantified by means of the Intracluster Correlation Coefficient (ICC), sometimes also referred to as intraclass correlation coefficient. • This is very similar to the well known Pearson’s correlation coefficient; only that we do not simultaneously look at observations of two variables on the same object but we look simultaneously on two values of the same variable, but taken at two different objects. • Calculation like Auto-correlation (discussed)
  • 23. Intra-cluster correlation • Mean square between elementsin the population 𝑆2 = 𝑖=1 𝑁 𝑗=1 𝑀 𝑦 𝑖𝑗− 𝑌 2 𝑁𝑀−1 • Intra cluster correlation 𝜌 = 𝐸(𝑦𝑖𝑗 − 𝑌)(𝑦𝑗𝑘 − 𝑌) 𝐸 𝑦𝑖𝑗 − 𝑌 2 = 2 𝑖=1 𝑁 𝑗=1<𝑘 𝑀 (𝑦𝑖𝑗 − 𝑌)(𝑦𝑗𝑘 − 𝑌) (𝑀 − 1)(𝑁𝑀 − 1)𝑆2
  • 24. Variance in terms of 𝜌 • 𝑉 𝑦𝑛 = 𝑁−𝑛 𝑁 1 n 𝑖=1 𝑁 𝑦 𝑖− 𝑌 2 𝑁−1 • Expand the squared term and relate with 𝜌 • 𝑉 𝑦𝑛 = (1−𝑓)(𝑁𝑀−1) n𝑀2(𝑁−1) 𝑆2 [1 + (𝑀 − 1)𝜌] • If N large 𝑁𝑀 − 1 ≈ 𝑁𝑀 and 𝑁 − 1 ≈ 𝑁 • 𝑉 𝑦𝑛 ≈ 1−𝑓 nM 𝑆2 [1 + (𝑀 − 1)𝜌] • 𝑉 𝑦𝑛 = 1−𝑓 nM 𝑆2 [1 + (𝑀 − 1)𝜌] [simplicity ]
  • 25. Design effect • Variance of 𝐶𝑙𝑢𝑠𝑡𝑒𝑟 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 • 𝑉 𝑦𝑛 = 1−𝑓 nM 𝑆2 [1 + (𝑀 − 1)𝜌] • Variance of 𝑆𝑖𝑚𝑝𝑙𝑒 𝑟𝑎𝑛𝑑𝑜𝑚 𝑠𝑎𝑚𝑝𝑙𝑖𝑛𝑔 • 𝑉 𝑦 𝑛𝑀 = 𝑁𝑀−𝑛𝑀 𝑁𝑀 𝑆2 𝑛𝑀 = 1−𝑓 nM 𝑆2 • Dividing 𝑉 𝑦 𝑛 𝑉 𝑦 𝑛𝑀 = 1 + 𝑀 − 1 𝜌 = Deff • What is the inter pretation of Design effect? – It’s simple, can you find it. Try your best.
  • 26. Relationship between 𝜌, Deff and M • 𝐷𝑒𝑓𝑓 = 1 + 𝑀 − 1 𝜌 – See its property when – 𝜌 = 1 [Deff=M all the M values in a cluster are equal] – 𝑀 = 1 [SRS= cluster sampling] – 𝜌 = 0 [cluster void – 𝐷𝑒𝑓𝑓 = 0 or +1 find range of intra-cluster correlation
  • 27. Efficiency of cluster sampling • 𝑉 𝑦 𝑛𝑀 𝑉 𝑦 𝑛 = 1 1+ 𝑀−1 𝜌 = 1 𝐷𝑒𝑓𝑓 • Observe its characteristics when – 𝜌 > 0 Cluster sampling less efficient compared to SRS – 𝜌 < 0 Cluster sampling more efficient compared to SRS
  • 28. Single-stage cluster sampling (Equal) • Find Optimum n and M subject to constraint cost. – Ignore it provisionally
  • 30. Single stage cluster sampling with Unequal cluster size
  • 31. Single-stage cluster sampling (Unequal) Clusters Elements 1 2 3 ... i ... N 1 𝑦11 𝑦21 𝑦31 ... 𝑦𝑖1 ... 𝑦 𝑁1 2 𝑦12 𝑦22 𝑦32 ... 𝑦𝑖2 ... 𝑦 𝑁2 ... … ... ... ... ... ... ... j 𝑦1𝑗 𝑦2𝑗 𝑦3𝑗 ... 𝑦𝑖𝑗 ... 𝑦 𝑁𝑗 ... ... ... ... ... ... ... ... 𝑀𝑖 𝑦1𝑀1 𝑦2𝑀2 𝑦3𝑀3 ... 𝑦𝑖𝑀 𝑖 ... 𝑦 𝑁𝑀 𝑁 Cluster total 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁 Cluster mean 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁
  • 32. • Total number of elements 𝑀0 = 𝑖=1 𝑁 𝑀𝑖 • Total number of elements in each cluster 𝑦𝑖 = 𝑗=1 𝑀 𝑖 𝑦𝑖𝑗 • Average number of elements per cluster 𝑀 = 𝑖=1 𝑁 𝑀𝑖 N = 𝑀0 𝑁 Single-stage cluster sampling (Unequal)
  • 33. Single-stage cluster sampling (Unequal) • Population mean (1) 𝑌 = 𝑖=1 𝑁 𝑗=1 𝑀 𝑖 𝑦𝑖𝑗 𝑖=1 𝑁 𝑀𝑖 = 𝑖=1 𝑁 𝑀𝑖 𝑦𝑖 𝑖=1 𝑁 𝑀𝑖 = 𝑖=1 𝑁 𝑀𝑖 𝑦𝑖 𝑀0 • Population mean (2) 𝑌𝑁 = 𝑖=1 𝑁 𝑦𝑖 𝑁 • Are they same?
  • 34. Single-stage cluster sampling (Unequal) • Sample mean (1) 𝑦𝑛 = 𝑖=1 𝑛 𝑦 𝑖 𝑛 Biased for 𝑌 but unbiased for 𝑌𝑁 • Sample mean (2) • 𝑦𝑛 = 𝑁 𝑛𝑀0 𝑖=1 𝑛 𝑀𝑖 𝑦𝑖 = 1 𝑛 𝑖=1 𝑛 ( 𝑀 𝑖 𝑦 𝑖 𝑀 ) This is unbiased for 𝑌
  • 35. Single-stage cluster sampling (Unequal) Clusters Elements 1 2 3 ... i ... N 1 𝑦11 𝑦21 𝑦31 ... 𝑦𝑖1 ... 𝑦 𝑁1 2 𝑦12 𝑦22 𝑦32 ... 𝑦𝑖2 ... 𝑦 𝑁2 ... … ... ... ... ... ... ... j 𝑦1𝑗 𝑦2𝑗 𝑦3𝑗 ... 𝑦𝑖𝑗 ... 𝑦 𝑁𝑗 ... ... ... ... ... ... ... ... M 𝑦1𝑀1 𝑦2𝑀2 𝑦3𝑀3 ... 𝑦𝑖𝑀 𝑖 ... 𝑦 𝑁𝑀 𝑁 Cluster total 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁 Cluster mean 𝑦1 𝑦2 𝑦3 ... 𝑦𝑖 ... 𝑦 𝑁
  • 36. • Do an example Single-stage cluster sampling (Unequal)
  • 37. • Further study – Cluster sampling with PPS sampling (No need right now ) Single-stage cluster sampling (Unequal)
  • 38. Background... • A unit may contain too many elements to obtain a measurement on each • A unit may contain elements that are nearly alike. Multi-stage cluster sampling (Two-stage)
  • 39. Background... • 𝑉 𝑦 𝑛𝑀 𝑉 𝑦 𝑛 = 1 1+ 𝑀−1 𝜌 or 𝑉 𝐶𝑙𝑢𝑠𝑡𝑒𝑟 𝑉 𝑆𝑅𝑆 = 1 1+ 𝑀−1 𝜌 – What will be happen when M increase?????? • Less efficient cluster sampling • Large cluster draw small sample Multi-stage cluster sampling (Two-stage)
  • 40. • Sub-sampling (two stage sampling) • A two stage cluster is one, which is obtained by first selecting a sample of cluster and then selecting again a sample of elements from each sampled cluster. • Village → Household (subsample) Multi-stage cluster sampling (Two-stage)
  • 41. Multi-stage cluster sampling (Two-stage) Cluster 𝑴𝒊 Population elements Total Cluster mean 1 𝑀1 𝑦11, 𝑦12, … , 𝑦1𝑗, … , 𝑦1𝑀1 𝑌1 = 𝑗=1 𝑴 𝟏 𝒚 𝟏𝒋 𝑌1 = 𝑌1 𝑀1 2 𝑀2 𝑦21, 𝑦22, … , 𝑦2𝑗, … , 𝑦2𝑀2 𝑌2 = 𝑗=1 𝑴 𝟐 𝒚 𝟐𝒋 𝑌2 = 𝑌2 𝑀2 … … … … … i 𝑀𝑖 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑗, … , 𝑦𝑖𝑀 𝑖 𝑌𝑖 = 𝑗=1 𝑴 𝒊 𝒚𝒊𝒋 𝑌𝑖 = 𝑌𝑖 𝑀𝑖 … … … … … N 𝑀 𝑁 𝑦 𝑁1, 𝑦 𝑁2, … , 𝑦 𝑁𝑗, … , 𝑦 𝑁𝑀 𝑁 𝑌𝑁 = 𝑗=1 𝑴 𝑵 𝒚 𝑵𝒋 𝑌𝑁 = 𝑌𝑁 𝑀 𝑁
  • 42. • 𝑌 = 𝑖=1 𝑁 𝑌𝑖 = 𝑖=1 𝑁 𝑗=𝑗 𝑀 𝑖 𝑦𝑖𝑗 • 𝑀0 = 𝑖=1 𝑁 𝑀𝑖 • 𝑌𝑖 = 𝑗=𝑗 𝑀 𝑖 𝑦 𝑖𝑗 𝑀 𝑖 = 𝑌 𝑖 𝑀 𝑖 • Population mean 𝑌 = 𝑖=1 𝑁 𝑗=𝑗 𝑀 𝑖 𝑦𝑖𝑗 𝑖=1 𝑁 𝑀𝑖 = Y 𝑀0 = 𝑖=1 𝑁 𝑌𝑖 𝑀0 = 𝑖=1 𝑁 𝑌𝑖 𝑀0 = 𝑖=1 𝑁 𝑀𝑖 𝑌𝑖 𝑀0 • Population pooled mean 𝑌𝑖 = 𝑖=1 𝑁 𝑌𝑖 𝑁 = 𝑗=𝑗 𝑀 𝑖 𝑦𝑖𝑗 𝑁 = 𝑖=1 𝑁 𝑀𝑖 𝑌𝑖 𝑁 Multi-stage cluster sampling (Two-stage) Red and blue mean are different. Red is individual cluster mean but blue is polled mean
  • 43. Multi-stage cluster sampling (Two-stage) Unit 𝑴𝒊 𝒎𝒊 Sample observation Total Cluster mean 1 𝑀1 𝑚1 𝑦11, 𝑦12, … , 𝑦1𝑗, … , 𝑦1𝑚1 𝑦1 = 𝑗=1 𝒎 𝟏 𝒚 𝟏𝒋 𝑦1 = 𝑦1 𝑚1 2 𝑀2 𝑚2 𝑦21, 𝑦22, … , 𝑦2𝑗, … , 𝑦2𝑚2 𝑦2 = 𝑗=1 𝒎 𝟐 𝒚 𝟐𝒋 𝑦2 = 𝑦2 𝑚2 … … … … … … i 𝑀𝑖 𝑚𝑖 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑗, … , 𝑦𝑖𝑚 𝑖 𝑦𝑖 = 𝑗=1 𝒎 𝒊 𝒚𝒊𝒋 𝑦𝑖 = 𝑦𝑖 𝑚𝑖 … … … … … … n 𝑀 𝑛 𝑚 𝑛 𝑦 𝑛1, 𝑦 𝑛2, … , 𝑦 𝑛𝑗, … , 𝑦𝑛𝑚 𝑛 𝑦 𝑁 = 𝑗=1 𝒎 𝒏 𝒚 𝒏𝒋 𝑦𝑛 = 𝑦𝑛 𝑚 𝑛
  • 44. • 𝑦 = 𝑖=1 𝑛 𝑦𝑖 = 𝑖=1 𝑛 𝑗=𝑗 𝑚 𝑖 𝑦𝑖𝑗 • 𝑚0 = 𝑖=1 𝑛 𝑚𝑖 , 𝑚 = 𝑚0 𝑛 • Average value per second stage unit • 𝑦𝑖 = 𝑗=𝑗 𝑚 𝑖 𝑦 𝑖𝑗 𝑚 𝑖 = 𝑦 𝑖 𝑚 𝑖 , 𝑦 = y 𝑚0 • Average value per first-stage unit 𝑦𝑛 = 𝑦 𝑛 = 𝑖=1 𝑛 𝑗=𝑗 𝑚 𝑖 𝑦 𝑖𝑗 𝑛 Multi-stage cluster sampling (Two-stage)
  • 45. • Number of estimator is defined (You can define more with good properties as a researcher ) • 𝑦𝑡𝑠(1) = 𝑖=1 𝑛 𝑦 𝑖 𝑛 ordinary mean based on first stage unit mean. • 𝑦𝑡𝑠(2) = 𝑖=1 𝑛 𝑀 𝑖 𝑦 𝑖 𝑛 𝑀 = 𝑁 𝑀0 𝑖=1 𝑛 𝑀 𝑖 𝑦 𝑖 𝑛 based on 𝑀0 = 𝑖=1 𝑛 𝑀 𝑖 𝑦 𝑖 𝑖=1 𝑛 𝑀 𝑖 = 𝑦𝑡𝑠 Known as ratio estimator • 𝑌𝑅 = 𝑀0 𝑖=1 𝑛 𝑀 𝑖 𝑦 𝑖 𝑖=1 𝑛 𝑀 𝑖 estimator of total Multi-stage cluster sampling (Two-stage) replace 𝑀0by 𝑀0 = 𝑁 𝑖=1 𝑛 𝑀 𝑖 𝑛
  • 46. Why such Scribble functions? • 𝑖 th cluster total= 𝑀𝑖 𝑦𝑖 • Estimator of total Y over selected n clusters 𝑖=1 𝑛 𝑀𝑖 𝑦𝑖 • Average value of Y per cluster is 𝑖=1 𝑛 𝑀 𝑖 𝑦 𝑖 𝑛 • Estimator of total Y over N clusters N n 𝑖=1 𝑛 𝑀𝑖 𝑦𝑖 • Total= Total frequency × mean
  • 47. Multi-stage cluster sampling (Two-stage) Unit 𝑴𝒊 𝒎𝒊 Sample observation Total Cluster mean 1 𝑀1 𝑚1 𝑦11, 𝑦12, … , 𝑦1𝑗, … , 𝑦1𝑚1 𝑦1 = 𝑗=1 𝒎 𝟏 𝒚 𝟏𝒋 𝑦1 = 𝑦1 𝑚1 2 𝑀2 𝑚2 𝑦21, 𝑦22, … , 𝑦2𝑗, … , 𝑦2𝑚2 𝑦2 = 𝑗=1 𝒎 𝟐 𝒚 𝟐𝒋 𝑦2 = 𝑦2 𝑚2 … … … … … … i 𝑀𝑖 𝑚𝑖 𝑦𝑖1, 𝑦𝑖2, … , 𝑦𝑖𝑗, … , 𝑦𝑖𝑚 𝑖 𝑦𝑖 = 𝑗=1 𝒎 𝒊 𝒚𝒊𝒋 𝑦𝑖 = 𝑦𝑖 𝑚𝑖 … … … … … … n 𝑀 𝑛 𝑚 𝑛 𝑦 𝑛1, 𝑦 𝑛2, … , 𝑦 𝑛𝑗, … , 𝑦𝑛𝑚 𝑛 𝑦 𝑁 = 𝑗=1 𝒎 𝒏 𝒚 𝒏𝒋 𝑦𝑛 = 𝑦𝑛 𝑚 𝑛
  • 48. Why such Scribble functions? • N n 𝑖=1 𝑛 𝑀𝑖 𝑦𝑖 = 𝑌 = 𝑀0 × 𝑚𝑒𝑎𝑛 = 𝑀0 𝑁 𝑀0 𝑖=1 𝑛 𝑀 𝑖 𝑦 𝑖 𝑛 • 𝑚𝑒𝑎𝑛 = 𝑌 𝑀0 [Estimator for 𝑌] • Thus 𝑦𝑡𝑠(2) = 𝑁 𝑀0 𝑖=1 𝑛 𝑀 𝑖 𝑦 𝑖 𝑛
  • 49. Unbiasedness... • Theorem 9.1: The estimator 𝑦𝑡𝑠(2) is unbiased and its variance is given by 𝑉 𝑦 𝑡𝑠 2 = 1 − 𝑓1 1 𝑀2 𝑆 𝑏 2 𝑛 + 1 𝑛𝑁 𝑀2 𝑖=1 𝑁 𝑀𝑖 2 1 − 𝑓2𝑖 𝑆𝑖 2 𝑚𝑖 Where 𝑓1 = 𝑛 𝑁 , 𝑓2𝑖 = 𝑚 𝑖 𝑀 𝑖 Prerequisite given next slide
  • 50. Conditional Expectation • 𝐸 𝑋 = 𝐸[𝐸 𝑋 𝑌 ] • 𝐸 𝑢 = 𝐸1 𝐸2 𝑢 𝑏∗ = 𝑗 𝑝 𝑏∗ = 𝐵∗ (𝐸 𝑢 𝐵∗ ) • 𝐸1 is unconditional in our context expectation of first stage selection • 𝐸2 conditional expectationin our context expectation of second stage selections from a given set of first stage units. • 𝑉 𝑥 = 𝑉 𝐸 𝑋 𝑌 + 𝐸 𝑉 𝑋 𝑌 • 𝑉 𝑦 𝑡𝑠 2 = 𝑉1 𝐸2 𝑦 𝑡𝑠 2 𝑛 + 𝐸1 𝑉2 𝑦 𝑡𝑠 2 𝑛
  • 51. Advantages • Flexible than one stage • Quality control purpose • Large survey • Less cost & more convenience over stratified sampling of same size