Chapter 6_Confidence intervals for range of values, calculated from sample data

CHAPTER 6
Confidence Intervals
Week 3-1

Week 3-2
Learning Objectives
• Explain the difference between a point estimate and an
interval estimate.
• Construct and interpret confidence intervals:
 with Z-distribution for the population mean or proportion.
 with t-distribution for the population mean.
• Determine appropriate sample size to achieve specified
levels of accuracy and confidence.

Week 3-3
Content of this chapter
 Confidence Intervals for the Population Mean, μ
 when Population Standard Deviation σ is Known
 when Population Standard Deviation σ is Unknown
 Confidence Intervals for the Population
Proportion, P
 Determining the Required Sample Size, n

Week 3-4
Estimation Process
I am 95%
confident that
μ is between 40
& 60.
(mean, μ, is
unknown)
Population
Random Sample
Mean
X = 50
Sample
POINT
Estimate
INTERVAL
Estimate

Week 3-5
Point Estimates vs.
Interval Estimates
 A single value used to approximate a population parameter.
 A Point Estimate is a single number (sample mean) is a
point estimate of the population mean.
 A Confidence Interval provides additional information about
variability
Point Estimate
Lower
Confidence
Limit
Upper
Confidence
Limit
Width of
confidence interval

Week 3-6
We can estimate a
Population Parameter …
Point Estimates
with a Sample
Statistic
(a Point Estimate)
Mean
Proportion p
X
μ
ps
Point Estimate
• Imagine you want to know the average height of students in your class.
• You measure the height of a few students and calculate the average from your
sample.
• This single average number is a point estimate for the average height of all
students in the class.
So, a point estimate is just one single number, like saying, "I think the average
height is 170 cm.

Week 3-7
Unbiased Point Estimates
Population Sample
Parameter Statistic Formula
• Mean, µ
• Variance, 2
• Proportion, p
x x =
åxi
n
1
–
)2
–
(
n
x
xi
s2
s2
å
=
= x successes
n trials
ps ps
An Unbiased Point Estimate is a statistical estimate of a population parameter
that, on average, equals the true value of that parameter over many samples.

Week 3-8
Confidence Interval Estimate
 How much uncertainty is associated with a point
estimate of a population parameter?
 An interval estimate provides more information
about a population characteristic than does a point
estimate
 Such interval estimates are called confidence
intervals

Week 3-9
 An interval gives a range of values:
 within which the population parameter is likely to
lie.
 Takes into consideration of variation in sample
statistics from sample to sample data
 Based on observation from 1 sample set
 Stated in terms of level of confidence

 Instead of giving a single number, you give a range.
For example, you might say, "I think the average
height is between 165 cm and 175 cm."
 This range is called an interval estimate or
confidence interval.
 So, an interval estimate says, "I'm pretty sure the true
average is somewhere between 165 cm and 175 cm."
Week 3-10

Week 3-11
Confidence Level
 The degree of certainty that an interval will contain the actual
population parameter (μ , p )
 A percentage (always less than 100%)=(like 90%, 95%, or
99%)
 Confidence Level: (1 - α)×100%
 Here, α represents the level of significance, or the probability
that the interval does not contain the true parameter.
 For example, if α=0.05, then the confidence level is
(1−0.05)×100%=95%

Week 3-12
Confidence Level, (1 - α)100%
 Suppose confidence level = 95%
 Also written (1 - ) = 0.95
 A relative frequency interpretation:
 In the long run, 95% of all the confidence intervals
will contain the unknown true parameter.
 A specific interval either will contain or will not
contain the true parameter
 No probability involved in a specific interval once the
interval has been calculated.
(continued)

Week 3-13
General Formula
 The general formula for confidence interval is:
= Point Estimate ± (Critical Value) (Standard Error)
Sample mean,
Sample proportion, Ps
X , ,
√𝑃𝑠(1−𝑃𝑠)
𝑛
Point Estimate ± Margin error
• sample statistic used
to estimate the
population parameter
The margin of error tells us how
much we expect the point estimate
to vary from the true population
parameter.
The standard error
measures the variability
of the sample statistic

Week 3-14
μ
μx

Intervals and Level of Confidence
Intervals
extend from
to
(1-) × 100%
of intervals
constructed
contain μ;
() x 100% do
not.
Sampling Distribution of the Mean
n
σ
Z
X 
n
σ
Z
X 
x
x1
x2
/2
 /2



1
The center of the
curve (between the
tails) represents the
confidence level
(e.g., 95% or 99%)
The area in the tails
(α) represents the
probability that the
interval does not
contain the
population mean.

Week 3-15
Population
Mean
σ Unknown
Confidence
Intervals
Population
Proportion
σ
Known

1st Condition:
One Population Mean
σ is known
Week 3-16

Week 3-17
Confidence Interval for μ
(σ is Known)
 Assumptions
 Population standard deviation σ is known
 Population is normally distributed
 Confidence interval estimate:
Note:
Z is the normal distribution critical value for a probability
of α/2 in each tail
n
σ
Z
X
2



Week 3-18
Finding the Critical Value, Z
 Consider a 95% confidence interval:
Z= -1.96 Z= 1.96
1−𝛼=0.95
.025
0
2
α
 .025
0
2
α

Point Estimate
Lower
Confidence
Limit
Upper
Confidence
Limit
Z units:
X units: Point Estimate
0
1.96
Z 

r
Since this is a two-tailed
interval (meaning we’re
accounting for variation
on both sides of the
mean), we divide α
alphaα by 2.
To capture the middle 95% of the
distribution, we look up the Z-
value that corresponds to the
outer 2.5% (0.025) in each tail.

Week 3-19
Common Levels of Confidence
 Commonly used confidence levels are 90%, 95%,
and 99%
Confidence
Level
Confidence
Coefficient, Z value
1.28
1.645
1.96
2.33
2.58
3.08
3.27
0.80
0.90
0.95
0.98
0.99
0.998
0.999
80%
90%
95%
98%
99%
99.8%
99.9%


1

Week 3-20
Example
 A sample of 11 circuits from a large normal
population has a mean resistance of 2.20 ohms.
We know from past testing that the population
standard deviation is 0.35 ohms.
 Determine a 95% confidence interval for the true
mean resistance of the population.

Week 3-21
Interpretation
Solution:
 We are 95% confident that the true mean
resistance is between 1.9932 and 2.4068 ohms
2.4068)
,
(1.9932
.2068
2.20
)
11
(.35/
1.96
2.20
n
σ
Z
X






Week 3-22
Population
Mean
σ
Unknown
Confidence
Intervals
Population
Proportion
σ Known

2nd Condition:
One Population Mean
σ is unknown
n is small
Week 3-23

Week 3-24
 If the population standard deviation σ is
unknown, we can substitute it with the
sample standard deviation, S
 This introduces extra uncertainty, since S is
different from sample to sample
 So we use the t distribution instead of the
normal distribution
(σ Unknown)

Week 3-25
 Assumptions
 Population standard deviation is unknown
 Population is normally distributed
 Use Student’s t Distribution
 Confidence Interval Estimate:
(where t is the critical value of the t distribution with n-1 d.f. and an area of
α/2 in each tail)
(σ Unknown)
n
S
t
X 1
-
n

(continued)

Week 3-26
Student’s t-Distribution
 The t is a family of distributions
 The t value depends on degrees of freedom
(d.f.)
 Number of observations that are free to vary after sample
mean has been calculated
d.f. = n - 1

Week 3-27
Student’s t Table
t
0 2.920
Let: n = 3 , 
=0.10 2-tail test
df = n - 1 = 2
/2 =0.05
/2 =
0.05

Week 3-28
t distribution values
With comparison to the Z value
Confidence t t t Z
Level (10 d.f.) (20 d.f.) (30 d.f.) d.f. = ∞
0.80 1.372 1.325 1.310 1.28
0.90 1.812 1.725 1.697 1.64
0.95 2.228 2.086 2.042 1.96
0.99 3.169 2.845 2.750 2.58
Note: t Z as n increases

Week 3-29
Example
A random sample of n = 25 has X = 50 and
S = 8. Form a 95% confidence interval for μ

Week 3-30
Example
A random sample of n = 25 has X = 50 and
S = 8. Form a 95% confidence interval for μ
 d.f. = n – 1 = 24, so
The confidence interval is
2.0639
t0.025,24
1
n
,
/2 



t
25
8
(2.0639)
50
n
S
1
-
n
/2, 

 
t
X
= (46.698 , 53.302)

3rd Condition:
One Population Mean
σ is unknown
n is large
Week 3-31

Week 3-32
Student’s t Distribution
t
0
t (df = 5)
t (df = 13)
t-distributions are bell-shaped
and symmetric, but have
‘fatter’ tails than the normal
Standard
Normal
(t with df = )
Note: t Z as n increases


( unknown, n is large)
Week 3-33
Assumptions:
- Population standard deviation is unknown
- Due to the Central Limit Theorem, if n is large enough, the
sampling distribution is normal regardless the shape of population
distribution.
Confidence interval estimate:
Note:
Z is the normal distribution critical value for a probability of α/2 in each tail

Week 3-34
The quality control manager at a battery factory
needs to estimate the mean life of a large
shipment of batteries. A random sample of 81
batteries indicated a sample mean life of 400
hours with a standard deviation of 150 hours.
Construct a 95% confidence interval estimate of
the population mean life of batteries in this
shipment.
Example

Week 3-35
Example
81
150
(1.96)
0
0
4
n
S
/2 

 
Z
X
(367.333 , 432.667)
 1  α = 0.95, α = 0.05
 The 95% confidence interval is:
1.96
Z0.025
/2 


Z
X = 400, S = 150
n = 81
667
.
2
3
0
0
4 

=

Week 3-36
Population
Mean
σ Unknown
Confidence
Intervals
Population
Proportion
σ Known

4th Condition:
One Population Proportion
Week 3-37

Week 3-38
Confidence Intervals for the
Population Proportion, P
 An interval estimate for the population
proportion (P) can be calculated by using
sample proportion (Ps)
 (P) = percentage of the entire
population that has a particular
characteristic.

Week 3-39
Confidence Intervals for the
Population Proportion, P
 Recall that the distribution of the sample proportion
is approximately normal if the sample size is large,
with standard deviation
 We will estimate this with sample data:
(continued)
s
P (1 )
n
s
P

n
P)
P(1
σP



P

P
SD of
proportion

Week 3-40
Confidence Interval
 Upper and lower confidence limits for the population
proportion are calculated with the formula
 where
 Z is the standard normal value for the level of confidence
desired
 Ps is the sample proportion
 n is the sample size
(1 )
s s
s
P P
P Z
n



Week 3-41
Example
 A random sample of 100 people shows
that 25 are left-handed.
 Form a 95% confidence interval for the
true proportion of left-handers.

Week 3-42
Example
 A random sample of 100 people shows that 25
are left-handed. Form a 95% confidence
interval for the true proportion of left-handers.
s
P (1 )
(0.25)(0.75)
25/100 1.96
100
s
P
Ps Z
n


 
0.3349)
,
(0.1651
(0.0433)
1.96
.25
0 

(continued)

Week 3-43
Interpretation
 We are 95% confident that the true percentage of
left-handers in the population is between
16.51% and 33.49%.
 Although the interval from 0.1651 to 0.3349
may or may not contain the true proportion,
95% of intervals formed from samples of size
100 in this manner will contain the true
proportion.

Week 3-44
Confidence
Interval
Population
Mean
Population
Proportion
σ known σ unknown
n
σ
Z
X
2


n is small n is large
,
2 1
n
S
X t
n



2
(1 )
s s
s
P P
P Z
n



1
2 3
4

Determining Sample Size
Week 3-45

Week 3-46
For the
Mean
Determining
Sample Size
For the
Proportion
𝑛=
𝑍2
σ2
𝑒
2 2
2
e
)
p
1
(
p
Z
n



Week 3-47
Sampling Error
 The required sample size can be found to reach a
desired Margin of Error (e) with a specified level of
confidence (1 - )
 The margin of error is also called Sampling Error
 the amount of imprecision in the estimate of the population
parameter
 the amount added and subtracted to the point estimate to
form the confidence interval
(= critical value * standard error)

Week 3-48
For the
Mean
Determining
Sample Size
n
σ
Z
X 
n
σ
Z
e 
Sampling Error
(Margin of Error)

Week 3-49
For the
Mean
Determining
Sample Size
n
σ
Z
e 
(continued)
2
2
2
e
σ
Z
n 
Now solve
for n to get

Week 3-50
 To determine the required sample size for the mean,
you must know:
 The desired level of confidence (1 - ), which
determines the critical Z value
 The acceptable sampling error (margin of error), e
 The standard deviation, σ
(continued)

Week 3-51
Required Sample Size Example
If  = 45, what sample size is needed to
estimate the mean within ± 5 with 90%
confidence?
(Always round up)
219.19
5
(45)
(1.645)
σ
2
2
2
2
2
2



e
Z
n
So the required sample size is n = 220
2
2
2
e
σ
Z
n 

Week 3-52
If σ is unknown
 If unknown, σ can be estimated when
using the required sample size formula
 Use a value for σ that is expected to be at least
as large as the true σ
 Select a pilot sample and estimate σ with the
sample standard deviation, S

Week 3-53
(1 )
s s
s
P P
P Z
n


n
)
p
1
(
p
Z
e


Determining
Sample Size
For the
Proportion
Sampling Error
(Margin of Error)

Week 3-54
Determining
Sample Size
For the
Proportion
2
2
e
)
p
1
(
p
Z
n


Now solve
for n to get
n
)
p
1
(
p
Z
e


(continued)

Week 3-55
 To determine the required sample size for the
proportion, you must know:
 The desired level of confidence (1 - ), which
determines the critical Z value
 The acceptable sampling error (margin of error), e
 The true proportion of “successes”, p
 p can be estimated with a pilot sample, if necessary
(or conservatively use p = 0.50)
(continued)

Week 3-56
How large a sample would be necessary to
estimate the true proportion defective in a
large population within ±3%, with 95%
confidence?
(Assume a pilot sample yields = 0.12)

P
s
P
2
2
e
)
p
1
(
p
Z
n



Week 3-57
Solution:
For 95% confidence, use Z = 1.96
e = 0.03
= 0.12, so use this to estimate p
So use n = 451
450.74
(0.03)
.12)
0
(0.12)(1
(1.96)
)
1
(
2
2
2
2





e
p
p
Z
n
(continued)
Ps

Week 3-58
Summary
 Introduced the concept of confidence intervals
 Discussed point estimates
 Developed confidence interval estimates for one
population mean and one population proportion
 Determining required sample size for different
level of accuracy and confidence

Chapter 6_Confidence intervals for range of values, calculated from sample data

More Related Content

Similar to Chapter 6_Confidence intervals for range of values, calculated from sample data (20)

Recently uploaded (20)

Chapter 6_Confidence intervals for range of values, calculated from sample data

Editor's Notes