SlideShare a Scribd company logo
Estimation
Getabalew E (MPH, Ph.D)
1
Dr. Getabalew
2/3/2023
Learning Objectives
• At the end of the class, the learners
will be able to:
• Define estimation
• Explain the types of estimation
• Apply the concepts of estimation
2/3/2023 Dr. Getabalew 2
• The process of drawing conclusions about an entire
population based on the data in a sample is known as
statistical inference.
• Estimation is the process of determining a likely value
for a variable in the survey population, based on
information collected from the sample.
• Estimation is the use of sample statistics to estimate
population parameters.
Estimation
3
Dr. Getabalew
2/3/2023
Example
• A sample survey revealed:
– Proportion of smokers among a certain group of
population aged 15 to 24.
– Mean of SBP among sampled population
– Prevalence of HIV-positive among people involved
in the study
The next question is what can we predict about the
characteristics of the population from which the
sample was drawn
4
Dr. Getabalew
2/3/2023
Point and Interval Estimates
A point estimate is a single value used as an estimate of a population
parameter
Interval estimate is a range or interval of numbers believed to include
unknown population parameter with a certain degree of assurance
Point estimate is always within the interval estimate
Point Estimate
Lower
Confidence
Limit
Upper
Confidence
Limit
Interval estimate
2/3/2023 Dr. Getabalew 5
Estimation Process
Mean, , is
Population
unknown
Random
X = 50
S
a m
p
l
e
Interval estimate
I am 95%
confident that
is between 40 &
60.
Point estimate
Mean
2/3/2023 Dr. Getabalew 6
1. Point Estimate
• A single numerical value used to estimate the
corresponding population parameter.
Sample Statistics are Estimators of Population Parameters
Sample mean,
Sample variance, S2
Sample proportion,
Sample Odds Ratio, OŔ
Sample Relative Risk, RŔ
Sample correlation coefficient, r
µ
2
P or π
OR
RR
ρ
7
Dr. Getabalew
2/3/2023
a) Unbiasedness: An estimator is said to be
unbiased if its expected value is equal to the
population parameter it estimates.
For example: when E(X ) ,the sample mean is an
unbiased estimator of the population mean
Unbiasedness is an average or long-run property.
The mean of any single sample will probably not
equal to the population mean, but the average of the
means of repeated independent samples from a
population will equal to the population mean.
2/3/2023 Dr. Getabalew 8
b) Minimum variance: (Efficiency)
An estimate which has a minimum standard error
is a good estimator
For symmetrical distribution the mean has a mini
mum standard error and
If the distribution is skewed the median has a mi
nimum standard error
2/3/2023 Dr. Getabalew 9
C) Consistency: An estimator is said to be consistent if its
probability of being close to the parameter it estimates
increases as the sample size increases
n = 100
n = 10
Consistency
2/3/2023 Dr. Getabalew 10
2. Interval Estimation
Confidence Intervals
Give a plausible range of values of the estimate likely
to include the “true” (population) value with a given
confidence level.
An interval estimate provides more information about
a population characteristic than does a point estimate
Such interval estimates are called confidence
intervals.
11
Dr. Getabalew
2/3/2023
General Formula:
The general formula for all CIs is:
point estimate (measure of how confident we
want to be) (standard error)
The value of the statistic in my sample
(eg., mean, odds ratio, etc.)
From a Z table
Standard error of the statistic.
Lower limit = Point Estimate - (Critical Value) x (Standard Error)
Upper limit = Point Estimate + (Critical Value) x (Standard Error)
12
Dr. Getabalew
2/3/2023
A CI in general:
Confidence in which the interval will contain the unknown
population parameter
– Based on observation from a sample
– Gives information about closeness to unknown
population parameters
– Stated in terms of level of confidence
• Never 100% sure
Also written (1 - α) = .95
A wide interval suggests imprecision of estimation.
Narrow CI widths reflects large sample size or low
variability or both.
13
Dr. Getabalew
2/3/2023
Definition: 95% CI
When sampling is from a normally distributed population
with known standard deviation, we are 100 (1-α) [e.g.,
95%] confident that the single computed interval contains
the unknown population parameter.
14
Dr. Getabalew
2/3/2023
15
2/3/2023 Dr. Getabalew
16
2/3/2023 Dr. Getabalew
1. CI for a Single Population Mean
A. Known variance (large sample size, normally
distributed)
Assumptions
Population standard deviation ( ) is known
Population is normally distributed
If population is not normal, use large
sample
17
Dr. Getabalew
2/3/2023
• There are 3 elements to a CI:
1. Point estimate
2. SE of the point estimate
3. Confidence level;
A 100(1- )% C.I. for is:
is to be chosen by the researcher, most common values of are
0.05, 0.01 and 0.1. 18
Dr. Getabalew
2/3/2023
3. Commonly used CLs are 90%, 95%, and
99%
19
Dr. Getabalew
2/3/2023
Example:
1. Waiting times (in hours) at a particular hospital
are believed to be approximately normally
distributed with a variance of 2.25 hr.
a. A sample of 20 outpatients revealed a mean waiting
time of 1.52 hours. Construct the 95% CI for the
estimate of the population mean.
b. Suppose that the mean of 1.52 hours had resulted
from a sample of 32 patients. Find the 95% CI.
c. What effect does larger sample size have on the CI?
20
Dr. Getabalew
2/3/2023
a.
)
17
.
2
,
87
(.
65
.
52
.
1
)
33
(.
96
.
1
52
.
1
20
25
.
2
96
.
1
52
.
1






• We are 95% confident that the true mean waiting time is between 0.87
and 2.17 hrs.
• An incorrect interpretation is that there is 95% probability that this
interval contains the true population mean.
b.
)
.05
2
,
99
(.
53
.
52
.
1
)
27
(.
96
.
1
52
.
1
32
25
.
2
96
.
1
52
.
1






c. The larger the sample size makes the CI narrower (more
precision).
21
Dr. Getabalew
2/3/2023
2/3/2023 22
Dr. Getabalew
Student’s t Distribution
• Bell Shaped
• Symmetric about zero (the mean)
• Flatter than the Normal (0,1). This means
– The variability of a t is greater than that of a Z
that is normal(0,1)
– Thus, there is more area under the tails and less
at center
– Because variability is greater, resulting
confidence intervals will be wider. 23
Dr. Getabalew
2/3/2023
• Note: t approaches z as n increases
24
Dr. Getabalew
2/3/2023
Student’s t Table
25
Dr. Getabalew
2/3/2023
Example
• Standard error =
• t-value at 90% CL at 19 df =1.729
26
Dr. Getabalew
2/3/2023
27
Dr. Getabalew
2/3/2023
2. CI for the difference between
population means (normally distributed)
A. Known variances (2 independent samples)
• When 1 and 2 are known and both populations are
normal or both sample sizes are at least 30, the test
statistic is a z-value…
28
Dr. Getabalew
2/3/2023
Examples
• We are interested in the similarity of the two groups.
1) Is mean blood pressure the same for males and
females?
2) Is body mass index (BMI) similar for breast cancer
cases versus non-cancer patients?
3) Is length of stay (LOS) for patients in hospital “A” the
same as that for similar patients in hospital “B”?
29
Dr. Getabalew
2/3/2023
Example
• Researchers are interested in the difference between
serum uric acid levels in patients with and without
Down’s syndrome.
• Patients without Down’s syndrome
– n=12, sample mean=4.5 mg/100ml, 2=1.0
• Patients with Down’s syndrome
– n=15, sample mean=3.4 mg/100ml, 2=1.5
• Calculate the 95% CI.
• We are 95% confident that the true difference
between the two population means is between 0.26
and 1.94. 30
Dr. Getabalew
2/3/2023
2/3/2023 31
Dr. Getabalew
2/3/2023 32
Dr. Getabalew
2/3/2023 33
Dr. Getabalew
2/3/2023 34
Dr. Getabalew
3. CIs for single population proportion, p
• Is based on three elements of CI.
– Point estimate
– SE of point estimate
– Confidence interval 35
Dr. Getabalew
2/3/2023
36
Dr. Getabalew
2/3/2023
Example 1
A random sample of 100 people shows that 25
are left-handed. Form a 95% CI for the true
proportion of left-handers.
Interpretation: we are 95% confidence that the true percentage of left
handers in the population is between 16.51%, 33.49%
37
Dr. Getabalew
2/3/2023
Example 2
• It was found that 28.1% of 153 cervical-cancer cases
had never had a Pap smear prior to the time of case’s
diagnosis. Calculate a 95% CI for the percentage of
cervical-cancer cases who never had a Pap test.
•
38
Dr. Getabalew
2/3/2023
4. Two Population Proportions
• We are often interested in comparing proportions
from 2 populations:
• Is the incidence of disease A the same in two
populations?
• Patients are treated with either drug D, or with
placebo. Is the proportion “improved” the same in
both groups?
39
Dr. Getabalew
2/3/2023
Confidence Interval for
Two Population Proportions
• SE of the difference =
• The confidence interval for p1 – p2 is:
40
Dr. Getabalew
2/3/2023
Example
• In a clinical trial for a new drug to treat hypertension,
N1 = 50 patients were randomly assigned to receive
the new drug, and N2 = 50 patients to receive a
placebo. 34 of the patients receiving the drug showed
improvement, while 15 of those receiving placebo
showed improvement.
• Compute a 95% CI estimate for the difference
between proportions improved.
41
Dr. Getabalew
2/3/2023
• p1 = 34/50 = 0.68, p2 = 15/50 = 0.30
• The point estimate for the difference is:
= [0.68−0.30]=0.38
• SE of the difference =
• 95% CI
– Lower = ( point estimate ) - (Zα/2) (SE)
= 0.38 – (1.96)(0.0925) = 0.20
– Upper = ( point estimate ) + (Zα/2) (SE)
= 0.38 + (1.96)(0.0925) = 0.56
• 95% CI = (0.20, 0.56)
42
Dr. Getabalew
2/3/2023
Hypothesis Testing
2/3/2023 Dr. Getabalew 43
• One way of statistical inference
• Is a claim (assumption) about a population parameter
• Hypotheses are formulated, experiments are performed,
and results are evaluated for their consistency (non-
consistency) with a hypothesis.
• The purpose of HT is to aid the clinician, researcher or
administrator in reaching a decision (conclusion)
concerning a population by examining a sample from that
population.
2/3/2023 Dr. Getabalew 44
Types of Hypothesis
1. The Null Hypothesis, H0
Is a statement claiming that there is no difference
between the hypothesized value and the population
value.
(The effect of interest is zero = no difference)
States the assumption (hypothesis) to be tested
H0 is a statement of agreement (or no difference), is
always about a population parameter, not about a
sample statistic
2/3/2023 Dr. Getabalew 45
Always contains “=” , “ ≤” or “≥ ” sign
May or may not be rejected
Begin with the assumption that the Ho is true
– Similar to the notion of innocent until proven
guilty
2/3/2023 Dr. Getabalew 46
2. The Alternative Hypothesis, HA
Is a statement of what we will believe is true if our
sample data causes us to reject Ho.
Is generally the hypothesis that is believed (or needs
to be supported) by the researcher
Is a statement that disagrees (opposes) with Ho
(The effect of interest is not zero)
Never contains “=” , “ ≤” or “≥ ” sign
May or may not be accepted
2/3/2023 Dr. Getabalew 47
Steps in Hypothesis Testing
1. Formulate the appropriate statistical hypotheses clearly
• Specify HO and HA
H0: = 0 H0: ≤ 0 H0: ≥ 0
H1: 0 H1: > 0 H1: < 0
two-tailed one-tailed one-tailed
• Can we conclude that the proportion of patients with leukemia
who survive more than six years is not 60%?
Ho: ? HA: ?
• Can we conclude that a certain population mean is greater than
50?
Ho: ? HA: ?
2/3/2023 Dr. Getabalew 48
2. State the assumptions necessary for computing probabilities
• A distribution is approximately normal (Gaussian)
• Variance is known or unknown
3. Select a sample and collect data
• Categorical, continuous
4. Decide on the appropriate test statistic for the hypothesis.
E.g., One population
OR
2/3/2023 Dr. Getabalew 49
5. Specify the desired level of significance for the
statistical test ( =0.05, 0.01, etc.)
6. Determine the critical value.
– A value the test statistic must attain to be
declared significant.
-1.96 1.96 1.645 -1.645
2/3/2023 Dr. Getabalew 50
7. Obtain sample evidence and compute the test
statistic
8. Reach a decision and draw the conclusion
• If Ho is rejected, we conclude that HA is true
(or accepted).
• If Ho is not rejected, we conclude that Ho may
be true.
2/3/2023 Dr. Getabalew 51
Rejection and Non-Rejection Regions
• The values of the test statistic assume the points on the
horizontal axis of the normal distribution and are
divided into two groups:
• Rejection region, and
• Non-rejection region.
2/3/2023 Dr. Getabalew 52
Example: Two-sided test at α 5%
Rejection region Non-rejection region Rejection region
= 0.025 = 0.025
0.95
1.96
-1.96
2/3/2023 Dr. Getabalew 53
Statistical Decision
Reject Ho if the value of the test statistic that we
compute from our sample is one of the values in the
rejection region
Don’t reject Ho if the computed value of the test
statistic is one of the values in the non-rejection
region.
2/3/2023 Dr. Getabalew 54
Level of Significance, α
Is the probability of rejecting a true Ho
For example, a significance level of 0.05 indicates a
5% risk of concluding that a difference exists when
there is no actual difference.
Alpha levels are controlled by the researcher and
are related to confidence level.
An alpha level obtained by subtracting the
confidence level from 100%
2/3/2023 Dr. Getabalew 55
2/3/2023 Dr. Getabalew 56
Another way to state conclusion
• Reject Ho if P-value < α
• Accept Ho if P-value ≥ α
P-value is the probability of obtaining a test statistic
as extreme as or more extreme than the actual test
statistic obtained if the Ho is true
Indicates the probability of having enough
evidence to reject or not to reject the null
hypothesis
The larger the test statistic, the smaller is the P-value.
OR, the smaller the P-value the stronger the evidence
against the Ho.
2/3/2023 Dr. Getabalew 57
1. Hypothesis Testing of a Single Mean
(Normally Distributed)
2/3/2023 Dr. Getabalew 58
1.1 Known Variance
2/3/2023 Dr. Getabalew 59
Example: Two-Tailed Test
1. A simple random sample of 10 people from a certain
population has a mean age of 27. Can we conclude
that the mean age of the population is not 30? The
variance is known to be 20. Let α = .05.
• Answer, "Yes we can, if we can reject the Ho that it is
30."
A. Data
n = 10, sample mean = 27, 2 = 20, α = 0.05
B. Assumptions
Simple random sample
Normally distributed population
variance is known
2/3/2023 Dr. Getabalew 60
C. Hypotheses
Ho: µ = 30
HA: µ ≠ 30
D. Test statistic
As the population variance is known, we use Z
as the test statistic.
2/3/2023 Dr. Getabalew 61
E. Decision Rule
• Reject Ho if the Z value falls in the rejection region.
• Don’t reject Ho if the Z value falls in the non-rejection region.
• Because of the structure of Ho it is a two tail test. Therefore,
reject Ho if Z ≤ -1.96 or Z ≥ 1.96.
2/3/2023 Dr. Getabalew 62
F. Calculation of test statistic
G. Statistical decision
We reject the Ho because Z = -2.12 is in the rejection
region. The value is significant at 5% α.
H. Conclusion
We conclude that µ is not 30. P-value = 0.0340
A Z value of -2.12 corresponds to an area of 0.0170. Since there
are two parts to the rejection region in a two tail test, the P-value is
twice this which is .0340.
2/3/2023 Dr. Getabalew 63
Example: One -Tailed Test
• A simple random sample of 10 people from a certain
population has a mean age of 27. Can we conclude that
the mean age of the population is less than 30? The
variance is known to be 20. Let α = 0.05.
• Data
n = 10, sample mean = 27, 2 = 20, α = 0.05
• Hypotheses
Ho: µ ?, HA: µ ?
2/3/2023 Dr. Getabalew 64
• Test statistic
• Rejection Region
• With α = 0.05 and the inequality, we have the entire rejection region
at the left. The critical value will be Z = -1.64. Reject Ho if Z < -
1.645.
=
Lower tail test
2/3/2023 Dr. Getabalew 65
• Statistical decision
– We reject the Ho because -2.12 < -1.645.
• Conclusion
– We conclude that µ < 30.
– p = .0170 this time because it is only a one tail test and not a two
tail test.
2/3/2023 Dr. Getabalew 66
1.2 Unknown Variance
• In most practical applications the standard deviation of
the underlying population is not known
• In this case, can be estimated by the sample standard
deviation s.
• If the underlying population is normally distributed,
then the test statistic is:
2/3/2023 Dr. Getabalew 67
Example: Two-Tailed Test
• A simple random sample of 14 people from a certain population
gives a sample mean body mass index (BMI) of 30.5 and sd of
10.64. Can we conclude that the BMI is not 35 at α 5%?
• Ho: µ = 35, HA: µ ≠35
• Test statistic
• If the assumptions are correct and Ho is true, the test statistic
follows Student's t distribution with 13 degrees of freedom.
2/3/2023 Dr. Getabalew 68
• Decision rule
– We have a two tailed test. With α = 0.05 it means that each tail is
0.025. The critical t values with 13 df are -2.1604 and 2.1604.
– We reject Ho if the t ≤ -2.1604 or t ≥ 2.1604.
• Do not reject Ho because -1.58 is not in the rejection
region. Based on the data of the sample, it is possible
that µ = 35. P-value = 0.1375
2/3/2023 Dr. Getabalew 69
Two Population Means, Independent
Samples
2/3/2023 Dr. Getabalew 70
2.1 Known Variances
(Independent Samples)
• When two independent samples are drawn
from a normally distributed population with
known variance, the test statistic for testing
the Ho of equal population means is:
2/3/2023 Dr. Getabalew 71
Example:
• Researchers wish to know a difference in mean serum
uric acid (SUA) levels between normal individuals and
individuals with Down’s syndrome. The means SUA
levels on 12 individuals with Down’s syndrome and 15
normal individuals are 4.5 and 3.4 mg/100 ml,
respectively. with variances. ( 2=1, 2=1.5, respectively).
Is there a difference between the means of both groups
at α 5%?
• Hypotheses:
Ho: µ1- µ2 = 0 or Ho: µ1 = µ2
HA: µ1 - µ2 ≠ 0 or HA: µ1 ≠ µ2
2/3/2023 Dr. Getabalew 72
• With α = 0.05, the critical values of Z are -1.96 and
+1.96. We reject Ho if Z < -1.96 or Z > +1.96.
• Reject Ho because 2.57 > 1.96.
• From these data, it can be concluded that the
population means are not equal. A 95% CI would
give the same conclusion. P-value = 0.01.
2/3/2023 Dr. Getabalew 73
2.2 Unknown Variances
i. Equal variances (Independent samples)
• With equal population variances, we can
obtain a pooled value from the sample
variances.
• The test statistic for µ1 - µ2 is:
• Where tα/2 has (n1 + n2 – 2) df., and
2/3/2023 Dr. Getabalew 74
Example:
• We wish to know if we may conclude, at the 95%
confidence level, that smokers, in general, have
greater lung damage than do non-smokers.
• Calculation of Pooled
Variance
2/3/2023 Dr. Getabalew 75
• Hypotheses:
Ho: µ1 ≤ µ2 = 0, HA: µ1 > µ2
• With α = 0.05 and df = 23, the critical value of t is 1.7139. We
reject Ho if t > 1.7139.
• Test statistic
• Reject Ho because 2.6563 > 1.7139. On
the basis of the data, we conclude that µ1 >
µ2.
2/3/2023 Dr. Getabalew 76
3. Hypothesis Testing about a Single
Population Proportion
(Normal Approximation to Binomial Distribution)
• Involves categorical values
• Two possible outcomes
– “Success” (possesses a certain
characteristic)
– “Failure” (does not possesses that
characteristic)
• Fraction or proportion of population in the
“success” category is denoted by p
2/3/2023 Dr. Getabalew 77
Example
• In the general population of 0 to 4-year-olds, the annual
incidence of asthma is 1.4%. If 10 cases of asthma are observed
over a single year in a sample of 500 children whose mothers
smoke, can we conclude that this is different from the
underlying probability of p0 = 0.014? α = 5%
H0 : p = 0.014
HA: p ≠ 0.014
2/3/2023 Dr. Getabalew 78
• The test statistic is given by:
2/3/2023 Dr. Getabalew 79
• The critical value of Zα/2 at α=5% is ±1.96.
• Don’t reject Ho since Z (=1.14) in the non-rejection
region between ±1.96.
• P-value = 0.2548
• We do not have sufficient evidence to conclude that
the probability of developing asthma for children
whose mothers smoke in the home is different from
the probability in the general population
2/3/2023 Dr. Getabalew 80
4. Hypothesis Tests about the Difference
Between
Two Population Proportions
2/3/2023 Dr. Getabalew 81
Where X1 = the observed number of events in the first sample
and X2 = the observed number of events in the second sample
2/3/2023 Dr. Getabalew 82
2/3/2023 Dr. Getabalew 83
Example
• A study was conducted to investigate the
possible cause of gastroenteritis outbreak
following a lunch served in a high school
cafeteria. Among the 225 students who ate the
sandwiches, 109 became ill. While, among the
38 students who did not eat the sandwiches, 4
became ill. Is there a significant difference
between the two groups at α =5%.
• We wish to test
Ho: p1 = p2 against the alternative
HA: p1 ≠ p2
2/3/2023 Dr. Getabalew 84
2/3/2023 Dr. Getabalew 85
• Assume that the sample sizes are large
enough, and the normal approximation to
the binomial distribution is valid.
• If the Ho is true, then p1 = p2 = p
2/3/2023 Dr. Getabalew 86
The area under the standard normal curve to the
right of 4.36 is less than 0.0001. Therefore, p <
0.0002. We reject H0 at the 0.05 level.
The proportion of students who became ill
differs in the two groups; those who ate the
prepared sandwiches were more likely to
develop gastroenteritis.
2/3/2023 Dr. Getabalew 87
Types of Errors in Hypothesis
Tests
• Whenever we reject or accept the Ho, we
commit errors.
• Two types of errors are committed.
– Type I Error
– Type II Error
2/3/2023 Dr. Getabalew 88
Type I Error
• The error committed when a true Ho is rejected
• Considered a serious type of error
• The probability of a type I error is the probability of
rejecting the Ho when it is true
• The probability of type I error is α
• Called level of significance of the test
• Set by researcher in advance
2/3/2023 Dr. Getabalew 89
Type II Error
• The error committed when a false Ho is not rejected
• The probability of Type II Error is
Power
• The probability of rejecting the Ho when it is false.
Power = 1 – β = 1- probability of type II error
• We would like to maintain low probability of a
Type I error (α) and low probability of a Type II
error (β) [high power = 1 - β].
2/3/2023 Dr. Getabalew 90
Action
(Conclusion)
Reality
Ho True Ho False
Do not
reject Ho
Correct action
(Prob. = 1-α)
Type II error (β)
(Prob. = β= 1-Power)
Reject Ho Type I error (α)
(Prob. = α = Sign. level)
Correct action
(Prob. = Power = 1-β)
2/3/2023 Dr. Getabalew 91
Thank you
2/3/2023 Dr. Getabalew 92

More Related Content

PPT
Lecture-3 inferential stastistics.ppt
PPTX
Sampling distributions
PPT
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
PPT
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
PPTX
3. Statistical inference_anesthesia.pptx
PPT
Two sample t-test
PPT
2_5332511410507220042.ppt
PDF
Common statistical pitfalls & errors in biomedical research (a top-5 list)
Lecture-3 inferential stastistics.ppt
Sampling distributions
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
Clinical Trials Versus Health Outcomes Research: SAS/STAT Versus SAS Enterpri...
3. Statistical inference_anesthesia.pptx
Two sample t-test
2_5332511410507220042.ppt
Common statistical pitfalls & errors in biomedical research (a top-5 list)

Similar to Estimation and hypothesis test lecture.pdf (20)

DOCX
Confidence Intervals in the Life Sciences PresentationNamesS.docx
DOCX
Running head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docx
PPTX
estimation.pptx
PPTX
Test of significance
PPT
Soni_Biostatistics.ppt
DOCX
Running Head SCENARIO NCLEX MEMORIAL HOSPITAL .docx
PPT
Chapter 3 Confidence Interval Revby Rao
PDF
Biostatics part 7.pdf
PPT
Chapter 3 Confidence Interval
PPTX
Sample size estimation in health research
PPTX
Sample size estimation in health research
DOCX
Course Project Phase TwoPavel GarbuzApril 12th, 2017.docx
PPTX
Sample size estimation
PDF
Sample size determination
PPT
Introduction to t test and types in Nursing.ppt
PPTX
Avoid overfitting in precision medicine: How to use cross-validation to relia...
PPTX
Sample size calculation
PPTX
Practical Methods To Overcome Sample Size Challenges
PDF
Lemeshow samplesize
PPTX
sample size calculations in different types of study..pptx
Confidence Intervals in the Life Sciences PresentationNamesS.docx
Running head PROJECT PHASE 4-INFECTIOUS DISEASES1PROJECT PHASE.docx
estimation.pptx
Test of significance
Soni_Biostatistics.ppt
Running Head SCENARIO NCLEX MEMORIAL HOSPITAL .docx
Chapter 3 Confidence Interval Revby Rao
Biostatics part 7.pdf
Chapter 3 Confidence Interval
Sample size estimation in health research
Sample size estimation in health research
Course Project Phase TwoPavel GarbuzApril 12th, 2017.docx
Sample size estimation
Sample size determination
Introduction to t test and types in Nursing.ppt
Avoid overfitting in precision medicine: How to use cross-validation to relia...
Sample size calculation
Practical Methods To Overcome Sample Size Challenges
Lemeshow samplesize
sample size calculations in different types of study..pptx
Ad

Recently uploaded (20)

PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
Business Ppt On Nestle.pptx huunnnhhgfvu
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PDF
Fluorescence-microscope_Botany_detailed content
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
annual-report-2024-2025 original latest.
PPTX
Database Infoormation System (DBIS).pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
1_Introduction to advance data techniques.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
Business Analytics and business intelligence.pdf
PDF
Lecture1 pattern recognition............
PPTX
Acceptance and paychological effects of mandatory extra coach I classes.pptx
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
Supervised vs unsupervised machine learning algorithms
PPTX
Introduction to machine learning and Linear Models
PPTX
Qualitative Qantitative and Mixed Methods.pptx
PPTX
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PPTX
IBA_Chapter_11_Slides_Final_Accessible.pptx
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
Business Ppt On Nestle.pptx huunnnhhgfvu
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
Fluorescence-microscope_Botany_detailed content
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
annual-report-2024-2025 original latest.
Database Infoormation System (DBIS).pptx
Reliability_Chapter_ presentation 1221.5784
1_Introduction to advance data techniques.pptx
climate analysis of Dhaka ,Banglades.pptx
Business Analytics and business intelligence.pdf
Lecture1 pattern recognition............
Acceptance and paychological effects of mandatory extra coach I classes.pptx
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
Supervised vs unsupervised machine learning algorithms
Introduction to machine learning and Linear Models
Qualitative Qantitative and Mixed Methods.pptx
01_intro xxxxxxxxxxfffffffffffaaaaaaaaaaafg
STUDY DESIGN details- Lt Col Maksud (21).pptx
IBA_Chapter_11_Slides_Final_Accessible.pptx
Ad

Estimation and hypothesis test lecture.pdf

  • 1. Estimation Getabalew E (MPH, Ph.D) 1 Dr. Getabalew 2/3/2023
  • 2. Learning Objectives • At the end of the class, the learners will be able to: • Define estimation • Explain the types of estimation • Apply the concepts of estimation 2/3/2023 Dr. Getabalew 2
  • 3. • The process of drawing conclusions about an entire population based on the data in a sample is known as statistical inference. • Estimation is the process of determining a likely value for a variable in the survey population, based on information collected from the sample. • Estimation is the use of sample statistics to estimate population parameters. Estimation 3 Dr. Getabalew 2/3/2023
  • 4. Example • A sample survey revealed: – Proportion of smokers among a certain group of population aged 15 to 24. – Mean of SBP among sampled population – Prevalence of HIV-positive among people involved in the study The next question is what can we predict about the characteristics of the population from which the sample was drawn 4 Dr. Getabalew 2/3/2023
  • 5. Point and Interval Estimates A point estimate is a single value used as an estimate of a population parameter Interval estimate is a range or interval of numbers believed to include unknown population parameter with a certain degree of assurance Point estimate is always within the interval estimate Point Estimate Lower Confidence Limit Upper Confidence Limit Interval estimate 2/3/2023 Dr. Getabalew 5
  • 6. Estimation Process Mean, , is Population unknown Random X = 50 S a m p l e Interval estimate I am 95% confident that is between 40 & 60. Point estimate Mean 2/3/2023 Dr. Getabalew 6
  • 7. 1. Point Estimate • A single numerical value used to estimate the corresponding population parameter. Sample Statistics are Estimators of Population Parameters Sample mean, Sample variance, S2 Sample proportion, Sample Odds Ratio, OŔ Sample Relative Risk, RŔ Sample correlation coefficient, r µ 2 P or π OR RR ρ 7 Dr. Getabalew 2/3/2023
  • 8. a) Unbiasedness: An estimator is said to be unbiased if its expected value is equal to the population parameter it estimates. For example: when E(X ) ,the sample mean is an unbiased estimator of the population mean Unbiasedness is an average or long-run property. The mean of any single sample will probably not equal to the population mean, but the average of the means of repeated independent samples from a population will equal to the population mean. 2/3/2023 Dr. Getabalew 8
  • 9. b) Minimum variance: (Efficiency) An estimate which has a minimum standard error is a good estimator For symmetrical distribution the mean has a mini mum standard error and If the distribution is skewed the median has a mi nimum standard error 2/3/2023 Dr. Getabalew 9
  • 10. C) Consistency: An estimator is said to be consistent if its probability of being close to the parameter it estimates increases as the sample size increases n = 100 n = 10 Consistency 2/3/2023 Dr. Getabalew 10
  • 11. 2. Interval Estimation Confidence Intervals Give a plausible range of values of the estimate likely to include the “true” (population) value with a given confidence level. An interval estimate provides more information about a population characteristic than does a point estimate Such interval estimates are called confidence intervals. 11 Dr. Getabalew 2/3/2023
  • 12. General Formula: The general formula for all CIs is: point estimate (measure of how confident we want to be) (standard error) The value of the statistic in my sample (eg., mean, odds ratio, etc.) From a Z table Standard error of the statistic. Lower limit = Point Estimate - (Critical Value) x (Standard Error) Upper limit = Point Estimate + (Critical Value) x (Standard Error) 12 Dr. Getabalew 2/3/2023
  • 13. A CI in general: Confidence in which the interval will contain the unknown population parameter – Based on observation from a sample – Gives information about closeness to unknown population parameters – Stated in terms of level of confidence • Never 100% sure Also written (1 - α) = .95 A wide interval suggests imprecision of estimation. Narrow CI widths reflects large sample size or low variability or both. 13 Dr. Getabalew 2/3/2023
  • 14. Definition: 95% CI When sampling is from a normally distributed population with known standard deviation, we are 100 (1-α) [e.g., 95%] confident that the single computed interval contains the unknown population parameter. 14 Dr. Getabalew 2/3/2023
  • 17. 1. CI for a Single Population Mean A. Known variance (large sample size, normally distributed) Assumptions Population standard deviation ( ) is known Population is normally distributed If population is not normal, use large sample 17 Dr. Getabalew 2/3/2023
  • 18. • There are 3 elements to a CI: 1. Point estimate 2. SE of the point estimate 3. Confidence level; A 100(1- )% C.I. for is: is to be chosen by the researcher, most common values of are 0.05, 0.01 and 0.1. 18 Dr. Getabalew 2/3/2023
  • 19. 3. Commonly used CLs are 90%, 95%, and 99% 19 Dr. Getabalew 2/3/2023
  • 20. Example: 1. Waiting times (in hours) at a particular hospital are believed to be approximately normally distributed with a variance of 2.25 hr. a. A sample of 20 outpatients revealed a mean waiting time of 1.52 hours. Construct the 95% CI for the estimate of the population mean. b. Suppose that the mean of 1.52 hours had resulted from a sample of 32 patients. Find the 95% CI. c. What effect does larger sample size have on the CI? 20 Dr. Getabalew 2/3/2023
  • 21. a. ) 17 . 2 , 87 (. 65 . 52 . 1 ) 33 (. 96 . 1 52 . 1 20 25 . 2 96 . 1 52 . 1       • We are 95% confident that the true mean waiting time is between 0.87 and 2.17 hrs. • An incorrect interpretation is that there is 95% probability that this interval contains the true population mean. b. ) .05 2 , 99 (. 53 . 52 . 1 ) 27 (. 96 . 1 52 . 1 32 25 . 2 96 . 1 52 . 1       c. The larger the sample size makes the CI narrower (more precision). 21 Dr. Getabalew 2/3/2023
  • 23. Student’s t Distribution • Bell Shaped • Symmetric about zero (the mean) • Flatter than the Normal (0,1). This means – The variability of a t is greater than that of a Z that is normal(0,1) – Thus, there is more area under the tails and less at center – Because variability is greater, resulting confidence intervals will be wider. 23 Dr. Getabalew 2/3/2023
  • 24. • Note: t approaches z as n increases 24 Dr. Getabalew 2/3/2023
  • 25. Student’s t Table 25 Dr. Getabalew 2/3/2023
  • 26. Example • Standard error = • t-value at 90% CL at 19 df =1.729 26 Dr. Getabalew 2/3/2023
  • 28. 2. CI for the difference between population means (normally distributed) A. Known variances (2 independent samples) • When 1 and 2 are known and both populations are normal or both sample sizes are at least 30, the test statistic is a z-value… 28 Dr. Getabalew 2/3/2023
  • 29. Examples • We are interested in the similarity of the two groups. 1) Is mean blood pressure the same for males and females? 2) Is body mass index (BMI) similar for breast cancer cases versus non-cancer patients? 3) Is length of stay (LOS) for patients in hospital “A” the same as that for similar patients in hospital “B”? 29 Dr. Getabalew 2/3/2023
  • 30. Example • Researchers are interested in the difference between serum uric acid levels in patients with and without Down’s syndrome. • Patients without Down’s syndrome – n=12, sample mean=4.5 mg/100ml, 2=1.0 • Patients with Down’s syndrome – n=15, sample mean=3.4 mg/100ml, 2=1.5 • Calculate the 95% CI. • We are 95% confident that the true difference between the two population means is between 0.26 and 1.94. 30 Dr. Getabalew 2/3/2023
  • 35. 3. CIs for single population proportion, p • Is based on three elements of CI. – Point estimate – SE of point estimate – Confidence interval 35 Dr. Getabalew 2/3/2023
  • 37. Example 1 A random sample of 100 people shows that 25 are left-handed. Form a 95% CI for the true proportion of left-handers. Interpretation: we are 95% confidence that the true percentage of left handers in the population is between 16.51%, 33.49% 37 Dr. Getabalew 2/3/2023
  • 38. Example 2 • It was found that 28.1% of 153 cervical-cancer cases had never had a Pap smear prior to the time of case’s diagnosis. Calculate a 95% CI for the percentage of cervical-cancer cases who never had a Pap test. • 38 Dr. Getabalew 2/3/2023
  • 39. 4. Two Population Proportions • We are often interested in comparing proportions from 2 populations: • Is the incidence of disease A the same in two populations? • Patients are treated with either drug D, or with placebo. Is the proportion “improved” the same in both groups? 39 Dr. Getabalew 2/3/2023
  • 40. Confidence Interval for Two Population Proportions • SE of the difference = • The confidence interval for p1 – p2 is: 40 Dr. Getabalew 2/3/2023
  • 41. Example • In a clinical trial for a new drug to treat hypertension, N1 = 50 patients were randomly assigned to receive the new drug, and N2 = 50 patients to receive a placebo. 34 of the patients receiving the drug showed improvement, while 15 of those receiving placebo showed improvement. • Compute a 95% CI estimate for the difference between proportions improved. 41 Dr. Getabalew 2/3/2023
  • 42. • p1 = 34/50 = 0.68, p2 = 15/50 = 0.30 • The point estimate for the difference is: = [0.68−0.30]=0.38 • SE of the difference = • 95% CI – Lower = ( point estimate ) - (Zα/2) (SE) = 0.38 – (1.96)(0.0925) = 0.20 – Upper = ( point estimate ) + (Zα/2) (SE) = 0.38 + (1.96)(0.0925) = 0.56 • 95% CI = (0.20, 0.56) 42 Dr. Getabalew 2/3/2023
  • 44. • One way of statistical inference • Is a claim (assumption) about a population parameter • Hypotheses are formulated, experiments are performed, and results are evaluated for their consistency (non- consistency) with a hypothesis. • The purpose of HT is to aid the clinician, researcher or administrator in reaching a decision (conclusion) concerning a population by examining a sample from that population. 2/3/2023 Dr. Getabalew 44
  • 45. Types of Hypothesis 1. The Null Hypothesis, H0 Is a statement claiming that there is no difference between the hypothesized value and the population value. (The effect of interest is zero = no difference) States the assumption (hypothesis) to be tested H0 is a statement of agreement (or no difference), is always about a population parameter, not about a sample statistic 2/3/2023 Dr. Getabalew 45
  • 46. Always contains “=” , “ ≤” or “≥ ” sign May or may not be rejected Begin with the assumption that the Ho is true – Similar to the notion of innocent until proven guilty 2/3/2023 Dr. Getabalew 46
  • 47. 2. The Alternative Hypothesis, HA Is a statement of what we will believe is true if our sample data causes us to reject Ho. Is generally the hypothesis that is believed (or needs to be supported) by the researcher Is a statement that disagrees (opposes) with Ho (The effect of interest is not zero) Never contains “=” , “ ≤” or “≥ ” sign May or may not be accepted 2/3/2023 Dr. Getabalew 47
  • 48. Steps in Hypothesis Testing 1. Formulate the appropriate statistical hypotheses clearly • Specify HO and HA H0: = 0 H0: ≤ 0 H0: ≥ 0 H1: 0 H1: > 0 H1: < 0 two-tailed one-tailed one-tailed • Can we conclude that the proportion of patients with leukemia who survive more than six years is not 60%? Ho: ? HA: ? • Can we conclude that a certain population mean is greater than 50? Ho: ? HA: ? 2/3/2023 Dr. Getabalew 48
  • 49. 2. State the assumptions necessary for computing probabilities • A distribution is approximately normal (Gaussian) • Variance is known or unknown 3. Select a sample and collect data • Categorical, continuous 4. Decide on the appropriate test statistic for the hypothesis. E.g., One population OR 2/3/2023 Dr. Getabalew 49
  • 50. 5. Specify the desired level of significance for the statistical test ( =0.05, 0.01, etc.) 6. Determine the critical value. – A value the test statistic must attain to be declared significant. -1.96 1.96 1.645 -1.645 2/3/2023 Dr. Getabalew 50
  • 51. 7. Obtain sample evidence and compute the test statistic 8. Reach a decision and draw the conclusion • If Ho is rejected, we conclude that HA is true (or accepted). • If Ho is not rejected, we conclude that Ho may be true. 2/3/2023 Dr. Getabalew 51
  • 52. Rejection and Non-Rejection Regions • The values of the test statistic assume the points on the horizontal axis of the normal distribution and are divided into two groups: • Rejection region, and • Non-rejection region. 2/3/2023 Dr. Getabalew 52
  • 53. Example: Two-sided test at α 5% Rejection region Non-rejection region Rejection region = 0.025 = 0.025 0.95 1.96 -1.96 2/3/2023 Dr. Getabalew 53
  • 54. Statistical Decision Reject Ho if the value of the test statistic that we compute from our sample is one of the values in the rejection region Don’t reject Ho if the computed value of the test statistic is one of the values in the non-rejection region. 2/3/2023 Dr. Getabalew 54
  • 55. Level of Significance, α Is the probability of rejecting a true Ho For example, a significance level of 0.05 indicates a 5% risk of concluding that a difference exists when there is no actual difference. Alpha levels are controlled by the researcher and are related to confidence level. An alpha level obtained by subtracting the confidence level from 100% 2/3/2023 Dr. Getabalew 55
  • 57. Another way to state conclusion • Reject Ho if P-value < α • Accept Ho if P-value ≥ α P-value is the probability of obtaining a test statistic as extreme as or more extreme than the actual test statistic obtained if the Ho is true Indicates the probability of having enough evidence to reject or not to reject the null hypothesis The larger the test statistic, the smaller is the P-value. OR, the smaller the P-value the stronger the evidence against the Ho. 2/3/2023 Dr. Getabalew 57
  • 58. 1. Hypothesis Testing of a Single Mean (Normally Distributed) 2/3/2023 Dr. Getabalew 58
  • 59. 1.1 Known Variance 2/3/2023 Dr. Getabalew 59
  • 60. Example: Two-Tailed Test 1. A simple random sample of 10 people from a certain population has a mean age of 27. Can we conclude that the mean age of the population is not 30? The variance is known to be 20. Let α = .05. • Answer, "Yes we can, if we can reject the Ho that it is 30." A. Data n = 10, sample mean = 27, 2 = 20, α = 0.05 B. Assumptions Simple random sample Normally distributed population variance is known 2/3/2023 Dr. Getabalew 60
  • 61. C. Hypotheses Ho: µ = 30 HA: µ ≠ 30 D. Test statistic As the population variance is known, we use Z as the test statistic. 2/3/2023 Dr. Getabalew 61
  • 62. E. Decision Rule • Reject Ho if the Z value falls in the rejection region. • Don’t reject Ho if the Z value falls in the non-rejection region. • Because of the structure of Ho it is a two tail test. Therefore, reject Ho if Z ≤ -1.96 or Z ≥ 1.96. 2/3/2023 Dr. Getabalew 62
  • 63. F. Calculation of test statistic G. Statistical decision We reject the Ho because Z = -2.12 is in the rejection region. The value is significant at 5% α. H. Conclusion We conclude that µ is not 30. P-value = 0.0340 A Z value of -2.12 corresponds to an area of 0.0170. Since there are two parts to the rejection region in a two tail test, the P-value is twice this which is .0340. 2/3/2023 Dr. Getabalew 63
  • 64. Example: One -Tailed Test • A simple random sample of 10 people from a certain population has a mean age of 27. Can we conclude that the mean age of the population is less than 30? The variance is known to be 20. Let α = 0.05. • Data n = 10, sample mean = 27, 2 = 20, α = 0.05 • Hypotheses Ho: µ ?, HA: µ ? 2/3/2023 Dr. Getabalew 64
  • 65. • Test statistic • Rejection Region • With α = 0.05 and the inequality, we have the entire rejection region at the left. The critical value will be Z = -1.64. Reject Ho if Z < - 1.645. = Lower tail test 2/3/2023 Dr. Getabalew 65
  • 66. • Statistical decision – We reject the Ho because -2.12 < -1.645. • Conclusion – We conclude that µ < 30. – p = .0170 this time because it is only a one tail test and not a two tail test. 2/3/2023 Dr. Getabalew 66
  • 67. 1.2 Unknown Variance • In most practical applications the standard deviation of the underlying population is not known • In this case, can be estimated by the sample standard deviation s. • If the underlying population is normally distributed, then the test statistic is: 2/3/2023 Dr. Getabalew 67
  • 68. Example: Two-Tailed Test • A simple random sample of 14 people from a certain population gives a sample mean body mass index (BMI) of 30.5 and sd of 10.64. Can we conclude that the BMI is not 35 at α 5%? • Ho: µ = 35, HA: µ ≠35 • Test statistic • If the assumptions are correct and Ho is true, the test statistic follows Student's t distribution with 13 degrees of freedom. 2/3/2023 Dr. Getabalew 68
  • 69. • Decision rule – We have a two tailed test. With α = 0.05 it means that each tail is 0.025. The critical t values with 13 df are -2.1604 and 2.1604. – We reject Ho if the t ≤ -2.1604 or t ≥ 2.1604. • Do not reject Ho because -1.58 is not in the rejection region. Based on the data of the sample, it is possible that µ = 35. P-value = 0.1375 2/3/2023 Dr. Getabalew 69
  • 70. Two Population Means, Independent Samples 2/3/2023 Dr. Getabalew 70
  • 71. 2.1 Known Variances (Independent Samples) • When two independent samples are drawn from a normally distributed population with known variance, the test statistic for testing the Ho of equal population means is: 2/3/2023 Dr. Getabalew 71
  • 72. Example: • Researchers wish to know a difference in mean serum uric acid (SUA) levels between normal individuals and individuals with Down’s syndrome. The means SUA levels on 12 individuals with Down’s syndrome and 15 normal individuals are 4.5 and 3.4 mg/100 ml, respectively. with variances. ( 2=1, 2=1.5, respectively). Is there a difference between the means of both groups at α 5%? • Hypotheses: Ho: µ1- µ2 = 0 or Ho: µ1 = µ2 HA: µ1 - µ2 ≠ 0 or HA: µ1 ≠ µ2 2/3/2023 Dr. Getabalew 72
  • 73. • With α = 0.05, the critical values of Z are -1.96 and +1.96. We reject Ho if Z < -1.96 or Z > +1.96. • Reject Ho because 2.57 > 1.96. • From these data, it can be concluded that the population means are not equal. A 95% CI would give the same conclusion. P-value = 0.01. 2/3/2023 Dr. Getabalew 73
  • 74. 2.2 Unknown Variances i. Equal variances (Independent samples) • With equal population variances, we can obtain a pooled value from the sample variances. • The test statistic for µ1 - µ2 is: • Where tα/2 has (n1 + n2 – 2) df., and 2/3/2023 Dr. Getabalew 74
  • 75. Example: • We wish to know if we may conclude, at the 95% confidence level, that smokers, in general, have greater lung damage than do non-smokers. • Calculation of Pooled Variance 2/3/2023 Dr. Getabalew 75
  • 76. • Hypotheses: Ho: µ1 ≤ µ2 = 0, HA: µ1 > µ2 • With α = 0.05 and df = 23, the critical value of t is 1.7139. We reject Ho if t > 1.7139. • Test statistic • Reject Ho because 2.6563 > 1.7139. On the basis of the data, we conclude that µ1 > µ2. 2/3/2023 Dr. Getabalew 76
  • 77. 3. Hypothesis Testing about a Single Population Proportion (Normal Approximation to Binomial Distribution) • Involves categorical values • Two possible outcomes – “Success” (possesses a certain characteristic) – “Failure” (does not possesses that characteristic) • Fraction or proportion of population in the “success” category is denoted by p 2/3/2023 Dr. Getabalew 77
  • 78. Example • In the general population of 0 to 4-year-olds, the annual incidence of asthma is 1.4%. If 10 cases of asthma are observed over a single year in a sample of 500 children whose mothers smoke, can we conclude that this is different from the underlying probability of p0 = 0.014? α = 5% H0 : p = 0.014 HA: p ≠ 0.014 2/3/2023 Dr. Getabalew 78
  • 79. • The test statistic is given by: 2/3/2023 Dr. Getabalew 79
  • 80. • The critical value of Zα/2 at α=5% is ±1.96. • Don’t reject Ho since Z (=1.14) in the non-rejection region between ±1.96. • P-value = 0.2548 • We do not have sufficient evidence to conclude that the probability of developing asthma for children whose mothers smoke in the home is different from the probability in the general population 2/3/2023 Dr. Getabalew 80
  • 81. 4. Hypothesis Tests about the Difference Between Two Population Proportions 2/3/2023 Dr. Getabalew 81
  • 82. Where X1 = the observed number of events in the first sample and X2 = the observed number of events in the second sample 2/3/2023 Dr. Getabalew 82
  • 84. Example • A study was conducted to investigate the possible cause of gastroenteritis outbreak following a lunch served in a high school cafeteria. Among the 225 students who ate the sandwiches, 109 became ill. While, among the 38 students who did not eat the sandwiches, 4 became ill. Is there a significant difference between the two groups at α =5%. • We wish to test Ho: p1 = p2 against the alternative HA: p1 ≠ p2 2/3/2023 Dr. Getabalew 84
  • 86. • Assume that the sample sizes are large enough, and the normal approximation to the binomial distribution is valid. • If the Ho is true, then p1 = p2 = p 2/3/2023 Dr. Getabalew 86
  • 87. The area under the standard normal curve to the right of 4.36 is less than 0.0001. Therefore, p < 0.0002. We reject H0 at the 0.05 level. The proportion of students who became ill differs in the two groups; those who ate the prepared sandwiches were more likely to develop gastroenteritis. 2/3/2023 Dr. Getabalew 87
  • 88. Types of Errors in Hypothesis Tests • Whenever we reject or accept the Ho, we commit errors. • Two types of errors are committed. – Type I Error – Type II Error 2/3/2023 Dr. Getabalew 88
  • 89. Type I Error • The error committed when a true Ho is rejected • Considered a serious type of error • The probability of a type I error is the probability of rejecting the Ho when it is true • The probability of type I error is α • Called level of significance of the test • Set by researcher in advance 2/3/2023 Dr. Getabalew 89
  • 90. Type II Error • The error committed when a false Ho is not rejected • The probability of Type II Error is Power • The probability of rejecting the Ho when it is false. Power = 1 – β = 1- probability of type II error • We would like to maintain low probability of a Type I error (α) and low probability of a Type II error (β) [high power = 1 - β]. 2/3/2023 Dr. Getabalew 90
  • 91. Action (Conclusion) Reality Ho True Ho False Do not reject Ho Correct action (Prob. = 1-α) Type II error (β) (Prob. = β= 1-Power) Reject Ho Type I error (α) (Prob. = α = Sign. level) Correct action (Prob. = Power = 1-β) 2/3/2023 Dr. Getabalew 91
  • 92. Thank you 2/3/2023 Dr. Getabalew 92