1. Cihan University
Biomedical Department
Third Stage
2022-2023
Bio Statistics Course
First Semester
Hazhar T. A. Blbas
Statistics Department - MSc at UCF in 2014
Statistics Department – PhD Students at SUE
Founder and CEO of STAT Office for Data Analysis and Training Statistical
FB: SOSDAT
2. Definitions
Statistics
is the science of conducting studies to collect,
organize, summarize, analyze, and draw
conclusions from data.
Biostatistics
is the application of statistics to a wide range of
topics in biology.
3. Applications of Biostatistics
•Public health, including epidemiology , health
services research, nutrition and environmental health
•Design and analysis of clinical trials in medicine,
genomics, population genetics and statistical
genetics.
•Ecology, ecological forecasting
•Biological sequence analysis
4. Null and Alternative Hypotheses
Convert the research question to null and
alternative hypotheses
• The null hypothesis (H0) is a claim of “no
difference in the population”
• The alternative hypothesis (Ha) claims “H0 is
false”
5. • The first step in the procedure is to state the hypotheses null
and alternative forms. The null hypothesis (abbreviate “H
naught”) is a statement of no difference. The alternative
hypothesis (“H sub a”) is a statement of difference.
• The null hypothesis is a statement that you want to test. In
general, the null hypothesis is that things are the same as each
other, or the same as a theoretical expectation. For example, if
you measure the size of the feet of male and female chickens,
the null hypothesis could be that the average foot size in male
chickens is the same as the average foot size in female
chickens.
• The alternative hypothesis is that things are different from
each other, or different from a theoretical expectation. For
example, one alternative hypothesis would be that male
chickens have a different average foot size than female
6. Two types of Error
Type I Error
In a hypothesis test, a type I error occurs when the
null hypothesis is rejected when it is in fact true
(We reject Ho while Ho is True); that is, 𝐻o is
wrongly rejected.
For example, in a clinical trial of a new drug, the
null hypothesis might be that the new drug is no
better, on average, than the current drug; i.e.
𝐻o: there is no difference between the two drugs
on average.
7. Type II Error
In a hypothesis test, a type II error occurs when the
null hypothesis 𝐻o is not rejected when it is in fact
false (We accept Ho while Ho is False).
For example, in a clinical trial of a new drug, the
null hypothesis might be that the new drug is no
better, on average, than the current drug; i.e.
𝐻o: there is no difference between the two drugs
on average.
8. Alpha () and Beta () Errors
Test Result
Truth
H0 Correct H0 wrong
Accept H0 OK Type II (β Error)
Reject H0 Type I (α Error) OK
α ≡ probability of a Type I error
β ≡ Probability of a Type II error
)
1
(
)
1
(
9. H0: Innocent
Jury Trial Hypothesis Test
Actual Situation Actual Situation
Verdict / Judgment Innocent Guilty Decision H 0
True H 0
False
Innocent Correct Error
Accept
H
0
1 -
Type II
Error ( )
Guilty Error Correct
H
0
Type I
Error
( )
Power
(1 - )
Result Possibilities
False
Negative
False
Positive
Reject
10. H0: Healthy Person (No Covid19)
Doctor Trial Hypothesis Test
Actual Situation Actual Situation
Doctor No Covid19 Covid19 Decision H 0
True H 0
False
No Covid19 Correct Error
Accept
H
0
1 -
Type II
Error ( )
Covid19 Error Correct
H
0
Type I
Error
( )
Power
(1 - )
Result Possibilities
False
Negative
False
Positive
Reject
Reference: Hazhar Blbas
11. Explanation of the Type I and Type II error
a) H0: The person is healthy (No Covid19)
H1: The person is unhealthy (Covid19)
b) A Type I error is a false positive. It has been decided that the
person has Corona Virus’s when she/he dose not have.
c) A Type II error is a false negative. It has been decided that the
person is healthy, when they actually have Corona Virus’s
disease.
d) A Type I error would require more testing, resulting in time
and money lost. A Type II error would mean that the person did
not receive the treatment they needed. A Type II error is much
worse.
12. H0: No Covid19
Doctor Trial Hypothesis Test
Actual Situation Actual Situation
Doctor No Covid19 Covid19 Decision H 0
True H 0
False
st Correct Error
Accept
H
0
1 -
Type II
Error ( )
Covid19 Error Correct
H
0
Type I
Error
( )
Power
(1 - )
Result Possibilities
False
Negative
False
Positive
Reject
Reference: Hazhar Blbas
13. Example
a) H0: The person is healthy
H1: The person has Alzheimers
b) A Type I error is a false positive. It has been decided that
the person has Alzheimer’s disease when he/she doesnt.
c) A Type II error is a false negative. It has been decided that
the person is healthy, when he/she actually has Alzheimer’s
disease.
d) A Type I error would require more testing, resulting in time
and money lost. A Type II error would mean that the person
did not receive the treatment they needed. A Type II error is
much worse.
e) The power of this test is the ability of the test to detect
patients with Alzheimer’s disease. In this case, the power can
be computed as 1− P(Type II error) = 1− 0.08 = 0.92.
14. 1. Type I error can only occur if H0 is true
2. Type II error can only occur if H0 is false
3. There is a tradeoff between type I and II errors.
If the probability of type I error ( ) increased,
then the probability of type II error ( β ) declines.
4. When the difference between the hypothesized
parameter and the actual true value is small, the
probability of type two error (the non-rejection
region) is larger.
5. Increasing the sample size, n, for a given level of
, reduces β
Type I and Type II errors cannot happen at the same time
15. Significance Level
𝛼 ≡ probability of a Type I error
𝛼 = Pr(reject H0 | H0 true)
(the “|” is read as “given”)
Although 𝛼 is also called the “size of the critical
region”.
𝛼 = 0.05
𝛼 = 0.01
16. Power of Test
β ≡ probability of a Type II error
β = Pr(accept H0 | H0 false)
(the “|” is read as “given”)
The power of a statistical hypothesis test measures
the test's ability to reject the null hypothesis Ho when
it is actually false - that is, to make a correct decision.
1 – β = “Power” ≡ probability of avoiding a Type II error
1– β = Pr(accept H0 | H0 false)
17. Basis for
comparison Type I error Type II error
Definition
Type 1 error, in statistical
hypothesis testing, is the
error caused by rejecting
a null hypothesis when it
is true.
Type II error is the error
that occurs when the null
hypothesis is accepted
when it is not true.
Also termed
Type I error is equivalent
to false positive.
Type II error is equivalent
to a false negative.
Meaning
It is a false rejection of a
true hypothesis.
It is the false acceptance
of an incorrect hypothesis.
Symbol
Type I error is denoted by
α.
Type II error is denoted by
β.
Probability
The probability of type I
error is equal to the level
of significance.
The probability of type II
error is equal to one minus
the power of the test.
Type I error vs Type II error
18. Basis for
comparison Type I error Type II error
Reduced
It can be reduced by
decreasing the level of
significance.
It can be reduced by
increasing the level of
significance.
Cause
It is caused by luck or
chance.
It is caused by a smaller
sample size or a less
powerful test.
What is it?
Type I error is similar to a
false hit.
Type II error is similar to a
miss.
Hypothesis
Type I error is associated
with rejecting the null
hypothesis.
Type II error is associated
with rejecting the
alternative hypothesis.
When does it
happen?
It happens when the
acceptance levels are set
too lenient.
It happens when the
acceptance levels are set
too stringent.
Type I error vs Type II error
19. Type of the T-test
• One-sample t-test compares one sample mean
with a hypothesized value
• Paired sample t-test (dependent sample)
compares the means of two dependent variables
• Independent sample t-test compares the means
of two independent variables
– Equal variance
– Unequal variance
22. Steps of One Sample T-Test
1.State the null hypothesis
the alternative hypothesis
2.Choose a significance level
3.Determine the critical region
4.Compute the
5.Make a decision, reject the null hypothesis if the test
statistic Z computed in step 4 falls in the rejection
region for the test; otherwise, do not reject the null
hypothesis.
23. One Sample t-test
Hypothesis Testing:
- Unknown Parameters Requires t-test
- Comparison of One Sample Mean to a Specific Value
- Assumptions: dependent variable is scale,
Randomization, Normal Distribution
We can calculate one sample t-test by hand and
SPSS
n
s
M
s
M
t
M /
0
0
=
=
24. Five steps to find one-sample t-test for
a population mean (Slide 1 of 5)
• Step 1: State hypotheses
The null hypothesis is H0: = 0 (the real mean equals
some proposed theoretical constant 0);
The alternative hypothesis is one of the following:
Ha: 0 Ha: < 0 Ha: > 0
(Two Tailed) (Left Tailed) (Right Tailed)
• Step 2 Decide on the significance level,
25. Five steps to find one-sample t-test for
a population mean (Slide 2 of 5)
• Step 3 The critical values are
±t/2 -t +t
(Two Tailed) (Left Tailed) (Right Tailed)
df = n - 1.
26. Five steps to find one-sample t-test for
a population mean (Slide 3 of 5)
• Finding Critical Values
A portion of the t distribution table
For example, alpha = .05 and number of sample size=6
for the two tails: df =n-1= 6-1 =5. Table says 2.571.
27. Five steps to find one-sample t-test for
a population mean (Slide 4 of 5)
• Finding Critical Values
The t-distribution for df = 3, 2-tailed α = 0.10
28. Five steps to find one-sample t-test for
a population mean (Slide 5 of 5)
• Step 4 Compute the value of the test
statistic
• Step 5 If the value of the test statistic falls in
the rejection region (if absolute value of
sample is greater than critical value), reject H0,
otherwise do not reject H0.
n
s
M
s
M
t
M /
0
0
=
=
29. Summary of hypothesis-testing
• The null hypothesis H0: = 0
Type Conditions Test Statistic
z-test
μ0 is known
σ is known
t-test
μ0 is hypothesized or predicted
σ is unknown
n
M
M
z
M /
0
0
=
=
n
s
M
s
M
t
M /
0
0
=
=
30. Testing whether light bulbs have a life of 1000 hours at = 0.05
800, 750, 940, 970, 790, 980, 820, 760, 1000, 860
Assumptions: dependent variable is scale, Randomization,
Normal Distribution
• Step 1 State hypotheses
– Null hypothesis is H0: = 1000.
– Alternative hypothesis is H1: 1000.
• Step 2 Set alpha. = .05
• Step 3 Determine the critical value. Looking for alpha = .05,
two tails with df = 10-1 = 9. Critical value= 2.262.
Example 1:
31. • Step 4 Calculate the test statistic
What is the mean of our sample?
What is the standard deviation for our sample of light
bulbs?
)
(
;
867 M
X
n
xi
M =
=
=
73
.
96
1
)
(
)
(
Sample
of
Deviation
Standard
2
=
=
n
X
X
S
35
.
4
59
.
30
1000
867
0
=
=
=
n
S
M
t
32. • Step 5 State decision rule, if absolute value of
sample is greater than critical value, reject null.
We reject the null hypothesis (Test Statistics=|-4.35| >
Critical value=|2.262|) that the bulbs were drawn from a
population in which the average life is 1000 hrs.
The difference between our sample mean (867) and the
mean of the population (1000) is SO different that it is
unlikely that our sample could have been drawn from a
population with an average life of 1000 hours
33. • SPSS Steps:-
Click Analyze, Compare Means, One-Sample T Test. Select light
bulb (name of variables) and put it in the Test Variables box.
Type 1000 in the Test Value box. Click OK. You get the output
on this slide
Because the p-value (Sig. (2-tailed)) is less than .05, we reject H0. So, it’s significant.
One-Sample Statistics
10 867.0000 96.7299 30.5887
BULBLIFE
N Mean Std. Deviation
Std. Error
Mean
One-Sample Test
-4.348 9 .002 -133.0000 -202.1964 -63.8036
BULBLIFE
t df Sig. (2-tailed)
Mean
Difference Lower Upper
95% Confidence
Interval of the
Difference
Test Value = 1000
34. • Exercise (2)
The mean emission of all engines of a new design
needs to be below 20 ppm if the design is to meet new
emission requirements. Ten engines are manufactured
for testing purposes, and the emission level of each is
determined.
15.6, 16.2, 22.5, 20.5, 16.4, 19.4, 16.6, 17.9, 12.7, 13.9
Does the data supply sufficient evidence to conclude that type
of engine meets the new standard, with alpha=0.05?
Assumptions: dependent variable is scale, Randomization,
Normal Distribution
Example 2:
35. Step 1 State hypotheses
H0: Emissions are equal to (or greater than) 20ppm;
H1: Emissions are less than 20ppm (One-Tailed Test)
Step 2 Set alpha. = .05
Step 3 Determine the critical value. Looking for alpha = .05,
one tails with df = 10-1 = 9. Critical value= -1.833.
Step 4 Calculate the test statistic
M =17.17 ; SD = 2.98 ; SM =0.942 ; t statistic = -3.00
Step 5 Decision
State decision rule, we reject H0 because the absolute of test
statistics is greater than critical value (t= |-3| > critical value=|-
1.833|, it means the emissions are not equal to 20 ppm
(emissions are less than 20 ppm).
36. • Example (3)
An outbreak of Salmonella-related illness was attributed to
ice cream produced at a certain factory. Scientists measured
the level of Salmonella in 9 randomly sampled batches of
ice cream. The levels (in MPN/g) were:
0.593 0.142 0.329 0.691 0.231 0.793
0.519 0.392 0.418
Is there evidence that the mean level of Salmonella in the
ice cream is greater than 0.3 MPN/g, at Alpha = 0.05?
Step 1 State hypotheses
H0: = 0.3
Ha: > 0.3
Example 3:
37. Step 2 Set alpha. = .05
Step 3 Determine the critical value. Looking for alpha =
.05, one tails with df = 9-1 = 8. Critical value= 1.860.
Step 4 Calculate the test statistic
M =0.456 ; SD =0.213 ; SM =0.071 ; t-statistic =2.197
Step 5 Decision
Since, t-statistics is greater than critical value, reject Ho,
there is moderately strong evidence that the mean
Salmonella level in the ice cream is above 0.3 MPN/g.
38. Example:
We want to test whether a new headache medicine
provides a relief time equal to or different from the
standard of 100 minutes.
90 93 93 99 98 100 103 104 99 102
Homework 1:
You are conducting an experiment to see if a
given therapy works to reduce test anxiety. A
standard measure of test anxiety is known to
produce a µ = 20. In the sample you draw of 81
the mean M = 18 with s = 9.
Homework 2:
39. Chapter 7 - Page 218 Fundamental Biostatistics-
Rosner
Cardiology A topic of recent clinical interest is
the possibility of using drugs to reduce infarct
size in patients who have had a myocardial
infarction within the past 24 hours. Suppose we
know that in untreated patients the mean
infarct size is 25 (ck − g − EQ/m2). Furthermore,
in 8 patients treated with a drug the mean
infarct size is 16 with a standard deviation of 10.
Is the drug effective in reducing infarct size?
Homework 3:
40. Chapter 7 - Page 218 Fundamental Biostatistics-
Rosner
• Cardiovascular Disease, Pediatrics Suppose
the mean cholesterol level of 10 children
whose fathers died from heart disease is 200
mg/dL and the sample standard deviation is
50 mg/dL
Test the hypothesis that the mean cholesterol
level is higher 175 in this group than in the
general population.
Homework 4:
41. t-tests with Two Samples
A- Independent sample t-test compares the
means of two independent variables
Equal variance
Unequal variance
B- Paired sample t-test (Dependent Sample t-test)
compares the means of two dependent variables
43. Steps of Independent Samples t-test
1.State the null hypothesis 𝐻0: 𝑀1 = 𝑀2
the alternative hypothesis 𝐻1: 𝑀1 ≠ 𝑀2
2.Choose a significance level
3.Determine the critical value:
Critical value (, df=n1+n2-2)
44. 4. Used when we have two independent samples, e.g.,
treatment and control groups.
• Formula is:
• Terms in the numerator are the sample means.
• Term in the denominator is the standard error of the
difference between means.
diff
X
X
SE
X
X
t 2
1
2
1
=
2
2
2
1
2
1
n
SD
n
SD
SEdiff
=
5.Make a decision, reject the null hypothesis if the test
statistic t computed in step 4 falls in the rejection region
for the test; otherwise, do not reject the null hypothesis.
45. Suppose we study the effect of
caffeine on a motor test where the task is to keep
a the mouse centered on a moving dot. Everyone
gets a drink; half get caffeine, half get placebo;
nobody knows who got what. Use Alpha = 0.05.
Explain: So let’s say we do the following study. We bring in our
volunteers and give each of them a psychomotor test where they
use a mouse to keep a dot centered on a computer screen target
that keeps moving away (pursuit task). One hour before the test,
both groups get an oral dose of a drug. For every other person
(1/2 of the people), the drug is caffeine. For the other half, it’s a
placebo. Nobody in the study knows who got what. All take the
test. The results are in the slide.
Example 1:
46. Independent Sample Data (Data are time off task)
Experimental (Caffeine) Control (No Caffeine)
12 21
14 18
10 14
8 20
16 11
5 19
3 8
9 12
11 13
15
N1=9, M1=9.778, SD1=4.1164 N2=10, M2=15.1, SD2=4.2805
47. 1. State Hypotheses.
Null Hypothesis: H0: 1 = 2.
Alternative Hypothesis: H1: 1 2.
2. Set alpha. Alpha = .05
3. Determine the critical value.
is 0.05, 2 tails, and d.f = n1+n2-2 = 10+9-2 = 17.
So, the critical value of two tailed (=0.05, df=17) is 2.11.
49. Using SPSS
• Open SPSS
• Open file “SPSS Examples” for Lab 5
• Go to:
–“Analyze” then “Compare Means”
–Choose “Independent samples t-test”
–Put Group in “grouping variable” and
Indep.Sample in “test variable” box.
–Define grouping variable numbers.
• E.g., we labeled the experimental
(caffeine) group as “1” in our data set and
the control (No caffeine) group as “2”
50. Group Statistics
Group N Mean Std. Deviation Std. Error Mean
Indep.Sample 1 9 9.78 4.116 1.372
2 10 15.10 4.280 1.354
Independent Samples Test
Levene's
Test for
Equality of
Variances t-test for Equality of Means
F Sig. t df
Sig. (2-
tailed)
Mean
Difference
Std. Error
Difference
95% Confidence
Interval of the
Difference
Lower Upper
Indep.
Sample
Equal
variances
assumed
.135 .718 -2.755 17 .014 -5.322 1.932 -9.398 -1.247
Equal
variances
not assumed
-2.761 16.911 .013 -5.322 1.927 -9.390 -1.254
51. A medical researcher wishes to see
whether the pulse rates of smokers are higher than the pulse
rates of nonsmokers. Samples of 100 smokers and 100
nonsmokers are selected. The results are shown here. Can the
researcher conclude, at 𝛼=0.05, that smokers have higher pulse
rates than nonsmokers?
Example 2:
𝑺𝒎𝒐𝒌𝒆𝒓𝒔 𝑵𝒐𝒏𝒔𝒎𝒐𝒌𝒆𝒓𝒔
ഥ
𝑿𝟏= 90 ഥ
𝑿𝟐= 88
𝑆𝐷1= 5 𝑆𝐷2= 6
𝑛1= 100 𝑛2= 100
52. 1. State Hypotheses.
Null Hypothesis: H0: Smokers = Nonsmokers
Alternative Hypothesis: H1: Smokers > Nonsmokers
2. Set alpha. Alpha = .05
3. Determine the critical value.
is 0.05, one tail, and d.f = n1+n2-2 = 100+100-2 = 198.
So, the critical value of one tail (=0.05, df=198) is:
𝟏. 𝟔𝟒𝟒𝟗
54. Can we conclude that patients
with primary hypertension (PH), on the average,
have higher total cholesterol levels than
normotensive (NT) patients? In the following
table are total cholesterol measurements (mg/dl)
for 29 PH patients and 20 NT patients. Can we
conclude that PH patients have, on average,
higher total cholesterol levels than NT patients?
Assume that the population variances are not
known and they are unequal.
Homework 1:
57. In our previous discussion involving the difference
between two population means, it was assumed that
the samples were independent.
In this section, a different version of the t test is
explained. Samples are considered to be dependent
samples when the subjects are paired or matched in
some way.
For example, suppose a medical researcher wants to
see whether a drug will affect the reaction time of its
users.
Paired Sample t-test
58. Assumptions for the Paired Sample t-test
1. The sample sizes must be equal [n1=n2].
2. The observations must be paired (e.g.
before 𝑥𝑖 and after 𝑦𝑖 studies, different
measuring devices (𝑥𝑖,𝑦𝑖), etc.) which means
that the samples are no longer independent.
59. Steps of Dependent Samples t-test
1.State the null hypothesis H0: D = 0
Alternative Hypothesis: H1: D 0
2.Choose a significance level
3.Determine the critical region
60. 4. t Statistics for two dependent samples, e.g., before
and after treatment.
𝑡 =
ഥ
𝐷 − 𝑀𝐷
𝑆ഥ
𝐷
Where, 𝑆ഥ
𝐷 =
𝑆
𝑛
5.Make a decision, reject the null hypothesis if the test
statistic t computed in step 4 falls in the rejection region
for the test; otherwise, do not reject the null hypothesis.
D: Difference between each pair of observations
ഥ
𝐷: Mean of difference between each pair of observations
𝑆ഥ
𝐷: The standard deviation of these difference
61. Does the Diet Work?
A developer of a new diet is interested in showing that
it is effective. He randomly chooses 15
subjects to go on the diet for 1 month.
He weighs each patient before and after
the 1-month period to see whether
there is evidence of a weight loss at the
end of the month.
Hazhar Blbas 61
Example 3:
62. Patient before after difference
1 210 204 6
2 207 205 2
3 183 182 1
4 195 196 -1
5 187 177 10
6 201 193 8
7 158 152 6
8 180 182 -2
9 173 165 8
10 198 186 12
11 225 218 7
12 243 237 6
13 168 174 -6
14 177 178 -1
15 196 199 -3
Mean 193.40 189.87 3.53
SD 21.48 20.53 5.33
1. Set alpha = .05
2. Null hypothesis:
H0: 1 = 2.
Alternative is:
H1: 1 > 2.
1. Calculate the test
statistic:
376
.
1
15
33
.
5
)
( =
=
=
pairs
n
SD
SD
Mean
565
.
2
376
.
1
53
.
3
=
=
=
pairs
n
SD
D
t
63. 4. Determine the critical value of t.
Alpha =.05, tails=1
df = n(pairs)-1 =15-1=14.
Critical value is 1.761
5. There is evidence that the mean weight loss is
positive, that is, that the diet is effective in
producing weight loss, because of, the Test
Statistics = 2.567 is grater that the Critical
Value=1.761 (we do reject Ho).
64. First Check Normality for the difference scores (μd )
• Analyze Descriptive Statistics Explore
• Put the (μd ) variable into the Dependent list and tick Both
• Click on Plots and then tick on Normality plots with test
• The p-values 0.544 from Shapiro-Wilk test of normality is
greater than 0.05 which imply that it is acceptable to assume
that the μd distribution is normal (or bell-shaped)
Hazhar Blbas 64
Using SPSS
65. Paired Samples t-Test (Cont.)
• To perform the one sample t-test:
H0 : μd = 0 (the mean of the differences is zero;
i.e., the diet is ineffective).
Ha : μd > 0 (the mean of the differences is positive;
i.e., the diet is effective).
• Analyze Compare Means Paired Samples T-Test
• Put both variables Before and After into paired variables box
• Click on Ok
There is evidence that the mean weight loss is positive, that is, that the diet
is effective in producing weight loss, t(14) = 2.567, one-tailed p = 0.01
(because μd > 0 which means one tailed)
Hazhar Blbas 65
66. Study was designed to see if a drug was effective
at losing weight. Nine women were in the study,
and took drugs for two separate weeks, one
week the drug, and another week a placebo.
After each week amount of weight loss was
recorded.
Homework 2:
Has the drug done a better
job than the placebo in
terms of weight loss? Let
∝=0.05.
Drug Placebo
1.1 0
1.3 -0.3
1 0.6
1.7 0.3
1.4 -0.7
0.1 -0.2
0.5 0.6
1.6 0.9
-0.5 -2
67. The systolic blood pressures
of 𝑛 = 12 women between
the ages of 20 and 35 were
measured before and after
administration of a newly
developed drug. Data are
shown in Table
Homework 3: