Introduction to Hypothesis Testing

Chapter 8
Introduction to Hypothesis Testing
PowerPoint Lecture Slides
Essentials of Statistics for the
Behavioral Sciences
Eighth Edition
by Frederick J. Gravetter and Larry B. Wallnau

Chapter 8 Learning Outcomes
• Understand logic of hypothesis testing1
• State hypotheses and locate critical region(s)2
• Conduct z-test and make decision3
• Define and differentiate Type I and Type II errors4
• Understand effect size and compute Cohen’s d5
• Make directional hypotheses and conduct one-tailed test6

Tools You Will Need
• z-Scores (Chapter 5)
• Distribution of sample means (Chapter 7)
– Expected value
– Standard error
– Probability and sample means

8.1 Hypothesis Testing Logic
• Hypothesis testing is one of the most
commonly used inferential procedures
• Definition: a statistical method that uses
sample data to evaluate the validity of a
hypothesis about a population parameter

Logic of Hypothesis Test
• State hypothesis about a population
• Predict the expected characteristics of the
sample based on the hypothesis
• Obtain a random sample from the population
• Compare the obtained sample data with the
prediction made from the hypothesis
– If consistent, hypothesis is reasonable
– If discrepant, hypothesis is rejected

Figure 8.1
Basic Experimental Design

Figure 8.2 Unknown Population in
Basic Experimental Design

Four Steps in Hypothesis Testing
Step 1: State the hypotheses
Step 2: Set the criteria for a decision
Step 3: Collect data; compute sample statistics
Step 4: Make a decision

Step 1: State Hypotheses
• Null hypothesis (H0) states that, in the general
population, there is no change, no difference,
or is no relationship
• Alternative hypothesis (H1) states that there is
a change, a difference, or there is a
relationship in the general population

Step 2: Set the Decision Criterion
• Distribution of sample outcomes is divided
– Those likely if H0 is true
– Those “very unlikely” if H0 is true
• Alpha level, or significance level, is a probability
value used to define “very unlikely” outcomes
• Critical region(s) consist of the extreme sample
outcomes that are “very unlikely”
• Boundaries of critical region(s) are determined by
the probability set by the alpha level

Figure 8.3 Note “Unlikely” Parts of
Distribution of Sample Means

Figure 8.4
Critical region(s) for α = .05

Learning Check
• A sports coach is investigating the impact of a
new training method. In words, what would
the null hypothesis say?
• The new training program produces different
results from the existing oneA
• The new training program produces results
about like the existing oneB
• The new training program produces better
results than the existing oneC
• There is no way to predict the results of the
new training programD

Learning Check - Answer
• A sports coach is investigating the impact of a
new training method. In words, what would
the null hypothesis say?
• The new training program produces different
results from the existing oneA
• The new training program produces results
about like the existing oneB
• The new training program produces better
results than the existing oneC
• There is no way to predict the results of the
new training programD

Learning Check
• Decide if each of the following statements
is True or False.
• If the alpha level is decreased, the size
of the critical region decreasesT/F
• The critical region defines unlikely
values if the null hypothesis is trueT/F

Learning Check - Answers
• Alpha is the proportion of the area
in the critical region(s)True
• This is the definition of “unlikely”True

Step 3: Collect Data (and…)
• Data always collected after hypotheses stated
• Data always collected after establishing
decision criteria
• This sequence assures objectivity

Step 3: (continued)…
Compute Sample Statistics
• Compute a sample statistic (z-score) to show
the exact position of the sample
• In words, z is the difference between the
observed sample mean and the hypothesized
population mean divided by the standard
error of the mean
M
M
z




Step 4: Make a decision
• If sample statistic (z) is located in the critical
region, the null hypothesis is rejected
• If the sample statistic (z) is not located in the
critical region, the researcher fails to reject the
null hypothesis

Jury Trial:
Hypothesis Testing Analogy
• Trial begins with the null hypothesis “not guilty”
(defendant’s innocent plea)
• Police and prosecutor gather evidence (data)
relevant to the validity of the innocent plea
• With sufficient evidence against, jury rejects null
hypothesis innocence claim to conclude “guilty”
• With insufficient evidence against, jury fails to
convict, i.e., fails to reject the “not guilty” claim
(but does not conclude defendant is innocent)

Learning Check
is True or False.
• When the z-score is quite
extreme, it shows the null
hypothesis is true
T/F
• A decision to retain the null
hypothesis means you proved that
the treatment has no effect
T/F

• An extreme z-score is in the critical
region—very unlikely if H0 is trueFalse
• Failing to reject H0 does not prove it
true; there is just not enough evidence
to reject it
False

8.2 Uncertainty and Errors
in Hypothesis Testing
• Hypothesis testing is an inferential process
– Uses limited information from a sample to make a
statistical decision, and then from it a general
conclusion
– Sample data used to make the statistical decision
allows us to make an inference and draw a
conclusion about a population
• Errors are possible

Type I Errors
• Researcher rejects a null hypothesis that is
actually true
• Researcher concludes that a treatment has an
effect when it has none
• Alpha level is the probability that a test will
lead to a Type I error

Type II Errors
• Researcher fails to reject a null hypothesis
that is really false
• Researcher has failed to detect a real
treatment effect
• Type II error probability is not easily identified

Table 8.1
Actual Situation
No Effect =
H0 True
Effect Exists =
H0 False
Researcher’s
Decision
Reject H0
Type I error
(α)
Decision correct
Fail to reject H0 Decision correct
Type II error
(β)

Figure 8.5 Location of
Critical Region Boundaries

Learning Check
is True or False.
• A Type I error is like convicting an
innocent person in a jury trialT/F
• A Type II error is like convicting a
guilty person in a jury trialT/F

• Innocence is the “null hypothesis”
for a jury trial; conviction is like
rejecting that hypothesis
True
• Convicting a guilty person is not an
error; but acquitting a guilty
person would be like Type II error
False

8.3 Hypothesis Testing Summary
• Step 1: State hypotheses and select alpha level
• Step 2: Locate the critical region
• Step 3: Collect data; compute the test statistic
• Step 4: Make a probability-based decision
about H0: Reject H0 if the test statistic is
unlikely when H0 is true—called a “significant”
or “statistically significant” result

In the Literature
• A result is significant or statistically significant
if it is very unlikely to occur when the null
hypothesis is true; conclusion: reject H0
• In APA format
– Report that you found a significant effect
– Report value of test statistic
– Report the p-value of your test statistic

Figure 8.6
Critical Region for Standard Test

8.3 Assumptions for
Hypothesis Tests with z-Scores
• Random sampling
• Independent Observation
• Value of σ is not changed by the treatment
• Normally distributed sampling distribution

Factors that Influence the
Outcome of a Hypothesis Test
• Size of difference between sample mean and
original population mean
– Larger discrepancies  larger z-scores
• Variability of the scores
– More variability  larger standard error
• Number of scores in the sample
– Larger n  smaller standard error

Learning Check
• A researcher uses a hypothesis test to evaluate
H0: µ = 80. Which combination of factors is most
likely to result in rejecting the null hypothesis?
• σ = 5 and n = 25A
• σ = 5 and n = 50B
• σ = 10 and n = 25C
• σ = 10 and n = 50D

• A researcher uses a hypothesis test to evaluate
H0: µ = 80. Which combination of factors is most
likely to result in rejecting the null hypothesis?
• σ = 5 and n = 25A
• σ = 5 and n = 50B
• σ = 10 and n = 25C
• σ = 10 and n = 50D

Learning Check
is True or False.
• An effect that exists is more likely
to be detected if n is largeT/F
• An effect that exists is less likely to
be detected if σ is largeT/F

Learning Check - Answers
• A larger sample produces a
smaller standard error and larger zTrue
• A larger standard deviation
increases the standard error and
produces a smaller z
True

8.4 Directional Hypothesis Tests
• The standard hypothesis testing procedure is
called a two-tailed (non-directional) test
because the critical region involves both tails
to determine if the treatment increases or
decreases the target behavior
• However, sometimes the researcher has a
specific prediction about the direction of the
treatment

8.4 Directional Hypothesis Tests
(Continued)
• When a specific direction of the treatment
effect can be predicted, it can be incorporated
into the hypotheses
• In a directional (one-tailed) hypothesis test,
the researcher specifies either an increase or
a decrease in the population mean as a
consequence of the treatment

Figure 8.7 Example 8.3
Critical Region (Directional)

One-tailed and Two-tailed Tests
Compared
• One-tailed test allows rejecting H0 with
relatively small difference provided the
difference is in the predicted direction
• Two-tailed test requires relatively large
difference regardless of the direction of the
difference
• In general two-tailed tests should be used
unless there is a strong justification for a
directional prediction

Learning Check
• A researcher is predicting that a treatment will
decrease scores. If this treatment is evaluated
using a directional hypothesis test, then the
critical region for the test.
• would be entirely in the right-hand tail of
the distributionA
• would be entirely in the left-hand tail of
the distributionB
• would be divided equally between the two tails
of the distributionC
• cannot answer without knowing the value of
the alpha levelD

• A researcher is predicting that a treatment will
decrease scores. If this treatment is evaluated
using a directional hypothesis test, then the
critical region for the test.
• would be entirely in the right-hand tail of
the distributionA
• would be entirely in the left-hand tail of
the distributionB
• would be divided equally between the two tails
of the distributionC
• cannot answer without knowing the value of
the alpha levelD

8.5 Hypothesis Testing Concerns:
Measuring Effect Size
• Although commonly used, some researchers
are concerned about hypothesis testing
– Focus of test is data, not hypothesis
– Significant effects are not always substantial
• Effect size measures the absolute magnitude
of a treatment effect, independent of sample
size
• Cohen’s d measures effect size simply and
directly in a standardized way


 treatmentnotreatment
deviationstandard
differencemean
dsCohen'


Cohen’s d : Measure of Effect Size
Magnitude of d Evaluation of Effect Size
d = 0.2 Small effect
d = 0.5 Medium effect
d = 0.8 Large effect

Figure 8.8 When is a 15-point
Difference a “Large” Effect?

Learning Check
is True or False.
• Increasing the sample size will also
increase the effect sizeT/F
• Larger differences between the
sample and population mean
increase effect size
T/F

Learning Check -Answers
• Sample size does not affect
Cohen’s dFalse
• The mean difference is in the
numerator of Cohen’s d
True

8.6 Statistical Power
• The power of a test is the probability that the
test will correctly reject a false null hypothesis
– It will detect a treatment effect if one exists
– Power = 1 – β [where β = probability of a Type II
error]
• Power usually estimated before starting study
– Requires several assumptions about factors that
influence power

Figure 8.9
Measuring Statistical Power

Influences on Power
• Increased Power
– As effect size increases, power also increases
– Larger sample sizes produce greater power
– Using a one-tailed (directional) test increases power
(relative to a two-tailed test)
• Decreased Power
– Reducing the alpha level (making the test more
stringent) reduces power
– Using two-tailed (non-directional) test decreases
power (relative to a one-tailed test)

Figure 8.10
Sample Size Affects Power

Learning Check
• The power of a statistical test is the
probability of _____
• rejecting a true null hypothesisA
• supporting true null hypothesisB
• rejecting a false null hypothesisC
• supporting a false null hypothesisD

• The power of a statistical test is the
probability of _____
• rejecting a true null hypothesisA
• supporting true null hypothesisB
• rejecting a false null hypothesisC
• supporting a false null hypothesisD

Learning Check
is True or False.
• Cohen’s d is used because alone, a
hypothesis test does not measure
the size of the treatment effect
T/F
• Lowering the alpha level from .05
to .01 will increase the power of a
statistical test
T/F

Answer
• Differences might be significant
but not of substantial sizeTrue
• It is less likely that H0 will be
rejected with a small alpha
False

Any
Questions
?
Concepts
?
Equations?

Introduction to Hypothesis Testing

More Related Content

What's hot (20)

Similar to Introduction to Hypothesis Testing (20)

Recently uploaded (20)

Introduction to Hypothesis Testing

Editor's Notes