hypothesis testing overview

WHAT IS HYPOTHESIS TESTING?
• It is concerned with how well the sample data support a null hypothesis and when the
null hypothesis can be rejected
• It is stating a null (H0) and alternative hypotheses (H1), and then using inferential
statistics on a new set of data to determine what decision needs to be made about these
hypotheses.
(Unlike estimation, there is no clear hypothesis about the population parameter.)
• To make a probabilistic decision about the truth of the null and alternative hypotheses,
which depends on the research data.
• The goal is to hopefully “nullify” the null hypothesis
(to find relationships or patterns to justify rejection of the null hypothesis).

The null hypothesis that states:
“There is no effect present.”
The alternative hypothesis that states:
“There is an effect present.”
The key question answered in hypothesis testing:
“Is the value of my sample statistic unlikely enough (assuming that the null
hypothesis is true) to reject the null hypothesis and tentatively accept the
alternative hypothesis?”

Example:
An experiment is performed to compare a new method of counseling (given to the
experimental group) to no counseling at all (the control group).
In this case:
the null hypothesis says: “there is no effect“
the treatment group is not any better than the control group after the treatment
the alternative hypothesis says: “there is an effect”
the treatment and control groups do differ after the treatment
If the two groups are very dissimilar after the treatment, the researcher might be able to
reject the null hypothesis and accept the alternative hypothesis.

What is a Null Hypotheses?
• represented by the symbol H0
• a statement about a population parameter and states that some condition concerning
the population parameter is true.
• It is the focal point in hypothesis testing because it is the hypothesis that is tested
directly (not the alternative hypothesis).
• In most studies, it predicts no difference or no relationship in the population.
• It is the hypothesis tested directly using probability theory; also sometimes is called “null
hypothesis significance testing” or NHST.
• operates under the assumption that the null hypothesis is true.
If the results of the research study differ greatly from the assumption that the null
hypothesis is true, the researcher rejects the null hypothesis and tentatively accepts the
alternative hypothesis.
• It is like a “means to an end.”
The null hypothesis is used because that is what must be stated and tested directly in
statistics.

Harnett (1982) explanation of the null hypothesis:
The term “null hypothesis” developed from early work in the theory of hypothesis testing,
in which this hypothesis corresponded to a theory about a population parameter that the
researcher thought did not represent the true value of the parameter (hence the word
“null,” which means invalid, void, or amounting to nothing). The alternative hypothesis
generally specified those values of the parameter that the researcher believed did hold
true. (p. 346)

What is an Alternative hypothesis?
• represented by the symbol H1
• The statement that the population parameter is some value other than the value stated
by the null hypothesis.
• It asserts the opposite of the null hypothesis and usually is a statement of a difference
between means or a relationship between variables.
• A statement that logically contradicts the null hypothesis; H0 and H1 cannot both be true
at the same time.
• If there is reason to reject the null hypothesis, the alternative hypothesis can be
tentatively accepted.
• It is almost always more consistent with the research hypothesis;
Therefore, the hope is that the results support the alternative hypothesis, not the null
hypothesis.

According to the logic of hypothesis testing, you should assume that an effect is
not present until you have good evidence to conclude otherwise.
The researcher states a null hypothesis but hopes ultimately to be able to reject it.
In other words, the null hypothesis is the hypothesis that the researcher hopes to be able
to nullify by conducting the hypothesis test.

A familiar and common analogy is in the justice system’s assumption that an accused is
innocent until proven guilty.
Null hypothesis : accused = innocent (or accused = not guilty)
Alternative hypothesis : accused  innocent (or accused  not guilty)
The prosecution (researcher) proceeds to present evidence that will hopefully disprove the
accused is innocent, therefore, give probable reason to reject the assumption that the
accused is innocent (thereby rejecting the null hypothesis) and accept the alternative
hypothesis.

Three points about hypothesis testing:
1. The alternative hypothesis can never include an equal sign (=).
2. The alternative hypothesis is based on one of these three signs:
 (not equal to), < (less than), or > (greater than).
3. The null hypothesis is based on one of these three signs:
= (equal to), < (less than or equal to), or > (greater than or equal to).
(The equality sign is always a part of the null hypothesis.)

Example:
Let’s assume that we are interested in knowing which teaching method works better:
the discussion teaching method or the lecture teaching method.
Here are the null and alternative hypotheses:
Null hypothesis: H0: µD = µL
Alternative hypothesis: H1: µD  µL
where
µD is the symbol for the discussion group population mean, and
µL is the symbol for the lecture group population mean.

The null hypothesis says:
“that the average performance of students in discussion classes is equal to the
average performance of students in lecture classes. “
This null hypothesis is called a point or exact hypothesis because it contains an equal
sign (=).
The alternative hypothesis says:
“the discussion and lecture population means are not equal.”
As can be seen, the alternative hypothesis states the opposite of the null hypothesis.

Directional Alternative Hypotheses
Nondirectional alternative hypothesis H1 has () sign
Directional alternative hypothesis H1 has either (>) or (<).

Null hypothesis: H0: µD < µL (H0: µD =µL)
Alternative hypothesis: H1: µD > µL (H1: µD  µL)
From earlier example, we can instead state the following:
H0: µD > µL
H1: µD < µL
or
A major drawback of directional alternative hypothesis occurs when a large difference in
the opposite direction is found. The researcher must conclude that no relationship exists in
the population. That is the rule of hypothesis testing when using a directional alternative
hypothesis.
However, this conclusion of no relationship would operate against the discovery function of
science. Because of this, most practicing researchers state directional research hypotheses
(i.e., they make a directional prediction), but they test nondirectional alternative
hypotheses so that they can leave open this discovery function of science.

Examining the Probability Value and Making a Decision
The null hypothesis that is tested directly in the hypothesis-testing procedure. When a
researcher states a null hypothesis, the researcher is able to use the principles of
inferential statistics to construct a probability model about what would happen if the null
hypothesis were true.
This probability model is nothing but the sampling distribution that would result for the
sample statistic (mean, percentage, correlation) over repeated sampling if the null
hypothesis were true.
Probability value (p value) The probability of the observed result of your research study
(or a more extreme result) if the null hypothesis were true

1
, where is the number of tosses
2
n
n
 
 
 

Empirical Rule for Standard Deviations
For any set of numbers, essentially 90% or more of the values lie within plus or minus
three standard deviations from the mean.
Rule for Bell-shaped Data
If the histogram for a set of data is shaped like a bell, then about 2/3, or 68%, of the values
lie within the interval  ±  . About 95% of the values lie within the interval  ± 2 . And
virtually all of the values lie within the interval  ± 3 .

Significance level or ()
The cutoff the researcher uses to decide when to reject the null hypothesis (usually 0.05).
If p  , H0 is rejected and H1 is tentatively accepted.
If p >  , H0 is accepted
When H0 is rejected and H1 is accepted, the finding is said to be statistically significant.
A finding is statistically significant when it is believed that the observed result was not due
only to chance or sampling error (based on the evidence of the data).

2 Rules in hypothesis testing:
Rule 1: If the probability value is less than or equal to the significance level ( = 0.05 for most
research) the null hypothesis is rejected and the alternative hypothesis is tentatively
accepted.
We also conclude that the observed relationship is statistically significant (i.e., the observed
difference between the groups is not just due to chance fluctuations).
Rule 2: If the probability value is greater than the significance level, the null hypothesis
cannot be rejected.
We can only claim to fail to reject the null hypothesis and conclude that the relationship is
not statistically significant (i.e., any observed difference between the groups is probably
nothing but a reflection of chance fluctuations).

Type I error : Null hypothesis is true, but rejected; known as false positives since it is
falsely concluded that there is a relationship in the population.
Examples:
A false positive occurs when a medical test says that you have a disease but you really
don’t.
A Type I error occurs when an innocent person is found guilty.
Type II error : null hypothesis is false, but is not rejected; known as false negatives
since it is falsely concluded that there is no relationship in the population, i.e., it not
statistically significant in error.
Examples:
A false negative occurs when a medical test says that you do not have a disease but you
really do.
A Type II error occurs when a guilty person is found to be not guilty.
Type I and II Errors
Two types of possible errors when applying hypothesis test that lead to wrong conclusion.

Example:
A researcher wants to determine who has the higher starting salary: recent male college
graduates or recent female college graduates. Constructing these two statistical
hypotheses:
Null hypothesis: H0: µMales = µFemales
Alternative hypothesis: H1: µMales  µFemales
If µMales = 43000, µFemales = 27000, p would be small because such a large difference
would be unlikely if the H0 were true; H0 is rejected because the results call into question
the H0.
If µMales = 33000, µFemales = 31000, the difference could simply be due to chance (i.e.,
sampling error). In this case, p could be larger than that in the previous example because
this time the difference is likely under the assumption H0 is true. If p is large, H0 cannot be
rejected, which implies it is not statistically significant (i.e., the observed difference
between the two means may simply be a random or chance fluctuation).

Controlling the Risk of Errors
• Use large sample sizes
- provide a test that is more sensitive or has more power.
- less chances of making a hypothesis-testing error
- increases chances of drawing the correct conclusion.
- If statistical significance is achieved, findings has practical significance
Power : the probability of rejecting the null hypothesis when it is false
Practical significance : conclusion made when a relationship is strong enough to be of
practical importance
• Examine the effect size indicator
- An effect size indicator is a statistical measure of the strength of a relationship.
- It tells how big an effect is present.
- Some effect size indicators:
Cohen’s standardized effect size
beta squared
omega squared
Cramer’s V
correlation coefficient squared.

Example:
Two techniques are compared for teaching spelling, and the means of the two groups in
the study turned out to be 86% and 85% correct on the spelling test after the
intervention.
The difference between these two means is quite small and is probably not practically
significant; however, this difference might end up being statistically significant if we have
a very large number of people in each of the two treatment groups.
Likewise, a small correlation might be statistically significant but not practically significant
if there is a very large number of people in the research study or that you read about and
evaluate.
This does not mean that larger samples are bad. The rule—the bigger the sample size, the
better—still applies. It simply means that you must always make sure that a finding is
practically significant in addition to being statistically significant.

Example:
A neurologist is testing the effect of a drug on response time by injecting 100 rats with a
unit dose of the drug, subjecting each to neurological stimulus, and recording its response
time. The neurologist knows that the mean response time for rats not injected with the
drug is 1.2 seconds. The mean of the 100 injected rats' response time is 1.05 seconds with
a sample standard deviation of 0.5 seconds. Do you think the drug has an effect on
response time?
Source: https://www.youtube.com/watch?v=-FtlH4svqx4

Given: 100n  not injected 1.2x 
injected 1.05x 
0.5s  0
1
: drug has no effect 1.2
: drug has effect 1.2
H
H


 
 
Solution:
Find the s.d. of the sampling distribution:
0.5 0.5
0.05
10100x
s
n n

     
Find how many standard deviations is the sample mean from population mean:
1.2 1.05x
x x
z
 
 
 
 
From the standard normal distribution, we get the normal probability for the given z-score.
Therefore, p = 0.0013. If we use the standard  = 0.05, then p  , i.e., we reject H0.
1.2 1.05 1.2 1.05 0.15
3
0.05 0.05x
z

 
   

HYPOTHESIS TESTING IN PRACTICE
t Test for Independent Samples
- Used to determine whether the difference between the means of two groups is
statistically significant.
- Uses the t distribution (the sampling distribution used to determine the probability
value)
One-Way Analysis of Variance (one-way ANOVA)
- Used to compare two or more group means, appropriate whenever you have one
quantitative dependent variable and one categorical independent variable.
(Two-way ANOVA for two categorical independent variables, three-way ANOVA for
three categorical independent variables, and so forth.)
- Uses the F distribution (F distribution is distribution skewed to the right, i.e., the tail is
pulled or stretched out to the right).
- The F distribution can be computed by a statistical computer programs.

t Test for Correlation Coefficients
- Used to determine whether a correlation coefficient is statistically significant
- Correlation coefficient show relationship between a quantitative dependent variable
and a quantitative independent variable.
t Test for Regression Coefficients
- Used to determine whether a regression coefficient is statistically significant
Chi-Square Test for Contingency Tables
- used to determine whether a relationship observed in a contingency table is
statistically significant
Post Hoc Tests in Analysis of Variance
- A follow up test to ANOVA used to determine which means are significantly different
- Popular post-hoc tests:
Newmann-Keuls test
The Turkey Test
Bonferroni test
Other Significance Tests
- Analysis of covariance (ANCOVA)
- Partial correlation coefficients

https://statistics.laerd.com/statistical-guides/normal-distribution-calculations.php
REFERENCES:
How to do Normal Distributions Calculations
t Test for Independent Samples
https://www.youtube.com/watch?v=jyoO4i8yUag
Probability density functions and binomial distribution
https://www.youtube.com/watch?v=Fvi9A_tEmXQ&index=8&list=PL1328115D3D8A2566
Analysis of covariance (ANCOVA)
https://www.youtube.com/watch?v=rpe4kPGteCQ
https://www.youtube.com/watch?v=8i0h98chSHU
Post Hoc Tests in Analysis of Variance
https://www.youtube.com/watch?v=rZuYwJupGus
https://www.youtube.com/watch?v=8NJxtwnSDZ8
http://www.statisticshowto.com/newman-keuls/
Chi-Square Test for Contingency Tables
https://www.youtube.com/watch?v=hpWdDmgsIRE
t Test for Regression Coefficients
https://www.youtube.com/watch?v=j5oPzAJvnVI
t Test for Correlation Coefficients
https://www.youtube.com/watch?v=Uf5nW7D8quk
One-Way Analysis of Variance (one-way ANOVA)
https://www.youtube.com/watch?v=51QZa7b0Ozk

hypothesis testing overview

More Related Content

What's hot (20)

Similar to hypothesis testing overview (20)

More from i i (15)

Recently uploaded (20)

hypothesis testing overview