Hypothesis testing in statistics

HYPOTHESIS TESTING
IN STATISTICS

NOT EVERYTHING YOU’RE TOLD IS ABSOLUTELY CERTAIN
• The trouble is, how do you know when what you’re being told isn’t right? Hypothesis tests give
you a way of using samples to test whether or not statistical claims are likely to be true. They give
you a way of weighing the evidence and testing whether extreme results can be explained by
mere coincidence.

NEW SCENARIO
Statsville’s new miracle drug
• Statsville’s leading drug company has produced a new remedy for curing snoring. Frustrated snorers are
flocking to their doctors in hopes of finding nightly relief.
• The drug company claims that their miracle drug cures 90% of people within two weeks, which is great news
for the people with snoring difficulties. The trouble is, not everyone’s convinced

• The doctor at the Statsville Surgery has been prescribing SnoreCull to her patients, but she’s disappointed by
the results. She decides to conduct her own trial of the drug.
• She takes a random sample of 15 snorers and puts them on a course of SnoreCull for two weeks. After two
weeks, she calls them back in to see whether their snoring has stopped.
• Here are the results:

• If the drug cures 90% of people, how many people in the sample of 15 snorers would you expect to have
been cured? What sort of distribution do you think this follows?

• 90% of 15 is 13.5, so you’d expect 14 people to be cured. Only 11 people in the doctors sample were cured,
which is much lower than the result you’d expect
• There are a specific number of trials and the doctor is interested in the number of successes, so the
number of successes follows a binomial distribution. If X is the number of successes then X ~ B(15, 0.9).

SO WHAT’S THE PROBLEM?
• Here’s the probability distribution for how many people the drug company says should have been cured by the
snoring remedy.
• So why the discrepancy?

• The drug company might not be deliberately telling lies, but their claims might be misleading.
• It’s possible that the tests of the drug company were flawed, and this might have resulted in misleading claims
being made about SnoreCull. They may have inadvertent conducted flawed or biased tests on SnoreCull,
which resulted in them making inaccurate predictions about the population. If the success rate of SnoreCull is
actually lower than 90%, this would explain why only 11 people in the sample were cured.

The drug company’s claims might actually be accurate.
• Rather than the drug company being at fault, it’s always possible that the patients in the doctor’s sample may
not have been representative of the snoring population as a whole. It’s always possible that the snoring remedy
does cure 90% of snorers, but the doctor just happens to have a higher proportion of people in her sample
whom it doesn’t cure. In other words, her sample might be biased in some way, or it could just come down to
there being a small number of patients in the sample.
Resolving the conflict from 50,000 feet
• So how do we resolve the conflict between the doctor and the drug company? Let’s take a very high level view
of what we need to do. We can resolve the conflict between the drug company and the doctor byputting the
claims of the drug company on trial.
• In other words, we’ll accept the word of the drug company by default, but if there’s strong evidence against it,
we’ll side with the doctor instead.

• In general, this process is called hypothesis testing, as you take a hypothesis or claim and then test it against the
evidence. Let’s look at the general process for this.

THE SIX STEPS FOR HYPOTHESIS TESTING
• Here are the broad steps that are involved in hypothesis testing. We’ll go through each one in detail in the
following pages.

THE DRUG COMPANY’S CLAIM
• According to the drug company, SnoreCull cures 90% of patients within 2 weeks. We need to accept this
position unless there is sufficiently strong evidence to the contrary.
• The claim that we’re testing is called the null hypothesis. It’s represented by H0, and it’s the claim that we’ll
accept unless there is strong evidence against it.

SO WHAT’S THE NULL HYPOTHESIS FOR SNORECULL?
• The null hypothesis for SnoreCull is the claim of the drug company: that it cures 90% of patients. This is the
claim that we’re going to go along with, unless we find strong evidence against it.
• We need to test whether at least 90% of patients are cured by the drug, so this means that the null hypothesis is
that p = 90%.
So what’s the alternative?
• We’ve looked at what the claim is we’re going to test, the null hypothesis, but what if it’s not true? What’s the
alternative?

• The counterclaim to the null hypothesis is called the alternate hypothesis. It’s represented by H1, and it’s the
claim that we’ll accept if there’s strong enough evidence to reject H0.
• The doctor believes that SnoreCull cures less than 90% of people, so this means that the alternate hypothesis is
that p < 90%.

• When hypothesis testing, you assume the null hypothesis is true. If there’s sufficient evidence against it,
you reject it and accept the alternate hypothesis.
Step 2: Choose your test statistics
• Now that you’ve determined exactly what it is you’re going to test, you need some means of testing it. You can
do this with a test statistic. The test statistic is the statistic that you use to test your hypothesis. It’s the statistic
that’s most relevant to the test.
What’s the test statistic for SnoreCull?
• In our hypothesis test, we want to test whether SnoreCull cures 90% of people or more. To test this, we can
look at the probability distribution according to the drug company, and see whether the number of successes
in the sample is significant.
• If we use X to represent the number of people cured in the sample, this means that we can use X as our test
statistic. There are 15 people in the sample, and the probability of success according to the drug company is
0.9. As X follows a binomial distribution, this means that the test statistic is actually:

• We choose the test statistic according to H0, the null hypothesis.
• We need to test whether there is sufficient evidence against the null hypothesis, and we do this by first
assuming that H0 is true. We then look for evidence that contradicts H0. For the SnoreCull hypothesis test, we
assume that the probability of succes,s is 0.9 unless there is strong evidence against this being true.
• To do this, we look at how likely it is for us to get the results we did, assuming the probability of success is 0.9.
In other words, we take the results of the sample and examine the probability of getting that result.We do this
by finding a critical region.

STEP 3: DETERMINE THE CRITICAL REGION
• The critical region of a hypothesis test is the set of values that present the most extreme evidence against the
null hypothesis.
• Let’s see how this works by taking another look at the doctor’s sample. If 90% or more people had been
cured, this would have been in line with the claims made by the drug company. As the number of people
cured decreases, the more unlikely it becomes that the claims of the drug company are true.
• Here’s the probability distribution:

AT WHAT POINT CAN WE REJECT THE DRUG COMPANY CLAIMS?
• What we need is some way of indicating at what point we can reasonably reject the null hypothesis, and we can
do this by specifying a critical region. If the number of snorers cured falls within the critical region, then we’ll
say there is sufficient evidence to reject the null hypothesis. If the number of snorers cured falls outside the
critical region, then we’ll accept that there isn’t sufficient evidence to reject the null hypothesis, and we’ll accept
the claims of the drug company. We’ll call the cut off point for the critical region c, the critical value.
• So how do we choose the critical region?

TO FIND THE CRITICAL REGION, FIRST DECIDE ON THE SIGNIFICANCE LEVEL
• Before we can find the critical region of the hypothesis test, we first need to decide on the significance level.
The significance level of a test is a measure of how unlikely you want the results of the sample to be before you
reject the null hypothesis Ho. Just like the confidence level for a confidence interval, the significance level is
given as a percentage.
• As an example, suppose we want to test the claims of the drug company at a 5% level of significance. This
means that we choose the critical region so that the probability of fewer than c snorers being cured is less than
0.05. It’s the lowest 5% of the probability distribution.

• The significance level is normally represented by the Greek letter𝛼. The lower 𝛼 is, the more unlikely the
results in your sample need to be before we reject Ho.
So what significance level should we use?
• Let’s use a significance level of 5% in our hypothesis test. This means that if the number of snorers cured in
the sample in the lowest 5% of the probability distribution, then we will reject the claims of the drug company.
If the number of snorers cured lies in the top 95% of the probability distribution, then we’ll decide there isn’t
enough evidence to reject the null hypothesis, and accept the claims of the drug company. If we use X to
represent the number of snorers cured, then we define the critical region as being values such that

• When you’re constructing a critical region for your test, another thing you need to be aware of is whether
you’re conducting a one-tailed or two-tailed test. Let’s look at the difference between the two, and what impact
this has on the critical region?
• A one-tailed test is where the critical region falls at one end of the possible set of values in your test. You
choose the level of the test—represented by 𝛼—and then make sure that the critical region reflects this as a
corresponding probability.
• The tail can be at either end of the set of possible values, and the end you use depends on your alternate
hypothesis H1.
• If your alternate hypothesis includes a < sign, then use the lower tail, where the critical region is at the lower
end of the data.
• If your alternate hypothesis includes a > sign, then use the upper tail, where the critical region is at the upper
end of
• the data. We’re using a one-tailed test for the SnoreCull hypothesis test with the critical region in the lower tail,
as our alternate hypothesis is that p < 0.9.

Hypothesis testing in statistics

TWO-TAILED TESTS
• A two-tailed test is where the critical region is split over both ends of the set of values. You choose the level of
the test  , and then make sure that the overall critical region reflects this as a corresponding probability by
splitting it into two. Both ends contain  /2, so that the total is  .
• You can tell if you need to use a two-tailed test by looking at the alternate hypothesis H1. If H1 contains a 
sign, then you need to use a two-tailed test as you are looking for some change in the parameter, rather than an
increase or decrease.
• We would have used a two-tailed test for our SnoreCull if our alternate hypothesis had been p  0.9. We
would have had to check whether significantly more or significantly fewer than 90% of patients had been cured

FIND THE P-VALUE
• Now that we’ve looked at critical regions, we can move on to step 4, finding the p-value.
• A p-value is the probability of getting a value up to and including the one in your sample in the direction of
your critical region. It’s a way of taking your sample and working out whether the result falls within the critical
region for your hypothesis test. In other words, we use the p-value to say whether or not we can reject the null
hypothesis.
How do we find the p-value?
• How we find the p-value depends on our critical region and our test statistic. For the SnoreCull test, 11 people
were cured, and our critical region is the lower tail of the distribution. This means that our p-value is P(X 11),
where X is the distribution for the number of people cured in the sample.
• As the significance level of our test is 5%, this means that if P(X 11) is less than 0.05, then the value 11 falls
within the critical region, and we can reject the null hypothesis.

• We know from step 2 that X ~ B(15, 0.9). What’s P(X ≤ 11)?

• A p-value is the probability of getting the results in the sample, or something more extreme, in the
direction of the critical region.
• In our hypothesis test for SnoreCull, the critical region is the lower tail of the probability distribution. In order
to see whether 11 people being cured of snoring is in the critical region, we calculated P(X 11), as this is the
probability of getting a result at least as extreme as the results of our sample in the direction of the lower tail.
• Had our critical region been the upper tail of the probability distribution instead, we would have needed to
find P(X 11). We would have counted more extreme results as being greater than 11, as these would have
been closer to the critical region.

STEP 5: IS THE SAMPLE RESULT IN THE CRITICAL REGION?
• Now that we’ve found the p-value, we can use it to see whether the result from our sample falls within the
critical region. If it does, then we’ll have sufficient evidence to reject the claims of the drug company
• Our critical region is the lower tail of the probability distribution, and we’re using a significance level of 5%.
This means that we can reject the null hypothesis if our p-value is less that 0.05. As our p-value is 0.0555, this
means that the number of people cured by SnoreCull in the sample doesn’t fall within the critical region.

STEP 6: MAKE YOUR DECISION
• We’ve now reached the final step of the hypothesis test. We can decide whether to accept the null hypothesis,
or reject it in favor of the alternative.
• The p-value of the hypothesis test falls just outside the critical region of the test. This means that there isn’t
sufficient evidence to reject the null ,In other words:
We accept the claims of the drug company hypothesis.

Hypothesis testing in statistics

More Related Content

What's hot (20)

Similar to Hypothesis testing in statistics (20)

Recently uploaded (20)

Hypothesis testing in statistics