Hypothesis testing - college hypothesis testing

Hypothesis Testing
Same idea, different angle

In brief
Interval Estimation
• You collect sample(s)
• Make a guess about the range of
population parameters – lowest
and highest values
• μ ε [μL, μH]. What are μL and μH?
• μ1 – μ2 ε [a, b]. What are a and b?
• Talk about range of μ
Hypothesis Testing
• You collect sample(s)
• You want make (more) precise
statements about the parameter
values
• Can μ > 50?
• Is μ1 > μ2?
• Talk about probabilities of certain μ

Statistical Hypothesis - I
• An assertion or conjecture
• About distribution on one or more random variables
• μ = 7, μ 9,
• A hypothesis, if true, might completely specify the distribution,
then it is simple hypothesis. (μ = 7)
• If not, composite hypothesis (μ 9)

Statistical Hypothesis - II
• Importance of alternative hypothesis
• X ≈ N(μ, 1). Sample.
H0: μ = 50, HA: μ ≠ 50
• X ≈ N(μ, 1). Sample.
H0: , HA: μ < 50

Null and Alternative Hypotheses
H0 HA
Assumption or status
quo, nothing new
Rejection of an assumption
Assumed to be true
or given
Rejection of an assumption or
given
Negation of the
research question
Research q. needs to be proven
Always contains an
equality
Does not contain an equality

Null and Alternative Statements
• All statistical statements are made in relation to the null hyp.
• As researchers, we either reject the null hypothesis or fail to reject
the null hypothesis. We do not accept the null hypothesis.
• This is because the null is assumed to be true from the start.
• If we reject the null hypothesis, we conclude the data supports
the alternative hypothesis.
• However, if we fail to reject the null, that does not prove the null
is “true”.
• We only set up an assumption to either reject or fail to reject.

Example
• During the 2010-11 English Premier League season, Manchester United
home matches had an average attendance of 74,961. A club
marketing analyst would like to see if attendance decreased during
the most recent season. Establish a null and alternative hypothesis for
this analysis.
• What is our assumption?
• We can only assume that the attendance remained the same.
• Marketing analyst: interested in knowing if the attendance decreased.
• Which hypothesis format should we choose?
• I would choose: H0: 74961 and Ha < 74961

Thinking about hypotheses
• When formulating a statistical hypothesis:
• Ask: am I testing an assumption, or the status quo, that already
exists? Or am I testing a claim or assertion beyond what I already
know or can know?
• The null and the alternative are ALWAYS in opposition to each
other; cannot both be true.

Significance levels
• Consider a population with distribution where is unknown.
• We want to test a hypothesis about .
• Suppose F is a normal dist. With mean and variance =1.
• H0: (simple hyp.) OR H0 : (composite)
• To test this hyp, we observe a sample, and based on this, we
have to decide whether or not to accept H0.
• We define a region “critical region” with the proviso that the
hypothesis is to be rejected if the value is in the critical region.

Significance levels and errors
• In our example, variance =1, and sd =1.
• SE of mean =
• 95% CI =>
• Reject the null ( when sample average differs from 1 by more
than 1.96 divided by sq. root of the sample size.
• Type I error: rejecting null when it is supported by data.
• Type II errors: fail to reject the null when it is false.

Significance levels
• We are not determining is H0 is ”true” but only if its validity is
consistent with the resultant data.
• Thus, H0 is rejected if the resultant data are unlikely when H0 is
true.
• Specify , and then require the test to have the property that
whenever H0 is true, its probability of being rejected is never
greater than .
• Value of is the level of significance of the test.
• It is usually set in advance, common values: 0.1; 0.05; 0.01

Basic Method
• Suppose
• To develop a test of , at the level of significance is to start by
determining a point estimate of say d(X).
• The hypothesis is rejected if d(X) is “far away” from the region .
• To determine how “far away” it needs to be for us to reject , we
need to determine the probability dist of d(X) when is true.
• This will give us the critical region to make the test have the
required significance level

Die example
• 600 rolls of the die
• H0 : die is fair
• Ha : die is NOT fair
• In plain English: is the variation in outcomes due to chance, or is
the variation beyond what random chance would allow?
• How much should our data vary for us to conclude that our die is
not fair? i.e. we reject the null?

Errors Possible
To test H0, set , and then require the test to have the probability of Type
I error occurring can never be greater than

To be more precise, this is what we mean by α and β

Critical Value
• If we desire that the test has significance level then we must
determine the critical value c that will make the type I error =
• We can determine whether or not to accept the null hypothesis
by computing, first, the value of the test statistic,
• And second, the probability that a unit normal would (in
absolute value) exceed that quantity.
• This probability is called the p-value of the test – gives the critical
significance level.
• Relationship between alpha and p: reject null if p- value < alpha

Type I and Type II errors again
• : probability of committing a Type 1 error.
• : probability of committing a Type II error
• As decreases (level of significance increases), Type I error
decreases
• As decreases, probability of Type II error increases.
• Delicate balance!

Central Idea
Type I Error: reject H0 when it is correct.
Type II Error: accept H0 when it is false.
Which is in your control or smaller?

Type I and Type II errors
What is the null hypothesis here?

t-Test
• When mean and std. dev. are both unknown

F TEST
• 5 cans of tuna filled by machines.The quality assurance manager wishes to test the variability
of two machines.
• Machine 1: n=25; mean: 5.0492
• Machine 2: n=22; mean: 4.9808
• Variance 1: 0.1130
• Variance 2: 0.0137 oz.
• Question: is this difference due to sampling error or is it statistically significant?
• Use F test to compare variances.

REGRESSION- ECOTRIX
• In this course we have learnt to measure effects. Students understand more of statistics concepts in evening classes
then morning classes. But does “evening” open up students’ brains? Does “evening” or “moon light” improve
students’ comprehension skills?
• Causal Effects – Next leap in data comprehension!
• Regress Wage Education:
– Does education affect wage?
– Wagei = α + β * Educationi + εi
• Regress Wage Education Gender:
– For a given gender, does education affect wage?
– For a given education, does gender affect wage?
– Wagei = α + β1 * Educationi + β2 * Genderi + εi

Hypothesis testing - college hypothesis testing

More Related Content

What's hot (20)

Similar to Hypothesis testing - college hypothesis testing (20)

Recently uploaded (20)

Hypothesis testing - college hypothesis testing