3. Chi-Square Test
• can be used to determine if categorical data shows
dependency or the two classifications are independent.
• can also be used to make comparisons between theoretical
populations and actual data when categories are used.
• the chi-square test is applicable in large number of problems
4. Chi-Square Test
• (i) test the goodness of fit
• (ii) test the significance of association between two attributes,
• (iii) test the homogeneity or the significance of population variance
5. Chi-Square Test
• The null hypothesis states that there is no relationship
between the two variables
• The research hypothesis states that there is a relationship
between the two variables.
6. Chi-Square Test
When to ACCEPT null hypothesis?
1. positive and greater than the critical value, then we have
sufficient evidence to reject the null hypothesis and accept
the alternative hypothesis.
2. positive and lower than or equal to the critical value, we
must accept the null hypothesis.
7. Chi-Square Test for Comparing
Variance
chi-square value is often used to judge the significance of population variance i.e.,
we can use the test to judge if a random sample has been drawn from a normal
population with mean (µ) and with a specified variance (σp2 ).
8. Chi-Square Test for Comparing
Variance
The test is based on χ2 -distribution. Such a distribution we encounter when we
deal with collections of values that involve adding up squares. Variances of
samples require us to add a collection of squared quantities and, thus, have
distributions that are related to χ2 -distribution.
9. X
Chi-Square Test for Comparing
Variance
If we take each one of a collection of sample variances, divided them by the known
population variance and multiply these quotients by (n – 1), where n means the
number of items in the sample, we shall obtain a χ2 -distribution.
Thus, (d.f.) would have the same distribution as χ2 –distribution
with (n – 1) degrees of freedom.
10. X
Chi-Square Test for Comparing
Variance
The χ2 -distribution is not symmetrical and all
the values are positive. For making use of this
distribution, one is required to know the
degrees of freedom since for different
degrees of freedom we have different curves.
The smaller the number of degrees of
freedom, the more skewed is the distribution
12. Example 1
Weight of 10 students is as follows:
Student
No. 1 2 3 4 5 6 7 8 9 10
Weight
(kg)
38 40 45 53 47 43 55 48 52 49
Can we say that the variance of the distribution of weight of all students from which the
above sample of 10 students was drawn is equal to 20 kgs? Test this at 5% and 1% level of
significance.
13. Solution
First of all, we should work out the variance of the sample data
Student
No.
Xi (Weight in Kilograms) (Xi – X) (Xi – X)^2
1 38 -9 81
2 40 -7 49
3 45 -2 04
4 53 +6 36
5 47 +0 00
6 43 -4 16
7 55 +8 64
8 48 +1 01
9 52 +5 25
10 49 +2 04
n = 10 ∑Xi = 470
X = 47 ∑(Xi - X) = 280
14. Solution
First of all, we should work out the variance of the sample data
Student
No.
Xi (Weight in Kilograms) (Xi – X) (Xi – X)^2
1 38 -9 81
2 40 -7 49
3 45 -2 04
4 53 +6 36
5 47 +0 00
6 43 -4 16
7 55 +8 64
8 48 +1 01
9 52 +5 25
10 49 +2 04
n = 10 ∑Xi = 470
X = 47 ∑(Xi - X) = 280
15. Solution
Let the null hypothesis be Ho: variance of sample = variance of the population. In order to
test this hypothesis, we work out the χ2 value as under:
17. Solution
Degrees of freedom in the given case is (n – 1) = (10 – 1) = 9. At 5% level of
significance the table value of χ2 = 16.92 and at 1% level of significance, it is 21.67
for 9 d.f. and both these values are greater than the calculated value of χ2 which is
13.999. Hence, we accept the null hypothesis and conclude that the variance of the
given distribution can be taken as 20 kgs at 5% as also at 1% level of significance. In
other words, the sample can be said to have been taken from a population with
variance 20 kgs.
19. Example 2
A sample of 10 is drawn randomly from a certain population. The sum of the
squared deviations from the mean of the given sample is 50. Test the
hypothesis that the variance of the population is 5 at 5%level of significance.
21. Solution
Let the null hypothesis be Ho: variance of sample = variance of the population.
In order to test this hypothesis, we work out the χ2 value as under:
Degrees of freedom = (10 – 1) = 9
The table value of χ2 at 5% level for 9 d.f. is 16.92. The calculated value of χ2 is
less than this table value, so we accept the null hypothesis and conclude that
the variance of the population is 5 as given in the question.
22. Chi-Square as a Non-Parametric
Test
The following conditions should be satisfied before χ2 test can be applied:
(i) Observations recorded and used are collected on a random basis.
(ii) All the items in the sample must be independent.
(iii) No group should contain very few items, say less than 10. In case where
the frequencies are less than 10, regrouping is done by combining the
frequencies of adjoining groups so that the new frequencies become
greater than 10. Some statisticians take this number as 5, but 10 is
regarded as better by most of the statisticians.
23. Chi-Square as a Non-Parametric
Test
The following conditions should be satisfied before χ2 test can be applied:
(iv) The overall number of items must also be reasonably large. It should normally
be at least 50, howsoever small the number of groups may be.
(v) The constraints must be linear. Constraints which involve linear equations in
the cell frequencies of a contingency table (i.e., equations containing no squares
or higher powers of the frequencies) are known are know as linear constraints.
24. Example 1
A die is thrown 132 times with following results:
Number turned up 1 2 3 4 5 6
Frequency 16 20 25 14 29 28
Is the die unbiased?
25. Solution
Let us take the hypothesis that the die is unbiased. If that is so, the probability of
obtaining any one of the six numbers is 1/6 and as such the expected frequency of
any one number coming upward is 132 ×1/6 = 22. Now we can write the observed
frequencies along with expected frequencies and work out the value of χ2 as follows:
26. Solution
∑ [(Oi – Ei ) 2 /Ei ] = 9. Hence, the calculated value of χ2 = 9.
Degrees of freedom in the given problem is (n – 1) = (6 – 1) = 5.
The table value of χ2 for 5 degrees of freedom at 5% level of significance is
11.071.
Comparing calculated and table values of χ2 , we find that calculated value is
less than the table value and as such could have arisen due to fluctuations of
sampling. The result, thus, supports the hypothesis and it can be concluded
that the die is unbiased.
27. Activity
Find the value of χ2 for the following information:
CLASS A B C D E
OBSERVED FREQUENCY 8 29 44 15 4
THEORETICAL (EXPECTED
FREQUENCY)
7 24 38 24 7
28. Activity
The table given below shows the data obtained during outbreak of smallpox:
Attacked Not Attacked Total
Vaccinated 31 469 500
Not Vaccinated 185 1315 1500
Total 216 1784 2000
Test the effectiveness of vaccination in preventing the attack from smallpox.
Test your result with the help of χ2 at 5% level of significance.