SlideShare a Scribd company logo
CHI-SQUARE
TEST
RICLYN D. MOSCOSA
Introduction
Chi-square is one of the most commonly used non-parametric test
Introduced by Karl Pearson as a test of significance in 1990
Denoted by the Greek sign χ2
It is a useful measure of comparing experimentally observed result
with experimentally theoretical result or based on hypothesis
The chi-squared distribution with k degrees of freedom is the
distribution of a sum of the squares of k independent standard
normal random samples
It is determined by the degrees of freedom
It can be applied on categorical or qualitative data using a
contingency table
Used to evaluate unpaired/unrelated samples and proportions
Chi-Square
It is a mathematical expression, representing the ratio between
experimentally obtained result (O) and the theoretically expected
result (E) based on certain hypothesis
It used data in the form of frequencies (i.e., the number of
occurrence of an event)
Calculated by dividing the square of the overall deviation in the
observed and expected frequencies by the expected frequency
If there is no difference between the actual and observed
frequency, the value of chi-square is zero
If there is a difference then the value of test will be other than
zero
Differences may be due to sampling fluctuations
FORMULA: CHI SQUARE
Contingency Table
A type of table in a matrix format that displays the multivariate
frequency distribution of the variables
It provides a basic picture of interrelation between two variables
The values depend on the number of classes
Degrees of Freedom
It is the number of independent pieces of information which are free to
vary, that go into the estimate of a parameter
In a contingency table, the degree of freedom is calculated in a
different manner as df = (R-1) (C-1)
where:
R = no. of rows in a table
C = no. of columns in a table
Chi-Square Distribution
The sampling distribution of the chi-square statistic is not a Normal distribution.
It is a right-skewed distribution that allows only positive values because X2 can
never be negative
When the expected counts are all at least 5, the sampling distribution of the X2
statistic is close to a chi-square distribution with df equals the number of categories
minus 1.
Characteristics of Chi Square
It is based on frequencies and not on the parameters like
mean and standard deviation
Used for testing difference between the entire set of the
expected and the observed frequency
Used for testing the hypothesis and is not useful for
estimation
It is an important non-parametric test as no rigid
assumptions are necessary in regard to the type of
population, no need of parameter values and relatively
less mathematical details are involved
Assumptions for the validity of Chi-Square Test
All observations should be independent No individual
item should be included twice.
The total number of observation should be large. The
chi-square test should not be used if n>50
For comparison purpose, the data must be in original
units
If the theoretical frequencies is <5, then we pool it with
either preceding or succeeding frequency, so that the
resulting sum is >5
Limitations
It does not give us much information about the
strength of the relationship. It only conveys the
existence on non-existence of relationships between
the variables
It is sensitive to sample size
It is also sensitive to small expected frequencies.
STEPS
Identify the problem
Make a contingency table and note the observed frequency (O)
in each classes of one event, row wise i.e., horizontally. And
then the numbers in each group of the other event. column
wise, i.e., vertically
Set up the Ho; According to Null Hypothesis, no association
exists between attributes. This needs setting up of Ha
Calculate the expected frequencies (E)
Find the difference between observed and expected frequency
in each cell (O-E)
Calculate the chi-square value by applying the formula. The
value ranges from zero to infinite.
1.
2.
3.
4.
5.
6.
Uses of Chi Square Test
Goodness of fit - It measures how much the observed or actual
frequency differ from the expected/predicted frequency.
Test of Homogenity - Used to determine whether frequency counts
are distributed identically across different samples
Test of Independence - Used to explain that variables are how much
attached with each other.
1.
2.
3.
Chi-Square Test of
Goodness-of-Fit
Chi-Square Test of
Independence
TYPES
Number of Variables: One
Purpose of Test: Determines if
sample date matches a
population
Degrees of Freedom: K-1
Number of Variables: Two
Purpose of Test: Compares
two set of data to see if there
is a relationship
Degrees of Freedom: (r-1) (c-1)
Chi Square Goodness of Fit Test
A Ho and Ha established and a significance level is selected for
rejection of Ho.
A random sample of observations is drawn from a relevant statistical
population.
A set of expected frequencies is derived under the assumption that
the Ho is true
The observed frequencies compared with the expected frequencies
The calculated value of Chi-Square goodness of fit test is compared
with the table value. If the calculated value of chi-square goodness of
fit is greater than the table value, we will reject the null hypothesis and
conclude that there is a significant difference between the observed
and the expected frequency.
1.
2.
3.
4.
5.
Steps in Testing Goodness of Fit
A null and alternative hypothesis is established and a significance
level is selected for rejection of null hypothesis.
A random sample of observations is drawn from a relevant statistical
population.
A set of expected frequencies is derived under the assumption that
the null hypothesis is true.
The observed frequencies compared with the expected frequencies
The calculated value of Chi-Square goodness of fit test is compared
with the table value. If the calculated value of the Chi-Square
goodness of fit test is greater than the table value, we will reject the
null hypothesis and conclude that there is a significant difference
between the observed and the expected frequency.
1.
2.
3.
4.
5.
Conditions for Performing a Chi-Square for Goodness of Fit
Random: The data come a well-designed random sample or from a randomized
experiment
10%: When sampling without replacement, check that n ≤ (1/10)N.
Large Counts: All expected counts are greater than 5
Before we start using the chi-square goodness-of-fit test, we have two
important cautions to offer.
•The chi-square test statistic compares observed and expected counts. Don’t try to
perform calculations with the observed and expected proportions in each
category.
•When checking the Large Sample Size condition, be sure to examine the
expected counts, not the observed counts.
Mars, Incorporated makes milk chocolate candies. Here’s what the company’s
Consumer Affairs Department says about the color distribution of its M&M’S® Milk
Chocolate Candies: On average, the new mix of colors of M&M’S ® Milk Chocolate
Candies will contain 13 percent of each of browns and reds, 14 percent yellows, 16
percent greens, 20 percent oranges and 24 percent blues.
The one-way table summarizes the data from a sample bag of M&M’S ® Milk Chocolate
Candies. In general, one-way tables display the distribution of a categorical variable for
the individuals in a sample.
Sample Problem:
Stating the Hypothesis
H0: The company’s stated color distribution for M&M’S ® Milk Chocolate
Candies is correct.
Ha: The company’s stated color distribution for M&M’S ® Milk Chocolate
Candies is not correct.
We can also write the hypotheses in symbols as:
H0: pblue= 0.24, porange= 0.20, pgreen= 0.16, pyellow= 0.14, pred= 0.13,
pbrown= 0.13,
Ha: At least one of the pi’s is incorrect
Finding the expected counts for the color categories:
To compute for the Chi-Square Statistic:
The Chi-Square Distributions and P-values
Since our P-value is between 0.05 and 0.10, it is greater than α = 0.05. Therefore, we fail to reject H0. We
don’t have sufficient evidence to conclude that the company’s claimed color distribution is incorrect.
Chi-Square Test Presentation__Method.pdf

More Related Content

PPTX
Chi square
PPTX
Chi square test
PPTX
Chi squared test
PPTX
Chi Squ.pptx.statisticcs.109876543210987
PPTX
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
PDF
Pearson's Chi-square Test for Research Analysis
PPTX
Chi square test
PPT
Chi-square IMP.ppt
Chi square
Chi square test
Chi squared test
Chi Squ.pptx.statisticcs.109876543210987
The Chi-Square Statistic: Tests for Goodness of Fit and Independence
Pearson's Chi-square Test for Research Analysis
Chi square test
Chi-square IMP.ppt

Similar to Chi-Square Test Presentation__Method.pdf (20)

PPTX
CHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptx
PPTX
Chi-square test.pptx
PPTX
Parametric & non parametric
DOCX
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
PPTX
Statistical tests
PPT
Chi square mahmoud
PPT
Aron chpt 11 ed (2)
PPTX
Chi-square test.pptx
PPTX
Chi square test final
PPTX
Chi-square-Distribution: Introduction & Applications
DOCX
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docx
PPTX
PPTX
Categorical Data and Statistical Analysis
PDF
Lect w7 t_test_amp_chi_test
PPTX
Chi square test.pptxthakida thom thakida thom th
PPTX
Chi-Square Test assignment Stat ppt.pptx
PPTX
Chi -square test
PDF
Chi sqaure test
PPT
chi-squaretest-170826142554.ppt
PPTX
Inferential Statistics: Chi Square (X2) - DAY 6 - B.ED - 8614 - AIOU
CHI SQUARE DISTRIBUTIONdjfnbefklwfwpfioaekf.pptx
Chi-square test.pptx
Parametric & non parametric
Week 5 Lecture 14 The Chi Square TestQuite often, patterns of .docx
Statistical tests
Chi square mahmoud
Aron chpt 11 ed (2)
Chi-square test.pptx
Chi square test final
Chi-square-Distribution: Introduction & Applications
Week 5 Lecture 14 The Chi Square Test Quite often, pat.docx
Categorical Data and Statistical Analysis
Lect w7 t_test_amp_chi_test
Chi square test.pptxthakida thom thakida thom th
Chi-Square Test assignment Stat ppt.pptx
Chi -square test
Chi sqaure test
chi-squaretest-170826142554.ppt
Inferential Statistics: Chi Square (X2) - DAY 6 - B.ED - 8614 - AIOU
Ad

Recently uploaded (20)

PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PDF
1911 Gold Corporate Presentation Aug 2025.pdf
PDF
Deliverable file - Regulatory guideline analysis.pdf
PDF
Introduction to Generative Engine Optimization (GEO)
PDF
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
PDF
ANALYZING THE OPPORTUNITIES OF DIGITAL MARKETING IN BANGLADESH TO PROVIDE AN ...
PDF
NEW - FEES STRUCTURES (01-july-2024).pdf
PDF
Booking.com The Global AI Sentiment Report 2025
PDF
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
PPTX
Astra-Investor- business Presentation (1).pptx
PPTX
operations management : demand supply ch
PDF
TyAnn Osborn: A Visionary Leader Shaping Corporate Workforce Dynamics
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PDF
THE COMPLETE GUIDE TO BUILDING PASSIVE INCOME ONLINE
PDF
Family Law: The Role of Communication in Mediation (www.kiu.ac.ug)
PPTX
Negotiation and Persuasion Skills: A Shrewd Person's Perspective
PDF
Ôn tập tiếng anh trong kinh doanh nâng cao
PDF
Digital Marketing & E-commerce Certificate Glossary.pdf.................
PDF
Tata consultancy services case study shri Sharda college, basrur
PPTX
Sales & Distribution Management , LOGISTICS, Distribution, Sales Managers
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
1911 Gold Corporate Presentation Aug 2025.pdf
Deliverable file - Regulatory guideline analysis.pdf
Introduction to Generative Engine Optimization (GEO)
NISM Series V-A MFD Workbook v December 2024.khhhjtgvwevoypdnew one must use ...
ANALYZING THE OPPORTUNITIES OF DIGITAL MARKETING IN BANGLADESH TO PROVIDE AN ...
NEW - FEES STRUCTURES (01-july-2024).pdf
Booking.com The Global AI Sentiment Report 2025
kom-180-proposal-for-a-directive-amending-directive-2014-45-eu-and-directive-...
Astra-Investor- business Presentation (1).pptx
operations management : demand supply ch
TyAnn Osborn: A Visionary Leader Shaping Corporate Workforce Dynamics
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
THE COMPLETE GUIDE TO BUILDING PASSIVE INCOME ONLINE
Family Law: The Role of Communication in Mediation (www.kiu.ac.ug)
Negotiation and Persuasion Skills: A Shrewd Person's Perspective
Ôn tập tiếng anh trong kinh doanh nâng cao
Digital Marketing & E-commerce Certificate Glossary.pdf.................
Tata consultancy services case study shri Sharda college, basrur
Sales & Distribution Management , LOGISTICS, Distribution, Sales Managers
Ad

Chi-Square Test Presentation__Method.pdf

  • 2. Introduction Chi-square is one of the most commonly used non-parametric test Introduced by Karl Pearson as a test of significance in 1990 Denoted by the Greek sign χ2 It is a useful measure of comparing experimentally observed result with experimentally theoretical result or based on hypothesis The chi-squared distribution with k degrees of freedom is the distribution of a sum of the squares of k independent standard normal random samples It is determined by the degrees of freedom It can be applied on categorical or qualitative data using a contingency table Used to evaluate unpaired/unrelated samples and proportions
  • 3. Chi-Square It is a mathematical expression, representing the ratio between experimentally obtained result (O) and the theoretically expected result (E) based on certain hypothesis It used data in the form of frequencies (i.e., the number of occurrence of an event) Calculated by dividing the square of the overall deviation in the observed and expected frequencies by the expected frequency
  • 4. If there is no difference between the actual and observed frequency, the value of chi-square is zero If there is a difference then the value of test will be other than zero Differences may be due to sampling fluctuations FORMULA: CHI SQUARE
  • 5. Contingency Table A type of table in a matrix format that displays the multivariate frequency distribution of the variables It provides a basic picture of interrelation between two variables The values depend on the number of classes
  • 6. Degrees of Freedom It is the number of independent pieces of information which are free to vary, that go into the estimate of a parameter In a contingency table, the degree of freedom is calculated in a different manner as df = (R-1) (C-1) where: R = no. of rows in a table C = no. of columns in a table
  • 7. Chi-Square Distribution The sampling distribution of the chi-square statistic is not a Normal distribution. It is a right-skewed distribution that allows only positive values because X2 can never be negative When the expected counts are all at least 5, the sampling distribution of the X2 statistic is close to a chi-square distribution with df equals the number of categories minus 1.
  • 8. Characteristics of Chi Square It is based on frequencies and not on the parameters like mean and standard deviation Used for testing difference between the entire set of the expected and the observed frequency Used for testing the hypothesis and is not useful for estimation It is an important non-parametric test as no rigid assumptions are necessary in regard to the type of population, no need of parameter values and relatively less mathematical details are involved
  • 9. Assumptions for the validity of Chi-Square Test All observations should be independent No individual item should be included twice. The total number of observation should be large. The chi-square test should not be used if n>50 For comparison purpose, the data must be in original units If the theoretical frequencies is <5, then we pool it with either preceding or succeeding frequency, so that the resulting sum is >5
  • 10. Limitations It does not give us much information about the strength of the relationship. It only conveys the existence on non-existence of relationships between the variables It is sensitive to sample size It is also sensitive to small expected frequencies.
  • 11. STEPS Identify the problem Make a contingency table and note the observed frequency (O) in each classes of one event, row wise i.e., horizontally. And then the numbers in each group of the other event. column wise, i.e., vertically Set up the Ho; According to Null Hypothesis, no association exists between attributes. This needs setting up of Ha Calculate the expected frequencies (E) Find the difference between observed and expected frequency in each cell (O-E) Calculate the chi-square value by applying the formula. The value ranges from zero to infinite. 1. 2. 3. 4. 5. 6.
  • 12. Uses of Chi Square Test Goodness of fit - It measures how much the observed or actual frequency differ from the expected/predicted frequency. Test of Homogenity - Used to determine whether frequency counts are distributed identically across different samples Test of Independence - Used to explain that variables are how much attached with each other. 1. 2. 3.
  • 13. Chi-Square Test of Goodness-of-Fit Chi-Square Test of Independence TYPES Number of Variables: One Purpose of Test: Determines if sample date matches a population Degrees of Freedom: K-1 Number of Variables: Two Purpose of Test: Compares two set of data to see if there is a relationship Degrees of Freedom: (r-1) (c-1)
  • 14. Chi Square Goodness of Fit Test A Ho and Ha established and a significance level is selected for rejection of Ho. A random sample of observations is drawn from a relevant statistical population. A set of expected frequencies is derived under the assumption that the Ho is true The observed frequencies compared with the expected frequencies The calculated value of Chi-Square goodness of fit test is compared with the table value. If the calculated value of chi-square goodness of fit is greater than the table value, we will reject the null hypothesis and conclude that there is a significant difference between the observed and the expected frequency. 1. 2. 3. 4. 5.
  • 15. Steps in Testing Goodness of Fit A null and alternative hypothesis is established and a significance level is selected for rejection of null hypothesis. A random sample of observations is drawn from a relevant statistical population. A set of expected frequencies is derived under the assumption that the null hypothesis is true. The observed frequencies compared with the expected frequencies The calculated value of Chi-Square goodness of fit test is compared with the table value. If the calculated value of the Chi-Square goodness of fit test is greater than the table value, we will reject the null hypothesis and conclude that there is a significant difference between the observed and the expected frequency. 1. 2. 3. 4. 5.
  • 16. Conditions for Performing a Chi-Square for Goodness of Fit Random: The data come a well-designed random sample or from a randomized experiment 10%: When sampling without replacement, check that n ≤ (1/10)N. Large Counts: All expected counts are greater than 5 Before we start using the chi-square goodness-of-fit test, we have two important cautions to offer. •The chi-square test statistic compares observed and expected counts. Don’t try to perform calculations with the observed and expected proportions in each category. •When checking the Large Sample Size condition, be sure to examine the expected counts, not the observed counts.
  • 17. Mars, Incorporated makes milk chocolate candies. Here’s what the company’s Consumer Affairs Department says about the color distribution of its M&M’S® Milk Chocolate Candies: On average, the new mix of colors of M&M’S ® Milk Chocolate Candies will contain 13 percent of each of browns and reds, 14 percent yellows, 16 percent greens, 20 percent oranges and 24 percent blues. The one-way table summarizes the data from a sample bag of M&M’S ® Milk Chocolate Candies. In general, one-way tables display the distribution of a categorical variable for the individuals in a sample. Sample Problem:
  • 18. Stating the Hypothesis H0: The company’s stated color distribution for M&M’S ® Milk Chocolate Candies is correct. Ha: The company’s stated color distribution for M&M’S ® Milk Chocolate Candies is not correct. We can also write the hypotheses in symbols as: H0: pblue= 0.24, porange= 0.20, pgreen= 0.16, pyellow= 0.14, pred= 0.13, pbrown= 0.13, Ha: At least one of the pi’s is incorrect
  • 19. Finding the expected counts for the color categories:
  • 20. To compute for the Chi-Square Statistic:
  • 21. The Chi-Square Distributions and P-values Since our P-value is between 0.05 and 0.10, it is greater than α = 0.05. Therefore, we fail to reject H0. We don’t have sufficient evidence to conclude that the company’s claimed color distribution is incorrect.