SlideShare a Scribd company logo
2
Most read
3
Most read
5
Most read
Cross Tabulation and
Chi-Square
Cross Tabulation
• Tests whether a relationship exists in
data collected
• i.e. tests whether there is a contingency
between two variables
• Tests whether there are any differences
or similarities in the responses between 2
or more variables
• Usually only do cross tabs between 2
variables that make sense
• Most common x-tabs are by personal
characteristics: such as?…..
- Gender
- Age
- Income level
- Marital status
- Place of residence
- Ethno-cultural background
- Type of household
- Educational background
Example
• Conducting research on the number of
accident claims for a car insurance
company
• Want to see if the number of claims
varies by different types of respondents
• What would be some other meaningful x-
tabs for insurance claims other than
personal characteristics?
Meaningful x-tabs
• Type of car (sports, family, mini-van)
• Whether the driver has any previous
driving convictions
• Whether the driver has taken driving
lessons as a youth
• Quality of vision
• Colour of hair (but would this one be
meaningful?)
Number of insurance claims by gender,
NDJ insurance 2015
Number of
claims
Males Females Total
0 10 032 13 478 23 510
1 2 156 1 430 3 586
2 129 25 145
3 13 4 17
Total 12 321 14 937 27 258
• is there a difference between the number
of claims made by gender?
• difficult to tell by absolute numbers
Number of insurance claims by gender,
NDJ 2015
Number of
claims
Males % Females %
0 81.4 90.2
1 17.5 9.6
2 1.1 0.2
3 0.1 0.0
Total 100.0 100.0
• Conclusion? Yes there is a difference…but
is it statistically significant? Need to do a chi-
square test to determine
Chi-square test
• It enables you to find out if the values for
the two variables are independent or
associated
• If they are independent, there is no
relationship, i.e. the number of claims
does not vary significantly by gender
• If they are associated, there is a
relationship, i.e. the number of claims
does vary significantly by gender
2 Requirements for Chi-square test
1 Try at least to get 50 cases in each sub-
group of the variables being cross
tabulated
E.g. want to examine relationship
between age and number of claims
would need at least 50 cases in each age
group
i.e. 18-24, 25-34, 35-44, 45-54, 55-64, 65+
if not, collapse sub-groups: 18-34, 35-54,
and 55+
2 Requirements for Chi-square test
2 No more than 20% of cells have less
than 5 expected responses
therefore try to collapse the number of
cells whenever possible
example….
Example
Number of
Holidays per
year
18-24
years
25-34
years
35-44
years
45+
years
0 12 4 7 10
1 8 12 7 5
2 8 29 10 4
3 3 2 14 6
4 16 2 11 4
5 4 14 4 7
6 6 13 2 2
7-10 2 2 4 1
11 or more 0 1 4 2
• Number of cells = 36, lots with less than 5
(i.e. more than 20% in fact 50%)
• Thus collapse number of categories...
Thus...
Number of
Holidays per yr
18-34
years
35 +
years
0 16 17
1-2 57 26
3-4 23 35
5+ 40 26
• Number of cells = 8, none less than 5
• Now meets both requirements
How to check if relationship of 2
variables is associated...
• Run the cross tab for your 2 variables
• Check the chi-square value and the degrees of
freedom
• If p value less than .05, your two variables are
said to be associated, i.e. there is a relationship
between the two variables
• You can then statistically be confident in saying
that the number of claims is related to gender
Running a cross tab in Minitab
(similar in SPSS and others)
Example of coded data
Cross_Tabs_lecture.ppt
Tip – make sure independent
variable like age/gender are in
the row
If p value less than 0.05 there is a
relationship, that is satisfaction levels
vary significantly by gender in this
example
In this example we don’t meet
the 20% rule, that is more than
20% of our cells (there are 8
in the example 4 residence
categories x 2 satisfaction
scores), have less than 5 – it
tells you below that 3 do, so
we have to collapse the
residence categories from 4
down to 2 – the computer will
re-code for you…
Re-coding example….
Re-code into new blank column not over the
same data you never want to lose your original
data
Here I re-coded data into a blank column 25
And now values are only 1 and 2 instead of 1-4
Now we can run a cross tab of
residence and satisfaction, p
value less than 0.05 therefore
residence has significant
influence on satisfaction
levels, here our domestic
tourists (coded as 1) are much
more satisfied (69.2%) than
overseas visitors (coded as 2)
– 42.5%!!

More Related Content

PDF
Association and its different measures using SPSS
PDF
Bmgt 311 chapter_14
DOCX
Spss cross tab n chi sq bivariate analysis
PDF
Bmgt 311 chapter_14
PPT
Introduction to spss
PPTX
Chi-Square Test assignment Stat ppt.pptx
PPTX
Statistics.pptx
PPTX
Chi-Square Test Non Parametric Test Categorical Variable
Association and its different measures using SPSS
Bmgt 311 chapter_14
Spss cross tab n chi sq bivariate analysis
Bmgt 311 chapter_14
Introduction to spss
Chi-Square Test assignment Stat ppt.pptx
Statistics.pptx
Chi-Square Test Non Parametric Test Categorical Variable

Similar to Cross_Tabs_lecture.ppt (20)

PDF
3. Univariable and Multivariable Analysis_Using Stata_2025 (2).pdf
PDF
3. Univariable and Multivariable Analysis_Using Stata_2025.pdf
PPT
Data Analysis for Graduate Studies Summary
PPTX
Chisquare test
PDF
Unit 4 Data Reduction.pdf
PDF
Categorical data analysis
PPTX
statistic
PPT
Bivariate analysis
PPTX
lecture_5.pptx
PPTX
Association between-variables
PPTX
Association between-variables
PPTX
Very good statistics-overview rbc (1)
PPTX
Quantitative Data Analysis for Social Science Rsearch
PPTX
chi.pptx,chi.pptx,chi.pptx,chi.pptx.chi.pptx
PDF
Katagorisel veri analizi
PPTX
Inferential-Statistics.pptx werfgvqerfwef3e
DOCX
PRM project report
PPT
cross tabulation
PPT
Chi square mahmoud
3. Univariable and Multivariable Analysis_Using Stata_2025 (2).pdf
3. Univariable and Multivariable Analysis_Using Stata_2025.pdf
Data Analysis for Graduate Studies Summary
Chisquare test
Unit 4 Data Reduction.pdf
Categorical data analysis
statistic
Bivariate analysis
lecture_5.pptx
Association between-variables
Association between-variables
Very good statistics-overview rbc (1)
Quantitative Data Analysis for Social Science Rsearch
chi.pptx,chi.pptx,chi.pptx,chi.pptx.chi.pptx
Katagorisel veri analizi
Inferential-Statistics.pptx werfgvqerfwef3e
PRM project report
cross tabulation
Chi square mahmoud
Ad

More from HasanGilani3 (8)

PPT
Intro to SPSS.ppt
PPT
ad writing.ppt
PPTX
2013 SC retail image and consumer perceptions(1).pptx
PPT
Chapter 9 Fundamental of Hypothesis Testing.ppt
PPT
business ethics.ppt
PPTX
Session 12 How_To_Make_an_Effective_Poster.pptx
PDF
Luxury brand social media.pdf
PPTX
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Intro to SPSS.ppt
ad writing.ppt
2013 SC retail image and consumer perceptions(1).pptx
Chapter 9 Fundamental of Hypothesis Testing.ppt
business ethics.ppt
Session 12 How_To_Make_an_Effective_Poster.pptx
Luxury brand social media.pdf
Decolonisation-and-anti-racism-in-the-classroom-CAI-workshop-Dec-2020 (1).pptx
Ad

Recently uploaded (20)

PDF
Chapter 5_Foreign Exchange Market in .pdf
PDF
Daniels 2024 Inclusive, Sustainable Development
PDF
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
PPTX
Board-Reporting-Package-by-Umbrex-5-23-23.pptx
PDF
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
PDF
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
PDF
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
PDF
How to Get Funding for Your Trucking Business
PDF
Roadmap Map-digital Banking feature MB,IB,AB
PPTX
Belch_12e_PPT_Ch18_Accessible_university.pptx
PDF
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
PDF
Tata consultancy services case study shri Sharda college, basrur
PPTX
2025 Product Deck V1.0.pptxCATALOGTCLCIA
PPT
Chapter four Project-Preparation material
PPT
340036916-American-Literature-Literary-Period-Overview.ppt
PPTX
HR Introduction Slide (1).pptx on hr intro
DOCX
Business Management - unit 1 and 2
PDF
Cours de Système d'information about ERP.pdf
PDF
NewBase 12 August 2025 Energy News issue - 1812 by Khaled Al Awadi_compresse...
PPTX
Lecture (1)-Introduction.pptx business communication
Chapter 5_Foreign Exchange Market in .pdf
Daniels 2024 Inclusive, Sustainable Development
SIMNET Inc – 2023’s Most Trusted IT Services & Solution Provider
Board-Reporting-Package-by-Umbrex-5-23-23.pptx
Solara Labs: Empowering Health through Innovative Nutraceutical Solutions
pdfcoffee.com-opt-b1plus-sb-answers.pdfvi
Outsourced Audit & Assurance in USA Why Globus Finanza is Your Trusted Choice
How to Get Funding for Your Trucking Business
Roadmap Map-digital Banking feature MB,IB,AB
Belch_12e_PPT_Ch18_Accessible_university.pptx
Stem Cell Market Report | Trends, Growth & Forecast 2025-2034
Tata consultancy services case study shri Sharda college, basrur
2025 Product Deck V1.0.pptxCATALOGTCLCIA
Chapter four Project-Preparation material
340036916-American-Literature-Literary-Period-Overview.ppt
HR Introduction Slide (1).pptx on hr intro
Business Management - unit 1 and 2
Cours de Système d'information about ERP.pdf
NewBase 12 August 2025 Energy News issue - 1812 by Khaled Al Awadi_compresse...
Lecture (1)-Introduction.pptx business communication

Cross_Tabs_lecture.ppt

  • 2. Cross Tabulation • Tests whether a relationship exists in data collected • i.e. tests whether there is a contingency between two variables • Tests whether there are any differences or similarities in the responses between 2 or more variables
  • 3. • Usually only do cross tabs between 2 variables that make sense • Most common x-tabs are by personal characteristics: such as?….. - Gender - Age - Income level - Marital status - Place of residence - Ethno-cultural background - Type of household - Educational background
  • 4. Example • Conducting research on the number of accident claims for a car insurance company • Want to see if the number of claims varies by different types of respondents • What would be some other meaningful x- tabs for insurance claims other than personal characteristics?
  • 5. Meaningful x-tabs • Type of car (sports, family, mini-van) • Whether the driver has any previous driving convictions • Whether the driver has taken driving lessons as a youth • Quality of vision • Colour of hair (but would this one be meaningful?)
  • 6. Number of insurance claims by gender, NDJ insurance 2015 Number of claims Males Females Total 0 10 032 13 478 23 510 1 2 156 1 430 3 586 2 129 25 145 3 13 4 17 Total 12 321 14 937 27 258 • is there a difference between the number of claims made by gender? • difficult to tell by absolute numbers
  • 7. Number of insurance claims by gender, NDJ 2015 Number of claims Males % Females % 0 81.4 90.2 1 17.5 9.6 2 1.1 0.2 3 0.1 0.0 Total 100.0 100.0 • Conclusion? Yes there is a difference…but is it statistically significant? Need to do a chi- square test to determine
  • 8. Chi-square test • It enables you to find out if the values for the two variables are independent or associated • If they are independent, there is no relationship, i.e. the number of claims does not vary significantly by gender • If they are associated, there is a relationship, i.e. the number of claims does vary significantly by gender
  • 9. 2 Requirements for Chi-square test 1 Try at least to get 50 cases in each sub- group of the variables being cross tabulated E.g. want to examine relationship between age and number of claims would need at least 50 cases in each age group i.e. 18-24, 25-34, 35-44, 45-54, 55-64, 65+ if not, collapse sub-groups: 18-34, 35-54, and 55+
  • 10. 2 Requirements for Chi-square test 2 No more than 20% of cells have less than 5 expected responses therefore try to collapse the number of cells whenever possible example….
  • 11. Example Number of Holidays per year 18-24 years 25-34 years 35-44 years 45+ years 0 12 4 7 10 1 8 12 7 5 2 8 29 10 4 3 3 2 14 6 4 16 2 11 4 5 4 14 4 7 6 6 13 2 2 7-10 2 2 4 1 11 or more 0 1 4 2 • Number of cells = 36, lots with less than 5 (i.e. more than 20% in fact 50%) • Thus collapse number of categories...
  • 12. Thus... Number of Holidays per yr 18-34 years 35 + years 0 16 17 1-2 57 26 3-4 23 35 5+ 40 26 • Number of cells = 8, none less than 5 • Now meets both requirements
  • 13. How to check if relationship of 2 variables is associated... • Run the cross tab for your 2 variables • Check the chi-square value and the degrees of freedom • If p value less than .05, your two variables are said to be associated, i.e. there is a relationship between the two variables • You can then statistically be confident in saying that the number of claims is related to gender
  • 14. Running a cross tab in Minitab (similar in SPSS and others)
  • 17. Tip – make sure independent variable like age/gender are in the row
  • 18. If p value less than 0.05 there is a relationship, that is satisfaction levels vary significantly by gender in this example
  • 19. In this example we don’t meet the 20% rule, that is more than 20% of our cells (there are 8 in the example 4 residence categories x 2 satisfaction scores), have less than 5 – it tells you below that 3 do, so we have to collapse the residence categories from 4 down to 2 – the computer will re-code for you…
  • 21. Re-code into new blank column not over the same data you never want to lose your original data
  • 22. Here I re-coded data into a blank column 25 And now values are only 1 and 2 instead of 1-4
  • 23. Now we can run a cross tab of residence and satisfaction, p value less than 0.05 therefore residence has significant influence on satisfaction levels, here our domestic tourists (coded as 1) are much more satisfied (69.2%) than overseas visitors (coded as 2) – 42.5%!!