SlideShare a Scribd company logo
Cross-Tabs Continued Andrew Martin PS 372 University of Kentucky
Statistical Independence Statistical independence  is a property of two variables in which the probability that an observation is in a particular category of one variable and a particular category of the other variable equals the simple or marginal probability of being in those categories. Contrary to other statistical measures discussed in class, statistical independence indicators test for a lack of a relationship between two variables.
Statistical Independence Let us assume two nominal variables, X and Y. The values for these variables are as follows: X:  a, b, c,  ... Y:  r, s, t , ...
Statistical Independence P(X= a ) stands for the probability a randomly selected case has property or value  a  on variable X. P(Y=r) stands for the probability a randomly selected case has property or value  r  on variable Y P(X=a, Y=r) stands for the joint probability that a randomly selected observation has both property  a  and property  r  simultaneously.
Statistical Independence If X and Y are statistically independent: P(X= a , Y= r ) = [P(X= a )][P(Y= r )] for all  a  and  r .
Statistical Independence
If gender and turnout are independent: Total obs in column  m * Total obs in row v   N    =  mv
Statistical Independence Total obs in column  m * Total obs in row v   N    =  mv 210 * 100 300  =  70 70 is the expected frequency. Because the observed and expected frequencies are the same, the variables are independent.
150 * 150 300 = 75
Here, the relationship is not independent (or dependent) because 75 (expected frequency) is less than 100 (observed frequency).
Testing for Independence How do we test for independence for an entire cross-tabulation table?  A statistic used to test the statistical significance of a relationship in a cross-tabulation table is a  chi-square test (χ 2 ).
Chi-Square Statistic The chi-square statistic essentially compares an observed result—the table produced by the data—with a hypothetical table that would occur if, in the population, the variables were statistically independent.
How is the chi-square statistic calculated? The chi-square test is set up just like a hypothesis test. The observed chi-square value is compared to the critical value for a certain critical region. A statistic is calculated for each cell of the cross-tabulation and is similar to the independence statistic.
How is the chi-square statistic calculated? (Observed frequency – expected frequency) 2
Chi-Square Test The null hypothesis is statistical independence between X and Y. H 0 : X, Y Independent The alternative hypothesis is X and Y are not independent.  H A : X, Y Dependent
Chi-Square Test The chi-square is a family of distributions, each of which depends on degrees of freedom. The degrees of freedom equals the number of rows minus one times the  number of columns minus one.  (r-1)(c-1)‏ Level of significance: The probability  ( α ) of incorrectly rejecting a true null hypothesis.
Chi-Square Test Critical value: The chi-square test is always a one-tail test. Choose the critical value of chi-square from a tabulation to make the critical region  (the region of rejection) equal to  α. (JRM: Appendix C, pg. 577)‏
Chi-Square Test The observed chi-2 is the sum of the squared differences between observed and expected frequencies divided by the expected frequency. If  χ 2 obs   ≥ χ 2 crit. , reject null hypothesis. Otherwise, do not reject.
 
Chi-Square Test Let's assume we want to test the relationship at the .01 level.  The observed  χ 2  is 62.21. The degrees of freedom is (5-1)(2-1) = 4. The critical χ 2  is 13.28. Since 62.21 > 13.28, we can reject the null of an independent relationship. Y (attitudes toward gun control) is dependent on X (gender).
Chi-Square Test The  χ 2  statistic works  for dependent variables that are ordinal or nominal measures, but another statistic is more appropriate for interval- and ratio-level data.
Analysis of Variance For quantitative (or interval- and ratio-level ) data the analysis of variance is appropriate. Analysis of variance  aka ANOVA. The independent variable, however, is generally still nominal or ordinal.
ANOVA tells political scientists ... (1) if there are any differences among the means (2) which specific means differ and by how much (3) whether the observed differences in Y could have arisen by chance or whether they reflect real variation among the categories or groups in X
Two important concepts Effect size —The difference between one mean and the other. Difference of means test —The larger the difference of means, the more likely the difference is not due to chance and is instead due to a relationship between the independent and dependent variables.
Setting up an Example Suppose you want to test the effect of negative political ads on intention of voting in the next election.  You set up a control group and a test group. Each group watches a newscast, but the test group watches negative TV ads during the commercial breaks. The control group watches a newscast without a campaign ad. You create a pre- and post-test of both groups to compare the effects of both ads.
Difference of the means Effect = Mean (test group) – Mean (control group)‏
 
Difference of the means Although different statistics use different formulas, each means test has two identical properties: (1) The numerator indicates the difference of the means (2) The denominator indicates the standard error of the difference of the means.
Difference of the means A means test will compare the means of two different samples. The larger the  N  for both samples, the greater confidence that the observed difference in the sample (D) will correctly estimate the population difference ( ∆ ).
Difference of the means In abstract terms: Mean of test group – mean of control group Std. Error (test) + Std. error (control)‏ In concrete terms: Mean (Ads) – Mean (No ads)‏ Std. Error (Ads) + Std. error (No ads)‏
Hypothesis Test of Means Difference of the means tests the null hypothesis that there is no difference between the means. You can basically test the significance of a means difference by employing a hypothesis test or calculating a confidence interval.
Small-Sample Test of DoM Let's suppose we are looking at a two samples: both measure the level of democracy in a country, but one is a sample of  developed  countries and another is a sample of  developing  countries. We want to test whether the level of economic development impacts the level of openness in a democracy. Specifically, we want to test whether the population differences are 0.
Small-Sample Test of DoM Two small samples, so we must use the t distribution and calculate degrees of freedom . When there are two samples, degrees of freedom equals N 1  (first sample) + N 2  (second sample) – 2.
 
The standard error for the difference of means is .144.
ANOVA ANOVA or analysis of variance allows us to expand on previous methods. This procedure treats the observations in categories of the explanatory or independent variable as independent samples from populations. This makes it possible to test hypotheses such as H0 = μ1 = μ2 =μ3
Variation Three types Total variation —A quantitative measure of the variation in a variable, determined by summing the squared deviation of each observation from the mean.
Variation Explained variation —That portion of the total variation in a dependent variable explained by the variance in the independent variable. Unexplained variation —That portion of the total variation in a dependent variable that is not accounted for by the variation in the independent variable(s).
ANOVA Total variance = Within variance + Between variance Within variance is the  unexplained  variance Between variance is the  explained  variance
ANOVA Explained variance refers to the fact that some of the observed differences seems to be due to “membership in” or “having a property of” one category of X.  On average, the A's differ fro the B's.  Knowing this characteristics can help us tell the value of Y.
ANOVA Percent explained = (between/total) X 100 Ex: Percent explained = (0/total) = 0
ANOVA ANOVA involves quantifying the types of variation and using the numbers to make inferences.  The standard measure of variation is the sum of squares, which is a total of squared deviations about the mean.
ANOVA TSS = BSS + WSS TSS = Total sum of squares BSS =Between mean variability WSS = Within group variability Percent explained – (BSS/WSS) X 100
ANOVA The percent of variation explained is called eta-squared (η 2 ). It varies between 0 and 1 like any proportion.  0 means the independent variable explains nothing about the dependent variable. 1 means the independent variable explains all variation in the dependent variable.
ANOVA Often this statistic is explained as follows: X explains 60 percent of the variation in Y and hence is an important explanatory factor. (if η 2  ) = .6

More Related Content

PPT
Aron chpt 9 ed t test independent samples
DOCX
T test independent samples
PPTX
Comparing means
PPT
Introduction to ANOVAs
PDF
Independent samples t-test
PPTX
Anova (f test) and mean differentiation
PPT
PPTX
ANOVA TEST by shafeek
Aron chpt 9 ed t test independent samples
T test independent samples
Comparing means
Introduction to ANOVAs
Independent samples t-test
Anova (f test) and mean differentiation
ANOVA TEST by shafeek

What's hot (17)

PPTX
Application of ANOVA
PPTX
The comparison of two populations
PDF
Assumptions of ANOVA
PPT
PPT
Small Sampling Theory Presentation1
PPT
Ch9 Correlation and Regression
PPT
Aron chpt 9 ed f2011
PPT
Nonparametric statistics
PPTX
ODP
Multiple Linear Regression II and ANOVA I
PPT
117 chap8 slides
PPT
One way anova
PDF
Introduction to ANOVA
PPT
T Test For Two Independent Samples
PPTX
Qt notes by mj
PDF
2 way ANOVA(Analysis Of VAriance
PPTX
F test and ANOVA
Application of ANOVA
The comparison of two populations
Assumptions of ANOVA
Small Sampling Theory Presentation1
Ch9 Correlation and Regression
Aron chpt 9 ed f2011
Nonparametric statistics
Multiple Linear Regression II and ANOVA I
117 chap8 slides
One way anova
Introduction to ANOVA
T Test For Two Independent Samples
Qt notes by mj
2 way ANOVA(Analysis Of VAriance
F test and ANOVA
Ad

Similar to Chi2 Anova (20)

PPTX
Chi square test social research refer.ppt
PPT
More tabs
PPTX
Anova in easyest way
PPTX
Medical Statistics Part-II:Inferential statistics
PPTX
Analysis of variance (anova)
PPTX
Parametric tests seminar
PPTX
STATISTICAL TESTS USED IN VARIOUS STUDIES
PDF
inferentialstatistics-210411214248.pdf
PPTX
Inferential statistics
PPTX
Basic of Statistical Inference Part-V: Types of Hypothesis Test (Parametric)
PDF
MSC III_Research Methodology and Statistics_Inferrential ststistics.pdf
PPTX
Stats 3000 Week 1 - Winter 2011
PDF
202003241550010409rajeev_pandey_Non-Parametric.pdf
DOCX
Assessment 3 ContextYou will review the theory, logic, and a.docx
PPTX
Chi square and t tests, Neelam zafar & group
PPTX
6 the six uContinuous data analysis.pptx
PPT
Ch7 Analysis of Variance (ANOVA)
PPT
Biostatistics
PPTX
ders 5 hypothesis testing.pptx
PPT
Lecture-6 (t-test and one way ANOVA.ppt
Chi square test social research refer.ppt
More tabs
Anova in easyest way
Medical Statistics Part-II:Inferential statistics
Analysis of variance (anova)
Parametric tests seminar
STATISTICAL TESTS USED IN VARIOUS STUDIES
inferentialstatistics-210411214248.pdf
Inferential statistics
Basic of Statistical Inference Part-V: Types of Hypothesis Test (Parametric)
MSC III_Research Methodology and Statistics_Inferrential ststistics.pdf
Stats 3000 Week 1 - Winter 2011
202003241550010409rajeev_pandey_Non-Parametric.pdf
Assessment 3 ContextYou will review the theory, logic, and a.docx
Chi square and t tests, Neelam zafar & group
6 the six uContinuous data analysis.pptx
Ch7 Analysis of Variance (ANOVA)
Biostatistics
ders 5 hypothesis testing.pptx
Lecture-6 (t-test and one way ANOVA.ppt
Ad

More from mandrewmartin (20)

PPT
Regression
PPT
Diffmeans
PPT
Crosstabs
PPT
Statisticalrelationships
PPT
Statistics 091208004734-phpapp01 (1)
PPT
Morestatistics22 091208004743-phpapp01
PPT
Week 7 - sampling
PPT
Research design pt. 2
PPT
Research design
PPT
Measurement pt. 2
PPT
Measurement
PPT
Introduction
PPT
Building blocks of scientific research
PPT
Studying politics scientifically
PPT
Berry et al
PPT
Chapter 11 Psrm
PPT
Week 7 Sampling
PPT
Stats Intro Ps 372
PPT
Statistics
PPT
Regression
Diffmeans
Crosstabs
Statisticalrelationships
Statistics 091208004734-phpapp01 (1)
Morestatistics22 091208004743-phpapp01
Week 7 - sampling
Research design pt. 2
Research design
Measurement pt. 2
Measurement
Introduction
Building blocks of scientific research
Studying politics scientifically
Berry et al
Chapter 11 Psrm
Week 7 Sampling
Stats Intro Ps 372
Statistics

Recently uploaded (20)

PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Chapter 3 Spatial Domain Image Processing.pdf
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PPTX
MYSQL Presentation for SQL database connectivity
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PPTX
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
PDF
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
KodekX | Application Modernization Development
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PPTX
Digital-Transformation-Roadmap-for-Companies.pptx
PPTX
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
PPT
Teaching material agriculture food technology
PDF
Empathic Computing: Creating Shared Understanding
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
Cloud computing and distributed systems.
Reach Out and Touch Someone: Haptics and Empathic Computing
Chapter 3 Spatial Domain Image Processing.pdf
NewMind AI Weekly Chronicles - August'25 Week I
MYSQL Presentation for SQL database connectivity
Understanding_Digital_Forensics_Presentation.pptx
PA Analog/Digital System: The Backbone of Modern Surveillance and Communication
Bridging biosciences and deep learning for revolutionary discoveries: a compr...
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Advanced methodologies resolving dimensionality complications for autism neur...
KodekX | Application Modernization Development
“AI and Expert System Decision Support & Business Intelligence Systems”
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Digital-Transformation-Roadmap-for-Companies.pptx
Effective Security Operations Center (SOC) A Modern, Strategic, and Threat-In...
Teaching material agriculture food technology
Empathic Computing: Creating Shared Understanding
Unlocking AI with Model Context Protocol (MCP)
Review of recent advances in non-invasive hemoglobin estimation
Cloud computing and distributed systems.

Chi2 Anova

  • 1. Cross-Tabs Continued Andrew Martin PS 372 University of Kentucky
  • 2. Statistical Independence Statistical independence is a property of two variables in which the probability that an observation is in a particular category of one variable and a particular category of the other variable equals the simple or marginal probability of being in those categories. Contrary to other statistical measures discussed in class, statistical independence indicators test for a lack of a relationship between two variables.
  • 3. Statistical Independence Let us assume two nominal variables, X and Y. The values for these variables are as follows: X: a, b, c, ... Y: r, s, t , ...
  • 4. Statistical Independence P(X= a ) stands for the probability a randomly selected case has property or value a on variable X. P(Y=r) stands for the probability a randomly selected case has property or value r on variable Y P(X=a, Y=r) stands for the joint probability that a randomly selected observation has both property a and property r simultaneously.
  • 5. Statistical Independence If X and Y are statistically independent: P(X= a , Y= r ) = [P(X= a )][P(Y= r )] for all a and r .
  • 7. If gender and turnout are independent: Total obs in column m * Total obs in row v N = mv
  • 8. Statistical Independence Total obs in column m * Total obs in row v N = mv 210 * 100 300 = 70 70 is the expected frequency. Because the observed and expected frequencies are the same, the variables are independent.
  • 9. 150 * 150 300 = 75
  • 10. Here, the relationship is not independent (or dependent) because 75 (expected frequency) is less than 100 (observed frequency).
  • 11. Testing for Independence How do we test for independence for an entire cross-tabulation table? A statistic used to test the statistical significance of a relationship in a cross-tabulation table is a chi-square test (χ 2 ).
  • 12. Chi-Square Statistic The chi-square statistic essentially compares an observed result—the table produced by the data—with a hypothetical table that would occur if, in the population, the variables were statistically independent.
  • 13. How is the chi-square statistic calculated? The chi-square test is set up just like a hypothesis test. The observed chi-square value is compared to the critical value for a certain critical region. A statistic is calculated for each cell of the cross-tabulation and is similar to the independence statistic.
  • 14. How is the chi-square statistic calculated? (Observed frequency – expected frequency) 2
  • 15. Chi-Square Test The null hypothesis is statistical independence between X and Y. H 0 : X, Y Independent The alternative hypothesis is X and Y are not independent. H A : X, Y Dependent
  • 16. Chi-Square Test The chi-square is a family of distributions, each of which depends on degrees of freedom. The degrees of freedom equals the number of rows minus one times the number of columns minus one. (r-1)(c-1)‏ Level of significance: The probability ( α ) of incorrectly rejecting a true null hypothesis.
  • 17. Chi-Square Test Critical value: The chi-square test is always a one-tail test. Choose the critical value of chi-square from a tabulation to make the critical region (the region of rejection) equal to α. (JRM: Appendix C, pg. 577)‏
  • 18. Chi-Square Test The observed chi-2 is the sum of the squared differences between observed and expected frequencies divided by the expected frequency. If χ 2 obs ≥ χ 2 crit. , reject null hypothesis. Otherwise, do not reject.
  • 19.  
  • 20. Chi-Square Test Let's assume we want to test the relationship at the .01 level. The observed χ 2 is 62.21. The degrees of freedom is (5-1)(2-1) = 4. The critical χ 2 is 13.28. Since 62.21 > 13.28, we can reject the null of an independent relationship. Y (attitudes toward gun control) is dependent on X (gender).
  • 21. Chi-Square Test The χ 2 statistic works for dependent variables that are ordinal or nominal measures, but another statistic is more appropriate for interval- and ratio-level data.
  • 22. Analysis of Variance For quantitative (or interval- and ratio-level ) data the analysis of variance is appropriate. Analysis of variance aka ANOVA. The independent variable, however, is generally still nominal or ordinal.
  • 23. ANOVA tells political scientists ... (1) if there are any differences among the means (2) which specific means differ and by how much (3) whether the observed differences in Y could have arisen by chance or whether they reflect real variation among the categories or groups in X
  • 24. Two important concepts Effect size —The difference between one mean and the other. Difference of means test —The larger the difference of means, the more likely the difference is not due to chance and is instead due to a relationship between the independent and dependent variables.
  • 25. Setting up an Example Suppose you want to test the effect of negative political ads on intention of voting in the next election. You set up a control group and a test group. Each group watches a newscast, but the test group watches negative TV ads during the commercial breaks. The control group watches a newscast without a campaign ad. You create a pre- and post-test of both groups to compare the effects of both ads.
  • 26. Difference of the means Effect = Mean (test group) – Mean (control group)‏
  • 27.  
  • 28. Difference of the means Although different statistics use different formulas, each means test has two identical properties: (1) The numerator indicates the difference of the means (2) The denominator indicates the standard error of the difference of the means.
  • 29. Difference of the means A means test will compare the means of two different samples. The larger the N for both samples, the greater confidence that the observed difference in the sample (D) will correctly estimate the population difference ( ∆ ).
  • 30. Difference of the means In abstract terms: Mean of test group – mean of control group Std. Error (test) + Std. error (control)‏ In concrete terms: Mean (Ads) – Mean (No ads)‏ Std. Error (Ads) + Std. error (No ads)‏
  • 31. Hypothesis Test of Means Difference of the means tests the null hypothesis that there is no difference between the means. You can basically test the significance of a means difference by employing a hypothesis test or calculating a confidence interval.
  • 32. Small-Sample Test of DoM Let's suppose we are looking at a two samples: both measure the level of democracy in a country, but one is a sample of developed countries and another is a sample of developing countries. We want to test whether the level of economic development impacts the level of openness in a democracy. Specifically, we want to test whether the population differences are 0.
  • 33. Small-Sample Test of DoM Two small samples, so we must use the t distribution and calculate degrees of freedom . When there are two samples, degrees of freedom equals N 1 (first sample) + N 2 (second sample) – 2.
  • 34.  
  • 35. The standard error for the difference of means is .144.
  • 36. ANOVA ANOVA or analysis of variance allows us to expand on previous methods. This procedure treats the observations in categories of the explanatory or independent variable as independent samples from populations. This makes it possible to test hypotheses such as H0 = μ1 = μ2 =μ3
  • 37. Variation Three types Total variation —A quantitative measure of the variation in a variable, determined by summing the squared deviation of each observation from the mean.
  • 38. Variation Explained variation —That portion of the total variation in a dependent variable explained by the variance in the independent variable. Unexplained variation —That portion of the total variation in a dependent variable that is not accounted for by the variation in the independent variable(s).
  • 39. ANOVA Total variance = Within variance + Between variance Within variance is the unexplained variance Between variance is the explained variance
  • 40. ANOVA Explained variance refers to the fact that some of the observed differences seems to be due to “membership in” or “having a property of” one category of X. On average, the A's differ fro the B's. Knowing this characteristics can help us tell the value of Y.
  • 41. ANOVA Percent explained = (between/total) X 100 Ex: Percent explained = (0/total) = 0
  • 42. ANOVA ANOVA involves quantifying the types of variation and using the numbers to make inferences. The standard measure of variation is the sum of squares, which is a total of squared deviations about the mean.
  • 43. ANOVA TSS = BSS + WSS TSS = Total sum of squares BSS =Between mean variability WSS = Within group variability Percent explained – (BSS/WSS) X 100
  • 44. ANOVA The percent of variation explained is called eta-squared (η 2 ). It varies between 0 and 1 like any proportion. 0 means the independent variable explains nothing about the dependent variable. 1 means the independent variable explains all variation in the dependent variable.
  • 45. ANOVA Often this statistic is explained as follows: X explains 60 percent of the variation in Y and hence is an important explanatory factor. (if η 2 ) = .6