SlideShare a Scribd company logo
UNIVERSITY OF LUCKNOW
Association and its different measures using SPSS
Presented By:
Ankur Dhangar
M.Sc. Biostatistics Sem -3
Roll No. 2210014145008
Association and its different measures
 Association refers to the relationship or dependency between two or more categorical
variables. It's about understanding whether the occurrence or distribution of values in one
categorical variable is related to or influences the values in another categorical variable.
 When exploring association between categorical variables:
1. Independence: If two categorical variables are independent, the occurrence of one
variable's categories does not affect the distribution of the other variable's categories.
For example, there might not be any association between gender and favorite ice
cream flavor.
2. Association or Dependency: When there's an association between categorical
variables, the occurrence or distribution of values in one variable is related to the
values in another variable. For instance, there might be an association between
smoking habits (yes/no) and the incidence of a particular health condition
(present/absent).
Measures of association
Association involves several statistical tests such as-
For nominal variables
1. Chi-square test
2. Fisher's exact test
3.Phi coefficient and
Cramer’s V
4. Lambda
5.Uncertainity
coefficient
For Ordinal variables
1.Gamma
2. Somer’s d
3. Kendall’s tau-b
4. Kendall’s tau-c
For Nominal By Interval
1.Eta
Some other
1. Kappa
2. Risk
3. McNemar etc
The chi-square test for independence, also called Pearson's chi-square test or the chi-
square test of association, is used to discover if there is a relationship between two
categorical variables.
Assumptions:
When you choose to analyse your data using a chi-square test for independence, you
need to make sure that the data you want to analyse "passes" two assumptions:
Assumption #1: Your two variables should be measured at an ordinal or nominal level
(i.e., categorical data).
Assumption #2: Your two variable should consist of two or more categorical,
independent groups. Example independent variables that meet this criterion include
gender (2 groups: Males and Females), ethnicity (e.g., 3 groups: Caucasian, African
American and Hispanic)
Example
Educators are always looking for novel ways in which to teach statistics to
undergraduates as part of a non-statistics degree course (e.g., psychology).
With current technology, it is possible to present how-to guides for statistical
programs online instead of in a book. However, different people learn in
different ways. An educator would like to know whether gender (male/female)
is associated with the preferred type of learning medium (online vs. books).
Therefore, we have two nominal variables: Gender (male/female) and
Preferred Learning Medium (online/books).
Setup in SPSS
In SPSS Statistics, we created two variables so that we could enter our data:
Gender and Preferred_Learning_Medium.
Procedure:
1. Click Analyze > Descriptives Statistics > Crosstabs... on the top menu, as shown
below:
2. You will be presented with the following Crosstabs dialogue box:
3. Transfer one of the variables into the Row(s): box and the other variable into
the Column(s): box. In our example, we will transfer the Gender variable into
the Row(s): box and Preferred_Learning_Medium into the Column(s): box.
4. Click on the button. You will be presented with
the following Crosstabs: Statistics dialogue box:
5. Select the Chi-square and Phi and Cramer's V options, as shown below:
6.Click on the button.
7. Click on the button. You will be presented with the
following Crosstabs: Cell Display dialogue box:
8. Select Observed from the –Counts– area, and Row, Column and Total from
the –Percentages– area,
9. Click on the button.
10. Click on the button.
Note: This next option is only really useful if you have more than two categories in one of
your variables, but we will show it here in case you have. If you don't, you can skip to
STEP 12.
11. You will be presented with the following:
This option allows you to change the order of the values to either ascending
or descending.
12.Once you have made your choice, click on the button.
.
13. Click on the button to generate your output
Output:
You will be presented with some tables in the Output Viewer under the title
"Crosstabs". The tables of note are presented below:
The Crosstabulation Table (Gender*Preferred Learning Medium
Crosstabulation)
This table allows us to understand that both males and females prefer to learn
using online materials versus books.
The Chi-Square Tests Table
When reading this table we are interested in the results of the "Pearson Chi-
Square" row. We can see here that χ(1) = 0.487, p = .485. This tells us that
there is no statistically significant association between Gender and Preferred
Learning Medium; that is, both Males and Females equally prefer online
learning versus books.
The Symmetric Measures Table
Phi and Cramer's V are both tests of the strength of association. We can see
that the strength of association between the variables is very weak.
Fisher’s Exact Test is used to determine whether or not there is a significant
association between two categorical variables.
It is typically used as an alternative to the Chi-Square Test of
Independence when one or more of the cell counts in a 2×2 table is less than 5
Fisher’s Exact Test
Example:
Democrat Republican
Female 8 4
Male 4 9
Suppose we want to know whether or not gender is associated with political
party preference at a particular college. To explore this, we randomly poll
25 students on campus. The number of students who are Democrats or
Republicans, based on gender, is shown in the table below:
To determine if there is a statistically significant association between gender
and political party preference, we can use the following steps to perform
Fisher’s Exact Test in SPSS:
Step 1: Enter the data.
First, enter the data as shown below:
Each row shows an individual’s ID, their political party preference, and their
gender.
Step 2: Perform Fisher’s Exact Test.
Click the Analyze tab, then Descriptive Statistics, then Crosstabs:
Drag the variable Gender into the box labelled Rows and the
variable Party into the box labelled Columns. Then click the button
labelled Statistics and make sure that the box next to Chi-square is checked.
Then click Continue.
Next, click the button labelled Exact and make sure the box next to Exact is
checked. Then click Continue.
Lastly, click OK to perform Fisher’s Exact Test.
Interpret the results
Once you click OK, the results of Fisher’s Exact Test will be displayed:
The first table displays the number of missing cases in the dataset. We can see
that there are 0 missing cases in this example.
The second table displays a crosstab of the total number of individuals by
gender and political party preference.
The third table shows the results of Fisher’s Exact Test. We can see the
following two p-values for the test:
•Two-sided p-value: .115
•One-sided p-value: .081
The null hypothesis for Fisher’s Exact Test is that the two variables are
independent. In this case, our null hypothesis is that gender and political party
preference are independent, which is a two-sided test so we would use the
two-sided p-value of 0.115.
Since this p-value is not less than 0.05, we do not reject the null hypothesis.
Thus, we don’t have sufficient evidence to say that there is a significant
association between gender and political party preference.
Phi coefficient and Cramer's V
Phi Coefficient:
•Use Case: Measures association between two dichotomous variables in a
2x2 table.
•Range: Varies between -1 and 1.
•Interpretation:
• 1 indicates a perfect association.
• 0 indicates no association.
• -1 indicates a perfect negative association.
•Specificity: Applicable only to 2x2 contingency tables.
•Calculation: Derived when analyzing two dichotomous variables using
Crosstabs in SPSS with the "Phi and Cramer's V" option selected.
Cramer's V:
•Use Case: Measures association between categorical variables in
contingency tables larger than 2x2.
•Range: Varies between 0 and 1.
•Interpretation:
• 1 indicates a perfect association.
• 0 indicates no association.
•Applicability: Suitable for larger contingency tables beyond 2x2, providing
a measure of association strength.
•Calculation: Automatically calculated by SPSS Crosstabs when dealing with
tables larger than 2x2.
Both coefficients help assess the strength of association between categorical
variables, with Phi specific to 2x2 tables and Cramer's V extending to larger
contingency tables. They aid in understanding relationships within datasets or
research studies.
Lambda Coefficient: A measure of association that reflects the proportional
reduction in error when values of the independent variable are used to
predict values of the dependent variable. A value of 1 means that the
independent variable perfectly predicts the dependent variable. A value of 0
means that the independent variable is no help in predicting the dependent
variable.
Uncertainty coefficient: A measure of association that indicates the
proportional reduction in error when values of one variable are used to
predict values of the other variable. For example, a value of 0.83 indicates
that knowledge of one variable reduces error in predicting values of the
other variable by 83%. The program calculates both symmetric and
asymmetric versions of the uncertainty coefficient
For Ordinal variables: For tables in which both rows and columns contain
ordered values
Goodman and Kruskal's gamma: Goodman and Kruskal's gamma can be used
when both ordinal variables have just two categories. For example, you could
use Goodman and Kruskal's gamma to understand whether there is an
association between exam performance (i.e., with two categories: "pass" or
"fail") and test anxiety level (i.e., with two categories: "high" or "low").
Assumptions:
1.Your two variables should be measured on an ordinal scale. Examples
of ordinal variables include Likert items (e.g., a 7-point scale from "strongly
agree" through to "strongly disagree").
2. There needs to be a monotonic relationship between the two variables.
Example
A researcher at the Department of Health wants to determine if there is an association
between the amount of physical activity people undertake and obesity levels. They recruited
250 people to take part in a study to find out. These participants were randomly sampled
from the population.
Participants were asked to complete a questionnaire explaining their level of physical
activity. Based on the results from this questionnaire, participants were categorized into one
of five physical activity levels: "sedentary", "low", "moderate", "high" and "very high".
Participants were also assessed by a nurse practitioner to determine their body fat
classification. Based on this assessment, participants were categorized into one of four
levels: "morbidly obese", "obese", "normal" and "underweight". These ordered responses
reflected the categories of our two variables: physical_activity_level (i.e., with five
categories: "sedentary", "low", "moderate", "high" and "very high")
and body_fat_classification (i.e., with four categories: "morbidly obese", "obese", "normal"
and "underweight").
Data setup
For a Goodman and Kruskal's gamma, you will have either two or three
variables:
(1) The ordinal variable, physical_activity_level, which has five ordered
categories: "sedentary", "low", "moderate", "high" and "very high";
(2) The ordinal variable, body_fat_classification, which has four ordered
categories: "underweight", "normal", "obese" and "morbidly obese".
(3) The frequencies (i.e., total counts) for the two ordinal variables above (i.e.,
the number of participants for each cell combination). This is captured in the
variable, freq.
Procedure:
Click Analyze > Descriptive Statistics > Crosstabs... on the top menu, as
shown below:
You will be presented with the Crosstabs dialogue box,
as shown below:
Transfer the variable, physical_activity_level, into
the Row(s): box, and the
variable, body_fat_classification, into
the Column(s): box, by dragging-and-dropping or by
clicking the relevant buttons
Click on the button. You will be presented with the
following Crosstabs: Statistics dialogue box:
Select the Gamma tick box in the –Ordinal– area, as shown below:
Click on the button. You will be returned to the Crosstabs dialogue
box, as shown below:
Click on the button. This will generate the output for Goodman and
Kruskal's gamma.
Interpreting the results for Goodman and Kruskal's gamma
The Case Processing Summary table provides a useful check of your data
to determine the valid sample size, N, and whether you have any missing
data. In our example, there were 250 participants with no missing data.
Finally, you should consult the Symmetric Measures table, which provides
the result of Goodman and Kruskal's gamma, as shown below:
Goodman and Kruskal's gamma is presented in the "Gamma" row of the "Value" column
and is -.509 in this example. This indicates that there is a strong, negative
association between the level of physical activity and body fat classification. In other
words, higher levels of physical activity (e.g., a "very high" level of physical activity) are
associated with a lower body fat classification (e.g., an "underweight" body fat
classification); and vice versa, with lower levels of physical activity (e.g., a "sedentary"
level of physical activity) being associated with a higher body fat classification (e.g., a
"morbidly obese" body fat classification).
Furthermore, the "Approx. Sig." column shows that the statistical significance
value (i.e., p-value) is < .001, which means that the p-value is less than .001. Therefore,
the association between physical activity level and body fat classification is statistically
significant.
Somers’ d :
Somers' delta (or Somers' d, for short), is a nonparametric measure of the
strength and direction of association that exists between an ordinal
dependent variable and an ordinal independent variable.
For Example: We can use Somers' d to understand whether there is an
association between customer satisfaction and hotel room cleanliness (i.e.,
the ordinal dependent variable is "customer satisfaction", measured on a five
point scale from "very satisfied" to "very dissatisfied", and the ordinal
independent variable is "hotel room cleanliness", measured on a three point
scale from "above average" to "below average").
Interpretation:
when running the Somers' d procedure, start with the Case Processing
Summary table:
The Case Processing Summary table provides a useful check of your data to determine the
valid sample size, N, and whether you have any missing data. In our example, there were 189
participants with no missing data.
Next, you should get a 'feel' for your data using the table showing the crosstabulation of the
data (this will be labelled based on your two variables; in our case,
the hotel_room_cleanliness * customer_satisfaction Crosstabulation table), as shown
below:
Finally, you should consult the Directional Measures table, which provides
the result of Somers' d, as shown below:
Somers' d is presented in the "customer_satisfaction Dependent" row of the
"Value" column and is .603 in this example. This indicates that increased hotel
room cleanliness is associated with increased customer satisfaction.
Furthermore, the "Approx. Sig." column shows that the statistical significance
value (i.e., p-value) is .000, which means p < .0005. Therefore, the association
between the ordinal dependent variable, "customer satisfaction", and ordinal
independent variable, "hotel room cleanliness", is statistically significant.
In our example, you might present the results as follows:
Somers' d was run to determine the association between customer satisfaction
and hotel room cleanliness amongst 189 participants. There was a strong,
positive correlation between customer satisfaction and hotel room cleanliness,
which was statistically significant (d = .603, p < .0005).
Kendall's Tau-b:
Kendall's tau-b assesses the strength and direction of association between two ordinal
variables. It doesn’t consider ties in the data.
1.Open Data in SPSS: Load your dataset.
2.Access Cross-tabulation Analysis: Go to the menu bar.
1. Click on "Analyze."
2. Select "Descriptive Statistics."
3. Choose "Crosstabs."
3.Select Variables: In the "Crosstabs" dialog box:
1. Choose the ordinal variables you want to analyze.
2. Place one variable in the "Rows" box and the other in the "Columns" box.
4.Run the Analysis: Click on the "Statistics" button in the Crosstabs dialog box:
1. Check the box for "Kendall's tau-b" under the "Statistics" list.
2. Click "Continue" to return to the Crosstabs dialog box.
3. Click on the "OK" button to execute the analysis.
Kendall's tau-c: A nonparametric measure of association for ordinal variables
that ignores ties. The sign of the coefficient indicates the direction of the
relationship, and its absolute value indicates the strength, with larger absolute
values indicating stronger relationships. Possible values range from -1 to 1, but
a value of -1 or +1 can be obtained only from square tables.
We are going to perform a Cross tabulation of the variables “Prayer
Frequency” and “Fundamentalist”. We test for the existence of a relationship
between those two variables. In order to test for the existence of a relationship,
we use the SPSS output shown above.
THE ASSUMPTIONS
We use Kendall’s tau-b, Kendall’s tau-c and Gamma to check for a relationship, which
is appropriate because we are analyzing two Ordinal variables
The hypotheses:
We want to test the following null and alternative hypotheses:
Ho: There is not relationship between ''Prayer Frequency'' and ''Fundamentalist''
Ha: There is a relationship between ''Prayer Frequency'' and ''Fundamentalist''
In order to test these hypotheses we use SPSS crosstabs analysis and the Kendall’s tau-
b, Kendall’s tau-c and Gamma statistics.
Level of Significance:
We choose the level of significance alpha =0.05. The level of significance corresponds to
the probability to make a Type I error, which is the probability of the rejecting the null
hypothesis when it is actually true.
Results:
The significance of the Kendall’s tau-b, Kendall’s tau-c and Gamma statistics is p =
0.000 for all of them, which indicates that there is a relationship between the two
variables. Since the p-values are all less than 0.05, our previously chosen level of
significance, we have enough evidence to reject the null hypothesis.
Conclusions:
We reject the null hypothesis at the 0.05 level of significance, which means that we accept
that there is a relationship between the variables, with a 0.05 significance level. The value
of the Kendall’s tau-b, Kendall’s tau-c and Gamma is small (0.262, 0.282, 0.360
respectively) which is indication of a rather weak relationship.
Nominal by Interval: When one variable is categorical and the other is
quantitative, select Eta. The categorical variable must be coded numerically.
Eta: A measure of association that ranges from 0 to 1, with 0 indicating no
association between the row and column variables and values close to 1
indicating a high degree of association. Eta is appropriate for a dependent
variable measured on an interval scale (for example, income) and an
independent variable with a limited number of categories (for example,
gender). Two eta values are computed: one treats the row variable as the
interval variable, and the other treats the column variable as the interval
variable.
Interpret Results:
Eta measures the strength of association between categorical variables in
contingency tables, considering their nominal nature. It's particularly useful for
larger tables where other measures might not be as effective.
Cohen's kappa
Cohen's kappa is a statistical measure that assesses the level of agreement
between two raters or observers when dealing with categorical or nominal
data. It's particularly useful in cases where there might be agreement by
chance alone.
For a Cohen's kappa, you will have two variables. In this example, these are:
(1) the scores for "Rater 1", Officer1, which reflect Police Officer 1's
decision to rate a person's behaviour as being either "normal" or
"suspicious"; and (2) the scores for "Rater 2", Officer2, which reflect Police
Officer 2's decision to rate a person's behaviour as being either "normal" or
"suspicious".
Assumptions:
1. The response (e.g., judgement) that is made by your two raters is measured on
a nominal scale (i.e., either an ordinal or nominal variable) and the categories need to
be mutually exclusive.
2. The response data are paired observations of the same phenomenon, meaning that both
raters assess the same observations.
3. Each response variable must have the same number of categories and
the crosstabulation must be symmetric (i.e., "square") (e.g., a 2x2 crosstabulation, 3x3
crosstabulation, 4x4 crosstabulation, etc.). For example, a 2x2 crosstabulation means that the
responses of both raters are measured on a dichotomous scale; that is, a nominal scale with
two categories (e.g., no scarring vs scarring.
4. The two raters are independent (i.e., one rater's judgement does not affect the other rater's
judgement).
5. The same two raters are used to judge all observations. This has been referred to as
having fixed or unique raters. If different raters were used for each observation (e.g.,
patient), Cohen's kappa is not the appropriate test to use.
Click Analyze > Descriptive Statistics > Crosstabs... on the main menu:
You will be presented with the Crosstabs dialogue box, as shown below:
You need to transfer one variable (e.g., Officer1) into the Row(s): box, and the
second variable (e.g., Officer2) into the Column(s): box.
Click on the button. You will be presented with the Crosstabs:
Statistics dialogue box,
Select the Kappa checkbox. You will end up with the dialogue box below:
• Click on the button and you will be returned to
the Crosstabs dialogue box.
• Click on the button. You will be presented with
the Crosstabs: Cell Display dialogue box, as shown below:
• Keep the Observed checkbox selected, as shown below:
• Click on the button. You will be returned to
the Crosstabs dialogue box, as shown below:
1.Click on the button to generate the output for Cohen's kappa.
Output of Cohen's kappa:
SPSS Statistics generates two main tables of output for Cohen's kappa:
the Crosstabulation table and Symmetric Measures table
We can use the Crosstabulation table, amongst other things, to understand the degree to
which the two raters (i.e., both police officers) agreed and disagreed on their judgement of
suspicious behaviour. You can see from the table above that of the 100 people evaluated by
the police officers, 85 people displayed normal behaviour as agreed by both police officers.
In addition, both officers agreed that there were seven people who displayed suspicious
behaviour. Therefore, there were eight individuals (i.e., 6 + 2 = 8) for whom the two police
officers could not agree on their behaviour.
You can see that Cohen's kappa (κ) is .593. This is the proportion of
agreement over and above chance agreement. Cohen's kappa (κ) can range
from -1 to +1. A kappa (κ) of .593 represents a moderate strength of
agreement. Furthermore, since p < .001 (i.e., p is less than .001), our kappa
(κ) coefficient is statistically significantly different from zero.
McNemar test
McNemar test to assess the association or difference between two related categorical
variables. This test is often used to analyze paired categorical data, especially in situations
where you're dealing with a binary outcome (e.g., yes/no, success/failure) measured on the
same subjects or entities at different points in time or under different conditions.
To perform a McNemar test for association in SPSS:
1.Data Preparation: Ensure your data is arranged in a 2x2 contingency table format, where
each row represents a pair of related observations or conditions.
2.Open SPSS: Start by opening your dataset in SPSS.
3.Conduct the Test:
1. Go to "Analyze" > "Nonparametric Tests" > "Legacy Dialogs" > "2 Related
Samples..."
2. In the dialog box that appears, move your paired categorical variables to the "Paired
Variables" box.
3. Click on "Options" to select the McNemar test.
4. Click "OK" to run the analysis.
Example
A researcher wanted to investigate the impact of an intervention on smoking. In this
hypothetical study, 50 participants were recruited to take part, consisting of 25 smokers and
25 non-smokers. All participants watched an emotive video showing the impact that deaths
from smoking-related cancers had on families. Two weeks after this video intervention, the
same participants were asked whether they remained smokers or non-smokers.
Therefore, participants were categorized as being either smokers or non-smokers before the
intervention and then re-assessed as either smokers or non-smokers after the intervention.
Due to the same participants being measured twice, we have paired-samples. We also have
a dependent variable that is dichotomous with two mutually exclusive categories (i.e.,
"smoker" and "non-smoker"). As a result, a McNemar's test is the appropriate choice to
analyze the data.
Output of the McNemar's test
Crosstabulation Table:
Consulting the bottom-left cell first, you can see that there were 16 participants that were
originally smokers, but following the intervention, they became non-smokers. In the sense
that the intervention was designed to reduce smoking, these participants could be
considered the intervention's successes. However, by consulting the top-right cell, you can
see that five non-smokers actually took up smoking following the intervention! Clearly,
this is not the effect you were looking for, and it is important that you note this in your
report. So, although overall there were more 'positive' changes than 'negative' changes, it
can be enlightening to know the different 'directions of travel' that the participants took.
Test Statistics Table:
Fifty participants were recruited to take part in an intervention
designed to warn about the dangers of smoking. An exact
McNemar's test determined that there was a statistically
significant difference in the proportion of non-smokers pre-
and post-intervention, p = .027.

More Related Content

PPSX
Evaluating Algebraic Expressions - Math 7 Q2W4 LC1
PPTX
Assessment
PPTX
union and intersection.pptx
PPTX
Share My Lesson: The Slope of a Line
PDF
WORKBOOK ENGLISH GRADE 4
PPTX
Curriculum Mapping
PDF
6.1 Radian Measure
PDF
Inscribed Angles
Evaluating Algebraic Expressions - Math 7 Q2W4 LC1
Assessment
union and intersection.pptx
Share My Lesson: The Slope of a Line
WORKBOOK ENGLISH GRADE 4
Curriculum Mapping
6.1 Radian Measure
Inscribed Angles

What's hot (20)

PPT
Education planning
DOCX
Action Plan in Gardening 2022.docx
PPTX
Triangle congruence-gr.8
PPTX
DM_017 S.2025.pptx..............................
PPTX
nature of the roots and discriminant
DOCX
Lesson plan special angles
PDF
1.1 Real Number Properties
PPTX
Lesson 1: Special Products
PPTX
PERCENTILE : MEASURES OF POSITION FOR GROUPED DATA
PDF
5As Lesson Plan on Pairs of Angles Formed by Parallel Lines Cut by a Transversal
PDF
Grade 9: Mathematics Unit 3 Variation
PDF
Parallel lines and transversals wkst
PPTX
RPMS-PPST-Multiyear-2023.final.pptx
PDF
China and India worksheet
PPTX
4. Classroom Observation Tool (COT).pptx
DOCX
SUMMARY OF INDEX OF MASTERY.docx
DOCX
Periodical / Quarter 3 - Test for Grade 1 Pupils
PDF
Math 9 Curriculum Guide rev.2016
PPT
1.3 midpoint and distance formulas
PPTX
Introduction and Orientation on FUNDAMENTALS OF ABM 1.pptx
Education planning
Action Plan in Gardening 2022.docx
Triangle congruence-gr.8
DM_017 S.2025.pptx..............................
nature of the roots and discriminant
Lesson plan special angles
1.1 Real Number Properties
Lesson 1: Special Products
PERCENTILE : MEASURES OF POSITION FOR GROUPED DATA
5As Lesson Plan on Pairs of Angles Formed by Parallel Lines Cut by a Transversal
Grade 9: Mathematics Unit 3 Variation
Parallel lines and transversals wkst
RPMS-PPST-Multiyear-2023.final.pptx
China and India worksheet
4. Classroom Observation Tool (COT).pptx
SUMMARY OF INDEX OF MASTERY.docx
Periodical / Quarter 3 - Test for Grade 1 Pupils
Math 9 Curriculum Guide rev.2016
1.3 midpoint and distance formulas
Introduction and Orientation on FUNDAMENTALS OF ABM 1.pptx
Ad

Similar to Association and its different measures using SPSS (20)

PPTX
marketing research & applications on SPSS
PPTX
lecture_5.pptx
PPTX
Statistical test in spss
PPTX
How to Choose the Right Statistical Test for Your Assignment
PDF
Katagorisel veri analizi
PPT
Introduction to spss
PPT
Chapter38
PPTX
UNIT 5.pptx
PPTX
Categorical_Data_Analysis_Combined_MSc_Biostatistics.pptx
DOCX
Spss cross tab n chi sq bivariate analysis
PPTX
Summary of statistical tools used in spss
PPTX
ARM Module 5. Advanced research methodology
PPTX
Chi square and t tests, Neelam zafar & group
PPT
Week-7-Slides-Mean-Tests-Parametrics-Test-selection-module.ppt
PDF
Overview of Advance Marketing Research
PPTX
BRM Unit 3 Data Analysis.pptx
PPT
Chi square mahmoud
PPT
Cross_Tabs_lecture.ppt
PPTX
Marketing Research Hypothesis Testing.pptx
marketing research & applications on SPSS
lecture_5.pptx
Statistical test in spss
How to Choose the Right Statistical Test for Your Assignment
Katagorisel veri analizi
Introduction to spss
Chapter38
UNIT 5.pptx
Categorical_Data_Analysis_Combined_MSc_Biostatistics.pptx
Spss cross tab n chi sq bivariate analysis
Summary of statistical tools used in spss
ARM Module 5. Advanced research methodology
Chi square and t tests, Neelam zafar & group
Week-7-Slides-Mean-Tests-Parametrics-Test-selection-module.ppt
Overview of Advance Marketing Research
BRM Unit 3 Data Analysis.pptx
Chi square mahmoud
Cross_Tabs_lecture.ppt
Marketing Research Hypothesis Testing.pptx
Ad

Recently uploaded (20)

PDF
01-Introduction-to-Information-Management.pdf
PPTX
Microbial diseases, their pathogenesis and prophylaxis
PDF
Complications of Minimal Access Surgery at WLH
PPTX
master seminar digital applications in india
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
Institutional Correction lecture only . . .
PDF
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
PPTX
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
PDF
Classroom Observation Tools for Teachers
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PDF
A systematic review of self-coping strategies used by university students to ...
PDF
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PDF
RMMM.pdf make it easy to upload and study
PDF
O7-L3 Supply Chain Operations - ICLT Program
PPTX
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Anesthesia in Laparoscopic Surgery in India
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
01-Introduction-to-Information-Management.pdf
Microbial diseases, their pathogenesis and prophylaxis
Complications of Minimal Access Surgery at WLH
master seminar digital applications in india
Supply Chain Operations Speaking Notes -ICLT Program
Final Presentation General Medicine 03-08-2024.pptx
Institutional Correction lecture only . . .
Chapter 2 Heredity, Prenatal Development, and Birth.pdf
Introduction-to-Literarature-and-Literary-Studies-week-Prelim-coverage.pptx
Classroom Observation Tools for Teachers
Final Presentation General Medicine 03-08-2024.pptx
A systematic review of self-coping strategies used by university students to ...
grade 11-chemistry_fetena_net_5883.pdf teacher guide for all student
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
RMMM.pdf make it easy to upload and study
O7-L3 Supply Chain Operations - ICLT Program
school management -TNTEU- B.Ed., Semester II Unit 1.pptx
Abdominal Access Techniques with Prof. Dr. R K Mishra
Anesthesia in Laparoscopic Surgery in India
O5-L3 Freight Transport Ops (International) V1.pdf

Association and its different measures using SPSS

  • 1. UNIVERSITY OF LUCKNOW Association and its different measures using SPSS Presented By: Ankur Dhangar M.Sc. Biostatistics Sem -3 Roll No. 2210014145008
  • 2. Association and its different measures  Association refers to the relationship or dependency between two or more categorical variables. It's about understanding whether the occurrence or distribution of values in one categorical variable is related to or influences the values in another categorical variable.  When exploring association between categorical variables: 1. Independence: If two categorical variables are independent, the occurrence of one variable's categories does not affect the distribution of the other variable's categories. For example, there might not be any association between gender and favorite ice cream flavor. 2. Association or Dependency: When there's an association between categorical variables, the occurrence or distribution of values in one variable is related to the values in another variable. For instance, there might be an association between smoking habits (yes/no) and the incidence of a particular health condition (present/absent).
  • 3. Measures of association Association involves several statistical tests such as- For nominal variables 1. Chi-square test 2. Fisher's exact test 3.Phi coefficient and Cramer’s V 4. Lambda 5.Uncertainity coefficient For Ordinal variables 1.Gamma 2. Somer’s d 3. Kendall’s tau-b 4. Kendall’s tau-c For Nominal By Interval 1.Eta Some other 1. Kappa 2. Risk 3. McNemar etc
  • 4. The chi-square test for independence, also called Pearson's chi-square test or the chi- square test of association, is used to discover if there is a relationship between two categorical variables. Assumptions: When you choose to analyse your data using a chi-square test for independence, you need to make sure that the data you want to analyse "passes" two assumptions: Assumption #1: Your two variables should be measured at an ordinal or nominal level (i.e., categorical data). Assumption #2: Your two variable should consist of two or more categorical, independent groups. Example independent variables that meet this criterion include gender (2 groups: Males and Females), ethnicity (e.g., 3 groups: Caucasian, African American and Hispanic)
  • 5. Example Educators are always looking for novel ways in which to teach statistics to undergraduates as part of a non-statistics degree course (e.g., psychology). With current technology, it is possible to present how-to guides for statistical programs online instead of in a book. However, different people learn in different ways. An educator would like to know whether gender (male/female) is associated with the preferred type of learning medium (online vs. books). Therefore, we have two nominal variables: Gender (male/female) and Preferred Learning Medium (online/books). Setup in SPSS In SPSS Statistics, we created two variables so that we could enter our data: Gender and Preferred_Learning_Medium.
  • 6. Procedure: 1. Click Analyze > Descriptives Statistics > Crosstabs... on the top menu, as shown below: 2. You will be presented with the following Crosstabs dialogue box:
  • 7. 3. Transfer one of the variables into the Row(s): box and the other variable into the Column(s): box. In our example, we will transfer the Gender variable into the Row(s): box and Preferred_Learning_Medium into the Column(s): box. 4. Click on the button. You will be presented with the following Crosstabs: Statistics dialogue box:
  • 8. 5. Select the Chi-square and Phi and Cramer's V options, as shown below: 6.Click on the button. 7. Click on the button. You will be presented with the following Crosstabs: Cell Display dialogue box:
  • 9. 8. Select Observed from the –Counts– area, and Row, Column and Total from the –Percentages– area, 9. Click on the button. 10. Click on the button. Note: This next option is only really useful if you have more than two categories in one of your variables, but we will show it here in case you have. If you don't, you can skip to STEP 12. 11. You will be presented with the following:
  • 10. This option allows you to change the order of the values to either ascending or descending. 12.Once you have made your choice, click on the button. . 13. Click on the button to generate your output Output: You will be presented with some tables in the Output Viewer under the title "Crosstabs". The tables of note are presented below: The Crosstabulation Table (Gender*Preferred Learning Medium Crosstabulation)
  • 11. This table allows us to understand that both males and females prefer to learn using online materials versus books. The Chi-Square Tests Table When reading this table we are interested in the results of the "Pearson Chi- Square" row. We can see here that χ(1) = 0.487, p = .485. This tells us that there is no statistically significant association between Gender and Preferred Learning Medium; that is, both Males and Females equally prefer online learning versus books.
  • 12. The Symmetric Measures Table Phi and Cramer's V are both tests of the strength of association. We can see that the strength of association between the variables is very weak. Fisher’s Exact Test is used to determine whether or not there is a significant association between two categorical variables. It is typically used as an alternative to the Chi-Square Test of Independence when one or more of the cell counts in a 2×2 table is less than 5 Fisher’s Exact Test
  • 13. Example: Democrat Republican Female 8 4 Male 4 9 Suppose we want to know whether or not gender is associated with political party preference at a particular college. To explore this, we randomly poll 25 students on campus. The number of students who are Democrats or Republicans, based on gender, is shown in the table below: To determine if there is a statistically significant association between gender and political party preference, we can use the following steps to perform Fisher’s Exact Test in SPSS: Step 1: Enter the data. First, enter the data as shown below: Each row shows an individual’s ID, their political party preference, and their gender.
  • 14. Step 2: Perform Fisher’s Exact Test. Click the Analyze tab, then Descriptive Statistics, then Crosstabs:
  • 15. Drag the variable Gender into the box labelled Rows and the variable Party into the box labelled Columns. Then click the button labelled Statistics and make sure that the box next to Chi-square is checked. Then click Continue. Next, click the button labelled Exact and make sure the box next to Exact is checked. Then click Continue.
  • 16. Lastly, click OK to perform Fisher’s Exact Test. Interpret the results Once you click OK, the results of Fisher’s Exact Test will be displayed:
  • 17. The first table displays the number of missing cases in the dataset. We can see that there are 0 missing cases in this example. The second table displays a crosstab of the total number of individuals by gender and political party preference. The third table shows the results of Fisher’s Exact Test. We can see the following two p-values for the test: •Two-sided p-value: .115 •One-sided p-value: .081 The null hypothesis for Fisher’s Exact Test is that the two variables are independent. In this case, our null hypothesis is that gender and political party preference are independent, which is a two-sided test so we would use the two-sided p-value of 0.115. Since this p-value is not less than 0.05, we do not reject the null hypothesis. Thus, we don’t have sufficient evidence to say that there is a significant association between gender and political party preference.
  • 18. Phi coefficient and Cramer's V Phi Coefficient: •Use Case: Measures association between two dichotomous variables in a 2x2 table. •Range: Varies between -1 and 1. •Interpretation: • 1 indicates a perfect association. • 0 indicates no association. • -1 indicates a perfect negative association. •Specificity: Applicable only to 2x2 contingency tables. •Calculation: Derived when analyzing two dichotomous variables using Crosstabs in SPSS with the "Phi and Cramer's V" option selected.
  • 19. Cramer's V: •Use Case: Measures association between categorical variables in contingency tables larger than 2x2. •Range: Varies between 0 and 1. •Interpretation: • 1 indicates a perfect association. • 0 indicates no association. •Applicability: Suitable for larger contingency tables beyond 2x2, providing a measure of association strength. •Calculation: Automatically calculated by SPSS Crosstabs when dealing with tables larger than 2x2. Both coefficients help assess the strength of association between categorical variables, with Phi specific to 2x2 tables and Cramer's V extending to larger contingency tables. They aid in understanding relationships within datasets or research studies.
  • 20. Lambda Coefficient: A measure of association that reflects the proportional reduction in error when values of the independent variable are used to predict values of the dependent variable. A value of 1 means that the independent variable perfectly predicts the dependent variable. A value of 0 means that the independent variable is no help in predicting the dependent variable. Uncertainty coefficient: A measure of association that indicates the proportional reduction in error when values of one variable are used to predict values of the other variable. For example, a value of 0.83 indicates that knowledge of one variable reduces error in predicting values of the other variable by 83%. The program calculates both symmetric and asymmetric versions of the uncertainty coefficient
  • 21. For Ordinal variables: For tables in which both rows and columns contain ordered values Goodman and Kruskal's gamma: Goodman and Kruskal's gamma can be used when both ordinal variables have just two categories. For example, you could use Goodman and Kruskal's gamma to understand whether there is an association between exam performance (i.e., with two categories: "pass" or "fail") and test anxiety level (i.e., with two categories: "high" or "low"). Assumptions: 1.Your two variables should be measured on an ordinal scale. Examples of ordinal variables include Likert items (e.g., a 7-point scale from "strongly agree" through to "strongly disagree"). 2. There needs to be a monotonic relationship between the two variables.
  • 22. Example A researcher at the Department of Health wants to determine if there is an association between the amount of physical activity people undertake and obesity levels. They recruited 250 people to take part in a study to find out. These participants were randomly sampled from the population. Participants were asked to complete a questionnaire explaining their level of physical activity. Based on the results from this questionnaire, participants were categorized into one of five physical activity levels: "sedentary", "low", "moderate", "high" and "very high". Participants were also assessed by a nurse practitioner to determine their body fat classification. Based on this assessment, participants were categorized into one of four levels: "morbidly obese", "obese", "normal" and "underweight". These ordered responses reflected the categories of our two variables: physical_activity_level (i.e., with five categories: "sedentary", "low", "moderate", "high" and "very high") and body_fat_classification (i.e., with four categories: "morbidly obese", "obese", "normal" and "underweight").
  • 23. Data setup For a Goodman and Kruskal's gamma, you will have either two or three variables: (1) The ordinal variable, physical_activity_level, which has five ordered categories: "sedentary", "low", "moderate", "high" and "very high"; (2) The ordinal variable, body_fat_classification, which has four ordered categories: "underweight", "normal", "obese" and "morbidly obese". (3) The frequencies (i.e., total counts) for the two ordinal variables above (i.e., the number of participants for each cell combination). This is captured in the variable, freq.
  • 24. Procedure: Click Analyze > Descriptive Statistics > Crosstabs... on the top menu, as shown below: You will be presented with the Crosstabs dialogue box, as shown below: Transfer the variable, physical_activity_level, into the Row(s): box, and the variable, body_fat_classification, into the Column(s): box, by dragging-and-dropping or by clicking the relevant buttons
  • 25. Click on the button. You will be presented with the following Crosstabs: Statistics dialogue box: Select the Gamma tick box in the –Ordinal– area, as shown below: Click on the button. You will be returned to the Crosstabs dialogue box, as shown below:
  • 26. Click on the button. This will generate the output for Goodman and Kruskal's gamma. Interpreting the results for Goodman and Kruskal's gamma The Case Processing Summary table provides a useful check of your data to determine the valid sample size, N, and whether you have any missing data. In our example, there were 250 participants with no missing data.
  • 27. Finally, you should consult the Symmetric Measures table, which provides the result of Goodman and Kruskal's gamma, as shown below: Goodman and Kruskal's gamma is presented in the "Gamma" row of the "Value" column and is -.509 in this example. This indicates that there is a strong, negative association between the level of physical activity and body fat classification. In other words, higher levels of physical activity (e.g., a "very high" level of physical activity) are associated with a lower body fat classification (e.g., an "underweight" body fat classification); and vice versa, with lower levels of physical activity (e.g., a "sedentary" level of physical activity) being associated with a higher body fat classification (e.g., a "morbidly obese" body fat classification). Furthermore, the "Approx. Sig." column shows that the statistical significance value (i.e., p-value) is < .001, which means that the p-value is less than .001. Therefore, the association between physical activity level and body fat classification is statistically significant.
  • 28. Somers’ d : Somers' delta (or Somers' d, for short), is a nonparametric measure of the strength and direction of association that exists between an ordinal dependent variable and an ordinal independent variable. For Example: We can use Somers' d to understand whether there is an association between customer satisfaction and hotel room cleanliness (i.e., the ordinal dependent variable is "customer satisfaction", measured on a five point scale from "very satisfied" to "very dissatisfied", and the ordinal independent variable is "hotel room cleanliness", measured on a three point scale from "above average" to "below average"). Interpretation: when running the Somers' d procedure, start with the Case Processing Summary table:
  • 29. The Case Processing Summary table provides a useful check of your data to determine the valid sample size, N, and whether you have any missing data. In our example, there were 189 participants with no missing data. Next, you should get a 'feel' for your data using the table showing the crosstabulation of the data (this will be labelled based on your two variables; in our case, the hotel_room_cleanliness * customer_satisfaction Crosstabulation table), as shown below: Finally, you should consult the Directional Measures table, which provides the result of Somers' d, as shown below:
  • 30. Somers' d is presented in the "customer_satisfaction Dependent" row of the "Value" column and is .603 in this example. This indicates that increased hotel room cleanliness is associated with increased customer satisfaction. Furthermore, the "Approx. Sig." column shows that the statistical significance value (i.e., p-value) is .000, which means p < .0005. Therefore, the association between the ordinal dependent variable, "customer satisfaction", and ordinal independent variable, "hotel room cleanliness", is statistically significant. In our example, you might present the results as follows: Somers' d was run to determine the association between customer satisfaction and hotel room cleanliness amongst 189 participants. There was a strong, positive correlation between customer satisfaction and hotel room cleanliness, which was statistically significant (d = .603, p < .0005).
  • 31. Kendall's Tau-b: Kendall's tau-b assesses the strength and direction of association between two ordinal variables. It doesn’t consider ties in the data. 1.Open Data in SPSS: Load your dataset. 2.Access Cross-tabulation Analysis: Go to the menu bar. 1. Click on "Analyze." 2. Select "Descriptive Statistics." 3. Choose "Crosstabs." 3.Select Variables: In the "Crosstabs" dialog box: 1. Choose the ordinal variables you want to analyze. 2. Place one variable in the "Rows" box and the other in the "Columns" box. 4.Run the Analysis: Click on the "Statistics" button in the Crosstabs dialog box: 1. Check the box for "Kendall's tau-b" under the "Statistics" list. 2. Click "Continue" to return to the Crosstabs dialog box. 3. Click on the "OK" button to execute the analysis.
  • 32. Kendall's tau-c: A nonparametric measure of association for ordinal variables that ignores ties. The sign of the coefficient indicates the direction of the relationship, and its absolute value indicates the strength, with larger absolute values indicating stronger relationships. Possible values range from -1 to 1, but a value of -1 or +1 can be obtained only from square tables.
  • 33. We are going to perform a Cross tabulation of the variables “Prayer Frequency” and “Fundamentalist”. We test for the existence of a relationship between those two variables. In order to test for the existence of a relationship, we use the SPSS output shown above. THE ASSUMPTIONS We use Kendall’s tau-b, Kendall’s tau-c and Gamma to check for a relationship, which is appropriate because we are analyzing two Ordinal variables The hypotheses: We want to test the following null and alternative hypotheses: Ho: There is not relationship between ''Prayer Frequency'' and ''Fundamentalist'' Ha: There is a relationship between ''Prayer Frequency'' and ''Fundamentalist'' In order to test these hypotheses we use SPSS crosstabs analysis and the Kendall’s tau- b, Kendall’s tau-c and Gamma statistics.
  • 34. Level of Significance: We choose the level of significance alpha =0.05. The level of significance corresponds to the probability to make a Type I error, which is the probability of the rejecting the null hypothesis when it is actually true. Results: The significance of the Kendall’s tau-b, Kendall’s tau-c and Gamma statistics is p = 0.000 for all of them, which indicates that there is a relationship between the two variables. Since the p-values are all less than 0.05, our previously chosen level of significance, we have enough evidence to reject the null hypothesis. Conclusions: We reject the null hypothesis at the 0.05 level of significance, which means that we accept that there is a relationship between the variables, with a 0.05 significance level. The value of the Kendall’s tau-b, Kendall’s tau-c and Gamma is small (0.262, 0.282, 0.360 respectively) which is indication of a rather weak relationship.
  • 35. Nominal by Interval: When one variable is categorical and the other is quantitative, select Eta. The categorical variable must be coded numerically. Eta: A measure of association that ranges from 0 to 1, with 0 indicating no association between the row and column variables and values close to 1 indicating a high degree of association. Eta is appropriate for a dependent variable measured on an interval scale (for example, income) and an independent variable with a limited number of categories (for example, gender). Two eta values are computed: one treats the row variable as the interval variable, and the other treats the column variable as the interval variable. Interpret Results: Eta measures the strength of association between categorical variables in contingency tables, considering their nominal nature. It's particularly useful for larger tables where other measures might not be as effective.
  • 36. Cohen's kappa Cohen's kappa is a statistical measure that assesses the level of agreement between two raters or observers when dealing with categorical or nominal data. It's particularly useful in cases where there might be agreement by chance alone. For a Cohen's kappa, you will have two variables. In this example, these are: (1) the scores for "Rater 1", Officer1, which reflect Police Officer 1's decision to rate a person's behaviour as being either "normal" or "suspicious"; and (2) the scores for "Rater 2", Officer2, which reflect Police Officer 2's decision to rate a person's behaviour as being either "normal" or "suspicious".
  • 37. Assumptions: 1. The response (e.g., judgement) that is made by your two raters is measured on a nominal scale (i.e., either an ordinal or nominal variable) and the categories need to be mutually exclusive. 2. The response data are paired observations of the same phenomenon, meaning that both raters assess the same observations. 3. Each response variable must have the same number of categories and the crosstabulation must be symmetric (i.e., "square") (e.g., a 2x2 crosstabulation, 3x3 crosstabulation, 4x4 crosstabulation, etc.). For example, a 2x2 crosstabulation means that the responses of both raters are measured on a dichotomous scale; that is, a nominal scale with two categories (e.g., no scarring vs scarring. 4. The two raters are independent (i.e., one rater's judgement does not affect the other rater's judgement). 5. The same two raters are used to judge all observations. This has been referred to as having fixed or unique raters. If different raters were used for each observation (e.g., patient), Cohen's kappa is not the appropriate test to use.
  • 38. Click Analyze > Descriptive Statistics > Crosstabs... on the main menu: You will be presented with the Crosstabs dialogue box, as shown below:
  • 39. You need to transfer one variable (e.g., Officer1) into the Row(s): box, and the second variable (e.g., Officer2) into the Column(s): box. Click on the button. You will be presented with the Crosstabs: Statistics dialogue box, Select the Kappa checkbox. You will end up with the dialogue box below:
  • 40. • Click on the button and you will be returned to the Crosstabs dialogue box. • Click on the button. You will be presented with the Crosstabs: Cell Display dialogue box, as shown below: • Keep the Observed checkbox selected, as shown below: • Click on the button. You will be returned to the Crosstabs dialogue box, as shown below:
  • 41. 1.Click on the button to generate the output for Cohen's kappa. Output of Cohen's kappa: SPSS Statistics generates two main tables of output for Cohen's kappa: the Crosstabulation table and Symmetric Measures table
  • 42. We can use the Crosstabulation table, amongst other things, to understand the degree to which the two raters (i.e., both police officers) agreed and disagreed on their judgement of suspicious behaviour. You can see from the table above that of the 100 people evaluated by the police officers, 85 people displayed normal behaviour as agreed by both police officers. In addition, both officers agreed that there were seven people who displayed suspicious behaviour. Therefore, there were eight individuals (i.e., 6 + 2 = 8) for whom the two police officers could not agree on their behaviour. You can see that Cohen's kappa (κ) is .593. This is the proportion of agreement over and above chance agreement. Cohen's kappa (κ) can range from -1 to +1. A kappa (κ) of .593 represents a moderate strength of agreement. Furthermore, since p < .001 (i.e., p is less than .001), our kappa (κ) coefficient is statistically significantly different from zero.
  • 43. McNemar test McNemar test to assess the association or difference between two related categorical variables. This test is often used to analyze paired categorical data, especially in situations where you're dealing with a binary outcome (e.g., yes/no, success/failure) measured on the same subjects or entities at different points in time or under different conditions. To perform a McNemar test for association in SPSS: 1.Data Preparation: Ensure your data is arranged in a 2x2 contingency table format, where each row represents a pair of related observations or conditions. 2.Open SPSS: Start by opening your dataset in SPSS. 3.Conduct the Test: 1. Go to "Analyze" > "Nonparametric Tests" > "Legacy Dialogs" > "2 Related Samples..." 2. In the dialog box that appears, move your paired categorical variables to the "Paired Variables" box. 3. Click on "Options" to select the McNemar test. 4. Click "OK" to run the analysis.
  • 44. Example A researcher wanted to investigate the impact of an intervention on smoking. In this hypothetical study, 50 participants were recruited to take part, consisting of 25 smokers and 25 non-smokers. All participants watched an emotive video showing the impact that deaths from smoking-related cancers had on families. Two weeks after this video intervention, the same participants were asked whether they remained smokers or non-smokers. Therefore, participants were categorized as being either smokers or non-smokers before the intervention and then re-assessed as either smokers or non-smokers after the intervention. Due to the same participants being measured twice, we have paired-samples. We also have a dependent variable that is dichotomous with two mutually exclusive categories (i.e., "smoker" and "non-smoker"). As a result, a McNemar's test is the appropriate choice to analyze the data. Output of the McNemar's test Crosstabulation Table:
  • 45. Consulting the bottom-left cell first, you can see that there were 16 participants that were originally smokers, but following the intervention, they became non-smokers. In the sense that the intervention was designed to reduce smoking, these participants could be considered the intervention's successes. However, by consulting the top-right cell, you can see that five non-smokers actually took up smoking following the intervention! Clearly, this is not the effect you were looking for, and it is important that you note this in your report. So, although overall there were more 'positive' changes than 'negative' changes, it can be enlightening to know the different 'directions of travel' that the participants took. Test Statistics Table: Fifty participants were recruited to take part in an intervention designed to warn about the dangers of smoking. An exact McNemar's test determined that there was a statistically significant difference in the proportion of non-smokers pre- and post-intervention, p = .027.