Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 1
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi
ICAR Research Complex for NEH Region, Umiam, Meghalaya
uttamba@gmail.com, aniruddhaubkv@gmail.com, aktripathi2020@yahoo.co.in
Chapter 1: Introduction
Which is more powerful (parametric and non-parametric tests)
Parametric Assumptions
Nonparametric Assumptions
Advantages of Nonparametric Tests
Disadvantages of nonparametric tests
Few important points on nonparametric test
Measurement
Parametric vs. non-parametric tests
Nonparametric Methods
Chapter2: Tests of relationships between variables
Chi-square Test
Binomial Test
Run Test for Randomness
One-Sample Kolmogorov-Smirnov Test
Chapter 3: Two-Independent-Samples Tests
Mann-Whitney U test
The two-sample Kolmogorov-Smirnov test
Wlad-Walfowitz Run
Mozes Extreme Reactions
Chapter 4: Multiple Independent Samples Tests
Median test
Kruskal-Wallis H
Jonckheere-terpstra test
Chapter 5: Tests for Two Related Samples
Wilcoxon signed-ranks
McNemar
Marginal-homogeinity
Sign test
Chapter 6: Tests for Multiple Related Samples
Friedman
Cochran’s Q
Kendall’s W
Chapter 7: Exact Tests and Monte Carlo Method
The Exact Method
The Monte Carlo Method
When to Use Exact Tests
Test Questions:
References:
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 2
They are called nonparametric because they make no assumptions about the parameters (such as the mean and
variance) of a distribution, nor do they assume that any particular distribution is being used.
Introduction
A parametric statistical test is one that makes assumptions about the parameters (defining properties) of the population
distribution(s) from which one's data are drawn.
A non-parametric test is one that makes no such assumptions. In this strict sense, "non-parametric" is essentially a null
category, since virtually all statistical tests assume one thing or another about the properties of the source population(s).
Which is more powerful?
Non-parametric statistical procedures are less powerful because they use less information in their calculation. For
example, a parametric correlation uses information about the mean and deviation from the mean while a non-parametric
correlation will use only the ordinal position of pairs of scores.
Parametric Assumptions
 The observations must be independent
 The observations must be drawn from normally distributed populations
 These populations must have the same variances
 The means of these normal and homoscedastic populations must be linear combinations of effects due to columns
and/or rows
Nonparametric Assumptions
Certain assumptions are associated with most nonparametric statistical tests, but these are fewer and weaker than
those of parametric tests.
Advantages of Nonparametric Tests
 Probability statements obtained from most nonparametric statistics are exact probabilities, regardless of the
shape of the population distribution from which the random sample was drawn
 If sample sizes as small as N=6 are used, there is no alternative to using a nonparametric test
 Easier to learn and apply than parametric tests
 Based on a model that specifies very general conditions.
 No specific form of the distribution from which the sample was drawn.
 Hence nonparametric tests are also known as distribution free tests.
Disadvantages of nonparametric tests
 Losing precision/wasteful of data
 Low power
 False sense of security
 Lack of software
 Testing distributions only
 Higher-ordered interactions not dealt with
 Parametric models are more efficient if data permit.
 It is difficult to compute by hand for large samples
 Tables are not widely available
 In cases where a parametric test would be appropriate, non-parametric tests have less power. In other words,
a larger sample size can be required to draw conclusions with the same degree of confidence.
Few points
 The inferences drawn from tests based on the parametric tests such as t, F and Chi-square may be seriously
affected when the parent population’s distribution is not normal.
 The adverse effect could be more when sample size is small.
 Thus when there is doubt about the distribution of the parent population, a nonparametric method should be
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 3
used.
 In many situations, particularly in social and behavioral sciences, observations are difficult or impossible to
take on numerical scales and a suitable nonparametric test is an alternative under such situations.
Measurement
The 4 levels of measurement
1. Nominal or Classificatory Scale
 Gender, ethnic background, colors of a spectrum
 In research activities a YES/NO scale is nominal. It has no order and there is no distance
between YES and NO.
2. Ordinal or Ranking Scale
 Hardness of rocks, beauty, military ranks
 The simplest ordinal scale is a ranking.
 There is no objective distance between any two points on your subjective scale.
3. Interval Scale
 Celsius or Fahrenheit. It is an interval scale because it is assumed to have equidistant
points between each of the scale elements.
4. Ratio Scale
 Kelvin temperature, speed, height, mass or weight
 Ratio data is interval data with a natural zero point
Parametric vs. non-parametric tests
Parametric Non-parametric
Assumed distribution Normal Any
Assumed variance Homogeneous Any
Typical data Ratio or Interval Ordinal or Nominal
Data set relationships Independent Any
Usual central measure Mean Median
Benefits Can draw more conclusions Simplicity; Less affected by outliers
Tests
Choosing Choosing parametric test Choosing a non-parametric test
Correlation test Pearson Spearman
Independent measures, 2 groups Independent-measures t-test Mann-Whitney test
Independent measures, >2 groups
One-way, independent-measures
ANOVA
Kruskal-Wallis test
Repeated measures, 2 conditions Matched-pair t-test Wilcoxon test
Repeated measures, >2 conditions One-way, repeated measures ANOVA Friedman's test
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 4
Nonparametric Methods
There is at least one nonparametric test equivalent to a parametric test
Tests of relationships between variables
Chi-square Test
This goodness-of-fit test compares the observed and expected frequencies in each category to test either that all categories
contain the same proportion of values or that each category contains a user-specified proportion of values.
Examples
The chi-square test could be used to determine if a basket of fruit contains equal proportions of apples, bananas, oranges,
and peaches.
fruits count
orange 1
orange 1
mango 2
banana 3
lemon 4
banana 3
orange 1
lemon 4
lemon 4
orange 1
mango 2
banana 3
lemon 4
banana 3
orange 1
lemon 4
lemon 4
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 5
SPSS Steps:
Get the data.
Follow the steps as shown
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 6
Get the count in the test variable list
Click OK and get the output as shown below
Interpretation:
Here p value is 0.981 which is more than 0.05. Hence it is not significant and we fail to reject the null hypothesis and
conclude that there is no significant difference in the proportions of apples, bananas, oranges, and peaches.
We could also test to see if a basket of fruit contains 10% apples, 20% bananas, 50% oranges, and 20% peaches. For this
we have to define the proportions by checking the button “Values” and keep on adding.
Binomial Test
The Binomial Test procedure is useful when you want to compare a single sample from a dichotomous variable to an
expected proportion. If the dichotomy does not exist in the data as a variable, one can be dynamically created based upon
a cut point on a scale variable (take age as example from the data). If your variable has more than two outcomes, try the
Chi-Square Test procedure. If you want to compare two dichotomous variables, try the McNemar test in the
Two-Related-Samples Tests procedure.
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 7
Example
Say we wish to test whether the proportion of females from the variable “gender” differs significantly from 50%, i.e.,
from 0.5. We will use the exact statement to produce the exact p-values.
AgeMarital_StatusFamily_SizeLand_HoldingAchievementMarket_OrientationProblemGender
21 2 1 1 83 17 16 0
40 1 0 0 77 18 17 0
32 1 0 1 79 18 17 0
37 1 2 1 80 18 17 1
40 3 2 1 78 18 17 0
40 1 2 0 78 18 17 1
52 1 0 0 79 24 13 0
35 2 2 1 94 24 20 1
38 2 2 1 81 22 12 0
55 1 0 1 78 18 10 1
35 2 1 0 87 23 17 1
35 3 2 1 89 22 10 0
55 1 1 0 87 23 15 0
40 1 2 1 86 23 14 1
62 1 1 1 80 18 10 1
40 1 1 0 83 24 13 1
48 3 1 1 76 21 14 1
62 1 2 1 84 23 11 0
36 1 0 0 81 26 11 0
35 1 2 1 80 21 11 0
35 1 2 1 77 22 13 1
35 1 1 1 82 16 14 1
18 2 2 0 83 26 10 0
SPSS Steps:
Get the data.
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 8
Follow the steps as shown below
Get the variable gender in the test variable list.
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 9
Click OK and get the output
Interpretation:
Since p value is 1 it is not significant and we fail to reject null hypothesis and conclude that the proportion of females
from the variable “gender” does not differ significantly from 50%.
Run Test for Randomness
Run test is used for examining whether or not a set of observations constitutes a random sample from an infinite
population. Test for randomness is of major importance because the assumption of randomness underlies statistical
inference. In addition, tests for randomness are important for time series analysis. Departure from randomness can take
many forms. The cut point is based either on a measure of central tendency (mean, median, or mode) or a custom value. A
sample with too many or too few runs suggests that the sample is not random.
Example
Let’s see whether the variable “AGE” in the dataset below is random.
Table: Cancer dataset
ID TRT AGE WEIGHIN STAGE TOTALCIN TOTALCW2 TOTALCW4 TOTALCW6
1 0 52 124 2 6 6 6 7
5 0 77 160 1 9 6 10 9
6 0 60 136.5 4 7 9 17 19
9 0 61 179.6 1 6 7 9 3
11 0 59 175.8 2 6 7 16 13
15 0 69 167.6 1 6 6 6 11
21 0 67 186 1 6 11 11 10
26 0 56 158 3 6 11 15 15
31 0 61 212.8 1 6 9 6 8
35 0 51 189 1 6 4 8 7
39 0 46 149 4 7 8 11 11
41 0 65 157 1 6 6 9 6
45 0 67 186 1 8 8 9 10
2 0 46 163.8 2 7 16 9 10
12 1 56 227.2 4 6 10 11 9
14 1 42 162.6 1 4 6 8 7
16 1 44 261.4 2 6 11 11 14
22 1 27 225.4 1 6 7 6 6
24 1 68 226 4 12 11 12 9
34 1 77 164 2 5 7 13 12
37 1 86 140 1 6 7 7 7
42 1 73 181.5 0 8 11 16
44 1 67 187 1 5 7 7 7
50 1 60 164 2 6 8 16
58 1 54 172.8 4 7 8 10 8
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 10
SPSS Steps:
Load the data.
Follow the following steps.
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 11
Select “AGE” in the test variables list.
This variable “AGE” must be divided into two spate groups. Therefore we must indicate a cut point. Now lets take
Median as the cut point. Any value blow the median point will belong to one group and any value greater than or equal to
median will belong to the other group. Now click OK to get output.
Interpretation:
Now p value is 0.450. So it is not significant and we cannot say that AGE is not random.
One-Sample Kolmogorov-Smirnov Test
The One-Sample Kolmogorov-Smirnov procedure is used to test the null hypothesis that a sample comes from a particular
distribution. Four theoretical distribution functions are available-- normal, uniform, Poisson, and exponential. If we want
to compare the distributions of two variables, use the two-sample Kolmogorov-Smirnov test in the
Two-Independent-Samples Tests procedure.
Example: Let us test the variable “AGE” in the cancer dataset used for Run test above is normal distribution or uniform
distribution.
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 12
SPSS Steps
Get the data as done before. Then…
Select “AGE” in the test variable list.
Check the distribution for which you want to test. Click OK and get the output.
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 13
Interpretation:
The p value is 0.997 which is not significant and therefore we cannot say that “AGE” does not have an approximate
normal distribution. If the p value were less than 0.05 we would say it is significant and AGE does not follow an
approximate normal distribution.
Two-Independent-Samples Tests
The nonparametric tests for two independent samples are useful for determining whether or not the values of a particular
variable differ between two groups. This is especially true when the assumptions of the t test are not met.
 Mann-Whitney U test: To test for differences between two groups
 The two-sample Kolmogorov-Smirnov test: To test the null hypothesis that two samples have the same
distribution
 Wlad-Walfowitz Run: Used to examine whether two random samples come from populations having same
distribution
 Mozes Extreme Reactions: Exact Test
Example: We want to find out whether the sales are different between two designs.
sales design store_size
11 1 1
17 1 2
16 1 3
14 1 4
15 1 5
12 2 1
10 2 2
15 2 3
19 2 4
11 2 5
23 3 1
20 3 2
18 3 3
17 3 4
27 4 1
33 4 2
22 4 3
26 4 4
28 4 5
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 14
SPSS Steps:
Open the dataset
Let’s compare between design 1 and 2.
Enter variable sales in test variable list and design in grouping variable.
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 15
Since we are performing two independent sample tests we have to designate which two groups in our factor design we
want to compare. So click “Define groups”.
Here we type group 2 and 1. Order is not important, only we have to enter two distinct groups. Then click continue and
OK to get output.
Interpretation:
Now two p values are displayed, asymptotic which is appropriate for large sample and exact which is independent of
sample size. Therefore we will take the exact p value i. e. 0.548 which is not significant and we conclude that there is no
significant difference in sales between the design group 1 and group 2.
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 16
Multiple Independent Samples Tests
The nonparametric tests for multiple independent samples are useful for determining whether or not the values of a
particular variable differ between two or more groups. This is especially true when the assumptions of ANOVA are not
met.
 Median test: This method tests the null hypothesis that two or more independent samples have the same
median. It assumes nothing about the distribution of the test variable, making it a good choice when you
suspect that the distribution varies by group
 Kruskal-Wallis H: This test is a one-way analysis of variance by ranks. It tests the null hypothesis that
multiple independent samples come from the same population.
 Jonckheere-terpstra test: Exact test
Example:
We want to find out whether the sales are different between the designs (Comparing more than two samples
simultaneously)
SPSS Steps:
Get the data in SPSS window as done before. Then…
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 17
Define range
Click continue then OK to get output.
Interpretation:
P value is 0.003 which is significant. Therefore we conclude that there is significant difference between the groups
(meaning- at least two groups are different)
Tests for Two Related Samples
The nonparametric tests for two related samples allow you to test for differences between paired scores when you cannot
(or would rather not) make the assumptions required by the paired-samples t test. Procedures are available for testing
nominal, ordinal, or scale variables.
 Wilcoxon signed-ranks: A nonparametric alternative to the paired-samples t test. The only assumptions
made by the Wilcoxon test are that the test variable is continuous and that the distribution of the difference
scores is reasonably symmetric.
 McNemar method tests the null hypothesis that binary responses are unchanged. As with the Wilcoxon test,
the data may be from a single sample measured twice or from two matched samples. The McNemar test is
particularly appropriate with nominal or ordinal test variables for binary data. Unlike the Wilcoxon test, the
McNemar test is designed for use with nominal or ordinal test variables.
 Marginal-homogeinity: If the varialbles are mortinomial i.e if they have more than two levels.
 Sign test: Wilkoxon and Sign are used for contineous data and of the two wilkoxon is more powerful
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 18
Example: Use the cancer data deployed in Run Test to test whether the condition of the cancer patient at the end of 2nd
week and 4th
week are significantly different. (here higher the reading, better is the condition)
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 19
Output:
Interpretation:
P value is 0.006 which is significant. This indicates that the condition of cancer patient at the end of 2nd
week and 4th
week
are different.
Tests for Multiple Related Samples
The nonparametric tests for multiple related samples are useful alternatives to a repeated measures analysis of variance.
They are especially appropriate for small samples and can be used with nominal or ordinal test variables.
Friedman test is a nonparametric alternative to the repeated measures ANOVA. It tests the null hypothesis that
multiple ordinal responses come from the same population. As with the Wilcoxon test for two related samples, the
data may come from repeated measures of a single sample or from the same measure from multiple matched samples.
The only assumptions made by the Friedman test are that the test variables are at least ordinal and that their
distributions are reasonably similar.
Cochran’s Q: It tests the null hypothesis that multiple related proportions are the same. Think of the Cochran Q test
as an extension of the McNemar test used to assess change over two times or two matched samples. Unlike the
Friedman test, the Cochran test is designed for use with binary variables.
Kendall’s W: is a normalization of Friedman test and can be interpreted as a measure of agreement
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 20
SPSS steps:
Output
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 21
Interpretation:
P value is less than 0.05. Hence there is significant difference between the four groups (meaning- at least two groups are
different)
Exact Tests and Monte Carlo Method
These new methods, the exact and Monte Carlo methods, provide a powerful means for obtaining accurate results when
your data set is small, your tables are sparse or unbalanced, the data are not normally distributed, or the data fail to meet
any of the underlying assumptions necessary for reliable results using the standard asymptotic method.
The Exact Method
By default, IBM® SPSS® Statistics calculates significance levels for the statistics in the Crosstabs and Nonparametric
Tests procedures using the asymptotic method. This means that p values are estimated based on the assumption that the
data, given a sufficiently large sample size, conform to a particular distribution.
However, when the data set is small, sparse, contains many ties, is unbalanced, or is poorly distributed, the asymptotic
method may fail to produce reliable results. In these situations, it is preferable to calculate a significance level based on
the exact distribution of the test statistic. This enables you to obtain an accurate p value without relying on assumptions
that may not be met by your data.
The Monte Carlo Method
Although exact results are always reliable, some data sets are too large for the exact p value to be calculated, yet don’t
meet the assumptions necessary for the asymptotic method. In this situation, the Monte Carlo method provides an
unbiased estimate of the exact p value, without the requirements of the asymptotic method.
The Monte Carlo method is a repeated sampling method. For any observed table, there are many tables, each with the
same dimensions and column and row margins as the observed table. The Monte Carlo method repeatedly samples a
specified number of these possible tables in order to obtain an unbiased estimate of the true p value.
The Monte Carlo method is less computationally intensive than the exact method, so results can often be obtained more
quickly. However, if you have chosen the Monte Carlo method, but exact results can be calculated quickly for your data,
they will be provided.
When to Use Exact Tests
Calculating exact results can be computationally intensive, time-consuming, and can sometimes exceed the memory limits
of your machine. In general, exact tests can be performed quickly with sample sizes of less than 30. Table 1.1 provides a
guideline for the conditions under which exact results can be obtained quickly.
Non Parametric Tests: Hands on SPSS
N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 22
Test Questions
References
NONPARAMETRIC TESTS
Eldho Varghese and Cini Varghese
Indian Agricultural Statistics Research Institute, New Delhi - 110 012
eldho@iasri.res.in, cini_v@iasri.res.in
IBM SPSS Exact Tests
Cyrus R. Mehta and Nitin R. Patel
IBM SPSS Statistics Base 20

More Related Content

PPTX
Non parametric presentation
PDF
Testing of hypothesis
PPTX
t testleri
PDF
t-TEst. :D
PPTX
House Price Prediction.pptx
PPTX
Applications of biostatistics
PPTX
Introduction to principal component analysis (pca)
Non parametric presentation
Testing of hypothesis
t testleri
t-TEst. :D
House Price Prediction.pptx
Applications of biostatistics
Introduction to principal component analysis (pca)

What's hot (20)

PPTX
Testing of hypothesis and tests of significance
PPTX
statistical inference
PPTX
Degree of freedom
PPTX
Statistical analysis of biological data (comaprison of means)
PPTX
Friedman Test- A Presentation
PPTX
Statistics for data science
PPTX
Statistics - ONE WAY ANOVA
PPT
Doe10 factorial2k blocking
PDF
Student's T Test
PPTX
Discriminant function analysis (DFA)
PDF
演講-Meta analysis in medical research-張偉豪
PPTX
Surrogate end points with biomarkers including salient feature
PDF
Point Estimate, Confidence Interval, Hypotesis tests
PPTX
Regression analysis
PDF
Regression analysis and its type
PPTX
STUDENT PERFORMANCE ANALYSIS USING DECISION TREE
PDF
Anomaly/Novelty detection with scikit-learn
PPTX
Non parametric test
PPTX
Research article discussion for simultaneous method HPLC estimation
Testing of hypothesis and tests of significance
statistical inference
Degree of freedom
Statistical analysis of biological data (comaprison of means)
Friedman Test- A Presentation
Statistics for data science
Statistics - ONE WAY ANOVA
Doe10 factorial2k blocking
Student's T Test
Discriminant function analysis (DFA)
演講-Meta analysis in medical research-張偉豪
Surrogate end points with biomarkers including salient feature
Point Estimate, Confidence Interval, Hypotesis tests
Regression analysis
Regression analysis and its type
STUDENT PERFORMANCE ANALYSIS USING DECISION TREE
Anomaly/Novelty detection with scikit-learn
Non parametric test
Research article discussion for simultaneous method HPLC estimation
Ad

Similar to Non parametrict test (20)

PPTX
Statatistic in the Philippines of the current
PPTX
Topic 10 DATA ANALYSIS TECHNIQUES.pptx
PPTX
Alternatives to t test
DOCX
Parametric vs non parametric test
PPTX
Non parametric study; Statistical approach for med student
PPT
Comparison statisticalsignificancetestir
PPTX
Nonparametric tests
PPTX
Week 7 spss 2 2013
PPTX
Research Non Parametric Statistics And its Application
PPTX
PRESENTATION ON TESTS OF SIGNIFICANCE.pptx
PPTX
BASIC STATISTICAL TREATMENT IN RESEARCH.pptx
PDF
Parametric & Non-Parametric tests SPSS WORKSHOPpdf
PPTX
Parametric vs Non-Parametric
PPT
QUANTITATIVE DATA ANALYSIS for powerful research.ppt
PPT
Non-parametric presentationnnnnnnnnnnnnnn
PPTX
BIOSTATISTICS SLIDESHARE.pptx
PPTX
Non parametric test
PDF
Data Analysis using Statistics and Hypotheses Testing.pdf
Statatistic in the Philippines of the current
Topic 10 DATA ANALYSIS TECHNIQUES.pptx
Alternatives to t test
Parametric vs non parametric test
Non parametric study; Statistical approach for med student
Comparison statisticalsignificancetestir
Nonparametric tests
Week 7 spss 2 2013
Research Non Parametric Statistics And its Application
PRESENTATION ON TESTS OF SIGNIFICANCE.pptx
BASIC STATISTICAL TREATMENT IN RESEARCH.pptx
Parametric & Non-Parametric tests SPSS WORKSHOPpdf
Parametric vs Non-Parametric
QUANTITATIVE DATA ANALYSIS for powerful research.ppt
Non-parametric presentationnnnnnnnnnnnnnn
BIOSTATISTICS SLIDESHARE.pptx
Non parametric test
Data Analysis using Statistics and Hypotheses Testing.pdf
Ad

Recently uploaded (20)

PDF
Yoga for life ...........................
PDF
Lifestyle and the Experience of Living a Full Day Without the Use of a Smartp...
PDF
Your Love Marriage Forecast: What the Stars Say
PDF
What Is Intimate Partner Violence - Understanding the Crisis, Data and Solutions
PPTX
Life is Long about a life and to increase it.pptx
PDF
DVIBEWEAR – Custom T-Shirt Fashion Brand Startup Pitch Deck | Style Wahi Jo V...
PDF
The Science-Backed Benefits of Fruit and Vegetable Extracts.pdf
PDF
The Fashion Impact of Los Angeles’ Entertainment Scene by David Shane PR.pdf
DOC
AAMU毕业证学历认证,爱默生学院毕业证ps毕业证
PPTX
Social%20Dance%20(%20Cha%20Cha%20Dance).pptx.pptx
PDF
Step into a new era of fashion where style meets sustainability.
PDF
Global Business Today 10th Edition by Hill Test Bank.pdf
DOCX
Desale Chali is a professional in the field of Electrical and Computer Engine...
PDF
150 Unique Baby Names for 2025 (With Meanings & Origins)
PDF
Equivalent Mass and Its Applications practical ppt
PPTX
Saraf Furniture Reviews – A Story of Trust, Craftsmanship, and Happy Homes.pptx
PDF
the unconditional part of left to own devices take two
PPTX
Lifestyle of Swami Chinmayananda ji Swami ji
PPTX
TLE 8 MANICURE.pptx 1-39U2-048012412048120182
PPTX
Untitled presentation.pptxjkljlkjlkjlkjlkjlkjlkjlkj
Yoga for life ...........................
Lifestyle and the Experience of Living a Full Day Without the Use of a Smartp...
Your Love Marriage Forecast: What the Stars Say
What Is Intimate Partner Violence - Understanding the Crisis, Data and Solutions
Life is Long about a life and to increase it.pptx
DVIBEWEAR – Custom T-Shirt Fashion Brand Startup Pitch Deck | Style Wahi Jo V...
The Science-Backed Benefits of Fruit and Vegetable Extracts.pdf
The Fashion Impact of Los Angeles’ Entertainment Scene by David Shane PR.pdf
AAMU毕业证学历认证,爱默生学院毕业证ps毕业证
Social%20Dance%20(%20Cha%20Cha%20Dance).pptx.pptx
Step into a new era of fashion where style meets sustainability.
Global Business Today 10th Edition by Hill Test Bank.pdf
Desale Chali is a professional in the field of Electrical and Computer Engine...
150 Unique Baby Names for 2025 (With Meanings & Origins)
Equivalent Mass and Its Applications practical ppt
Saraf Furniture Reviews – A Story of Trust, Craftsmanship, and Happy Homes.pptx
the unconditional part of left to own devices take two
Lifestyle of Swami Chinmayananda ji Swami ji
TLE 8 MANICURE.pptx 1-39U2-048012412048120182
Untitled presentation.pptxjkljlkjlkjlkjlkjlkjlkjlkj

Non parametrict test

  • 1. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 1 Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi ICAR Research Complex for NEH Region, Umiam, Meghalaya uttamba@gmail.com, aniruddhaubkv@gmail.com, aktripathi2020@yahoo.co.in Chapter 1: Introduction Which is more powerful (parametric and non-parametric tests) Parametric Assumptions Nonparametric Assumptions Advantages of Nonparametric Tests Disadvantages of nonparametric tests Few important points on nonparametric test Measurement Parametric vs. non-parametric tests Nonparametric Methods Chapter2: Tests of relationships between variables Chi-square Test Binomial Test Run Test for Randomness One-Sample Kolmogorov-Smirnov Test Chapter 3: Two-Independent-Samples Tests Mann-Whitney U test The two-sample Kolmogorov-Smirnov test Wlad-Walfowitz Run Mozes Extreme Reactions Chapter 4: Multiple Independent Samples Tests Median test Kruskal-Wallis H Jonckheere-terpstra test Chapter 5: Tests for Two Related Samples Wilcoxon signed-ranks McNemar Marginal-homogeinity Sign test Chapter 6: Tests for Multiple Related Samples Friedman Cochran’s Q Kendall’s W Chapter 7: Exact Tests and Monte Carlo Method The Exact Method The Monte Carlo Method When to Use Exact Tests Test Questions: References:
  • 2. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 2 They are called nonparametric because they make no assumptions about the parameters (such as the mean and variance) of a distribution, nor do they assume that any particular distribution is being used. Introduction A parametric statistical test is one that makes assumptions about the parameters (defining properties) of the population distribution(s) from which one's data are drawn. A non-parametric test is one that makes no such assumptions. In this strict sense, "non-parametric" is essentially a null category, since virtually all statistical tests assume one thing or another about the properties of the source population(s). Which is more powerful? Non-parametric statistical procedures are less powerful because they use less information in their calculation. For example, a parametric correlation uses information about the mean and deviation from the mean while a non-parametric correlation will use only the ordinal position of pairs of scores. Parametric Assumptions  The observations must be independent  The observations must be drawn from normally distributed populations  These populations must have the same variances  The means of these normal and homoscedastic populations must be linear combinations of effects due to columns and/or rows Nonparametric Assumptions Certain assumptions are associated with most nonparametric statistical tests, but these are fewer and weaker than those of parametric tests. Advantages of Nonparametric Tests  Probability statements obtained from most nonparametric statistics are exact probabilities, regardless of the shape of the population distribution from which the random sample was drawn  If sample sizes as small as N=6 are used, there is no alternative to using a nonparametric test  Easier to learn and apply than parametric tests  Based on a model that specifies very general conditions.  No specific form of the distribution from which the sample was drawn.  Hence nonparametric tests are also known as distribution free tests. Disadvantages of nonparametric tests  Losing precision/wasteful of data  Low power  False sense of security  Lack of software  Testing distributions only  Higher-ordered interactions not dealt with  Parametric models are more efficient if data permit.  It is difficult to compute by hand for large samples  Tables are not widely available  In cases where a parametric test would be appropriate, non-parametric tests have less power. In other words, a larger sample size can be required to draw conclusions with the same degree of confidence. Few points  The inferences drawn from tests based on the parametric tests such as t, F and Chi-square may be seriously affected when the parent population’s distribution is not normal.  The adverse effect could be more when sample size is small.  Thus when there is doubt about the distribution of the parent population, a nonparametric method should be
  • 3. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 3 used.  In many situations, particularly in social and behavioral sciences, observations are difficult or impossible to take on numerical scales and a suitable nonparametric test is an alternative under such situations. Measurement The 4 levels of measurement 1. Nominal or Classificatory Scale  Gender, ethnic background, colors of a spectrum  In research activities a YES/NO scale is nominal. It has no order and there is no distance between YES and NO. 2. Ordinal or Ranking Scale  Hardness of rocks, beauty, military ranks  The simplest ordinal scale is a ranking.  There is no objective distance between any two points on your subjective scale. 3. Interval Scale  Celsius or Fahrenheit. It is an interval scale because it is assumed to have equidistant points between each of the scale elements. 4. Ratio Scale  Kelvin temperature, speed, height, mass or weight  Ratio data is interval data with a natural zero point Parametric vs. non-parametric tests Parametric Non-parametric Assumed distribution Normal Any Assumed variance Homogeneous Any Typical data Ratio or Interval Ordinal or Nominal Data set relationships Independent Any Usual central measure Mean Median Benefits Can draw more conclusions Simplicity; Less affected by outliers Tests Choosing Choosing parametric test Choosing a non-parametric test Correlation test Pearson Spearman Independent measures, 2 groups Independent-measures t-test Mann-Whitney test Independent measures, >2 groups One-way, independent-measures ANOVA Kruskal-Wallis test Repeated measures, 2 conditions Matched-pair t-test Wilcoxon test Repeated measures, >2 conditions One-way, repeated measures ANOVA Friedman's test
  • 4. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 4 Nonparametric Methods There is at least one nonparametric test equivalent to a parametric test Tests of relationships between variables Chi-square Test This goodness-of-fit test compares the observed and expected frequencies in each category to test either that all categories contain the same proportion of values or that each category contains a user-specified proportion of values. Examples The chi-square test could be used to determine if a basket of fruit contains equal proportions of apples, bananas, oranges, and peaches. fruits count orange 1 orange 1 mango 2 banana 3 lemon 4 banana 3 orange 1 lemon 4 lemon 4 orange 1 mango 2 banana 3 lemon 4 banana 3 orange 1 lemon 4 lemon 4
  • 5. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 5 SPSS Steps: Get the data. Follow the steps as shown
  • 6. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 6 Get the count in the test variable list Click OK and get the output as shown below Interpretation: Here p value is 0.981 which is more than 0.05. Hence it is not significant and we fail to reject the null hypothesis and conclude that there is no significant difference in the proportions of apples, bananas, oranges, and peaches. We could also test to see if a basket of fruit contains 10% apples, 20% bananas, 50% oranges, and 20% peaches. For this we have to define the proportions by checking the button “Values” and keep on adding. Binomial Test The Binomial Test procedure is useful when you want to compare a single sample from a dichotomous variable to an expected proportion. If the dichotomy does not exist in the data as a variable, one can be dynamically created based upon a cut point on a scale variable (take age as example from the data). If your variable has more than two outcomes, try the Chi-Square Test procedure. If you want to compare two dichotomous variables, try the McNemar test in the Two-Related-Samples Tests procedure.
  • 7. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 7 Example Say we wish to test whether the proportion of females from the variable “gender” differs significantly from 50%, i.e., from 0.5. We will use the exact statement to produce the exact p-values. AgeMarital_StatusFamily_SizeLand_HoldingAchievementMarket_OrientationProblemGender 21 2 1 1 83 17 16 0 40 1 0 0 77 18 17 0 32 1 0 1 79 18 17 0 37 1 2 1 80 18 17 1 40 3 2 1 78 18 17 0 40 1 2 0 78 18 17 1 52 1 0 0 79 24 13 0 35 2 2 1 94 24 20 1 38 2 2 1 81 22 12 0 55 1 0 1 78 18 10 1 35 2 1 0 87 23 17 1 35 3 2 1 89 22 10 0 55 1 1 0 87 23 15 0 40 1 2 1 86 23 14 1 62 1 1 1 80 18 10 1 40 1 1 0 83 24 13 1 48 3 1 1 76 21 14 1 62 1 2 1 84 23 11 0 36 1 0 0 81 26 11 0 35 1 2 1 80 21 11 0 35 1 2 1 77 22 13 1 35 1 1 1 82 16 14 1 18 2 2 0 83 26 10 0 SPSS Steps: Get the data.
  • 8. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 8 Follow the steps as shown below Get the variable gender in the test variable list.
  • 9. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 9 Click OK and get the output Interpretation: Since p value is 1 it is not significant and we fail to reject null hypothesis and conclude that the proportion of females from the variable “gender” does not differ significantly from 50%. Run Test for Randomness Run test is used for examining whether or not a set of observations constitutes a random sample from an infinite population. Test for randomness is of major importance because the assumption of randomness underlies statistical inference. In addition, tests for randomness are important for time series analysis. Departure from randomness can take many forms. The cut point is based either on a measure of central tendency (mean, median, or mode) or a custom value. A sample with too many or too few runs suggests that the sample is not random. Example Let’s see whether the variable “AGE” in the dataset below is random. Table: Cancer dataset ID TRT AGE WEIGHIN STAGE TOTALCIN TOTALCW2 TOTALCW4 TOTALCW6 1 0 52 124 2 6 6 6 7 5 0 77 160 1 9 6 10 9 6 0 60 136.5 4 7 9 17 19 9 0 61 179.6 1 6 7 9 3 11 0 59 175.8 2 6 7 16 13 15 0 69 167.6 1 6 6 6 11 21 0 67 186 1 6 11 11 10 26 0 56 158 3 6 11 15 15 31 0 61 212.8 1 6 9 6 8 35 0 51 189 1 6 4 8 7 39 0 46 149 4 7 8 11 11 41 0 65 157 1 6 6 9 6 45 0 67 186 1 8 8 9 10 2 0 46 163.8 2 7 16 9 10 12 1 56 227.2 4 6 10 11 9 14 1 42 162.6 1 4 6 8 7 16 1 44 261.4 2 6 11 11 14 22 1 27 225.4 1 6 7 6 6 24 1 68 226 4 12 11 12 9 34 1 77 164 2 5 7 13 12 37 1 86 140 1 6 7 7 7 42 1 73 181.5 0 8 11 16 44 1 67 187 1 5 7 7 7 50 1 60 164 2 6 8 16 58 1 54 172.8 4 7 8 10 8
  • 10. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 10 SPSS Steps: Load the data. Follow the following steps.
  • 11. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 11 Select “AGE” in the test variables list. This variable “AGE” must be divided into two spate groups. Therefore we must indicate a cut point. Now lets take Median as the cut point. Any value blow the median point will belong to one group and any value greater than or equal to median will belong to the other group. Now click OK to get output. Interpretation: Now p value is 0.450. So it is not significant and we cannot say that AGE is not random. One-Sample Kolmogorov-Smirnov Test The One-Sample Kolmogorov-Smirnov procedure is used to test the null hypothesis that a sample comes from a particular distribution. Four theoretical distribution functions are available-- normal, uniform, Poisson, and exponential. If we want to compare the distributions of two variables, use the two-sample Kolmogorov-Smirnov test in the Two-Independent-Samples Tests procedure. Example: Let us test the variable “AGE” in the cancer dataset used for Run test above is normal distribution or uniform distribution.
  • 12. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 12 SPSS Steps Get the data as done before. Then… Select “AGE” in the test variable list. Check the distribution for which you want to test. Click OK and get the output.
  • 13. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 13 Interpretation: The p value is 0.997 which is not significant and therefore we cannot say that “AGE” does not have an approximate normal distribution. If the p value were less than 0.05 we would say it is significant and AGE does not follow an approximate normal distribution. Two-Independent-Samples Tests The nonparametric tests for two independent samples are useful for determining whether or not the values of a particular variable differ between two groups. This is especially true when the assumptions of the t test are not met.  Mann-Whitney U test: To test for differences between two groups  The two-sample Kolmogorov-Smirnov test: To test the null hypothesis that two samples have the same distribution  Wlad-Walfowitz Run: Used to examine whether two random samples come from populations having same distribution  Mozes Extreme Reactions: Exact Test Example: We want to find out whether the sales are different between two designs. sales design store_size 11 1 1 17 1 2 16 1 3 14 1 4 15 1 5 12 2 1 10 2 2 15 2 3 19 2 4 11 2 5 23 3 1 20 3 2 18 3 3 17 3 4 27 4 1 33 4 2 22 4 3 26 4 4 28 4 5
  • 14. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 14 SPSS Steps: Open the dataset Let’s compare between design 1 and 2. Enter variable sales in test variable list and design in grouping variable.
  • 15. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 15 Since we are performing two independent sample tests we have to designate which two groups in our factor design we want to compare. So click “Define groups”. Here we type group 2 and 1. Order is not important, only we have to enter two distinct groups. Then click continue and OK to get output. Interpretation: Now two p values are displayed, asymptotic which is appropriate for large sample and exact which is independent of sample size. Therefore we will take the exact p value i. e. 0.548 which is not significant and we conclude that there is no significant difference in sales between the design group 1 and group 2.
  • 16. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 16 Multiple Independent Samples Tests The nonparametric tests for multiple independent samples are useful for determining whether or not the values of a particular variable differ between two or more groups. This is especially true when the assumptions of ANOVA are not met.  Median test: This method tests the null hypothesis that two or more independent samples have the same median. It assumes nothing about the distribution of the test variable, making it a good choice when you suspect that the distribution varies by group  Kruskal-Wallis H: This test is a one-way analysis of variance by ranks. It tests the null hypothesis that multiple independent samples come from the same population.  Jonckheere-terpstra test: Exact test Example: We want to find out whether the sales are different between the designs (Comparing more than two samples simultaneously) SPSS Steps: Get the data in SPSS window as done before. Then…
  • 17. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 17 Define range Click continue then OK to get output. Interpretation: P value is 0.003 which is significant. Therefore we conclude that there is significant difference between the groups (meaning- at least two groups are different) Tests for Two Related Samples The nonparametric tests for two related samples allow you to test for differences between paired scores when you cannot (or would rather not) make the assumptions required by the paired-samples t test. Procedures are available for testing nominal, ordinal, or scale variables.  Wilcoxon signed-ranks: A nonparametric alternative to the paired-samples t test. The only assumptions made by the Wilcoxon test are that the test variable is continuous and that the distribution of the difference scores is reasonably symmetric.  McNemar method tests the null hypothesis that binary responses are unchanged. As with the Wilcoxon test, the data may be from a single sample measured twice or from two matched samples. The McNemar test is particularly appropriate with nominal or ordinal test variables for binary data. Unlike the Wilcoxon test, the McNemar test is designed for use with nominal or ordinal test variables.  Marginal-homogeinity: If the varialbles are mortinomial i.e if they have more than two levels.  Sign test: Wilkoxon and Sign are used for contineous data and of the two wilkoxon is more powerful
  • 18. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 18 Example: Use the cancer data deployed in Run Test to test whether the condition of the cancer patient at the end of 2nd week and 4th week are significantly different. (here higher the reading, better is the condition)
  • 19. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 19 Output: Interpretation: P value is 0.006 which is significant. This indicates that the condition of cancer patient at the end of 2nd week and 4th week are different. Tests for Multiple Related Samples The nonparametric tests for multiple related samples are useful alternatives to a repeated measures analysis of variance. They are especially appropriate for small samples and can be used with nominal or ordinal test variables. Friedman test is a nonparametric alternative to the repeated measures ANOVA. It tests the null hypothesis that multiple ordinal responses come from the same population. As with the Wilcoxon test for two related samples, the data may come from repeated measures of a single sample or from the same measure from multiple matched samples. The only assumptions made by the Friedman test are that the test variables are at least ordinal and that their distributions are reasonably similar. Cochran’s Q: It tests the null hypothesis that multiple related proportions are the same. Think of the Cochran Q test as an extension of the McNemar test used to assess change over two times or two matched samples. Unlike the Friedman test, the Cochran test is designed for use with binary variables. Kendall’s W: is a normalization of Friedman test and can be interpreted as a measure of agreement
  • 20. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 20 SPSS steps: Output
  • 21. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 21 Interpretation: P value is less than 0.05. Hence there is significant difference between the four groups (meaning- at least two groups are different) Exact Tests and Monte Carlo Method These new methods, the exact and Monte Carlo methods, provide a powerful means for obtaining accurate results when your data set is small, your tables are sparse or unbalanced, the data are not normally distributed, or the data fail to meet any of the underlying assumptions necessary for reliable results using the standard asymptotic method. The Exact Method By default, IBM® SPSS® Statistics calculates significance levels for the statistics in the Crosstabs and Nonparametric Tests procedures using the asymptotic method. This means that p values are estimated based on the assumption that the data, given a sufficiently large sample size, conform to a particular distribution. However, when the data set is small, sparse, contains many ties, is unbalanced, or is poorly distributed, the asymptotic method may fail to produce reliable results. In these situations, it is preferable to calculate a significance level based on the exact distribution of the test statistic. This enables you to obtain an accurate p value without relying on assumptions that may not be met by your data. The Monte Carlo Method Although exact results are always reliable, some data sets are too large for the exact p value to be calculated, yet don’t meet the assumptions necessary for the asymptotic method. In this situation, the Monte Carlo method provides an unbiased estimate of the exact p value, without the requirements of the asymptotic method. The Monte Carlo method is a repeated sampling method. For any observed table, there are many tables, each with the same dimensions and column and row margins as the observed table. The Monte Carlo method repeatedly samples a specified number of these possible tables in order to obtain an unbiased estimate of the true p value. The Monte Carlo method is less computationally intensive than the exact method, so results can often be obtained more quickly. However, if you have chosen the Monte Carlo method, but exact results can be calculated quickly for your data, they will be provided. When to Use Exact Tests Calculating exact results can be computationally intensive, time-consuming, and can sometimes exceed the memory limits of your machine. In general, exact tests can be performed quickly with sample sizes of less than 30. Table 1.1 provides a guideline for the conditions under which exact results can be obtained quickly.
  • 22. Non Parametric Tests: Hands on SPSS N. Uttam Singh, Aniruddha Roy & A. K. Tripathi – 2013 22 Test Questions References NONPARAMETRIC TESTS Eldho Varghese and Cini Varghese Indian Agricultural Statistics Research Institute, New Delhi - 110 012 eldho@iasri.res.in, cini_v@iasri.res.in IBM SPSS Exact Tests Cyrus R. Mehta and Nitin R. Patel IBM SPSS Statistics Base 20