Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf

Tests of Significance
Dr M. Dinesh Dhamodhar
Reader
Department of Public Health Dentistry

Contents
• Introduction
• Tests of significance
• Parametric test
• Non Parametric test
• Application of tests in dental research
• Conclusion
• Bibliography

• VARIABLE
– A general term for any feature of the unit
which is observed or measured.
• FREQUENCY DISTRIBUTION
– Distribution showing the number of
observations or frequencies at different values or
within certain range of values of the variable.

Tests of significance
• Population
• is any finite collection of elements
I.e – individuals, items, observations etc,.
✓Statistic –
✓is a quantity describing a sample, namely a function
of observations
✓Parameter –
✓ is a constant describing a population
✓Sample –
✓is a part or subset of the population

Statistic
(Greek)
Parameter
(Latin)
Mean
Standard
Deviation
Variance
Correlation
coefficient
Number of
subjects
x
s
2
s
r
n


2


N

Hypothesis testing
• Hypothesis
• is an assumption about the status of a phenomenon
0
H
H
✓Null hypothesis or hypothesis of no difference –
✓States no difference between statistic of a sample & parameter
of population or b/n statistics of two samples
✓This nullifies the claim that the experiment result is different
from or better than the one observed already
✓Denoted by

• Alternate hypothesis –
• Any hypothesis alternate to null hypothesis, which is to be
tested
• Denoted by
1
H
Note : the alternate hypothesis is accepted when
null hypothesis is rejected

Type I & type II errors
• Type I error =
• Type II error =
No error
Type II error
is true
Type I error
No error
is true
Accept
Accept
0
H 1
H
1
H
0
H


When primary concern of the test is to see
whether the null hypothesis can be rejected
such test is called Test of significance

α ERROR
• The probability of committing type I error is called “P”
value
• Thus p-value is the chance that the presence of difference
is concluded when actually there is none

✓Type I error – important- fixed in advance at a
low level
✓Thus α is the maximum tolerable probability
of type I error

• Difference b/n level of significance & P-value -
LOS P-value
1) Maximum tolerable
chance of type I error
1) Actual probability of
type I error
2) α is fixed in advance 2) calculated on basis of data
following procedures
The P-value can be more than α or less than α depending on data
When P-value is < than α → results is statistically significant

• The level of significance is usually fixed at 5%
(0.05) or 1% (0.01) or 0.1% (0.001) or 0.5% (0.005)
• Maximum desirable is 5% level
• When P-value is b/n
0.05-0.01 = statistically significant
< than 0.01= highly statistically significant
Lower than 0.001 or 0.005 = very highly significant

Sampling Distribution
1/2
1/2 
Zone of
Rejection H0
Zone of
Rejection H0
Zone of
Acceptance H0
SD
x 96
.
1
+
SD
x 96
.
1
−
x
Confidence limits – 95%
Confidence interval

TESTS IN TEST OF SIGNIFICANCE
Parametric
(normal distribution &
Normal curve )
Non-parametric
(not follow
normal distribution)
Quantitative data Qualitative data
1) Student ‘t’test
1) Paired
2) Unpaired
2) Z test
(for large samples)
3) One way ANOVA
4) Two way ANOVA
1) Z – prop test
2) χ² test
Qualitative
(quantitative converted
to qualitative )
1. Mann Whitney U test
2. Wilcoxon rank test
3. Kruskal wallis test
4. Friedmann test

Dr Sandesh N
Parametric Uses Non-parametric
Paired t test Test of diff b/n
Paired observation
Wilcoxon signed
rank test
Two sample t test Comparison of two
groups
Wilcoxon rank sum test
Mann Whitney U test
Kendall’s s test
One way Anova Comparison of
several groups
Kruskal wallis test
Two way Anova Comparison of groups
values on two variables
Friedmann test
Correlation
coefficient
Measure of association
B/n two variable
Spearman’s rank
Correlation
Kendall’s rank
correlation
Normal test (Z test ) Chi square test

Student ‘t’ test
• Small samples do not follow normal distribution
• Prof W.S.Gossett – Student‘t’ test – pen name – student
• It is the ratio of observed difference b/n two mean of small samples
to the SE of difference in the same

Types
Unpaired ‘t’test
Paired ‘t’test

• Criteria for applying ‘t’ test –
• Random samples
• Quantitative data
• Variable follow normal distribution
• Sample size less than 30
• Application of ‘t’ test –
1. Two means of small independent sample
2. Sample mean and population mean
3. Two proportions of small independent samples

Unpaired ‘t’ test
I) Difference b/n means of two independent samples
Group 1 Group 2
Sample size
Mean
SD
1
n 2
n
1
x 2
x
1
SD 2
SD
( ) 0
2
1
0 =
−
 x
x
H
( ) 0
2
1
1 
−
 x
x
H
1) Null hypothesis
2) Alternate hypothesis
Data –

3) Test criterion
( )
2
1
2
1
x
x
SE
x
x
t
−
−
=
( ) by
calculated
is
of
here 2
1 x
x
SE −
( ) 







+
=
−
2
1
2
1
1
1
of
n
n
SD
x
x
SE
( ) ( )
2
1
1
where
2
1
2
2
2
2
1
1
−
+
−
+
−
=
n
n
SD
n
SD
n
SD
( ) ( ) ( )








+
−
+
−
+
−
=
−
2
1
2
1
2
2
2
2
1
1
2
1
1
1
2
1
1
n
n
n
n
SD
n
SD
n
x
x
SE

4) Calculate degree of freedom
( ) ( ) 1
1
1 2
1
2
1 −
+
=
−
+
−
= n
n
n
n
df
6) Draw conclusions
5) Compare the calculated value &
the table value

• Example – difference b/n caries experience of high
& low socioeconomic group
Sl
no
Details High socio
economic group
Low socio
economic group
I Sample size
II DMFT
III Standard deviation
15
1 =
n 10
2 =
n
91
.
2
1 =
x 26
.
2
2 =
x
27
.
0
1 =
SD 22
.
0
2 =
SD
( )
23
,
34
.
6
1027
.
0
65
.
0
2
1
2
1
=
=
=
−
−
= df
x
x
SE
x
x
t
001
.
0
001
.
0 76
.
3 t
t
t c 

=
There is a significant difference

Other applications
II) Difference b/n sample mean & population mean
n
SD
SE
x
t
=
−
=

1
−
= n
df








+
−
=
2
1
2
1
1
1
n
n
PQ
p
p
t
2
1
2
2
1
1
where
n
n
p
n
p
n
P
+
+
=
P
Q −
=1
2
2
1 −
+
= n
n
df
III) Difference b/n two sample proportions

Paired ‘t’ test
• Is applied to paired data of observations from one
sample only when each individual gives a paired of
observations
• Here the pair of observations are correlated and not
independent, so for application of ‘t’ test following
procedure is used-
1. Find the difference for each pair
2. Calculate the mean of the difference (x) ie
3. Calculate the SD of the differences & later SE
x
y
y =
− 2
1
x






=
n
SD
SE

4. Test criterion
( ) ( )
n
x
SD
x
d
SE
x
t =
−
=
0
1
−
= n
df
7. Draw conclusions
6. Refer ‘t’ table & find the probability
of calculated value
5. Degree of freedom

• Example – to find out if there is any significant improvement in DAI
scores before and after orthodontic treatment
Sl no DAI before DAI after Difference Squares
1 30 24 6 36
2 26 23 3 9
3 27 24 3 9
4 35 25 10 100
5 25 23 2 4
Total 20 158

( ) ( ) ( ) ( ) ( ) ( )2
2
2
2
2
2
4
2
4
10
4
3
4
3
4
6
squares,
of
sum
4
5
20
−
+
−
+
−
+
−
+
−
=
−
=
=
=


x
x
n
x
x
Mean
46
4
36
1
1
4
=
+
+
+
+
=
( )
78
.
2
but
4
1
6352
.
2
5179
.
1
4
5179
.
1
5
391
.
3
391
.
3
5
.
11
4
46
1
5
.
0
5
.
0
2
t
t
t
n
df
SE
x
t
n
SD
SE
n
x
x
SD
c
c


=
=
−
=
=
=
=

=
=
=

=
=
=
−
−
=

Hence not significant

Z test (Normal test)
• Similar to ‘t’ test in all aspect except that the sample size should be >
30
• In case of normal distribution, the tabulated value of Z at -
960
.
1
level
%
5 05
.
0 =
= Z
576
.
2
level
%
1 01
.
0 =
= Z
290
.
3
level
%
1
.
0 001
.
0 =
= Z

• Z test can be used for –
1. Comparison of means of two samples –








+
=
2
2
2
1
2
1
n
SD
n
SD
( )
2
1
2
1
x
x
SE
x
x
Z
−
−
=
( ) ( )
2
2
2
1
2
1
where SE
SE
x
x
SE +
=
−
n
SD
x
Z
2

−
=
2. Comparison of sample mean & population mean

3. Difference b/n two sample proportions
2
1
2
2
1
1
2
1
2
1
here
w
1
1 n
n
p
n
p
n
P
n
n
PQ
p
p
Z
+
+
=














+
−
=
P
Q −
=1












−
=
n
PQ
P
p
Z
1
Where p = sample proportion
P = populn proportion
4. Comparison of sample proportion
(or percentage) with population proportion
(or percentage)

Analysis of variance (ANOVA)
• Useful for comparison of means of several groups
• R A Fisher in 1920’s
• Has four models
1. One way classification (one way ANOVA )
2. Single factor repeated measures design
3. Nested or hierarchical design
4. Two way classification (two way ANOVA)

One way ANOVA
• Can be used to compare like-
• Effect of different treatment modalities
• Effect of different obturation techniques on the apical seal , etc,.

Groups (or treatments) 1 2 i k
Individual values
Calculate
No of observations
Sum of x values
Sum of squares
Mean of values
11
x
2
i
x
22
x
n
x2
n
x1
12
x
1
k
x
1
i
x
21
x
in
x
2
k
x
kn
x
n n n n
n
x
x
x 1
12
11 ...+
+
+
=
1
Τ 2
T i
T k
T
( ) ( ) ( )2
1
2
12
2
11 .. n
x
x
x +
+
+
=
1
S
2
S i
S k
S
n
T
x 1
1 = 2
x i
x k
x

ANOVA table
Sl
no
Source
of
variation
Degree
of
freedom
Sum of squares Mean sum of
squares
F ratio or
variance ratio
I Between
Groups
II With in
groups
III Total
1
−
k
k
n−
1
−
n
( ) 







−
=
− 
 i i
i i
N
T
x
x
x
2
2
2
( ) 
 
  −
=
− i
i
i
i j ij
i j i
ij
n
T
x
x
x
2
2
2
( ) 







−
=
−  
  N
T
x
x
x i j ij
i j ij
2
2
2
( )
1
2
2
−
−
=

k
x
x
S i i
B
k
N
n
T
x
S
i j i
i
i
ij
W
−






−
=
  
2
2
2
1
2
2
2
−
−
=
 
N
N
T
x
S
i j ij
T
( )
k
N
k
S
S
W
B
−
− ,
1
2
2

Example- see whether there is a difference in number of patients
seen in a given period by practitioners in three group practice
Practice A B C
Individual values 268 387 161
349 264 346
328 423 324
209 254 293
292 239
Calculate
No of observations (n) 5 4 5
Sum of x values 1441 1328 1363
Sum of squares 426899 462910 393583
Mean of values 288.2 332.0 272.6

( )
71
.
63861
2
2
2
2
=
+
+
+
+
−
+
+
=


C
B
A
C
B
A
C
B
A
n
n
n
x
x
x
x
x
x
( ) ( ) ( ) ( )
71
.
8215
2
2
2
2
=
+
+
+
+
−
+
+
=
 




C
B
A
C
B
A
C
C
B
B
A
A
n
n
n
x
x
x
n
x
n
x
n
x
55646.0
SS
between
-
SS
total
=
=
Between group sum of squares
Total sum of squares
With in group sum of squares

Two way ANOVA
• Is used to study the impact of two factors on
variations in a specific variable
• Eg – Effect of age and sex on DMFT value

Sample values
blocks Treatments sample size Total Mean
value
i
ii
..
n
Sample
size
Total
Mean
value
11
x
32
x
22
x
n
x2
n
x1
12
x
1
k
x
31
x
21
x
n
x3
2
k
x
kn
x
n n n n
k
k
k
N
nk =
2
T
n
T
1
T
1
T
2
T
k
T
3
T T
2
x
1
x
k
x
3
x
2
x
1
x
n
x
x

Non parametric tests
• Here the distribution do not require any specific pattern of
distribution. They are applicable to almost all kinds of distribution
• Chi square test
• Mann Whitney U test
• Wilcoxon signed rank test
• Wilcoxon rank sum test
• Kendall’s S test
• Kruskal wallis test
• Spearman’s rank correlation

Chi square test
• By Karl Pearson & denoted as χ²
• Application
1. Alternate test to find the significance of difference in two or more than two
proportions
2. As a test of association b/n two events in binomial or multinomial samples
3. As a test of goodness of fit

• Requirement to apply chi square test
• Random samples
• Qualitative data
• Lowest observed frequency not less than 5
• Contingency table
• Frequency table where sample classified according to two
different attributes
• 2 rows ; 2 columns => 2 X 2 contingency table
• r rows : c columns => rXc contingency table
( )

−
=
E
E
O
2
2

O – observed frequency
E – expected frequency

• Steps
1. State null & alternate hypothesis
2. Make contingency table of the data
3. Determine expected frequency by
4. Calculate chi-square of each by-
( )
c
r
( )
frequency
total
N
c
r
E

=
( )
E
E
O
2
2 −
=


5. calculate degree of freedom
6. Sum all the chi-square of each cell – this gives
chi-square value of the data
7. Compare the calculated value with the table
value at any LOS
8. Draw conclusions
( )

−
=
E
E
O
2
2

( )( )
1
1 −
−
= r
c
df

• Chi square test only tells the presence or absence of
association
• but does not measure the strength of association

Wilcoxon signed rank test
• Is equivalent to paired ‘t’ test
• Steps
• Exclude any differences which are zero
• Put the remaining differences in ascending order, ignoring the signs
• Gives ranks from lowest to highest
• If any differences are equal, then average their ranks
• Count all the ranks of positive differences – T+
• Count all the ranks of negative differences – T-

• If there is no differences b/n variables then T+ & T_ will
be similar, but if there is difference then one sum will
be large and the other will be much smaller
• T= smaller of T+&T_
• Compare the T value with the critical value for 5%, 2%
& 1% significance level
• A result is significant if it is smaller than critical value

Mann Whitney U test
• Is used to determine whether two independent sample have been
drawn from same sample
• It is a alternative to student ‘t’ test & requires at least ordinal or
normal measurement
( )
2
1
1
1
2
1
2
1
R
or
R
n
n
n
n
U −
+
+
=
Where, n1n2 are sample sizes
R1 R2 are sum of ranks assigned to I & II group

Comparison of birth weights of children born to 15 non
smokers with those of children born to 14 heavy smokers
NS 3.9 3.7 3.6 3.7 3.2 4.2 4.0 3.6 3.8 3.3 4.1 3.2 3.5 3.5 2.7
HS 3.1 2.8 2.9 3.2 3.8 3.5 3.2 2.7 3.6 3.7 3.6 2.3 2.3 3.6
R1 26 23 16 21 8 29 27 17 24 12 28 10 15 13 03
R2 7 5 6 11 25 14 9 4 20 22 19 2 1 18
Ranks assignments

Sum of R1= 272 and Sum of R2=163
Difference T=R1 – R2 is 109
The table value of T0.05 is 96 , so reject the H0
We conclude that weights of children born to the
heavy smokers are significantly lower than those of
the children born to the non-smokers (p<0.05)

Applications of statistical tests in
Research Methods

Research interested in relationship
B/n more than two variables
Use multiple regression
Or
Multivariate analysis
Multiple – variable problem

Bibliography
• Biostatistics
– Rao K Vishweswara, Ist edition.
• Methods in Biostatistics
– Dr Mahajan B K, 5th edition.
• Essentials of Medical Statistics
– Kirkwood Betty R, 1st edition.
• Health Research design and Methodology
– Okolo Eucharia Nnadi.
• Simple Biostaistics
– Indrayan,1st edition.
• Statistics in Dentistry
– Bulman J S

Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf

More Related Content

Similar to Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf (20)

Recently uploaded (20)

Dr.Dinesh-BIOSTAT-Tests-of-significance-1-min.pdf