SlideShare a Scribd company logo
BIOL209: Two Samples
Paul Gardner
April 3, 2017
Paul Gardner BIOL209: Two Samples
Two samples
Some of the most frequently used (and simplest) models
include:
Covered in BIOL209:
comparing 2 variances (Fisher’s F test: var.test)
comparing 2 sample means with normal distribution (Student’s
t test: t.test)
comparing 2 means with non-normal distribution (Wilcoxon’s
test: wilcox.test)
comparing 2 proportions (the binomial test: prop.test) NB.
I may not cover this...
comparing 2 variables (“Pearson’s” or “Spearman’s rank”
correlation: cor.test)
Covered in BIOL309:
testing for independence in contingency tables using χ2
(chisq.test)
testing small samples for correlation with Fisher’s exact
test(chisq.test)
Paul Gardner BIOL209: Two Samples
R. A. Fisher
Who is this Fisher?
R. A. Fisher (1890-1962) was a statistician and geneticist who
contributed to mathematics and genetic measures of
evolutionary selection
Developed analysis of variance (ANOVA) – See Distinguished
Professor David Schiel’s lectures
Fisher’s F test
Fisher’s exact test
Fisher’s method for combining P-values
and lots more...
https://en.wikipedia.org/wiki/Ronald Fisher
Paul Gardner BIOL209: Two Samples
Fisher’s F test
Comparing 2 variances (s2
1 & s2
2 ,)
Sometimes we are interested in comparing the variances of two
samples
This can happen when, for example, we want to compare the
results of a treatment group and a control
If s2
1 ≥ s2
2 , then:
F =
s2
1
s2
2
Broad
x
Freq.
−15
−10
−5
0
5
10
15
0
20
40
60
80
100
120
Narrow
x
Freq.
−15
−10
−5
0
5
10
15
0
100
200
300
400
500
Paul Gardner BIOL209: Two Samples
Fisher’s F test
F =
s2
1
s2
2
, s2
1 > s2
2
F ≥ 1
What is the null (H0) for this test?
How will we know if there is a significant difference between
the variances? I.e. what is the critical value of F?
What are the assumptions of this test?
normal & independent
David Schiel will talk a LOT more about this test for ANOVAs
Paul Gardner BIOL209: Two Samples
An example
Ozone concentrations were collected in 2 market gardens
gardenB gardenC
1 5 3
2 5 3
3 6 2
4 7 1
5 4 10
6 4 4
7 3 3
8 5 11
9 6 3
10 5 10
Garden B
[ozone] (pphm)
Frequency
0 2 4 6 8 10 12
01234
Garden C
[ozone] (pphm)
Frequency
0 2 4 6 8 10 12
01234
mean
median
mode
oz<-read.csv("f.test.data.csv", header=T)
par(mfrow=c(1,2), cex=2.5)
hist(oz$gardenB, col="cornflowerblue", main="Garden B", xlab="[ozone] (pphm)")
hist(oz$gardenC, col="salmon", main="Garden C", xlab="[ozone] (pphm)")
Paul Gardner BIOL209: Two Samples
An example
Calculate F
In this example, we have no idea which variance is likely to be
larger (often not the case for control vs expt situations)
Therefore, we use a two-tailed test (p = 1 − α
2 )
var(oz$gardenB)
[1] 1.333333
var(oz$gardenC)
[1] 14.22222
(F.ratio <- var(oz$gardenC)/var(oz$gardenB))
[1] 10.66667
#we double the probablity for the two-tailed test:
2*(1-pf(F.ratio, 9, 9))
[1] 0.001624199
The probability of obtaining a F ratio as large as this, or
larger, if the variances were the same (i.e. by chance), is less
than 0.002.
NB. This is not the probability that the null is true!
The null is assumed to be true in carrying out the test
(same with the other statistical tests).
Paul Gardner BIOL209: Two Samples
RECALL: p values and significance
p is the probability that a test statistic could have occured by
chance when the null hypothesis is true
say this 50 times, write on the bathroom wall, remember it!
a result is said to be significant (or unlikely to be due to
chance) if p is low
≤ 5% (or α ≤ 0.05)
the significance threshold may change depending upon the field
e.g. p ≤ 3x10−7
is often used in high-energy physics,
p ≤ 10−10
is often used in genomics
Paul Gardner BIOL209: Two Samples
An example
There is a faster way of running F tests:
var.test(oz$gardenC, oz$gardenB, alternative = "two.sided")
F test to compare two variances
data: oz$gardenC and oz$gardenB
F = 10.667, num df = 9, denom df = 9, p-value = 0.001624
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
2.649449 42.943938
sample estimates:
ratio of variances
10.66667
NB. shouldn’t use Student’s t test on this data as the
variance are significantly different and the means are the same
(5 pphm)
Paul Gardner BIOL209: Two Samples
Comparing two means
We now know how to test if our variances are significantly
different,
Suppose we wish to see if our means of our samples are
significantly different
As usual, we want to:
1. Compute a test statistic
2. How likely is our test statistic if our null hypothesis is true?
3. Compare the statistic to a critical value.
R is particularly handy, since many statistical tables and
formula for many probability distributions have been built into
the package
What are some measures we could use to compare two means?
Paul Gardner BIOL209: Two Samples
Examples of comparing two means
Genetic variation and associations with disease (or the
severity)
Treatment vs control groups (e.g. with/without fertiliser or
pesticide or greenhouse or ...)
Comparing regions, outcomes, social systems, time-periods, ...
So fundamental that it’s difficult to think of a field where this
wouldn’t be useful!
−3 −2 −1 0 1 2 3
0.00.10.20.30.4
Probability Densities for 2 Normal Distributions
x
Prob.
mu=0.0, sigma=1.0
mu=1.0, sigma=1.0
Paul Gardner BIOL209: Two Samples
We’ll consider two tests (maybe three)
Student’s t test for when our samples are independent,
variances are constant, and the data is normally distributed
(parametric)
Wilcoxon rank-sum test for when our samples are
independent, but the data is not necessarily normally
distributed (nonparametric)
Kolmogorov-Smirnov test for when our samples are
independent, but the data is not necessarily normally
distributed (nonparametric)
Paul Gardner BIOL209: Two Samples
Student’s t Test
“Student” is a pseudonym for W. S. Gosset who first
published the approach in 1908
Prevented from using his own name by his employer, the
Guinness Brewing Company
The test statistic is the number of standard errors by which
the sample means are separated
does this remind you of z scores?
Paul Gardner BIOL209: Two Samples
Student’s t Test
We write:
t =
¯xA − ¯xB
SEdiff
I hope you recall that: SE¯x = s2
n ,
A trick called the variance sum law, can be used to show that:
SEdiff =
s2
A
nA
+
s2
B
nB
Paul Gardner BIOL209: Two Samples
Estimating the variance of a difference between 2
independent samples
If nA = nB:
[(xA − ¯xA) − (xB − ¯xB)]2
With a little algebra we can show that:
σ2
¯xA−¯xB
= σ2
A + σ2
B
NB. Only true if the samples are not correlated
Paul Gardner BIOL209: Two Samples
Let’s try an example
The ozone concentrations in market gardens, garden B vs C
was inappropriate for Student’s t, what about A vs B?
(oz<-read.csv("t.test.data.csv", header=T))
gardenA gardenB
1 3 5
2 4 5
3 4 6
4 3 7
5 2 4
6 3 4
7 1 3
8 3 5
9 5 6
10 2 5
#check means:
apply(oz, 2, mean)
gardenA gardenB
3 5
#check variances:
apply(oz, 2, var)
gardenA gardenB
1.333333 1.333333
Garden A
[ozone] (pphm)
Frequency
0 2 4 6 8 10 12
01234
Garden B
[ozone] (pphm)
Frequency
0 2 4 6 8 10 12
01234
mean
median
mode
Paul Gardner BIOL209: Two Samples
Degrees of freedom, t statistic
d.f . = nA + nB − 2 ????
d.f . = 10 + 10 − 2 = 18
Since, we’re not testing (or don’t know) in advance which
garden has the higher mean ozone concentration, this is a
two-tailed test (if we did, then we’d use a one-tailed test).
Therefore we can work out the critical value of Student’s t,
with α = 0.05:
qt(0.975,18)
[1] 2.100922
Paul Gardner BIOL209: Two Samples
More data exploration
Boxplots are a great way to examine data
The notch option is is handy:
?boxplot
notch: if notch is TRUE, a notch is drawn in each side of the
boxes. If the notches of two plots do not overlap this is
strong evidence that the two medians differ (Chambers _et
al_, 1983, p. 62). See boxplot.stats for the calculations
used.
attach(oz)
ozone<-c(gardenA,gardenB)
label <- factor( c(rep("A",10), rep("B", 10)) )
boxplot(ozone~label, notch=T, xlab="[ozone] (pphm)", ylab="Garden", col="cornflowerblue", horizontal = T)
AB
1 2 3 4 5 6 7
[ozone] (pphm)
Garden
Paul Gardner BIOL209: Two Samples
More data exploration
AB
1 2 3 4 5 6 7
[ozone] (pphm)
Garden
The notches of the two plots do not overlap
Therefore the medians are significantly different at the 5%
level
s2A <- var(gardenA)
s2B <- var(gardenB)
s2A/s2B
[1] 1
( mean(gardenA) - mean(gardenB) )/sqrt( s2A/10 + s2B/10 )
[1] -3.872983
The absolute value of the test statistic is greater than the critical
value (2.100922). Therefore we can reject the null hypothesis.
Paul Gardner BIOL209: Two Samples
The easy way...
t.test(gardenA, gardenB)
Welch Two Sample t-test
data: gardenA and gardenB
t = -3.873, df = 18, p-value = 0.001115
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-3.0849115 -0.9150885
sample estimates:
mean of x mean of y
3 5
You might describe this result like so:
Ozone concentration was significantly higher in garden B
(mean = 5.0 pphm) than in garden A (mean = 3.0 pphm;
t = 3.873, p = 0.0011 (2 tailed), d.f . = 18)
Paul Gardner BIOL209: Two Samples
Wilcoxon Rank-Sum Test
A non-parametric alternative to Student’s t test
The test statistic is computed by:
1. Place both samples into a single array, with their sample
names attached (e.g. “A” and “B”)
2. Then sort the list, keeping the names attached
3. Assign a rank to each value (ties receive an averaged rank)
4. Sum the rank for each sample
5. Significance is determined by the size of the smaller sum of
ranks
Right skew
x
Freq.
0 5 10 15 20
050150
mean
median
mode
Left skew
x
Freq.
0 5 10 15 20
040100
Paul Gardner BIOL209: Two Samples
Wilcoxon Rank-Sum Test
What is the null for this test?
What is the minimum value for this statistic?
Hint: i i = 1
2 n(n + 1)
Paul Gardner BIOL209: Two Samples
An example: back to the ozone contaminated gardens
ozone
[1] 3 4 4 3 2 3 1 3 5 2 5 5 6 7 4 4 3 5 6 5
label
[1] A A A A A A A A A A B B B B B B B B B B
Levels: A B
(combined.ranks <- rank(ozone))
[1] 6.0 10.5 10.5 6.0 2.5 6.0 1.0 6.0 15.0 2.5 15.0 15.0 18.5 20.0 10.5
[16] 10.5 6.0 15.0 18.5 15.0
#tapply: Apply a function to each cell of an array, grouped by factors!
tapply(combined.ranks, label, sum)
A B
66 144
We can look up the smaller of the two values (66) in tables of
Wilcoxon rank sums and reject the null if 66 is smaller than
the tabled value at the appropriate significance
In this case, the critical value is 78
Therefore we can again reject the null hypothesis.
The sample means are significantly different.
Paul Gardner BIOL209: Two Samples
Critical values table
http://www.real-statistics.com/statistics-tables/wilcoxon-rank-sum-table-independent-samples/
Paul Gardner BIOL209: Two Samples
The easy way...
wilcox.test(gardenA, gardenB)
Wilcoxon rank sum test with continuity correction
data: gardenA and gardenB
W = 11, p-value = 0.002988
alternative hypothesis: true location shift is not equal to 0
Warning message:
In wilcox.test.default(gardenA, gardenB) :
cannot compute exact p-value with ties
The function wilcox.test approximates a z value, which it
then uses to compute a p value
In this case, p = 0.002988, which is much less than 0.05, so
we can reject the null
The warning means that p values cannot be compute exactly
(doesn’t matter in most cases)
Why is W = 11 & not 66? n
2 (n + 1) is subtracted in R (noted
in the documentation for wilcox.test)
Paul Gardner BIOL209: Two Samples
t vs W
t.test(gardenA, gardenB)
t = -3.873, df = 18, p-value = 0.001115
wilcox.test(gardenA, gardenB)
W = 11, p-value = 0.002988
The non-parametric test is much more appropriate than the t
test when the distribution is not normal
However, the non-parametric test is about 95% as powerful
when the distribution is normal (i.e. increased chance of
falsely accepting the null)
Wilcoxon is more powerful than t in the presence of
outliers/skew
Typically, as here, the t test will give the lower p value
Wilcoxon tests are generally more conservative!
Paul Gardner BIOL209: Two Samples
Tests on Paired Samples
Sometimes 2-sample data comes from paired observations
E.g. an individuals behaviour in the morning vs afternoon,
stream health before vs after contamination, well-being before
or after a drug treatment, ...
Recall the variance for of a difference:
σ2
¯xA−¯xB
= [(xA − ¯xA) − (xB − ¯xB)]2
= (xA − ¯xA)2
+ (xB − ¯xB)2
− 2(xA − ¯xA)(xB − ¯xB)
The covariance of A & B is given by the third term
If the covariance is positive then the variance of the difference
is reduced!
This can make it easier to detect significant differences
between the means!
Paul Gardner BIOL209: Two Samples
An example
Kick samples of aquatice invertebrates from 16 rivers
1 sample upstream from a sewage outfall, 1 downstream
(streams<-read.csv("streams.csv", header=T))
down up
1 20 23
2 15 16
3 10 10
4 5 4
5 20 22
6 15 15
7 10 12
8 5 7
9 20 21
10 15 16
11 10 11
12 5 5
13 20 22
14 15 14
15 10 10
16 5 6
attach(streams)
Down
# invertebrates
Frequency
5 10 15 20
01234
Up
# invertebrates
Frequency
5 10 15 20
0.01.02.03.0
mean
median
mode
Paul Gardner BIOL209: Two Samples
An example: t.test
If we ignore the fact that the samples are paired (p is rubbish):
t.test(down, up)
Welch Two Sample t-test
data: down and up
t = -0.40876, df = 29.755, p-value = 0.6856
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-5.248256 3.498256
sample estimates:
mean of x mean of y
12.500 13.375
Paul Gardner BIOL209: Two Samples
An example: paired t.test
The picture changes completely if we account for the fact that
samples are paired!
The moral of the story:
if you can do a paired t test, then you should always do a
paired test!
In general, if you have information on blocking or spatial
correlation then you should incorporate this information into
your analysis
t.test(down, up, paired=T)
Paired t-test
data: down and up
t = -3.0502, df = 15, p-value = 0.0081
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
-1.4864388 -0.2635612
sample estimates:
mean of the differences
-0.875
Paul Gardner BIOL209: Two Samples
An example: t on the differences
Same result:
Halved degrees of freedom, but this is compensated for by
reducing the error variance
Blocking always helps!
t.test(up-down)
One Sample t-test
data: up - down
t = 3.0502, df = 15, p-value = 0.0081
alternative hypothesis: true mean is not equal to 0
95 percent confidence interval:
0.2635612 1.4864388
sample estimates:
mean of x
0.875
Paul Gardner BIOL209: Two Samples
Pop test: Q1
Given:
s2
=
(x − ¯x)2
n − 1
Show that:
s2
=
1
n − 1
× x2
−
( x)2
n
Paul Gardner BIOL209: Two Samples
Pop test: Q2
A colleague has collected some very important independant
data on the affects of a drug on tumour growth.
She has computed the following statistics for you:
Control:
nC = 10
x2
C = 100, 334
xC = 998
Drugged:
nD = 10
x2
D = 68, 811
xD = 823
What is the t statistics for comparing these two samples?
Recall:
t =
¯xA − ¯xB
SEdiff
SEdiff =
s2
A
nA
+
s2
B
nB
The critical value of t is 1.73 for α = 0.05 and d.f . = 18. Is
the result significant?
Paul Gardner BIOL209: Two Samples
Pop test: Q3
You’re given the data for 3 independent samples. The
corresponding histograms are shown below.
1. How would you compare sample 1 & 2?
2. How would you compare sample 2 & 3?
Justify your answers.
Sample 1
x
Freq.
70 90 110 130
0100200
Sample 2
x
Freq.
80 100 120 140
0100200 Sample 3
x
Freq.
95 100 105 110
060120
mean
median
mode
Paul Gardner BIOL209: Two Samples
Further reading
Chapter 6 of Crawley (2015) Statistics: An introduction using
R.
Excluding binomial, χ2, contingency tables & Fisher’s exacts
tests
Paul Gardner BIOL209: Two Samples
The End
Paul Gardner BIOL209: Two Samples

More Related Content

PPT
Test of significance (t-test, proportion test, chi-square test)
PDF
Overview of statistics: Statistical testing (Part I)
PPT
T test statistic
PDF
t-tests in R - Lab slides for UGA course FANR 6750
PPTX
tests of significance
PPTX
Tests of Significance.pptx powerpoint presentation
PPTX
Test of significance
PPTX
BIOSTATISTICS T TEST Z TEST F TEST HYPOTHESIS TYPES OF ERROR.pptx
Test of significance (t-test, proportion test, chi-square test)
Overview of statistics: Statistical testing (Part I)
T test statistic
t-tests in R - Lab slides for UGA course FANR 6750
tests of significance
Tests of Significance.pptx powerpoint presentation
Test of significance
BIOSTATISTICS T TEST Z TEST F TEST HYPOTHESIS TYPES OF ERROR.pptx

Similar to Analysis of two samples (20)

PPTX
Student’s t test
PPTX
Summary of statistical tools used in spss
PPTX
Non-parametric.pptx qualitative and quantity data
PPTX
BERD R Short Course (Topic 4).pptxBERD R Short Course (Topic 4).pptx
PPT
6Tests of significance Parametric and Non Parametric tests.ppt
PPTX
Statistical Significance Tests.pptx
PPTX
Stats test
PPTX
Test of significance
PPTX
Stats test
PPTX
K.A.Sindhura-t,z,f tests
PPTX
Lecture 3 about it governanace and how it works
PPT
1 ANOVA.ppt
PPTX
Two variances or standard deviations
PPTX
Epidemiological study design and it's significance
PPTX
All non parametric test
PPTX
All non parametric test
PPTX
non-parametric tests - Dr Smijal Gopalan Marath - Specialist Periodontist - ...
PPTX
Two Variances or Standard Deviations
PPT
parametric hypothesis testing using MATLAB
PPT
section11_Nonparametric.ppt
Student’s t test
Summary of statistical tools used in spss
Non-parametric.pptx qualitative and quantity data
BERD R Short Course (Topic 4).pptxBERD R Short Course (Topic 4).pptx
6Tests of significance Parametric and Non Parametric tests.ppt
Statistical Significance Tests.pptx
Stats test
Test of significance
Stats test
K.A.Sindhura-t,z,f tests
Lecture 3 about it governanace and how it works
1 ANOVA.ppt
Two variances or standard deviations
Epidemiological study design and it's significance
All non parametric test
All non parametric test
non-parametric tests - Dr Smijal Gopalan Marath - Specialist Periodontist - ...
Two Variances or Standard Deviations
parametric hypothesis testing using MATLAB
section11_Nonparametric.ppt
Ad

More from Paul Gardner (20)

PDF
ppgardner-lecture07-genome-function.pdf
PDF
ppgardner-lecture06-homologysearch.pdf
PDF
ppgardner-lecture05-alignment-comparativegenomics.pdf
PDF
ppgardner-lecture04-annotation-comparativegenomics.pdf
PDF
ppgardner-lecture03-genomesize-complexity.pdf
PDF
Does RNA avoidance dictate protein expression level?
PDF
Machine learning methods
PDF
Clustering
PDF
Monte Carlo methods
PDF
The jackknife and bootstrap
PDF
Contingency tables
PDF
Regression (II)
PDF
Regression (I)
PDF
Analysis of covariation and correlation
PDF
Analysis of single samples
PDF
Centrality and spread
PDF
Fundamentals of statistical analysis
PDF
Random RNA interactions control protein expression in prokaryotes
PDF
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
PDF
A meta-analysis of computational biology benchmarks reveals predictors of pro...
ppgardner-lecture07-genome-function.pdf
ppgardner-lecture06-homologysearch.pdf
ppgardner-lecture05-alignment-comparativegenomics.pdf
ppgardner-lecture04-annotation-comparativegenomics.pdf
ppgardner-lecture03-genomesize-complexity.pdf
Does RNA avoidance dictate protein expression level?
Machine learning methods
Clustering
Monte Carlo methods
The jackknife and bootstrap
Contingency tables
Regression (II)
Regression (I)
Analysis of covariation and correlation
Analysis of single samples
Centrality and spread
Fundamentals of statistical analysis
Random RNA interactions control protein expression in prokaryotes
Avoidance of stochastic RNA interactions can be harnessed to control protein ...
A meta-analysis of computational biology benchmarks reveals predictors of pro...
Ad

Recently uploaded (20)

PPTX
Derivatives of integument scales, beaks, horns,.pptx
PPT
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
PPTX
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
PPTX
7. General Toxicologyfor clinical phrmacy.pptx
PDF
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
PDF
The scientific heritage No 166 (166) (2025)
PDF
Placing the Near-Earth Object Impact Probability in Context
PPTX
BIOMOLECULES PPT........................
PPTX
The KM-GBF monitoring framework – status & key messages.pptx
PPTX
Introduction to Cardiovascular system_structure and functions-1
PDF
bbec55_b34400a7914c42429908233dbd381773.pdf
PDF
Sciences of Europe No 170 (2025)
PDF
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
PDF
AlphaEarth Foundations and the Satellite Embedding dataset
PPTX
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
PPTX
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
PDF
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
PPTX
famous lake in india and its disturibution and importance
PDF
An interstellar mission to test astrophysical black holes
PPTX
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...
Derivatives of integument scales, beaks, horns,.pptx
The World of Physical Science, • Labs: Safety Simulation, Measurement Practice
ANEMIA WITH LEUKOPENIA MDS 07_25.pptx htggtftgt fredrctvg
7. General Toxicologyfor clinical phrmacy.pptx
Mastering Bioreactors and Media Sterilization: A Complete Guide to Sterile Fe...
The scientific heritage No 166 (166) (2025)
Placing the Near-Earth Object Impact Probability in Context
BIOMOLECULES PPT........................
The KM-GBF monitoring framework – status & key messages.pptx
Introduction to Cardiovascular system_structure and functions-1
bbec55_b34400a7914c42429908233dbd381773.pdf
Sciences of Europe No 170 (2025)
CAPERS-LRD-z9:AGas-enshroudedLittleRedDotHostingaBroad-lineActive GalacticNuc...
AlphaEarth Foundations and the Satellite Embedding dataset
G5Q1W8 PPT SCIENCE.pptx 2025-2026 GRADE 5
GEN. BIO 1 - CELL TYPES & CELL MODIFICATIONS
VARICELLA VACCINATION: A POTENTIAL STRATEGY FOR PREVENTING MULTIPLE SCLEROSIS
famous lake in india and its disturibution and importance
An interstellar mission to test astrophysical black holes
Protein & Amino Acid Structures Levels of protein structure (primary, seconda...

Analysis of two samples

  • 1. BIOL209: Two Samples Paul Gardner April 3, 2017 Paul Gardner BIOL209: Two Samples
  • 2. Two samples Some of the most frequently used (and simplest) models include: Covered in BIOL209: comparing 2 variances (Fisher’s F test: var.test) comparing 2 sample means with normal distribution (Student’s t test: t.test) comparing 2 means with non-normal distribution (Wilcoxon’s test: wilcox.test) comparing 2 proportions (the binomial test: prop.test) NB. I may not cover this... comparing 2 variables (“Pearson’s” or “Spearman’s rank” correlation: cor.test) Covered in BIOL309: testing for independence in contingency tables using χ2 (chisq.test) testing small samples for correlation with Fisher’s exact test(chisq.test) Paul Gardner BIOL209: Two Samples
  • 3. R. A. Fisher Who is this Fisher? R. A. Fisher (1890-1962) was a statistician and geneticist who contributed to mathematics and genetic measures of evolutionary selection Developed analysis of variance (ANOVA) – See Distinguished Professor David Schiel’s lectures Fisher’s F test Fisher’s exact test Fisher’s method for combining P-values and lots more... https://en.wikipedia.org/wiki/Ronald Fisher Paul Gardner BIOL209: Two Samples
  • 4. Fisher’s F test Comparing 2 variances (s2 1 & s2 2 ,) Sometimes we are interested in comparing the variances of two samples This can happen when, for example, we want to compare the results of a treatment group and a control If s2 1 ≥ s2 2 , then: F = s2 1 s2 2 Broad x Freq. −15 −10 −5 0 5 10 15 0 20 40 60 80 100 120 Narrow x Freq. −15 −10 −5 0 5 10 15 0 100 200 300 400 500 Paul Gardner BIOL209: Two Samples
  • 5. Fisher’s F test F = s2 1 s2 2 , s2 1 > s2 2 F ≥ 1 What is the null (H0) for this test? How will we know if there is a significant difference between the variances? I.e. what is the critical value of F? What are the assumptions of this test? normal & independent David Schiel will talk a LOT more about this test for ANOVAs Paul Gardner BIOL209: Two Samples
  • 6. An example Ozone concentrations were collected in 2 market gardens gardenB gardenC 1 5 3 2 5 3 3 6 2 4 7 1 5 4 10 6 4 4 7 3 3 8 5 11 9 6 3 10 5 10 Garden B [ozone] (pphm) Frequency 0 2 4 6 8 10 12 01234 Garden C [ozone] (pphm) Frequency 0 2 4 6 8 10 12 01234 mean median mode oz<-read.csv("f.test.data.csv", header=T) par(mfrow=c(1,2), cex=2.5) hist(oz$gardenB, col="cornflowerblue", main="Garden B", xlab="[ozone] (pphm)") hist(oz$gardenC, col="salmon", main="Garden C", xlab="[ozone] (pphm)") Paul Gardner BIOL209: Two Samples
  • 7. An example Calculate F In this example, we have no idea which variance is likely to be larger (often not the case for control vs expt situations) Therefore, we use a two-tailed test (p = 1 − α 2 ) var(oz$gardenB) [1] 1.333333 var(oz$gardenC) [1] 14.22222 (F.ratio <- var(oz$gardenC)/var(oz$gardenB)) [1] 10.66667 #we double the probablity for the two-tailed test: 2*(1-pf(F.ratio, 9, 9)) [1] 0.001624199 The probability of obtaining a F ratio as large as this, or larger, if the variances were the same (i.e. by chance), is less than 0.002. NB. This is not the probability that the null is true! The null is assumed to be true in carrying out the test (same with the other statistical tests). Paul Gardner BIOL209: Two Samples
  • 8. RECALL: p values and significance p is the probability that a test statistic could have occured by chance when the null hypothesis is true say this 50 times, write on the bathroom wall, remember it! a result is said to be significant (or unlikely to be due to chance) if p is low ≤ 5% (or α ≤ 0.05) the significance threshold may change depending upon the field e.g. p ≤ 3x10−7 is often used in high-energy physics, p ≤ 10−10 is often used in genomics Paul Gardner BIOL209: Two Samples
  • 9. An example There is a faster way of running F tests: var.test(oz$gardenC, oz$gardenB, alternative = "two.sided") F test to compare two variances data: oz$gardenC and oz$gardenB F = 10.667, num df = 9, denom df = 9, p-value = 0.001624 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 2.649449 42.943938 sample estimates: ratio of variances 10.66667 NB. shouldn’t use Student’s t test on this data as the variance are significantly different and the means are the same (5 pphm) Paul Gardner BIOL209: Two Samples
  • 10. Comparing two means We now know how to test if our variances are significantly different, Suppose we wish to see if our means of our samples are significantly different As usual, we want to: 1. Compute a test statistic 2. How likely is our test statistic if our null hypothesis is true? 3. Compare the statistic to a critical value. R is particularly handy, since many statistical tables and formula for many probability distributions have been built into the package What are some measures we could use to compare two means? Paul Gardner BIOL209: Two Samples
  • 11. Examples of comparing two means Genetic variation and associations with disease (or the severity) Treatment vs control groups (e.g. with/without fertiliser or pesticide or greenhouse or ...) Comparing regions, outcomes, social systems, time-periods, ... So fundamental that it’s difficult to think of a field where this wouldn’t be useful! −3 −2 −1 0 1 2 3 0.00.10.20.30.4 Probability Densities for 2 Normal Distributions x Prob. mu=0.0, sigma=1.0 mu=1.0, sigma=1.0 Paul Gardner BIOL209: Two Samples
  • 12. We’ll consider two tests (maybe three) Student’s t test for when our samples are independent, variances are constant, and the data is normally distributed (parametric) Wilcoxon rank-sum test for when our samples are independent, but the data is not necessarily normally distributed (nonparametric) Kolmogorov-Smirnov test for when our samples are independent, but the data is not necessarily normally distributed (nonparametric) Paul Gardner BIOL209: Two Samples
  • 13. Student’s t Test “Student” is a pseudonym for W. S. Gosset who first published the approach in 1908 Prevented from using his own name by his employer, the Guinness Brewing Company The test statistic is the number of standard errors by which the sample means are separated does this remind you of z scores? Paul Gardner BIOL209: Two Samples
  • 14. Student’s t Test We write: t = ¯xA − ¯xB SEdiff I hope you recall that: SE¯x = s2 n , A trick called the variance sum law, can be used to show that: SEdiff = s2 A nA + s2 B nB Paul Gardner BIOL209: Two Samples
  • 15. Estimating the variance of a difference between 2 independent samples If nA = nB: [(xA − ¯xA) − (xB − ¯xB)]2 With a little algebra we can show that: σ2 ¯xA−¯xB = σ2 A + σ2 B NB. Only true if the samples are not correlated Paul Gardner BIOL209: Two Samples
  • 16. Let’s try an example The ozone concentrations in market gardens, garden B vs C was inappropriate for Student’s t, what about A vs B? (oz<-read.csv("t.test.data.csv", header=T)) gardenA gardenB 1 3 5 2 4 5 3 4 6 4 3 7 5 2 4 6 3 4 7 1 3 8 3 5 9 5 6 10 2 5 #check means: apply(oz, 2, mean) gardenA gardenB 3 5 #check variances: apply(oz, 2, var) gardenA gardenB 1.333333 1.333333 Garden A [ozone] (pphm) Frequency 0 2 4 6 8 10 12 01234 Garden B [ozone] (pphm) Frequency 0 2 4 6 8 10 12 01234 mean median mode Paul Gardner BIOL209: Two Samples
  • 17. Degrees of freedom, t statistic d.f . = nA + nB − 2 ???? d.f . = 10 + 10 − 2 = 18 Since, we’re not testing (or don’t know) in advance which garden has the higher mean ozone concentration, this is a two-tailed test (if we did, then we’d use a one-tailed test). Therefore we can work out the critical value of Student’s t, with α = 0.05: qt(0.975,18) [1] 2.100922 Paul Gardner BIOL209: Two Samples
  • 18. More data exploration Boxplots are a great way to examine data The notch option is is handy: ?boxplot notch: if notch is TRUE, a notch is drawn in each side of the boxes. If the notches of two plots do not overlap this is strong evidence that the two medians differ (Chambers _et al_, 1983, p. 62). See boxplot.stats for the calculations used. attach(oz) ozone<-c(gardenA,gardenB) label <- factor( c(rep("A",10), rep("B", 10)) ) boxplot(ozone~label, notch=T, xlab="[ozone] (pphm)", ylab="Garden", col="cornflowerblue", horizontal = T) AB 1 2 3 4 5 6 7 [ozone] (pphm) Garden Paul Gardner BIOL209: Two Samples
  • 19. More data exploration AB 1 2 3 4 5 6 7 [ozone] (pphm) Garden The notches of the two plots do not overlap Therefore the medians are significantly different at the 5% level s2A <- var(gardenA) s2B <- var(gardenB) s2A/s2B [1] 1 ( mean(gardenA) - mean(gardenB) )/sqrt( s2A/10 + s2B/10 ) [1] -3.872983 The absolute value of the test statistic is greater than the critical value (2.100922). Therefore we can reject the null hypothesis. Paul Gardner BIOL209: Two Samples
  • 20. The easy way... t.test(gardenA, gardenB) Welch Two Sample t-test data: gardenA and gardenB t = -3.873, df = 18, p-value = 0.001115 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -3.0849115 -0.9150885 sample estimates: mean of x mean of y 3 5 You might describe this result like so: Ozone concentration was significantly higher in garden B (mean = 5.0 pphm) than in garden A (mean = 3.0 pphm; t = 3.873, p = 0.0011 (2 tailed), d.f . = 18) Paul Gardner BIOL209: Two Samples
  • 21. Wilcoxon Rank-Sum Test A non-parametric alternative to Student’s t test The test statistic is computed by: 1. Place both samples into a single array, with their sample names attached (e.g. “A” and “B”) 2. Then sort the list, keeping the names attached 3. Assign a rank to each value (ties receive an averaged rank) 4. Sum the rank for each sample 5. Significance is determined by the size of the smaller sum of ranks Right skew x Freq. 0 5 10 15 20 050150 mean median mode Left skew x Freq. 0 5 10 15 20 040100 Paul Gardner BIOL209: Two Samples
  • 22. Wilcoxon Rank-Sum Test What is the null for this test? What is the minimum value for this statistic? Hint: i i = 1 2 n(n + 1) Paul Gardner BIOL209: Two Samples
  • 23. An example: back to the ozone contaminated gardens ozone [1] 3 4 4 3 2 3 1 3 5 2 5 5 6 7 4 4 3 5 6 5 label [1] A A A A A A A A A A B B B B B B B B B B Levels: A B (combined.ranks <- rank(ozone)) [1] 6.0 10.5 10.5 6.0 2.5 6.0 1.0 6.0 15.0 2.5 15.0 15.0 18.5 20.0 10.5 [16] 10.5 6.0 15.0 18.5 15.0 #tapply: Apply a function to each cell of an array, grouped by factors! tapply(combined.ranks, label, sum) A B 66 144 We can look up the smaller of the two values (66) in tables of Wilcoxon rank sums and reject the null if 66 is smaller than the tabled value at the appropriate significance In this case, the critical value is 78 Therefore we can again reject the null hypothesis. The sample means are significantly different. Paul Gardner BIOL209: Two Samples
  • 25. The easy way... wilcox.test(gardenA, gardenB) Wilcoxon rank sum test with continuity correction data: gardenA and gardenB W = 11, p-value = 0.002988 alternative hypothesis: true location shift is not equal to 0 Warning message: In wilcox.test.default(gardenA, gardenB) : cannot compute exact p-value with ties The function wilcox.test approximates a z value, which it then uses to compute a p value In this case, p = 0.002988, which is much less than 0.05, so we can reject the null The warning means that p values cannot be compute exactly (doesn’t matter in most cases) Why is W = 11 & not 66? n 2 (n + 1) is subtracted in R (noted in the documentation for wilcox.test) Paul Gardner BIOL209: Two Samples
  • 26. t vs W t.test(gardenA, gardenB) t = -3.873, df = 18, p-value = 0.001115 wilcox.test(gardenA, gardenB) W = 11, p-value = 0.002988 The non-parametric test is much more appropriate than the t test when the distribution is not normal However, the non-parametric test is about 95% as powerful when the distribution is normal (i.e. increased chance of falsely accepting the null) Wilcoxon is more powerful than t in the presence of outliers/skew Typically, as here, the t test will give the lower p value Wilcoxon tests are generally more conservative! Paul Gardner BIOL209: Two Samples
  • 27. Tests on Paired Samples Sometimes 2-sample data comes from paired observations E.g. an individuals behaviour in the morning vs afternoon, stream health before vs after contamination, well-being before or after a drug treatment, ... Recall the variance for of a difference: σ2 ¯xA−¯xB = [(xA − ¯xA) − (xB − ¯xB)]2 = (xA − ¯xA)2 + (xB − ¯xB)2 − 2(xA − ¯xA)(xB − ¯xB) The covariance of A & B is given by the third term If the covariance is positive then the variance of the difference is reduced! This can make it easier to detect significant differences between the means! Paul Gardner BIOL209: Two Samples
  • 28. An example Kick samples of aquatice invertebrates from 16 rivers 1 sample upstream from a sewage outfall, 1 downstream (streams<-read.csv("streams.csv", header=T)) down up 1 20 23 2 15 16 3 10 10 4 5 4 5 20 22 6 15 15 7 10 12 8 5 7 9 20 21 10 15 16 11 10 11 12 5 5 13 20 22 14 15 14 15 10 10 16 5 6 attach(streams) Down # invertebrates Frequency 5 10 15 20 01234 Up # invertebrates Frequency 5 10 15 20 0.01.02.03.0 mean median mode Paul Gardner BIOL209: Two Samples
  • 29. An example: t.test If we ignore the fact that the samples are paired (p is rubbish): t.test(down, up) Welch Two Sample t-test data: down and up t = -0.40876, df = 29.755, p-value = 0.6856 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -5.248256 3.498256 sample estimates: mean of x mean of y 12.500 13.375 Paul Gardner BIOL209: Two Samples
  • 30. An example: paired t.test The picture changes completely if we account for the fact that samples are paired! The moral of the story: if you can do a paired t test, then you should always do a paired test! In general, if you have information on blocking or spatial correlation then you should incorporate this information into your analysis t.test(down, up, paired=T) Paired t-test data: down and up t = -3.0502, df = 15, p-value = 0.0081 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -1.4864388 -0.2635612 sample estimates: mean of the differences -0.875 Paul Gardner BIOL209: Two Samples
  • 31. An example: t on the differences Same result: Halved degrees of freedom, but this is compensated for by reducing the error variance Blocking always helps! t.test(up-down) One Sample t-test data: up - down t = 3.0502, df = 15, p-value = 0.0081 alternative hypothesis: true mean is not equal to 0 95 percent confidence interval: 0.2635612 1.4864388 sample estimates: mean of x 0.875 Paul Gardner BIOL209: Two Samples
  • 32. Pop test: Q1 Given: s2 = (x − ¯x)2 n − 1 Show that: s2 = 1 n − 1 × x2 − ( x)2 n Paul Gardner BIOL209: Two Samples
  • 33. Pop test: Q2 A colleague has collected some very important independant data on the affects of a drug on tumour growth. She has computed the following statistics for you: Control: nC = 10 x2 C = 100, 334 xC = 998 Drugged: nD = 10 x2 D = 68, 811 xD = 823 What is the t statistics for comparing these two samples? Recall: t = ¯xA − ¯xB SEdiff SEdiff = s2 A nA + s2 B nB The critical value of t is 1.73 for α = 0.05 and d.f . = 18. Is the result significant? Paul Gardner BIOL209: Two Samples
  • 34. Pop test: Q3 You’re given the data for 3 independent samples. The corresponding histograms are shown below. 1. How would you compare sample 1 & 2? 2. How would you compare sample 2 & 3? Justify your answers. Sample 1 x Freq. 70 90 110 130 0100200 Sample 2 x Freq. 80 100 120 140 0100200 Sample 3 x Freq. 95 100 105 110 060120 mean median mode Paul Gardner BIOL209: Two Samples
  • 35. Further reading Chapter 6 of Crawley (2015) Statistics: An introduction using R. Excluding binomial, χ2, contingency tables & Fisher’s exacts tests Paul Gardner BIOL209: Two Samples
  • 36. The End Paul Gardner BIOL209: Two Samples