SlideShare a Scribd company logo
AANNOOVVAA 
OOnnee wwaayy SSiinnggllee FFaaccttoorr MMooddeellss 
KARAN DESAI-11BIE001 
DHRUV PATEL-11BIE024 
VISHAL DERASHRI -11BIE030 
HARDIK MEHTA-11BIE037 
MALAV BHATT-11BIE056
DEFINITION 
 Analysis of variance (ANOVA) is a collection of 
statistical models used to analyze the differences 
between group means and their associated procedures 
(such as "variation" among and between groups), 
developed by R.A.Fisher .In the ANOVA setting, the 
observed variance in a particular variable is partitioned 
into components attributable to different sources of 
variation 
22
-Sir Ronald 
Aylmer Fisher 
FRS was an English statistician, 
evolutionary biologist, geneticist, and 
eugenicist 
33
Why ANOVA 
• Compare the mean of more than two 
population? 
• Compare populations each containing 
several subgroups or levels? 
4
Problem with multiple T test 
• One problem with this approach is the increasing 
number of tests as the number of groups 
increases 
• The probability of making a Type I error increases 
as the number of tests increase. 
• If the probability of a Type I error for the analysis 
is set at 0.05 and 10 t-tests are done, the overall 
probability of a Type I error for the set of tests = 1 
– (0.95)10 = 0.40* instead of 0.05 
5
 In its simplest form, ANOVA provides a statistical test of 
whether or not the means of several groups are equal, 
and therefore generalizes the t-test to more than two 
groups. As doing multiple two-sample t-tests would result 
in an increased chance of committing a statistical type-I 
error, ANOVAs are useful in comparing (testing) three or 
more means (groups or variables) for statistical 
significance. 
66
• Another way to describe the multiple comparisons 
problem is to think about the meaning of an alpha 
level = 0.05 
• Alpha of 0.05 implies that, by chance, there will be 
one Type I error in every 20 tests: 1/20 = 0.05. 
• This means that, by chance the null hypothesis 
will be incorrectly rejected once in every 20 tests 
• As the number of tests increases, the probability 
of finding a ‘significant’ result by chance 
increases. 
7
Importance of ANOVA 
• The ANOVA is an important test because 
it enables us to see for example how 
effective two different types of treatment 
are and how durable they are. 
• Effectively a ANOVA can tell us how well a 
treatment work, how long it lasts and how 
budget friendly it will be an 
8
CLASSIFICATION OF ANOVA 
MODEL 
1. Fixed-effects models: 
The fixed-effects model of analysis of 
variance applies to situations in which the experimenter 
applies one or more treatments to the subjects of the 
experiment to see if the response variable values 
change. This allows the experimenter to estimate the 
ranges of response variable values that the treatment 
would generate in the population as a whole. 
99
2. Random-effects model: 
Random effects models are used 
when the treatments are not fixed. This occurs when the 
various factor levels are sampled from a larger 
population. Because the levels themselves are random 
variables , some assumptions and the method of 
contrasting the treatments (a multi-variable 
generalization of simple differences) differ from the fixed-effects 
model. 
1100
3.Mixed-effects models 
A mixed-effects model contains experimental 
factors of both fixed and random-effects types, with appropriately 
different interpretations and analysis for the two types. 
Example: Teaching experiments could be performed by a 
university department to find a good introductory textbook, with each 
text considered a treatment. The fixed-effects model would 
compare a list of candidate texts. The random-effects model would 
determine whether important differences exist among a list of 
randomly selected texts. The mixed-effects model would 
compare the (fixed) incumbent texts to randomly selected 
alternatives. 
1111
ASSUMPTION 
 Normal distribution 
 Variances of dependent variable are equal in all 
populations 
 Random samples; independent scores 
1122
One way Single factor ANOVA 
1133
ONE-WAY ANOVA 
 One factor (manipulated variable) 
 One response variable 
 Two or more groups to compare 
1144
USEFULLNESS 
 Similar to t-test 
 More versatile than t-test 
 Compare one parameter (response variable) 
between two or more groups 
1155
Remember that… 
 Standard deviation (s) 
n 
s = √[(Σ (xi – X)2)/(n-1)] 
i = 1 
 In this case: Degrees of freedom (df) 
df = Number of observations or groups 
1166
ANOVA 
 ANOVA (ANalysis Of VAriance) is a natural extension used to 
compare the means more than 2 populations. 
 Basic Question: Even if the true means of n populations were 
equal (i.e. m1 = m2 = m3 = m4) we cannot expect the sample means 
(`x1, `x2, `x3, `x4 ) to be equal. So when we get different values 
for the `x’s, 
 How much is due to randomness? 
 How much is due to the fact that we are sampling from 
different populations with possibly different mj’s.
ANOVA TERMINOLOGY 
 Response Variable (y) 
 WWhhaatt wwee aarree mmeeaassuurriinngg 
 Experimental Units 
 TThhee iinnddiivviidduuaall uunniitt tthhaatt wwee wwiillll mmeeaassuurree 
 Factors 
 IInnddeeppeennddeenntt vvaarriiaabblleess wwhhoossee vvaalluueess ccaann cchhaannggee ttoo aaffffeecctt 
tthhee oouuttccoommee ooff tthhee rreessppoonnssee vvaarriiaabbllee,, yy 
 Levels of Factors 
 VVaalluueess ooff tthhee ffaaccttoorrss 
 Treatments 
 TThhee ccoommbbiinnaattiioonn ooff tthhee lleevveellss ooff tthhee ffaaccttoorrss aapppplliieedd ttoo aann 
eexxppeerriimmeennttaall uunniitt
Example 
We want to know how combinations of different 
amounts of water (1 ac-ft, 3 ac-ft, 5 ac-ft) and 
different fertilizers (A, B, C) affect crop yields 
 Response variable 
– ccrroopp yyiieelldd ((bbuusshheellss//aaccrree)) 
 Experimental unit 
 EEaacchh aaccrree tthhaatt rreecceeiivveess aa ttrreeaattmmeenntt 
 Factors ((22)) 
 WWaatteerr aanndd ffeerrttiilliizzeerr 
 Levels ((33 ffoorr WWaatteerr;; 33 ffoorr FFeerrttiilliizzeerr)) 
 WWaatteerr:: 11,, 33,, 55;; FFeerrttiilliizzeerr:: AA,, BB,, CC 
 Treatments ((99 == 33xx33)) 
 11AA,, 33AA,, 55AA,, 11BB,, 33BB,, 55BB,, 11CC,, 33CC,, 55CC
Total Treatments 
Fertilizer 
A B C 
1 AC-FT Treatment 1 Treatment 2 Treatment 3 
Water 3 AC-FT Treatment 4 Treatment 5 Treatment 6 
5 AC-FT Treatment 7 Treatment 8 Treatment 9
Single Factor ANOVA 
Basic Assumptions 
 If we focus on only one factor (e.g. fertilizer type in the 
previous example), this is called single factor ANOVA. 
 In this case, levels and treatments are the same thing since 
there are no combinations between factors. 
 Assumptions for Single Factor ANOVA 
1. The distribution of each population in the comparison has a 
normal distribution 
2. The standard deviations of each population (although 
unknown) are assumed to be equal (i.e. s1 = s2 = s3 = s4) 
3. Sampling is: 
Random 
Independent
Example 
 The university would like to know if the delivery mode of the 
introductory statistics class affects the performance in the 
class as measured by the scores on the final exam. 
 The class is given in four different formats: 
 Lecture 
 Text Reading 
 Videotape 
 Internet 
 The final exam scores from random samples of students from 
each of the four teaching formats was recorded.
Samples
Summary 
 There is a single factor under observation – teaching format 
 There are k = 4 different treatments (or levels of teaching 
formats) 
 The number of observations (experimental units) are n1 = 7, n2 
= 8, n3 = 6, n4 = 5 total number of 
observations, n = 26 
= = = = 
Treatment Means : x1 76, x2 65, x3 75, x4 74 
= 
Grand mean (of all 26 observations) : x 72
Why aren’t all the`x’s the same? 
 There is variability due to the different treatments -- 
Between TTrreeaattmmeenntt VVaarriiaabbiilliittyy ((TTrreeaattmmeenntt)) 
 There is variability due to randomness within each 
treatment -- WWiitthhiinn TTrreeaattmmeenntt VVaarriiaabbiilliittyy ((EErrrroorr)) 
BBAASSIICC CCOONNCCEEPPTT 
If the average BBeettwweeeenn TTrreeaattmmeenntt VVaarriiaabbiilliittyy is “large” 
compared to the average WWiitthhiinn TTrreeaattmmeenntt VVaarriiaabbiilliittyy, 
we can reasonably conclude that there really are 
differences among the population means (i.e. at least 
one μj differs from the others).
Basic Questions 
 Given this basic concept, the natural questions are: 
 What is “variability” due to treatment and due to error 
and how are they measured? 
 What is “average variability” due to treatment and due 
to error and how are they measured? 
 What is “large”? 
 How much larger than the observed average 
variability due to error does the observed average 
variability due to treatment have to be before we 
are convinced that there are differences in the true 
population means (the μ’s)?
How Is “Total” Variability 
Measured? Variability is defined as the Sum ooff SSqquuaarree DDeevviiaattiioonnss (from the 
grand mean). So, 
SSSSTT (Total Sum of Squares) 
 Sum of Squared Deviations of all observations from the 
grand mean. (McClave uses SSTotal) 
 SSSSTTrr (Between Treatment Sum of Squares) 
 Sum of Square Deviations Due to Different Treatments. 
(McClave uses SST) 
 SSSSEE (Within Treatment Sum of Squares) 
 Sum of Square Deviations Due to Error 
SSSSTT == SSSSTTrr ++ SSSSEE
How is “Average” Variability Measured? 
“Average” Variability is measured in: 
MMeeaann SSqquuaarree VVaalluueess (MSTr and MSE) 
 Found by dividing SSTr and SSE by their respective 
degrees of freedom 
AANNOOVVAA TTAABBLLEE 
# treatments -1 DFT - DFTR 
VVaarriiaabbiilliittyy SSSS DDFF MMeeaann SSqquuaarree ((MMSS)) 
Between Tr. (Treatment) SSTr k-1 SSTr/DFTR 
Within Tr. (Error) SSE n-k SSE/DFE 
TOTAL SST n-1 
# observations -1
Formula for Calculating 
SST 
Calculating SST 
Just like the 
numerator of the 
variance 
assuming all (26) 
entries come 
from one 
population 
=åå - 
SST (x x) 
(82 72) 2 ... (81 72) 2 
4394 
2 
ij 
= - + + - =
Formula for Calculating 
SSTr 
Calculating SSTr 
Between Treatment 
Variability 
Replace all entries within 
each treatment by its 
mean – now all the 
variability is between (not 
within) treatments 
76 
76 
76 
76 
76 
76 
76 
=å - 
SSTr n (x x) 
2 
j j 
75 
75 
75 
75 
75 
75 
65 
65 
65 
65 
65 
65 
65 
65 
2 2 2 2 
= - + - + - + - = 
7(76 72) 8(65 72) 6(75 72) 5(74 72) 578 
74 
74 
74 
74 
74
Formula for Calculating 
SSE 
Calculating SSE (Within Treatment Variability) 
The difference between the SST and SSTr --- 
= = 
SSE SST - SSTr 
= 
4394 - 578 3816
Can we Conclude a Difference Among 
the 4 Teaching Formats? 
We conclude that at least one population mean differs 
from the others if the average between treatment 
variability is large compared to the average within 
treatment variability, that is if MSTr/MSE is “large”. 
 The ratio of the two measures of variability for these 
normally distributed random variables has an FF 
ddiissttrriibbuuttiioonn and the FF--ssttaattiissttiicc ((==MMSSTTrr//MMSSEE)) is 
compared to a critical F-value from an F distribution 
with: 
 Numerator degrees of freedom = DFTr 
 Denominator degrees of freedom = DFE 
 If the ratio of MSTr to MSE (the F-statistic) exceeds 
the critical F-value, we can conclude that aatt lleeaasstt oonnee 
ppooppuullaattiioonn mmeeaann ddiiffffeerrss ffrroomm tthhee ootthheerrss.
Can We Conclude Different Teaching 
Formats Affect Final Exam Scores? 
The F-test 
H0: m1 = m2 = m3 = m4 
HA: At least one mj differs from the others 
Select α = .05. 
Reject H0 (Accept HA) if: 
F MSTr = > α,DFTr,DFE = .05,3,22 = 
F F 3.05 
MSE
Hand Calculations for the F-test 
= = = 
173.45 
578 
3816 
22 
MSTr SSTr 
MSE SSE 
DFE 
192.67 
3 
DFTr 
= = = 
1.11 
F 192.67 
= = 
173.45 
< 
1.11 3.05 
CCaannnnoott ccoonncclluuddee tthheerree iiss aa ddiiffffeerreennccee aammoonngg tthhee μμjj’’ss
Excel Approach
EXCEL OUTPUT 
pp--vvaalluuee == ..336655997755 >> ..0055 
CCaannnnoott ccoonncclluuddee ddiiffffeerreenncceess
REVIEW 
 ANOVA Situation and Terminology 
 Response variable, Experimental Units, Factors, 
Levels, Treatments, Error 
 Basic Concept 
 If the “average variability” between treatments is “a 
lot” greater than the “average variability” due to error – 
conclude that at least one mean differs from the 
others. 
 Single Factor Analysis 
 By Hand 
 By Excel

More Related Content

PPTX
Applications of regression analysis - Measurement of validity of relationship
PPTX
PPT on Sample Size, Importance of Sample Size,
PPTX
Analysis of variance
PDF
Testing of hypothesis
PPTX
non parametric statistics
PPTX
PPTX
Multivariate analysis
Applications of regression analysis - Measurement of validity of relationship
PPT on Sample Size, Importance of Sample Size,
Analysis of variance
Testing of hypothesis
non parametric statistics
Multivariate analysis

What's hot (20)

PDF
Multivariate Analysis
PPTX
Testing of hypotheses
PPTX
Developing research plan
PPTX
One way anova final ppt.
PPT
Graeco Latin Square Design
PDF
ANOVA test and correlation
PPT
Mann Whitney U Test | Statistics
PPTX
PPTX
Sample and sample size
PPTX
Student t test
DOCX
PPT
Latin square design
PPTX
Coefficient of variation
PPTX
When to use, What Statistical Test for data Analysis modified.pptx
PPT
Introduction to Probability and Probability Distributions
PPTX
Degrees of freedom
Multivariate Analysis
Testing of hypotheses
Developing research plan
One way anova final ppt.
Graeco Latin Square Design
ANOVA test and correlation
Mann Whitney U Test | Statistics
Sample and sample size
Student t test
Latin square design
Coefficient of variation
When to use, What Statistical Test for data Analysis modified.pptx
Introduction to Probability and Probability Distributions
Degrees of freedom
Ad

Viewers also liked (6)

PPT
Hypothesis testing interview
PPTX
Research Methods: Experimental Design I (Single Factor)
PPTX
Ethics in qualitative research
PPT
Types of experimental design
PPTX
Experimental design
PDF
Qualitative data analysis
Hypothesis testing interview
Research Methods: Experimental Design I (Single Factor)
Ethics in qualitative research
Types of experimental design
Experimental design
Qualitative data analysis
Ad

Similar to Anova single factor (20)

PPT
Anova single factor
PPTX
Chapter 5 experimental design for sbh
PPTX
Shovan anova main
PPTX
Variance component analysis by paravayya c pujeri
PPTX
Parametric test - t Test, ANOVA, ANCOVA, MANOVA
PPTX
Anova, ancova
PPTX
Commonly used statistical tests in research
PPTX
Anova in easyest way
PPTX
Inferential statistics
PPT
anova & analysis of variance pearson.ppt
PPTX
Analysis of variance (anova)
PPTX
Full Lecture Presentation on ANOVA
PPTX
PHD SEMINAR- HIMANSHU JADHAV public health denitistry.pptx
PPTX
Stats ANOVA and Kruskal……………………………………….
PPTX
Tugasan kumpulan anova
PPT
1 ANOVA.ppt
PPTX
Anova - One way and two way
PPT
8 a class slides one way anova part 1
PPTX
QM Unit II.pptx
PPTX
Parametric & non-parametric
Anova single factor
Chapter 5 experimental design for sbh
Shovan anova main
Variance component analysis by paravayya c pujeri
Parametric test - t Test, ANOVA, ANCOVA, MANOVA
Anova, ancova
Commonly used statistical tests in research
Anova in easyest way
Inferential statistics
anova & analysis of variance pearson.ppt
Analysis of variance (anova)
Full Lecture Presentation on ANOVA
PHD SEMINAR- HIMANSHU JADHAV public health denitistry.pptx
Stats ANOVA and Kruskal……………………………………….
Tugasan kumpulan anova
1 ANOVA.ppt
Anova - One way and two way
8 a class slides one way anova part 1
QM Unit II.pptx
Parametric & non-parametric

Recently uploaded (20)

PDF
STATICS OF THE RIGID BODIES Hibbelers.pdf
PDF
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PDF
Pre independence Education in Inndia.pdf
PPTX
Week 4 Term 3 Study Techniques revisited.pptx
PPTX
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
PDF
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PDF
01-Introduction-to-Information-Management.pdf
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
PDF
Basic Mud Logging Guide for educational purpose
PDF
2.FourierTransform-ShortQuestionswithAnswers.pdf
PPTX
master seminar digital applications in india
PDF
Insiders guide to clinical Medicine.pdf
PDF
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPTX
PPH.pptx obstetrics and gynecology in nursing
PDF
Classroom Observation Tools for Teachers
PPTX
Cell Structure & Organelles in detailed.
PPTX
Microbial diseases, their pathogenesis and prophylaxis
STATICS OF THE RIGID BODIES Hibbelers.pdf
Origin of periodic table-Mendeleev’s Periodic-Modern Periodic table
Abdominal Access Techniques with Prof. Dr. R K Mishra
Pre independence Education in Inndia.pdf
Week 4 Term 3 Study Techniques revisited.pptx
Introduction to Child Health Nursing – Unit I | Child Health Nursing I | B.Sc...
BÀI TẬP BỔ TRỢ 4 KỸ NĂNG TIẾNG ANH 9 GLOBAL SUCCESS - CẢ NĂM - BÁM SÁT FORM Đ...
Renaissance Architecture: A Journey from Faith to Humanism
01-Introduction-to-Information-Management.pdf
Supply Chain Operations Speaking Notes -ICLT Program
The Healthy Child – Unit II | Child Health Nursing I | B.Sc Nursing 5th Semester
Basic Mud Logging Guide for educational purpose
2.FourierTransform-ShortQuestionswithAnswers.pdf
master seminar digital applications in india
Insiders guide to clinical Medicine.pdf
Mark Klimek Lecture Notes_240423 revision books _173037.pdf
PPH.pptx obstetrics and gynecology in nursing
Classroom Observation Tools for Teachers
Cell Structure & Organelles in detailed.
Microbial diseases, their pathogenesis and prophylaxis

Anova single factor

  • 1. AANNOOVVAA OOnnee wwaayy SSiinnggllee FFaaccttoorr MMooddeellss KARAN DESAI-11BIE001 DHRUV PATEL-11BIE024 VISHAL DERASHRI -11BIE030 HARDIK MEHTA-11BIE037 MALAV BHATT-11BIE056
  • 2. DEFINITION  Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences between group means and their associated procedures (such as "variation" among and between groups), developed by R.A.Fisher .In the ANOVA setting, the observed variance in a particular variable is partitioned into components attributable to different sources of variation 22
  • 3. -Sir Ronald Aylmer Fisher FRS was an English statistician, evolutionary biologist, geneticist, and eugenicist 33
  • 4. Why ANOVA • Compare the mean of more than two population? • Compare populations each containing several subgroups or levels? 4
  • 5. Problem with multiple T test • One problem with this approach is the increasing number of tests as the number of groups increases • The probability of making a Type I error increases as the number of tests increase. • If the probability of a Type I error for the analysis is set at 0.05 and 10 t-tests are done, the overall probability of a Type I error for the set of tests = 1 – (0.95)10 = 0.40* instead of 0.05 5
  • 6.  In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the t-test to more than two groups. As doing multiple two-sample t-tests would result in an increased chance of committing a statistical type-I error, ANOVAs are useful in comparing (testing) three or more means (groups or variables) for statistical significance. 66
  • 7. • Another way to describe the multiple comparisons problem is to think about the meaning of an alpha level = 0.05 • Alpha of 0.05 implies that, by chance, there will be one Type I error in every 20 tests: 1/20 = 0.05. • This means that, by chance the null hypothesis will be incorrectly rejected once in every 20 tests • As the number of tests increases, the probability of finding a ‘significant’ result by chance increases. 7
  • 8. Importance of ANOVA • The ANOVA is an important test because it enables us to see for example how effective two different types of treatment are and how durable they are. • Effectively a ANOVA can tell us how well a treatment work, how long it lasts and how budget friendly it will be an 8
  • 9. CLASSIFICATION OF ANOVA MODEL 1. Fixed-effects models: The fixed-effects model of analysis of variance applies to situations in which the experimenter applies one or more treatments to the subjects of the experiment to see if the response variable values change. This allows the experimenter to estimate the ranges of response variable values that the treatment would generate in the population as a whole. 99
  • 10. 2. Random-effects model: Random effects models are used when the treatments are not fixed. This occurs when the various factor levels are sampled from a larger population. Because the levels themselves are random variables , some assumptions and the method of contrasting the treatments (a multi-variable generalization of simple differences) differ from the fixed-effects model. 1100
  • 11. 3.Mixed-effects models A mixed-effects model contains experimental factors of both fixed and random-effects types, with appropriately different interpretations and analysis for the two types. Example: Teaching experiments could be performed by a university department to find a good introductory textbook, with each text considered a treatment. The fixed-effects model would compare a list of candidate texts. The random-effects model would determine whether important differences exist among a list of randomly selected texts. The mixed-effects model would compare the (fixed) incumbent texts to randomly selected alternatives. 1111
  • 12. ASSUMPTION  Normal distribution  Variances of dependent variable are equal in all populations  Random samples; independent scores 1122
  • 13. One way Single factor ANOVA 1133
  • 14. ONE-WAY ANOVA  One factor (manipulated variable)  One response variable  Two or more groups to compare 1144
  • 15. USEFULLNESS  Similar to t-test  More versatile than t-test  Compare one parameter (response variable) between two or more groups 1155
  • 16. Remember that…  Standard deviation (s) n s = √[(Σ (xi – X)2)/(n-1)] i = 1  In this case: Degrees of freedom (df) df = Number of observations or groups 1166
  • 17. ANOVA  ANOVA (ANalysis Of VAriance) is a natural extension used to compare the means more than 2 populations.  Basic Question: Even if the true means of n populations were equal (i.e. m1 = m2 = m3 = m4) we cannot expect the sample means (`x1, `x2, `x3, `x4 ) to be equal. So when we get different values for the `x’s,  How much is due to randomness?  How much is due to the fact that we are sampling from different populations with possibly different mj’s.
  • 18. ANOVA TERMINOLOGY  Response Variable (y)  WWhhaatt wwee aarree mmeeaassuurriinngg  Experimental Units  TThhee iinnddiivviidduuaall uunniitt tthhaatt wwee wwiillll mmeeaassuurree  Factors  IInnddeeppeennddeenntt vvaarriiaabblleess wwhhoossee vvaalluueess ccaann cchhaannggee ttoo aaffffeecctt tthhee oouuttccoommee ooff tthhee rreessppoonnssee vvaarriiaabbllee,, yy  Levels of Factors  VVaalluueess ooff tthhee ffaaccttoorrss  Treatments  TThhee ccoommbbiinnaattiioonn ooff tthhee lleevveellss ooff tthhee ffaaccttoorrss aapppplliieedd ttoo aann eexxppeerriimmeennttaall uunniitt
  • 19. Example We want to know how combinations of different amounts of water (1 ac-ft, 3 ac-ft, 5 ac-ft) and different fertilizers (A, B, C) affect crop yields  Response variable – ccrroopp yyiieelldd ((bbuusshheellss//aaccrree))  Experimental unit  EEaacchh aaccrree tthhaatt rreecceeiivveess aa ttrreeaattmmeenntt  Factors ((22))  WWaatteerr aanndd ffeerrttiilliizzeerr  Levels ((33 ffoorr WWaatteerr;; 33 ffoorr FFeerrttiilliizzeerr))  WWaatteerr:: 11,, 33,, 55;; FFeerrttiilliizzeerr:: AA,, BB,, CC  Treatments ((99 == 33xx33))  11AA,, 33AA,, 55AA,, 11BB,, 33BB,, 55BB,, 11CC,, 33CC,, 55CC
  • 20. Total Treatments Fertilizer A B C 1 AC-FT Treatment 1 Treatment 2 Treatment 3 Water 3 AC-FT Treatment 4 Treatment 5 Treatment 6 5 AC-FT Treatment 7 Treatment 8 Treatment 9
  • 21. Single Factor ANOVA Basic Assumptions  If we focus on only one factor (e.g. fertilizer type in the previous example), this is called single factor ANOVA.  In this case, levels and treatments are the same thing since there are no combinations between factors.  Assumptions for Single Factor ANOVA 1. The distribution of each population in the comparison has a normal distribution 2. The standard deviations of each population (although unknown) are assumed to be equal (i.e. s1 = s2 = s3 = s4) 3. Sampling is: Random Independent
  • 22. Example  The university would like to know if the delivery mode of the introductory statistics class affects the performance in the class as measured by the scores on the final exam.  The class is given in four different formats:  Lecture  Text Reading  Videotape  Internet  The final exam scores from random samples of students from each of the four teaching formats was recorded.
  • 24. Summary  There is a single factor under observation – teaching format  There are k = 4 different treatments (or levels of teaching formats)  The number of observations (experimental units) are n1 = 7, n2 = 8, n3 = 6, n4 = 5 total number of observations, n = 26 = = = = Treatment Means : x1 76, x2 65, x3 75, x4 74 = Grand mean (of all 26 observations) : x 72
  • 25. Why aren’t all the`x’s the same?  There is variability due to the different treatments -- Between TTrreeaattmmeenntt VVaarriiaabbiilliittyy ((TTrreeaattmmeenntt))  There is variability due to randomness within each treatment -- WWiitthhiinn TTrreeaattmmeenntt VVaarriiaabbiilliittyy ((EErrrroorr)) BBAASSIICC CCOONNCCEEPPTT If the average BBeettwweeeenn TTrreeaattmmeenntt VVaarriiaabbiilliittyy is “large” compared to the average WWiitthhiinn TTrreeaattmmeenntt VVaarriiaabbiilliittyy, we can reasonably conclude that there really are differences among the population means (i.e. at least one μj differs from the others).
  • 26. Basic Questions  Given this basic concept, the natural questions are:  What is “variability” due to treatment and due to error and how are they measured?  What is “average variability” due to treatment and due to error and how are they measured?  What is “large”?  How much larger than the observed average variability due to error does the observed average variability due to treatment have to be before we are convinced that there are differences in the true population means (the μ’s)?
  • 27. How Is “Total” Variability Measured? Variability is defined as the Sum ooff SSqquuaarree DDeevviiaattiioonnss (from the grand mean). So, SSSSTT (Total Sum of Squares)  Sum of Squared Deviations of all observations from the grand mean. (McClave uses SSTotal)  SSSSTTrr (Between Treatment Sum of Squares)  Sum of Square Deviations Due to Different Treatments. (McClave uses SST)  SSSSEE (Within Treatment Sum of Squares)  Sum of Square Deviations Due to Error SSSSTT == SSSSTTrr ++ SSSSEE
  • 28. How is “Average” Variability Measured? “Average” Variability is measured in: MMeeaann SSqquuaarree VVaalluueess (MSTr and MSE)  Found by dividing SSTr and SSE by their respective degrees of freedom AANNOOVVAA TTAABBLLEE # treatments -1 DFT - DFTR VVaarriiaabbiilliittyy SSSS DDFF MMeeaann SSqquuaarree ((MMSS)) Between Tr. (Treatment) SSTr k-1 SSTr/DFTR Within Tr. (Error) SSE n-k SSE/DFE TOTAL SST n-1 # observations -1
  • 29. Formula for Calculating SST Calculating SST Just like the numerator of the variance assuming all (26) entries come from one population =åå - SST (x x) (82 72) 2 ... (81 72) 2 4394 2 ij = - + + - =
  • 30. Formula for Calculating SSTr Calculating SSTr Between Treatment Variability Replace all entries within each treatment by its mean – now all the variability is between (not within) treatments 76 76 76 76 76 76 76 =å - SSTr n (x x) 2 j j 75 75 75 75 75 75 65 65 65 65 65 65 65 65 2 2 2 2 = - + - + - + - = 7(76 72) 8(65 72) 6(75 72) 5(74 72) 578 74 74 74 74 74
  • 31. Formula for Calculating SSE Calculating SSE (Within Treatment Variability) The difference between the SST and SSTr --- = = SSE SST - SSTr = 4394 - 578 3816
  • 32. Can we Conclude a Difference Among the 4 Teaching Formats? We conclude that at least one population mean differs from the others if the average between treatment variability is large compared to the average within treatment variability, that is if MSTr/MSE is “large”.  The ratio of the two measures of variability for these normally distributed random variables has an FF ddiissttrriibbuuttiioonn and the FF--ssttaattiissttiicc ((==MMSSTTrr//MMSSEE)) is compared to a critical F-value from an F distribution with:  Numerator degrees of freedom = DFTr  Denominator degrees of freedom = DFE  If the ratio of MSTr to MSE (the F-statistic) exceeds the critical F-value, we can conclude that aatt lleeaasstt oonnee ppooppuullaattiioonn mmeeaann ddiiffffeerrss ffrroomm tthhee ootthheerrss.
  • 33. Can We Conclude Different Teaching Formats Affect Final Exam Scores? The F-test H0: m1 = m2 = m3 = m4 HA: At least one mj differs from the others Select α = .05. Reject H0 (Accept HA) if: F MSTr = > α,DFTr,DFE = .05,3,22 = F F 3.05 MSE
  • 34. Hand Calculations for the F-test = = = 173.45 578 3816 22 MSTr SSTr MSE SSE DFE 192.67 3 DFTr = = = 1.11 F 192.67 = = 173.45 < 1.11 3.05 CCaannnnoott ccoonncclluuddee tthheerree iiss aa ddiiffffeerreennccee aammoonngg tthhee μμjj’’ss
  • 36. EXCEL OUTPUT pp--vvaalluuee == ..336655997755 >> ..0055 CCaannnnoott ccoonncclluuddee ddiiffffeerreenncceess
  • 37. REVIEW  ANOVA Situation and Terminology  Response variable, Experimental Units, Factors, Levels, Treatments, Error  Basic Concept  If the “average variability” between treatments is “a lot” greater than the “average variability” due to error – conclude that at least one mean differs from the others.  Single Factor Analysis  By Hand  By Excel