SlideShare a Scribd company logo
Hazilah Mohd Amin Analysis of Variance (ANOVA)
Goals After completing, you should be able to:   Recognize situations in which to use analysis of variance (ANOVA) Perform a single-factor hypothesis test for Comparing More Than Two Means  and interpret results
The F - Distribution   Analysis-of-variance procedures rely on F-distribution. There are infinitely many   F-distributions, and we identify an   F-distribution (and   F-curve) by its number of degrees of freedom. F-distribution has two numbers of degrees of freedom.
Key Fact  F distribuition curve:
Find Critical Value: Example  Find the F value for 8 df for numerator, 14 df for denominator, and 0.05 area in the right tail of the F distribuition curve. Critical value:  F  , df numerator,df denominator   =  F  , 8,14  =  ?
Table 12.1  (p. 534) Critical value:  F  , 8,14   = 2.70
Hypotheses of One-Way ANOVA All population means are equal  i.e., no treatment effect (no variation in means among groups) At least one population mean is different  i.e., there is a treatment effect  Does not mean that all population means are different (some pairs may be the same)  The analysis of variance is a procedure that tests to determine whether differences exits between two or more population means .
One-Factor ANOVA  All Means are the same: The Null Hypothesis is True  (No Treatment Effect)
One-Factor ANOVA  At least one mean is different: The Null Hypothesis is NOT true  (Treatment Effect is present) or
One-Way Analysis of Variance
 
 
One-Factor ANOVA  F Test: Example 1 You want to see if three different golf clubs yield different distances.  You randomly select five measurements from trials on an automated driving machine for each club.  At the .05 significance level, is there a difference in mean distance? Club 1   Club 2   Club 3 254   234   200 263   218   222 241   235   197 237   227   206 251   216   204
Solution of Example 1 The data are interval The problem objective is to compare mean distances in three type of golf  club. We hypothesize that the three population means are equal One   Way   A n a l y s i s   o f   V a r i a n c e
Defining the Hypotheses H 0 :   1  =   2 =   3 H 1 : At least two means differ Solution
N o t a t i o n Independent samples are drawn from k populations (treatments). X 11 x 21 . . . X n1,1 X 12 x 22 . . . X n2,2 X 1k x 2k . . . X nk,k Sample size Sample mean X is the “response variable”. The variables’ value are called “responses”.
T e r m i n o l o g y In the context of this problem… Response variable  – distance  Experimental unit  – golf club when we record distance figures. Factor  or  treatment  – the criterion by which we classify the populations (the treatments). In this problems the factor is the type of golf clubs.
The rationale of the name of   A n a l y s i s   o f   V a r i a n c e  ( A N O V A )  We are testing the different between means but why ANOVA? Two types of variability are employed when testing for the equality of the population means:  Within Samples  and  Between Samples
One   Way   A n a l y s i s   o f   V a r i a n c e Graphical demonstration : Employing two types of variability:  Within Samples  and  Between Samples
Treatment 1 Treatment 2 Treatment 3 20 16 15 14 11 10 9 The sample means are the same as before, but the larger within-sample variability  makes it harder to draw a conclusion about the population means. A small variability within the samples makes it easier to draw a conclusion about the  population means.  20 25 30 1 7 Treatment 1 Treatment 2 Treatment 3 10 12 19 9
One-Factor ANOVA Example: Scatter Diagram • • • • • 270 260 250 240 230 220 210 200 190 • • • • • • • • • • Distance Club 1   Club 2   Club 3 254   234   200 263   218   222 241   235   197 237   227   206 251   216   204 Club 1  2  3 From scatter diagram, we can clearly see sample means difference because of small within-sample variability
Test Statistics (F), Critical Value & Rejection Criterion Test statistic: where  MSB  is mean squares  between  variances where  MSW  is mean squares  within  variances Rejection Region: F > F  , k-1,n-k Degrees of freedom df 1  = k – 1  (k =  levels or treatments)  df 2  = n – k  (n = sum of sample sizes from all populations) H 0 :  μ 1 =  μ 2  = …   =  μ   k H A : At least two population means are different The hypothesis test:
One-Factor ANOVA Example Computations Club 1   Club 2   Club 3 254   234   200 263   218   222 241   235   197 237   227   206 251   216   204 x 1  = 249.2 x 2  = 226.0 x 3  = 205.8 x = 227.0 n 1  = 5 n 2  = 5 n 3  = 5 n = 15 k = 3 MSB = 4716.4 / (3-1) = 2358.2 MSW = 1119.6 / (15-3) = 93.3 SSB =  4716.4 SSW =  1119.6
One-Factor ANOVA Example Solution H 0 :  μ 1  =  μ 2  =  μ 3 H A :  μ i  not all equal    = .05 df 1 = k-1 =3-1 =2  df 2  = n-k =15-3 =12  F   = 25.275 Test Statistic:  Decision:  Test statistic F is greater than critical value Conclusion: Reject H 0  at    = 0.05 There is evidence that at least one  μ i   differs from the rest 0      = .05 F .05  = 3.885 Reject H 0 Do not  reject H 0 Critical Value:  F  , k-1,n-k   =  F  , 2,12  = 3.885
ANOVA Single Factor: Excel Output EXCEL:  tools | data analysis | ANOVA: single factor F  , k-1,n-k   =  F  , 2,12  = 3.885 SUMMARY Groups Count Sum Average Variance Club 1 5 1246 249.2 108.2 Club 2 5 1130 226 77.5 Club 3 5 1029 205.8 94.2 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 4716.4 2 2358.2 25.275 4.99E-05 3.885 Within  Groups 1119.6 12 93.3 Total 5836.0 14        
Rationale 1: Variability Between Sample   If H 0 :  μ 1 =  μ 2  = … =  μ k   is  true , we would expect all the sample means to be close to one another.  If the alternative hypothesis is true, at least some of the sample means would differ. Thus, we measure variability between sample means (and hence MSB or MSTr).
Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means.  Therefore, even though sample means may markedly differ from one another, we have to consider the “within samples variability” (and hence MSW or MSE).  Rationale II: Variability Within
Interpreting One-Factor ANOVA  F Statistic The F statistic is the ratio of the  between  estimate of variance and the  within  estimate of variance The ratio must always be positive df 1  = k -1  will typically be small df 2  = n - k  will typically be large The test statistic F ratio should be  close to 1  (SSB small due to small sample means difference) if  H 0 :  μ 1 =  μ 2  = … =  μ k   is  true The ratio will be  larger than 1  (SSB large due to large sample means difference) if  H 0 :  μ 1 =  μ 2  = … =  μ k   is  false
Example 2 A study was conducted to determine if the drying time for a certain paint is affected by the type of applicator used.  The data in the table on the next screen represents the drying time (in minutes) for 3 different applicators when the paint was applied to standard wallboard.  Is there any evidence to suggest the type of applicator has a significant effect on the paint drying time at the 0.05 level? Notation : The type of applicator is the treatment, factor or level . hence k = 3
Notation Used in ANOVA Factor Levels Sample from Sample from Sample from Sample from Replication Level 1 Level 2 Level 3 Level  k n = 1 x 1,1 x 2,1 x 3,1 x k ,1 n = 2 x 1,2 x 2,2 x 3,2 x k ,2 n = 3 x 1,3 x 2,3 x 3,3 x k ,3 Column T 1 T 2 T 3 T k T Totals T = grand total = sum of all  x 's =   x =   T i . . . . . . . . .
Sample Results  1 x  2 x  3 x
Solution Assumptions:  The data (samples) was randomly collected and all observations are independent.  The populations are (approximately) normally distributed.  Populations have equal variances. The null and the alternative hypothesis: H o :   1  =   2  =   3 The mean drying time is the same for each applicator H a :  At least one mean is different  Not all drying time means are equal
Partition of Total Variation Commonly referred to as: Sum of Squares Within (SSW) Sum of Squares Error (SSE) Sum of Squares Unexplained Within Groups Variation Variation Due to Factor/Treatment (SSB) Variation Due to Random Sampling (SSW) Sum of Squares Total (SST) Commonly referred to as: Sum of Squares Between (SSB) Sum of Squares Treatment (SSTr) Sum of Squares Factor Sum of Squares Among Sum of Squares Explained Among Groups Variation = + Total variation SST can be split into two parts: SST = SSB + SSW
 
 x and   x 2  Calculator:  Enter  x i  data, retrieve   x and   x 2 Enter Statictics SD: Mode Mode 1 Clear old data:  Shift Clr  1 = Enter x i  data: 39.1  DT  39.4  DT  31.1  DT  33.7  DT  30.5  DT  34.6  DT  …29.5  DT Find (  x):  Shift   S-SUM  2  = 616.5 Find (  x 2 ):  Shift   S-SUM  1  = 20,316.69
Variation Sums of Squares
Mean Square The mean square for the factor being tested and for the error is obtained by dividing the sum-of-square value by the corresponding number of degrees of freedom Numerator degrees of freedom = df(factor) = k    1 = 3    1 = 2 df(total) =  n     1 = 19    1 = 18 Denominator degrees of freedom = df(error) =  n     k = 19    3 = 16 Calculations:
One-Way ANOVA Table Source of Variation df SS MS Between Samples SSB MSB = Within Samples n - k SSW MSW = Total n - 1 SST = SSB+SSW k - 1 MSB MSW F ratio SSB k - 1 SSW n - k F = The sums of squares and the degrees of freedom must check SS(factor) + SS(error) = SS(total) or SSB + SSW = SST  df(factor) + df(error) = df(total) or df(between) + df(within) = df(total) An  ANOVA table   is often used to record the sums of squares and to organize the rest of the calculations.  Format for the ANOVA Table:
The Completed ANOVA Table The Complete ANOVA Table: The Test Statistic:
Solution Continued The Results a. Decision:  Reject  H o   at    = 0.05 b. Conclusion : There is evidence to suggest the three population  means are not all the same.  The type of applicator has a significant effect on  the paint drying time at the 0.05 level of significance. Critical Value:  F  , k-1,n-k   =  F  , 2,16  = 3.63 The Test Statistic F = 4.27 is in the rejection region. Reject H 0 F .05  = 3.63 Do not  reject H 0    = .05
One-Way ANOVA F-Test: Exercise 1 You’re a trainer for Microsoft Corp. Is there any evidence to suggest the type of training method has a significant effect on the learning time at the 0.05 level? The data in the table represents the learning times (in hours) of 12 people using 4 different training methods. M1 M2 M3 M4 10 11 13 18 9 16 8 23 5 9 9 25 © 1984-1994 T/Maker Co. Answer: Critical Value = 4.07. Test statistic = 11.6
Hey!  Lets   get   our   hand  dirty …   Using   S P S S ….
One   Way   A n a l y s i s   o f   V a r i a n c e  U s i n g  S P S S Suppose we want to know whether students who have to work many hours outside school to support themselves find their grade suffering. We examine this question by comparing the GPAs of students who work various hours  outside school. Let’s examine this question using data in  Student  file.  File>Open>   Student
One   Way   A n a l y s i s   o f   V a r i a n c e  U s i n g  S P S S First examine the average  GPA  for each of the three work categories (0 hrs, 1-19hrs, >20hrs) -  WorkCat Graph > Boxplot   then choose   Simple  and click   Define .  Select   GPA   as the   variable   and  WorkCat   for the   Category Axis.  Click  Option
After Clicking  Options …,  click off   Display  groups   defined by missing value , and click   Continue   then   OK . You’ll get this
What is the Box-plot telling us? Some variation across the groups See median GPAs (dark line in the middle of box) differ slightly between groups.  So, should we attribute the observed difference to sampling error or they genuinely differ? Neither box-plot nor the median offer decisive evidence. Hence we need ANOVA.
One   Way   A n a l y s i s   o f   V a r i a n c e  U s i n g  S P S S We are testing:  H 0  :   1  =   2  =   3   H 1 :  At least two means differ Before attempting ANOVA, need to review the ANOVA assumptions.  Independent samples  (ii) Normality  (iii) Variances equality.  We can test both (ii) & (iii). Analyze>Descriptive Statistics>Explore
Analyze>Descriptive Statistics>Explore In the  Explore  dialog box, select  GPAs  as the  dependent List  variable,  WorkCat  as the  Factor List  variable and  Plot  as the  Display . Next, click  Plot … We are interested in a  normality test, select   Select this &   deselect this only. Click  Continue    and  OK .  See next slide…
The Output has several parts, let focus on the tests of normality The Kolmogorov-Smirnov test assesses whether there is significant departure from normality in the population distribution of the 3 groups.  H 0 : Distributions are normal . Look at the p-values, all are > 0.05. Do not reject  H 0 .  Hence no evidence of non-normality.
One   Way   A n a l y s i s   o f   V a r i a n c e  U s i n g  S P S S We still need to validate the homogeneity of variance assumption. We do this within ANOVA. Analyze>Compare Means>One-Way ANOVA Dependent List   variable is   GPA   and  Factor  variable is   WorkCat. Click  Option,
One   Way   A n a l y s i s   o f   V a r i a n c e  U s i n g  S P S S under  Statistics , select  Descriptive  and  Homogeneity of variance test . Click  Continue  &  OK H 0 : Variances are equal.  One-Way ANOVA output consists many parts.  Look at the p-value > 0.05.  Hence do not reject  H 0 .
Normality & Homogeneity of variances assumptions met … hence Let find out whether students who work various hours outside school differ in their GPAs. The P-value of .000 is very small, hence we reject Ho and conclude that the means GPAs are not all the same. Where are the differences? Hence Post-Hoc test…
End of ANOVA See U Later…
One-Way ANOVA F-Test:  Exercise 1 Solution You’re a trainer for Microsoft Corp. Is there any evidence to suggest the type of training method has a significant effect on the learning time at the 0.05 level? The data in the table represents the learning times (in hours) of 12 people using 4 different training methods. M1 M2 M3 M4 10 11 13 18 9 16 8 23 5 9 9 25 © 1984-1994 T/Maker Co.
Summary Table  Solution* Source of Variation Degrees   of Freedom Sum of Squares Mean Square (Variance) F Treatment ( Methods ) 4 - 1 = 3 348 116 11.6 Error 12 - 4 = 8 80 10 Total 12 - 1 = 11 428
One-Way ANOVA F-Test  Solution* H 0 :   1  =   2  =   3  =   4 H a : Not All Equal    = .05  1  = 3   2  = 8  Critical Value(s): F 0 4.07 Test Statistic:  Decision: Conclusion: Reject at    = .05 There Is Evidence Pop. Means Are Different    = .05 F MSB MSE    116 10 11 6 .

More Related Content

PPT
Estimation and hypothesis testing 1 (graduate statistics2)
PPTX
T- test .pptx
PDF
Independent samples t-test
PPTX
Goodness of fit (ppt)
PPTX
Anova; analysis of variance
PPTX
Testing of hypothesis
PPTX
One sided or one-tailed tests
PDF
Analysis of Variance (ANOVA)
Estimation and hypothesis testing 1 (graduate statistics2)
T- test .pptx
Independent samples t-test
Goodness of fit (ppt)
Anova; analysis of variance
Testing of hypothesis
One sided or one-tailed tests
Analysis of Variance (ANOVA)

What's hot (20)

PPTX
T distribution | Statistics
PPTX
Spearman rank correlation coefficient
PPTX
T test, independant sample, paired sample and anova
PDF
Chi square Test Using SPSS
PPTX
Two Proportions
PPTX
Testing of hypotheses
PPT
PPT
Regression analysis
PPT
Anova lecture
PPTX
Hypothesis testing , T test , chi square test, z test
PPTX
Sampling Distributions
PDF
Regression analysis algorithm
PPTX
Advance Statistics - Wilcoxon Signed Rank Test
PPTX
Regression Analysis
PPTX
Formulating hypotheses
PPT
T test statistics
PPT
Multiple regression presentation
PPTX
Regression analysis
PPT
Chi square mahmoud
T distribution | Statistics
Spearman rank correlation coefficient
T test, independant sample, paired sample and anova
Chi square Test Using SPSS
Two Proportions
Testing of hypotheses
Regression analysis
Anova lecture
Hypothesis testing , T test , chi square test, z test
Sampling Distributions
Regression analysis algorithm
Advance Statistics - Wilcoxon Signed Rank Test
Regression Analysis
Formulating hypotheses
T test statistics
Multiple regression presentation
Regression analysis
Chi square mahmoud
Ad

Viewers also liked (20)

PPTX
Analysis of variance (ANOVA)
PPT
9. basic concepts_of_one_way_analysis_of_variance_(anova)
PPT
One way anova
PPT
One Way Anova
PPTX
Contingency Table Test, M. Asad Hayat, UET Taxila
PPT
Chapter 5 Anova2009
PPT
metodologi penelitian
PPTX
analisis varians
PPTX
Analisis varian (anava)
PPTX
PPT ANALISIS DATA SURVEI
PPT
T14 anova
PPT
Anova single factor
PPTX
Imad Feneir - One way anova
PPTX
Imad Feneir - Two-way ANOVA - replication
PPTX
Anova (Statistics)
PPT
In Anova
PPT
Biogeochemical cycles C, H2O, N, and O
PPTX
One way anova final ppt.
PPT
Biogeochemical Cycles: Natural Cycles of Elements
Analysis of variance (ANOVA)
9. basic concepts_of_one_way_analysis_of_variance_(anova)
One way anova
One Way Anova
Contingency Table Test, M. Asad Hayat, UET Taxila
Chapter 5 Anova2009
metodologi penelitian
analisis varians
Analisis varian (anava)
PPT ANALISIS DATA SURVEI
T14 anova
Anova single factor
Imad Feneir - One way anova
Imad Feneir - Two-way ANOVA - replication
Anova (Statistics)
In Anova
Biogeochemical cycles C, H2O, N, and O
One way anova final ppt.
Biogeochemical Cycles: Natural Cycles of Elements
Ad

Similar to Anova by Hazilah Mohd Amin (20)

PPTX
Analysis of variance
PDF
2Analysis of Variance.pdf
PPTX
QM Unit II.pptx
PPT
two way abbbbiiiv dsdsdsdsdsdsdssssssssssssssssssssss
PPTX
Chap15 analysis of variance
PPT
two way abbbbiiivaa eheheheheeeeeeeeeeeeeeeeeeeeeeeeee
PPT
ANOVA.ppt
PPT
Anova.ppt
PPTX
ANOVA One way ppt statistics.pptx
PPT
UNIVERSITY FALL LECTURES ON STATISTICAL VARIANCE --2
PPT
12-5-03.ppt test means better two t test
PPTX
2-1 ANOVA.pptx materi statiska untuk perhitungan anova
PPT
anova & analysis of variance pearson.ppt
PDF
Anova stat 512
PPTX
Full Lecture Presentation on ANOVA
PPT
12-5-03.ppt
PPT
Statistical Tools
PPT
12-5-03.ppt
PPT
Anova test
PPT
oneway ANOVA.ppt
Analysis of variance
2Analysis of Variance.pdf
QM Unit II.pptx
two way abbbbiiiv dsdsdsdsdsdsdssssssssssssssssssssss
Chap15 analysis of variance
two way abbbbiiivaa eheheheheeeeeeeeeeeeeeeeeeeeeeeeee
ANOVA.ppt
Anova.ppt
ANOVA One way ppt statistics.pptx
UNIVERSITY FALL LECTURES ON STATISTICAL VARIANCE --2
12-5-03.ppt test means better two t test
2-1 ANOVA.pptx materi statiska untuk perhitungan anova
anova & analysis of variance pearson.ppt
Anova stat 512
Full Lecture Presentation on ANOVA
12-5-03.ppt
Statistical Tools
12-5-03.ppt
Anova test
oneway ANOVA.ppt

Recently uploaded (20)

PDF
Abdominal Access Techniques with Prof. Dr. R K Mishra
PPTX
human mycosis Human fungal infections are called human mycosis..pptx
PDF
102 student loan defaulters named and shamed – Is someone you know on the list?
PDF
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
PPTX
Final Presentation General Medicine 03-08-2024.pptx
PPTX
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
PDF
Basic Mud Logging Guide for educational purpose
PPTX
Renaissance Architecture: A Journey from Faith to Humanism
PPTX
Pharmacology of Heart Failure /Pharmacotherapy of CHF
PDF
Supply Chain Operations Speaking Notes -ICLT Program
PPTX
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
PDF
01-Introduction-to-Information-Management.pdf
PDF
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
PPTX
Cell Types and Its function , kingdom of life
PDF
Classroom Observation Tools for Teachers
PPTX
Pharma ospi slides which help in ospi learning
PDF
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
PDF
Module 4: Burden of Disease Tutorial Slides S2 2025
PDF
O5-L3 Freight Transport Ops (International) V1.pdf
PDF
Insiders guide to clinical Medicine.pdf
Abdominal Access Techniques with Prof. Dr. R K Mishra
human mycosis Human fungal infections are called human mycosis..pptx
102 student loan defaulters named and shamed – Is someone you know on the list?
Saundersa Comprehensive Review for the NCLEX-RN Examination.pdf
Final Presentation General Medicine 03-08-2024.pptx
BOWEL ELIMINATION FACTORS AFFECTING AND TYPES
Basic Mud Logging Guide for educational purpose
Renaissance Architecture: A Journey from Faith to Humanism
Pharmacology of Heart Failure /Pharmacotherapy of CHF
Supply Chain Operations Speaking Notes -ICLT Program
Introduction_to_Human_Anatomy_and_Physiology_for_B.Pharm.pptx
01-Introduction-to-Information-Management.pdf
Physiotherapy_for_Respiratory_and_Cardiac_Problems WEBBER.pdf
Cell Types and Its function , kingdom of life
Classroom Observation Tools for Teachers
Pharma ospi slides which help in ospi learning
The Lost Whites of Pakistan by Jahanzaib Mughal.pdf
Module 4: Burden of Disease Tutorial Slides S2 2025
O5-L3 Freight Transport Ops (International) V1.pdf
Insiders guide to clinical Medicine.pdf

Anova by Hazilah Mohd Amin

  • 1. Hazilah Mohd Amin Analysis of Variance (ANOVA)
  • 2. Goals After completing, you should be able to: Recognize situations in which to use analysis of variance (ANOVA) Perform a single-factor hypothesis test for Comparing More Than Two Means and interpret results
  • 3. The F - Distribution Analysis-of-variance procedures rely on F-distribution. There are infinitely many F-distributions, and we identify an F-distribution (and F-curve) by its number of degrees of freedom. F-distribution has two numbers of degrees of freedom.
  • 4. Key Fact F distribuition curve:
  • 5. Find Critical Value: Example Find the F value for 8 df for numerator, 14 df for denominator, and 0.05 area in the right tail of the F distribuition curve. Critical value: F  , df numerator,df denominator = F  , 8,14 = ?
  • 6. Table 12.1 (p. 534) Critical value: F  , 8,14 = 2.70
  • 7. Hypotheses of One-Way ANOVA All population means are equal i.e., no treatment effect (no variation in means among groups) At least one population mean is different i.e., there is a treatment effect Does not mean that all population means are different (some pairs may be the same) The analysis of variance is a procedure that tests to determine whether differences exits between two or more population means .
  • 8. One-Factor ANOVA All Means are the same: The Null Hypothesis is True (No Treatment Effect)
  • 9. One-Factor ANOVA At least one mean is different: The Null Hypothesis is NOT true (Treatment Effect is present) or
  • 11.  
  • 12.  
  • 13. One-Factor ANOVA F Test: Example 1 You want to see if three different golf clubs yield different distances. You randomly select five measurements from trials on an automated driving machine for each club. At the .05 significance level, is there a difference in mean distance? Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204
  • 14. Solution of Example 1 The data are interval The problem objective is to compare mean distances in three type of golf club. We hypothesize that the three population means are equal One Way A n a l y s i s o f V a r i a n c e
  • 15. Defining the Hypotheses H 0 :  1 =  2 =  3 H 1 : At least two means differ Solution
  • 16. N o t a t i o n Independent samples are drawn from k populations (treatments). X 11 x 21 . . . X n1,1 X 12 x 22 . . . X n2,2 X 1k x 2k . . . X nk,k Sample size Sample mean X is the “response variable”. The variables’ value are called “responses”.
  • 17. T e r m i n o l o g y In the context of this problem… Response variable – distance Experimental unit – golf club when we record distance figures. Factor or treatment – the criterion by which we classify the populations (the treatments). In this problems the factor is the type of golf clubs.
  • 18. The rationale of the name of A n a l y s i s o f V a r i a n c e ( A N O V A ) We are testing the different between means but why ANOVA? Two types of variability are employed when testing for the equality of the population means: Within Samples and Between Samples
  • 19. One Way A n a l y s i s o f V a r i a n c e Graphical demonstration : Employing two types of variability: Within Samples and Between Samples
  • 20. Treatment 1 Treatment 2 Treatment 3 20 16 15 14 11 10 9 The sample means are the same as before, but the larger within-sample variability makes it harder to draw a conclusion about the population means. A small variability within the samples makes it easier to draw a conclusion about the population means. 20 25 30 1 7 Treatment 1 Treatment 2 Treatment 3 10 12 19 9
  • 21. One-Factor ANOVA Example: Scatter Diagram • • • • • 270 260 250 240 230 220 210 200 190 • • • • • • • • • • Distance Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204 Club 1 2 3 From scatter diagram, we can clearly see sample means difference because of small within-sample variability
  • 22. Test Statistics (F), Critical Value & Rejection Criterion Test statistic: where MSB is mean squares between variances where MSW is mean squares within variances Rejection Region: F > F  , k-1,n-k Degrees of freedom df 1 = k – 1 (k = levels or treatments) df 2 = n – k (n = sum of sample sizes from all populations) H 0 : μ 1 = μ 2 = … = μ k H A : At least two population means are different The hypothesis test:
  • 23. One-Factor ANOVA Example Computations Club 1 Club 2 Club 3 254 234 200 263 218 222 241 235 197 237 227 206 251 216 204 x 1 = 249.2 x 2 = 226.0 x 3 = 205.8 x = 227.0 n 1 = 5 n 2 = 5 n 3 = 5 n = 15 k = 3 MSB = 4716.4 / (3-1) = 2358.2 MSW = 1119.6 / (15-3) = 93.3 SSB = 4716.4 SSW = 1119.6
  • 24. One-Factor ANOVA Example Solution H 0 : μ 1 = μ 2 = μ 3 H A : μ i not all equal  = .05 df 1 = k-1 =3-1 =2 df 2 = n-k =15-3 =12 F = 25.275 Test Statistic: Decision: Test statistic F is greater than critical value Conclusion: Reject H 0 at  = 0.05 There is evidence that at least one μ i differs from the rest 0  = .05 F .05 = 3.885 Reject H 0 Do not reject H 0 Critical Value: F  , k-1,n-k = F  , 2,12 = 3.885
  • 25. ANOVA Single Factor: Excel Output EXCEL: tools | data analysis | ANOVA: single factor F  , k-1,n-k = F  , 2,12 = 3.885 SUMMARY Groups Count Sum Average Variance Club 1 5 1246 249.2 108.2 Club 2 5 1130 226 77.5 Club 3 5 1029 205.8 94.2 ANOVA Source of Variation SS df MS F P-value F crit Between Groups 4716.4 2 2358.2 25.275 4.99E-05 3.885 Within Groups 1119.6 12 93.3 Total 5836.0 14        
  • 26. Rationale 1: Variability Between Sample If H 0 : μ 1 = μ 2 = … = μ k is true , we would expect all the sample means to be close to one another. If the alternative hypothesis is true, at least some of the sample means would differ. Thus, we measure variability between sample means (and hence MSB or MSTr).
  • 27. Large variability within the samples weakens the “ability” of the sample means to represent their corresponding population means. Therefore, even though sample means may markedly differ from one another, we have to consider the “within samples variability” (and hence MSW or MSE). Rationale II: Variability Within
  • 28. Interpreting One-Factor ANOVA F Statistic The F statistic is the ratio of the between estimate of variance and the within estimate of variance The ratio must always be positive df 1 = k -1 will typically be small df 2 = n - k will typically be large The test statistic F ratio should be close to 1 (SSB small due to small sample means difference) if H 0 : μ 1 = μ 2 = … = μ k is true The ratio will be larger than 1 (SSB large due to large sample means difference) if H 0 : μ 1 = μ 2 = … = μ k is false
  • 29. Example 2 A study was conducted to determine if the drying time for a certain paint is affected by the type of applicator used. The data in the table on the next screen represents the drying time (in minutes) for 3 different applicators when the paint was applied to standard wallboard. Is there any evidence to suggest the type of applicator has a significant effect on the paint drying time at the 0.05 level? Notation : The type of applicator is the treatment, factor or level . hence k = 3
  • 30. Notation Used in ANOVA Factor Levels Sample from Sample from Sample from Sample from Replication Level 1 Level 2 Level 3 Level k n = 1 x 1,1 x 2,1 x 3,1 x k ,1 n = 2 x 1,2 x 2,2 x 3,2 x k ,2 n = 3 x 1,3 x 2,3 x 3,3 x k ,3 Column T 1 T 2 T 3 T k T Totals T = grand total = sum of all x 's =  x =  T i . . . . . . . . .
  • 31. Sample Results  1 x  2 x  3 x
  • 32. Solution Assumptions: The data (samples) was randomly collected and all observations are independent. The populations are (approximately) normally distributed. Populations have equal variances. The null and the alternative hypothesis: H o :  1 =  2 =  3 The mean drying time is the same for each applicator H a : At least one mean is different Not all drying time means are equal
  • 33. Partition of Total Variation Commonly referred to as: Sum of Squares Within (SSW) Sum of Squares Error (SSE) Sum of Squares Unexplained Within Groups Variation Variation Due to Factor/Treatment (SSB) Variation Due to Random Sampling (SSW) Sum of Squares Total (SST) Commonly referred to as: Sum of Squares Between (SSB) Sum of Squares Treatment (SSTr) Sum of Squares Factor Sum of Squares Among Sum of Squares Explained Among Groups Variation = + Total variation SST can be split into two parts: SST = SSB + SSW
  • 34.  
  • 35.  x and  x 2 Calculator: Enter x i data, retrieve  x and  x 2 Enter Statictics SD: Mode Mode 1 Clear old data: Shift Clr 1 = Enter x i data: 39.1 DT 39.4 DT 31.1 DT 33.7 DT 30.5 DT 34.6 DT …29.5 DT Find (  x): Shift S-SUM 2 = 616.5 Find (  x 2 ): Shift S-SUM 1 = 20,316.69
  • 36. Variation Sums of Squares
  • 37. Mean Square The mean square for the factor being tested and for the error is obtained by dividing the sum-of-square value by the corresponding number of degrees of freedom Numerator degrees of freedom = df(factor) = k  1 = 3  1 = 2 df(total) = n  1 = 19  1 = 18 Denominator degrees of freedom = df(error) = n  k = 19  3 = 16 Calculations:
  • 38. One-Way ANOVA Table Source of Variation df SS MS Between Samples SSB MSB = Within Samples n - k SSW MSW = Total n - 1 SST = SSB+SSW k - 1 MSB MSW F ratio SSB k - 1 SSW n - k F = The sums of squares and the degrees of freedom must check SS(factor) + SS(error) = SS(total) or SSB + SSW = SST df(factor) + df(error) = df(total) or df(between) + df(within) = df(total) An ANOVA table is often used to record the sums of squares and to organize the rest of the calculations. Format for the ANOVA Table:
  • 39. The Completed ANOVA Table The Complete ANOVA Table: The Test Statistic:
  • 40. Solution Continued The Results a. Decision: Reject H o at  = 0.05 b. Conclusion : There is evidence to suggest the three population means are not all the same. The type of applicator has a significant effect on the paint drying time at the 0.05 level of significance. Critical Value: F  , k-1,n-k = F  , 2,16 = 3.63 The Test Statistic F = 4.27 is in the rejection region. Reject H 0 F .05 = 3.63 Do not reject H 0  = .05
  • 41. One-Way ANOVA F-Test: Exercise 1 You’re a trainer for Microsoft Corp. Is there any evidence to suggest the type of training method has a significant effect on the learning time at the 0.05 level? The data in the table represents the learning times (in hours) of 12 people using 4 different training methods. M1 M2 M3 M4 10 11 13 18 9 16 8 23 5 9 9 25 © 1984-1994 T/Maker Co. Answer: Critical Value = 4.07. Test statistic = 11.6
  • 42. Hey! Lets get our hand dirty … Using S P S S ….
  • 43. One Way A n a l y s i s o f V a r i a n c e U s i n g S P S S Suppose we want to know whether students who have to work many hours outside school to support themselves find their grade suffering. We examine this question by comparing the GPAs of students who work various hours outside school. Let’s examine this question using data in Student file. File>Open> Student
  • 44. One Way A n a l y s i s o f V a r i a n c e U s i n g S P S S First examine the average GPA for each of the three work categories (0 hrs, 1-19hrs, >20hrs) - WorkCat Graph > Boxplot then choose Simple and click Define . Select GPA as the variable and WorkCat for the Category Axis. Click Option
  • 45. After Clicking Options …, click off Display groups defined by missing value , and click Continue then OK . You’ll get this
  • 46. What is the Box-plot telling us? Some variation across the groups See median GPAs (dark line in the middle of box) differ slightly between groups. So, should we attribute the observed difference to sampling error or they genuinely differ? Neither box-plot nor the median offer decisive evidence. Hence we need ANOVA.
  • 47. One Way A n a l y s i s o f V a r i a n c e U s i n g S P S S We are testing: H 0 :  1 =  2 =  3 H 1 : At least two means differ Before attempting ANOVA, need to review the ANOVA assumptions. Independent samples (ii) Normality (iii) Variances equality. We can test both (ii) & (iii). Analyze>Descriptive Statistics>Explore
  • 48. Analyze>Descriptive Statistics>Explore In the Explore dialog box, select GPAs as the dependent List variable, WorkCat as the Factor List variable and Plot as the Display . Next, click Plot … We are interested in a normality test, select Select this & deselect this only. Click Continue and OK . See next slide…
  • 49. The Output has several parts, let focus on the tests of normality The Kolmogorov-Smirnov test assesses whether there is significant departure from normality in the population distribution of the 3 groups. H 0 : Distributions are normal . Look at the p-values, all are > 0.05. Do not reject H 0 . Hence no evidence of non-normality.
  • 50. One Way A n a l y s i s o f V a r i a n c e U s i n g S P S S We still need to validate the homogeneity of variance assumption. We do this within ANOVA. Analyze>Compare Means>One-Way ANOVA Dependent List variable is GPA and Factor variable is WorkCat. Click Option,
  • 51. One Way A n a l y s i s o f V a r i a n c e U s i n g S P S S under Statistics , select Descriptive and Homogeneity of variance test . Click Continue & OK H 0 : Variances are equal. One-Way ANOVA output consists many parts. Look at the p-value > 0.05. Hence do not reject H 0 .
  • 52. Normality & Homogeneity of variances assumptions met … hence Let find out whether students who work various hours outside school differ in their GPAs. The P-value of .000 is very small, hence we reject Ho and conclude that the means GPAs are not all the same. Where are the differences? Hence Post-Hoc test…
  • 53. End of ANOVA See U Later…
  • 54. One-Way ANOVA F-Test: Exercise 1 Solution You’re a trainer for Microsoft Corp. Is there any evidence to suggest the type of training method has a significant effect on the learning time at the 0.05 level? The data in the table represents the learning times (in hours) of 12 people using 4 different training methods. M1 M2 M3 M4 10 11 13 18 9 16 8 23 5 9 9 25 © 1984-1994 T/Maker Co.
  • 55. Summary Table Solution* Source of Variation Degrees of Freedom Sum of Squares Mean Square (Variance) F Treatment ( Methods ) 4 - 1 = 3 348 116 11.6 Error 12 - 4 = 8 80 10 Total 12 - 1 = 11 428
  • 56. One-Way ANOVA F-Test Solution* H 0 :  1 =  2 =  3 =  4 H a : Not All Equal  = .05  1 = 3  2 = 8 Critical Value(s): F 0 4.07 Test Statistic: Decision: Conclusion: Reject at  = .05 There Is Evidence Pop. Means Are Different  = .05 F MSB MSE    116 10 11 6 .

Editor's Notes

  • #4: Change to page 800
  • #5: Change to page 803
  • #11: Change to page 803
  • #12: Delete slide and insert procedure 16.1 (steps 1-4) from page 813
  • #13: Delete slide and insert procedure 16.1 (steps 5-7 critical value approach) from page 813
  • #35: Change to page 803
  • #42: You assign randomly 3 people to each method, making sure that they are similar in intelligence etc.
  • #55: You assign randomly 3 people to each method, making sure that they are similar in intelligence etc.