2. Analysis of multi-location experiments
• Crop improvement process is lengthy and
involves several stages.
• Phenotypic VS Genotypic,
• What does Environment (year, location)?
– Relation to Agro-ecological classification
– On-farm and on-station experiments?
• Genotype by environment Interaction (GEI)? (if
GEI significant then G & E main effects has little value)
4. Determinants of Multi-location Trials
• Replication within sites
– How many replications are reasonable?
– Is Single rep possible?
– What will happen if some of the plots damaged in LOC?
• Design of experiments at each site [RCBD, Lattice, etc.]
– Can we combine different designs?
– Shall we send same design/randomization to collaborating
centers?
5. What will happen if?
– Design changes from LOC to LOC? (some designs,
like RCBD and family of lattice may not be
combined
– Different REP at LOC [No problem for RCBD, but small
challenge for IB(incomplete block]
– Different management at LOC [affect productivity
of the site compared to others]
– Different type of precursor crop?
6. Form of Combined Analysis
• When data is combined over locations and years,
the ANOVA structure take different forms: There are
three possibilities
– Same location and randomization is used each year
– Same location is used each year but different
randomization is adopted
– Different location is used each year.
The structure of ANOVA is slightly different for each of them.
7. Analysis of multi-location experiments
• Multi-location experiments are important in
agronomic research to check the responses of
genotypes to different environments,
• It also help to estimate and predict yield accurately
and assess the stability.
• The conventional analysis of variance usually begins
from separate analysis of each environment
• Thus, the trials must be designed in such a way that
sufficient information (error df) will be available for
valid interpretation of the results
8. Analysis of multiple experiments
• Results of separate analysis are used to infer a
particular site and obtain necessary information such as
mean and variance which enhance further analysis.
• Once separate analysis is performed, it is necessary to
check variance homogeneity before combining the data.
• This is because if variances of an environment are not
homogenous, it is not possible to combine data as they
are.
9. Analysis of multiple experiments
• If not homogenous, first transformation is required to fulfill the
assumption. Then, the data will be combined and analyzed
using the transformed scale (Gomez and Gomez, 1984).
• E.g. If the yield data(kg/ha) contains 11 environments (from
2004 to 2007-4 years and at 4 locations), with 4 replications and
16 genotypes.
• If you use Bartlet test for homogeneity test, before combine
analysis. The maximum Mse is 147390.36 for environment 9 in
year 1, and the minimum of 558294.26 for environment 2
during year 2. The ratio of the largest Mse to the smallest is
2.8.,i.e.<3 so that possible to do combine analysis
10. Table. Combined analysis of variance for grain yield (kg/ha) of
sixteen barley genotypes and checks evaluated in Food barley
regional variety trial–Early set from 2004-2007
(Sinana,Robe,Adaba and Agarfa).
Source of variation DF Anova SS Mean Square
Replication 3 2648607.59 882869.19**
Year 3 369334216.52 123111405.50**
Environment 10 150626256.98 50208752.32**
Genotype (G) 15 31847374.07 2123158.27**
Genotype x environment 150 29472311.54 654940.25**
Genotype x year 45 38871603.98 863813.42**
Year x environment 30 93664439.48 23416109.87**
Genotype x environment x Year 450 5938387.81 98973.13ns
Error 525 122948150.90 234186.95
Total 1231 848716162.93
12. Interpretation of conventional ANOVA
• GEI non-significant (note: GEI could exist but attribute to
random occurrence)
– This means either the MS for GEI is small or the
RMS is high
– In the later case, two scenarios:
• Presence of damaged plots or outliers
• Variance heterogeneity.
13. conclusions
• Year, environment and genotype showed highly
significant (p<0.01)
• Genotype x environment, and genotype x year
interactions were also highly significant (p<0.01).
The performances of genotypes differed in different
environments and in different years. Hence, different
genotypes could have yielded high in various envts
and years making stability difficult.
• However, the genotype x environment x year was
not significant (p<0.05). The interaction of
genotypes with environment is similar in all years
14. Interpretation of conventional ANOVA
• GEI non-significant (note: GEI could exist but attribute to
random occurrence)
– This means either the MS for GEI is small or the
RMS is high
– In the later case, two scenarios:
• Presence of damaged plots or outliers
• Variance heterogeneity.
15. Interpretation….
• GEI Significant (note: Varying level of evidence)
– Does the level of evidence matter? (5%, 1%..)
• (Some says no!, only its existence matters)
• But, if weak evidence, the GEI might have been
influenced by few genotypes (also remember factorial rule)
– The GEI might have occurred due to one or more
outliers (replace and re-test)
• Generally need to develop habit to exam residuals
– Cross-over and Non cross-over type (presence of strong
evidence signals Cross-over type)
16. Genotype Stability Analysis
Definition: Ability of genotype to maintain near
constant phenotype for the character of interest over
a variable environment (wide adaptation)
Performance of Genotype changes due to:
– Environmental pressure/stress
– Differences in genotype ability to adapt to the stress factor.
17. Stability Analysis…
Challenge 1: Wide adaptability is often in
conflict with predictability.
• sacrifices of performance in specific
environment
Use of stability and Adaptability (interchangeably?)
Stability Temporal Variation
Adaptability Spatial variation
18. Stability Analysis…
Challenges 2: Various measures of stability (&
different concepts) due to philosophy of
researchers, How to choose one?
Challenge 3: Two types of views - Statistical
and Biological.
19. Stability Statistics: Univariate
• Variance of Genotypes Across Environments
(S2
i)---------------------------(1)
S2
i = Σ(xij-xi.)2/e-1 (xij=ith G mean at jth E)
• The Coefficient of Variation (CV)
(Francis and Kannenberg, 1978)--------------(2)
• Mean Variance component for pairwise GEI, Q
(Plaisted and Peterson’s, 1959)-----------(3)
Note that methods 3-6 are similar as they all share
Σ(xij-xi.-x.j+x..)2 as the main component in their formula
20. Stability Statistics: Univariate
• Variance Component for GEI, θi (Plaisted’s, 1960)------
-----------------------------------------(4)
• Ecovalence, w2
i, (Wricke’s, 1962)----------(5)
W2
i = Σ(xij-xi.-x.j+x..)2 xi.=ith G mean, x.j=ith E mean
• Stability variance, σ2
i (Shukla, 1972a)----(6)
Bs = Σ(xij-x.j)(x.j-x..)/(x.j-x..)2,
• Regression Coefficient, bi (Finaly & Wilkinson, 1963)---(7)
bi = Σxij(x.i-x..)/(x.j-x..)2, intercept=xi-bixi
b=0 or b=1
21. Stability Statistics: Univariate
• bi and Deviation (from regression) parameter,
δ2
i (Eberhart & Russell, 1966)------------(9)
bi = Σxij(x.j-x..)/(x.j-x..)2, intercept=xi
• Superiority measure, Pi (Lin and Binns, 1988).[For ith
genotype, pi is the difference between its mean and mean of best cultivar
divide by 2xe]----------------(11)
22. • Finlay and Wilkinson (1963) defined a genotype with
bi = 0 as stable,
• Eberhart and Russell (1966) defined a genotype with
bi = 1 and with deviation from regression mean
squares (Sd2i) =0 to be stable.
• Perkins and Jinks (1968) estimated regression of GxE
(rather than genotypic means) on environmental
index.
24. Table 2. Analysis of variance for grain yield for barley multi-location yield trials
(BRVTES04) in the highlands of Bale
Source DF Anova SS Mean Square % explained variation
Total 703 845.324
Replication ( R) 3 2.634 0.878 **
Year (Y) 3 369.168 123.056** 52.51
Environment (E) 3 150.604 50.201** 21.42
Genotype (G) 15 31.886 2.126** 4.54
G*E 45 29.462 0.655** 4.19
G* Y 45 38.845 0.863** 5.53
YR*LOC 4 93.727 23.432** 13.33
G*E*Y 60 5.979 0.100** 0.85
Error 525 123.020 0.234
Coefficient of variation (%) =19.85 and grand mean=2.44
t/ha
32. Random vs fixed effects in analyzing
• Mixed procedure is a generalization of standard
linear model .
• It handles data that exhibit correlation and non-
constant variability.
• Mixed model has two components-fixed and random
effects.
• Fixed effects are determined by factors such as
variety and breed, whereas random-effects are
unknown random-variabilities
• Fixed effects-may be varieties,
• Random effects-like blocks, location,year. An error
term to be used for testing the effects.
33. Random vs fixed effects in analyzing
• In the case of fixed effects, the conclusion works
only for those factor levels used in the study
• Whereas in random effects the conclusion works for
the population from which the factor levels are
drawn from.
34. Genotype stability
• Purpose of multi-location trails is to investigate an
overall performance of different varieties over
locations.
• Some genotypes show a relatively strong
dependence to the combination of the important site
characteristics indicating unstable, while others
lesser dependence showing more stable.
• So, to identify these genotypes, a regression of
genotype yield on site characteristics is important.
35. Genotype stability
• Hence, the mean yields of all genotypes at each site
are used as measures of site performance, and the
genotype yield is regressed on an overall mean yield
of all genotypes in a site.
• Several procedures are used: like AMMI
model(additive main effects and multiplicative
interaction model), GGEbiplot, Eberhart and Russel's
regression approach,..etc.SAS,AMMI,
GGEbiplot,MSTATC,Genstat and others can be used
for analyzing