The experimental design used is fully replicated (D1). The Additive Main effects and Multiplicative Interaction (AMMI) model is based on frequentist statistics (section \@ref(section-freq)). The analysis can be broken down in two steps [@gauch_statistical_2006][@gauch_statistical_2008][@yan_gge_2007] :
$Y_{ijk} = \mu + \alpha_{i} + \theta_{j} + rep_{k}(\theta_{j}) + (\alpha\theta){ij} + \varepsilon{ijk}; \quad \varepsilon_{ijk} \sim \mathcal{N} (0,\sigma^2)$
With,
Or, if there are several years in the data set:
$Y_{ijkl} = \mu + \alpha_{i} + \theta_{j} + \beta_{l} + (\alpha\theta){ij} + (\alpha\beta){il} + (\theta\beta){jl} + rep{k}((\theta\beta){jl}) + \varepsilon{ijkl}; \quad \varepsilon_{ijkl} \sim \mathcal{N} (0,\sigma^2)$
With,
and all other effects are the same as in the previous model.
a PCA that focus on the germplasm $\times$ location interaction :
$(\alpha\theta){ij} = \sum{n}^{N} \lambda_{n} \gamma_{in} \omega_{jn}$ ^[Note than in PPBstats
, the PCA is done on $\alpha\theta_{ij} + \varepsilon_{ijkl}$. The residuals is in the last dimension of the PCA]
which can also be written :
$(\alpha\theta){ij} = \sum{n}^{N} (\sqrt{\lambda_{n}} \gamma_{in}) (\sqrt{\lambda_{n}} \omega_{jn})$
With,
The data are double centered on location and germplasm. The PCA studies the structure of the interaction matrix. The locations are the variables and the germplasms are the individuals.
This PCA allows to detect
PPBstats
For AMMI analysis, you can follow these steps (Figure \@ref(fig:main-workflow)):
format_data_PPBstats()
model_GxE()
and gxe_analysis = "AMMI"
check_model()
and vizualise it with plot()
mean_comparisons()
and vizualise it with plot()
biplot_data()
and plot()
parameters_groups()
and visualise it with plot()
data(data_model_GxE) data_model_GxE = format_data_PPBstats(data_model_GxE, type = "data_agro") head(data_model_GxE)
To run model GxE on the dataset, used the function model_GxE
.
You can run it on one variable.
out_ammi = model_GxE(data_model_GxE, variable = "y1", gxe_analysis = "AMMI")
out_ammi
is a list containing three elements :
info
: a list with variable and gxe_analysisout_ammi$info
ANOVA
a list with five elements :
model
r
out_ammi$ANOVA$model
anova_model
r
out_ammi$ANOVA$anova_model
germplasm_effects
a list of two elements :
effects
r
out_ammi$ANOVA$germplasm_effects$effects
intra_variance
r
out_ammi$ANOVA$germplasm_effects$intra_variance
location_effects
r
out_ammi$ANOVA$location_effects$effects
interaction_matrix
r
out_ammi$ANOVA$interaction_matrix
PCA
: PCA object from FactoMineRout_ammi$PCA
The tests to check the model are explained in section \@ref(check-model-freq).
Once the model is run, it is necessary to check if the outputs can be taken with confidence.
This step is needed before going ahead in the analysis (in fact, object used in the next functions must come from check_model()
).
out_check_ammi = check_model(out_ammi)
out_check_ammi
is a list containing four elements :
model_GxE
the output from the modeldata_ggplot
a list containing information for ggplot:data_ggplot_residuals
a list containing :data_ggplot_normality
data_ggplot_skewness_test
data_ggplot_kurtosis_test
data_ggplot_shapiro_test
data_ggplot_qqplot
data_ggplot_variability_repartition_pie
data_ggplot_var_intra
Once the computation is done, you can visualize the results with plot()
p_out_check_ammi = plot(out_check_ammi)
p_out_check_ammi
is a list with:
residuals
histogram
: histogram with the distribution of the residuals
r
p_out_check_ammi$residuals$histogram
qqplot
r
p_out_check_ammi$residuals$qqplot
points
r
p_out_check_ammi$residuals$points
variability_repartition
: pie with repartition of SumSq for each factor
p_out_check_ammi$variability_repartition
variance_intra_germplasm
: repartition of the residuals for each germplasm (see Details for more information)
With the hypothesis than the micro-environmental variation is equaly distributed on all the individuals (i.e. all the plants), the distribution of each germplasm represent the intra-germplasm variance.
This has to been seen with caution:p_out_check_ammi$variance_intra_germplasm
pca_composante_variance
: variance caught by each dimension of the PCA run on the interaction matrixp_out_check_ammi$pca_composante_variance
The method to compute mean comparison are explained in section \@ref(mean-comp-check-freq).
Get mean comparisons with mean_comparisons()
.
out_mean_comparisons_ammi = mean_comparisons(out_check_ammi, p.adj = "bonferroni")
out_mean_comparisons_ammi
is a list of three elements:
info
: a list with variable and gxe_analysisdata_ggplot_LSDbarplot_germplasm
data_ggplot_LSDbarplot_location
data_ggplot_LSDbarplot_year
p_out_mean_comparisons_ammi = plot(out_mean_comparisons_ammi)
p_out_mean_comparisons_ammi
is a list of three elements with barplots :
For each element of the list, there are as many graph as needed with nb_parameters_per_plot
parameters per graph.
Letters are displayed on each bar. Parameters that do not share the same letters are different regarding type I error (alpha) and alpha correction.
The error I (alpha) and the alpha correction are displayed in the title.
germplasm
: mean comparison for germplasmpg = p_out_mean_comparisons_ammi$germplasm names(pg) pg$`1`
location
: mean comparison for locationpl = p_out_mean_comparisons_ammi$location names(pl) pl$`1`
year
: mean comparison for year
Here there are no more than 2 years in the data set so it is NULL
p_out_mean_comparisons_ammi$year
The biplot represents information about the percentages of total variation explained by the two axes. It has to be linked to the total variation caught by the interaction. If the total variation is small, then the biplot is useless. If the total variation is high enought, then the biplot is useful if the two first dimension represented catch enought variation (the more the better).
out_biplot_ammi = biplot_data(out_check_ammi)
p_out_biplot_ammi = plot(out_biplot_ammi)
p_out_biplot_ammi
is a list of three elements :
ecovalence
Ecovalence from @wricke_uber_1962 give part of interaction variance taken by germplasm and location. This parameter provides indications on the contribution of each variety to the interaction term and therefore on its stability over the different farms with respect to the productivity potential of each of them. It is generally described as a dynamic stability indicator [@becker_stability_1988]. A low ecovalence means low interaction, i.e. more stability.
Ecovalance of germplasm $i$ is $W_{i}=\sum_{i}^{n} (\eta_{i}\nu_{j})^{2}$
Ecovalance of location $j$ is $W_{j}=\sum_{j}^{n} (\eta_{i}\nu_{j})^{2}$.
Ecovalances are represented in fonction of mean effects by germplasm and location.
p_out_biplot_ammi$ecovalence
interaction
which displays the interaction matrixp_out_biplot_ammi$interaction
biplot
being a list of four elements :p_out_biplot_ammi$biplot$simple_biplot
Regarding the other elements of the list, it returns NULL
as these visualisation is only done for gxe_analysis = "GGE"
.
p_out_biplot_ammi$biplot$which_won_where p_out_biplot_ammi$biplot$mean_vs_stability p_out_biplot_ammi$biplot$discrimitiveness_vs_representativeness
In order to cluster locations or germplasms, you may use mulivariate analysis on a matrix with several variables in columns and parameter in rows.
This is done with parameter_groups()
which do a PCA on this matrix.
Clusters are done based on HCPC method as explained here
Lets' have an example with three variables.
First run the models
out_ammi_2 = model_GxE(data_model_GxE, variable = "y2", gxe_analysis = "AMMI") out_ammi_3 = model_GxE(data_model_GxE, variable = "y3", gxe_analysis = "AMMI")
Then check the models
out_check_ammi_2 = check_model(out_ammi_2) out_check_ammi_3 = check_model(out_ammi_3)
Then run the function for germplasm. It can also be done on location or intra germplasm variance
out_parameter_groups = parameter_groups( list("y1" = out_check_ammi, "y2" = out_check_ammi_2, "y3" = out_check_ammi_3), "germplasm" )
out_parameter_groups
is list of two elements:
obj.pca
: the PCA object from FactoMineR::PCA()
clust
, a list of two elements:res.hcpc
: the HCPC object from FactoMineR::HCPC()
clust
: the dataframe with cluster assigned to each individualVisualize outputs with plot
p_germplasm_group = plot(out_parameter_groups)
p_germplasm_group
is list of two elements :
pca
: a list with three elements on the PCA on the group of parameters :
composante_variance
: variance caught by each dimension of the PCA
r
p_germplasm_group$pca$composante_variance
ind
: graph of individuals
r
p_germplasm_group$pca$ind
var
: graph of variables
r
p_germplasm_group$pca$var
clust
: output from factextra::fviz_cluster()
, a list of number of cluster + 1 element
cl = p_germplasm_group$clust names(cl)
cl$cluster_all
cl$cluster_1
list_out_check_model = list(out_check_ammi, out_check_ammi_2, out_check_ammi_3) names(list_out_check_model) = c("ammi_1", "ammi_2", "ammi_3") post_hoc_variation(list_out_check_model)
If you wish to apply the AMMI workflow to several variables, you can use lapply()
with the following code :
workflow_gxe = function(x, gxe){ out_gxe = model_GxE(data_model_GxE, variable = x, gxe_analysis = gxe) out_check_gxe = check_model(out_gxe) p_out_check_gxe = plot(out_check_gxe) out_mean_comparisons_gxe = mean_comparisons(out_check_gxe, p.adj = "bonferroni") p_out_mean_comparisons_gxe = plot(out_mean_comparisons_gxe) out_biplot_gxe = biplot_data(out_check_gxe) p_out_biplot_gxe = plot(out_biplot_gxe) out = list( "out_gxe" = out_gxe, "out_check_gxe" = out_check_gxe, "p_out_check_gxe" = p_out_check_gxe, "out_mean_comparisons_gxe" = out_mean_comparisons_gxe, "p_out_mean_comparisons_gxe" = p_out_mean_comparisons_gxe, "out_biplot_gxe" = out_biplot_gxe, "p_out_biplot_gxe" = p_out_biplot_gxe ) return(out) } vec_variables = c("y1", "y2", "y3") out = lapply(vec_variables, workflow_gxe, "AMMI") names(out) = vec_variables out_parameter_groups = parameter_groups( list("y1" = out$y1$out_check_gxe, "y2" = out$y2$out_check_gxe, "y3" = out$y3$out_check_gxe), "germplasm" ) p_germplasm_group = plot(out_parameter_groups)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.