AMMI (M6a) {#ammi}

Theory of the model

The experimental design used is fully replicated (D1). The Additive Main effects and Multiplicative Interaction (AMMI) model is based on frequentist statistics (section \@ref(section-freq)). The analysis can be broken down in two steps [@gauch_statistical_2006][@gauch_statistical_2008][@yan_gge_2007] :

  1. an ANOVA with the following model :

$Y_{ijk} = \mu + \alpha_{i} + \theta_{j} + rep_{k}(\theta_{j}) + (\alpha\theta){ij} + \varepsilon{ijk}; \quad \varepsilon_{ijk} \sim \mathcal{N} (0,\sigma^2)$

With,

Or, if there are several years in the data set:

$Y_{ijkl} = \mu + \alpha_{i} + \theta_{j} + \beta_{l} + (\alpha\theta){ij} + (\alpha\beta){il} + (\theta\beta){jl} + rep{k}((\theta\beta){jl}) + \varepsilon{ijkl}; \quad \varepsilon_{ijkl} \sim \mathcal{N} (0,\sigma^2)$

With,

$(\alpha\theta){ij} = \sum{n}^{N} \lambda_{n} \gamma_{in} \omega_{jn}$ ^[Note than in PPBstats, the PCA is done on $\alpha\theta_{ij} + \varepsilon_{ijkl}$. The residuals is in the last dimension of the PCA]

which can also be written :

$(\alpha\theta){ij} = \sum{n}^{N} (\sqrt{\lambda_{n}} \gamma_{in}) (\sqrt{\lambda_{n}} \omega_{jn})$

With,

The data are double centered on location and germplasm. The PCA studies the structure of the interaction matrix. The locations are the variables and the germplasms are the individuals.

This PCA allows to detect

Steps with PPBstats

For AMMI analysis, you can follow these steps (Figure \@ref(fig:main-workflow)):

Format the data

data(data_model_GxE)
data_model_GxE = format_data_PPBstats(data_model_GxE, type = "data_agro")
head(data_model_GxE)

Run the model

To run model GxE on the dataset, used the function model_GxE. You can run it on one variable.

out_ammi = model_GxE(data_model_GxE, variable = "y1", gxe_analysis = "AMMI")

out_ammi is a list containing three elements :

out_ammi$info
out_ammi$PCA

Check and visualize model outputs

The tests to check the model are explained in section \@ref(check-model-freq).

Check the model

Once the model is run, it is necessary to check if the outputs can be taken with confidence. This step is needed before going ahead in the analysis (in fact, object used in the next functions must come from check_model()).

out_check_ammi = check_model(out_ammi)

out_check_ammi is a list containing four elements :

Visualize outputs

Once the computation is done, you can visualize the results with plot()

p_out_check_ammi = plot(out_check_ammi)

p_out_check_ammi is a list with:

p_out_check_ammi$variability_repartition
p_out_check_ammi$variance_intra_germplasm
p_out_check_ammi$pca_composante_variance

Get and visualize mean comparisons

The method to compute mean comparison are explained in section \@ref(mean-comp-check-freq).

Get mean comparisons

Get mean comparisons with mean_comparisons().

out_mean_comparisons_ammi = mean_comparisons(out_check_ammi, p.adj = "bonferroni")

out_mean_comparisons_ammi is a list of three elements:

Visualize mean comparisons
p_out_mean_comparisons_ammi = plot(out_mean_comparisons_ammi)

p_out_mean_comparisons_ammi is a list of three elements with barplots :

For each element of the list, there are as many graph as needed with nb_parameters_per_plot parameters per graph. Letters are displayed on each bar. Parameters that do not share the same letters are different regarding type I error (alpha) and alpha correction. The error I (alpha) and the alpha correction are displayed in the title.

pg = p_out_mean_comparisons_ammi$germplasm
names(pg)
pg$`1`
pl = p_out_mean_comparisons_ammi$location
names(pl)
pl$`1`
p_out_mean_comparisons_ammi$year

Get and visualize biplot

The biplot represents information about the percentages of total variation explained by the two axes. It has to be linked to the total variation caught by the interaction. If the total variation is small, then the biplot is useless. If the total variation is high enought, then the biplot is useful if the two first dimension represented catch enought variation (the more the better).

Get biplot
out_biplot_ammi = biplot_data(out_check_ammi)
Visualize biplot
p_out_biplot_ammi = plot(out_biplot_ammi)

p_out_biplot_ammi is a list of three elements :

Ecovalence from @wricke_uber_1962 give part of interaction variance taken by germplasm and location. This parameter provides indications on the contribution of each variety to the interaction term and therefore on its stability over the different farms with respect to the productivity potential of each of them. It is generally described as a dynamic stability indicator [@becker_stability_1988]. A low ecovalence means low interaction, i.e. more stability.

Ecovalance of germplasm $i$ is $W_{i}=\sum_{i}^{n} (\eta_{i}\nu_{j})^{2}$

Ecovalance of location $j$ is $W_{j}=\sum_{j}^{n} (\eta_{i}\nu_{j})^{2}$.

Ecovalances are represented in fonction of mean effects by germplasm and location.

p_out_biplot_ammi$ecovalence
p_out_biplot_ammi$interaction
p_out_biplot_ammi$biplot$simple_biplot

Regarding the other elements of the list, it returns NULL as these visualisation is only done for gxe_analysis = "GGE".

p_out_biplot_ammi$biplot$which_won_where
p_out_biplot_ammi$biplot$mean_vs_stability
p_out_biplot_ammi$biplot$discrimitiveness_vs_representativeness

Get and vizualise groups of parameters

Get groups of parameters

In order to cluster locations or germplasms, you may use mulivariate analysis on a matrix with several variables in columns and parameter in rows.

This is done with parameter_groups() which do a PCA on this matrix.

Clusters are done based on HCPC method as explained here

Lets' have an example with three variables.

First run the models

out_ammi_2 = model_GxE(data_model_GxE, variable = "y2", gxe_analysis = "AMMI")
out_ammi_3 = model_GxE(data_model_GxE, variable = "y3", gxe_analysis = "AMMI")

Then check the models

out_check_ammi_2 = check_model(out_ammi_2)
out_check_ammi_3 = check_model(out_ammi_3)

Then run the function for germplasm. It can also be done on location or intra germplasm variance

out_parameter_groups = parameter_groups(
  list("y1" = out_check_ammi, "y2" = out_check_ammi_2, "y3" = out_check_ammi_3), 
  "germplasm"
  )

out_parameter_groups is list of two elements:

Visualize groups of parameters

Visualize outputs with plot

p_germplasm_group = plot(out_parameter_groups)

p_germplasm_group is list of two elements :

cl = p_germplasm_group$clust
names(cl)
cl$cluster_all
cl$cluster_1

post hoc analysis to visualize variation repartition for several variables

list_out_check_model = list(out_check_ammi, out_check_ammi_2, out_check_ammi_3)
names(list_out_check_model) = c("ammi_1", "ammi_2", "ammi_3")
post_hoc_variation(list_out_check_model)

Apply the workflow to several variables {#workflow_gxe}

If you wish to apply the AMMI workflow to several variables, you can use lapply() with the following code :

workflow_gxe = function(x, gxe){
  out_gxe = model_GxE(data_model_GxE, variable = x, gxe_analysis = gxe)

  out_check_gxe = check_model(out_gxe)
  p_out_check_gxe = plot(out_check_gxe)

  out_mean_comparisons_gxe = mean_comparisons(out_check_gxe, p.adj = "bonferroni")
  p_out_mean_comparisons_gxe = plot(out_mean_comparisons_gxe)

  out_biplot_gxe = biplot_data(out_check_gxe)
  p_out_biplot_gxe = plot(out_biplot_gxe)

  out = list(
    "out_gxe" = out_gxe,
    "out_check_gxe" = out_check_gxe,
    "p_out_check_gxe" = p_out_check_gxe,
    "out_mean_comparisons_gxe" = out_mean_comparisons_gxe,
    "p_out_mean_comparisons_gxe" = p_out_mean_comparisons_gxe,
    "out_biplot_gxe" = out_biplot_gxe,
    "p_out_biplot_gxe" = p_out_biplot_gxe
  )

  return(out)
}

vec_variables = c("y1", "y2", "y3")

out = lapply(vec_variables, workflow_gxe, "AMMI")
names(out) = vec_variables

out_parameter_groups = parameter_groups(
  list("y1" = out$y1$out_check_gxe, "y2" = out$y2$out_check_gxe, "y3" = out$y3$out_check_gxe), 
  "germplasm" )

p_germplasm_group = plot(out_parameter_groups)


priviere/PPBstats documentation built on May 6, 2021, 1:20 a.m.