Classic ANOVA (M4a) {#classic-anova}

Theory of the model

The experimental design used is fully replicated (D1) on one location. The model is based on frequentist statistics (section \@ref(section-freq)). The tests to check the model are explained in section \@ref(check-model-freq). The method to compute mean comparison are explained in section \@ref(mean-comp-check-freq). The analysis is done the following model :

$Y_{ijk} = \mu + \alpha_{i} + rep_{k} + \varepsilon_{ijk}; \quad \varepsilon_{ijk} \sim \mathcal{N} (0,\sigma^2)$

With,

Steps with PPBstats

For classic anova analysis, you can follow these steps (Figure \@ref(fig:main-workflow)):

Format the data

A subset of data_model_GxE is used in this exemple.

data(data_model_GxE)
data_model_anova = droplevels(dplyr::filter(data_model_GxE, location == "loc-1"))  
data_model_anova = format_data_PPBstats(data_model_anova, type = "data_agro")
head(data_model_anova)

Run the model

To run model on the dataset, used the function model_anova. You can run it on one variable.

out_anova = model_anova(data_model_anova, variable = "y1")

out_anova is a list containing two elements :

out_anova$info

Check and visualize model outputs

The tests to check the model are explained in section \@ref(check-model-freq).

Check the model

Once the model is run, it is necessary to check if the outputs can be taken with confidence. This step is needed before going ahead in the analysis (in fact, object used in the next functions must come from check_model()).

out_check_anova = check_model(out_anova)

out_check_anova is a list containing four elements :

Visualize outputs

Once the computation is done, you can visualize the results with plot()

p_out_check_anova = plot(out_check_anova)

p_out_check_anova is a list with:

p_out_check_anova$variability_repartition
p_out_check_anova$variance_intra_germplasm

Get and visualize mean comparisons

The method to compute mean comparison are explained in section \@ref(mean-comp-check-freq).

Get mean comparisons

Get mean comparisons with mean_comparisons().

out_mean_comparisons_anova = mean_comparisons(out_check_anova, p.adj = "bonferroni")

out_mean_comparisons_anova is a list of two elements:

Visualize mean comparisons
p_out_mean_comparisons_anova = plot(out_mean_comparisons_anova)

p_out_mean_comparisons_anova is a list of one element with barplots :

For each element of the list, there are as many graph as needed with nb_parameters_per_plot parameters per graph. Letters are displayed on each bar. Parameters that do not share the same letters are different regarding type I error (alpha) and alpha correction. The error I (alpha) and the alpha correction are displayed in the title.

pg = p_out_mean_comparisons_anova$germplasm
names(pg)
pg$`1`

Get and vizualise groups of parameters

Get groups of parameters

In order to cluster locations or germplasms, you may use mulivariate analysis on a matrix with several variables in columns and parameter in rows.

This is done with parameter_groups() which do a PCA on this matrix.

Clusters are done based on HCPC method as explained here

Lets' have an example with three variables.

First run the models

out_anova_2 = model_anova(data_model_anova, variable = "y2")
out_anova_3 = model_anova(data_model_anova, variable = "y3")

Then check the models

out_check_anova_2 = check_model(out_anova_2)
out_check_anova_3 = check_model(out_anova_3)

Then run the function for germplasm.

out_parameter_groups = parameter_groups(
  list("y1" = out_check_anova, "y2" = out_check_anova_2, "y3" = out_check_anova_3), 
  "germplasm"
  )

out_parameter_groups is list of two elements:

Visualize groups of parameters

Visualize outputs with plot

p_germplasm_group = plot(out_parameter_groups)

p_germplasm_group is list of two elements :

cl = p_germplasm_group$clust
names(cl)
cl$cluster_all
cl$cluster_1

post hoc analysis to visualize variation repartition for several variables

list_out_check_model = list("anova_1" = out_check_anova, "anova_2" = out_check_anova_2, "anova_3" = out_check_anova_3)
post_hoc_variation(list_out_check_model)

Apply the workflow to several variables

If you wish to apply the AMMI workflow to several variables, you can use lapply() with the following code :

workflow_anova = function(x, data){
  out_anova = model_anova(data, variable = x)

  out_check_anova = check_model(out_anova)
  p_out_check_anova = plot(out_check_anova)

  out_mean_comparisons_anova = mean_comparisons(out_check_anova, p.adj = "bonferroni")
  p_out_mean_comparisons_anova = plot(out_mean_comparisons_anova)

  out = list(
    "out_anova" = out_anova,
    "out_check_anova" = out_check_anova,
    "p_out_check_anova" = p_out_check_anova,
    "out_mean_comparisons_anova" = out_mean_comparisons_anova,
    "p_out_mean_comparisons_anova" = p_out_mean_comparisons_anova
  )

  return(out)
}

vec_variables = c("y1", "y2", "y3")

out = lapply(vec_variables, workflow_anova, data_model_anova)
names(out) = vec_variables

list_out_check_model = list("anova_1" = out$y1$out_check_anova, "anova_2" = out$y2$out_check_anova, "anova_3" = out$y3$out_check_anova)

out_parameter_groups = parameter_groups(list_out_check_model, "germplasm" )
p_germplasm_group = plot(out_parameter_groups)

p_post_hoc_variation = post_hoc_variation(list_out_check_model)


priviere/PPBstats documentation built on May 6, 2021, 1:20 a.m.