Study local adaptation

Workflow and function relations in PPBstats regarding local adaptation analysis

Figure \@ref(fig:main-workflow-family-4-HALF) displays the functions and their relationships. Table \@ref(tab:function-descriptions-workflow-family-4-HALF) describes each of the main functions.

You can have more information for each function by typing ?function_name in your R session. Note that check_model(), mean_comparison() and plot() are S3 method. Therefore, you should type ?check_model, ?mean_comparison or ?plot.PPBstats to have general features and then see in details for specific functions.

knitr::include_graphics("figures/main-functions-agro-family-4-HALF.png")

| function name | description | | --- | --- | | design_experiment | Provides experimental design for the different situations corresponding to the choosen family of analysis | | format_data_PPBstats | Check and format the data to be used in PPBstats functions | | HA_to_LF | Transform home away data to local foreign data | | LF_to_HA | Transform local foreign data to home away data | | model_home_away | Run home away model | | model_local_foreign | Run local foreign model | | check_model | Check if the model went well | | mean_comparisons | Get mean comparisons | | plot | Build ggplot objects to visualize output | Table: (#tab:function-descriptions-workflow-family-4-HALF) Function description.

Home away {#home-away}

Home away analysis allows to study local adaptation. Away in a location refers to a germplasm that has not been grown or selected in a given location. Home in a location refers to a germplasm that has been grown or selected in a given location.

The following model take into account germplasm and location effects in order to better study version (home or away) effect [@blanquart_practical_2013]. The model is based on frequentist statistics (section \@ref(section-freq)).

$Y_{ijkm} = \mu + \alpha_i + \theta_j + \omega_{k_{ij}} + (\omega \times \alpha){k{ij}j} + rep(\theta){mj} + \varepsilon{ijkm}; \quad \varepsilon_{ijkm} \sim \mathcal{N} (0,\sigma^2)$

with

The comparisons of all germplasm in all location in sympatric or allopatric situation (measured by version effect $\omega_{k_{ij}}$) give a glocal measure of local adaptation [@blanquart_practical_2013]. Interaction effect $(\omega \times \alpha){k{ij}j}$ give information on specific adaptation to each location.

If there are more than one year, then the model can be written :

$Y_{ijklm} = \mu + \alpha_i + \theta_j + \beta_l + \omega_{k_{ij}} + (\theta \times \beta){jl} + (\omega \times \alpha){k_{ij}j} + rep(\theta \times \beta){mjl} + (\omega \times \alpha \times \beta){k_{ij}jl} + \varepsilon_{ijklm}; \quad \varepsilon_{ijklm} \sim \mathcal{N} (0,\sigma^2)$

with

Interaction $(\omega \times \theta \times \beta){k{ij}jl}$ give information on specific adaptation to each location for a given year.

A type III anova is done here as the data are not orthogonal.

Steps with PPBstats

For home away analysis, you can follow these steps (Figure \@ref(fig:main-workflow-family-4-HALF)):

Format data
data("data_agro_HA")
data_agro_HA = format_data_PPBstats(data_agro_HA, type = "data_agro_HA")
head(data_agro_HA)

Where version represents away or home and group represents the location where the germplasm come from.

Describe the data
p = plot(data_agro_HA, vec_variables = "y1", plot_type = "barplot")

p is a list with as many element as variable. For each variable, there are three elements :

p$y1$home_away_merged
p$y1$home_away_merged_per_germplasm
p$y1$home_away_per_germplasm$`germ-1`$version
p$y1$home_away_per_germplasm$`germ-1`$origin

If you have several year in the data, you can set argument f_gris = "year" in order to have plot for each year. In the example below t is not really relevent beaause there is only one year in the data set!

p = plot(data_agro_HA, vec_variables = "y1", plot_type = "barplot", f_grid = "year")
p$y1$home_away_merged_per_germplasm
Run the model

To run HOME AWAY model on the dataset, use the function model_home_away. You can run it on one variable.

out_ha = model_home_away(data_agro_HA, "y1")

out_ha is a list containing three elements :

out_ha$info
Check and visualize model outputs

The tests to check the model are explained in section \@ref(check-model-freq).

Check the model

Once the model is run, it is necessary to check if the outputs can be taken with confidence. This step is needed before going ahead in the analysis (in fact, object used in the next functions must come from check_model()).

out_check_ha = check_model(out_ha)

out_check_ha is a list containing four elements :

Visualize outputs

Once the computation is done, you can visualize the results with plot()

p_out_check_ha = plot(out_check_ha)

p_out_check_ha is a list with:

p_out_check_ha$variability_repartition
p_out_check_ha$variance_intra_germplasm
Get and visualize mean comparisons

The method to compute mean comparison are explained in section \@ref(mean-comp-check-freq). Here, the computation is based on emmeans.

Get mean comparisons

Get mean comparisons with mean_comparisons().

out_mean_comparisons_ha = mean_comparisons(out_check_ha, p.adj = "tukey")

out_mean_comparisons_ha is a list of five elements:

Visualize mean comparisons
p_out_mean_comparisons_ha = plot(out_mean_comparisons_ha)

p_out_mean_comparisons_ha is a list of three elements with barplots :

For each element of the list, there are as many graph as needed with nb_parameters_per_plot parameters per graph. Letters are displayed on each bar. Parameters that do not share the same letters are different regarding type I error (alpha) and alpha correction. The error I (alpha) and the alpha correction are displayed in the title.

When comparing version for each germplasm, differences are displayed with stars. The stars corresponds to the pvalue:

| pvalue | stars | | --- | --- | | $< 0.001$ | * | | $[0.001 , 0.05]$ | | | $[0.05 , 0.01]$ | * | | $> 0.01$ | . |

pvg = p_out_mean_comparisons_ha$"version:germplasm"
pvg
pg = p_out_mean_comparisons_ha$germplasm
pg$`1`
pl = p_out_mean_comparisons_ha$location
pl$`1`
post hoc analysis to visualize variation repartition for several variables

First run the models

out_ha_2 = model_home_away(data_agro_HA, "y2")
out_ha_3 = model_home_away(data_agro_HA, "y3")

Then check the models

out_check_ha_2 = check_model(out_ha_2)
out_check_ha_3 = check_model(out_ha_3)
list_out_check_model = list("ha_1" = out_check_ha, "ha_2" = out_check_ha_2, "ha_3" = out_check_ha_3)
post_hoc_variation(list_out_check_model)
Apply the workflow to several variables

If you wish to apply the AMMI workflow to several variables, you can use lapply() with the following code :

workflow_home_away = function(x, data){
  out_home_away = model_home_away(data, variable = x)

  out_check_home_away = check_model(out_home_away)
  p_out_check_home_away = plot(out_check_home_away)

  out_mean_comparisons_home_away = mean_comparisons(out_check_home_away, p.adj = "bonferroni")
  p_out_mean_comparisons_home_away = plot(out_mean_comparisons_home_away)

  out = list(
    "out_home_away" = out_home_away,
    "out_check_home_away" = out_check_home_away,
    "p_out_check_home_away" = p_out_check_home_away,
    "out_mean_comparisons_home_away" = out_mean_comparisons_home_away,
    "p_out_mean_comparisons_home_away" = p_out_mean_comparisons_home_away
  )

  return(out)
}

vec_variables = c("y1", "y2", "y3")

out = lapply(vec_variables, workflow_home_away, data_agro_HA)
names(out) = vec_variables

list_out_check_model = list("ha_1" = out$y1$out_check_home_away, "ha_2" = out$y2$out_check_home_away, "ha_3" = out$y3$out_check_home_away)

p_post_hoc_variation = post_hoc_variation(list_out_check_model)

Local foreign {#local-foreign}

Another way to study local adaptation of germplasm to their location from origin is to compare germplasm behavior on their original location with their behavior on other locations : if the first is greater than the second then the germplasm is more adapted to its original location rather than to the other locations.

Local in a location refers to a germplasm that has been grown or selected in a given location. Foreign in a location refers to a germplasm that has not been grown or selected in a given location.

The following model take into account germplasm and location effects in order to better study version (local or foreign) effect [@blanquart_practical_2013]:

$Y_{ijkm} = \mu + \alpha_i + \theta_j + \omega_{k_{ij}} + (\omega \times \theta){k{ij}j} + rep(\theta){mj} + \varepsilon{ijkm}; \quad \varepsilon_{ijkm} \sim \mathcal{N} (0,\sigma^2)$

with

As for home away model, version effect $\omega_{k_{ij}}$) give a glocal measure of local adaptation of germplasm to their location of origin [@blanquart_practical_2013]. Interaction effect $(\omega \times \theta){k{ij}j}$ give information on specific adaptation to each germplasm.

If there are more than one year, then the model can be written :

$Y_{ijklm} = \mu + \alpha_i + \theta_j + \beta_l + \omega_{k_{ij}} + (\theta \times \beta){jl} + (\omega \times \alpha){k_{ij}i} + rep(\theta \times \beta){mjl} + (\omega \times \alpha \times \beta){k_{ij}il} + \varepsilon_{ijklm}; \quad \varepsilon_{ijklm} \sim \mathcal{N} (0,\sigma^2)$

with

Interaction $(\omega \times \theta \times \beta){k{ij}jl}$ give information on specific adaptation to each germplasm for a given year.

A type III anova is done here as the data are not orthogonal.

Steps with PPBstats

For local foreign analysis, you can follow these steps (Figure \@ref(fig:main-workflow-family-4-HALF)):

Format data
data("data_agro_LF")
data_agro_LF = format_data_PPBstats(data_agro_LF, type = "data_agro_LF")
head(data_agro_LF)
Describe the data
p = plot(data_agro_LF, vec_variables = "y1", plot_type = "barplot")

p is a list with as many element as variable. For each variable, there are three elements :

p$y1$local_foreign_merged
p$y1$local_foreign_merged_per_location
p$y1$local_foreign_per_location$`loc-1`$version
p$y1$local_foreign_per_location$`loc-1`$origin

If you have several year in the data, you can set argument f_gris = "year" in order to have plot for each year. In the example below t is not really relevent beaause there is only one year in the data set!

p = plot(data_agro_LF, vec_variables = "y1", plot_type = "barplot", f_grid = "year")
p$y1$local_foreign_merged_per_location
Run the model

To run LOCAL FOREIGN model on the dataset, use the function model_local_foreign. You can run it on one variable.

out_lf = model_local_foreign(data_agro_LF, "y1")

out_lf is a list containing three elements :

out_lf$info
Check and visualize model outputs

The tests to check the model are explained in section \@ref(check-model-freq).

Check the model

Once the model is run, it is necessary to check if the outputs can be taken with confidence. This step is needed before going ahead in the analysis (in fact, object used in the next functions must come from check_model()).

out_check_lf = check_model(out_lf)

out_check_lf is a list containing four elements :

Visualize outputs

Once the computation is done, you can visualize the results with plot()

p_out_check_lf = plot(out_check_lf)

p_out_check_lf is a list with:

p_out_check_lf$variability_repartition
p_out_check_lf$variance_intra_germplasm
Get and visualize mean comparisons

The method to compute mean comparison are explained in section \@ref(mean-comp-check-freq). Here, the computation is based on emmeans.

Get mean comparisons

Get mean comparisons with mean_comparisons().

out_mean_comparisons_lf = mean_comparisons(out_check_lf, p.adj = "tukey")

out_mean_comparisons_lf is a list of five elements:

Visualize mean comparisons
p_out_mean_comparisons_lf = plot(out_mean_comparisons_lf)

p_out_mean_comparisons_lf is a list of three elements with barplots :

For each element of the list, there are as many graph as needed with nb_parameters_per_plot parameters per graph. Letters are displayed on each bar. Parameters that do not share the same letters are different regarding type I error (alpha) and alpha correction. The error I (alpha) and the alpha correction are displayed in the title.

When comparing version for each germplasm, differences are displayed with stars. The stars corresponds to the pvalue:

| pvalue | stars | | --- | --- | | $< 0.001$ | * | | $[0.001 , 0.05]$ | | | $[0.05 , 0.01]$ | * | | $> 0.01$ | . |

pvg = p_out_mean_comparisons_lf$"version:location"
pvg
pg = p_out_mean_comparisons_lf$germplasm
pg$`1`
pl = p_out_mean_comparisons_lf$location
pl$`1`
post hoc analysis to visualize variation repartition for several variables

First run the models

out_lf_2 = model_local_foreign(data_agro_LF, "y2")
out_lf_3 = model_local_foreign(data_agro_LF, "y3")

Then check the models

out_check_lf_2 = check_model(out_lf_2)
out_check_lf_3 = check_model(out_lf_3)
list_out_check_model = list("lf_1" = out_check_lf, "lf_2" = out_check_lf_2, "lf_3" = out_check_lf_3)
post_hoc_variation(list_out_check_model)
Apply the workflow to several variables

If you wish to apply the AMMI workflow to several variables, you can use lapply() with the following code :

workflow_local_foreign = function(x, data){
  out_local_foreign = model_local_foreign(data, variable = x)

  out_check_local_foreign = check_model(out_local_foreign)
  p_out_check_local_foreign = plot(out_check_local_foreign)

  out_mean_comparisons_local_foreign = mean_comparisons(out_check_local_foreign, p.adj = "bonferroni")
  p_out_mean_comparisons_local_foreign = plot(out_mean_comparisons_local_foreign)

  out = list(
    "out_local_foreign" = out_local_foreign,
    "out_check_local_foreign" = out_check_local_foreign,
    "p_out_check_local_foreign" = p_out_check_local_foreign,
    "out_mean_comparisons_local_foreign" = out_mean_comparisons_local_foreign,
    "p_out_mean_comparisons_local_foreign" = p_out_mean_comparisons_local_foreign
  )

  return(out)
}

vec_variables = c("y1", "y2", "y3")

out = lapply(vec_variables, workflow_local_foreign, data_agro_LF)
names(out) = vec_variables

list_out_check_model = list("lf_1" = out$y1$out_check_local_foreign, "lf_2" = out$y2$out_check_local_foreign, "lf_3" = out$y3$out_check_local_foreign)

p_post_hoc_variation = post_hoc_variation(list_out_check_model)


priviere/PPBstats documentation built on May 6, 2021, 1:20 a.m.