Description Usage Arguments Details Value Author(s) References Examples

The gwas function calculates the likelihood ratio for each marker under the empirical Bayesian framework. The method allows analysis with multiple populations. `gwas2`

is computationally optimized. `gwas3`

was design for multiple random populations.

1 2 3 4 5 |

`y` |
Numeric vector of observations ( |

`gen` |
Numeric matrix containing the genotypic data. A matrix with |

`fam` |
Numeric vector of length |

`chr` |
Numeric vector indicating the number of markers in each chromosome. The sum of |

`window` |
Numeric. If specified, genetic distance between markers is used for moving window strategy. Window must be specified in Morgans ( |

`fixed` |
Logical. If TRUE, markers are treated as fixed effect and hence, evaluated through Wald statistics. If markers are specief as fixed, the argument 'window' is not applicable. |

`EIG` |
Output of the R function 'eigen'. It is used for user-defined kinship matrix. |

`cov` |
Numeric vector of length |

`Phe` |
Numeric matrix of observations ( |

`ge` |
Logical. If TRUE, meta-analysis (function gwasGE) will be done for the |

`ammi` |
Integer. It indicates the number of principal components used to represent |

`ByEnv` |
List of objected output from |

Empirical Bayes model (Wang 2016) with a special incidence matrix is recreated to optimize the information provided by the subpopulations. Each locus is recoded as a vector with length *f* equal to number of subpopulations, or NAM families, as the interaction locus by family. For example, a locus heterozigous from an individual from subpopulation 2 is coded as [ 1, 0, 1 , ... ,*f* ], a locus homozigous for the reference allele from any subpopulation is coded as [ 2, 0, 0, ... , *f* ] and a locus homozigous for the founder allele from an individual from subpopulation 1 is coded as [ 0, 2, 0, ... ,*f* ]. The base model for genome scanning is described by:

*y = Xb + Zu + g + e*

That includes the fixed effect (*Xb*), the marker (*Zu*), the polygene (*g*) and the residuals (*e*). If the *window* term is specified, the model for genome scanning is expanded as follows:

*y = Xb + Zu[k-1] + Zu + Zu[k+1] + g - g[k] + e*

This model includes three extra terms: the left side genome ( *Zu[k-1]* ) and the right side genome ( *Zu[k+1]* ), also subtracting the window polygene ( *-g[k]* ). Windows are based on genetic distance, which is computed using Kosambi map function. The recombination rate is estimated under the assuption markers are ordered and that genotypes are recombinant inbred lines.

The polygenic term is calculated only once (Zhang et al 2010) using eigendecomposition with a GEMMA-like algorithm (Zhou ans Stephens 2012). Efficient inversion of capacitance matrix is obtained through the Woodbury matrix identities. Models and algorithms are described with more detail by Xavier et al (2015) and Wei and Xu (2016).

In order to analyze large dateset, one can avoid memory issues by using the function *gwas2*, but that the argument 'window' is not implemented for *gwas2*. This function also allows used-defined kindship through the argument EIG, and the use of a numeric covariate vector through the argument cov.

When multi-environmental trials are the target of mapping, one may use the function *gwasGE* to perform analysis by environment, followed by "meta-analysis" used to combine the results. This strategy provides an idea of the variation on QTL effect due to environment, genetic background (provided by the stratification factor) and the interaction between environment and genetics.

An alternative to this method is the mega-analysis, where one can provide the stratification factor as a combination of subpopulation and environment. Meta-analysis can be performed in a single step with function `gwasGE`

, or users can perform multiple association analyses using `gwas3`

and perform meta-analysis with `meta3`

. In `gwasGE`

, the same genotype will often appear more than once in the phenotypic and genotypic data, so that phenotypes are provided as a matrix. The statistical detail about the meta-analysis are available in the vignette *Background for Meta-analysis*.

The function `gwas3`

is an alternative for association analysis and meta-analysis, also solved in the Empirical-Bayes framework for multiple populations. Unlike `gwas`

, `gwas2`

and `gwasGE`

, this function does not set a reference allele and analysis each marker as the interaction of allele by stratification factor (ie. family or subpopulation). Therefore, `gwas3`

is compatible with any allele coding.

For further statistical background:

1) `system(paste('open',system.file("doc","gwa_description.pdf",package="NAM")))`

2) `system(paste('open',system.file("doc","gwa_ge_interactions.pdf",package="NAM")))`

The function gwas returns a list containing the method deployed (*Method*), a summary of predicted parameters and statistical tests (*PolyTest*), estimated genetic map for NAM panels (*MAP*) and the marker names (*SNPs*).

Alencar Xavier, Tiago Pimenta, Qishan Wang and Shizhong Xu

Wang, Q., Wei, J., Pan, Y., & Xu, S. (2016). An efficient empirical Bayes method for genomewide association studies. Journal of Animal Breeding and Genetics, 133(4), 253-263.

Wei, J., & Xu, S. (2016). A Random Model Approach to QTL Mapping in Multi-parent Advanced Generation Inter-cross (MAGIC) Populations. Genetics, 202(2), 471-486.

Xavier, A., Xu, S., Muir, W. M., & Rainey, K. M. (2015). NAM: Association Studies in Multiple Populations. Bioinformatics, 31(23), 3862-3864.

Zhang et al. 2010. Mixed linear model approach adapted for genome-wide association studies. Nat. Genet. 42:355-360.

Zhou, X., & Stephens, M. (2012). Genome-wide efficient mixed-model analysis for association studies. Nature genetics, 44(7), 821-824.

1 2 3 4 5 6 7 |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.