calc_association: Run case/control or quantitative association tests on SNP...

View source: R/association_functions.R

calc_associationR Documentation

Run case/control or quantitative association tests on SNP data.

Description

Runs several different association tests on SNP data. The response variable must have only two different categories (as in case/control) for most test types, although the "gmmat.score" method supports quantitative traits. Tests may be broken up by sample-specific facets.

Usage

calc_association(
  x,
  facets = NULL,
  response,
  method = "gmmat.score",
  w = c(0, 1, 2),
  formula = NULL,
  family.override = FALSE,
  maxiter = 500,
  sampleID = NULL,
  Gmaf = 0,
  par = FALSE,
  cleanup = TRUE,
  verbose = FALSE
)

Arguments

x

snpRdata object

facets

character, default NULL. Categorical metadata variables by which to break up analysis. See Facets_in_snpR for more details.

response

character. Name of the column containing the response variable of interest. Must match a column name in sample metadata. Response must be categorical, with only two categories.

method

character, default "gmmat.score". Specifies association method. Options:

  • gmmat.score: Population/family structure corrected mlm approach, based on Chen et al (2016).

  • armitage: Armitage association test, based on Armitage (1955).

  • odds_ratio: Log odds ratio test.

  • chisq: Chi-squared test.

See description for more details.

w

numeric, default c(0, 1, 2). Weight variable for each genotype for the Armitage association method. See description for details.

formula

character, default set to response ~ 1. Null formula for the response variable, as described in formula.

family.override

character, default NULL. Provides an alternative model family object to use for GMMAT GWAS regression. By default, uses gaussian, link = "identity" for a quantitative phenotype and binomial, link = "logit" for a categorical phenotype.

maxiter

numeric, default 500. Maximum iterations to use when fitting the glmm when using the gmmat.score option.

sampleID

character, default NULL. Optional, the name of a column in the sample metadata to use as a sampleID when using the gmmat.score option.

Gmaf

numeric, default NULL. If using the "GMMAT" option, can provide and optional minor allele frequency filter used when constructing the relatedness matrix.

par

numeric or FALSE, default FALSE. Number of parallel cores to use for computation.

cleanup

logical, default TRUE. If TRUE, files produced by and for the "gmmat" option will be removed after completion.

verbose

Logical, default FALSE. If TRUE, some progress updates will be printed to the console.

Details

Several methods can be used: Armitage, chi-squared, and odds ratio. For The Armitage approach weights should be provided to the "w" argument, which specifies the weight for each possible genotype (homozygote 1, heterozygote, homozygote 2). The default, c(0,1,2), specifies an additive model. The "gmmat.score" method uses the approach described in Chen et al. (2016) and implemented in the glmmkin and glmm.score functions. For this method, a 'G' genetic relatedness matrix is first created using the Gmatrix function according to Yang et al 2010.

Multi-category data is currently not supported, but is under development.

Facets are specified as described in Facets_in_snpR. NULL and "all" facet specifications function as described.

Note that if all individuals in one facet level have one of the two possible phenotypes, each association test will return NA or NaN. As a result, if the response variable overlaps perfectly with the facet variable, all results will be NA or NaN.

If the chisq method is used, a column will also be returned that specifies which allele (major or minor) is associated with which phenotype.

Value

A snpRdata object with the resulting association test results merged into the stats socket.

Author(s)

William Hemstrom

Keming Su

Avani Chitre

References

Armitage (1955). Tests for Linear Trends in Proportions and Frequencies. Biometrics.

Chen et al. (2016). Control for Population Structure and Relatedness for Binary Traits in Genetic Association Studies via Logistic Mixed Models. American Journal of Human Genetics.

Yang et al. (2010). Common SNPs explain a large proportion of the heritability for human height. Nature Genetics.

Examples

  # add a dummy phenotype and run an association test.
  x <- stickSNPs
  sample.meta(x)$phenotype <- sample(c("A", "B"), nsamps(stickSNPs), TRUE)
  x <- calc_association(x, facets = c("pop"), response = "phenotype", method = "armitage")
  get.snpR.stats(x, "pop", "association")


hemstrow/snpR documentation built on March 20, 2024, 7:03 a.m.