mPhen: A function for the genetic association testing of multiple...

Description Usage Arguments Value Note Author(s) References Examples

View source: R/mPhen.R

Description

mPhen performs association testing between genetic variants (SNPs; CNVs) and multiple phenotypes. The primary purpose is for modelling and testing multiple phenotypes jointly by performing an ordinal regression where SNPs are treated as the outcome and multiple phenotypes are predictors; this can have large increases in statistical power to detect genotype-phenotype associations over the univariate approach (method described in O'Reilly et al. 2012, see below). However, mPhen can also be used to perform standard univariate linear regression (SNP as predictor) and univariate ordinal regression (SNP as outcome) on the phenotypes under study. mPhen can be applied to directly genotyped or imputed data.

Usage

1
2
3
mPhen(genoData, phenoData, phenotypes = "all",
      covariates = NULL, resids = NULL, strats = NULL, 
      opts = mPhen.options(c("regression","pheno.input")))

Arguments

genoData

Either a matrix (for directly measured genotypes) or 3 dimensional array (for imputed genotypes). The first dimension (rows) corresponds to individuals, and row.names are inidividual IDs. The second dimension corresponds to SNPS, with col.names equal to the snp identifiers. For directly measured genotypes, the value in each cell is a numeric genotype( i.e. AA = 0, AB=1, BB = 2). For imputed genotypes, the 3rd dimension corresponds to genotypes (with dimnames(genoData)[[3]]) equal to a numeric vector corresponding to genotype values. The values in these cells are the probability of each genotype multiplied by 1000. For copy number genotypes the numeric values correspond to numbers of copies. An example provided by 'snps' and 'snps.imputed'

phenoData

Matrix containing phenotype data, where each row corresponds to an individual and row.names are individual IDs. Each column contains data on a certain phenotype across the sample of individuals (can be quantitative, case/control or ordinal. Must be numeric); the column header provides the phenotype name. An example is provided by 'pheno'.

phenotypes

Vector of phenotype names, to be tested. If value is 'all' then all phenotypes are included after removing covariates and residuals.

covariates

Vector of phenotypes, from phenoData, to be considered as covariates to be controlled for in the regression (Default is no covariates).

resids

Vector of residuals, from phenoData, alternative way to adjust for covariates, which pre-calculates offset terms to use in the per SNP regression (Default is no residuals).

strats

Statification vector (i.e. cases/controls, exposed/not exposed, male/female etc), from phenoData (Default is no stratification).

opts

A list of options, which is obtained from mPhen.options(c("regression","pheno.input")). To get more information about these options, type mPhen.options(c("regression","pheno.input"),descr=TRUE)

Value

Returns a list, with two items. The first item (Results) is a Results is a 4 dimensional matrix, with dimensions [strata, snps, phenotypes, result_type], where result_type includes beta, pvalue and Nobs. The second item is a vector of minor allele frequencies.

Note

The user should remember that the genotype data file is always a matrix of at least a column, hence if taking a subset of 1 SNP in the non-imputed genotype data matrix, the option drop = FALSE should be used (see the example below)

Author(s)

Lachlan Coin, Federico Calboli, Clive Hoggart, Paul O'Reilly, Yotsawat Pomyen.

Maintainer, Federico Calboli f.calboli@imperial.ac.uk

References

O'Reilly et al. 2012. MultiPhen: Joint model of multiple phenotypes can increase discovery in GWAS. http://dx.plos.org/10.1371/journal.pone.0034861

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
data(snps); data(snps.imputed); data(pheno)
opts = mPhen.options(c("regression","pheno.input"))
res = mPhen(snps, pheno, phenotypes = "all",
      covariates = c('testPheno3', 'testPheno4'),opts = opts) 
# performs a MultiPhen analysis, with snp as outcome,
# and phenotypes testPheno1, testPheno2 as predictors, 
#with testPheno3 and testPheno4 as covariates using ordinal regression

res = mPhen(snps, pheno, phenotypes = c('testPheno1', 'testPheno2'), 
      covariates = c('testPheno3', 'testPheno4'), resids = 'testPheno5', opts = opts) 
# the same as above, with the fifth phenotype as residual 

res = mPhen(snps[,2, drop = FALSE], pheno, phenotypes = c('testPheno1', 'testPheno2'), 
      covariates = 'testPheno3',  opts = opts) 
# please note the use use of drop = FALSE if analysing only one SNP


res = mPhen(snps.imputed, pheno, phenotypes = c('testPheno1', 'testPheno2'), 
      covariates = 'testPheno3',  opts = opts) 
# for imputed data

Example output

Loading required package: MASS
Loading required package: abind
Loading required package: epitools
Loading required package: meta
Loading 'meta' package (version 4.9-5).
Type 'help(meta)' for a brief overview.
[1] "excluding 0 samples based on exclusion criteria"
Warning messages:
1: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
2: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
3: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
[1] "excluding 0 samples based on exclusion criteria"
Warning messages:
1: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
2: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
3: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
[1] "excluding 0 samples based on exclusion criteria"
Warning message:
In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
[1] "excluding 0 samples based on exclusion criteria"
NULL
[1] "polr failed, using Gaussian"
NULL
[1] "polr failed, using Gaussian"
NULL
[1] "polr failed, using Gaussian"
Warning messages:
1: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
2: glm.fit: fitted probabilities numerically 0 or 1 occurred 
3: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
4: glm.fit: fitted probabilities numerically 0 or 1 occurred 
5: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
6: glm.fit: fitted probabilities numerically 0 or 1 occurred 

MultiPhen documentation built on Feb. 9, 2020, 5:07 p.m.