mPhen: A function for the genetic association testing of multiple...
In MultiPhen: A Package to Test for Multi-Trait Association

Description Usage Arguments Value Note Author(s) References Examples

mPhen performs association testing between genetic variants (SNPs; CNVs) and multiple phenotypes. The primary purpose is for modelling and testing multiple phenotypes jointly by performing an ordinal regression where SNPs are treated as the outcome and multiple phenotypes are predictors; this can have large increases in statistical power to detect genotype-phenotype associations over the univariate approach (method described in O'Reilly et al. 2012, see below). However, mPhen can also be used to perform standard univariate linear regression (SNP as predictor) and univariate ordinal regression (SNP as outcome) on the phenotypes under study. mPhen can be applied to directly genotyped or imputed data.

1
2
3

mPhen(genoData, phenoData, phenotypes = "all",
      covariates = NULL, resids = NULL, strats = NULL, 
      opts = mPhen.options(c("regression","pheno.input")))

`genoData`	Either a matrix (for directly measured genotypes) or 3 dimensional array (for imputed genotypes). The first dimension (rows) corresponds to individuals, and row.names are inidividual IDs. The second dimension corresponds to SNPS, with col.names equal to the snp identifiers. For directly measured genotypes, the value in each cell is a numeric genotype( i.e. AA = 0, AB=1, BB = 2). For imputed genotypes, the 3rd dimension corresponds to genotypes (with dimnames(genoData)[[3]]) equal to a numeric vector corresponding to genotype values. The values in these cells are the probability of each genotype multiplied by 1000. For copy number genotypes the numeric values correspond to numbers of copies. An example provided by 'snps' and 'snps.imputed'
`phenoData`	Matrix containing phenotype data, where each row corresponds to an individual and row.names are individual IDs. Each column contains data on a certain phenotype across the sample of individuals (can be quantitative, case/control or ordinal. Must be numeric); the column header provides the phenotype name. An example is provided by 'pheno'.
`phenotypes`	Vector of phenotype names, to be tested. If value is 'all' then all phenotypes are included after removing covariates and residuals.
`covariates`	Vector of phenotypes, from phenoData, to be considered as covariates to be controlled for in the regression (Default is no covariates).
`resids`	Vector of residuals, from phenoData, alternative way to adjust for covariates, which pre-calculates offset terms to use in the per SNP regression (Default is no residuals).
`strats`	Statification vector (i.e. cases/controls, exposed/not exposed, male/female etc), from phenoData (Default is no stratification).

opts

A list of options, which is obtained from mPhen.options(c("regression","pheno.input")). To get more information about these options, type mPhen.options(c("regression","pheno.input"),descr=TRUE)

Returns a list, with two items. The first item (Results) is a Results is a 4 dimensional matrix, with dimensions [strata, snps, phenotypes, result_type], where result_type includes beta, pvalue and Nobs. The second item is a vector of minor allele frequencies.

The user should remember that the genotype data file is always a matrix of at least a column, hence if taking a subset of 1 SNP in the non-imputed genotype data matrix, the option drop = FALSE should be used (see the example below)

Lachlan Coin, Federico Calboli, Clive Hoggart, Paul O'Reilly, Yotsawat Pomyen.

Maintainer, Federico Calboli f.calboli@imperial.ac.uk

O'Reilly et al. 2012. MultiPhen: Joint model of multiple phenotypes can increase discovery in GWAS. http://dx.plos.org/10.1371/journal.pone.0034861

data(snps); data(snps.imputed); data(pheno)
opts = mPhen.options(c("regression","pheno.input"))
res = mPhen(snps, pheno, phenotypes = "all",
      covariates = c('testPheno3', 'testPheno4'),opts = opts) 
# performs a MultiPhen analysis, with snp as outcome,
# and phenotypes testPheno1, testPheno2 as predictors, 
#with testPheno3 and testPheno4 as covariates using ordinal regression

res = mPhen(snps, pheno, phenotypes = c('testPheno1', 'testPheno2'), 
      covariates = c('testPheno3', 'testPheno4'), resids = 'testPheno5', opts = opts) 
# the same as above, with the fifth phenotype as residual 

res = mPhen(snps[,2, drop = FALSE], pheno, phenotypes = c('testPheno1', 'testPheno2'), 
      covariates = 'testPheno3',  opts = opts) 
# please note the use use of drop = FALSE if analysing only one SNP


res = mPhen(snps.imputed, pheno, phenotypes = c('testPheno1', 'testPheno2'), 
      covariates = 'testPheno3',  opts = opts) 
# for imputed data

Loading required package: MASS
Loading required package: abind
Loading required package: epitools
Loading required package: meta
Loading 'meta' package (version 4.9-5).
Type 'help(meta)' for a brief overview.
[1] "excluding 0 samples based on exclusion criteria"
Warning messages:
1: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
2: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
3: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
[1] "excluding 0 samples based on exclusion criteria"
Warning messages:
1: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
2: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
3: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
[1] "excluding 0 samples based on exclusion criteria"
Warning message:
In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
[1] "excluding 0 samples based on exclusion criteria"
NULL
[1] "polr failed, using Gaussian"
NULL
[1] "polr failed, using Gaussian"
NULL
[1] "polr failed, using Gaussian"
Warning messages:
1: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
2: glm.fit: fitted probabilities numerically 0 or 1 occurred 
3: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
4: glm.fit: fitted probabilities numerically 0 or 1 occurred 
5: In var(var1, na.rm = TRUE) :
  Calling var(x) on a factor x is deprecated and will become an error.
  Use something like 'all(duplicated(x)[-1L])' to test for a constant vector.
6: glm.fit: fitted probabilities numerically 0 or 1 occurred