maps: Multi-locus Association test for a Primary trait and its...

Description Usage Arguments Details Value Author(s) References Examples

Description

maps is used to perform multi-locus association test for a dichotomous primary trait and a quantitative secondary phenotype. It adopts a random effect model with two variance components, and the SNPs can be from a gene or whatever selected by the users. This multi-locus test allows missing genotypes as long as they are missing completely at random (MCAR).

Usage

1
2
3
4
maps(data, formula = NULL, subset = NULL, nperm = 1e+05, 
       rho = seq(-1, 1, length.out = 21), 
       kappa = seq(0, 1, len = 21), na.rm = FALSE, 
       seed = 0, nthread = NULL, plot.pval = FALSE)

Arguments

data

a data frame containing all variables specified in formula.

formula

an object of clas "Formula". The user can specify two outcomes on the left-hand side. The covariates and the variables of interest can be specified on the right-hand side. See 'Details' below.

subset

an optional vector specifying a subset of observations to be used in the testing process.

nperm

an integer specifying the number of replicates used in Monte Carlo test.

rho

a vector of effect correlation parameter used in tuning process. Specific choice of rho lead to special cases of MAPS test. See 'Details' below.

kappa

a vector of variance component proportion parameter used in tuning process. Specific choice of kappa lead to special cases of MAPS test. See 'Details' below.

na.rm

a logical indicating whether to allow individuals with missing values on variables of interest. The current version always set it as FALSE as using more samples can increase power. See 'Details' below.

seed

an integer used as the random seed.

nthread

an interger specifying the number of threads used in parallelizing the Monte Carlo test. Use all available threads by default.

plot.pval

a logical indicating whether to draw a plot of the unadjusted p-values for each pair of rho and kappa. See 'Details' below.

Details

The formula is parsed by the package Formula. On the left-hand side, the binary outcome must be specified first, then the continuous outcome is specified, separated by |. One the right-hand side, the covariates must be specified first, then the variables of interest are specified, separated by |. A valid formula, e.g.,

SMOKE | CIG_PER_DAY ~ AGE + SEX | SNP1 + SNP2 + SNP3

means that the variable SMOKE is the binary outcome, CIG_PER_DAY is the continuous outcome. Both outcomes should be adjusted by AGE and SEX. The function will test the joint association effect of three SNPs, i.e., SNP1, SNP2, SNP3.

Ge's minp algorithm is used in evaluating the final p-value accounting for multiple-comparison in tuning parameters of rho and kappa. It will produce unadjusted statistics which are returned as $obs.rank. See 'Value' below. A generic function plot can be used to visualize these statistics (standardized as p-values), which gives intuition of the optimal chosen $rho.opt and $kappa.opt.

One of the major problems for multi-locus test is that it doesn't allow missing genotypes. In practice, the user has to exclude individuals even with one missing entry. Although generally the SNPs included in a gene pass the quality control, e.g., missing rate < 2%, however, a substantial proportion of individuals can be excluded in testing the association, especially for large gene. This can reduce the statistical power or more seriously, bias the inference. We propose to use the modified scores defined on all observed genotypes to generalize the score tests, which provides more flexibility to in real application. Please refer to our paper for more details.

maps has some special cases. If rho = 0 and kappa = 0.5, it is MAPS_{0,1/2} in our paper. If rho = 0 and kappa varies, it is MAPS_0. If kappa = 0.5 and rho varies, it is MAPS_cor. If both rho and kappa vary, it is MAPS_opt.

Value

maps returns an object of class "maps" containing p-value and other useful information.

An object of class "maps" is a list containing some of the following components (depending on the values of rho and kappa):

pval

the final p-value for the MAPS test. This p-value is adjusted for multiple-comparison if necessary. See 'Details'.

rho.opt

the optimal chosen rho.

kappa.opt

the optimal chosen kappa.

nperm

the number of replicates used in calculating the p-value.

rho

the vector of effect correlation parameter used in tuning process.

kappa

the vector of variance component proportion parameter used in tuning process.

refine

if TRUE, larger nperm should be try to obtain a more stable estimate of p-value.

obs.rank

a vector containing unadjusted statistics produced by Ge's minp algorithm.

stat

a vector of statistics used to estimate the final p-value $pval.

$pval is always returned.

Author(s)

Han Zhang <han.zhang2@nih.gov>

References

Zhang, H., Wu, C.O., Yang, Y., Berndt, S., and Yu K. (2015) A multi-locus genetic association test for a dichotomous trait and its secondary phenotype. Submitted.

Wu, C.O., Zheng, G., and Kwak, M. (2013) A joint regression analysis for genetic association studies with outcome stratified samples. Biometrics 69, 417–426.

Ge, Y., Dudoit, S., and Speed, T. (2003) Resampling-based multiple testing for microarray data analysis, Test 12, 1–77.

Examples

1
2
3
data(test) # loading data
obj <- maps(data, formula) # MAPS_opt in our paper
plot(obj)

zhangh12/MAPS documentation built on May 4, 2019, 10:16 p.m.