seqAssocGLMM_spaBurden: Burden tests

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/assoc_aggregate.r

Description

Burden p-value calculations using mixed models and the Saddlepoint approximation method for case-control imbalance.

Usage

1
2
3
seqAssocGLMM_spaBurden(gdsfile, modobj, units, wbeta=AggrParamBeta,
    summac=3, dsnode="", spa.pval=0.05, var.ratio=NaN, res.savefn="",
    res.compress="LZMA", parallel=FALSE, verbose=TRUE, verbose.maf=TRUE)

Arguments

gdsfile

a SeqArray GDS filename, or a GDS object

modobj

an R object for SAIGE model parameters

units

a list of units of selected variants, with S3 class "SeqUnitListClass" defined in the SeqArray package

wbeta

weights for per-variant effect, using beta distribution dbeta() according to variant's MAF; a length-two vector, or a matrix with two rows for multiple beta parameters; by default, using beta(1,1) and beta(1,25) both

summac

a threshold for the weighted sum of minor allele counts (checking >= summac)

dsnode

"" for automatically searching the GDS nodes "genotype" and "annotation/format/DS", or use a user-defined GDS node in the file

spa.pval

the p-value threshold for SPA adjustment, 0.05 by default

var.ratio

NaN for using the estimated variance ratio in the model fitting, or a user-defined variance ratio

res.savefn

an RData or GDS file name, "" for no saving

res.compress

the compression method for the output file, it should be one of LZMA, LZMA_RA, ZIP, ZIP_RA and none

parallel

FALSE (serial processing), TRUE (multicore processing), a numeric value for the number of cores, or other value; parallel is passed to the argument cl in seqParallel, see seqParallel for more details

verbose

if TRUE, show information

verbose.maf

if TRUE, show summary of MAFs in units

Details

The original SAIGE R package uses 0.05 as a threshold for unadjusted p-values to further calculate SPA-adjusted p-values. If var.ratio=NaN, the average of variance ratios (mean(modobj$var.ratio$ratio)) is used instead. For more details of SAIGE algorithm, please refer to the SAIGE paper [Zhou et al. 2018] (see the reference section).

Value

Return a data.frame with the following components if not saving to a file: chr, chromosome; start, a starting position; end, an ending position; numvar, the number of variants in a window; summac, the weighted sum of minor allele counts; beta, beta coefficient, odds ratio if binary outcomes); SE, standard error for beta coefficient; pval, adjusted p-value with Saddlepoint approximation;

p.norm

p-values based on asymptotic normality (could be 0 if it is too small, e.g., pnorm(-50) = 0 in R; used for checking only

cvg, whether the SPA algorithm converges or not for adjusted p-value.

Author(s)

Xiuwen Zheng

References

Zhou W, Nielsen JB, Fritsche LG, Dey R, Gabrielsen ME, Wolford BN, LeFaive J, VandeHaar P, Gagliano SA, Gifford A, Bastarache LA, Wei WQ, Denny JC, Lin M, Hveem K, Kang HM, Abecasis GR, Willer CJ, Lee S. Efficiently controlling for case-control imbalance and sample relatedness in large-scale genetic association studies. Nat Genet (2018). Sep;50(9):1335-1341.

See Also

seqAssocGLMM_spaACAT_V, seqAssocGLMM_spaACAT_O

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
# open a GDS file
fn <- system.file("extdata", "grm1k_10k_snp.gds", package="SAIGEgds")
gdsfile <- seqOpen(fn)

# load phenotype
phenofn <- system.file("extdata", "pheno.txt.gz", package="SAIGEgds")
pheno <- read.table(phenofn, header=TRUE, as.is=TRUE)
head(pheno)

# fit the null model
glmm <- seqFitNullGLMM_SPA(y ~ x1 + x2, pheno, gdsfile, trait.type="binary")

# get a list of variant units for burden tests
units <- seqUnitSlidingWindows(gdsfile, win.size=500, win.shift=250)

assoc <- seqAssocGLMM_spaBurden(gdsfile, glmm, units)
head(assoc)

# close the GDS file
seqClose(gdsfile)

SAIGEgds documentation built on Nov. 8, 2020, 7:46 p.m.