assocTestAggregate: Aggregate Association Testing

Description Usage Arguments Details Value Author(s) Examples

Description

assocTestAggregate performs aggregate association tests using the null model fit with fitNullModel.

Usage

1
2
3
4
5
6
7
## S4 method for signature 'SeqVarIterator'
assocTestAggregate(gdsobj, null.model, AF.max=1,
                   weight.beta=c(1,1), weight.user=NULL,
                   test=c("Burden", "SKAT", "SMMAT"),
                   burden.test=c("Score", "Wald"), rho=0,
                   pval.method=c("davies", "kuonen", "liu"),
                   verbose=TRUE)

Arguments

gdsobj

An object of class SeqVarIterator from the package SeqVarTools containing the genotype data for the variants and samples to be used for the analysis.

null.model

A null model object returned by fitNullModel.

AF.max

A numeric value specifying the upper bound on the alternate allele frequency for variants to be included in the analysis.

weight.beta

A numeric vector of length two specifying the two parameters of the Beta distribution used to determine variant weights; weights are given by dbeta(MAF, a, b), where MAF is the minor allele frequency, and a and b are the two parameters specified here. weight.beta = c(1,25) gives the Wu weights; weight.beta = c(0.5, 0.5) is proportional to the Madsen-Browning weights; and weight.beta = c(1,1) gives a weight of 1 to all variants. This input is ignored when weight.user is not NULL.

weight.user

A character string specifying the name of a variable in the variantData slot of gdsobj to be used as variant weights. When left NULL (the default), the weights specified by weight.beta will be used.

test

A character string specifying the type of test to be performed. The possibilities are "Burden" (default), "SKAT", or "SMMAT". When this is set to "SKAT" and the parameter rho has multiple values, a SKAT-O test is performed.

burden.test

A character string specifying the type of Burden test to perform when test = "Burden". The possibilities are "Score" and "Wald". "Score" can be used for any null.model. "Wald" can not be used when the null.model is from a mixed model with a binary outcome variable.

rho

A numeric value (or vector of numeric values) in [0,1] specifying the rho parameter for SKAT. When rho = 0, a standard SKAT test is performed. When rho = 1, a score burden test is performed. When rho is a vector of values, SKAT-O is performed using each of those values as the search space for the optimal rho.

pval.method

A character string specifying which method to use to calculate SKAT p-values. "davies" (the default) uses numerical integration; "kuonen" uses a saddlepoint method; and "liu" uses a moment matching approximation. If the davies method generates an error, kuonen is tried, and then liu as a last resort.

verbose

Logical indicator of whether updates from the function should be printed to the console; the default is TRUE.

Details

The type of aggregate unit tested depends on the class of iterator used for gdsobj. Options include sliding windows, specific ranges of variants or selection of individual variants (ranges with width 1). See SeqVarIterator for more details.

The effect size estimate is for each copy of the alternate allele. For multiallelic variants, each alternate allele is tested separately.

The SMMAT test is a hybrid of SKAT and the burdren test.

Value

A list with the following items:

results

A data.frame containing the results from the main analysis. Each row is a separate aggregate test:

If gdsobj is a SeqVarWindowIterator:

chr

The chromosome value

start

The start position of the window

end

The end position of the window

Always:

n.site

The number of variant sites included in the test.

n.alt

The number of alternate alleles included in the test.

n.sample.alt

The number of samples with an observed alternate allele at any variant in the aggregate set.

If test is "Burden":

If burden.test is "Score":

Score

The value of the score function

Score.SE

The estimated standard error of the Score

Score.Stat

The score Z test statistic

Score.pval

The score p-value

If burden.test is "Wald":

Est

The effect size estimate for a one unit increase in the burden value

Est.SE

The estimated standard error of the effect size estimate

Wald.Stat

The Wald Z test statistic

Wald.pval

The Wald p-value

If test is "SKAT":

Q_rho

The SKAT test statistic for the value of rho specified. There will be as many of these variables as there are rho values chosen.

pval_rho

The SKAT p-value for the value of rho specified. There will be as many of these variables as there are rho values chosen.

err_rho

Takes value 1 if there was an error in calculating the p-value for the value of rho specified when using the "kuonen" or "davies" methods; 0 otherwise. When there is an error, the p-value returned is from the "liu" method. There will be as many of these variables as there are rho values chosen.

When length(rho) > 1 and SKAT-O is performed:

min.pval

The minimum p-value among the p-values calculated for each choice of rho.

opt.rho

The optimal rho value; i.e. the rho value that gave the minimum p-value.

pval_SKATO

The SKAT-O p-value after adjustment for searching across multiple rho values.

If test is "SMMAT":

pval_burden

The burden test p-value

pval_hybrid

The SMMAT p-value

err

Takes value 1 if there was an error calculating the hybrid p-value; 0 otherwise. If err=1, pval_hybrid is set to pval_burden.

variantInfo

A list with as many elements as aggregate tests performed. Each element of the list is a data.frame providing information on the variants used in the aggregate test with results presented in the corresponding row of results. Each of these data.frames has the following information:

variant.id

The variant ID

chr

The chromosome value

pos

The base pair position

n.obs

The number of samples with non-missing genotypes

freq

The estimated alternate allele frequency

weight

The weight assigned to the variant in the analysis.

Author(s)

Matthew P. Conomos, Stephanie M. Gogarten, Tamar Sofer, Ken Rice, Chaoyu Yu

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
library(SeqVarTools)
library(Biobase)
library(GenomicRanges)

# open a sequencing GDS file
gdsfile <- seqExampleFileName("gds")
gds <- seqOpen(gdsfile)

# simulate some phenotype data
data(pedigree)
pedigree <- pedigree[match(seqGetData(gds, "sample.id"), pedigree$sample.id),]
pedigree$outcome <- rnorm(nrow(pedigree))

# construct a SeqVarData object
seqData <- SeqVarData(gds, sampleData=AnnotatedDataFrame(pedigree))

# fit the null model
nullmod <- fitNullModel(seqData, outcome="outcome", covars="sex")

# burden test - Range Iterator
gr <- GRanges(seqnames=rep(1,3), ranges=IRanges(start=c(1e6, 2e6, 3e6), width=1e6))
iterator <- SeqVarRangeIterator(seqData, variantRanges=gr)
assoc <- assocTestAggregate(iterator, nullmod, test="Burden")
assoc$results
lapply(assoc$variantInfo, head)

# SKAT test - Window Iterator
seqSetFilterChrom(seqData, include="22")
iterator <- SeqVarWindowIterator(seqData)
assoc <- assocTestAggregate(iterator, nullmod, test="SKAT")
head(assoc$results)
head(assoc$variantInfo)

# SKAT-O test - List Iterator
seqResetFilter(iterator)
gr <- GRangesList(
  GRanges(seqnames=rep(22,2), ranges=IRanges(start=c(16e6, 17e6), width=1e6)),
  GRanges(seqnames=rep(22,2), ranges=IRanges(start=c(18e6, 20e6), width=1e6)))
iterator <- SeqVarListIterator(seqData, variantRanges=gr)
assoc <- assocTestAggregate(iterator, nullmod, test="SKAT", rho=seq(0, 1, 0.25))
assoc$results
assoc$variantInfo

# user-specified weights
seqResetFilter(iterator)
variant.id <- seqGetData(gds, "variant.id")
weights <- data.frame(variant.id, weight=runif(length(variant.id)))
variantData(seqData) <- AnnotatedDataFrame(weights)
iterator <- SeqVarListIterator(seqData, variantRanges=gr)
assoc <- assocTestAggregate(iterator, nullmod, test="Burden", weight.user="weight")
assoc$results
assoc$variantInfo

seqClose(seqData)

UW-GAC/genesis2 documentation built on May 6, 2019, 3:29 p.m.