rsnpset.pvalue: RSNPset P-value Function
In RSNPset: Efficient Score Statistics for Genome-Wide SNP Set Analysis

Description Usage Arguments Details Value Note See Also Examples

View source: R/rsnpset.pvalue.R

Calculate observed, resampling, FWER-adjusted, and FDR-adjusted p-values for statistics from the function rsnpset().

1	rsnpset.pvalue(result, pval.transform=FALSE, qfun=function(x){qvalue(x)$qvalue})

`result`	Result from `rsnpset()`, an "RSNPset" S3 class object. Required.
`pval.transform`	Boolean indicating if the resampling p-values should be computed by comparing the observed p-value to the resampling p-values (`TRUE`). If not (`FALSE`), they are computed by comparing the observed statistics to the resampling statistics (may not be appropriate for `r.method="permutation"`). Note that `rsnpset()` must be run with `B > 0` in order to use `pval.transform=TRUE`. Default is `FALSE`.
`qfun`	Function used to calculate false discovery rate adjusted p-values. See below. Default is `function(x){qvalue(x)$qvalue}`.

See below.

An S3 class RSNPset.pvalue object that extends data.frame, with one row for each of the K SNP sets in result, columns W, rank, m, and two or more additional columns of p-values. Two columns, p, and q are always returned. If rsnpset() was run with B > 0, the columns pB and qB are included as well. If pval.transform=TRUE, the returned p-value columns will be p, pB, PB, q, and qB.

Column	Definition
`W`	Observed statistic
`rank`	Rank of the variance matrix for the observed data
`m`	Number of SNPs analyzed in the SNP set

Column	P-value	Definition
`p`	Asymptotic*	`pchisq(W,rank,lower.tail=FALSE)`
`pB`	Resampling**	See below.
`PB`	Family-wise error adjusted***	See below.
`q`	False discovery rate adjusted	`qfun(p)`
`qB`	Resampling FDR adjusted	`qfun(pB)`

* For W and rank from rsnpset().

** By default, the unadjusted resampling p-values are computed by comparing the observed statistics to the replication statistics. Note that a large number of replications may be required in order to account for multiple testing. For each SNP set, the value for pB is sum(W <= Wb)/B, where W is the observed statistic for the SNP set, Wb is a vector of resampling statistics, and B is the number of replications. If pval.transform=TRUE, then for each SNP set, the value for pB is sum(p > pb)/B where p is the observed p-value, and pb is a vector of the p-values of the B resampling statistics. It is possible that pB may be 0 for some SNP sets. To prevent this, pmax(pB,1/B) is returned instead.

*** The column PB is only returned if pval.transform=TRUE. For each SNP set, the value for PB is sum(p > Zb)/B, where Zb a vector of length B. Each element of Zb is the smallest resampling p-value across all K SNP sets for the bth replication. It is possible that PB may be 0 for some SNP sets. To prevent this, pmax(PB,1/B) is returned instead.

The qvalue() function, used by default in qfun, can fail for small numbers of replications/SNP sets. To overcome this, the qfun argument can be used to define a new q-value function, or to assign arguments for the qvalue() function. For example:

qfun=function(x){qvalue(x, robust=TRUE)$qvalue}.

This function computes p-values for the statistics from the function rsnpset.

For sorting and reviewing the p-values, see summary.RSNPset.pvalue.

More information on qvalue.

n <- 200    # Number of patients
m <- 1000   # Number of SNPs

set.seed(123)
G <- matrix(rnorm(n*m), n, m)   # Normalized SNP expression levels
rsids <- paste0("rs", 1:m)      # SNP rsIDs 
colnames(G) <- rsids
 
K <- 15                         # Number of SNP sets
genes <- paste0("XYZ", 1:K)     # Gene names 
gsets <- lapply(sample(3:50, size=K, replace=TRUE), sample, x=rsids)
names(gsets) <- genes

# Survival outcome
time <- rexp(n, 1/10)           # Survival time
event <- rbinom(n, 1, 0.9)      # Event indicator

## Not run: 
# Optional parallel backend
library(doParallel)
registerDoParallel(cores=8) 
## End(Not run)

# B >= 1000 is typically recommended
res <- rsnpset(Y=time, delta=event, G=G, snp.sets=gsets, score="cox", 
               B=50, r.method="permutation", ret.rank=TRUE)
rsnpset.pvalue(res, pval.transform=TRUE)