genePvals: Permutation p-Values for Gene Expression
In annavesely/sumSome: True Discovery Guarantee by Sum-Based Tests

genePvals

R Documentation

Permutation p-Values for Gene Expression

Description

This function computes p-value combinations for different permutations of gene expression data. A gene's p-value is calculated by performing the two-sample t test for the null hypothesis that the mean expression value is the same between two populations.

Usage

genePvals(expr, labels, alternative = "two.sided", alpha = 0.05, B = 200, seed = NULL,
          truncFrom = NULL, truncTo = 0.5, type = "vovk.wang", r = 0, rand = FALSE)

Arguments

`expr`	matrix where rows correspond to genes, and columns to samples.
`labels`	numeric/character vector with two levels, denoting the population of each sample.
`alternative`	direction of the alternative hypothesis (`greater`, `lower`, `two.sided`).
`alpha`	significance level.
`B`	number of permutations, including the identity.
`seed`	seed.
`truncFrom`	truncation parameter: values greater than `truncFrom` are truncated. If `NULL`, it is set to `alpha`.
`truncTo`	truncation parameter: truncated values are set to `truncTo`. If `NULL`, p-values are not truncated.
`type`	p-value combination among `edgington`, `fisher`, `pearson`, `liptak`, `cauchy`, `harmonic`, `vovk.wang` (see details).
`r`	parameter for Vovk and Wang's p-value combination.
`rand`	logical, `TRUE` to compute p-values by permutation distribution.

Details

A p-value p is transformed as following.

Edgington: p (Edgington, 1972)
Fisher: -2log(p) (Fisher, 1925)
Pearson: 2log(1-p) (Pearson, 1933)
Liptak: qnorm(1-p) (Liptak, 1958; Stouffer et al., 1949)
Cauchy: tan[(0.5-p)pi] with pi=3.142 (Liu and Xie, 2020)
Harmonic mean: 1/p (Wilson, 2019)
Vovk and Wang: p^r (log(p) for r=0) (Vovk and Wang, 2020)

An error message is returned if the transformation produces infinite values.

For Vovk and Wang, r=0 corresponds to Fisher, and r=-1 to the harmonic mean.

Truncation parameters should be such that truncTo is not smaller than truncFrom. As Pearson's and Liptak's transformations produce infinite values in 1, for such methods truncTo should be strictly smaller than 1.

The significance level alpha should be in the interval [1/B, 1).

Value

genePvals returns an object of class sumGene, containing

statistics: numeric matrix of p-values, where columns correspond to genes, and rows to permutations. The first permutation is the identity
alpha: significance level
truncFrom: transformed first truncation parameter
truncTo: transformed second truncation parameter

Author(s)

Anna Vesely.

References

Goeman J. J. and Solari A. (2011). Multiple testing for exploratory research. Statistical Science, doi: 10.1214/11-STS356.

Vesely A., Finos L., and Goeman J. J. (2023). Permutation-based true discovery guarantee by sum tests. Journal of the Royal Statistical Society, Series B (Statistical Methodology), doi: 10.1093/jrsssb/qkad019.

Examples

# simulate 20 samples of 100 genes
set.seed(42)
expr <- matrix(c(rnorm(1000, mean = 0, sd = 10), rnorm(1000, mean = 13, sd = 10)), ncol = 20)
rownames(expr) <- seq(100)
labels <- rep(c(1,2), each = 10)

# simulate pathways
pathways <- lapply(seq(3), FUN = function(x) sample(rownames(expr), 3*x))

# create object of class sumGene
res <- genePvals(expr = expr, labels = labels, alpha = 0.2, seed = 42, type = "liptak")
res
summary(res)

# confidence bound for the number of true discoveries and the TDP within pathways
out <- geneAnalysis(res, pathways = pathways)
out

annavesely/sumSome documentation built on Jan. 28, 2025, 8:15 a.m.