genePvals: Permutation p-Values for Gene Expression

View source: R/genePvals.R

genePvalsR Documentation

Permutation p-Values for Gene Expression

Description

This function computes p-value combinations for different permutations of gene expression data. A gene's p-value is calculated by performing the two-sample t test for the null hypothesis that the mean expression value is the same between two populations.

Usage

genePvals(expr, labels, alternative = "two.sided", alpha = 0.05, B = 200, seed = NULL,
          truncFrom = NULL, truncTo = 0.5, type = "vovk.wang", r = 0, rand = FALSE)

Arguments

expr

matrix where rows correspond to genes, and columns to samples.

labels

numeric/character vector with two levels, denoting the population of each sample.

alternative

direction of the alternative hypothesis (greater, lower, two.sided).

alpha

significance level.

B

number of permutations, including the identity.

seed

seed.

truncFrom

truncation parameter: values greater than truncFrom are truncated. If NULL, it is set to alpha.

truncTo

truncation parameter: truncated values are set to truncTo. If NULL, p-values are not truncated.

type

p-value combination among edgington, fisher, pearson, liptak, cauchy, harmonic, vovk.wang (see details).

r

parameter for Vovk and Wang's p-value combination.

rand

logical, TRUE to compute p-values by permutation distribution.

Details

A p-value p is transformed as following.

  • Edgington: p (Edgington, 1972)

  • Fisher: -2log(p) (Fisher, 1925)

  • Pearson: 2log(1-p) (Pearson, 1933)

  • Liptak: qnorm(1-p) (Liptak, 1958; Stouffer et al., 1949)

  • Cauchy: tan[(0.5-p)pi] with pi=3.142 (Liu and Xie, 2020)

  • Harmonic mean: 1/p (Wilson, 2019)

  • Vovk and Wang: p^r (log(p) for r=0) (Vovk and Wang, 2020)

An error message is returned if the transformation produces infinite values.

For Vovk and Wang, r=0 corresponds to Fisher, and r=-1 to the harmonic mean.

Truncation parameters should be such that truncTo is not smaller than truncFrom. As Pearson's and Liptak's transformations produce infinite values in 1, for such methods truncTo should be strictly smaller than 1.

The significance level alpha should be in the interval [1/B, 1).

Value

genePvals returns an object of class sumGene, containing

  • statistics: numeric matrix of p-values, where columns correspond to genes, and rows to permutations. The first permutation is the identity

  • alpha: significance level

  • truncFrom: transformed first truncation parameter

  • truncTo: transformed second truncation parameter

Author(s)

Anna Vesely.

References

Goeman J. J. and Solari A. (2011). Multiple testing for exploratory research. Statistical Science, doi: 10.1214/11-STS356.

Vesely A., Finos L., and Goeman J. J. (2023). Permutation-based true discovery guarantee by sum tests. Journal of the Royal Statistical Society, Series B (Statistical Methodology), doi: 10.1093/jrsssb/qkad019.

See Also

Permutation statistics for gene expression using t scores: geneScores

True discovery guarantee for cluster analysis: geneAnalysis

Examples

# simulate 20 samples of 100 genes
set.seed(42)
expr <- matrix(c(rnorm(1000, mean = 0, sd = 10), rnorm(1000, mean = 13, sd = 10)), ncol = 20)
rownames(expr) <- seq(100)
labels <- rep(c(1,2), each = 10)

# simulate pathways
pathways <- lapply(seq(3), FUN = function(x) sample(rownames(expr), 3*x))

# create object of class sumGene
res <- genePvals(expr = expr, labels = labels, alpha = 0.2, seed = 42, type = "liptak")
res
summary(res)

# confidence bound for the number of true discoveries and the TDP within pathways
out <- geneAnalysis(res, pathways = pathways)
out

annavesely/sumSome documentation built on Jan. 28, 2025, 8:15 a.m.