geneAnalysis: True Discovery Guarantee for Pathway Analysis of Gene...

View source: R/geneAnalysis.R

geneAnalysisR Documentation

True Discovery Guarantee for Pathway Analysis of Gene Expression Data

Description

This function uses permutation t-statistics/p-values to determine a true discovery guarantee for gene pathway analysis. It computes confidence bounds for the number of true discoveries and the true discovery proportion within each cluster. The bounds are simultaneous over all sets, and remain valid under post-hoc selection.

Usage

geneAnalysis(sumGene, pathways = NULL, nMax = 50, silent = FALSE)

Arguments

sumGene

an object of class sumGene, as returned by the functions geneScores and genePvals.

pathways

list of character vectors containing gene names (one vector per pathway). If NULL, the whole gene set is considered.

nMax

maximum number of iterations per cluster.

silent

logical, FALSE to print a summary of active pathways.

Value

geneAnalysis returns a data frame containing, for each pathway,

  • size: size

  • TD: lower (1-alpha)-confidence bound for the number of true discoveries

  • maxTD: maximum value of TD that could be found under convergence of the algorithm

  • TDP: lower (1-alpha)-confidence bound for the true discovery proportion

  • maxTD: maximum value of TDP that could be found under convergence of the algorithm.

Author(s)

Anna Vesely.

References

Goeman J. J. and Solari A. (2011). Multiple testing for exploratory research. Statistical Science, doi: 10.1214/11-STS356.

Vesely A., Finos L., and Goeman J. J. (2023). Permutation-based true discovery guarantee by sum tests. Journal of the Royal Statistical Society, Series B (Statistical Methodology), doi: 10.1093/jrsssb/qkad019.

See Also

Permutation statistics for gene expression: geneScores, genePvals

Examples

# simulate 20 samples of 100 genes
set.seed(42)
expr <- matrix(c(rnorm(1000, mean = 0, sd = 10), rnorm(1000, mean = 13, sd = 10)), ncol = 20)
rownames(expr) <- seq(100)
labels <- rep(c(1,2), each = 10)

# simulate pathways
pathways <- lapply(seq(3), FUN = function(x) sample(rownames(expr), 3*x))

# create object of class sumGene
res <- geneScores(expr = expr, labels = labels, alpha = 0.2, seed = 42)
res
summary(res)

# confidence bound for the number of true discoveries and the TDP within pathways
out <- geneAnalysis(res, pathways = pathways)
out

annavesely/sumSome documentation built on Jan. 28, 2025, 8:15 a.m.