fgseaSimple: Runs preranked gene set enrichment analysis.

View source: R/fgsea.R

fgseaSimpleR Documentation

Runs preranked gene set enrichment analysis.

Description

The function takes about O(nk^{3/2}) time, where n is number of permutations and k is a maximal size of the pathways. That means that setting 'maxSize' parameter with a value of ~500 is strongly recommended.

Usage

fgseaSimple(
  pathways,
  stats,
  nperm,
  minSize = 1,
  maxSize = length(stats) - 1,
  scoreType = c("std", "pos", "neg"),
  nproc = 0,
  gseaParam = 1,
  BPPARAM = NULL
)

Arguments

pathways

List of gene sets to check.

stats

Named vector of gene-level stats. Names should be the same as in 'pathways'

nperm

Number of permutations to do. Minimial possible nominal p-value is about 1/nperm

minSize

Minimal size of a gene set to test. All pathways below the threshold are excluded.

maxSize

Maximal size of a gene set to test. All pathways above the threshold are excluded.

scoreType

This parameter defines the GSEA score type. Possible options are ("std", "pos", "neg"). By default ("std") the enrichment score is computed as in the original GSEA. The "pos" and "neg" score types are intended to be used for one-tailed tests (i.e. when one is interested only in positive ("pos") or negateive ("neg") enrichment).

nproc

If not equal to zero sets BPPARAM to use nproc workers (default = 0).

gseaParam

GSEA parameter value, all gene-level statis are raised to the power of 'gseaParam' before calculation of GSEA enrichment scores.

BPPARAM

Parallelization parameter used in bplapply. Can be used to specify cluster to run. If not initialized explicitly or by setting 'nproc' default value 'bpparam()' is used.

Value

A table with GSEA results. Each row corresponds to a tested pathway. The columns are the following:

  • pathway – name of the pathway as in 'names(pathway)';

  • pval – an enrichment p-value;

  • padj – a BH-adjusted p-value;

  • ES – enrichment score, same as in Broad GSEA implementation;

  • NES – enrichment score normalized to mean enrichment of random samples of the same size;

  • nMoreExtreme' – a number of times a random gene set had a more extreme enrichment score value;

  • size – size of the pathway after removing genes not present in 'names(stats)'.

  • leadingEdge – vector with indexes of leading edge genes that drive the enrichment, see http://software.broadinstitute.org/gsea/doc/GSEAUserGuideTEXT.htm#_Running_a_Leading.

Examples

data(examplePathways)
data(exampleRanks)
fgseaRes <- fgseaSimple(examplePathways, exampleRanks, nperm=10000, maxSize=500)
# Testing only one pathway is implemented in a more efficient manner
fgseaRes1 <- fgseaSimple(examplePathways[1], exampleRanks, nperm=10000)

ctlab/fgsea documentation built on Dec. 21, 2024, 1:55 p.m.