CondSEA: Performs Condition Set Enrichment Analysis

Description Usage Arguments Details Value References See Also Examples

Description

Condition Set Enrichment Analysis (CondSEA) can be seen as a Gene-SEA performed over rows (as opposed to columns) of a matrix of GEPs. It tells how much a pathway is consistently dysregulated under a set of conditions (such as a set of drug treatments, disease states, cell types, etc.) when compared to a statistical background of other conditions.

Usage

1
2
3
CondSEA(rp_peps, pgset, bgset = "all", collections = "all",
  details = TRUE, rankingFun = rankPEPsByRows.ES, usecache = FALSE,
  sortoutput = TRUE)

Arguments

rp_peps

A repository created with createRepository, and containing PEPs created with buildPEPs.

pgset

A vector of names of conditions. Corresponding PEPs must exist in all the pathway collections currently in rp.

bgset

The background against which to compare pgset. If set to all (default), all the remaining PEPs will be used. If provided, the corresponding PEPs must exist in all the pathway collections currently in rp.

collections

A subset of the collection names returned by getCollections. If set to "all" (default), all the collections in rp will be used.

details

If TRUE (default) rank details will be reported for each condition in pgset.

rankingFun

The function used to rank PEPs column-wise. By default rankPEPsByRows.ES is used, which ranks using gene set enrichment scores (see details).

usecache

If set to TRUE, the computed ranked matrix will be stored in the the repository (see details). FALSE by default.

sortoutput

If TRUE (default) the output gene sets will be sorted in order of increasing p-value.

Details

For each pathway, all conditions are ranked by how much they dysregulate it (from the most UP-regulating to the most DOWN-regulating). Then, a Kolmogorov-Smirnov (KS) test is performed to compare the ranks assigned to conditions in pgset against the ranks assigned to conditions in bgset. A positive (negative) Enrichment Score (ES) of the KS test indicates whether each pathway is UP- (DOWN-) regulated by pgset as compared to bgset. A p-value is associated to the ES.

When PEPs are obtained from drug-induced gene expression profiles, PathSEA is the Drug-Set Enrichment Analysis [1].

The rankingFun must take in input PEPs like those loaded from the repository and return a matrix of row-wise ranks. Each row must contains ranks from 1 to the number of PEPs minus the number of NAs in the row.

When usecache=TRUE, the ranked matrix is permanently stored in HDF5 format, and subsequent calls to CondSEA will load from the disk the necessary ranks (not the whole matrix). The correct cached data is identified by the alphabetically sorted set union(pgset, bgset), by the collection name, and by the ranking function. Additional alls to CondSEA with variations of these inputs will create additional cache. Cached data is hidden in the repository by default and can be printed with rp_peps$print(all=TRUE), and cleared with clearCache(rp_peps).

Value

A list of 2, by names "CondSEA" and "details". The "CondSEA" entry is a 2-columns matrix including ESs and p-values (see details) for each pathway database and condition. The "details" entry reports the rank of each condition in pgset for each pathway.

References

[1] Napolitano F. et al, Drug-set enrichment analysis: a novel tool to investigate drug mode of action. Bioinformatics 32, 235-241 (2016).

See Also

getResults, getDetails, clearCache

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
db <- loadSamplePWS()
repo_path <- file.path(tempdir(), "gep2pepTemp")

rp <- createRepository(repo_path, db)
geps <- loadSampleGEP()
buildPEPs(rp, geps)

pgset <- c("(+)_chelidonine", "(+/_)_catechin")
psea <- CondSEA(rp, pgset)

res <- getResults(psea, "c3_TFT")

## getting the names of the top pathways

setId2setName(loadCollection(rp, "c3_TFT"), rownames(res))

unlink(repo_path, TRUE)

franapoli/gep2pep documentation built on May 30, 2019, 4:34 p.m.