multienrichjam: Prepare MultiEnrichMap data from enrichList

multienrichjamR Documentation

Prepare MultiEnrichMap data from enrichList

Description

Prepare MultiEnrichMap data from enrichList

Usage

multienrichjam(
  enrichList,
  enrichLabels = NULL,
  p_cutoff = 0.05,
  min_count = 3,
  top_enrich_n = 20,
  geneHitList = NULL,
  geneHitIM = NULL,
  colorV = NULL,
  enrichNumLimit = 4,
  nEM = 500,
  descriptionCurateFrom = c("^Genes annotated by the GO term "),
  descriptionCurateTo = c(""),
  descriptionCurateFn = fixSetLabels,
  geneDelim = "[,/ ]+",
  returnType = c("Mem", "list"),
  verbose = FALSE,
  ...
)

Arguments

enrichList

list of enrichResult objects, whose names are used in subsequent derived results.

enrichLabels

character vector of enrichment labels to use, as an optional alternative to names(enrichList).

geneHitList

list of character vectors, or list of numeric vectors whose names represent genes, or or NULL. When NULL the gene hit list for each enrichment result is inferred from the enrichment results themselves, however this option may incompletely represent which genes were statistical hits. Note that geneHitList and geneHitIM serve the same purpose and either can be supplied.

geneHitIM

numeric matrix with gene rows, enrichment columns, and numeric values indicating the presence and/or direction of change for each gene. Note that geneHitList and geneHitIM serve the same purpose and either can be supplied.

colorV

character vector of colors, length equal to length(enrichList), used to assign specific colors to each enrichment result.

enrichNumLimit

numeric value indicating the -log10(P-value) above which each color gradient is considered the maximum color, useful to apply a fixed threshold for each color gradient.

nEM

integer number, to define the maximum number of pathway nodes to include in the EnrichMap igraph network. This argument is passed to enrichMapJam().

descriptionCurateFn

function default fixSetLabels() used to curate pathway description to a user-friendly label. When NULL this step is skipped.

geneDelim

character pattern used with strsplit() to split multiple gene values into a list of vectors. The default for enrichResult objects is "/", but the default for other sources is often ",". The default pattern "[,/ ]+" splits by either "/", ",", or whitespace " ".

returnType

character default "Mem" output class:

  • 'Mem' returns Mem S4 object

  • 'list' returns legacy list format (deprecated)

verbose

logical indicating whether to print verbose output. For verbose to cascade to internal functions, use verbose=2.

...

additional arguments are passed to internal functions.

  • topEnrichListBySource() used to take the top topEnrichN pathways from each enrichment, and may also be used to subset by other criteria such as descriptionGrep, nameGrep, sourceSubset, subsetSets, etc.

nrow, ncol, byrow

optional arguments used to customize igraph node shape "coloredrectangle", useful when the number of enrichList results is larger than around 4. It defines the number of columns and rows used for each node, to display enrichment result colors, and whether to fill colors by row when byrow=TRUE, or by column when byrow=FALSE.

subsetSets

character vector of optional set names to use in the analysis, useful to analyze only a specific subset of known pathways.

overlapThreshold

numeric value between 0 and 1, indicating the Jaccard overlap score above which two pathways will be linked in the EnrichMap igraph network. By default, pathways whose genes overlap more than 0.1 will be connected, which is roughly equivalent to about a 10% overlap. Note that the Jaccard coefficient is adversely affected when pathway sets differ in size by more than about 5-fold.

cutoffRowMinP

numeric value between 0 and 1, indicating the enrichment P-value required by at least one enrichment result, to be retained in downstream analyses. This P-value can be confirmed in the returned list element "enrichIM", which is a matrix of P-values by pathway and enrichment.

enrichBaseline

numeric value indicating the -log10(P-value) at which colors are defined as non-blank in color gradients. This value is typically derived from cutoffRowMinP to ensure that colors are only applied when a pathway meets this significance threshold.

enrichLens

numeric value indicating the "lens" to apply to color gradients, where numbers above 0 make the color ramp more compressed, so colors are more vivid at lower numeric values.

topEnrichN

integer value with the maximum rows to retain from each enrichList table, by source. Set topEnrichN=0 or topEnrichN=NULL to disable subsetting for the top rows.

pathGenes, geneHits

character values indicating the colnames that contain the number of pathway genes, and the number of gene hits, respectively.

Details

This function performs most of the work of comparing multiple enrichment results. This function takes a list of enrichResult objects, generates an overall pathway-gene incidence matrix, assembles a pathway-to-Pvalue matrix, creates EnrichMap igraph network objects, and CnetPlot igraph network objects. It also applies node shapes and colors consistent with the colors used for each enrichment result.

By default, each enrichment result table is subsetted for the top n=20 pathways sorted by pathway source, defined by colnames c("Source", "Category"). For data without a source column, the overall enrichment results are sorted to take the top 20. Once the top 20 from each enrichment table are selected, the overall set of pathways are used to retain these pathways from all enrichment tables. In this way, a significant enrichment result from one table will still be compared to a non-significant result from another table.

The default values for topEnrichN and related arguments are intended when using enrichment results from MSigDB, which has colnames c("Source","Category") and represents 100 or more combinations of sources and categories. The default values will select the top 20 entries from the canonical pathways, after curating the canonical pathway categories to one "CP" source value.

To disable the top pathway filtering, set topEnrichN=0.

Colors can be defined for each enrichment result using the argument colorV, otherwise colors are assigned using colorjam::rainbowJam().

Value

list object containing various result formats:

  • colorV: named vector of colors assigned to each enrichment, where names match the names of each enrichment in enrichList.

See Also

Other jam enrichment functions: add_pathway_direction(), multiEnrichMap(), topEnrichBySource()

Examples

## See the Vignette for a full walkthrough example


jmw86069/jamenrich documentation built on June 13, 2025, 6:16 a.m.