ora: Performs an overrepresentation analysis, (optionally)...
In lianos/sparrow: Take command of set enrichment analyses through a unified interface

View source: R/do.ora.R

ora	R Documentation

Performs an overrepresentation analysis, (optionally) accounting for bias.

Description

This function wraps limma::kegga() to perform biased overrepresntation analysis over gene set collection stored in a GeneSetDb (gsd) object. Its easiest to use this function when the biases and selection criteria are stored as columns of the input data.frame dat.

Usage

ora(
  x,
  gsd,
  selected = "significant",
  groups = NULL,
  feature.bias = NULL,
  universe = NULL,
  restrict.universe = FALSE,
  plot.bias = FALSE,
  ...,
  as.dt = FALSE
)

plot_ora_bias(x, selected, feature.bias, ...)

Arguments

`x`	A data.frame with feature-level statistics. Minimally, this should have a `"feature_id"` (character) column, but read on ...
`gsd`	The GeneSetDb
`selected`	Either the name of a logical column in `dat` used to subset out the features to run the enrichement over, or a character vector of `"feature_id"`s that are selected from `dat[["feature_id"]]`.
`groups`	Encodes groups of features that we can use to test selected features individual, as well as "all" together. This can be specified by: (1) specifying a name of a column in `dat` to split the enriched features into subgroups. (2) A named list of features to intersect with `selected`. By default this is `NULL`, so we only run enrichment over all elements in `selected`. See examples for details.
`feature.bias`	If `NULL` (default), no bias is used in enrichment analysis. Otherwise, can be the name of a column in `dat` to extract a numeric bias vector (gene length, GC content, average expression, etc.) or a named (using featureIds) numeric vector of the same. The BiasedUrn CRAN package is required when this is not NULL.
`universe`	Defaults to all elements in `dat[["feature_id"]]`.
`restrict.universe`	See same parameter in `limma::kegga()`
`plot.bias`	See `plot` parameter in `limma::kegga()`. You can generate this plot without running `ora` using the `plot_ora_bias()`, like so: `plot_ora_bias(dat, selected = selected, groups = groups, feature.bias = feature.bias)`
`...`	parameters passed to `conform()`
`as.dt`	If `FALSE` (default), the data.frame like thing that this funciton returns will be set to a data.frame. Set this to `TRUE` to keep this object as a `data.table`

Details

In principle, this test does what goseq does, however I found that sometimes calling goseq would throw errors within goseq::nullp() when calling makesplines. I stumbled onto this implementation when googling for these errors and landing here: https://support.bioconductor.org/p/65789/#65914

The meat and potatoes of this function's code was extracted from limma::kegga(), written by Gordon Smyth and Yifang Hu.

Note that the BiasedUrn CRAN package needs to be installed to support biased enrichment testing

Value

A data.frame of pathway enrichment. The last N colums are enrichment statistics per pathway, grouped by the groups parameter. P.all are the stats for all selected features, and the remaingin ⁠P.*⁠ columns are for the features specifed by groups.

Functions

plot_ora_bias(): plots the bias of coviarate to DE / selected status. Code taken from limma::kegga()

References

Young, M. D., Wakefield, M. J., Smyth, G. K., Oshlack, A. (2010). Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biology 11, R14. http://genomebiology.com/2010/11/2/R14

Examples

dgestats <- exampleDgeResult()
gdb <- randomGeneSetDb(dgestats)

# Run enrichmnent without accounting for any bias
nobias <- ora(dgestats, gdb, selected = "selected", groups = "direction",
              feature.bias = NULL)

# Run enrichment and account for gene length
lbias <- ora(dgestats, gdb, selected = "selected",
             feature.bias = "effective_length")

# plot length bias with DGE status
plot_ora_bias(dgestats, "selected", "effective_length")

# induce length bias and see what is the what ...............................
biased <- dgestats[order(dgestats$pval),]
biased$effective_length <- sort(biased$effective_length, decreasing = TRUE)
plot_ora_bias(biased, "selected", "effective_length")
etest <- ora(biased, gdb, selected = "selected",
             groups = "direction",
             feature.bias = "effective_length")

lianos/sparrow documentation built on July 17, 2025, 3:14 a.m.