ora: Performs an overrepresentation analysis, (optinally)...

Description Usage Arguments Details Value References Examples

View source: R/do.ora.R

Description

This function wraps limma::kegga() to perform biased overrepresntation analysis over gene set collection stored in a GeneSetDb (gsd) object. Its easiest to use this function when the biases and selection criteria are stored as columns of the input data.frame dat.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
ora(
  gsd,
  dat,
  selected = "significant",
  groups = NULL,
  feature.bias = NULL,
  universe = NULL,
  restrict.universe = FALSE,
  plot.bias = FALSE,
  ...,
  as.dt = FALSE,
  .pipelined = FALSE
)

Arguments

gsd

The GeneSetDb

dat

A data.frame with feature-level statistics. Minimally, this should have a "feature_id" (character) column, but read on ...

selected

Either the name of a logical column in dat used to subset out the features to run the enrichement over, or a character vector of "feature_id"s that are selected from dat[["feature_id"]].

groups

Encodes groups of features that we can use to test selected features individual, as well as "all" together. This can be specified by: (1) specifying a name of a column in dat to split the enriched features into subgroups. (2) A named list of features to intersect with selected. By default this is NULL, so we only run enrichment over all elements in selected. See examples for details.

feature.bias

If NULL (default), no bias is used in enrichment analysis. Otherwise, can be the name of a column in dat to extract a numeric bias vector (gene length, GC content, average expression, etc.) or a named (using featureIds) numeric vector of the same. The BiasedUrn CRAN package is required when this is not NULL.

universe

Defaults to all elements in dat[["feature_id"]].

restrict.universe

See same parameter in limma::kegga()

plot.bias

See plot parameter in limma::kegga(). You can generate this plot without running ora using the plot_ora_bias(), like so: plot_ora_bias(dat, selected = selected, groups = groups, feature.bias = feature.bias)

Details

In principle, this test does what goseq does, however I found that sometimes calling goseq would throw errors within goseq::nullp() when calling makesplines. I stumbled onto this implementation when googling for these errors and landing here: https://support.bioconductor.org/p/65789/#65914

The meat and potatoes of this function's code was extracted from limma::kegga(), written by Gordon Smyth and Yifang Hu.

Note that the BiasedUrn CRAN package needs to be installed to support biased enrichment testing

Value

A data.frame of pathway enrichment. The last N colums are enrichment statistics per pathway, grouped by the groups parameter. P.all are the stats for all selected features, and the remaingin P.* columns are for the features specifed by groups.

References

Young, M. D., Wakefield, M. J., Smyth, G. K., Oshlack, A. (2010). Gene ontology analysis for RNA-seq: accounting for selection bias. Genome Biology 11, R14. http://genomebiology.com/2010/11/2/R14

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
dgestats <- exampleDgeResult("human", "ensembl")
gdb <- getMSigGeneSetDb("h", "human", "ensembl")

# Run enrichmnent without accounting for any bias
nobias <- ora(gdb, dgestats, selected = "selected", groups = "direction",
              feature.bias = NULL)

# Run enrichment and account for gene length
lbias <- ora(gdb, dgestats, selected = "selected",
             feature.bias = "effective_length")

# plot length bias with DGE status
plot_ora_bias(dgestats, "selected", "effective_length")

# induce length bias and see what is the what ...............................
biased <- dgestats[order(dgestats$pval),]
biased$effective_length <- sort(biased$effective_length, decreasing = TRUE)
plot_ora_bias(biased, "selected", "effective_length")
etest <- ora(gdb, biased, selected = "selected",
             groups = "direction",
             feature.bias = "effective_length")

lianos/multiGSEA documentation built on Nov. 17, 2020, 1:26 p.m.