Model-based gene set analysis (MGSA)

Share:

Description

This method is a wrapper for the mgsa methods from the Bioconductor package mgsa, which must be available on the system for the methods to run. The model-based gene set analysis (MGSA) analyzes all categories at once by embedding them in a Bayesian network, naturally taking overlap between categories into account and avoiding the need for multiple testing correction. Please consult the mgsa help page for more details.

Arguments

query

A CMAPCollection, GeneSet, or GeneSetCollection object containing the 'query' gene sets to compare against the 'sets'

sets

A CMAPCollection, GeneSetCollection or GeneSet object

universe

A character string of gene ids for all genes that could potentially be of interest, e.g. all genes represented on a microarray, all annotated genes, etc.

keep.scores

Logical: store the identifiers for the genes detected in 'query' and 'sets' ? (Default: FALSE) The size of the generated CMAPResults object increases with the number of contained gene sets. For very large collections, setting this parameter to 'TRUE' may require large amounts of memory.

element

A character string corresponding to the assayDataElementName of the NChannelSet object to be thresholded on the fly with the induceCMAPCollection.

lower

The lower threshold for the induceCMAPCollection.

higher

The 'higher' threshold for the induceCMAPCollection.

min.set.size

Number of genes a gene set induced by induceCMAPCollection needs to contain to be included in the analysis (Default:5).

...

Additional arguments passed to mgsa function from the mgsa package, including the following:

  • alpha: Grid of values for the parameter alpha. Values represent probabilities of false-positive events and hence must be in [0,1]. numeric

  • beta: Grid of values for the parameter beta. Values represent probabilities of false-negative events and hence must be in [0,1]. numeric.

  • p: Grid of values for the parameter p. Values represent probabilities of term activity and therefore must be in [0,1]. numeric.

  • steps: The number of steps of each run of the MCMC sampler. integer of length 1. A recommended value is 1e6 or greater.

  • restarts: The number of different runs of the MCMC sampler. integer of length 1. Must be greater or equal to 1. A recommended value is 5 or greater.

  • threads: The number of threads that should be used for concurrent restarts. A value of 0 means to use all available cores. Defaults to 'getOption(mc.cores, default=0)', which will instruct mgsa to use all available cores.

Value

A CMAPResults object. The reported p-values represent '1-marginal posterior probability'. For the 'effect' column, the p-values have been transformed to z-scores using a standard normal distribution.

Methods

signature(query = "GeneSet", sets = "CMAPCollection", universe = "character")
signature(query = "GeneSet", sets = "NChannelSet", universe = "character")
signature(query = "SignedGeneSet", sets = "CMAPCollection", universe = "character")
signature(query = "SignedGeneSet", sets = "NChannelSet", universe = "character")
signature(query = "GeneSetCollection", sets = "CMAPCollection", universe = "character")
signature(query = "GeneSetCollection", sets = "NChannelSet", universe = "character")
signature(query = "GeneSetCollection", sets = "GeneSetCollection", universe = "character")
signature(query = "GeneSet", sets = "GeneSetCollection", universe = "character")
signature(query = "GeneSet", sets = "GeneSet", universe = "character")
signature(query = "CMAPCollection", sets = "CMAPCollection", universe = "character")
signature(query = "CMAPCollection", sets = "GeneSetCollection", universe = "character")

Note

This Bayesian approach does not require any additional correction of p-values for multiple testing. For consistency, the returned CMAPResults object contains a padj column duplicating the content of the pval column.

See Also

mgsa

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
if( is.element("mgsa", installed.packages()[,1])){
   require( "mgsa", character.only = TRUE )
   
   data(gCMAPData)
   gene.set.collection <- induceCMAPCollection(gCMAPData, "z", 
   higher=2, lower=-2)

    ## compare all gene sets in the gene.set.collection 
    ## to each other
    universe = featureNames(gCMAPData)
     mgsa_score(gene.set.collection, gene.set.collection, 
     universe = universe)
  }

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.