getMSigGeneSetDb: Fetches a 'GeneSetDb' from geneset collections defined in...

Description Usage Arguments Details Value KEGG Gene Sets MSigDB Versions Citing the Molecular Signatures Database Examples

View source: R/get-msigdb.R

Description

This provides versioned genesets from gene set collections defined in MSigDB. Collections can be retrieved by their collection name, ie c("H", "C2", "C7").

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
getMSigGeneSetDb(
  collection = "H",
  species = "human",
  id.type = c("ensembl", "entrez", "symbol"),
  with.kegg = FALSE,
  allow_multimap = TRUE,
  min_ortho_sources = 2,
  promote_subcategory_to_collection = TRUE,
  version = NULL,
  ...
)

Arguments

collection

character vector specifying the collections you want (c1, c2, ..., c7, h). By default we load just the hallmark collecitons. Setting this to NULL loads all collections. Alternative you can also include named subsets of collections, like "reactome". Refer to the Details section for more information.

species

human or mouse?

with.kegg

The Broad distributes the latest versions of the KEGG genesets as part of the c2 collection. These genesets come with a restricted license, so by default we do not return them as part of the GeneSetDb. To include the KEGG gene sets when asking for the c2 collection, set this flag to TRUE.

allow_multimap, min_ortho_sources

configure how to handle orthology mapping (allow multimappers, and what type of level of db suport required). See help in msigdb.data::msigdb_retrieve()

version

the version of the MSigDB database to use.

Details

Some subsets of curated genesets from within C2 can be retrieved by name, like "reactome", "kegg", "biocarta", and "pid". You can, for instance, call this function withcollection = c("reactome", "H"), and the reactome subset of C2 will be returned, along with all of the hallmark genesets. When invoked like this, these "blessed" subsets of collections will be promoted out of the C2 collection and into its own. This happens when promote_subcategory_to_collection = FALSE (the default).

The GO collection (C5) will also be promoted out of C5 and into their own "GO_MP", "GO_BP", and "GO_MF" collections.

Value

a GeneSetDb object

KEGG Gene Sets

Due to the licensing restrictions over the KEGG collections, they are not returned from this function unless they are explicitly asked for. You can ask for them through this function by either (i) querying for the "c2" collection while setting with.kegg = TRUE; or (ii) explicitly calling with collection = "kegg".

MSigDB Versions

We recently switched to using the msigdbr package as the source of truth for these, so v7 is the earliest version of the MSigDB collections we make available. Version 6 are available in the following (deprecated) packages:

Citing the Molecular Signatures Database

To cite your use of the Molecular Signatures Database (MSigDB), please reference Subramanian, Tamayo, et al. (2005, PNAS 102, 15545-15550) and one or more of the following as appropriate:

Examples

1
2
3
4
5
6
7
## Not run: 
  gdb <- getMSigGeneSetDb(c("h", "reactome"), "human", "entrez")
  gdb.h.entrez <- getMSigGeneSetDb(c("h", "c2"), "human", "entrez")
  gdb.h.ens <- getMSigGeneSetDb(c("h", "c2"), "human", "ensembl")
  gdb.m.entrez <- getMSigGeneSetDb(c("h", "c2"), "mouse", "entrez")

## End(Not run)

lianos/multiGSEA documentation built on Nov. 17, 2020, 1:26 p.m.