loadGSC | R Documentation |
Load a gene set collection, to be used in runGSA
, in GMT, SBML
or SIF format, or optionally from a data.frame
.
loadGSC(file, type = "auto", addInfo)
file |
a character string, giving the name of the file containing the gene set collection. Optionally an object that can be coerced into a two-column data.frame, the first column containing genes and the second gene sets, representing all "gene"-to-"gene set" connections. |
type |
a character string giving the file type. Can be either of
|
addInfo |
an optional data.frame with two columns, the first containging the gene set names and the second containing additional information for each gene set. Some additional info may load automatically from the different file types. |
This function is used to create a gene-set collection object to be used with
runGSA
.
The "gmt" files available from the Molecular Signatures Database
(http://www.broadinstitute.org/gsea/msigdb/) can be loaded using
loadGSC
. This website is a valuable resource and contains several
different collections of gene sets.
By using the functionality of e.g. the biomaRt
package, a gene-set
collection with custom gene names (matching the statistics used in
runGSA
) can easily be compiled into a two-column data.frame
(column order: genes, gene sets) and loaded with type="data.frame"
.
If a sif-file is used it is assumed that the first column contains gene sets and the third column contains genes.
A genome-scale metabolic model in SBML format can be used to define gene
sets. In this case, metabolites will be the gene sets, containing all the
genes that code for enzymes catalyzing reactions in which the metabolite
takes part in. In order to load an SBML-file it is required that libSBML and
rsbml
is installed. Note that the SBML loading is an experimental
feature and is highly dependent on the version and format of the SBML file
and requires it to contain gene associations for the reactions. By examining
the returned GSC
object it is easy to see if the correct gene sets
were loaded.
A list like object of class GSC
containing two elements. The
first is gsc
, a list of the gene sets, each element a character
vector of genes. The second element is addInfo
, a data.frame
containing the optional additional information.
Leif Varemo piano.rpkg@gmail.com and Intawat Nookaew piano.rpkg@gmail.com
piano, runGSA
# Randomly generated gene sets: g <- sort(paste("g",floor(runif(100)*500+1),sep="")) g <- c(g,sort(paste("g",floor(runif(900)*1000+1),sep=""))) g <- c(g,sort(paste("g",floor(runif(1000)*2000+1),sep=""))) s <- paste("s",floor(rbeta(2000,0.9,1.7)*50+1),sep="") # Make data.frame: gsc <- cbind(g,s) # Load gene set collection from data.frame: gsc <- loadGSC(gsc)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.