GeneSet-class: Class "GeneSet"

Description Objects from the Class Slots Methods Author(s) See Also Examples

Description

A GeneSet contains a set of gene identifiers. Each gene set has a geneIdType, indicating how the gene identifiers should be interpreted (e.g., as Entrez identifiers), and a collectionType, indicating the origin of the gene set (perhaps including additional information about the set, as in the BroadCollection type).

Conversion between identifiers, subsetting, and logical (set) operations can be performed. Relationships between genes and phenotype in a GeneSet can be summarized using coloring to create a GeneColorSet. A GeneSet can be exported to XML with toBroadXML.

Objects from the Class

Construct a GeneSet with a GeneSet method (e.g., from a character vector of gene names, or an ExpressionSet), or from gene sets stored as XML (locally or on the internet; see getBroadSets)

Slots

setName:

Object of class "ScalarCharacter" containing a short name (single word is best) to identify the set.

setIdentifier:

Object of class "ScalarCharacter" containing a (unique) identifier for the set.

geneIdType:

Object of class "GeneIdentifierType" containing information about how the gene identifiers are encoded. See GeneIdentifierType and related classes.

geneIds:

Object of class "character" containing the gene symbols.

collectionType:

Object of class "CollectionType" containing information about how the geneIds were collected, including perhaps additional information unique to the collection methodology. See CollectionType and related classes.

shortDescription:

Object of class "ScalarCharacter" representing short description (1 line) of the gene set.

longDescription:

Object of class "ScalarCharacter" providing a longer description (e.g., like an abstract) of the gene set.

organism:

Object of class "ScalarCharacter" represents the organism the gene set is derived from.

pubMedIds:

Object of class "character" containing PubMed ids related to the gene set.

urls:

Object of class "character" containing urls used to construct or manipulate the gene set.

contributor:

Object of class "character" identifying who created the gene set.

version:

Object of class "Versions" a version number, manually curated (i.e., by the contributor) to provide a consistent way of tracking a gene set.

creationDate:

Object of class "character" containing the character string representation of the date on which the gene set was created.

Methods

Gene set construction:

GeneSet

See GeneSet methods and getBroadSets for convenient construction.

Slot access (e.g., setName) and retrieve (e.g., setName<-) :

collectionType<-

signature(object = "GeneSet", value = "CollectionType")

collectionType

signature(object = "GeneSet")

contributor<-

signature(object = "GeneSet", value = "character")

contributor

signature(object = "GeneSet")

creationDate<-

signature(object = "GeneSet", value = "character")

creationDate

signature(object = "GeneSet")

description<-

signature(object = "GeneSet", value = "character")

description

signature(object = "GeneSet")

geneIds<-

signature(object = "GeneSet", value = "character")

geneIds

signature(object = "GeneSet")

longDescription<-

signature(object = "GeneSet", value = "character")

longDescription

signature(object = "GeneSet")

organism<-

signature(object = "GeneSet", value = "character")

organism

signature(object = "GeneSet")

pubMedIds<-

signature(object = "GeneSet", value = "character")

pubMedIds

signature(object = "GeneSet")

setdiff

signature(x = "GeneSet", y = "GeneSet")

setIdentifier<-

signature(object = "GeneSet", value = "character")

setIdentifier

signature(object = "GeneSet")

setName<-

signature(object = "GeneSet", value = "character")

setName

signature(object = "GeneSet")

geneIdType<-

signature(object = "GeneSet", verbose=FALSE, value = "character"), signature(object = "GeneSet", verbose=FALSE, value = "GeneIdentifierType"): These method attempt to coerce geneIds from the current type to the type named by value. Successful coercion requires an appropriate method for mapIdentifiers.

geneIdType

signature(object = "GeneSet")

setVersion<-

signature(object = "GeneSet", value = "Versions")

setVersion

signature(object = "GeneSet")

urls<-

signature(object = "GeneSet", value = "character")

urls

signature(object = "GeneSet")

Logical and subsetting operations:

union

signature(x = "GeneSet", y = "GeneSet"): ...

|

signature(e1 = "GeneSet", e2 = "GeneSet"): calculate the logical ‘or’ (union) of two gene sets. The sets must contain elements of the same geneIdType.

|

signature(e1 = "GeneSet", e2 = "character"), signature(e1 = "character", e2 = "GeneSet"): calculate the logical ‘or’ (union) of a gene set and a character vector, i.e., add the geneIds named in the character vector to the gene set.

intersect

signature(x = "GeneSet", y = "GeneSet"):

&

signature(e1 = "GeneSet", e2 = "GeneSet"): calculate the logical ‘and’ (intersection) of two gene sets.

&

signature(e1 = "GeneSet", e2 = "character"), signature(e1 = "character", e2 = "GeneSet"): calculate the logical ‘and’ (intersection) of a gene set and a character vector, creating a new gene set containing only those genes named in the character vector.

setdiff

signature(x = "GeneSet", y = "GeneSet"), signature(x = "GeneSet", y = "character"), signature(x = "character", y = "GeneSet"): calculate the logical set difference betwen two gene sets, or betwen a gene set and a character vector.

[

signature(x = "GeneSet", i="character") signature(x = "GeneSet", i="numeric"): subset the gene set by index (i="numeric") or value (i="character"). Genes are re-ordered as required

[

signature(x = "ExpressionSet", i = "GeneSet"): subset the expression set, using genes in the gene set to select features. Genes in the gene set are coerced to appropriate annotation type if necessary (by consulting the annotation slot of the expression set, and using geneIdType<-).

[[

signature(x = "GeneSet"): select a single gene from the gene set.

\$

signature(x = "GeneSet"): select a single gene from the gene set, allowing partial matching.

Useful additional methods include:

GeneColorSet

signature(type = "GeneSet"): create a 'color' gene set from a GeneSet, containing information about phenotype. This method has a required argument phenotype, a character string describing the phenotype for which color is available. See GeneColorSet.

mapIdentifiers

Use the code in the examples to list available methods. These convert genes from one GeneIdentifierType to another. See mapIdentifiers and specific methods in GeneIdentifierType for additional detail.

incidence

Summarize shared membership in genes across gene sets. See incidence-methods.

toGmt

Export to 'GMT' format file. See toGmt.

show

signature(object = "GeneSet"): display a short summary of the gene set.

details

signature(object = "GeneSet"): display additional information about the gene set. See details.

initialize

signature(.Object = "GeneSet"): Used internally during gene set construction.

Author(s)

Martin Morgan <Martin.Morgan@RoswellPark.org>

See Also

GeneColorSet CollectionType GeneIdentifierType

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
## Empty gene set
GeneSet()
## Gene set from ExpressionSet
data(sample.ExpressionSet)
gs1 <- GeneSet(sample.ExpressionSet[100:109])
## GeneSet from Broad XML; 'fl' could be a url
fl <- system.file("extdata", "Broad.xml", package="GSEABase")
gs2 <- getBroadSets(fl)[[1]] # actually, a list of two gene sets
## GeneSet from list of geneIds
geneIds <- geneIds(gs2) # any character vector would do
gs3 <- GeneSet(geneIds=geneIds)
## unspecified set type, so...
is(geneIdType(gs3), "NullIdentifier") == TRUE
## update set type to match encoding of identifiers
geneIdType(gs2)
geneIdType(gs3) <- SymbolIdentifier()

## Convert between set types; this consults the 'annotation'
## information encoded in the 'AnnotationIdentifier' set type and the
## corresponding annotation package.
## Not run: 
gs4 <- gs1
geneIdType(gs4) <- EntrezIdentifier()

## End(Not run)

## logical (set) operations
gs5 <- GeneSet(sample.ExpressionSet[100:109], setName="subset1")
gs6 <- GeneSet(sample.ExpressionSet[105:114], setName="subset2")
## intersection: 5 'genes'; note the set name '(subset1 & subset2)'
gs5 & gs6
## union: 15 'genes'; note the set name
gs5 | gs6
## an identity
gs7 <- gs5 | gs6
gs8 <- setdiff(gs5, gs6) | (gs5 & gs6) | setdiff(gs6, gs5)
identical(geneIds(gs7), geneIds(gs8))
identical(gs7, gs8) == FALSE # gs7 and gs8 setNames differ

## output
tmp <- tempfile()
toBroadXML(gs2, tmp)
noquote(readLines(tmp))
## must be BroadCollection() collectionType 
try(toBroadXML(gs1))
gs9 <- gs1
collectionType(gs9) <- BroadCollection()
toBroadXML(gs9, tmp)
unlink(tmp)
toBroadXML(gs9) # no connection --> character vector
## list of geneIds --> vector of Broad GENESET XML
gs10 <- getBroadSets(fl) # two sets
entries <- sapply(gs10, function(x) toBroadXML(x))

## list mapIdentifiers available for GeneSet
showMethods("mapIdentifiers", classes="GeneSet", inherit=FALSE)

Example output

Loading required package: BiocGenerics
Loading required package: parallel

Attaching package: 'BiocGenerics'

The following objects are masked from 'package:parallel':

    clusterApply, clusterApplyLB, clusterCall, clusterEvalQ,
    clusterExport, clusterMap, parApply, parCapply, parLapply,
    parLapplyLB, parRapply, parSapply, parSapplyLB

The following objects are masked from 'package:stats':

    IQR, mad, sd, var, xtabs

The following objects are masked from 'package:base':

    Filter, Find, Map, Position, Reduce, anyDuplicated, append,
    as.data.frame, basename, cbind, colMeans, colSums, colnames,
    dirname, do.call, duplicated, eval, evalq, get, grep, grepl,
    intersect, is.unsorted, lapply, lengths, mapply, match, mget,
    order, paste, pmax, pmax.int, pmin, pmin.int, rank, rbind,
    rowMeans, rowSums, rownames, sapply, setdiff, sort, table, tapply,
    union, unique, unsplit, which, which.max, which.min

Loading required package: Biobase
Welcome to Bioconductor

    Vignettes contain introductory material; view with
    'browseVignettes()'. To cite Bioconductor, see
    'citation("Biobase")', and for packages 'citation("pkgname")'.

Loading required package: annotate
Loading required package: AnnotationDbi
Loading required package: stats4
Loading required package: IRanges
Loading required package: S4Vectors

Attaching package: 'S4Vectors'

The following object is masked from 'package:base':

    expand.grid

Loading required package: XML
Loading required package: graph

Attaching package: 'graph'

The following object is masked from 'package:XML':

    addNode

setName: NA 
geneIds:  (total: 0)
geneIdType: Null
collectionType: Null 
details: use 'details(object)'


[1] TRUE
geneIdType: Symbol
setName: (subset1 & subset2) 
geneIds: 31344_at, 31345_at, ..., 31348_at (total: 5)
geneIdType: Annotation (hgu95av2)
collectionType: ExpressionSet 
details: use 'details(object)'
setName: (subset1 | subset2) 
geneIds: 31339_at, 31340_at, ..., 31353_f_at (total: 15)
geneIdType: Annotation (hgu95av2)
collectionType: ExpressionSet 
details: use 'details(object)'
[1] TRUE
[1] TRUE
[1] "/work/tmp/tmp/RtmpBY5tXq/file78575830d2ca"
[1] <?xml version="1.0"?>                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
[2] <GENESET STANDARD_NAME="chr5q23" SYSTEMATIC_NAME="c1:101" ORGANISM="Human" EXTERNAL_DETAILS_URL="file://usr/lib/R/site-library/GSEABase/extdata/Broad.xml,http://www.broad.mit.edu/gsea/msigdb/cards/chr5q23.xml,http://genome.ucsc.edu/cgi-bin/hgTracks?position=5q23" CATEGORY_CODE="C1" CONTRIBUTOR="Broad Institute" DESCRIPTION_FULL="Genes in cytogenetic band chr5q23" DESCRIPTION_BRIEF="Genes in cytogenetic band chr5q23" MEMBERS_SYMBOLIZED="ZNF474,CCDC100,ANKRD43,NRG2,LOC391828,IL4,PACAP,SLC12A2,LOC644659,DTWD2,PRRC1,EGR1,LOC389322,FNIP1,MGC32805,SRFBP1,CSNK1G3,LOC644854,PGGT1B,LOC728682,FLJ33630,LOC644754,CAMLG,CSS3,LOX,UBE2B,FTMT,LOC728460,GDF9,LOC644146,LOC728711,LOC728342,CDO1,LOC402229,ADRA1B,MRPS5P3,LMNB1,NME5,LOC644100,LOC391827,RNF14,LOC391825,NEUROG1,LOC474341,FBN2,LOC133629,PRR16,LOC728612,ITGA2,FABP6,IRF1,FLJ27505,SLC27A6,PPIC,LOC391824,CSF2,LOC348958,LOC644557,ADAMTS2,SNX2,ARGFXP1,RPS17P2,GPX3,ZNF608,GRAMD3,IL5,CTXN3,HBEGF,IL3,LOC401206,RAD50,SNX24,ACTBP4,LOC153277,LOC340069,FLJ44606,LOC133609,PPP2CA,COMMD10,RNUXA,TNFAIP8,MARCH3,FLJ90650,PTMAP2,SEMA6A,LOC728586"/>
Error in toBroadXML(gs1) : 
  toBroadXML requires 'BroadCollection', got 'ExpressionSetCollection'
[1] "/work/tmp/tmp/RtmpBY5tXq/file78575830d2ca"
[1] "<?xml version=\"1.0\"?>\n<GENESET STANDARD_NAME=\"NA\" SYSTEMATIC_NAME=\"ip-172-31-1-5:30807:Tue May 21 07:27:55 2019:16\" ORGANISM=\"Homo sapiens\" EXTERNAL_DETAILS_URL=\"www.lab.not.exist\" CATEGORY_CODE=\"C1\" CONTRIBUTOR=\"Pierre Fermat\" PMID=\"\" DESCRIPTION_FULL=\"An example object of expression set (ExpressionSet) class\" DESCRIPTION_BRIEF=\"Smoking-Cancer Experiment\" MEMBERS_SYMBOLIZED=\"31339_at,31340_at,31341_at,31342_at,31343_at,31344_at,31345_at,31346_at,31347_at,31348_at\"/>"
Function: mapIdentifiers (package GSEABase)
what="GeneSet", to="GeneIdentifierType", from="AnnDbBimap"
what="GeneSet", to="GeneIdentifierType", from="GeneIdentifierType"
what="GeneSet", to="GeneIdentifierType", from="NullIdentifier"
what="GeneSet", to="GeneIdentifierType", from="environment"
what="GeneSet", to="GeneIdentifierType", from="missing"
what="GeneSet", to="NullIdentifier", from="GeneIdentifierType"

GSEABase documentation built on Dec. 13, 2020, 2 a.m.