The function constructs a category membership matrix, such as used by
from a list of gene identifiers and their annotated GO categories.
For each of the GO categories stated in
all less specific terms (ancestors) are also included, thus one need
only obtain the most specific set of GO term mappings, which
can be obtained from Bioconductor annotation packages or via biomaRt.
The ancestor relationships are obtained from the GO.db package.
Character vector with (arbitrary) gene identifiers. They will be used for the column names of the resulting matrix.
A character vector of the same length as
The function requires the
For subsequent analyses, it is often useful to remove categories that have only a small number of members. Use the normal matrix subsetting syntax for this, see example.
If a GO category in
categ is not found in the GO annotation
package, a warning will be generated, and no ancestors
for that GO category are added (but that category itself will be part
of the returned adjacency matrix).
The adjacency matrix of the bipartite category membership graph, rows are categories and columns genes.
1 2 3 4 5 6 7 8 9 10 11
g = cateGOry(c("CG2671", "CG2671", "CG2950"), c("GO:0090079", "GO:0001738", "GO:0003676"), sparse=TRUE) g rowSums(g) ## number of genes in each category ## Filter out categories with less than minMem and more than maxMem members. ## This is toy data, in real applications, a choice of minMem higher ## than 2 will be more appropriate. filter = function(x, minMemb = 2, maxMemb = 35) ((x>=minMemb) & (x<=maxMemb)) g[filter(rowSums(g)),,drop=FALSE ]
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.