get_topgo | R Documentation |
This function carries out a TopGO gene ontology enrichment on a data set with custom protein/gene IDs and GO terms. The function takes as main input a data frame with three specific columns: cluster numbers, Gene IDs, and GO terms. Alternatively, these can also be supplied as three individual lists.
get_topgo( df = NULL, GeneID = NULL, Gene.ontology.IDs = NULL, cluster = NULL, selected.cluster, topNodes = 50 )
df |
an (optional) data.frame with the three columns named as specified below ('GeneID', 'Gene.ontology.IDs', 'cluster') |
GeneID |
(character) The column containing gene IDs, alternatively a vector |
Gene.ontology.IDs |
(character) The column containing a list of GO terms for each gene, alternatively a vector with same order and length as 'GeneID' |
cluster |
(numeric, factor, character) the column containing a grouping variable, alternatively a vector with same order and length as 'GeneID' |
selected.cluster |
(character) the name of the group that is to be comapred to the background. Must be one of 'cluster'. If not specified, the first factor level is used (alphabetical order). |
topNodes |
(numeric) the max number of GO terms (nodes) to be returned by the function. |
a data.frame with TOpGO gene enrichment results
# The get_topgo function will require the TopGO package # as an additional dependency that is not automatically # attached with this package. library(topGO) # a list of arbitrary GO terms go_terms <- c( "GO:0006412", "GO:0015979", "GO:0046148", "GO:1901566", "GO:0042777", "GO:0006614", "GO:0016114", "GO:0006605", "GO:0090407", "GO:0031564", "GO:0032784", "GO:0052889", "GO:0032787", "GO:0043953", "GO:0046394", "GO:0042168", "GO:0009124", "GO:0006090", "GO:0016108", "GO:0016109", "GO:0016116", "GO:0016117", "GO:0065002", "GO:0006779", "GO:0072330", "GO:0046390", "GO:0006754", "GO:0018298", "GO:0006782", "GO:0022618", "GO:0042255", "GO:0046501", "GO:0070925", "GO:0071826", "GO:0006783", "GO:0009156" ) # construct a sample data set with 26 different genes in 2 different groups # and test which (randomly sampled) GO terms might be enriched in both groups. # We randomly sample 1 to 3 GO terms per gene. They need to be formatted as one # string of GO terms separated by "; ". df <- data.frame( GeneID = LETTERS, cluster = rep(c(1, 2), each = 13), Gene.ontology.IDs = sapply(1:26, function(x) paste(sample(go_terms, sample(1:3, 1)), collapse = ";") ), stringsAsFactors = FALSE ) # test if GO terms are enriched in group 1 against background get_topgo(df, selected.cluster = 1, topNodes = 5)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.