get_topgo: Convenience wrapper to TopGO package (Rahnenfueher et al.)

View source: R/get_topgo.R

get_topgoR Documentation

Convenience wrapper to TopGO package (Rahnenfueher et al.)

Description

This function carries out a TopGO gene ontology enrichment on a data set with custom protein/gene IDs and GO terms. The function takes as main input a data frame with three specific columns: cluster numbers, Gene IDs, and GO terms. Alternatively, these can also be supplied as three individual lists.

Usage

get_topgo(
  df = NULL,
  GeneID = NULL,
  Gene.ontology.IDs = NULL,
  cluster = NULL,
  selected.cluster,
  topNodes = 50
)

Arguments

df

an (optional) data.frame with the three columns named as specified below ('GeneID', 'Gene.ontology.IDs', 'cluster')

GeneID

(character) The column containing gene IDs, alternatively a vector

Gene.ontology.IDs

(character) The column containing a list of GO terms for each gene, alternatively a vector with same order and length as 'GeneID'

cluster

(numeric, factor, character) the column containing a grouping variable, alternatively a vector with same order and length as 'GeneID'

selected.cluster

(character) the name of the group that is to be comapred to the background. Must be one of 'cluster'. If not specified, the first factor level is used (alphabetical order).

topNodes

(numeric) the max number of GO terms (nodes) to be returned by the function.

Value

a data.frame with TOpGO gene enrichment results

Examples


# The get_topgo function will require the TopGO package 
# as an additional dependency that is not automatically 
# attached with this package.
library(topGO)

# a list of arbitrary GO terms
go_terms <- c(
  "GO:0006412", "GO:0015979", "GO:0046148", "GO:1901566", "GO:0042777", "GO:0006614",
  "GO:0016114", "GO:0006605", "GO:0090407", "GO:0031564", "GO:0032784", "GO:0052889",
  "GO:0032787", "GO:0043953", "GO:0046394", "GO:0042168", "GO:0009124", "GO:0006090",
  "GO:0016108", "GO:0016109", "GO:0016116", "GO:0016117", "GO:0065002", "GO:0006779",
  "GO:0072330", "GO:0046390", "GO:0006754", "GO:0018298", "GO:0006782", "GO:0022618",
  "GO:0042255", "GO:0046501", "GO:0070925", "GO:0071826", "GO:0006783", "GO:0009156"
)

# construct a sample data set with 26  different genes in 2 different groups
# and test which (randomly sampled) GO terms might be enriched in both groups.
# We randomly sample 1 to 3 GO terms per gene. They need to be formatted as one 
# string of GO terms separated by "; ".
df <- data.frame(
  GeneID = LETTERS,
  cluster = rep(c(1, 2), each = 13),
  Gene.ontology.IDs = sapply(1:26, 
    function(x) paste(sample(go_terms, sample(1:3, 1)), collapse = ";")
  ),
  stringsAsFactors = FALSE
)

# test if GO terms are enriched in group 1 against background
get_topgo(df, selected.cluster = 1, topNodes = 5)


m-jahn/R-tools documentation built on Feb. 5, 2023, 1:05 p.m.