In ZhenWei10/golite: Fast and simple GOEA

knitr::opts_chunk$set(echo = TRUE)

Installation

Install the package with:

library(devtools)
install_github("ZhenWei10/golite")

Run GO enrichment analysis

The necessary input of goea should be the gene set gene ID, background gene ID, and the orgDb object.

library(golite)
library(magrittr)
library(org.Hs.eg.db)

Use randomly sampled gene IDs for gene set and background.

library(TxDb.Hsapiens.UCSC.hg19.knownGene)
txdb = TxDb.Hsapiens.UCSC.hg19.knownGene
all_eids_hg19 <- names(genes(txdb))

set.seed(1)
eids_bg <- sample(all_eids_hg19, 3500)
eids_set <- sample(eids_bg,300)

Run GO enrichment analysis given gene set and background.

goea(gene_set = eids_set,
     back_ground = eids_bg,
     orgDb = org.Hs.eg.db,
     interpret_term = T) %>% head(.,10) %>% knitr::kable(.,"markdown")

The function can be vectorized, i.e. the input can be a list of multiple gene sets.

eids_sets <- lapply(1:10,function(x) sample(eids_bg,300)) 

goea(gene_set = eids_sets,
     back_ground = eids_bg,
     orgDb = org.Hs.eg.db) %>% summary

Run GO slim enrichment analysis

GO slim is a subset of GO terms that can be defined at here.

goea(gene_set = eids_set,
     back_ground = eids_bg,
     orgDb = org.Hs.eg.db,
     interpret_term = T,
     GO_Slim = T)  %>% head(.,10) %>% knitr::kable(.,"markdown")

you could set EASE_score = TRUE to get a more conservative p value.

For more information of EASE, please see here.

goea(gene_set = eids_sets,
     back_ground = eids_bg, 
     orgDb = org.Hs.eg.db,
     GO_Slim = T,
     EASE_Score = F) %>% lapply(.,function(x)x$p) %>% unlist %>% hist(main = "normal hypergeometric")

goea(gene_set = eids_sets,
         back_ground = eids_bg,
     orgDb = org.Hs.eg.db,
     GO_Slim = T,
     EASE_Score = T) %>% lapply(.,function(x)x$p) %>% unlist %>% hist(main = "EASE score")