knitr::opts_chunk$set(echo = TRUE)
NOTE: Currently, this package does only accept AGI-codes (A. thaliana). This will change, however.
This package provides a client for GO-Term enrichment via the API of PANTHER
. It takes a vector
of gene IDs, sends it to PANTHER
, and reformats the response into a handy dataframe
. This dataframe
also includes gene IDs, which are associated to the GO-Term in question.
```{ eval=FALSE}
devtools::install_github("lmuenter/oracl")
## Usage In this example, we'd like to identify overrepresented GO-Terms for an example dataset provided with the package. Note, that we specify the *Biological Process* ontology by setting `ont = bp` in `oracl::oraclient()`. Other options are of course `ont = mf` (Molecular Function) and `ont = cc` (Cellular Component). ```r # load package library(oracl) # Get a set of AGI-codes. gs <- oracl:::GS01 # Get a background geneset (optional) bg <- oracl:::background # conduct GO-Term ORA via PANTHER bp.df <- oraclient(gs, bg = bg, ont = "bp", fdr.thresh = 0.05)
# Load Packages library(ggplot2) # Make a plot volcano.p = volcanoracl(bp.df) # The plot `volcano.p` is a ggplot-object. # We can change its attributes! volcano.p + scale_colour_gradientn(colours = "steelblue")
{oracl}
with a list of genesetsWhen several genesets should be inferred, it may be handy to combine overrepresented terms in one dataframe. This is especially useful for plotting.
# obtain a list of genesets gs.ls <- list( oracl:::GS01, oracl:::GS02, oracl:::GS03 ) # get background geneset bg <- oracl:::background # set names of list elements (vital for later) names(gs.ls) <- c("GS01", "GS02", "GS03") # get overrepresented GO-terms bp.ls = lapply(gs.ls, oraclient, bg = bg, ont = "bp", fdr.thresh = 0.05 ) # get ONE dataframe (ID-column `grouping` specifies the geneset) bp.ls.df <- oracl_list_to_df(bp.ls)
We can now plot overrepresented GO-Terms using group information in the column bp.df$grouping
.
Here, we want to facet the plot according to the grouping variable (stored in bp.df$grouping
). We also specify the desired number of columns, the position of the facet label, and whether or not we only include labels found in each dataset (change these things according to your data!):
oraclot(bp.ls.df, top_n = 5) + facet_wrap(grouping ~ ., ncol = 1, strip.position = "right", scales = "free_y") + scale_color_viridis_c()
We can also make a facetted volcano plot:
volcanoracl(bp.ls.df, top_n = 5) + facet_wrap(grouping ~ ., nrow = 1)
Gene IDs and Organism. Currently, only Arabidopsis thaliana (L.) Heynh. can be investigated.
Cognate genes. in order to save resources, the API of PANTHER
does not report gene sets back (personal communication). Gene IDs reported by {oracl}
are therefore only approximations. In essence, the underlying geneset is semantically compared to a gene-to-GO-term-dataset for every enriched GO-Term. These datasets are included in {oracl}
(see oracl/data/goterms
). Datasets have been generated by conducting ORA using the PANTHER website with all available AGI codes. To obtain necessary datasets, all results (without Bonferroni Correction) were exported to .json, parsed, and reformated.
Functions for automated plotting.
Make other organisms available.
Implement redundancy removal using {rrvgo}
Automate gene-symbol mapping using {org.At.tair.db}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.