library(knitr) knitr::opts_chunk$set( error = FALSE, tidy = FALSE, message = FALSE, fig.align = "center") options(width = 100) options(markdown.HTML.stylesheet = "custom.css")
Note: On Aug 19 2019 GREAT released version 4 which supports
hg38 genome and removes some ontologies such pathways.
hg19 as default.
hg38 can be specified by argument
species = "hg38".
To use the older versions such as 3.0.0, specify as
submitGreatJob(..., version = "3").
GREAT (Genomic Regions Enrichment of Annotations Tool) is a popular web-based tool to associate biological functions to genomic regions. The rGREAT package makes GREAT anlaysis automatic by first constructing a HTTP POST request according to user's input and retrieving results from GREAT web server afterwards.
Load the package:
The input data is either a
GRanges object or a BED-format data frame, no matter it is sorted or not.
In following example, we use a data frame which is randomly generated.
set.seed(123) bed = circlize::generateRandomBed(nr = 1000, nc = 0) bed[1:2, ]
Submit genomic regions by
submitGreatJob(). Before submitting, genomic regions will be sorted and overlapping regions will be merged.
The returned variable
job is a
GreatJob class instance which can be used to retrieve results from
GREAT server and stored results which are already downloaded.
job = submitGreatJob(bed)
You can get the summary of your job by directly calling
More parameters can be set for the job:
job = submitGreatJob(bed, species = "mm9") job = submitGreatJob(bed, bg, species = "mm9") job = submitGreatJob(bed, adv_upstream = 10, adv_downstream = 2, adv_span = 2000) job = submitGreatJob(bed, rule = "twoClosest", adv_twoDistance = 2000) job = submitGreatJob(bed, rule = "oneClosest", adv_oneDistance = 2000)
Also you can choose different versions of GREAT for the analysis.
job = submitGreatJob(bed, version = "3.0") job = submitGreatJob(bed, version = "2.0")
Available parameters are (following content is copied from GREAT website):
species: "hg38", "hg19", "mm10", "mm9" are supported in GREAT version 4.x.x, "hg19", "mm10", "mm9", "danRer7" are supported in GREAT version 3.x.x and "hg19", "hg18", "mm9", "danRer7" are supported in GREAT version 2.x.x.
bgChoise: Background regions.
data. If this value is set to
bgargument should be specified
includeCuratedRegDoms: Whether to include curated regulatory domains.
rule: How to associate genomic regions to genes.
basalPlusExt: mode 'Basal plus extension'. Gene regulatory domain definition: Each gene is assigned a basal regulatory domain of a minimum distance upstream and downstream of the TSS (regardless of other nearby genes). The gene regulatory domain is extended in both directions to the nearest gene's basal domain but no more than the maximum extension in one direction.
adv_upstream: proximal extension to upstream (unit: kb)
adv_downstream: proximal extension to downstream (unit: kb)
adv_span: maximum extension (unit: kb)
twoClosest: mode 'Two nearest genes'. Gene regulatory domain definition: Each gene is assigned a regulatory domain that extends in both directions to the nearest gene's TSS but no more than the maximum extension in one direction.
adv_twoDistance: maximum extension (unit: kb)
oneClosest: mode 'Single nearest gene'. Gene regulatory domain definition: Each gene is assigned a regulatory domain that extends in both directions to the midpoint between the gene's TSS and the nearest gene's TSS but no more than the maximum extension in one direction.
adv_oneDistance: maximum extension (unit: kb)
job, we can now retrieve results from GREAT. The first and the primary results are
the tables which contain enrichment statistics for the analysis. By default it will retrieve
results from three GO Ontologies and all pathway ontologies. All tables contains statistics
for all terms no matter they are significant or not. Users can then make filtering yb self-defined cutoff.
There is a column for adjusted p-values by "BH" method. Other p-value adjustment methods can be applied by
The returned value of
getEnrichmentTables() is a list of data frames in which each one corresponds
to tables for single ontology. The structure of data frames are same as the tables on GREAT website.
tb = getEnrichmentTables(job) names(tb) tb[][1:2, ]
Information stored in
job will be updated after retrieving enrichment tables.
You can get results by either specifying the ontologies or by the pre-defined categories (categories already contains pre-defined sets of ontologies):
tb = getEnrichmentTables(job, ontology = c("GO Molecular Function", "BioCyc Pathway")) tb = getEnrichmentTables(job, category = c("GO"))
As you have seen in the previous messages and results, The enrichment tables contain no associated genes.
However, you can set
download_by = 'tsv' in
getEnrichmentTables() to download the complete
tables, but due to the restriction from GREAT web server, only the top 500 regions can be retreived.
tb2 = getEnrichmentTables(job, download_by = "tsv") nrow(tb2[["GO Molecular Function"]]) head(tb2[["GO Molecular Function"]])
All available ontology names for given species can be get by
and all available ontology categories can be get by
availableCategories(). Here you do not
need to provide species information because
job already contains it.
availableOntologies(job) availableCategories(job) availableOntologies(job, category = "GO")
Association between genomic regions and genes can be get by
The function will make the three plots which are same as on GREAT website and returns a
object which contains the gene-region associations.
res = plotRegionGeneAssociationGraphs(job) res[1:2, ]
For those regions that are not associated with any genes under current settings,
distTSS columns will be
You can also choose only plotting one of the three figures.
plotRegionGeneAssociationGraphs(job, type = 1)
By specifying ontology and term ID, you can get the association in a certain term.
Here the term ID is from the first column of the data frame which is returned by
res = plotRegionGeneAssociationGraphs(job, ontology = "GO Molecular Function", termID = "GO:0004984") res[1:2, ]
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.