find.top.GO.slim.terms: Find top enriched GO slim terms

Description Usage Arguments Details Value Author(s) References See Also Examples

Description

This function finds the top enriched Gene Ontology (functional annotation) slim terms in gene lists.

Usage

1
2
find.top.GO.slim.terms(gene_lists, all_genes, GOmappingfile, output_file, 
                       topNum = 20, GO_slim_id, heatmap = FALSE)

Arguments

gene_lists

an .xlsx file giving the lists of user-provided genes, either user-specified genes or associated genes found by select.associated.genes() or select. associated.orthologs().

all_genes

a character verctor giving the population of all genes.

GOmappingfile

a character giving the path of GO mapping file, which contains the information of the mapping of gene IDs to GO terms.

output_file

a character specifying the name of a .txt file to store the output of this function: top enriched GO slim terms in the input gene lists.

topNum

a integer specifying the number of top GO terms to be included in the results. Defaults to 20.

GO_slim_id

a character vector containing the GO IDs of all GO slim terms.

heatmap

a Boolean value specifying whether to output the heatmap for the top enriched GO slim terms. The heatmap gives the enrichment results across all samples of the GO slim terms that are at least top enriched in one biological sample. If heatmap = TRUE, this function outputs a pdf file named "Top enriched GO slim terms across samples.pdf".

Details

To use this function, please download the GO mapping file of the species of interest from http://geneontology.org/page/download-annotations. Please make sure that this file is in R's working directory and set GOmappingfile to the file's name.

gene_lists can be either the output .xlsx file of select.associated.orthologs(), the output .xlsx file of select.associated.genes() or an .xlsx file of the same format that contains the user-provided gene lists. If users want to use the overlap genes or overlap orthologs, they can find them in the output .xlsx files of ws.trom(), ws.trom.orthologs() or bs.trom(). Users can select the columns they are interested in and compact them into a new .xlsx file, and then pass the name of the new .xlsx file to gene_lists.

Users can check the .txt file output_file for the results of top enriched GO slim terms.

Value

A list of length 6*(number of biological samples). List elements are ordered in correspondence with the biological samples, e.g., the first 6 elements in the list correspond to the first sample, etc. For each sample, there are

a character vector giving the top GO slim IDs.
a character vector giving the corresponding the top GO slim terms.
a vector giving the number of occurences of the top GO slim IDs in the population.
a vector giving the observed number of occurences of the top GO slim IDs in the sample.
a vector giving the expected number of occurences of the top GO slim IDs in the sample.
a character vector giving the p-values from a hypergeometric test.

Author(s)

Jingyi Jessica Li, Wei Vivian Li

References

Li WV, Chen Y and Li JJ (2016). TROM: A Testing-Based Method for Finding Transcriptomic Similarity of Biological Samples. Statistics in Biosciences. DOI: 10.1007/s12561-016-9163-y

Li JJ, Huang H, Bickel PJ, & Brenner SE (2014). Comparison of D. melanogaster and C. elegans developmental stages, tissues, and cells by modENCODE RNA-seq data. Genome Research, 24(7), 1086-1101.

See Also

find.top.GO.terms

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
## Find top enriched GO terms in the developmental stages of D. melanogaster

## To run this example, please download the file "gene_association_fb_example.txt" from 
## https://ucla.box.com/GO-mapping-file.
## Please move "gene_association_fb_example.txt" to R's working directory.

## dm_gene_expr.rda can be downloaded and unzipped from
## http://www.stat.ucla.edu/~jingyi.li/packages/TROM/TROM_Rdata.zip.

## Not run: 
load("dm_gene_expr.rda")
dm_genes_all <- as.character(dm_gene_expr[,1]) 
data(GO_slim_id)
gene_lists <- system.file("dm_associated_genes.xlsx", package = "TROM")
dm_stage_GO_slim <- find.top.GO.slim.terms(
gene_lists = gene_lists,
all_genes = dm_genes_all,
GOmappingfile = "gene_association_fb_example.txt",
output_file = "top 20 enriched GO slim terms in fly stage-associated genes.txt",
GO_slim_id = GO_slim_id,
topNum = 20,
heatmap = FALSE)
## End(Not run)

TROM documentation built on May 1, 2019, 8:07 p.m.