get_anno_genes: Get genes that are annotated to GO-categories

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/get_anno_genes.R

Description

Given a vector of GO-IDs, e.g. c('GO:0072025','GO:0072221') this function returns all genes that are annotated to those GO-categories. This includes genes that are annotated to any of the child nodes of a GO-category.

Usage

1
2
get_anno_genes(go_ids, database = 'Homo.sapiens', genes = NULL, annotations = NULL,
    term_df = NULL, graph_path_df = NULL, godir = NULL)

Arguments

go_ids

character() vector of GO-IDs, e.g. c('GO:0051082', 'GO:0042254').

database

optional character() defining an OrganismDb or OrgDb annotation package from Bioconductor, like 'Mus.musculus' (mouse) or 'org.Pt.eg.db' (chimp).

genes

optional character() vector of gene-symbols. If defined, only annotations of those genes are returned.

annotations

optional data.frame() with two character() columns: gene-symbols and GO-categories. Alternative to 'database'.

term_df

optional data.frame() with an ontology 'term' table. Alternative to the default integrated GO-graph or godir. Also needs graph_path_df.

graph_path_df

optional data.frame() with an ontology 'graph_path' table. Alternative to the default integrated GO-graph or godir. Also needs term_df.

godir

optional character() specifying a directory that contains the ontology tables 'term.txt' and 'graph_path.txt'. Alternative to the default integrated GO-graph or term_df + graph_path_df.

Details

Besides the default 'Homo.sapiens', also other OrganismDb or OrgDb packages from Bioconductor, like 'Mus.musculus' (mouse) or 'org.Pt.eg.db' (chimp), can be used. It is also possible to directly provide a data.frame() with annotations, which is then searched for the input GO-categories and their child nodes.

By default the package's integrated GO-graph is used to find child nodes, but a custom ontology can be defined, too. For details on how to use a custom ontology with term_df + graph_path_df or godir please refer to the package's vignette. The advantage of term_df + graph_path_df over godir is that the latter reads the files 'term.txt' and 'graph_path.txt' from disk and therefore takes longer.

Value

A data.frame() with two columns: GO-IDs (character()) and the annotated genes (character()). The output is ordered by GO-ID and gene-symbol.

Author(s)

Steffi Grote

References

[1] Ashburner, M. et al. (2000). Gene Ontology: tool for the unification of biology. Nature Genetics 25, 25-29.

See Also

get_anno_categories
get_ids
get_names
get_child_nodes
get_parent_nodes

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
 

## find all genes that are annotated to GO:0000109
## ("nucleotide-excision repair complex")
get_anno_genes(go_ids='GO:0000109')

## find out wich genes from a set of genes
## are annotated to some GO-categories
genes = c('AGTR1', 'ANO1', 'CALB1', 'GYG1', 'PAX2')
gos = c('GO:0001558', 'GO:0005536', 'GO:0072205', 'GO:0006821')
anno_genes = get_anno_genes(go_ids=gos, genes=genes)
# add the names and domains of the GO-categories
cbind(anno_genes ,get_names(anno_genes$go_id)[,2:3])

## find all annotations to GO-categories containing 'serotonin receptor'
sero_ids = get_ids('serotonin receptor')
sero_anno = get_anno_genes(go_ids=sero_ids$go_id)
# merge with names of GO-categories
head(merge(sero_ids, sero_anno))

GOfuncR documentation built on Nov. 8, 2020, 8:27 p.m.