prepGO: Download and prepare gene ontology

prepGOR Documentation

Download and prepare gene ontology

Description

prepGO downloads and prepares data bases of gene ontology (GO) for enrichment analysis by gene sets.

Usage

prepGO(
  species = "human",
  abbr_species = NULL,
  gaf_url = NULL,
  obo_url = NULL,
  db_path = "~/proteoQ/dbs/go",
  type = c("biological_process", "cellular_component", "molecular_function"),
  filename = NULL,
  overwrite = FALSE
)

Arguments

species

Character string; the name of a species for the conveninent preparation of GO. The species available for the convenience feature is in one of c("human", "mouse", "rat") with "human" being the default. The argument is not required for other species; instead, users will provide values under arguments abbr_species, gaf_url and obo_url.

abbr_species

Two-letter character string; the abbreviated name of species used with org.Xx.eg.db. The value of abbr_species will be determined automatically if the species is in one of c("human", "mouse", "rat"). Otherwise, for example, users need to provide abbr_species = Ce for fetching the org.Ce.eg.db package in the name space of proteoQ.

For analysis against gene ontology and Molecular Signatures, the argument is further applied to differentiate the same biological terms under different species; e.g., GO~0072686 mitotic spindle becomes hs_GO~0072686 mitotic spindle for human and mm_GO~0072686 mitotic spindle for mouse.

gaf_url

A URL to GO Annotation File (GAF). A valid web address is required for species other than c("human", "mouse", "rat"). At the NULL default and the species in one of c("human", "mouse", "rat"), the link will be determined automatically; note that users can overwrite the default GAF by providing their own URL.

obo_url

A URL link to GO terms in an OBO format. At the NULL default, the web address will be determined automatically. Users can overwrite the default OBO by providing their own URL.

db_path

Character string; the local path for database(s). The default is "~/proteoQ/dbs/go".

type

Character vector. The name space in gene ontology to be included. The default is to include all in c("biological_process", "cellular_component", "molecular_function"). In the example of type = c("biological_process", "cellular_component"), terms of molecular_function will be excluded.

filename

Character string; An output file name. At the NULL default, the name will be determined automatically at a given species; i.e., go_hs.rds for human data. The file is saved as a .rds object for uses with prnGSPA.

overwrite

Logical; if TRUE, overwrite the downloaded database(s). The default is FALSE.

Examples


library(proteoQ)

# `human` and `mouse` with a default OBO;
# outputs under `db_path`
prepGO(human)
prepGO(mouse)

# head(readRDS(file.path("~/proteoQ/dbs/go/go_hs.rds")))
# head(readRDS(file.path("~/proteoQ/dbs/go/go_mm.rds")))

# enrichment analysis with custom `GO`
prnGSPA(
  gset_nms = c("~/proteoQ/dbs/go/go_hs.rds",
               "~/proteoQ/dbs/go/go_mm.rds"),
)

# `mouse` with a slim OBO
prepGO(
  species = mouse,
  obo_url = "http://current.geneontology.org/ontology/subsets/goslim_mouse.obo",
  filename = mm_slim.rds,
)

# `worm` not available for default GO preparation
if (!requireNamespace("BiocManager", quietly = TRUE))
    install.packages("BiocManager")
BiocManager::install("org.Ce.eg.db")

library(org.Ce.eg.db)

prepGO(
  # species = worm,
  abbr_species = Ce,
  gaf_url = "http://current.geneontology.org/annotations/wb.gaf.gz",
  obo_url = "http://purl.obolibrary.org/obo/go/go-basic.obo", 
  filename = go_ce.rds,
)



qzhang503/proteoQ documentation built on Dec. 14, 2024, 12:27 p.m.