getAnnotation: Annotation downloader

View source: R/annotation.R

getAnnotationR Documentation

Annotation downloader

Description

For Ensembl based annotations, this function connects to the EBI's Biomart service using the package biomaRt and downloads annotation elements (gene co-ordinates, exon co-ordinates, gene identifications, biotypes etc.) for each of the supported organisms. For UCSC/RefSeq annotations, it connects to the respective SQL databases if the package RMySQL is present, otherwise it downloads flat files and build a temporary SQLite database to make the necessaru build queries. See the help page of metaseqr2 for a list of supported organisms.

Usage

    getAnnotation(org, type, refdb = "ensembl", ver = NULL,
        rc = NULL)

Arguments

org

the organism for which to download annotation (one of the supported ones).

type

"gene", "exon" or "utr". Same as the countType in metaseqr2.

refdb

the online source to use to fetch annotation. It can be "ensembl" (default), "ucsc" or "refseq". In the later two cases, an SQL connection is opened with the UCSC public databases.

ver

the version of the annotation to use.

rc

Fraction of cores to use. Same as the rc in buildAnnotationDatabase.

Value

A data frame with the canonical (not isoforms!) genes or exons of the requested organism. When type="genes", the data frame has the following columns: chromosome, start, end, gene_id, gc_content, strand, gene_name, biotype. When type="exon" the data frame has the following columns: chromosome, start, end, exon_id, gene_id, strand, gene_name, biotype. When type="utr" the data frame has the following columns: chromosome, start, end, transcript_id, gene_id, strand, gene_name, biotype. The gene_id and exon_id correspond to Ensembl, UCSC or RefSeq gene, transcript and exon accessions respectively. The gene_name corresponds to HUGO nomenclature gene names.

Note

The data frame that is returned contains only "canonical" chromosomes for each organism. It does not contain haplotypes or random locations and does not contain chromosome M.

Author(s)

Panagiotis Moulos

Examples

mm10Genes <- getAnnotation("mm10","gene")

pmoulos/metaseqR2 documentation built on March 14, 2024, 8:15 p.m.