loadAnnotation: Load a sitadela simple annotation element

View source: R/sitadela.R

loadAnnotationR Documentation

Load a sitadela simple annotation element

Description

This function loads an annotation element from a local sitadela annotation database. If the annotation is not found and the organism is supported, the annotation is fetched and created on the fly but not imported in the local database. Use addAnnotation for this purpose (build/update/add annotations).

Usage

    loadAnnotation(genome, refdb, 
        type = c("gene", "transcript", "utr",
            "transutr", "transexon", "exon"), 
        version="auto", wtv = FALSE,
        db = getDbPath(), summarized = FALSE,
        asdf = FALSE, rc = NULL)

Arguments

genome

a sitadela supported organism or a custom organism name imported by the user.

refdb

a sitadela supported annotation source or a custom name imported by the user.

type

the transcriptional unit annotation level to load. It can be one of "gene" (default), "transcript", "utr", "transexon", "transutr", "exon". See Details for further explanation of each option.

version

the version of the annotation to use. See Details.

wtv

load annotations with versioned genes and transcripts when/where available.

db

same as the db in addAnnotation.

summarized

if TRUE, retrieve summarized, non-overlaping elements where appropriate (e.g. exons).

asdf

return the result as a data.frame (default FALSE).

rc

same as the rc in addAnnotation.

Details

Regarding org, it can be, for human genomes "hg18", "hg19" or "hg38", for mouse genomes "mm9", "mm10", for rat genomes "rn5" or "rn6", for drosophila genome "dm3" or "dm6", for zebrafish genome "danrer7", "danrer10" or "danrer11", for chimpanzee genome "pantro4", "pantro5", for pig genome "susscr3", "susscr11", for Arabidopsis thaliana genome "tair10" and for Equus caballus genome "equcab2" and "equcab3". Finally, it can be "USER_NAMED_ORG" with a custom organism which has been imported to the annotation database by the user using a GTF/GFF file. For example org="mm10_p1".

Regarding type, it defines the level of transcriptional unit (gene, transcript, 3' UTR, exon) coordinates to be loaded or fetched if not present. The following types are supported:

  • "gene": canonical gene coordinates are retrieved from the chosen database.

  • "transcript": all transcript coordinates are retrieved from the chosen database.

  • "utr": all 3' UTR coordinates are retrieved from the chosen database, grouped per gene.

  • "transutr": all 3' UTR coordinates are retrieved from the chosen database, grouped per \ transcript.

  • "transexon": all exon coordinates are retrieved from the chosen database, grouped per transcript.

  • "exon": all exon coordinates are retrieved from the chosen database.

Regarding version, this is an integer denoting the version of the annotation to use from the local annotation database or fetch on the fly. For Ensembl, it corresponds to Ensembl releases, while for UCSC/RefSeq, it is the date of creation (locally).

Value

The function returns a GenomicRanges object or a data.frame with the requested annotation.

Author(s)

Panagiotis Moulos

Examples

db <- file.path(system.file(package="sitadela"),
    "annotation.sqlite")
if (file.exists(db))
    gr <- loadAnnotation(genome="hg19",refdb="ensembl",
        type="gene",db=db)

pmoulos/sitadela documentation built on March 19, 2024, 2:02 a.m.