retrieveanno: Retrieve and Combine Annotation Information

View source: R/preprocessing-retrieveanno.R

retrieveannoR Documentation

Retrieve and Combine Annotation Information

Description

This function filters gencode annotations to retrieve "transcript". It then distinguishes transcripts coming from protein coding genes (MANE_Select) and those coming from long non-coding genes (lncRNA, Ensembl_canonical).

Usage

retrieveanno(exptabpath, gencodepath, saveobjectpath = NA, showtime = FALSE,
verbose = TRUE)

Arguments

exptabpath

Path to the experiment table file containing a table with columns named 'condition', 'replicate', 'strand', and 'path'.

gencodepath

Path to the GENCODE annotation file.

saveobjectpath

Path to save intermediate R objects. Default is 'NA' and R objects are not saved.

showtime

Logical. If 'TRUE', displays timing information. Default is 'FALSE'.

verbose

Logical. If 'TRUE', provides detailed messages during execution. Default is 'TRUE'.

Details

The function performs the following steps: 1. Reads experimental data from the provided CSV file and validates it. 2. Reads genomic annotations from the gencode file and filters for transcripts. 3. Separately processes protein-coding and long non-coding RNA transcripts: - For protein-coding genes, selects the most representative (MANE_Select or Ensembl_canonical) transcripts. - For long non-coding RNAs, filters out transcripts with undesirable evidence levels. 4. Combines these annotations into a single data frame, labeling each transcript with its biotype. 5. Optionally saves the resulting data frame as an RDS file in the specified directory. 6. Optionally reports the total time taken for analysis.

Value

A data frame containing the combined annotation information for protein-coding and long non-coding RNA transcripts. If 'saveobjectpath' is not 'NA', the object is also saved as an RDS file in the specified directory.

Examples

exptabpath <- system.file("extdata", "exptab-preprocessing.csv", package="tepr")
gencodepath <- system.file("extdata", "gencode-chr13.gtf", package = "tepr")

## Testing retrieveanno
allannobed <- retrieveanno(exptabpath, gencodepath, verbose = FALSE)


tepr documentation built on June 8, 2025, 10:46 a.m.