getBiotypes: Assigning Transcript Biotypes

View source: R/BioTIP_update_04202022.R

getBiotypesR Documentation

Assigning Transcript Biotypes

Description

The purpose of the getBiotypes() function is to class both coding and noncoding transcripts into biotypes using the most recent GENCODE annotations. This tool can also be used to define potential lncRNAs, given an available genome transcriptome assembly (a gtf file) or any genomic loci of interest.

Usage

getBiotypes(full_gr, gencode_gr, intron_gr = NULL, minoverlap = 1L)

Arguments

full_gr

A GRanges object which contains either coding or noncoding transcripts. Each GRanges objects' columns requires a unique identifications. For further details refer to the GRanges package.

gencode_gr

A GRanges object contatining a human Chr21 GENCODE reference annotation. A metadata column, "biotype", describes the transcript type.

intron_gr

A GRanges object containing the coordinates of non-coding transcripts.

minoverlap

An IRanges argument which detects minimum overlap between two IRanges objects. For more information about minoverlap argument refer to the IRanges package.

Details

For details of findOverlaps, type.partialOverlap, type.50Overlap type.toPlot, queryhits, and subjecthits see GenomicRanges https://www.bioconductor.org/packages/release/bioc/html/GenomicRanges.html, IRanges https://www.bioconductor.org/packages/release/bioc/html/IRanges.html, and BiocManager http://bioconductor.org/install/index.html.

Value

A GRanges object that returns classified transcriptome biotypes.

Note

Replace the PATH_FILE when loading your data locally.

Author(s)

Zhezhen Wang and Biniam Feleke

Source

Reference GRCh37 genome https://www.gencodegenes.org/human/release_25lift37.html for details on gtf format visit ensemble https://useast.ensembl.org/info/website/upload/gff.html

References

Wang, Z.Z., J. M. Cunningham and X. H. Yang (2018). 'CisPi: a transcriptomic score for disclosing cis-acting disease-associated lincRNAs.' Bioinformatics 34(17): 664-670', PMID: 30423099'

Examples

# Input datasets from our package's data folder
library(GenomicRanges)
data("gencode")
data("intron")
data("ILEF")

# Converting datasets to GRanges object
gencode_gr = GRanges(gencode)
ILEF_gr = GRanges(ILEF)
cod_gr = GRanges(cod)
intron_gr= GRanges(intron)

# Filtering non-coding transcripts
getBiotypes(ILEF_gr,  gencode_gr,  intron_gr)

## Not run: getBiotypes(intron_gr)

xyang2uchicago/NPS documentation built on Nov. 7, 2023, 1 a.m.