GFF_convert: Flatten GTF files into GFF format

Description Usage Arguments Value Note Author(s) References Examples

View source: R/GFF_convert.R

Description

The GFF_convert function takes a GTF file input and flattens in into a GFF file, which contains discrete, non-overlapping exon bin definitions. Transcripts per exon bin are annotated, and normal gene names are added, in addition the Ensembl IDs.

Usage

1
GFF_convert(annot.GTF, GTF.File, GFF.File)

Arguments

annot.GTF

This argument must be passed GTF file annotations, either in the form of an rtracklayer object from the rtracklayer import function, or a data table directly imported into R from a GTF file, using tab delimiters. The function will automatically detect the formatting of the file in these cases.

GTF.File

A full file path, including file name & extension, to the GTF file which will be flattened. This parameter is required. An example file path for a GTF file would be: /Users/username/path/to/file.gtf

GFF.File

A full file path to which the flattened GFF file can be written. This argument is not required, although it is strongly recommended that GFF files be saved for future use. An example file path for a GFF file would be: /Users/username/path/to/file.gff

Value

This function returns a GFF data frame, which contains non-overlapping exon bins, flattened from a GTF file. When the function GFF_convert is run, its results should usually be assigned to a variable name, such as 'GFF'.

Note

The GFF_convert function is based on the ideas from the authors of DEXSeq (Anders et al., 2012). Their dexseq_prepare_annotations.py pipeline for flattening GTF files into GFF format was translated into R. Great thanks are owed to Dylan Siriwardena (MSc) for helping write the initial translated R version.

Author(s)

R. Matthew Tanner

References

Anders, S., Reyes, A., and Huber, W. (2012). Detecting differential usage of exons from RNA-seq data. Genome Research, 22(10):2008-17. Lawrence M, Huber W, Pag\'es H, Aboyoun P, Carlson M, et al. (2013) Software for Computing and Annotating Genomic Ranges. PLoS Comput Biol 9(8): e1003118.doi:10.1371/journal.pcbi.1003118

Examples

1
2
3
4
5
# load the sub-sampled GTF file path from the ExCluster package
GTF_file <- system.file("extdata","sub_gen.v23.gtf", package = "ExCluster")
# now run GTF_file without assigning a GFF_file to write out
# assign this output to the GFF variable
GFF <- GFF_convert(GTF.File=GTF_file)

ExCluster documentation built on Nov. 8, 2020, 6:48 p.m.