R/exportToGtf.R

Defines functions exportToGtf

Documented in exportToGtf

#' Export GRangesList to GTF
#' 
#' Export the features in a GRangesList generated by \code{getFeatureRanges}
#' to a GTF file. The function will represent each row of each of the 
#' entries as an "exon", each individual entry as a "transcript", and 
#' aggregate all features belonging to the same gene as a "gene" entry in 
#' the GTF file. 
#' 
#' @param grl \code{GRangesList} object, typically generated by 
#'   \code{getFeatureRanges}
#' @param filepath Path to output GTF file
#' 
#' @author Charlotte Soneson
#' 
#' @export
#' 
#' @return Does not return anything, generates a GTF file
#' 
#' @importFrom rtracklayer export
#' @importFrom BiocGenerics sort unlist
#' @importFrom S4Vectors split
#' 
#' @examples
#'   ## Get feature ranges
#'   grl <- getFeatureRanges(
#'     gtf = system.file("extdata/small_example.gtf", package = "eisaR"),
#'     featureType = c("spliced", "intron"),
#'     intronType = "separate",
#'     flankLength = 5L,
#'     joinOverlappingIntrons = FALSE,
#'     verbose = TRUE
#'   )
#'   
#'   ## Export GTF
#'   exportToGtf(grl = grl, filepath = file.path(tempdir(), "exported.gtf"))
#' 
exportToGtf <- function(grl, filepath) {
    if (!is(grl, "GRangesList")) {
        stop("'grl' must be a GRangesList")
    }
    
    ## "Exons" (individual exons for spliced transcripts, unspliced transcripts
    ## and introns separately)
    gre <- BiocGenerics::unlist(grl)
    
    ## "Transcripts" (full transcript range for spliced transcripts, unspliced
    ## transcripts and introns separately)
    grt <- BiocGenerics::unlist(range(grl))
    grt$exon_id <- NA
    grt$exon_rank <- NA
    grt$transcript_id <- names(grt)
    grt$gene_id <- gre$gene_id[match(grt$transcript_id, gre$transcript_id)]
    grt$type <- "transcript"
    
    ## "Gene"
    grg <- S4Vectors::split(gre, gre$gene_id)
    grg <- BiocGenerics::unlist(range(grg))
    grg$exon_id <- NA
    grg$exon_rank <- NA
    grg$transcript_id <- NA
    grg$gene_id <- names(grg)
    grg$type <- "gene"
    
    grtot <- BiocGenerics::sort(c(gre, grt, grg))
    rtracklayer::export(grtot, con = filepath, format = "gtf")
}

Try the eisaR package in your browser

Any scripts or data that you put into this service are public.

eisaR documentation built on Nov. 8, 2020, 8:26 p.m.