getFeatureRanges: Generate a GRangesList object with feature ranges

View source: R/getFeatureRanges.R

getFeatureRangesR Documentation

Generate a GRangesList object with feature ranges

Description

Generate a GRangesList object with genomic ranges for (any combination of) spliced transcripts, unspliced transcripts and introns.

Usage

getFeatureRanges(
  gtf,
  featureType = c("spliced", "intron"),
  intronType = "separate",
  flankLength = 90L,
  joinOverlappingIntrons = FALSE,
  collapseIntronsByGene = FALSE,
  keepIntronInFeature = FALSE,
  verbose = TRUE
)

Arguments

gtf

Path to gtf file.

featureType

Character vector indicating the type(s) of features to extract, any subset of c("spliced", "intron", "unspliced").

intronType

Character vector indicating how to define the introns (only used if "intron" is part of featureType). Has to be either "separate" (introns are defined for each transcript separately) or "collapse" (transcripts of the same gene are first collapsed before introns are defined as any non-exonic part of the gene locus).

flankLength

Integer scalar indicating the length of the flanking sequence added to each side of each extracted intron (only used if "intron" is included in featureType).

joinOverlappingIntrons

Logical scalar indicating whether two introns that overlap (after adding the flanking sequences) should be joined into one feature.

collapseIntronsByGene

Logical scalar indicating whether to collapse overlapping intron ranges by gene after extraction.

keepIntronInFeature

Logical scalar indicating whether introns (after adding the flank length) should be restricted to stay within the boundaries of the feature they were generated from. For backwards compatibility, the default is FALSE.

verbose

Logical scalar, whether to print out progress messages.

Value

Returns a GRangesList object where each element represents one extracted feature. The metadata of this object contains two data.frames mapping corresponding identifiers between the different feature types, as well as a list of all features for each type.

Author(s)

Charlotte Soneson

Examples

  ## Get feature ranges
  grl <- getFeatureRanges(
    gtf = system.file("extdata/small_example.gtf", package = "eisaR"),
    featureType = c("spliced", "intron"),
    intronType = "separate",
    flankLength = 5L,
    joinOverlappingIntrons = FALSE,
    collapseIntronsByGene = FALSE,
    verbose = TRUE
  )
  
  ## GRangesList
  grl
  
  ## Corresponding transcript/gene IDs
  S4Vectors::metadata(grl)$corrtx
  S4Vectors::metadata(grl)$corrgene
  
  ## List of features of different types
  S4Vectors::metadata(grl)$featurelist
  
  ## Get feature sequences
  if (requireNamespace("BSgenome", quietly = TRUE) &&
      requireNamespace("GenomicFeatures", quietly = TRUE)) {
    library(BSgenome)
    genome <- Biostrings::readDNAStringSet(
      system.file("extdata/small_example_genome.fa", package = "eisaR"))
    seqs <- GenomicFeatures::extractTranscriptSeqs(x = genome, 
                                                   transcripts = grl)
    seqs
  }
 

fmicompbio/eisaR documentation built on Dec. 11, 2024, 1:05 a.m.