getGRcoverageFromBw: Get coverage for GRanges from bigWig files

getGRcoverageFromBwR Documentation

Get coverage for GRanges from bigWig files

Description

Get coverage for GRanges from bigWig files

Usage

getGRcoverageFromBw(
  gr,
  bwUrls,
  addGaps = FALSE,
  gap_feature_type = "gap",
  default_feature_type = "exon",
  feature_type_colname = "feature_type",
  use_memoise = FALSE,
  memoise_coverage_path = "coverage_memoise",
  do_shiny_progress = FALSE,
  verbose = FALSE,
  ...
)

Arguments

gr

GRanges object

bwUrls

character vector of full file paths or web URLs to bigWig files, suitable for use by rtracklayer::import().

addGaps

logical indicating whether gaps between GRanges should be added to the query. Gaps are determined using getGRgaps(). Practically, when addGaps=TRUE loads the coverage data between exons, which can be a substantially larger region than exons. When addGaps=FALSE the coverage data is not loaded in intron/gap regions, and therefore is not displayed in downstream plots like sashimi plots.

feature_type_colname, gap_feature_type, default_feature_type

When addGaps=TRUE a new column named using feature_type_colname is added to values(gr), whose value for gap regions is gap_feature_type. When feature_type_colname is already present in gr it is not modified, otherwise the column is created with value default_feature_type. By default, this function adds a column "feature_type" with value "gap".

use_memoise

logical indicating whether to use memoise::memoise() to store coverage data in cache files, which can be re-used in subsequent R sessions, given consistent values for memoise_coverage_path. Note that the primary reason to use memoise during this step, is that cache files will be stored for each bigWig file and not for the set of bigWig files. For example, adding one bigWig file to bwUrls will cause creating of one new memoise cache file, but will re-use any pre-existing memoise cache files for the previously cached bwUrls entries.

memoise_coverage_path

character path to file folder used to store coverage data in memoise cache files. By default, the folder is a subfolder of the current working directory (see getwd()) so it should be changed to an absolute path if needed for wider re-use in any working directory.

do_shiny_progress

logical indicating whether to update shiny progress bar, using shiny::setProgress(). It assumes the progress bar is already initiated.

verbose

logical indicating whether to print verbose output.

...

additional arguments are ignored.

Details

This function takes a GRanges object to define genomic regions, for which coverage data is loaded for each bwUrls input file.

Note that this function uses rtracklayer::import.bw() which they describe does not work on the Windows platform.

Update in version 0.0.68.900: This function was updated in two subtle ways, to work around a bug in rtracklater::import.bw(), which returns data sorted by chromosome in the order it is indexed in the bigWig file, then within each chromosome entries are returned in the order requested. This getGRcoverageFromBw() was updated to:

  1. Confirm input gr GRanges contains names, or assigns names as needed.

  2. The output coverage from rtracklayer::import.gw() is ordered by names(gr) to confirm the output coverage is returned in the identical order as requested.

The updates above were done outside the scope of memoise file caching, so that stored coverage cache files will still be valid, but the order of named entries from the cache will be dependent upon the order requested. In the event the cache coverage contains no names, they will be returned in the same order as stored, however it is possible the cache will be invalidated by the addition of names to gr, though unclear exactly how deeply memoise checks such things.

Value

DataFrame object, whose colnames are defined using either names(bwUrls) or by jamba::makeNames(basename(bwUrls)) then removing the .bw or .bigWig file extension, case-insensitively. Each column is type IRanges::NumericList-class which is a list of numeric coverage values.

See Also

Other jam GRanges functions: addGRLgaps(), addGRgaps(), annotateGRLfromGRL(), annotateGRfromGR(), assignGRLexonNames(), closestExonToJunctions(), combineGRcoverage(), exoncov2polygon(), findOverlapsGRL(), flattenExonsBy(), getFirstStrandedFromGRL(), getGRLgaps(), getGRgaps(), grl2df(), jam_isDisjoint(), make_ref2compressed(), sortGRL(), spliceGR2junctionDF(), stackJunctions()

Other jam RNA-seq functions: assignGRLexonNames(), closestExonToJunctions(), combineGRcoverage(), defineDetectedTx(), detectedTxInfo(), exoncov2polygon(), flattenExonsBy(), groups2contrasts(), internal_junc_score(), makeTx2geneFromGtf(), make_ref2compressed(), prepareSashimi(), runDiffSplice(), sortSamples(), spliceGR2junctionDF()


jmw86069/splicejam documentation built on April 21, 2024, 4:57 p.m.