makeGRangesBRG: Constructing and checking for base-pair resolution GRanges...

View source: R/dataset_functions.R

makeGRangesBRGR Documentation

Constructing and checking for base-pair resolution GRanges objects

Description

makeGRangesBRG splits up all ranges in dataset.gr to be each 1 basepair wide. For any range that is split up, all metadata information belonging to that range is inherited by its daughter ranges, and therefore the transformation is non-destructive. isBRG checks whether an object is a basepair resolution GRanges object.

Usage

makeGRangesBRG(dataset.gr, ncores = getOption("mc.cores", 2L))

isBRG(x)

Arguments

dataset.gr

A disjoint GRanges object, or a list of such objects.

ncores

If dataset.gr is a list, the number of cores to use for computations.

x

Object to be tested.

Details

Note that makeGRangesBRG doesn't perform any transformation on the metadata in the input. This function assumes that for an input GRanges object, any metadata for each range is equally correct when inherited by each individual base in that range. In other words, the dataset's "signal" (usually readcounts) fundamentally belongs to a single basepair position.

Value

makeGRangesBRG returns a GRanges object for which length(output) == sum(width(dataset.gr)), and for which all(width(output) == 1).

isBRG(x) returns TRUE if x is a GRanges object with the above characteristics.

Motivation

The motivating case for this function is a bigWig file (e.g. one imported by rtracklayer), as bigWig files typically use run-length compression on the data signal (the 'score' column), such that adjacent bases sharing the same signal are combined into a single range. As basepair-resolution genomic data is typically sparse, this compression has a minimal impact on memory usage, and removing it greatly enhances data handling as each index (each range) of the GRanges object corresponds to a single genomic position.

Generating basepair-resolution GRanges from whole reads

If working with a GRanges object containing whole reads, one can obtain base-pair resolution information by using the strand-specific function GenomicRanges::resize to select a single base from each read: set width = 1 and use the fix argument to choose the strand-specific 5' or 3' end. Then, strand-specific coverage can be calculated using getStrandedCoverage.

On the use of GRanges instead of GPos

The GPos class is a more suitable container for data of this type, as the GPos class is specific to 1-bp-wide ranges. However, in early testing, we encountered some kind of compatibility limitations with the newer GPos class, and have not re-tested it since. If you have feedback on switching to this class, please contact the author. Users can readily coerce a basepair-resolution GRanges object to a GPos object via gp <- GPos(gr, score = score(gr)).

Author(s)

Mike DeBerardine

See Also

getStrandedCoverage, GenomicRanges::resize()

Examples

if (.Platform$OS.type == "unix") {

    #--------------------------------------------------#
    # Make a bigWig file single width
    #--------------------------------------------------#

    # get local address for an included bigWig file
    bw_file <- system.file("extdata", "PROseq_dm6_chr4_plus.bw",
                           package = "BRGenomics")

    # BRGenomics::import_bigWig automatically applies makeGRangesBRG;
    # therefore will import using rtracklayer
    bw <- rtracklayer::import.bw(bw_file)
    strand(bw) <- "+"

    range(width(bw))
    length(bw)

    # make basepair-resolution (single-width)
    gr <- makeGRangesBRG(bw)

    isBRG(gr)
    range(width(gr))
    length(gr)
    length(gr) == sum(width(bw))
    sum(score(gr)) == sum(score(bw) * width(bw))

    #--------------------------------------------------#
    # Reverse using getStrandedCoverage
    #--------------------------------------------------#
    # -> for more examples, see getStrandedCoverage

    undo <- getStrandedCoverage(gr, ncores = 1)

    isBRG(undo)
    range(width(undo))
    length(undo) == length(bw)
    all(score(undo) == score(bw))

}

mdeber/BRGenomics documentation built on Aug. 3, 2024, 3:43 a.m.