binRegion: Divide region into similarly sized bins.

Description Usage Arguments Details Value Examples

View source: R/binning_aggregation.R

Description

Given a start, end, and number of bins, to divide, this function will split the regions into bins. Bins will be only approximately the same size, due to rounding. (they should not be more than 1 different).

Usage

1
binRegion(start, end, bins, idDF = NULL, strand = "*")

Arguments

start

Coordinate for beginning of range/range.

end

Coordinate for end of range/region.

bins

How many bins to divide this range/region.

idDF

A string/vector of strings that has chromosome (e.g. "chr1") for given start and end values

strand

"strand" column of the data.table (or single strand value if binRegion is only used on one region). Default is "*".

Details

Use case: take a set of regions, like CG islands, and bin them; now you can aggregate signal scores across the bins, giving you an aggregate signal in bins across many regions of the same type.

In theory, this just runs on 3 values, but you can run it inside a data.table j expression to divide a bunch of regions in the same way.

Value

A data.table, expanded to nrow = number of bins, with these id columns: id: region ID binID: repeating ID (this is the value to aggregate across) ubinID: unique bin IDs

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
library(data.table)
start <- c(100, 1000, 3000)
end <- c(500, 1400, 3400)
chr <- c("chr1", "chr1", "chr2")
strand <- c("*", "*", "*")
# strand not included in object 
# since MIRA assumes "*" already unless given something else
regionsToBinDT <- data.table(chr, start, end)
numberOfBins <- 15
# data.table "j command" using column names and numberOfBins variable
binnedRegionDT <- regionsToBinDT[, binRegion(start, end, numberOfBins, chr)]

databio/MIRA documentation built on April 16, 2020, 9:53 p.m.