Determine the count for a variety of local neighborhoods around a bin pair, for use in computing peak enrichment statistics.
an InteractionSet object containing bin pair counts, generated by
an integer scalar, specifying the number of bins to consider as the local neighborhood
an integer scalar, specifying the number of bins to exclude from the neighborhood
a string or integer scalar, specifying the assay containing bin pair counts in
a character vector containing 4 unique names for the neighborhood regions A-D, see below
An object of the same type as
data is returned, containing additional matrices in the
Each matrix contains the counts for one neighborhood for each bin pair in each library.
The area of each neighborhood is also returned in the
mcols and named with the
Consider the coordinates of the interaction space in terms of bins, and focus on any particular bin pair (named here as the target bin pair).
This target bin pair is characterized by four neighborhood regions, from A to D.
Region A (named
"quadrant") is a square with side lengths equal to
flank, positioned such that the target bin pair lies at the corner furthest from the diagonal (only used for intra-chromosomal targets).
Region B (named
"vertical") is a vertical rectangle with dimensions
(1, flank*2+1), containing the target bin pair at the center.
Region C (named
"horizontal") is the horizontal counterpart to B.
Region D (named
"surrounding") is a square with side lengths equal to
flank*2+1, where the target bin pair is positioned in the center.
Obviously, the target bin pair itself is excluded in the definition of each neighborhood.
exclude is positive, additional bin pairs closest to the target will also be excluded.
For example, region A* is constructed with
exclude instead of
flank, and the resulting area is excluded from region A (and so on for all other regions).
This avoids problems where diffuse interactions are imperfectly captured by the target bin pair, such that genuine interactions spill over into the neighborhood.
Spill-over is undesirable as it will inflate the size of the neighborhood counts for genuine interactions.
Setting a larger
exclude ensures that this does not occur.
The size of
flank requires consideration, as it defines the size of each neighborhood region.
If the value is too large, other peaks may be included in the background such that the neighborhood count size is inflated.
On the other hand, if
flank is too small, there will not be enough neighborhood bin pairs to dilute the increase in counts from spill-over.
Both scenarios result in a decrease in enrichment values and loss of power to detect punctate events.
The default value of 5 seems to work well, though users may wish to test several values for themselves.
For each bin pair, the other bin pairs in
data that belong to its neighborhood are identified.
The sum of counts across these bin pairs is computed for each library and stored in a matrix.
This is repeated for each type of neighborhood (A-D), and the matrices are named based on
The area of each neighborhood is also computed in terms of the number of bin pairs contained by the neighborhood.
Note that the neighborhood area includes bin pairs that are missing from
data, as these are assumed to have a count of zero.
filterPeaks for how these neighborhood counts are used to assess the “peak-ness” of each bin pair.
Rao S et al. (2014). A 3D map of the human genome at kilobase resolution reveals principles of chromatin looping. Cell. 159, 1665-1690.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
# Setting up the object. a <- 10 b <- 20 regions <- GRanges(rep(c("chrA", "chrB"), c(a, b)), IRanges(c(1:a, 1:b), c(1:a, 1:b))) set.seed(23943) all.anchor1 <- sample(length(regions), 50, replace=TRUE) all.anchor2 <- as.integer(runif(50, 1, all.anchor1+1)) data <- InteractionSet(matrix(as.integer(rnbinom(200, mu=10, size=10)), 50, 4), GInteractions(anchor1=all.anchor1, anchor2=all.anchor2, regions=regions, mode="reverse"), colData=DataFrame(lib.size=1:4*1000), metadata=List(width=1)) data$totals <- colSums(assay(data)) # Getting peaks. head(enrichedPairs(data)) head(enrichedPairs(data, flank=3)) head(enrichedPairs(data, flank=1)) head(enrichedPairs(data, exclude=1))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.