rep_mat_lookup: Repeated Hi-C matrix lookup

View source: R/methods_matrix_lookup.R

rep_mat_lookupR Documentation

Repeated Hi-C matrix lookup

Description

Analyses like aggregrate peak analysis (APA) and paired-end spatial chromatin analysis (PE-SCAn) require to look up regions of a Hi-C matrix repeatedly. This function acts as a common method for the lookup and summary of such repeated lookups for these analyses.

Usage

rep_mat_lookup(
  explist,
  anchors,
  rel_pos,
  shift = 0,
  outlier_filter = c(0, 1),
  raw = FALSE
)

Arguments

explist

Either a single GENOVA contacts object or a list of GENOVA contacts objects.

anchors

A matrix with two columns containing pre-computed anchor indices. See the anchors documentation.

rel_pos

An integer vector indicating relative positions in bins, indicating ranges around anchors to lookup.

shift

An integer of length 1 indicating how many basepairs the anchors should be shifted. Essentially performs circular permutation of size for a reasonable estimate of background. The argument is ignored when shift <= 0.

outlier_filter

A numeric of length 2 between [0-1] indicating quantiles of data to be used as thresholds. Data outside these thresholds are set to the nearest threshold. Setting this to c(0, 1) performs no outlier correction.

raw

A logical of length 1: should the raw array underlying the summary matrices be returned in the output? Should be TRUE if the intention is to use the quantify function.

Details

For each row in the anchors argument a region of the Hi-C matrix is looked up corresponding to that anchor. This data is then summarised by taking the mean of each position relative to the anchor across all the anchors.

Anchors are subject to a filtering step wherein anchors are discarded when they are within rel_pos range of a chromosome start or end. This ensures the anchors all report data from the same chromosome in the x- or y-direction.

For shifted anchors, an attempt is made to shift the anchors in the opposite direction before they are discarded.

When a region corresponding to a non-shifted anchor is looked up and is found to have no contacts within that region, it is discarded. Shifted regions are only discarded when the corresponding non-shifted anchor is discarded.

Anchors typically contain a 'type' attribute which informs rep_mat_lookup how the lookup should occur. The APA, PESCAn and ARA anchors look up regions of dimensions length(rel_pos) x length(rel_pos). The ARA anchors transpose these square regions when given a '-' direction before summary occurs. The ATA anchors look up anchors[, 2] - anchors[, 1] sized square regions and resizes these to a max(rel_pos) square region through bilinear interpolation before summary.

Value

A list of the same length as explist wherein list elements contain the results of the repeated lookup per experiment.

Resolution recommendation

10kb-40kb

See Also

Other aggregate repeated matrix lookup analyses: APA(), ARA(), ATA(), CSCAn(), PESCAn()


robinweide/GENOVA documentation built on March 14, 2024, 11:16 p.m.