APA: Aggregrate Peak Analysis

View source: R/ARMLA.R

APAR Documentation

Aggregrate Peak Analysis

Description

Perform multiple matrix lookup in Hi-C matrices for a twodimensional set of locations, for example loops.

Usage

APA(
  explist,
  bedpe,
  dist_thres = NULL,
  size_bin = NULL,
  size_bp = NULL,
  outlier_filter = c(0, 0.995),
  anchors = NULL,
  raw = TRUE
)

Arguments

explist

Either a single GENOVA contacts object or a list of GENOVA contacts objects.

bedpe

A BEDPE-formatted data.frame with the following 6 columns:

  1. A character giving the chromosome names of the first coordinate.

  2. An integer giving the start positions of the first coordinate.

  3. An integer giving the end positions of the first coordinate.

  4. A character giving the chromosome names of the second coordinate.

  5. An integer giving the start positions of the second coordinate.

  6. An integer giving the end positions of the second coordinate.

dist_thres

An integer vector of length 2 indicating the minimum and maximum distances in basepairs between anchorpoints.

size_bin, size_bp

The size of the lookup regions in bins (i.e. a score of 21 yields an output with 10 Hi-C bins both up- and downstream of the anchor). When NULL (default), it is internally set to 21 when the size_bp is also NULL. size_bp is an alternative parametrisation for the lookup regions, expressed in basepairs. size_bp is not used when the argument size_bin is set.

outlier_filter

A numeric of length 2 between [0-1] indicating quantiles of data to be used as thresholds. Data outside these thresholds are set to the nearest threshold. Setting this to c(0, 1) performs no outlier correction.

anchors

(Optional) A matrix with two columns containing pre-computed anchor indices. If this is set, skips calculation of anchor indices and uses this argument instead. See anchors_APA().

raw

A logical of length 1: should the raw array underlying the summary matrices be returned in the output? Should be TRUE if the intention is to use the quantify function.

Details

For each row in the 'bedpe' or 'anchors' argument, an size_bin x size_bin region centered on that location is retrieved. This data is then summarised by taking the mean for every element in these matrices across all locations.

The 'bedpe' argument is converted internally to an 'anchors' object.

Value

An APA_discovery object containing the following slots:

signal

An array with the dimensions size_bin x size_bin x length(explist) containing mean contact values for bins surrounding the anchors.

signal_raw

A list with length(explist) elements for each contacts object, wherein an element is an n x size_bin x size_bin array with contact values for each anchor. 'n' is the number of non-empty valid anchors.

Resolution recommendation

10kb-20kb

See Also

The rep_mat_lookup function that performs the lookup and summary for the APA function and others.
The discovery class for a general description of discovery classes.
The visualise function for visualisation of the results.
The quantify function for quantification of loop strenghts.
The anchors documentation for more information about anchors.

Other aggregate repeated matrix lookup analyses: ARA(), ATA(), CSCAn(), PESCAn(), rep_mat_lookup()

Examples

## Not run: 
# Typical usage: APA for loops
apa <- APA(list(WT = WT_10kb, KO = KO_10kb), bedpe = WT_loops)

# Alternative usage with pre-calculated anchors
anchors <- anchors_APA(WT_10kb$ABS, resolution(WT_10kb),
  bedpe = WT_loops
)
apa <- APA(list(WT = WT_10kb, KO = KO_10kb), anchors = anchors)

# Visualising results
visualise(apa)

## End(Not run)

robinweide/GENOVA documentation built on March 14, 2024, 11:16 p.m.