ARA: Aggregate Region Analysis

View source: R/ARMLA.R

ARAR Documentation

Aggregate Region Analysis

Description

Extracts Hi-C matrices centered around regions and averages the results for all regions. Can take orientations of regions into account.

Usage

ARA(
  explist,
  bed,
  shift = 1e+06,
  size_bin = NULL,
  size_bp = NULL,
  outlier_filter = c(0, 1),
  anchors = NULL,
  raw = TRUE,
  strand = NULL
)

Arguments

explist

Either a single GENOVA contacts object or a list of GENOVA contacts objects.

bed

A BED-formatted data.frame with the following 3 columns:

  1. A character giving the chromosome names.

  2. An integer with start positions.

  3. An integer with end positions.

Note that rows wherein the second column is larger than the third column are considered to be in the reverse direction.

shift

An integer of length 1 indicating how many basepairs the anchors should be shifted. Essentially performs circular permutation of size for a reasonable estimate of background. The argument is ignored when shift <= 0.

size_bin, size_bp

The size of the lookup regions in bins (i.e. a score of 21 yields an output with 10 Hi-C bins both up- and downstream of the anchor). When NULL (default), it is internally set to 21 when the size_bp is also NULL. size_bp is an alternative parametrisation for the lookup regions, expressed in basepairs. size_bp is not used when the argument size_bin is set.

outlier_filter

A numeric of length 2 between [0-1] indicating quantiles of data to be used as thresholds. Data outside these thresholds are set to the nearest threshold. Setting this to c(0, 1) performs no outlier correction.

anchors

(Optional) A matrix with two columns containing pre-computed anchor indices. If this is set, skips calculation of anchor indices and uses this argument instead. See anchors_ARA().

raw

A logical of length 1: should the raw array underlying the summary matrices be returned in the output? Should be TRUE if the intention is to use the quantify function.

strand

A character of the length nrow(bed). Overrules an attempt to infer strand from start > end information.

Details

By default, ARA also calculates the results for shifted anchors and normalises the "obsexp" slot by off-diagonal bands.

The 'bed' argument can take in oriented entries, wherein entries with a start site larger than the end site are considered to be in the reverse direction. The reverse sites are flipped during analysis, so the orientation is the same as in the forward sites.

Value

An ARA_discovery object containing the following slots:

obsexp

An array with the dimensions size_bin x size_bin x length(explist) containing fold change values for the signal over the (banded) median, shifted values.

signal

An array with the dimensions size_bin x size_bin x length(explist) containing mean contact values for bins surrounding the anchors.

signal_raw

A list with length(explist) elements for each contacts object, wherein an element is an n x size_bin x size_bin array with contact values for each anchor. 'n' is the number of non-empty valid anchors.

shifted

An array with the dimensions size_bin x size_bin x length(explist) containing mean contact values for bins that are shift basepairs away from the anchors.

shifted_raw

A list with length(explist) elements for each contacts object, wherein an alemeent is an n x size_bin x size_bin array with contact values for each shifted anchors. 'n' is the number of non-empty valid (unshifted) anchors.

Resolution recommendation

10kb-40kb

See Also

The rep_mat_lookup function that performs the lookup and summary for the ARA function and others.
The discovery class for a general description of discovery classes.
The visualise function for visualisation of the results.
The quantify function for quantification of ARA results.
The anchors documentation for more information about anchors.

Other aggregate repeated matrix lookup analyses: APA(), ATA(), CSCAn(), PESCAn(), rep_mat_lookup()

Examples

## Not run: 
# Typical usage
ara <- ARA(list(WT_20kb, KO_20kb), ctcf_sites)

# Alternative usage with pre-calculated anchors
anchors <- anchors_ARA(WT_20kb$IDX, ctcf_sites)
ara <- ARA(list(WT_20kb, KO_20kb))

# Visualisation
visualise(ara)

## End(Not run)

robinweide/GENOVA documentation built on March 14, 2024, 11:16 p.m.