anchors: Anchors for Hi-C

anchorsR Documentation

Anchors for Hi-C

Description

Anchors are indices to the matrix in a contacts object. Anchor functions translate genomic coordinates, typically BED-formatted data.frames, into indices corresponding to locations in the Hi-C matrix. Anchors are used in aggregate analysis functions to indicate where parts of the matrix should be looked up. Anchors come in different types, depending on the aggregate analysis functions they are used for.

Usage

anchors_PESCAn(
  IDX,
  res,
  bed,
  dist_thres = c(5000000L, Inf),
  min_compare = 10L,
  mode = c("cis", "trans", "both")
)

anchors_CSCAn(
  IDX,
  res,
  bedlist,
  dist_thres = c(50000, 2e+06),
  min_compare = 10L,
  mode = c("cis", "trans", "both"),
  group_direction = FALSE
)

anchors_APA(
  IDX,
  res,
  bedpe,
  dist_thres = c(0, Inf),
  mode = c("both", "cis", "trans")
)

anchors_ATA(IDX, bed, dist_thres = c(225000, Inf), padding = 1)

anchors_ARA(IDX, bed, strand = NULL)

anchors_extendedloops(IDX, res, bedpe, dist_thres = c(30000, 3e+06))

Arguments

IDX

The indices slot of a GENOVA contacts object.

res

The resolution attribute of a GENOVA contacts object. Not for ATA.

bed, bedlist

A BED-formatted data.frame with the following 3 columns:

  1. A character giving the chromosome names.

  2. An integer with start positions.

  3. An integer with end positions.

For the bedlist variant, a named list of the above. (CSCAn only).

dist_thres

An integer vector of length 2 indicating the minimum and maximum distances in basepairs between anchorpoints. For ATA-type anchors, the minimum and maximum sizes of TADs.

min_compare

An integer vector of length 1 indicating the minimum number of pairwise interactions on a chromosome to consider. PE-SCAn and C-SCAn only.

mode

A character vector of length 1 indicating which interactions to retain. Possible values: "cis", "trans" or "both". PE-SCAn, C-SCAn and APA only.

group_direction

A logical of length 1 which when TRUE will mirror groups for anchors where the left anchor location is larger than the right anchor location. Left and right refer to bedlist elements generating combinations. CSCAn only.

bedpe

A BEDPE-formatted data.frame with the following 6 columns. APA only:

  1. A character giving the chromosome names of the first coordinate.

  2. An integer giving the start positions of the first coordinate.

  3. An integer giving the end positions of the first coordinate.

  4. A character giving the chromosome names of the second coordinate.

  5. An integer giving the start positions of the second coordinate.

  6. An integer giving the end positions of the second coordinate.

padding

A numeric of length 1 to determine the padding around TADs, expressed in TAD widths. ATA only.

strand

A character of the length nrow(bed). Overrules an attempt to infer strand from start > end information. ARA only.

Details

Anchors are calculated within aggregate repeated matrix lookup analysis, but can also be provided as the 'anchors' argument for these functions.

Anchors are specific for a resolution of a contacts object and cannot be interchanged freely between resolutions.

The 'mode' argument determines what pairwise interactions are reported for APA and PE-SCAn. "cis" returns pairwise interactions within a chromosome; "trans" gives these between chromosomes and "both" returns both these types of interactions.

Value

A anchors object with two colums in matrix format.

Anchor types

PE-SCAn anchors

anchors_PESCAn() takes all pairwise interactions of genomic coordinates from a BED-like data.frame and maps these to indices of the Hi-C matrix. It is used within the PESCAn function. Wether these pairwise interactions are allowed to cross chromosome boundaries is determined by the 'mode' argument, which default to "cis" to only take pairwise interactions on the same chromosome.

C-SCAn anchors

anchors_CSCan(), like anchors_PESCAn, takes all pairwise interactions of genomic coordinates, but crosswise between unique combinations of BED-like data.frames in the bedlist argument. It is used within the CSCAn function. Has a group attribute to keep track from which combination of BED-like data.frame the anchor originated.

APA anchors

anchors_APA() takes a BEDPE-formatted data.frame and translates the coordinates in the first 3 and last 3 columns to indices of the Hi-C matrix. It is used within the APA function. The 'mode' argument defaults to "both" but optionally allows for either cis- or trans-interactions too.

Extended loops anchors

anchors_extendedloops() takes the same input as anchors_APA(), but transforms these coordinates to combinations of 5' and 3' anchors outside existing loops to get 'extended' loops. Based on the extended loops algorithm described in Haarhuis et al. (2017).

ATA anchors

anchors_ATA() takes the genomic coordinates of TADs in a BED-formatted data.frame and translates to indices of the Hi-C matrix. It is used within the ATA function. In contrast to the PE-SCAn anchors and ATA anchors, ATA anchors are positions on the matrix's diagonal. The 'padding' argument controls how large the region around a TAD should be expanded. Since TADs have variable sizes, ATA anchors can be calculated without resolution.

ARA anchors

anchors_ARA() takes a BED-formatted data.frame and translates these to Hi-C matrix indices on the diagonal. It is used within the ARA function. In contrast to other anchors, ARA anchors can take on a directionality. If the start positions are larger than the end positions, the anchor is assigned a reverse direction. Else they are given the forward direction.

See Also

bed2idx for general genomic coordinates to Hi-C index conversion.

Examples

## Not run: 
# PE-SCAn
anch <- anchors_PESCAn(WT_20kb$IDX, attr(WT_20kb, "resolution"),
                       super_enhancers)
PESCAn(list(WT_20kb, KO_20kb), anchors = anch)

# APA
anch <- anchors_APA(WT_20kb$IDX, attr(WT_20kb, "resolution"), loops)
APA(list(WT_20kb, KO_20kb), anchors = anch)

# APA with extended loops
ex_anch <- anchors_extendedloops(WT_20kb$IDX, attr(WT_20kb, "resolution"),
                                 loops)
APA(list(WT_20kb, KO_20kb), anchors = ex_anch)

# ATA
anch <- anchors_ATA(WT_10kb$IDX, tads)
ATA(list(WT_10kb, KO_10kb), anchors = anch)

# ARA
anch <- anchors_ARA(WT_20kb$IDX, ctcf_sites)
ARA(list(WT_20kb, KO_20kb), anchors = anch)

## End(Not run)

robinweide/GENOVA documentation built on March 14, 2024, 11:16 p.m.