ContactMatrix-class: ContactMatrix class and constructors

ContactMatrix classR Documentation

ContactMatrix class and constructors

Description

The ContactMatrix class contains a matrix where rows and columns represent genomic loci. Each entry of the matrix contains information about the interaction between the loci represented by the corresponding row/column, e.g., contact frequencies. Coordinates of the loci are also contained within this class.

Usage

## S4 method for signature 'ANY,numeric,numeric,GRanges'
ContactMatrix(matrix, anchor1, anchor2, regions, metadata=list())

## S4 method for signature 'ANY,GRanges,GRanges,GenomicRanges_OR_missing'
ContactMatrix(matrix, anchor1, anchor2, regions, metadata=list())

## S4 method for signature 'missing,missing,missing,GenomicRanges_OR_missing'
ContactMatrix(matrix, anchor1, anchor2, regions, metadata=list())

Arguments

matrix

Any matrix-like object containing interaction data.

anchor1, anchor2

Either a pair of numeric vectors containing indices to regions or a pair of GRanges objects. In both cases, anchor1 and anchor2 should have lengths equal to the number of rows and columns in matrix, respectively.

regions

A GRanges object containing the coordinates of the interacting regions. This argument is optional for InteractionSet,ANY,GRanges,GRanges-method.

metadata

A list containing experiment-wide metadata - see ?Annotated for more details.

Details

The ContactMatrix class inherits from the Annotated class, with several additional slots:

matrix:

A matrix-like object.

anchor1:

An integer vector specifying the index of the first interacting region.

anchor2:

An integer vector specifying the index of the second interacting region.

regions:

A sorted GRanges object containing the coordinates of all interacting regions.

Each entry of anchor1 corresponds to a row in matrix, while each entry of anchor2 corresponds to a column. Each entry of matrix represents an interaction between the corresponding entries in anchor1 and anchor2, i which point to the relevant coordinates in regions for each locus.

ContactMatrix objects can be constructed by specifying numeric vectors as anchor1 and anchor2 in the ContactMatrix function. These vectors will define the regions corresponding to the rows and columns of the matrix. Specifically, each value of the vector acts as an index to specify the relevant coordinates from regions. This means that the range of entries must lie within [1, length(regions)].

Alternatively, ContactMatrix objects can be constructed by directly supplying the GRanges of the interacting loci in ContactMatrix. If regions is not specified, this will be constructed automatically from the two sets of supplied GRanges. If regions is supplied, exact matching will be performed to identify the indices in regions corresponding to the regions in the supplied GRanges. Missing values are not tolerated and will cause an error to be raised.

Both methods will return an ContactMatrix object containing all of the specified information. Sorting of regions is also performed automatically, with re-indexing of all anchor indices to preserve the correct pairings between regions.

Value

For the constructors, a ContactMatrix object is returned.

Choosing between matrix classes

The ContactMatrix class provides support for any matrix-like object that implements dim, rbind, cbind, [ and t methods. The choice of class depends on the type of data and the intended application. Some of the common choices are described in more detail here:

  • Base matrices are simple to generate and are most efficient for dense data. This is often sufficient for most use cases where small regions of the interaction space are being examined.

  • Sparse matrices from the Matrix package are useful for large areas of the interaction space where most entries are zero. This reduces memory usage compared to a dense representation (though conversely, is less efficient for dense data). Note that all numeric values are coerced to double-precision types, which may take up more memory than a direct integer representation. Another issue is how missing values should be interpreted in the sparseMatrix – see ?inflate for more details.

  • Packed symmetric matrices from the Matrix package provide some memory savings for symmetric regions of the interaction space.

  • Delayed or HDF5-backed matrices from the DelayedArray and HDF5Array packages allow very large matrices to be represented without loading into memory.

Author(s)

Aaron Lun

See Also

ContactMatrix-access, ContactMatrix-subset, ContactMatrix-sort, Annotated-class, Matrix-class

Examples

set.seed(1000)
N <- 30
all.starts <- sample(1000, N)
all.ends <- all.starts + round(runif(N, 5, 20))
all.regions <- GRanges(rep(c("chrA", "chrB"), c(N-10, 10)),
    IRanges(all.starts, all.ends))

Nr <- 10
Nc <- 20
all.anchor1 <- sample(N, Nr)
all.anchor2 <- sample(N, Nc)
counts <- matrix(rpois(Nr*Nc, lambda=10), Nr, Nc)
x <- ContactMatrix(counts, all.anchor1, all.anchor2, all.regions)

# Equivalent construction:
ContactMatrix(counts, all.regions[all.anchor1], 
    all.regions[all.anchor2])
ContactMatrix(counts, all.regions[all.anchor1], 
    all.regions[all.anchor2], all.regions)

# Also works directly with Matrix objects.
counts2 <- Matrix::Matrix(counts)
x2 <- ContactMatrix(counts2, all.anchor1, all.anchor2, all.regions)
counts2 <- as(counts2, "dgCMatrix")
x2 <- ContactMatrix(counts2, all.anchor1, all.anchor2, all.regions)

LTLA/InteractionSet documentation built on July 3, 2023, 8:44 a.m.