bnbc: Normalize Contact Matrices with BNBC

View source: R/bnbc.R

bnbcR Documentation

Normalize Contact Matrices with BNBC

Description

Applies BNBC method to normalize contact matrices.

Usage

bnbc(cg, batch, threshold = NULL, step = NULL, qn = TRUE, nbands
= NULL, mod = NULL, mean.only = FALSE, tol = 5, bstart = 2, verbose = TRUE)

Arguments

cg

A ContactGroup object.

batch

A single batch indicator variable.

threshold

The maximum distance interacting loci are allowed to be separated by.

step

The step size, or the number of bases a contact matrix cell represents.

qn

Whether to apply quantile normalization on each band matrix. Defaults to TRUE.

bstart

The first band to normalize. Defaults to 2.

nbands

The last band to normalize. Defaults to nrow(cg) - 1.

mod

A model matrix specifying which sample information is to be preserved by ComBat. Optional.

mean.only

Whether ComBat should not correct for batch effect in the variances of band matrix rows. Defaults to FALSE, which means variances are corrected. Set to TRUE if there is only one observation per batch.

tol

The number of significant digits for which the mean value of a band matrix must be greater than 0 to be processed by ComBat.

verbose

Should the function print progress?

Details

Normalization and batch correction is performed in a band-wise manner, correcting all samples' observations of one matrix off-diagonal (which we refer to as a matrix "band") at a time. For each matrix band, we collect all samples' observations into a single matrix. We then apply quantile normalization to ensure distributional similarity across samples. Finally, we perform batch effect correction using ComBat on this matrix. Each samples' matrix band is then replaced with its corrected version. We refer to this process of Band-Wise Normalization and Batch Correction as BNBC.

This function applies BNBC to the set of contact matrices and returns a ContactGroup object with matrix bands bstart:nbands corrected. For those rows in the matrix bands which cannot be corrected we set all elements to 0.

Very high bands contain little data in Hi-C experiments, and we don't recommend to analyze those or apply this function to high bands, see the nbands argument to the function.

We recommend performing bnbc on contact matrices which have been converted to log-CPM and smoothed, see the example.

Value

A ContactGroup object for which matrix bands bstart:nbands have had BNBC applied.

References

Johnson, W.E., Li, C. and Rabinovic, A. Adjusting batch effects in microarray expression data using empirical Bayes methods. Biostatistics 2007, 8:118-127. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1093/biostatistics/kxj037")}

Fletez-Brant et al. Distance-dependent between-sample normalization for Hi-C experiments. In preparation.

See Also

ContactGroup, logCPM, boxSmoother, band

Examples

data(cgEx)
batches <- colData(cgEx)$Batch
cgEx.cpm <- logCPM(cgEx)
cgEx.smooth <- boxSmoother(cgEx, 5, mc.cores=1)
cgEx.bnbc <- bnbc(cgEx.smooth, batches, 1e7, 4e4, bstart=2, nbands=4)

hansenlab/bnbc documentation built on Feb. 4, 2024, 7:20 a.m.