recenter_nmatlist: Re-center coverage matrix data

recenter_nmatlistR Documentation

Re-center coverage matrix data

Description

Re-center coverage matrix data

Usage

recenter_nmatlist(
  nmatlist,
  recenter_heatmap = 1,
  recenter_range = NULL,
  recenter_invert = FALSE,
  spar = 0.5,
  edge_buffer = 0,
  empty_value = 0,
  verbose = FALSE,
  ...
)

Arguments

nmatlist

list of normalizedMatrix objects

recenter_heatmap

numeric (default 1) index with one or more entries in nmatlist to use for re-centering.

recenter_range

numeric (default NULL) with optional maximum distance from the target (center) of coverage in nmatlist. For example, if nmatlist data spans -50kb to +50kb, but peaks are no wider than 1kb, consider using recenter_range=1000 so that the recentering will only use coverage data -1000bp to +1000bp at most.

recenter_invert

logical indicating whether to invert the coverage, therefore effectively taking the minimum signal. This value is recycled to length(recenter_heatmap) such that each heatmap can individually be inverted as relevant.

empty_value

numeric value used for empty values created by the "edges" of recentered matrix data. Default is 0, other values may not be well-supported.

verbose

logical indicating whether to print verbose output.

...

additional arguments are ignored.

spar.edge.buffer

numeric values passed to summit_from_vector()

Details

Coverage matrix data is provided as nmatlist which is a list of normalizedMatrix objects (see EnrichmentHeatmap). One or more recenter_heatmap are defined, and the summit is calculated for each row using smooth.spline().

For each row, the summit position is therefore interpolated.

Coverage matrix data nmatlist is then shifted to recenter peaks across all coverage files, and the summit offset is stored as an attribute attr(nmatlist, "summit_offset").

Use Case

The input represents sequence coverage data for a set of genome regions of interest, for example ChIP-seq peaks, ATAC-seq peaks, where it may be useful for the center of enrichment to represent the position with highest coverage. Many peak/regional enrichment calling tools may not provide a summit position, the summit position may not be accurate, and/or the summit across multiple sets of merged peaks may not be available.

This method can use coverage data across multiple nmatlist matrix data to calculate a collective summit position.

Recommendation

The recommended workflow is to create coverage matrix data for a region wider than used for the final figure, so that the re-centering can be performed while maintaining coverage throughout the desired range.

Value

object in format nmatlist, a list of normalizedMatrix objects.

See Also

Other jam coverage heatmap functions: coverage_matrix2nmat(), get_nmat_ceiling(), nmathm_row_order(), nmatlist2heatmaps(), restrand_nmatlist(), validate_heatmap_params(), zoom_nmatlist(), zoom_nmat()

Examples

## There is a small example file to use for testing
# library(jamba)
cov_file1 <- system.file("data", "tss_coverage.matrix", package="platjam");
cov_file2 <- system.file("data", "h3k4me1_coverage.matrix", package="platjam");
cov_files <- c(cov_file1, cov_file2);
names(cov_files) <- gsub("[.]matrix",
   "",
   basename(cov_files));
nmatlist <- coverage_matrix2nmat(cov_files, verbose=FALSE);

nmatlist2heatmaps(nmatlist,
   title="Input data",
   transform=c("log2signed", "sqrt"));

nmatlist1 <- recenter_nmatlist(nmatlist)
nmatlist2heatmaps(nmatlist1,
   title="Input data, recentered by tss signal",
   transform=c("log2signed", "sqrt"));

nmatlist2i <- recenter_nmatlist(nmatlist, recenter_heatmap=2, recenter_invert=TRUE)
nmatlist2heatmaps(nmatlist2i,
   title="Input data, recentered by inverted h3k4me1 signal",
   transform=c("log2signed", "sqrt"));
head(data.frame(summit_name=attr(nmatlist2i[[1]], "summit_name")))

nmatlist2is <- restrand_nmatlist(nmatlist2i, restrand_heatmap=2, recenter_invert=FALSE)
nmatlist2heatmaps(nmatlist2is,
   title="Input data, recentered by inverted h3k4me1 signal,\nrestranded by tss",
   transform=c("log2signed", "sqrt"));

# summarize recenter and restrand output
head(data.frame(
   row=attr(nmatlist2is[[1]], "dimnames")[[1]],
   summit_name=attr(nmatlist2is[[1]], "summit_name"),
   restrand=attr(nmatlist2is[[1]], "restrand")))


jmw86069/platjam documentation built on Sept. 26, 2024, 3:31 p.m.