coverage_matrix2nmat: Import genome coverage matrix files
In jmw86069/platjam: Platform Jam, biological platform importers.

coverage_matrix2nmat

R Documentation

Import genome coverage matrix files

Description

Import genome coverage matrix files

Usage

coverage_matrix2nmat(
  x = NULL,
  filename = NULL,
  signal_name = NULL,
  target_name = "target",
  background = 0,
  smooth = FALSE,
  target_is_single_point = FALSE,
  signal_is_categorical = FALSE,
  mat_grep = "[-0-9]+:[-0-9]+",
  upstream_grep = "^[-]",
  downstream_grep = "^[^-]",
  target_grep = "^0$",
  verbose = FALSE,
  ...
)

Arguments

`x`	`data.frame` or compatible object containing genome coverage data, or a character file path. When `x` is not supplied, `filename` is used to import data. When `x` is a filename, it is used to populate `filename`, then data is imported into `x`.
`filename`	character path to a genome coverage file. When `x` is supplied, this argument is ignored. When `filename` is used, only the first file is imported.
`signal_name`	The name of signal regions. It is only used for printing the object. When `signal_name` is `NULL`, the `signal_name` is derived from `names(filename)` if available, then `basename(filename)`, or `"signal"` then only `x` is supplied.
`target_name`	The name of the target names. It is only used for printing the object.
`background`	numeric value containing the background value in the matrix.
`smooth`	logical whether to apply smoothing on rows.
`target_is_single_point`, `signal_is_categorical`	logical indicating whether the target region is a single point, and whether signal matrix is categorical, respectively.
`mat_grep`	character regular expression pattern used to identify colnames which contain coverage data. The default pattern expects the format `"-200:-100"`.
`upstream_grep`	character regular expression pattern used to identify upstream colnames from values that match `mat_grep`. The default assumes any region beginning `"-"` is negative and upstream the central target region.
`downstream_grep`	character regular expression pattern used to identify upstream colnames from values that match `mat_grep`. The default assumes all colnames which are not upstream are therefore downstream.
`target_grep`	character regular expression pattern used to identify a colname referring to the `target`, which by default can only be `"0"`. Otherwise, no target region is defined.
`verbose`	logical indicating whether to print verbose output.
`...`	additional arguments are ignored.

Details

This function imports genome coverage data matrix and returns an object of class normalizedMatrix compatible for use by the package "EnrichedHeatmap".

There is a conversion function EnrichedHeatmap::as.normalizedMatrix(), however this function does not call that function, in favor of defining the attributes directly. In future, this function may change to call that function.

Value

normalizedMatrix numeric matrix, where additiona metadata is stored in the object attributes. See EnrichedHeatmap::as.normalizedMatrix() for more details about the metadata. The rownames are defined by the first colname which does not match mat_grep, which by default is "Gene ID", otherwise rownames are NULL.

Examples

## There is a small example file to use for testing
cov_file <- system.file("data", "tss_coverage.matrix", package="platjam");
cov_file <- system.file("data", "h3k4me1_coverage.matrix", package="platjam");
if (length(cov_file) > 0) {
   nmat <- coverage_matrix2nmat(cov_file);
   jamba::printDebug("signal_name: ",
      attr(nmat, "signal_name"));

if (suppressPackageStartupMessages(require(EnrichedHeatmap))) {
   color <- "red3";
   signal_name <- attr(nmat, "signal_name");
   k <- 6;
   set.seed(123);
   partition <- kmeans(log10(1+nmat), centers=k)$cluster;
   EH <- EnrichedHeatmap(log10(1+nmat),
      split=partition,
      pos_line=FALSE,
      use_raster=TRUE,
      col=jamba::getColorRamp(color, n=10),
      top_annotation=ComplexHeatmap::HeatmapAnnotation(
         lines=anno_enriched(gp=grid::gpar(col=colorjam::rainbowJam(k)))
      ),
      axis_name_gp=grid::gpar(fontsize=8),
      name=signal_name,
      column_title=signal_name
   );
   PHM <- Heatmap(partition,
      use_raster=TRUE,
      col=structure(colorjam::rainbowJam(k),
         names=as.character(seq_len(k))),
      name="partition",
      show_row_names=FALSE,
      width=grid::unit(3, "mm"));
   draw(PHM + EH, main_heatmap=2);
}
}

jmw86069/platjam documentation built on April 12, 2025, 1:41 p.m.