# colStats: Row and Column Summary Statistics In matter: A framework for rapid prototyping with file-based data structures

## Description

These functions perform calculation of summary statistics over matrix rows and columns, for each level of a grouping variable (optionally), and with implicit row/column scaling and centering if desired.

## Usage

 1 2 3 4 5 6 7 8 9 10 11 12 13 ## S4 method for signature 'ANY' colStats(x, stat, groups, na.rm = FALSE, tform = identity, col.center = NULL, col.scale = NULL, row.center = NULL, row.scale = NULL, drop = TRUE, BPPARAM = bpparam(), ...) ## S4 method for signature 'ANY' rowStats(x, stat, groups, na.rm = FALSE, tform = identity, col.center = NULL, col.scale = NULL, row.center = NULL, row.scale = NULL, drop = TRUE, BPPARAM = bpparam(), ...)

## Arguments

 x A matrix on which to calculate summary statistics. stat The name of summary statistics to compute over the rows or columns of a matrix. Allowable values include: "min", "max", "prod", "sum", "mean", "var", "sd", "any", "all", and "nnzero". groups A factor or vector giving the grouping. If not provided, no grouping will be used. na.rm If TRUE, remove NA values before summarizing. tform A dimensionality-preserving transformation to be applied to the matrix (e.g., log() or sqrt()). col.center A vector of column centers to substract from each row. (Or a matrix with a column for each level of groups.) col.scale A vector of column scaling factors to divide from each row. (Or a matrix with a column for each level of groups.) row.center A vector of row centers to substract from each column. (Or a matrix with a column for each level of groups.) row.scale A vector of row centers to scaling factors to divide from each column. (Or a matrix with a column for each level of groups.) drop If only a single summary statistic is calculated, return the results as a vector (or matrix) rather than a list. BPPARAM An optional instance of BiocParallelParam. See documentation for bplapply. ... Additional arguments.

## Details

The summary statistics methods are calculated over chunks of the matrix using colstreamStats and rowstreamStats. For matter objects, the iteration is performed over the major dimension for IO efficiency.

## Value

A list for each stat requested, where each element is either a vector (if no grouping variable is provided) or a matrix where each column corresponds to a different level of groups.

If drop=TRUE, and only a single statistic is requested, then the result will be unlisted and returned as a vector or matrix.

Kylie A. Bemis