The matrixStats package provides highly optimized functions for
computing common summaries over rows and columns of matrices,
e.g. rowQuantiles()
. There are also functions that operate on
vectors, e.g. logSumExp()
. Their implementations strive to minimize
both memory usage and processing time. They are often remarkably
faster compared to good old apply()
solutions. The calculations are
mostly implemented in C, which allow us to optimize beyond what is
possible to do in plain R. The package installs out-of-the-box on all
common operating systems, including Linux, macOS and Windows.
With a matrix
> x <- matrix(rnorm(20 * 500), nrow = 20, ncol = 500)
it is many times faster to calculate medians column by column using
> mu <- matrixStats::colMedians(x)
than using
> mu <- apply(x, MARGIN = 2, FUN = median)
Moreover, if performing calculations on a subset of rows and/or columns, using
> mu <- colMedians(x, rows = 33:158, cols = 1001:3000)
is much faster and more memory efficient than
> mu <- apply(x[33:158, 1001:3000], MARGIN = 2, FUN = median)
For formal benchmarking of matrixStats functions relative to alternatives, see the Benchmark reports.
The objectives of the matrixStats package is to perform operations on matrices (i) as faster as possible, while (ii) not using unnecessary amounts of memory. These objectives drive the design, including the choice of the different defaults.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.