# rowmean: Give Column Means of a Matrix-like Object, Based on a... In analytics: Regression Outlier Detection, Stationary Bootstrap, Testing Weak Stationarity, NA Imputation, and Other Tools for Data Analysis

## Description

Compute column (weighted) means across rows of a numeric matrix-like object for each level of a grouping variable.

## Usage

 ```1 2``` ```rowmean(M, group = rownames(M), w = FALSE, reord = FALSE, na_rm = FALSE, big = TRUE, ...) ```

## Arguments

 `M` a matrix, data frame or vector of numeric data. Missing values are allowed. A numeric vector will be treated as a column vector. `group` a vector or factor giving the grouping, with one element per row of M. Default: rownames of M. `w` a vector giving the weights that must be applied to each of the stacked blocks of an original object `reord` if TRUE, then the result will be in order of sort(unique(group)), if FALSE (the default), it will be in the order that groups were encountered. `na_rm` logical (TRUE or FALSE). Should NA (including NaN) values be discarded? `big` is your object big and integer overflow is likely? If TRUE, then M is multiplied by 1.0 to ensure values are of type double (perhaps taking more RAM). `...` other arguments to be passed to or from methods.

## Details

This function is a wrapper for base function `rowsum` which allows one to compute the (weighted) mean instead of the sum, while handling integer overflow.

Note: although data frames ara allowed, keep in mind that data frames do not allow duplicate row names. Hence if you have a dataframe with more than 1 group, you may want to use the function as.matrix() to convert it to an object of class matrix

To compute the mean over all the rows of a matrix (i.e. a single group) use colMeans, which should be even faster.

## Value

A matrix-like object containing the means by group. There will be one row per unique value of group. If object supplied in fact (explicitly) had just one group, base function `colMeans` is called for maximum efficiency and a numeric vector containing the mean of each column is returned.

Albert Dorador

`rowsum`

## Examples

 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37``` ```A <- matrix(1:8, ncol = 2) rownames(A) <- c("A", "B", "A", "B") rowmean(A) B <- matrix(1:40, ncol = 2) gr <- rep(1:5, 4) B.mean <- rowmean(B, group = gr) sum(B.mean[, 1])*4 == sum(B[, 1]) #basic sanity check sum(B.mean[, 2])*4 == sum(B[, 2]) #basic sanity check dfB <- as.data.frame(B) gr <- rep(1:5, 4) dfB.mean <- rowmean(dfB, group = gr) numbers <- rnorm(1e7, mean = 3) C <- matrix(numbers, ncol = 5) gr <- rep(1:20, 1e5) rowmean(C, group = gr) # Handles Big Data fast vec <- 1:10 gr <- rep(1:2, 5) rowmean(vec, gr) onegroup = matrix(1:40, ncol = 2) gr = rep(1,20) rowmean(onegroup, gr) == mean(onegroup[,1]) rowmean(onegroup, gr) == mean(onegroup[,2]) numbers <- rnorm(30, mean = 3) D <- matrix(numbers, ncol = 3) num_blocks <- 2 gr <- rep(1:5, num_blocks) rownames(D) <- gr rowmean(D, w = c(0.1,0.9)) rowmean(D, w = c(0,1)) rowmean(D, w = c(0.5,0.5)) rowmean(D) ```

analytics documentation built on May 2, 2019, 3:37 p.m.