# big_colstats: Standard univariate statistics In bigstatsr: Statistical Tools for Filebacked Big Matrices

 big_colstats R Documentation

## Standard univariate statistics

### Description

Standard univariate statistics for columns of a Filebacked Big Matrix. For now, the `sum` and `var` are implemented (the `mean` and `sd` can easily be deduced, see examples).

### Usage

```big_colstats(X, ind.row = rows_along(X), ind.col = cols_along(X), ncores = 1)
```

### Arguments

 `X` An object of class FBM. `ind.row` An optional vector of the row indices that are used. If not specified, all rows are used. Don't use negative indices. `ind.col` An optional vector of the column indices that are used. If not specified, all columns are used. Don't use negative indices. `ncores` Number of cores used. Default doesn't use parallelism. You may use nb_cores.

### Value

Data.frame of two numeric vectors `sum` and `var` with the corresponding column statistics.

colSums apply

### Examples

```set.seed(1)

X <- big_attachExtdata()

# Check the results
str(test <- big_colstats(X))

# Only with the first 100 rows
ind <- 1:100
str(test2 <- big_colstats(X, ind.row = ind))
plot(test\$sum, test2\$sum)
abline(lm(test2\$sum ~ test\$sum), col = "red", lwd = 2)

X.ind <- X[ind, ]
all.equal(test2\$sum, colSums(X.ind))
all.equal(test2\$var, apply(X.ind, 2, var))

# deduce mean and sd
# note that the are also implemented in big_scale()
means <- test2\$sum / length(ind) # if using all rows,