cor_columns: Number of columns which are highly correlated to other...

Description Usage Arguments Details Value Author(s) Examples

Description

Number of columns which are highly correlated to other columns

Usage

1
cor_columns(x, abs_cutoff = 0.5, size = 1000, mc = 1, ...)

Arguments

x

a matrix, correlation is calculated by columns

abs_cutoff

cutoff of absolute correlation. It can be a numeric vector with more than one cutoffs.

size

size of blocks

mc

number of cores

...

pass to cor

Details

For each column, it looks for number of other columns which correlate with absolute correlation coefficient larger tham abs_cutoff. The calculation involves pair-wise correlation of all columns in the matrix. When number of columns is huge in the matrix, it is out of ability of R to store such long vector. This function solves this problem by splitting the columns into k blocks and looks at each block sequentially or in parallel.

The code is partially adapted from https://rmazing.wordpress.com/2013/02/22/bigcor-large-correlation-matrices-in-r/

Value

A matrix that represents how many other columns correlate to current column under the correlation cutoff.

Author(s)

Zuguang Gu <z.gu@dkfz.de>

Examples

1
2
3
4
5
6
## Not run: 
mat = matrix(rnorm(20000 * 10), ncol = 20000, nrow = 20)
cor_columns(mat, abs_cutoff = c(0.5, 0.6, 0.7))

## End(Not run)
NULL

jokergoo/epik documentation built on Sept. 28, 2019, 9:20 a.m.