fastCor is a helper function that compute Pearson correlation matrix
validClimR functions. It is similar
cor function in R but uses a faster implementation on 64-bit
machines (an optimized
BLAS library is highly recommended).
also uses a memory-efficient algorithm that allows for splitting the data matrix and
only compute the upper-triangular part of the correlation matrix. It can be used to
compute correlation matrix for the columns of any data matrix.
integer number greater than or equal to one, to split the data matrix into
logical to compute only the upper-triangular half of the correlation
logical to print processing information if
fastCor function computes the correlation matrix by
calling the cross product function in the Basic Linear Algebra Subroutines
(BLAS) library used by R. A significant performance improvement can be
achieved when building R on 64-bit machines with an optimized BLAS library,
such as ATLAS, OpenBLAS, or the commercial Intel MKL.
For big data, the memory required to allocate the square matrix of correlations
may exceed the total amount of physical memory available resulting in
“Error: cannot allocate vector of size...”.
for splitting the data matrix into
nSplit splits and only computes the
upper-triangular part of the correlation matrix with
upperTri = TRUE.
This almost halves memory use, which can be very important for big data.
nSplit > 1, the correlation matrix (or the upper-triangular part if
upperTri = TRUE) will be allocated and filled with computed correlation
sub-matrix for each split. the first
n-1 splits have equal size while
the last split may include any remaining columns.
N rows by
N columns) correlation matrix.
Hamada Badr <email@example.com>, Ben Zaitchik <firstname.lastname@example.org>, and Amin Dezfuli <email@example.com>.
Hamada S. Badr, Zaitchik, B. F. and Dezfuli, A. K. (2015): A Tool for Hierarchical Climate Regionalization, Earth Science Informatics, 1-10, http://dx.doi.org/10.1007/s12145-015-0221-7.
Hamada S. Badr, Zaitchik, B. F. and Dezfuli, A. K. (2014): Hierarchical Climate Regionalization, CRAN, http://cran.r-project.org/package=HiClimR.
bigcor: Large correlation matrices in R, https://rmazing.wordpress.com.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
require(HiClimR) ## Load test case data x <- TestCase$x ## Use fastCor function to compute the correlation matrix t0 <- proc.time() ; xcor <- fastCor(t(x)) ; proc.time() - t0 ## compare with cor function t0 <- proc.time() ; xcor0 <- cor(t(x)) ; proc.time() - t0 ## Not run: ## Split the data into 10 splits and return upper-triangular half only xcor10 <- fastCor(t(x), nSplit = 10, upperTri = TRUE) ## End(Not run)