Performance enhancements and SciDB matrix support for the s4vd s4vd biclustering method of Lee, Shen, Huange and Marron (Biclustering via Sparse Singular Value Decomposition, M. Lee, H. Shen, J. Huang, J. S. Marron, Biometrics 66, pp. 1087-1095, December 2010).
The package vignette summarizes the modifications: s4vdp4.pdf
NOTE: This package relies on recent versions of the scidb
package for R.
Install this package directly from Github using the devtools package:
library("devtools")
install_github(repo="s4vdp4", username="Paradigm4")
Install the latest SciDB package for R with:
install_github(repo="SciDBR", username="Paradigm4", ref="laboratory", quick=TRUE)
library("s4vd")
data(lung)
A = lung[1:2000,]
cat("Starting standard s4vd\n")
t1 = proc.time()
x = biclust(A, method=BCssvd, K=1)
print(proc.time()-t1)
library("s4vdp4")
cat("In-memory P4 s4vd\n")
X = A
t1 = proc.time()
x1 = biclust(X, method=BCssvd, K=1)
print(proc.time()-t1)
cat("Partly in-database P4 s4vd\n")
X = as.scidb(A)
t1 = proc.time()
x2 = biclust(X, method=BCssvd, K=1)
print(proc.time()-t1)
We've moved the large matrix vector products and matrix factorizations into SciDB. However, two vectors are returned to R for some additional processing in each iteration. This data transfer back and forth in each iteration is a bottle neck. We'd like to move more (all?) of the algorithm into SciDB in a future version, still scripting the overall program from R.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.