Description Details Author(s) See Also Examples
This package provides a parallel shared-memory programming paradigm for R, very similar to threaded programming in C/C++. This enables the programmer to write simpler, clearer code. Furthermore, in some applications this package produces significantly faster code, compared to versions written for other parallel R libraries. It also allows placing very large matrices in secondary storage, while treating them as being in shared memory.
Package: | Rdsm |
Type: | Package |
Version: | 2.1.1 |
Date: | 2014-02-16 |
License: | GPL (>= 2) |
List of functions:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | initialization, run at manager:
mgrinit(): initialize system
mgrmakevar(): create a shared variable
mgrmakelock(): create a lock
makebarr(): create a barrier
called by applications:
barr(): barrier call
rdsmlock(): lock operation (via realrdsmlock())
rdsmunlock(): unlock operation (via realrdsmunlock())
application utilities:
getidxs(): partition a set of indices for work assignment
getmatrix(): allow a matrix to be referenced regardless of
whether it is specified as a bigmemory object,
a bigmemory descriptor, or via a quoted name
stoprdsm() shut down cluster and clean up files
|
Built-in variables accessible by the threads, at the worker nodes:
1 2 3 | myinfo$nwrkrs: total number of threads
myinfo$id: this thread's ID number
|
To run, set up a cluster via the parallel package); we'll refer to
the R process from which this is done as the manager; the processes
running in the cluster will be called workers. Create the application's
shared variables from the manager, using mgrmakevar()
. Launch
the worker threads, again from the manager, by the parallel call
clusterEvalQ()
or clusterCall()
. One typically codes so
that the results are in shared variables. See examples below, and more
in the examples/
directory in this distribution.
The shared variables are read to/written by any of the workers and the
manager. In fact, while an Rdsm application is running, other R
processes on the same machine (or a different machine sharing the same
file system, if the variables are filebacked) can access the shared
variables. See the file ExternalAccess.txt
in the doc/
directory.
Rdsm uses the bigmemory library to store its shared variables. Though the latter can work on a (physical) cluster of several machines sharing a file system, Rdsm does not run on such systems at this time.
Further documentation in the doc/
directory.
Norm Matloff <matloff@cs.ucdavis.edu>
mgrinit
,
mgrmakevar
,
mgrmakelock
,
barr
,
rdsmlock
,
rdsmunlock
,
getidxs
,
getmatrix
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | library(parallel)
c2 <- makeCluster(2) # form 2-thread Snow cluster
mgrinit(c2) # initialize Rdsm
mgrmakevar(c2,"m",2,2) # make a 2x2 shared matrix
m[,] <- 3 # 2x2 matrix of all 3s
# example of shared memory:
# at each thread, set id to Rdsm built-in ID variable for that thread
clusterEvalQ(c2,id <- myinfo$id)
clusterEvalQ(c2,m[1,id] <- id^2) # assignment executed by each thread
m[,] # top row of m should now be (1,4)
# matrix multiplication; the product u %*% v is computed, product
# placed in w
# note again: mmul() call will be executed by each thread
mmul <- function(u,v,w) {
require(parallel)
# decide which rows of u this thread will work on
myidxs <- splitIndices(nrow(u),myinfo$nwrkrs)[[myinfo$id]]
# multiply this thread's part of u with v, placing the product in the
# corresponding part of w
w[myidxs,] <- u[myidxs,] %*% v[,]
invisible(0)
}
# create test matrices
mgrmakevar(c2,"a",6,2)
mgrmakevar(c2,"b",2,6)
mgrmakevar(c2,"c",6,6)
# give them values
a[,] <- 1:12
b[,] <- 1 # all 1s
clusterExport(c2,"mmul") # send mmul() to the threads
clusterEvalQ(c2,mmul(a,b,c)) # run the threads
c[,] # check results
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.