Most parallel R workflows duplicate large objects into every worker process. That wastes RAM and time. memshare stores big objects once in shared memory and lets workers attach to them as ordinary R vectors/matrices via ALTREP views. You get:
apply/lapply-style APIs that manage sharing for you.This vignette is a quick, practical guide, for technical details we refer to [Thrun and Märte, 2025]
install.packages("memshare") # CRAN # remotes::install_github("yourname/memshare") # dev
Requirements: R ≥ 4.0, C++17 toolchain.
memApply)library(memshare) set.seed(1) n <- 10000; p <- 2000 X <- matrix(rnorm(n * p), n, p) # numeric/double matrix y <- rnorm(n) # Correlate each column with y, in parallel, without copying X to workers res <- memApply( X = X, MARGIN = 2, FUN = function(v, y) cor(v, y), VARS = list(y = y) # shared side data ) str(res)
What happened?
X and y were placed in shared memory; workers received views (ALTREP) instead of copies.
Each worker extracted the i-th column as v, ran FUN(v, y), and returned a result. All views were released automatically at the end.
memLapply)list_length <- 1000 d <- 200 L <- lapply(1:list_length, function(i) matrix(rnorm(d * d), d, d)) w <- rnorm(d) ans <- memLapply(L, function(el, w) el %*% w, VARS = list(w = w)) length(ans); dim(ans[[1]])
ns <- "demo" X <- matrix(rnorm(1e6), 1000, 1000) registerVariables(ns, list(X = X)) vw <- retrieveViews(ns, "X") mean(vw$X[ , 1]) releaseViews(ns, "X") releaseVariables(ns, "X")
"demo"). Unload the package (or release views/variables) to clean up. Memory is freed once no views remain.
score <- function(v, a, b) sum((v - a)^2) / (1 + b) # any column-wise work ns <- "scores" a <- rnorm(n); b <- runif(1) out <- memApply(X = X, MARGIN = 2, FUN = score, VARS = list(a = a, b = b), NAMESPACE = ns)
Reuse the same namespace to avoid re-registering large objects.
ns <- "reuse" registerVariables(ns, list(X = X)) pass1 <- memApply("X", 2, function(v) sd(v), NAMESPACE = ns) pass2 <- memApply("X", 2, function(v) mean(v), NAMESPACE = ns) releaseVariables(ns, "X")
FUN's first argument must be the vector/list element (v for memApply, el for memLapply).VARS must use exactly the same names in FUN’s signature.clusterExport for small copied objects; big ones belong in VARS.releaseViews() in workers (handled automatically by memApply/memLapply), and releaseVariables() in the master when done.X is a numeric matrix (double) or a character name of a registered object; VARS is either a named list (to register) or character vector of existing names. viewList() in workers; any remaining views prevent releaseVariables() from reclaiming memory. NAMESPACE is missing and FUN is an inline lambda, the default namespace is "unnamed". Prefer explicit NAMESPACE in production.registerVariables(namespace, variableList) — put objects into shared memory. retrieveViews(namespace, variableNames) — get ALTREP views (workers). releaseViews(namespace, variableNames) — release worker views. releaseVariables(namespace, variableNames) — free objects (master). memApply(X, MARGIN, FUN, NAMESPACE = NULL, VARS = NULL, MAX.CORES = NULL) — matrix apply with shared memory. memLapply(X, FUN, NAMESPACE = NULL, VARS = NULL, MAX.CORES = NULL) — list apply with shared memory.[Thrun and Märte, 2025] Thrun, M.C., Märte, J.: Memshare: Memory Sharing for Multicore Computation in R with an Application to Feature Selection by Mutual Information using PDE, The R Journal, in revision, 2025.
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.