shard_crossprod: Parallel crossprod() using shard views + output buffers

View source: R/kernels.R

shard_crossprodR Documentation

Parallel crossprod() using shard views + output buffers

Description

Computes crossprod(X, Y) (i.e. t(X) %*% Y) using:

  • shared/mmap-backed inputs (one copy),

  • block views (no slice materialization),

  • BLAS-3 dgemm in each tile,

  • an explicit shared output buffer (no gather/bind spikes).

Usage

shard_crossprod(
  X,
  Y,
  workers = NULL,
  block_x = "auto",
  block_y = "auto",
  backing = c("mmap", "shm"),
  materialize = c("auto", "never", "always"),
  materialize_max_bytes = 512 * 1024^2,
  diagnostics = TRUE
)

Arguments

X, Y

Double matrices with the same number of rows.

workers

Number of worker processes.

block_x, block_y

Tile sizes over ncol(X) and ncol(Y). Use "auto" (default) to autotune on the current machine.

backing

Backing for shared inputs and output buffer ("mmap" or "shm").

materialize

Whether to return the result as a standard R matrix: "never" (return buffer handle), "always", or "auto" (materialize if estimated output size is below materialize_max_bytes).

materialize_max_bytes

Threshold for "auto" materialization.

diagnostics

Whether to collect shard_map diagnostics.

Details

This is intended as an ergonomic entry point for the "wow" path: users shouldn't have to manually call share(), view_block(), buffer(), tiles2d(), and shard_map() for common patterns.

Value

A list with:

  • buffer: shard_buffer for the result (p x v)

  • value: materialized matrix if requested, otherwise NULL

  • run: the underlying shard_result from shard_map

  • tile: chosen tile sizes

Examples


X <- matrix(rnorm(2000), 100, 20)
Y <- matrix(rnorm(2000), 100, 20)
res <- shard_crossprod(X, Y, block_x = 50, block_y = 10, workers = 2)
pool_stop()
res$value


shard documentation built on April 3, 2026, 9:08 a.m.