| shard_map | R Documentation |
Core parallel execution engine with supervision, shared inputs, and output buffers.
Executes a function over shards in parallel with worker supervision, shared inputs, and explicit output buffers. This is the primary entry point for shard's parallel execution model.
shard_map(
shards,
fun = NULL,
borrow = list(),
out = list(),
kernel = NULL,
scheduler_policy = NULL,
autotune = NULL,
dispatch_mode = c("rpc_chunked", "shm_queue"),
dispatch_opts = NULL,
workers = NULL,
chunk_size = 1L,
profile = c("default", "memory", "speed"),
mem_cap = "2GB",
recycle = TRUE,
cow = c("deny", "audit", "allow"),
seed = NULL,
diagnostics = TRUE,
packages = NULL,
init_expr = NULL,
timeout = 3600,
max_retries = 3L,
health_check_interval = 10L
)
shards |
A |
fun |
Function to execute per shard. Receives the shard descriptor
as first argument, followed by borrowed inputs and outputs. You can also
select a registered kernel via |
borrow |
Named list of shared inputs. These are exported to workers once and reused across shards. Treated as read-only by default. |
out |
Named list of output buffers (from |
kernel |
Optional. Name of a registered kernel (see |
scheduler_policy |
Optional list of scheduling hints (advanced). Currently:
|
autotune |
Optional. Online autotuning for scalar-N sharding (advanced).
When Accepted values:
|
dispatch_mode |
Dispatch mode (advanced). |
dispatch_opts |
Optional list of dispatch-mode specific knobs (advanced). Currently:
|
workers |
Integer. Number of worker processes. If NULL, uses existing
pool or creates one with |
chunk_size |
Integer. Shards to batch per worker dispatch (default 1). Higher values reduce RPC overhead but may hurt load balancing. |
profile |
Execution profile: |
mem_cap |
Memory cap per worker (e.g., "2GB"). Workers exceeding this are recycled. |
recycle |
Logical or numeric. If TRUE, recycle workers on RSS drift. If numeric, specifies drift threshold (default 0.5 = 50% growth). |
cow |
Copy-on-write policy for borrowed inputs: |
seed |
Integer. RNG seed for reproducibility. If NULL, no seed is set. |
diagnostics |
Logical. Collect detailed diagnostics (default TRUE). |
packages |
Character vector. Additional packages to load in workers. |
init_expr |
Expression to evaluate in each worker on startup. |
timeout |
Numeric. Seconds to wait for each shard (default 3600). |
max_retries |
Integer. Maximum retries per shard on failure (default 3). |
health_check_interval |
Integer. Check worker health every N shards (default 10). |
A shard_result object containing:
results: List of results from each shard (if fun returns values)
failures: Any permanently failed shards
diagnostics: Timing, memory, and worker statistics
pool_stats: Pool-level statistics
blocks <- shards(1000, workers = 2)
result <- shard_map(blocks, function(shard) {
sum(shard$idx^2)
}, workers = 2)
pool_stop()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.