get.cluster.worker.subsets: Get subsets to be distributed to workers

Description Usage Arguments Details Value Examples

View source: R/helpers.r

Description

Get subsets to be distributed to workers.

Usage

1
2
3
4
5
6
7
get.cluster.worker.subsets(
  num.vals,
  dim.size,
  dim.axes,
  axis.to.split.on,
  min.num.chunks = 1
)

Arguments

num.vals

The maximum number of values to process at once.

dim.size

The sizes of the dimensions of the data to be processed.

dim.axes

The axes of the data, as returned by nc.get.dim.axes.

axis.to.split.on

The axis (X, Y, T, etc) to split the data on.

min.num.chunks

The minimum number of chunks to generate, even if the chunks are considerably smaller than num.vals.

Details

Given a desired number of values (num.vals), the sizes of the dimensions (dim.size), the corresponding axes (dim.axes), the desired axis to split on (axis.to.split.on), and optionally the minimum number of chunks to return (min.num.chunks), returns a list of lists of subsets appropriate to be passed to nc.put.var.subsets.by.axes or nc.get.var.subsets.by.axes.

This functionality is useful when you want to keep memory consumption down but want to maximize the amount read in at one time to make the best use of available I/O bandwidth.

Value

A list of lists describing subsets in a suitable form to be passed to nc.put.var.subsets.by.axes or nc.get.var.subsets.by.axes.

Examples

1
2
3
## Get a subset from an example
subsets <- get.cluster.worker.subsets(1E7, c(128, 64, 50000),
                                      c(lon="X", lat="Y", time="T"), "Y")

ncdf4.helpers documentation built on Oct. 15, 2021, 5:19 p.m.