chunk_map: Apply functions to all partitions, but small chunks each time

View source: R/generics.R

chunk_mapR Documentation

Apply functions to all partitions, but small chunks each time

Description

Apply functions to all partitions, but small chunks each time

Usage

chunk_map(x, map_fun, reduce, max_nchunks, chunk_size, ...)

Arguments

x

a LazyArray or R array

map_fun

function to apply to each chunk

reduce

similar to reduce in partition_map

max_nchunks

maximum number of chunks. If number of chunks is too large, then chunk_size will be re-calculated.

chunk_size

integer chunk size. If chunk_size is too small, it will be ignored

...

ignored or passed to other methods

Details

The difference between chunk_map and partition_map is the margin or direction to apply mapping functions. In partition_map, mapping function is applied to each partition. If x is a matrix, this means applying to each column. chunk_map generate small chunks along all dimensions except the last, and apply mapping functions to each chunks. If x is a matrix, it make chunks along rows and apply mapping functions along rows.

Value

If reduce is missing, returns a list of results. Each result is returned by map_fun, and the total length equals to number of chunks mapped. If reduce is a function, that list of results will be passed to reduce and chunk_map returns the results generated from reduce.

See Also

partition_map

Examples


x <- as.lazymatrix(matrix(1:100, ncol = 2))
x

# Set max_nchunks=Inf and chunk_size=10 to force total number of chunks
# is around nrow(x)/10 and each chunk contains at most 10 rows
chunk_map(x, function(chunk){chunk[1:2,]}, chunk_size = 10, max_nchunks = Inf)

# For each chunks, calculate mean, then calculate the mean of chunk mean
chunk_map(x, function(chunk) {
  colMeans(chunk)
}, function(chunk_means) {
  Reduce('+', chunk_means) / length(chunk_means)
})

colMeans(x[])



dipterix/lazyarray documentation built on June 30, 2023, 6:30 a.m.