get.distances.chunked: Call a function once per chunk & save output as file (breaks...

View source: R/get.distances.chunked.R

get.distances.chunkedR Documentation

Call a function once per chunk & save output as file (breaks large input data into chunks)

Description

Call get.distances function in chunks, when list of frompoints is so long it taxes RAM (e.g. 11m blocks), saving each chunk as a separate .RData file in current working directory

Usage

get.distances.chunked(
  frompoints,
  topoints,
  fromchunksize,
  tochunksize,
  FUN = get.distances,
  folder = getwd(),
  ...
)

Arguments

frompoints

Require matrix or data.frame of lat/lon vauels that can be passed to get.distances function (colnames 'lat' and 'lon')

topoints

Require matrix or data.frame of lat/lon vauels that can be passed to get.distances function (colnames 'lat' and 'lon')

fromchunksize

Required, number specifying how many points to analyze at a time (per chunk).

tochunksize

(not yet implemented - current default is to use all topoints at once) number specifying how many points to analyze at a time (per chunk).

FUN

Optional function, get.distances by default, no other value allowed currently.

folder

Optional path specifying where to save .RData files, default is getwd()

...

Other parameters to pass to get.distances, such as radius or units

Details

filesizes if crosstab format (FASTEST & avoid needing rownums which take >twice as long & 1.25x sized file):
80MB file/chunk if 1k blocks x 11k topoints/chunk:
y=get.distances.chunked(testpoints(11e6), testpoints(11000), 1e3, units='km',return.crosstab=TRUE)
800MB file/chunk if 10k blocks x 11k topoints/chunk:
y=get.distances.chunked(testpoints(11e6), testpoints(11000), 1e4, units='km',return.crosstab=TRUE)

Value

Returns vector of character elements that are filenames for saved .RData output files in current working directory or specified folder.

See Also

ff and others related to parallelization, etc.


ejanalysis/proxistat documentation built on April 2, 2024, 10:13 a.m.