parallel_get: Parallelised version of basic_get, for requesting larger...

View source: R/fct_odata_get.R

parallel_getR Documentation

Parallelised version of basic_get, for requesting larger amounts of data.

Description

This uses multiples cores to make concurrent API requests, and then merges the individual results. There is some upfront work required to determine the series of smaller requests, so this function shouldn't be used for "small" requests.

FIXME: support 'query' that include $filter, this would require merging it with the $filter generated with the 'splitting_col'.

Usage

parallel_get(
  endpoint,
  entity = "",
  query = "",
  timeout = 10,
  splitting_col = "ResourceID",
  max_cores = 4,
  rows_per_query = 10000
)

Arguments

endpoint

API endpoint. Required.

entity

Data entity.

query

Query URL character (not URL-encoded).

timeout

Timeout for the GET request(s), in seconds.

splitting_col

The column on which to split the overall request into bite-size portions.

max_cores

Maximum number of cores. This will be overruled if there is less cores available.

rows_per_query

The approx. number of rows per individual request. This can be used to tune performance.

Value

A data frame containing the requested data.


xaviermiles/statsnz.odata documentation built on April 14, 2022, 12:53 p.m.