req_perform_parallel: Perform a list of requests in parallel

View source: R/multi-req.R

req_perform_parallelR Documentation

Perform a list of requests in parallel

Description

This variation on req_perform_sequential() performs multiple requests in parallel. Exercise caution when using this function; it's easy to pummel a server with many simultaneous requests. Only use it with hosts designed to serve many files at once, which are typically web servers, not API servers.

req_perform_parallel() has a few limitations:

  • Will not retrieve a new OAuth token if it expires part way through the requests.

  • Does not perform throttling with req_throttle().

  • Does not attempt retries as described by req_retry().

  • Only consults the cache set by req_cache() before/after all requests.

If any of these limitations are problematic for your use case, we recommend req_perform_sequential() instead.

Usage

req_perform_parallel(
  reqs,
  paths = NULL,
  pool = NULL,
  on_error = c("stop", "return", "continue"),
  progress = TRUE
)

Arguments

reqs

A list of requests.

paths

An optional character vector of paths, if you want to download the response bodies to disk. If supplied, must be the same length as reqs.

pool

Optionally, a curl pool made by curl::new_pool(). Supply this if you want to override the defaults for total concurrent connections (100) or concurrent connections per host (6).

on_error

What should happen if one of the requests fails?

  • stop, the default: stop iterating with an error.

  • return: stop iterating, returning all the successful responses received so far, as well as an error object for the failed request.

  • continue: continue iterating, recording errors in the result.

progress

Display a progress bar for the status of all requests? Use TRUE to turn on a basic progress bar, use a string to give it a name, or see progress_bars to customize it in other ways. Not compatible with req_progress(), as httr2 can only display a single progress bar at a time.

Value

A list, the same length as reqs, containing responses and possibly error objects, if on_error is "return" or "continue" and one of the responses errors. If on_error is "return" and it errors on the ith request, the ith element of the result will be an error object, and the remaining elements will be NULL. If on_error is "continue", it will be a mix of requests and error objects.

Only httr2 errors are captured; see req_error() for more details.

Examples

# Requesting these 4 pages one at a time would take 2 seconds:
request_base <- request(example_url())
reqs <- list(
  request_base |> req_url_path("/delay/0.5"),
  request_base |> req_url_path("/delay/0.5"),
  request_base |> req_url_path("/delay/0.5"),
  request_base |> req_url_path("/delay/0.5")
)
# But it's much faster if you request in parallel
system.time(resps <- req_perform_parallel(reqs))

# req_perform_parallel() will fail on error
reqs <- list(
  request_base |> req_url_path("/status/200"),
  request_base |> req_url_path("/status/400"),
  request("FAILURE")
)
try(resps <- req_perform_parallel(reqs))

# but can use on_error to capture all successful results
resps <- req_perform_parallel(reqs, on_error = "continue")

# Inspect the successful responses
resps |> resps_successes()

# And the failed responses
resps |> resps_failures() |> resps_requests()

r-lib/httr2 documentation built on Jan. 11, 2025, 10:21 a.m.