View source: R/req-perform-parallel.R
req_perform_parallel | R Documentation |
This variation on req_perform_sequential()
performs multiple requests in
parallel. Never use it without req_throttle()
; otherwise it's too easy to
pummel a server with a very large number of simultaneous requests.
While running, you'll get a progress bar that looks like:
[working] (1 + 4) -> 5 -> 5
. The string tells you the current status of
the queue (e.g. working, waiting, errored) followed by (the
number of pending requests + pending retried requests) -> the number of
active requests -> the number of complete requests.
The main limitation of req_perform_parallel()
is that it assumes applies
req_throttle()
and req_retry()
are across all requests. This means,
for example, that if request 1 is throttled, but request 2 is not,
req_perform_parallel()
will wait for request 1 before performing request 2.
This makes it most suitable for performing many parallel requests to the same
host, rather than a mix of different hosts. It's probably possible to remove
these limitation, but it's enough work that I'm unlikely to do it unless
I know that people would fine it useful: so please let me know!
Additionally, it does not respect the max_tries
argument to req_retry()
because if you have five requests in flight and the first one gets rate
limited, it's likely that all the others do too. This also means that
the circuit breaker is never triggered.
req_perform_parallel(
reqs,
paths = NULL,
pool = deprecated(),
on_error = c("stop", "return", "continue"),
progress = TRUE,
max_active = 10
)
A list, the same length as reqs
, containing responses and possibly
error objects, if on_error
is "return"
or "continue"
and one of the
responses errors. If on_error
is "return"
and it errors on the ith
request, the ith element of the result will be an error object, and the
remaining elements will be NULL
. If on_error
is "continue"
, it will
be a mix of requests and error objects.
Only httr2 errors are captured; see req_error()
for more details.
# Requesting these 4 pages one at a time would take 2 seconds:
request_base <- request(example_url()) |>
req_throttle(capacity = 100, fill_time_s = 60)
reqs <- list(
request_base |> req_url_path("/delay/0.5"),
request_base |> req_url_path("/delay/0.5"),
request_base |> req_url_path("/delay/0.5"),
request_base |> req_url_path("/delay/0.5")
)
# But it's much faster if you request in parallel
system.time(resps <- req_perform_parallel(reqs))
# req_perform_parallel() will fail on error
reqs <- list(
request_base |> req_url_path("/status/200"),
request_base |> req_url_path("/status/400"),
request("FAILURE")
)
try(resps <- req_perform_parallel(reqs))
# but can use on_error to capture all successful results
resps <- req_perform_parallel(reqs, on_error = "continue")
# Inspect the successful responses
resps |> resps_successes()
# And the failed responses
resps |> resps_failures() |> resps_requests()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.