...) Client Interface for R

getURIAsynchronous

R Documentation

Download multiple URIs concurrently, with inter-leaved downloads

Description

This function allows the caller to specify multiple URIs to download at the same time. All the requests are submitted and then the replies are processed as data becomes available on each connection. In this way, the responses are processed in an inter-leaved fashion, with a chunk from one response from one request being processed and then followed by a chunk from a different request.

Downloading documents asynchronously involves some trade-offs. The switching between different streams, detecting when input is available on any of them involves a little more processing and so increases the consumption of CPU cycles. On the other hand, there is a potentially large saving of time when one considers total time to download. See https://www.omegahat.net/RCurl/concurrent.xml for more details. This is a common trade-off that arises in concurrent/parallel/asynchronous computing.

getURI calls this function if more than one URI is specified and async is TRUE, the default in this case. One can also download the (contents of the) multiple URIs serially, i.e. one after the other using getURI with a value of FALSE for async.

Usage

getURIAsynchronous(url, ..., .opts = list(), write = NULL,
                   curl = getCurlHandle(),
                   multiHandle = getCurlMultiHandle(), perform = Inf,
                  .encoding = integer(), binary = rep(NA, length(url)))

Arguments

`url`	a character vector identifying the URIs to download.
`...`	named arguments to be passed to `curlSetOpt` when creating each of the different `curlHandle` objects.
`.opts`	a named list or `CURLOptions` object identifying the curl options for the handle. This is merged with the values of ... to create the actual options for the curl handle in the request.
`write`	an object giving the functions or routines that are to be called when input is waiting on the different HTTP response streams. By default, a separate callback function is associated with each input stream. This is necessary for the results to be meaningful as if we use a single reader, it will be called for all streams in a haphazard order and the content interleaved. One can do interesting things however using a single object.
`curl`	the prototypical curlHandle that is duplicated and used in in
`multiHandle`	this is a curl handle for performing asynchronous requests.
`perform`	a number which specifies the maximum number of calls to `curlMultiPerform` that are to be made in this function call. This is typically either 0 for no calls or `Inf` meaning process the requests until completion. One may find alternative values useful, such as 1 to ensure that the requests are dispatched.
`.encoding`	an integer or a string that explicitly identifies the encoding of the content that is returned by the HTTP server in its response to our query. The possible strings are ‘UTF-8’ or ‘ISO-8859-1’ and the integers should be specified symbolically as `CE_UTF8` and `CE_LATIN1`. Note that, by default, the package attempts to process the header of the HTTP response to determine the encoding. This argument is used when such information is erroneous and the caller knows the correct encoding.
`binary`	a logical vector identifying whether each URI has binary content or simple text.

Details

This uses curlMultiPerform and the multi/asynchronous interface for libcurl.

Value

The return value depends on the run-time characteristics of the call. If the call merely specifies the URIs to be downloaded, the result is a named character vector. The names identify the URIs and the elements of the vector are the contents of the corresponding URI.

If the requests are not performed or completed (i.e. perform is zero or too small a value to process all the chunks) a list with 2 elements is returned. These elements are:

`multiHandle`	the curl multi-handle, of class `MultiCURLHandle-class`. This can be used in further calls to `curlMultiPerform`
`write`	the `write` argument (after it was potentially expanded to a list). This can then be used to fetch the results of the requests when the requests are completed in the future.

Author(s)

Duncan Temple Lang <duncan@r-project.org>

References

Curl homepage https://curl.se/

Examples

  uris = c("https://www.omegahat.net/RCurl/index.html",
           "https://www.omegahat.net/RCurl/philosophy.xml")
  txt = getURIAsynchronous(uris)
  names(txt)
  nchar(txt)

RCurl documentation built on Sept. 11, 2024, 8:36 p.m.

RCurl index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

RCurl
General Network (HTTP/FTP/...) Client Interface for R

getURIAsynchronous: Download multiple URIs concurrently, with inter-leaved...
In RCurl: General Network (HTTP/FTP/...) Client Interface for R

Download multiple URIs concurrently, with inter-leaved downloads

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to getURIAsynchronous in RCurl...

R Package Documentation

Browse R Packages

We want your feedback!

RCurl General Network (HTTP/FTP/...) Client Interface for R

getURIAsynchronous: Download multiple URIs concurrently, with inter-leaved... In RCurl: General Network (HTTP/FTP/...) Client Interface for R

Download multiple URIs concurrently, with inter-leaved downloads

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to getURIAsynchronous in RCurl...

R Package Documentation

Browse R Packages

We want your feedback!

RCurl
General Network (HTTP/FTP/...) Client Interface for R

getURIAsynchronous: Download multiple URIs concurrently, with inter-leaved...
In RCurl: General Network (HTTP/FTP/...) Client Interface for R