...) Client Interface for R

basicTextGatherer

R Documentation

Cumulate text across callbacks (from an HTTP response)

Description

These functions create callback functions that can be used to with the libcurl engine when it passes information to us when it is available as part of the HTTP response.

basicTextGatherer is a generator function that returns a closure which is used to cumulate text provided in callbacks from the libcurl engine when it reads the response from an HTTP request.

debugGatherer can be used with the debugfunction libcurl option in a call and the associated update function is called whenever libcurl has information about the header, data and general messages about the request.

These functions return a list of functions. Each time one calls basicTextGatherer or debugGatherer, one gets a new, separate collection of functions. However, each collection of functions (or instance) shares the variables across the functions and across calls. This allows them to store data persistently across the calls without using a global variable. In this way, we can have multiple instances of the collection of functions, with each instance updating its own local state and not interfering with those of the others.

We use an S3 class named RCurlCallbackFunction to indicate that the collection of funcions can be used as a callback. The update function is the one that is actually used as the callback function in the CURL option. The value function can be invoked to get the current state that has been accumulated by the update function. This is typically used when the request is complete. One can reuse the same collection of functions across different requests. The information will be cumulated. Sometimes it is convenient to reuse the object but reset the state to its original empty value, as it had been created afresh. The reset function in the collection permits this.

multiTextGatherer is used when we are downloading multiple URIs concurrently in a single libcurl operation. This merely uses the tools of basicTextGatherer applied to each of several URIs. See getURIAsynchronous.

Usage

basicTextGatherer(txt = character(), max = NA, value = NULL,
                    .mapUnicode = TRUE)
multiTextGatherer(uris, binary = rep(NA, length(uris)))
debugGatherer()

Arguments

`txt`	an initial character vector to start things. We allow this to be specified so that one can initialize the content.
`max`	if specified as an integer this controls the total number of characters that will be read. If more are read, the function tells libcurl to stop!
`uris`	for `multiTextGatherer`, this is either the number or the names of the uris being downloaded and for which we need a separate writer function.
`value`	if specified, a function that is called when retrieving the text usually after the completion of the request and the processing of the response. This function can be used to convert the result into a different format, e.g. parse an XML document, read values from table in the text.
`.mapUnicode`	a logical value that controls whether the resulting text is processed to map components of the form \uxxxx to their appropriate Unicode representation.
`binary`	a logical vector that indicates which URIs yield binary content

Details

This is called when the libcurl engine finds sufficient data on the stream from which it is reading the response. It cumulates these bytes and hands them to a C routine in this package which calls the actual gathering function (or a suitable replacement) returned as the update component from this function.

Value

Both the basicTextGatherer and debugGatherer functions return an object of class RCurlCallbackFunction. basicTextGatherer extends this with the class RCurlTextHandler and debugGatherer extends this with the class RCurlDebugHandler. Each of these has the same basic structure, being a list of 3 functions.

`update`	the function that is called with the text from the callback routine and which processes this text by accumulating it into a vector
`value`	a function that returns the text cumulated across the callbacks. This takes an argument `collapse` (and additional ones) that are handed to `paste`. If the value of `collapse` is given as `NULL`, the vector of elements containing the different text for each callback is returned. This is convenient when debugging or if one knows something about the nature of the callbacks, e.g. the regular size that causes iit to identify records in a natural way.
`reset`	a function that resets the internal state to its original, empty value. This can be used to reuse the same object across requests but to avoid cumulating new input with the material from previous requests.

multiTextGatherer returns a list with an element corresponding to each URI. Each element is an object obtained by calling basicTextGatherer, i.e. a collection of 3 functions with shared state.

Author(s)

Duncan Temple Lang

References

Curl homepage https://curl.se/

Examples

if(url.exists("https://www.omegahat.net/RCurl/index.html")) withAutoprint({
  txt = getURL("https://www.omegahat.net/RCurl/index.html", write = basicTextGatherer())

  h = basicTextGatherer()
  txt = getURL("https://www.omegahat.net/RCurl/index.html", write = h$update)
    ## Cumulate across pages.
  txt = getURL("https://www.omegahat.net/index.html", write = h$update)


  headers = basicTextGatherer()
  txt = getURL("https://www.omegahat.net/RCurl/index.html",
               header = TRUE, headerfunction = headers$update)

     ## Now read the headers.
  cat(headers$value())
  headers$reset()


    ## Debugging callback
  d = debugGatherer()
  x = getURL("https://www.omegahat.net/RCurl/index.html", debugfunction = d$update, verbose = TRUE)
  cat(names(d$value()))
  d$value()[["headerIn"]]


    ## This hung on Solaris
    ## 2022-02-08 philosophy.html is malformed UTF-8
  uris = c("https://www.omegahat.net/RCurl/index.html",
           "https://www.omegahat.net/RCurl/philosophy.html")
## Not run: 
  g = multiTextGatherer(uris)
  txt = getURIAsynchronous(uris,  write = g)
  names(txt) # no names this way
  nchar(txt)

   # Now don't use names for the gatherer elements.
  g = multiTextGatherer(length(uris))
  txt = getURIAsynchronous(uris,  write = g)
  names(txt)
  nchar(txt)

## End(Not run)
})


## Not run: 
 Sys.setlocale(,"en_US.latin1")
 Sys.setlocale(,"en_US.UTF-8")
 uris = c("https://www.omegahat.net/RCurl/index.html",
          "https://www.omegahat.net/RCurl/philosophy.html")
 g = multiTextGatherer(uris)
 txt = getURIAsynchronous(uris,  write = g)

## End(Not run)

RCurl documentation built on Sept. 11, 2024, 8:36 p.m.

RCurl index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

RCurl
General Network (HTTP/FTP/...) Client Interface for R

basicTextGatherer: Cumulate text across callbacks (from an HTTP response)
In RCurl: General Network (HTTP/FTP/...) Client Interface for R

Cumulate text across callbacks (from an HTTP response)

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to basicTextGatherer in RCurl...

R Package Documentation

Browse R Packages

We want your feedback!

RCurl General Network (HTTP/FTP/...) Client Interface for R

basicTextGatherer: Cumulate text across callbacks (from an HTTP response) In RCurl: General Network (HTTP/FTP/...) Client Interface for R

Cumulate text across callbacks (from an HTTP response)

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Related to basicTextGatherer in RCurl...

R Package Documentation

Browse R Packages

We want your feedback!

RCurl
General Network (HTTP/FTP/...) Client Interface for R

basicTextGatherer: Cumulate text across callbacks (from an HTTP response)
In RCurl: General Network (HTTP/FTP/...) Client Interface for R