The RevoScaleR functions typically have several arguments beyond those used by dplyrXdf verbs. While usually you don't need to touch these, it can sometimes be useful to do so. For example, when using mutate
or transmute
, you could specify more complicated transformations via a transformFunc
. Similarly, rather than chaining together a mutate
and a summarise
— which would involve creating an intermediate file — you could incorporate the variable transformation into the summarise
itself.
Most of the one-table dplyrXdf verbs accept an .rxArgs
argument as a way of transmitting these extra arguments to the underlying RevoScaleR code. This should be a named list specifying the names and values of the arguments to be passed. The exact arguments will vary depending on the verb in question; here is a list of the verbs and the underlying RevoScaleR function that they call:
subset
, filter
and select
: rxDataStep
mutate
and transmute
: rxDataStep
summarise
: depending on the method chosen, rxCube
or rxSummary
arrange
: rxSort
rename
: rxDataStep
(only if data movement is required)
distinct
: rxDataStep
factorise
: rxFactors
doXdf
: rxDataStep
persist
: rxDataStep
You should use the .rxArgs
argument with caution, as some verbs may modify the data as part of their normal functioning, so the results you get back may not be as expected. It's also easy to write convoluted code that makes your dplyrXdf pipelines harder to read. However, when working with big datasets this feature can help save a lot of processing time by avoiding unnecessary disk traffic.
The following one-table verbs don't support the .rxArgs
argument:
group_by
: this verb doesn't do any processing; it only sets things up for subsequent verbs.
do
: the underlying functionality is provided by data frames and dplyr::do
.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.