do: Do arbitrary operations on a tbl
In RevolutionAnalytics/dplyrXdf: Tools for working with Microsoft R Server Xdf files and the dplyr package

Description Usage Arguments Details Value See Also Examples

The do verb converts the data to a data frame before running the operations. The doXdf verb keeps the data in Xdf format, so is not (as) limited by memory.

## S3 method for class 'RxFileData'
do(.data, ...)

## S3 method for class 'grouped_tbl_xdf'
do(.data, ...)

do_xdf(.data, ...)

doXdf(.data, ...)

## S3 method for class 'RxFileData'
do_xdf(.data, ...)

## S3 method for class 'grouped_tbl_xdf'
do_xdf(.data, ...)

## S3 method for class 'RxDataSource'
do(.data, ...)

## S3 method for class 'RxDataSource'
do_xdf(.data, ...)

`.data`	A tbl for an Xdf data source; or a raw Xdf data source.
`...`	Expressions to apply.

The difference between the do and do_xdf verbs is that the former converts the data into a data frame before running the expressions on it; while the latter passes the data as Xdf files. do is thus more flexible in the expressions it can run (basically anything that works with data frames), whereas do_xdf is better able to handle large datasets. The final output from do_xdf must still be able to fit in memory (see below).

do_xdf was called doXdf in previous versions of this package; it has been renamed to match dplyr's snake_case naming convention.

To run expressions on a grouped Xdf tbl, do and do_xdf split the data into one file per group, and the arguments are called on each file. Note however this may be slow if you have a large number of groups; and, for do, the operation will be limited by memory if the number of rows per group is large.

The do and do_xdf verbs always return a data frame, unlike the other verbs for Xdf objects. This is because they are meant to execute code that can return arbitrarily complex objects, and Xdf files can only store atomic data.

do in package dplyr

mtx <- as_xdf(mtcars, overwrite=TRUE)

# unnamed arg
do(mtx, {
    mpg2 <- 2 * .$mpg
    cyl2 <- 2 * .$cyl
    .
})

do_xdf(mtx, rxDataStep(., transformFunc=function(.data) {
    .data$mpg2 <- 2 * .data$mpg
    .data$cyl2 <- 2 * .data$cyl
    .data
}))

# named arg
do(mtx, m=lm(mpg ~ cyl, data=.))

do_xdf(mtx, m=rxLinMod(mpg ~ cyl, data=.))

# fitting multiple models to subsets of the data
if(require("nycflights13")) {
flx <- as_xdf(flights, overwrite=TRUE)
flx %>%
    group_by(carrier) %>%
    do(m=lm(arr_delay ~ dep_time, data=.))

# with do_xdf: useful if each subset is very large, but called code must be Xdf-aware
flx %>%
    group_by(carrier) %>%
    do_xdf(m2=rxLinMod(arr_delay ~ dep_time, data=.))
}

RevolutionAnalytics/dplyrXdf documentation built on June 3, 2019, 9:08 p.m.

RevolutionAnalytics/dplyrXdf index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

RevolutionAnalytics/dplyrXdf
Tools for working with Microsoft R Server Xdf files and the dplyr package

do: Do arbitrary operations on a tbl
In RevolutionAnalytics/dplyrXdf: Tools for working with Microsoft R Server Xdf files and the dplyr package

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to do in RevolutionAnalytics/dplyrXdf...

R Package Documentation

Browse R Packages

We want your feedback!

RevolutionAnalytics/dplyrXdf Tools for working with Microsoft R Server Xdf files and the dplyr package

do: Do arbitrary operations on a tbl In RevolutionAnalytics/dplyrXdf: Tools for working with Microsoft R Server Xdf files and the dplyr package

Description

Usage

Arguments

Details

Value

See Also

Examples

Related to do in RevolutionAnalytics/dplyrXdf...

R Package Documentation

Browse R Packages

We want your feedback!

RevolutionAnalytics/dplyrXdf
Tools for working with Microsoft R Server Xdf files and the dplyr package

do: Do arbitrary operations on a tbl
In RevolutionAnalytics/dplyrXdf: Tools for working with Microsoft R Server Xdf files and the dplyr package