setops: Set operations on data sources

Description Usage Arguments Details Value See Also Examples

Description

Set operations on data sources

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
## S3 method for class 'RxFileData'
intersect(x, y, ...)

## S3 method for class 'RxFileData'
setdiff(x, y, ...)

## S3 method for class 'RxFileData'
setequal(x, y, ...)

## S3 method for class 'RxFileData'
union_all(x, y, .outFile = tbl_xdf(x), .rxArgs)

## S3 method for class 'RxFileData'
union(x, y, .outFile = tbl_xdf(x), .rxArgs, ...)

## S3 method for class 'RxDataSource'
intersect(x, ...)

## S3 method for class 'RxDataSource'
setdiff(x, ...)

## S3 method for class 'RxDataSource'
setequal(x, ...)

## S3 method for class 'RxDataSource'
union(x, ...)

## S3 method for class 'RxDataSource'
union_all(x, ...)

Arguments

x, y

Data sources.

...

Not currently used.

.outFile

Output format for the returned data. If not supplied, create an xdf tbl; if NULL, return a data frame; if a character string naming a file, save an Xdf file at that location.

.rxArgs

A list of RevoScaleR arguments. See rxArgs for details.

Details

Currently, only union and union_all are supported for RevoScaleR data sources. The code uses rxDataStep(append="rows") to do the union; this can be much faster than using rxMerge(type="union").

Value

An object representing the joined data. This depends on the .outFile argument: if missing, it will be an xdf tbl object; if NULL, a data frame; and if a filename, an Xdf data source referencing a file saved to that location.

See Also

setops and bind_rows in package dplyr, rbind

Examples

1
2
3
4
5
6
7
mtx <- as_xdf(mtcars, overwrite=TRUE)
tbl <- union(mtx, mtx)
nrow(tbl)

# union_all doesn't remove duplicated rows
tbl2 <- union_all(mtx, mtx)
nrow(tbl2)

RevolutionAnalytics/dplyrXdf documentation built on June 3, 2019, 9:08 p.m.