Description Usage Arguments Details Value See Also Examples
Join two data sources together
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 | ## S3 method for class 'RxFileData'
left_join(x, y, by = NULL, copy = FALSE,
suffix = c(".x", ".y"), .outFile = tbl_xdf(x), .rxArgs, ...)
## S3 method for class 'RxFileData'
right_join(x, y, by = NULL, copy = FALSE,
suffix = c(".x", ".y"), .outFile = tbl_xdf(x), .rxArgs, ...)
## S3 method for class 'RxFileData'
inner_join(x, y, by = NULL, copy = FALSE,
suffix = c(".x", ".y"), .outFile = tbl_xdf(x), .rxArgs, ...)
## S3 method for class 'RxFileData'
full_join(x, y, by = NULL, copy = FALSE,
suffix = c(".x", ".y"), .outFile = tbl_xdf(x), .rxArgs, ...)
## S3 method for class 'RxFileData'
semi_join(x, y, by = NULL, copy = FALSE,
.outFile = tbl_xdf(x), .rxArgs, ...)
## S3 method for class 'RxFileData'
anti_join(x, y, by = NULL, copy = FALSE,
.outFile = tbl_xdf(x), .rxArgs, ...)
## S3 method for class 'RxDataSource'
left_join(x, ...)
## S3 method for class 'RxDataSource'
right_join(x, ...)
## S3 method for class 'RxDataSource'
full_join(x, ...)
## S3 method for class 'RxDataSource'
inner_join(x, ...)
## S3 method for class 'RxDataSource'
anti_join(x, ...)
## S3 method for class 'RxDataSource'
semi_join(x, ...)
|
x, |
y Data sources to join. |
by |
Character vector of variables to join by. See |
copy |
If the data sources are not stored in the same filesystem, whether to copy y to x's location. |
.outFile |
Output format for the returned data. If not supplied, create an xdf tbl; if |
.rxArgs |
A list of RevoScaleR arguments. See |
... |
Not currently used. |
These functions merge two datasets together, using rxMerge
.
For best performance, avoid merging on factor variables or on variables with mismatched types, especially in Spark. This is because rxMerge
is picky about its inputs, and dplyrXdf may have to transform the data to ensure that the merge succeeds.
Currently, merging in Spark has a few limitations. Only Xdf (in HDFS) and Spark data sources (RxHiveData
, RxOrcData
and RxParquetData
) can be merged, and only the "standard" join operations are supported: left_join
, right_join
, inner_join
and full join
. Moreover, Xdf files in HDFS can only be merged in the Spark compute context (not in the Hadoop or local compute contexts).
An object representing the joined data. This depends on the .outFile
argument: if missing, it will be an xdf tbl object; if NULL
, a data frame; and if a filename, an Xdf data source referencing a file saved to that location.
join
in package dplyr, rxMerge
1 2 3 4 5 6 7 | bmembx <- as_xdf(band_members, overwrite=TRUE)
binstx <- as_xdf(band_instruments, overwrite=TRUE)
left_join(bmembx, binstx)
right_join(bmembx, binstx)
inner_join(bmembx, binstx)
full_join(bmembx, binstx)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.