Description Usage Arguments Details Value Note on composite Xdf See Also Examples
Generate tbl_xdf data source object
1 2 |
xdf |
A |
file |
The filename to use for the tbl_xdf – this is the output filename to use when writing the data. By default, a random filename is generated. |
createCompositeSet |
Whether to create a composite Xdf file (see below). |
fileSystem |
The filesystem in which to save the Xdf file. |
... |
Further arguments passed to |
dplyrXdf uses the tbl_xdf class as part of its file management tasks. A tbl_xdf object specifies the file to which a dplyrXdf verb will save its output, and from which the next verb in a pipeline will read its input.
Like an RxXdfData object, a tbl_xdf object is a pointer to a file on disk that stores the actual data. A tbl_xdf also includes information on whether the file was generated as part of a pipeline; if so, subsequent verbs will know to delete the file when they return. This way, only the final output of a pipeline is retained.
In general, you should never need to create a tbl_xdf object manually.
Since a tbl_xdf is an RxXdfData object, all RevoScaleR functions that can work with Xdf files should also work with tbl_xdf's. For example, you can pass the output from a dplyrXdf pipeline straight to a RevoScaleR or MicrosoftML modelling function like rxLinMod
or rxNeuralNet
. If you encounter code that only works with base RxXdfData objects (eg if it uses checks like if(class(obj) == "RxXdfData") {...}
), you can strip off the tbl information with as_xdf(obj)
. See the examples below.
An object of S4 class tbl_xdf
, which inherits from RxXdfData
.
There are actually two kinds of Xdf files: standard and composite. A composite Xdf file is a directory containing multiple data and metadata files, which the RevoScaleR functions treat as a single dataset. While Xdf files in the native filesystem can be in either format, those in HDFS must be composite.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | tbl_xdf()
# create an Xdf data source, and base a tbl_xdf object on it
xdf <- RxXdfData("file", createCompositeSet=TRUE)
tbl_xdf(xdf)
## Not run:
# create a tbl_xdf in HDFS
tbl_xdf(fileSystem=RxHdfsFileSystem())
## End(Not run)
# example of code that requires a base RxXdfData object
my_model <- function(data, formula)
{
if(class(data) != "RxXdfData")
stop("must supply Xdf data source")
rxLinMod(formula, data=data)
}
mtx <- as_xdf(mtcars, overwrite=TRUE)
tbl <- select(mtx, mpg, wt, disp)
## Not run:
# this will fail
my_model(tbl, mpg ~ wt + disp)
## End(Not run)
# use as_xdf() to convert back to RxXdfData
my_model(as_xdf(tbl), mpg ~ wt + disp)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.