Description Usage Arguments Details Value See Also Examples
Subset a data source by rows and/or columns
1 2 3 4 5 6 |
.data |
A data source object, or tbl wrapping the same. |
subset |
Logical expression indicating rows to keep. |
select |
Columns to select. See |
.outFile |
Output format for the returned data. If not supplied, create an xdf tbl; if |
.rxArgs |
A list of RevoScaleR arguments. See |
... |
Other arguments passed to lower-level functions. |
This is a method for the subset
generic from base R. It combines the effects of the filter
and select
verbs, allowing you to subset a RevoScaleR data source (typically an xdf file) by rows and columns simultaneously. The advantage of this for an Xdf file is that it significantly reduces the amount of I/O compared to doing the row and column subsetting in separate steps.
If the select
argument is missing, subset
returns all the columns in the data; this is different to the select
verb, which returns no columns if no arguments are provided.
An object representing the subsetted data. This depends on the .outFile
argument: if missing, it will be an xdf tbl object; if NULL
, a data frame; and if a filename, an Xdf data source referencing a file saved to that location.
subset
in base R, filter
, select
, rxDataStep
1 2 3 4 5 6 7 8 9 10 11 | mtx <- as_xdf(mtcars, overwrite=TRUE)
tbl <- subset(mtx, mpg > 20, c(mpg, cyl))
dim(tbl)
# transform and filter simultaneously with .rxArgs
tbl2 <- subset(mtx, mpg > 20, c(mpg, cyl), .rxArgs=list(transforms=list(mpg2=2 * mpg)))
dim(tbl2)
names(tbl2)
# save to a persistent Xdf file
subset(mtx, mpg > 20, c(mpg, cyl), .outFile="mtcars_subset.xdf")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.