Description Usage Arguments Details Value See Also Examples
Convert columns in an Xdf file to factor
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | factorise(.data, ...)
factorize(.data, ...)
## S3 method for class 'RxXdfData'
factorise(.data, ..., .outFile = tbl_xdf(.data), .rxArgs)
## S3 method for class 'RxFileData'
factorise(.data, ..., .outFile = tbl_xdf(.data), .rxArgs)
all_character(vars = .varTypes)
all_integer(vars = .varTypes)
all_numeric(vars = .varTypes)
## S3 method for class 'data.frame'
factorise(.data, ...)
## S3 method for class 'RxDataSource'
factorise(.data, ...)
|
.data |
A data source. |
... |
Variables to convert to factors. |
.outFile |
Output format for the returned data. If not supplied, create an xdf tbl; if |
.rxArgs |
A list of RevoScaleR arguments. See |
The selector functions listed in select
also work with factorise
. In addition, you can use the following:
all_character()
: selects all character variables
all_integer()
: selects all integer variables, ie those of type "logical"
and "integer"
all_numeric()
: selects all numeric variables, ie those of type "numeric"
, "Date"
, "POSIXct"
, "logical"
and "integer"
If no variables are specified, all character variables will be converted to factors.
You can specify the levels for a variable by specifying them as the value of the argument. For example, factorise(*, x = c("a","b","c"))
will turn the variable x
into a factor with three levels a
, b
and c
. Any values that don't match the set of levels will be turned into NAs. In particular, this means you should include the existing levels for variables that are already factors.
For performance reasons, factors created by factorise
are not sorted; instead, the ordering of their levels will be determined by the order in which they are encountered in the data.
The method for RxXdfData
objects is a shell around rxFactors
, which is the standard RevoScaleR function for factor manipulation. For RxFileData
objects, the method calls rxImport
with an appropriately constructed colInfo
argument.
The data frame method simply calls factor
to convert the specified columns into factors.
An object representing the returned data. This depends on the .outFile
argument: if missing, it will be an xdf tbl object; if NULL
, a data frame; and if a filename, an Xdf data source referencing a file saved to that location.
chol
, qr
, svd
for the other meaning of factorise
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | mtx <- as_xdf(mtcars, overwrite=TRUE)
tbl1 <- factorise(mtx, am, vs)
tbl_types(tbl1)
tbl2 <- factorise(mtx, all_numeric())
tbl_types(tbl2)
# selector functions used by select(), rename() also work
tbl3 <- factorise(mtx, starts_with("m"))
tbl_types(tbl3)
# save to a persistent Xdf file
factorise(mtx, am, vs, .outFile="mtcars_factor.xdf")
# factorise() also works with data frames
tbl4 <- factorise(mtcars, cyl)
tbl_types(tbl4)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.