Description Usage Arguments Details Value
Make your own encoder to be used in a pipeline
1 2 3 4 5 6 7 8 |
X |
The data.frame/data.table to transform. |
Y |
Optional: The dependent variable to ignore in the transformation. |
fact |
Optional: The factor variable(s) to encode by - either positive integer(s) specifying the column number, or the name(s) of the column. If left empty a heuristic is used to determine the factor variable(s), and a warning is written with the names of the variables converted. |
method |
Optional: A character string indicating which encoding method to use, either of the following: * "mean" * "median" * "deviation" * "lowrank" * "spca" * "mnl" * "dummy" * "difference" * "helmert" * "simple_effect" * "repeated_effect" If only a single method is specified, it is taken to encode either all of the variables supplied through *fact*, or variables which have been flagged as factors automatically. If multiple methods are specified, the number of methods must match the number of factor variables in *fact* - and these are applied to correspond in the order in which they were supplied. In case a missmatch occurs, an error is raised. If left empty, the appriopriate method is selected on a case by case basis (and the selected methods are written out to console). |
custom_encoding_assignment |
**experimental** A function which takes two arguments (**X** and **fact**) denoting the data and the factors, respectivelly, and assigns a valid encoding **method** to each factor in **fact**. |
... |
Not implemented. |
Automatically selects the appropriate method given the number of anticipated newly created variables, based on the results in Johannemann et al.(2019) 'Sufficient Representations for Categorical Variables', and a simple heuristic - where
A new data.table X which contains the new columns and optionally the old factor(s).
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.