Description Usage Arguments Value Examples
View source: R/target_encoder.R
This function encodes categorical variables with average target values for each category.
1 2 3 4 5 6 7 8 | target_encoder(
X_train,
X_test = NULL,
y,
cat_columns,
prior = 0.5,
objective = "regression"
)
|
X_train |
A 'tibble' or 'data.frame' representing the training data set containing some categorical features/columns. |
X_test |
A 'tibble' or 'data.frame' representing the test set, containing some set of categorical features/columns. |
y |
A numeric vector or character vector representing the target variable. If the objective is "binary", then the vector should only contain two unique values. |
cat_columns |
A character vector containing the names of the categorical columns in the tibble that should be encoded. |
prior |
A number in [0, inf] that acts as pseudo counts when calculating the encodings. Useful for preventing encodings of 0 for when the training set does not have particular categories observed in the test set. A larger value gives less weight to what is observed in the training set. A value of 0 incorporates no prior information. The default value is 0.5. |
objective |
A string, either "regression" or "binary" specifying the problem. Default is regression. |
A list containing with processed training and test sets, in which the named categorical columns are replaced with their encodings.
1 2 3 4 5 6 | target_encoder(
X_train = mtcars,
y = mtcars$mpg,
cat_columns = c("gear", "carb"),
prior = 0.5,
objective = "regression")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.