Description Usage Arguments Value Examples
View source: R/target_encoders.R
Calculates out-of-fold mean features (also known as target encoding) for train and test data. This strategy is widely used to avoid overfitting or causing leakage while creating features using the target variable. This method is experimental. If the results you get are unexpected, please report them in github issues.
1 | kFoldMean(train_df, test_df, colname, target, n_fold = 5, seed = 42)
|
train_df |
train dataset |
test_df |
test dataset |
colname |
name of categorical column |
target |
the target or dependent variable, should be a string. |
n_fold |
the number of folds to use for doing kfold computation, default=5 |
seed |
the seed value, to ensure reproducibility, it could be any positive value, default=42 |
a train and test data table with out-of-fold mean value of the target for the given categorical variable
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | train <- data.frame(region=c('del','csk','rcb','del','csk','pune','guj','del'),
win = c(0,1,1,0,0,0,0,1))
test <- data.frame(region=c('rcb','csk','rcb','del','guj','pune','csk','kol'))
train_result <- kFoldMean(train_df = train,
test_df = test,
colname = 'region',
target = 'win',
seed = 1220)$train
test_result <- kFoldMean(train_df = train,
test_df = test,
colname = 'region',
target = 'win',
seed = 1220)$test
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.