f_train: Training xgboost models

Description Usage Arguments Value

View source: R/f_train.R

Description

Training xgboost models

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
f_train(
  df_train,
  target_name,
  var = NULL,
  nrounds = 32,
  max.depth = 4,
  eta = 1,
  min_child_weight = 50,
  early_stopping_rounds = 25,
  subsample = 0.9,
  colsample_bytree = 0.9,
  gamma = 1,
  ...
)

Arguments

df_train

train data. Should be data.table object

target_name

target variable. Should be a string with a variable present in train data.

var

variables used to train the model. Should be a character vector. Default value is NULL, indicating that all variables available from f_indicators will be used.

nrounds

max number of iterations. Default is 32

max.depth

Max depth of the trees, i.e. the number of splits made.

eta

step size of each boosting step. Large eta may lead to unstable results. Default is 1.

min_child_weight

minimum counts in a child. The algorithm stops if splitting leads to leaf node with fewer than min_child_weight instances.

early_stopping_rounds

stopping criteria. If performance is not improved after k(= early_stopping_rounds) iterations, the algorithm stops.

subsample

train proportion for each fold in cross validation. Should be a number between 0 and 1.

colsample_bytree

subsample proportion for each tree. Should be a number between 0 and 1.

gamma

minimum loss reduction required to make a further partition on a leaf node of the tree. The larger, the more conservative the algorithm will be.

...

additional parameters passed to xgboost.

Value

xgboost object


kristian-bak/kb.modelling documentation built on Dec. 21, 2021, 7:46 a.m.