train_model4: Train model 4
In markgene/yamatClassifier: Yet Another Methylation Array Toolkit: Classifier

train_model4

R Documentation

Train model 4

Description

Similar to model 1, but with tentative features from Boruta.

Usage

train_model4(
  dat,
  response_name,
  outer_cv_folds = 5,
  inner_cv_folds = 5,
  random_state = 56,
  mtry = NULL,
  save_level = 3,
  save_prefix = "train_model4_",
  overwrite = FALSE,
  output = NULL,
  verbose = TRUE
)

Arguments

`dat`	a `data.frame` of input data.
`response_name`	column name of the response.
`outer_cv_folds`	outer cross-validation fold number.
`inner_cv_folds`	inner cross-validation fold number.
`random_state`	random seed.
`mtry`	A vector of mtry for parameter tuning.
`save_level`	if save_level > 0, save outer train index. If save_level > 1, save calibrated probabilities and selected features in addition.
`save_prefix`	output file prefix.
`overwrite`	overwrite existing result files or not.
`output`	output directory.
`verbose`	A bool.
`feature_selection`	feature selection method.

Details

Key steps:

Tuning loop tunes single parameter mtry.
Outer cross-validation split the data set into training and testing set of M folds.
Inner cross-validation split the training set of the outer CV into N folds. Each fold does the feature selection with Boruta algorithm and random forest classification. When all folds are done. Train calibration model of Ridge multinomial logistic regression (MR) regression. The lambda is trained with cv.glmnet. The random forest and calibration models are used for the testing set of the outer CV.