cross_validate_model5: Cross-validate model 5

View source: R/train_model5.R

cross_validate_model5R Documentation

Cross-validate model 5

Description

Similar to model 1, but with tentative features from Boruta.

Usage

cross_validate_model5(
  dat,
  response_name,
  outer_cv_folds = 3,
  inner_cv_folds = 3,
  calibration_youden_index_threshold = 0.9,
  calibration_lambda_min_ratio = 1e-06,
  random_state = 56,
  mtry = NULL,
  save_level = 3,
  save_prefix = "train_model5_",
  overwrite = FALSE,
  output = NULL,
  verbose = TRUE
)

Arguments

dat

a data.frame of input data.

response_name

column name of the response.

outer_cv_folds

outer cross-validation fold number.

inner_cv_folds

inner cross-validation fold number.

calibration_youden_index_threshold

a float of Youden index threhold. Default to 0.9.

calibration_lambda_min_ratio

see lambda.min.ratio of glmnet. Default to 1e-6.

random_state

random seed.

mtry

A vector of mtry for parameter tuning.

save_level

if save_level > 0, save outer train index. If save_level > 1, save calibrated probabilities and selected features in addition.

save_prefix

output file prefix.

overwrite

overwrite existing result files or not.

output

output directory.

verbose

A bool.

feature_selection

feature selection method.

Details

Key steps:

  1. Tuning loop tunes single parameter mtry.

  2. Outer cross-validation split the data set into training and testing set of M folds.

  3. Inner cross-validation split the training set of the outer CV into N folds. Each fold does the feature selection with Boruta algorithm and random forest classification. When all folds are done. Train calibration model of Ridge multinomial logistic regression (MR) regression. The lambda is trained with cv.glmnet. The random forest and calibration models are used for the testing set of the outer CV.

Value

a list of cross-validation result of given mtry values.


markgene/yamatClassifier documentation built on Oct. 14, 2024, 2:36 a.m.