run_all_enr_fit_mets: run_all_enr_fit_mets function
In clmacleod/highlandr: Random Useful Functions

run_all_enr_fit_mets

R Documentation

run_all_enr_fit_mets function

Description

Uses cross validation ENR functions to consecutively check maximization of multiple performance metrics (DOCUMENTATION COMING- CURRENT DOCUMENTATION INCORRECT)

Usage

run_all_enr_fit_mets(
  dat,
  response_var,
  tune_type = "og",
  modname = "model",
  specs = TRUE,
  date_tf = TRUE,
  time_tf = FALSE,
  ties_measure = "mode",
  fit_mets = c("acc", "balacc", "ppv", "f1", "sens", "auroc", "npv", "spec", "logloss"),
  dir_name =
    "I:/Lagisetty SDR Misuse/5. Identifiable Data/E. Database/treatment arm creation/treatment arm creation/enr mods/",
  iter = 50,
  k = 10,
  num_alpha = 20,
  eq_wt = FALSE,
  lr_cutoff = seq(from = 0.05, to = 0.95, by = 0.05),
  ...
)

Arguments

`dat`	data frame containing the data to be modeled
`response_var`	string identifying the name of the outcome variable
`tune_type`	string indicating the tuning method to use. current options are 'og' (default) which will use the 'en_kfold_model' function to tune alpha and cutoffs using k fold cross validation and 'grid_lim' which will use the 'en_kfold_model_grid_lim' function to simultaneously estimate the three parameters using a randomized expanded grid
`modname`	string of the base name of the model. default is 'model'
`specs`	a vector of the first function to use (i.e. outside the parentheses) if fp='FALSE'. default is 'mean'. if supplying different functions be sure to quote e.g. "IQR"
`date_tf`	boolean indicating if the date should be written to the output files. default is TRUE
`time_tf`	boolean indicating if the time should be written to the output files. default is FALSE
`ties_measure`	string indicating the method for breaking ties. default is 'mode' indicating that the model with the best performance across all fit metrics listed will when when model results are tied.
`fit_mets`	vector indicating all fit metrics to be used to evaluate model performance. options are c(accuracy, auroc, logloss, f1, ppv, npv, sens, spec, bal_acc)
`dir_name`	string indicating the directory to which model results should be saved
`iter`	the number of iterations to use
`k`	the number of folds to use
`num_alpha`	an integer of the number of alphas to consider. this will be split across 0 to 1. for example if '5' is given then alphas will go from 0 to 1 and will be num_alpha/iteration (i.e. 0, .2, .4, .6, .8, 1)
`eq_wt`	boolean indicating whether the 0/1 classes should be balanced with weights. you may want to use this if there is a bad class imbalance
`lr_cutoff`	vector of cutoff values to test/tune for optimization. the default is 'c(.5)' which is to say 'equal distance from all classes' which is typical in standard analyses