run_all_enr_fit_mets: run_all_enr_fit_mets function

View source: R/enr functions.R

run_all_enr_fit_metsR Documentation

run_all_enr_fit_mets function

Description

Uses cross validation ENR functions to consecutively check maximization of multiple performance metrics (DOCUMENTATION COMING- CURRENT DOCUMENTATION INCORRECT)

Usage

run_all_enr_fit_mets(
  dat,
  response_var,
  tune_type = "og",
  modname = "model",
  specs = TRUE,
  date_tf = TRUE,
  time_tf = FALSE,
  ties_measure = "mode",
  fit_mets = c("acc", "balacc", "ppv", "f1", "sens", "auroc", "npv", "spec", "logloss"),
  dir_name =
    "I:/Lagisetty SDR Misuse/5. Identifiable Data/E. Database/treatment arm creation/treatment arm creation/enr mods/",
  iter = 50,
  k = 10,
  num_alpha = 20,
  eq_wt = FALSE,
  lr_cutoff = seq(from = 0.05, to = 0.95, by = 0.05),
  ...
)

Arguments

dat

data frame containing the data to be modeled

response_var

string identifying the name of the outcome variable

tune_type

string indicating the tuning method to use. current options are 'og' (default) which will use the 'en_kfold_model' function to tune alpha and cutoffs using k fold cross validation and 'grid_lim' which will use the 'en_kfold_model_grid_lim' function to simultaneously estimate the three parameters using a randomized expanded grid

modname

string of the base name of the model. default is 'model'

specs

a vector of the first function to use (i.e. outside the parentheses) if fp='FALSE'. default is 'mean'. if supplying different functions be sure to quote e.g. "IQR"

date_tf

boolean indicating if the date should be written to the output files. default is TRUE

time_tf

boolean indicating if the time should be written to the output files. default is FALSE

ties_measure

string indicating the method for breaking ties. default is 'mode' indicating that the model with the best performance across all fit metrics listed will when when model results are tied.

fit_mets

vector indicating all fit metrics to be used to evaluate model performance. options are c(accuracy, auroc, logloss, f1, ppv, npv, sens, spec, bal_acc)

dir_name

string indicating the directory to which model results should be saved

iter

the number of iterations to use

k

the number of folds to use

num_alpha

an integer of the number of alphas to consider. this will be split across 0 to 1. for example if '5' is given then alphas will go from 0 to 1 and will be num_alpha/iteration (i.e. 0, .2, .4, .6, .8, 1)

eq_wt

boolean indicating whether the 0/1 classes should be balanced with weights. you may want to use this if there is a bad class imbalance

lr_cutoff

vector of cutoff values to test/tune for optimization. the default is 'c(.5)' which is to say 'equal distance from all classes' which is typical in standard analyses

Examples

run_all_enr_fit_mets()

clmacleod/highlandr documentation built on April 17, 2025, 3:30 a.m.