runCV: Run logit modeling

Description Usage Arguments Value See Also

Description

Patients are first split into training and testing partitions. Next, samples with NA features will be removed. Then, training partition is split for cross-validation so that no patient has events in both validation and training. Both partitions are prepped for logit fitting and cross-validation is run.

Usage

1
2
3
runCV(data, cvReps, formula, labelName, lasso = FALSE, llength = NULL,
  lmax = NULL, predictors = NULL, needToRemove = NULL,
  createModelMatrix = FALSE, metric = c("prAUC", "rocAUC"), seed, folds)

Arguments

data

dataframe, rows are samples, cols are features plus some metadata not meant for modeling and will be removed

formula

char or formula object

labelName

char, column name of binary label

lasso

logical, whether to use lasso regularization

llength

num, number of lambdas to consider up to lmax

lmax

num, maximum lambda to consider, cannot be NULL if lambda is NULL

predictors

char, names of columns in data that should be in logit fit data

needToRemove

char, names of columns in data that should not be in logit fit data

createModelMatrix

logical, call model.matrix

metric

char, see aucs

seed

int, seed for split

folds

number of folds

Value

list of logit fits, averaged cross-validation results (if cvReps > 1), and test data partition (one that is formated like training data and is ready to be called by subsequent trained model)

See Also

prepLogitData, getCV, getPerformanceNames


novasmedley/gbmSpm documentation built on May 17, 2019, 10:39 a.m.