rNCV: Repeated, Nested Cross-Validation

Description Usage Arguments

View source: R/rNCV.R

Description

Supports classification and regression. Note: only continuous variables are expected to be used as predictors. It is assumed that there are a sufficient number of subjects in each category.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
rNCV(
  data,
  resp.var,
  ref.lv = NULL,
  nRep,
  nFolds.outer,
  methods,
  trControl,
  tuneLength,
  preProcess,
  metric,
  dir.path,
  file.root,
  stack.method = "wt.avg",
  weighted.by = NULL,
  stack.wt = NULL,
  control.stack = NULL,
  save.PredVal = FALSE
)

Arguments

data

The data frame containing the training set.

resp.var

Indicate the name of the column in the training set that contains the response variable.

ref.lv

reference level for categorical variables.

nRep

Number of times nCV is repeated.

nFolds.outer

Number of outer folds

methods

Similarly to the method argument in caret's train function, this argument is a list of strings specifying which classification or regression models to use. Possible values are found using names(getModelInfo()). See http://topepo.github.io/caret/train-models-by-tag.html. A list of functions can also be passed for a custom model function. See http://topepo.github.io/caret/using-your-own-model-in-train.html for details.

trControl

A list of values that define how this function acts. See trainControl and http://topepo.github.io/caret/using-your-own-model-in-train.html. (NOTE: If given, this argument must be named.)

tuneLength

An integer denoting the amount of granularity in the tuning parameter grid. By default, this argument is the number of levels for each tuning parameters that should be generated by train. If trainControl has the option search = "random", this is the maximum number of tuning parameter combinations that will be generated by the random search. (NOTE: If given, this argument must be named.)

preProcess

A string vector that defines a pre-processing of the predictor data. Current possibilities are "BoxCox", "YeoJohnson", "expoTrans", "center", "scale", "range", "knnImpute", "bagImpute", "medianImpute", "pca", "ica" and "spatialSign". The default is no pre-processing. See preProcess and trainControl on the procedures and how to adjust them. Pre-processing code is only designed to work when x is a simple matrix or data frame.

metric

A string that specifies what summary metric will be used to select the optimal model. By default, possible values are "RMSE" and "Rsquared" for regression and "Accuracy" and "Kappa" for classification. If custom performance metrics are used (via the summaryFunction argument in trainControl, the value of metric should match one of the arguments. If it does not, a warning is issued and the first metric given by the summaryFunction is used. (NOTE: If given, this argument must be named.)

dir.path

Directory where the CV data is stored.

file.root

Prefix for the CV filenames.

stack.method

???

weighted.by

???

stack.wt

???

control.stack

???

save.PredVal

Binary. Would you like to save the output from the PredVal function?


kforthman/caretStack documentation built on June 21, 2021, 8:38 a.m.