predict_two: Wrapper for executing the rNCV function, version 2.
In kforthman/caretStack: Stacks models from the caret package to create a more robust prediction

Description Usage Arguments Details Examples

Performs repeated nested cross-validation on the input dataset.

predict_two(
  data,
  var_to_predict,
  targetType = c("binary", "categorical", "numerical"),
  predictor_var_file_list,
  rdata_prefix,
  outDir = "",
  rNCVdir = "rNCV",
  nFolds.outer = 5,
  nRep = 5,
  methods = c("svmRadial", "ranger", "glmnet"),
  metric = "RMSE",
  ncore = 1,
  cmp.grp = NA,
  ctrl.reg = NA
)

`data`	the dataset to use in matrix form including the predictors and the target.
`var_to_predict`	The column name of the target.
`targetType`	please specify whether the target is "binary", "categorical", or "numerical".
`predictor_var_file_list`	File name of a .csv file that lists all the predictor variables. Can be a list of file names. Each file is expected to contain a list of variable data names that will be included in the analysis as predictors.
`rdata_prefix`	label to put in output file names
`outDir`	If you would like to save the output files into somewhere other than the working directory, specify that here.
`rNCVdir`	Specify the name of the folder rNCV folds will be automatically saved to. For example, if you set rNCVdir to 'rNCV', rdata_prefix to 'my_ML' and outDir to 'Output', the rNCV files will be saved to 'Output/rNCV_files/my_ML/'.
`nFolds.outer`	Number of outer folds
`nRep`	Number of times nCV is repeated.
`methods`	Similarly to the `method` argument in caret's `train` function, this argument is a list of strings specifying which classification or regression models to use. Possible values are found using `names(getModelInfo())`. See http://topepo.github.io/caret/train-models-by-tag.html. A list of functions can also be passed for a custom model function. See http://topepo.github.io/caret/using-your-own-model-in-train.html for details.
`metric`	A string that specifies what summary metric will be used to select the optimal model. By default, possible values are "RMSE" and "Rsquared" for regression and "Accuracy" and "Kappa" for classification. If custom performance metrics are used (via the `summaryFunction` argument in `trainControl`, the value of `metric` should match one of the arguments. If it does not, a warning is issued and the first metric given by the `summaryFunction` is used. (NOTE: If given, this argument must be named.)
`ncore`	For specifying the number of cores to use in parallel computing.
`ctrl.reg`	If desired, you may specify custom trainControl settings. Otherwise, trainControl will be set to caretStack defaults.

Target variable: Your target/dependent variable can be either categorical, binary, or numerical. If your target variable is categorical or binary, please ensure it is input as a factor. Only 1 target allowed.

Predictor variables: Though you can have binary or categorical targets, you cannot have binary/categorical predictors. But, it is possible to convert your binary and categorical predictors to numerical predictors. You can convert your binary predictors to 0,1. You can numerically rank ordinal categorical predictors. Nominal categorical variables, like race, can be converted to numerical variables using one-hot-encoding.

Missing data: If dataset contains predictors with missing data, the missing entries will be imputed using KNN imputation. If any subject is missing over 30% of their predictor variables, they are removed from the analysis. Any cases that have missing data for the target variable will be removed.

Saves the following files

Results file, [outDir]/[var_to_predict]_[rdata_prefix].results.RData, is a .RData file containing 5 objects:
- data is a list object that contains the response variables with the the imputed predictors. Cases with no entry for the response variable are removed. This is the dataset that is plugged into the function rNCV()
- res.rncv is the object returned from the function rNCV()
- output_label Label of output file name.
- predictor_vars A list of the names of the predictors. Must use numeric predictors.
- var_to_predict The name of the target variable.
Summary file, [outDir]/[var_to_predict]_[rdata_prefix]_summary.csv, is a .csv logging performance summary.
Variable importance file, [outDir]/[var_to_predict]_[rdata_prefix]_VarImp.csv, is a .csv logging variable importance.
rNCV files, [outDir]/[rNCVdir]/[rdata_prefix]/[var_to_predict]_[rdata_prefix]_Rep_[x]_fold_[x].rda and [outDir]/[rNCVdir]/[rdata_prefix]/[var_to_predict]_[rdata_prefix]_Rep_[x]_fold_[x]-PredVal.rda

prepped_data <- read.csv('Data/prepped_hc_data.csv', stringsAsFactors = F)
prepped_data[prepped_data$LC_Category == 'Dep', 'LC_Category'] <- 'Dep+Anx'
prepped_data[prepped_data$LC_Category == 'Anx', 'LC_Category'] <- 'Dep+Anx'
prepped_data <- prepped_data[which(prepped_data$LC_Category != 'Eating+'),]
prepped_data$LC_Category <- factor(prepped_data$LC_Category)

prepped_data[prepped_data$LC_Category == 'Dep', 'Dep.Anx'] <- 1
prepped_data[prepped_data$LC_Category == 'Anx', 'Dep.Anx'] <- 1

ft_data <- read.csv('Data/FT_summary.csv',  stringsAsFactors = F)

this_data <- merge(prepped_data, ft_data, by = c("id", "visit"), all.x = T)

predict_two(prepped_data, 'lme_slope_simple'  , c('Data/all_vars-clin_np.csv'),   'lme_slope_simple_vars-clin_np')
predict_two(prepped_data, 'lme_slope_simple'  , c('Data/all_vars-clinical.csv'),   'lme_slope_simple_vars-clinical')
predict_two(prepped_data, 'lme_slope_simple'  , c('Data/all_vars-np.csv'),   'lme_slope_simple_vars-np')

kforthman/caretStack documentation built on June 21, 2021, 8:38 a.m.

kforthman/caretStack index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

kforthman/caretStack
Stacks models from the caret package to create a more robust prediction

predict_two: Wrapper for executing the rNCV function, version 2.
In kforthman/caretStack: Stacks models from the caret package to create a more robust prediction

Description

Usage

Arguments

Details

Examples

Related to predict_two in kforthman/caretStack...

R Package Documentation

Browse R Packages

We want your feedback!

kforthman/caretStack Stacks models from the caret package to create a more robust prediction

predict_two: Wrapper for executing the rNCV function, version 2. In kforthman/caretStack: Stacks models from the caret package to create a more robust prediction

Description

Usage

Arguments

Details

Examples

Related to predict_two in kforthman/caretStack...

R Package Documentation

Browse R Packages

We want your feedback!

kforthman/caretStack
Stacks models from the caret package to create a more robust prediction

predict_two: Wrapper for executing the rNCV function, version 2.
In kforthman/caretStack: Stacks models from the caret package to create a more robust prediction