Description Usage Arguments Details Examples
Performs repeated nested cross-validation on the input dataset. Is intended for use on LIBR's T1000 data and expects data in this format. Accepts data that includes columns labeled 'id' and 'LC_Category' and will remove these columns before performing the rNCV.
1 2 3 4 5 6 7 8 9 10 | predict_one(
dset,
var_to_predict,
predictor_var_file_list,
rdata_prefix,
outDir = "",
nFolds.outer = 5,
methods = c("svmRadial", "ranger", "glmnet"),
metric = "RMSE"
)
|
dset |
the dataset to use in matrix form including the predictors, the target, participant id's, and LifeChart (LC) categories. |
var_to_predict |
The column name of the target. |
predictor_var_file_list |
list of filenames. Each file is expected to contain a list of T1000 variable data names that will be included in the analysis as predictors. |
rdata_prefix |
label to put in output file names |
nFolds.outer |
Number of outer folds |
methods |
Similarly to the |
metric |
A string that specifies what summary metric will
be used to select the optimal model. By default, possible values
are "RMSE" and "Rsquared" for regression and "Accuracy" and
"Kappa" for classification. If custom performance metrics are
used (via the |
ourDir |
If you would like to save the output files into somewhere other than the working directory, specify that here. Make sure the folder name ends with '/'. |
Missing data: If dataset contains predictors with missing data, the missing entrys will be imputed using KNN imputation. If any subject is missing over 30% of their predictor variables, they are removed from the analysis. Any cases that have missing data for the target variable will be removed. Only 1 target allowed.
Saves a file containing 5 objects:
data.rncv
is a list object that contains the response variables with the the imputed predictors. Cases with no entry for the response variable are removed. This is the dataset that is plugged into the function rNCV()
res.rncv
is the object returned from the function rNCV()
output_label
Label of output file name.
predictor_vars
A list of the names of the predictors.
var_to_predict
The name of the target variable.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 | prepped_data <- read.csv('Data/prepped_hc_data.csv', stringsAsFactors = F)
prepped_data[prepped_data$LC_Category == 'Dep', 'LC_Category'] <- 'Dep+Anx'
prepped_data[prepped_data$LC_Category == 'Anx', 'LC_Category'] <- 'Dep+Anx'
prepped_data <- prepped_data[which(prepped_data$LC_Category != 'Eating+'),]
prepped_data$LC_Category <- factor(prepped_data$LC_Category)
prepped_data[prepped_data$LC_Category == 'Dep', 'Dep.Anx'] <- 1
prepped_data[prepped_data$LC_Category == 'Anx', 'Dep.Anx'] <- 1
ft_data <- read.csv('Data/FT_summary.csv', stringsAsFactors = F)
this_data <- merge(prepped_data, ft_data, by = c("id", "visit"), all.x = T)
predict_one(prepped_data, 'lme_slope_simple' , c('Data/all_vars-clin_np.csv'), 'lme_slope_simple_vars-clin_np')
predict_one(prepped_data, 'lme_slope_simple' , c('Data/all_vars-clinical.csv'), 'lme_slope_simple_vars-clinical')
predict_one(prepped_data, 'lme_slope_simple' , c('Data/all_vars-np.csv'), 'lme_slope_simple_vars-np')
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.