Description Usage Arguments Details Value Author(s) See Also

This function performs a cross-validation analysis of a feature selection algorithm based on net residual improvement (NeRI) to return a predictive model. It is composed of a NeRI-based feature selection followed by an update procedure, ending with a bootstrapping backwards feature elimination. The user can control how many train and blind test sets will be evaluated.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | ```
crossValidationFeatureSelection_Res(size = 10,
fraction = 1.0,
pvalue = 0.05,
loops = 100,
covariates = "1",
Outcome,
timeOutcome = "Time",
variableList,
data,
maxTrainModelSize = 20,
type = c("LM", "LOGIT", "COX"),
testType = c("Binomial",
"Wilcox",
"tStudent",
"Ftest"),
startOffset = 0,
elimination.bootstrap.steps = 100,
trainFraction = 0.67,
trainRepetition = 9,
setIntersect = 1,
unirank = NULL,
print=TRUE,
plots=TRUE,
lambda="lambda.1se",
equivalent=FALSE,
bswimsCycles=10,
usrFitFun=NULL,
featureSize=0)
``` |

`size` |
The number of candidate variables to be tested (the first |

`fraction` |
The fraction of data (sampled with replacement) to be used as train |

`pvalue` |
The maximum |

`loops` |
The number of bootstrap loops |

`covariates` |
A string of the type "1 + var1 + var2" that defines which variables will always be included in the models (as covariates) |

`Outcome` |
The name of the column in |

`timeOutcome` |
The name of the column in |

`variableList` |
A data frame with two columns. The first one must have the names of the candidate variables and the other one the description of such variables |

`data` |
A data frame where all variables are stored in different columns |

`maxTrainModelSize` |
Maximum number of terms that can be included in the model |

`type` |
Fit type: Logistic ("LOGIT"), linear ("LM"), or Cox proportional hazards ("COX") |

`testType` |
Type of non-parametric test to be evaluated by the |

`startOffset` |
Only terms whose position in the model is larger than the |

`elimination.bootstrap.steps` |
The number of bootstrap loops for the backwards elimination procedure |

`trainFraction` |
The fraction of data (sampled with replacement) to be used as train for the cross-validation procedure |

`setIntersect` |
The intersect of the model (To force a zero intersect, set this value to 0) |

`trainRepetition` |
The number of cross-validation folds (it should be at least equal to |

`unirank` |
A list with the results yielded by the |

`print` |
Logical. If |

`plots` |
Logical. If |

`lambda` |
The passed value to the s parameter of the glmnet cross validation coefficient |

`equivalent` |
Is set to TRUE CV will compute the equivalent model |

`bswimsCycles` |
The maximum number of models to be returned by |

`usrFitFun` |
A user fitting function to be evaluated by the cross validation procedure |

`featureSize` |
The original number of features to be explored in the data frame. |

This function produces a set of data and plots that can be used to inspect the degree of over-fitting or shrinkage of a model. It uses bootstrapped data, cross-validation data, and, if possible, retrain data.

`formula.list` |
A list containing objects of class |

`Models.testPrediction` |
A data frame with the blind test set predictions made at each fold of the cross validation (Full B:SWiMS,Median,Bagged,Forward,Backward Elimination), where the models used to generate such predictions ( |

`FullBSWiMS.testPrediction` |
A data frame similar to |

`BSWiMS` |
A list containing the values returned by |

`forwardSelection` |
A list containing the values returned by |

`updatedforwardModel` |
A list containing the values returned by |

`testRMSE` |
The global blind test root-mean-square error (RMSE) of the cross-validation procedure |

`testPearson` |
The global blind test Pearson |

`testSpearman` |
The global blind test Spearman |

`FulltestRMSE` |
The global blind test RMSE of the Full model |

`FullTestPearson` |
The global blind test Pearson |

`FullTestSpearman` |
The global blind test Spearman |

`trainRMSE` |
The train RMSE at each fold of the cross-validation procedure |

`trainPearson` |
The train Pearson |

`trainSpearman` |
The train Spearman |

`FullTrainRMSE` |
The train RMSE of the Full model at each fold of the cross-validation procedure |

`FullTrainPearson` |
The train Pearson |

`FullTrainSpearman` |
The train Spearman |

`testRMSEAtFold` |
The blind test RMSE at each fold of the cross-validation procedure |

`FullTestRMSEAtFold` |
The blind test RMSE of the Full model at each fold of the cross-validation procedure |

`Fullenet` |
An object of class |

`LASSO.testPredictions` |
A data frame similar to |

`LASSOVariables` |
A list with the elastic net Full model and the models found at each cross-validation fold |

`byFoldTestMS` |
A vector with the Mean Square error for each blind fold |

`byFoldTestSpearman` |
A vector with the Spearman correlation between prediction and outcome for each blind fold |

`byFoldTestPearson` |
A vector with the Pearson correlation between prediction and outcome for each blind fold |

`byFoldCstat` |
A vector with the C-index (Somers' Dxy rank correlation : |

`CVBlindPearson` |
A vector with the Pearson correlation between the outcome and prediction for each repeated experiment |

`CVBlindSpearman` |
A vector with the Spearm correlation between the outcome and prediction for each repeated experiment |

`CVBlindRMS` |
A vector with the RMS between the outcome and prediction for each repeated experiment |

`Models.trainPrediction` |
A data frame with the outcome and the train prediction of every model |

`FullBSWiMS.trainPrediction` |
A data frame with the outcome and the train prediction at each CV fold for the main model |

`LASSO.trainPredictions` |
A data frame with the outcome and the prediction of each enet lasso model |

`uniTrainMSS` |
A data frame with mean square of the train residuals from the univariate models of the model terms |

`uniTestMSS` |
A data frame with mean square of the test residuals of the univariate models of the model terms |

`BSWiMS.ensemble.prediction` |
The ensemble prediction by all models on the test data |

`AtOptFormulas.list` |
The list of formulas with "optimal" performance |

`ForwardFormulas.list` |
The list of formulas produced by the forward procedure |

`baggFormulas.list` |
The list of the bagged models |

`LassoFilterVarList` |
The list of variables used by LASSO fitting |

Jose G. Tamez-Pena and Antonio Martinez-Torteya

```
crossValidationFeatureSelection_Bin,
improvedResiduals,
bootstrapVarElimination_Res
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.