Description Usage Arguments Details
#'
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 | runEnsembleModel(
population,
dataList,
modelList,
testSplit = "time",
testFraction = 0.2,
stackerUseCV = TRUE,
splitSeed = NULL,
nfold = 3,
saveDirectory = NULL,
saveEnsemble = F,
savePlpData = F,
savePlpResult = F,
savePlpPlots = F,
saveEvaluation = F,
analysisId = NULL,
verbosity = "INFO",
ensembleStrategy = "mean",
cores = NULL
)
|
population |
The population created using createStudyPopulation() who will be used to develop the model |
dataList |
An list of object of type |
modelList |
An list of type of base model created using one of the function in final ensembling model, the base model can be any model implemented in this package. |
testSplit |
Either 'person' or 'time' specifying the type of evaluation used. 'time' find the date where testFraction of patients had an index after the date and assigns patients with an index prior to this date into the training set and post the date into the test set 'person' splits the data into test (1-testFraction of the data) and train (validationFraction of the data) sets. The split is stratified by the class label. |
testFraction |
The fraction of the data to be used as the test set in the patient split evaluation. |
stackerUseCV |
When doing stacking you can either use the train CV predictions to train the stacker (TRUE) or leave 20 percent of the data to train the stacker |
splitSeed |
The seed used to split the test/train set when using a person type testSplit |
nfold |
The number of folds used in the cross validation (default 3) |
saveDirectory |
The path to the directory where the results will be saved (if NULL uses working directory) |
saveEnsemble |
Binary indicating whether to save the ensemble |
savePlpData |
Binary indicating whether to save the plpData object (default is F) |
savePlpResult |
Binary indicating whether to save the object returned by runPlp (default is F) |
savePlpPlots |
Binary indicating whether to save the performance plots as pdf files (default is F) |
saveEvaluation |
Binary indicating whether to save the oerformance as csv files (default is T) |
analysisId |
The analysis ID |
verbosity |
Sets the level of the verbosity. If the log level is at or higher in priority than the logger threshold, a message will print. The levels are:
|
ensembleStrategy |
The strategy used for ensembling the outputs from different models, it can be 'mean', 'product', 'weighted' and 'stacked' 'mean' the average probability from differnt models 'product' the product rule 'weighted' the weighted average probability from different models using train AUC as weights. 'stacked' the stakced ensemble trains a logistics regression on different models. |
cores |
The number of cores to use when training the ensemble |
This function applied a list of models and combines them into an ensemble model
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.