repeatcv | R Documentation |
Performs repeated calls to a nestedcv
model to determine performance across
repeated runs of nested CV.
repeatcv(
expr,
n = 5,
repeat_folds = NULL,
keep = FALSE,
extra = FALSE,
progress = TRUE,
rep_parallel = "mclapply",
rep.cores = 1L
)
expr |
An expression containing a call to |
n |
Number of repeats |
repeat_folds |
Optional list containing fold indices to be applied to the outer CV folds. |
keep |
Logical whether to save repeated outer CV fitted models for variable importance, SHAP etc. Note this can make the resulting object very large. |
extra |
Logical whether additional performance metrics are gathered for
binary classification models. See |
progress |
Logical whether to show progress. |
rep_parallel |
Either "mclapply" or "future". This determines which parallel backend to use. |
rep.cores |
Integer specifying number of cores/threads to invoke.
Ignored if |
We recommend using this with the R pipe |>
(see examples).
When comparing models, it is recommended to fix the sets of outer CV folds
used across each repeat for comparing performance between models. The
function repeatfolds()
can be used to create a fixed set of outer CV folds
for each repeat.
Parallelisation over repeats is performed using parallel::mclapply
(not
available on windows) or future
depending on how rep_parallel
is set.
Beware that cv.cores
can still be set within calls to nestedcv
models (=
nested parallelisation). This means that rep.cores
x cv.cores
number of
processes/forks will be spawned, so be careful not to overload your CPU. In
general parallelisation of repeats using rep.cores
is faster than
parallelisation using cv.cores
. rep.cores
is ignored if you are using
future. Set the number of workers for future using future::plan()
.
List of S3 class 'repeatcv' containing:
call |
the model call |
result |
matrix of performance metrics |
output |
a matrix or dataframe containing the outer CV predictions from each repeat |
roc |
(binary classification models only) a ROC curve object based on
predictions across all repeats as returned in |
fits |
(if |
data("iris")
dat <- iris
y <- dat$Species
x <- dat[, 1:4]
res <- nestcv.glmnet(y, x, family = "multinomial", alphaSet = 1,
n_outer_folds = 4) |>
repeatcv(3, rep.cores = 2)
res
summary(res)
## set up fixed fold indices
set.seed(123, "L'Ecuyer-CMRG")
folds <- repeatfolds(y, repeats = 3, n_outer_folds = 4)
res <- nestcv.glmnet(y, x, family = "multinomial", alphaSet = 1,
n_outer_folds = 4) |>
repeatcv(3, repeat_folds = folds, rep.cores = 2)
res
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.