Description Usage Arguments Details Value Note Author(s) See Also Examples
View source: R/cvo_create_folds.R
Create indices of folds with blocking and stratification (cvo object) Create a cross-validation object (cvo), which contain a list of indices for each fold of (repeated) k-fold cross-validation. Options of blocking and stratification are available. See more in "Details".
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 |
data |
A data frame, that contains variables which names are denoted
by arguments |
stratify_by |
A vector or a name of factor variable in |
block_by |
A vector or a name of variable in |
folds, k |
( |
times |
( |
seeds |
(
For more information about random number generation see
|
kind |
( Generator |
mode |
( |
returnTrain |
( |
predict |
( |
x |
A |
... |
(any)
|
Function cvo_create_folds
randomly divides observations into
folds that are used for (repeated) k-fold cross-validation. In these
folds observations are:
blocked by values in variable block_by
(i.e. observations with the same "ID" or other kind of blocking factor
are treated as one unit (a block) and are always in the same fold);
stratified by levels of factor variable stratify_by
(the proportions of these grouped units of observations per each
group (level) are kept approximately constant throughout all folds).
(list
) A list of folds. In each fold there are indices
observations. The structure of outputs is the similar to one
created with either function createFolds
from caret or function
makeResampleInstance
in mlr.
If folds
is too big and cases of at least one group (i.e.,
level in stratify_by
) are not included in at least one fold,
an error is returned. In that case smaller value of folds
is
recommended.
Vilmantas Gegzna
Function createFolds
from package
caret.
Function makeResampleInstance
from package
mlr.
Test if folds are blocked and stratified cvo_test_bs
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 | library(manyROC)
set.seed(123456)
# Data
DataSet1 <- data.frame(ID = rep(1:20, each = 2),
gr = gl(4, 10, labels = LETTERS[1:4]),
.row = 1:40)
# Explore data
str(DataSet1)
table(DataSet1[, c("gr", "ID")])
summary(DataSet1)
# Explore functions
nFolds <- 5
# If variables of data frame are provided:
Folds1_a <- cvo_create_folds(data = DataSet1,
stratify_by = "gr", block_by = "ID",
k = nFolds, returnTrain = FALSE)
Folds1_a
str(Folds1_a)
cvo_test_bs(Folds1_a, "gr", "ID", DataSet1)
# If "free" variables are provided:
Folds1_b <- cvo_create_folds(stratify_by = DataSet1$gr,
block_by = DataSet1$ID,
k = nFolds,
returnTrain = FALSE)
# str(Folds1_b)
cvo_test_bs(Folds1_b, "gr", "ID", DataSet1)
# Not blocked but stratified
Folds1_c <- cvo_create_folds(stratify_by = DataSet1$gr,
k = nFolds,
returnTrain = FALSE)
# str(Folds1_c)
cvo_test_bs(Folds1_c, "gr", "ID", DataSet1)
# Blocked but not stratified
Folds1_d <- cvo_create_folds(block_by = DataSet1$ID,
k = nFolds,
returnTrain = FALSE)
# str(Folds1_d)
cvo_test_bs(Folds1_d, "gr", "ID", DataSet1)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.