Description Usage Arguments Details Value Examples
This function allows to create (un)stratified folds from a label vector.
1 |
y |
Type: numeric. The label vector (not a factor). |
k |
Type: integer. The amount of folds to create. Causes issues if |
type |
Type: character. Whether the folds should be |
seed |
Type: integer. The seed for the random number generator. Defaults to |
named |
Type: boolean. Whether the folds should be named. Defaults to |
In contrary to Laurae::kfold
, please do not use stratified
for regression, use pseudo
instead. I had complaints about weird fold generation when using stratification with regression labels: it just does not work the way it was intended (now, use stratified
for classification stratification, and pseudo
for regression stratification).
A list of vectors for each fold, where an integer represents the row number.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 | # Reproducible Stratified folds
data <- 1:5000
folds1 <- kfold(y = data, k = 5, type = "pseudo", seed = 111)
folds2 <- kfold(y = data, k = 5, type = "pseudo", seed = 111)
identical(folds1, folds2)
# Treatments
data <- c(rep(1:50, rep(50, 50)))
str(kfold(y = data, k = 5, type = "treatment"))
# Stratified Classification
data <- c(rep(0, 250), rep(1, 250))
folds <- kfold(y = data, k = 5, type = "stratified")
for (i in 1:length(folds)) {
print(mean(data[folds[[i]]]))
}
# Stratified Regression
data <- 1:5000
folds <- kfold(y = data, k = 5, type = "pseudo")
for (i in 1:length(folds)) {
print(mean(data[folds[[i]]]))
}
# Stratified Multi-class Classification
data <- c(rep(0, 250), rep(1, 250), rep(2, 250))
folds <- kfold(y = data, k = 5, type = "stratified")
for (i in 1:length(folds)) {
print(mean(data[folds[[i]]]))
}
# Unstratified Regression
data <- 1:5000
folds <- kfold(y = data, k = 5, type = "random")
for (i in 1:length(folds)) {
print(mean(data[folds[[i]]]))
}
# Unstratified Multi-class Classification
data <- c(rep(0, 250), rep(1, 250), rep(2, 250))
folds <- kfold(y = data, k = 5, type = "random")
for (i in 1:length(folds)) {
print(mean(data[folds[[i]]]))
}
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.