Description Usage Arguments Value Examples
Group V-fold cross-validation creates splits of the data based on some grouping variable (which may have more than a single row associated with it). The function can create as many splits as there are unique values of the grouping variable or it can create a smaller set of splits where more than one value is left out at a time.
1 |
data |
A data frame. |
group |
A signle character value for the column of the data that will be used to create the splits. |
v |
The number of partitions of the data set. If let 'NULL', 'v' will be set to the number of unique values in the group. |
... |
Not currently used. |
An tibble with classes 'group_vfold_cv', 'rset', 'tbl_df', 'tbl', and 'data.frame'. The results include a column for the data split objects and an identification variable.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | set.seed(3527)
test_data <- data.frame(id = sort(sample(1:20, size = 80, replace = TRUE)))
test_data$dat <- runif(nrow(test_data))
set.seed(5144)
split_by_id <- group_vfold_cv(test_data, group = "id")
get_id_left_out <- function(x)
unique(assessment(x)$id)
library(purrr)
table(map_int(split_by_id$splits, get_id_left_out))
set.seed(5144)
split_by_some_id <- group_vfold_cv(test_data, group = "id", v = 7)
held_out <- map(split_by_some_id$splits, get_id_left_out)
table(unlist(held_out))
# number held out per resample:
map_int(held_out, length)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.