Description Usage Arguments Details Value Examples
One resample of Monte Carlo cross-validation takes a random sample (without replacement) of the original data set to be used for analysis. All other data points are added to the assessment set.
1 |
data |
A data frame. |
prop |
The proportion of data to be retained for modeling/analysis. |
times |
The number of times to repeat the sampling.. |
strata |
A variable that is used to conduct stratified sampling to create the resamples. |
... |
Not currently used. |
The 'strata' argument causes the random sampling to be conducted *within the stratification variable*. The can help ensure that the number of data points in the analysis data is equivalent to the proportions in the original data set.
An tibble with classes 'mc_cv', 'rset', 'tbl_df', 'tbl', and 'data.frame'. The results include a column for the data split objects and a column called 'id' that has a character string with the resample identifier.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 | mc_cv(mtcars, times = 2)
mc_cv(mtcars, prop = .5, times = 2)
library(purrr)
iris2 <- iris[1:130, ]
set.seed(13)
resample1 <- mc_cv(iris2, times = 3, prop = .5)
map_dbl(resample1$splits,
function(x) {
dat <- as.data.frame(x)$Species
mean(dat == "virginica")
})
set.seed(13)
resample2 <- mc_cv(iris2, strata = "Species", times = 3, prop = .5)
map_dbl(resample2$splits,
function(x) {
dat <- as.data.frame(x)$Species
mean(dat == "virginica")
})
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.