library(tidymodels) library(tidyposterior) library(workflowsets) # library(knitr) options(digits = 4, width = 84) options(dplyr.print_min = 6, dplyr.print_max = 6) options(cli.width = 85) options(crayon.enabled = FALSE) # Inset the code and results with soe extra space options(prompt = " ", continue = " ") knitr::opts_chunk$set(comment = " ##", prompt = TRUE)
There are several ways to give resampling results to the perf_mod()
function. To illustrate, here are some example objects using 10-fold cross-validation for a simple two-class problem:
library(tidymodels) library(tidyposterior) library(workflowsets) data(two_class_dat, package = "modeldata") set.seed(100) folds <- vfold_cv(two_class_dat)
We can define two different models (for simplicity, with no tuning parameters).
logistic_reg_glm_spec <- logistic_reg() %>% set_engine('glm') mars_earth_spec <- mars(prod_degree = 1) %>% set_engine('earth') %>% set_mode('classification')
For tidymodels, the [tune::fit_resamples()] function can be used to estimate performance for each model/resample:
rs_ctrl <- control_resamples(save_workflow = TRUE) logistic_reg_glm_res <- logistic_reg_glm_spec %>% fit_resamples(Class ~ ., resamples = folds, control = rs_ctrl) mars_earth_res <- mars_earth_spec %>% fit_resamples(Class ~ ., resamples = folds, control = rs_ctrl)
From these, there are several ways to pass the results to perf_mod()
.
The most general approach is to have a data frame with the resampling labels (i.e., one or more id columns) as well as columns for each model that you would like to compare.
For the model results above, [tune::collect_metrics()] can be used along with some basic data manipulation steps:
logistic_roc <- collect_metrics(logistic_reg_glm_res, summarize = FALSE) %>% dplyr::filter(.metric == "roc_auc") %>% dplyr::select(id, logistic = .estimate) mars_roc <- collect_metrics(mars_earth_res, summarize = FALSE) %>% dplyr::filter(.metric == "roc_auc") %>% dplyr::select(id, mars = .estimate) resamples_df <- full_join(logistic_roc, mars_roc, by = "id") resamples_df
We can then give this directly to perf_mod()
:
set.seed(101) roc_model_via_df <- perf_mod(resamples_df, refresh = 0) tidy(roc_model_via_df) %>% summary()
Alternatively, the result columns can be merged back into the original rsample
object. The up-side to using this method is that perf_mod()
will know exactly which model formula to use for the Bayesian model:
resamples_rset <- full_join(folds, logistic_roc, by = "id") %>% full_join(mars_roc, by = "id") set.seed(101) roc_model_via_rset <- perf_mod(resamples_rset, refresh = 0) tidy(roc_model_via_rset) %>% summary()
Finally, for tidymodels, a workflow set object can be used. This is a collection of models/preprocessing combinations in one object. We can emulate a workflow set using the existing example results then pass that to perf_mod()
:
example_wset <- as_workflow_set(logistic = logistic_reg_glm_res, mars = mars_earth_res) set.seed(101) roc_model_via_wflowset <- perf_mod(example_wset, refresh = 0) tidy(roc_model_via_rset) %>% summary()
The caret
package can also be used. An equivalent set of models are created:
library(caret)
library(caret) set.seed(102) logistic_caret <- train(Class ~ ., data = two_class_dat, method = "glm", trControl = trainControl(method = "cv")) set.seed(102) mars_caret <- train(Class ~ ., data = two_class_dat, method = "gcvEarth", tuneGrid = data.frame(degree = 1), trControl = trainControl(method = "cv"))
Note that these two models use the same resamples as one another due to setting the seed prior to calling train()
. However, these are different from the tidymodels results used above (so the final results will be different).
caret
has a resamples()
function that can collect and collate the resamples. This can also be given to perf_mod()
:
caret_resamples <- resamples(list(logistic = logistic_caret, mars = mars_caret)) set.seed(101) roc_model_via_caret <- perf_mod(caret_resamples, refresh = 0) tidy(roc_model_via_caret) %>% summary()
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.