knitr::opts_chunk$set( collapse = TRUE, comment = "#>", eval = FALSE )
library(tabnet) library(tidymodels) library(modeldata)
In this vignette, we show how to create a TabNet model using the tidymodels interface.
We are going to use the lending_club
dataset available
in the modeldata
package.
First let's split our dataset into training and testing so we can later access performance of our model:
set.seed(123) data("lending_club", package = "modeldata") split <- initial_split(lending_club, strata = Class) train <- training(split) test <- testing(split)
We now define our pre-processing steps. Note that TabNet handles categorical variables, so we don't need to do any kind of transformation to them. Normalizing the numeric variables is a good idea though.
rec <- recipe(Class ~ ., train) %>% step_normalize(all_numeric())
Next, we define our model. We are going to train for 50 epochs with a batch size of 128. There are other hyperparameters but, we are going to use the defaults.
mod <- tabnet(epochs = 50) %>% set_engine("torch", verbose = TRUE) %>% set_mode("classification")
We also define our workflow
object:
wf <- workflow() %>% add_model(mod) %>% add_recipe(rec)
We can now define our cross-validation strategy:
folds <- vfold_cv(train, v = 5)
And finally, fit the model:
fit_rs <- wf %>% fit_resamples(folds)
After a few minutes we can get the results:
collect_metrics(fit_rs)
# A tibble: 3 × 6 .metric .estimator mean n std_err .config <chr> <chr> <dbl> <int> <dbl> <chr> 1 accuracy binary 0.945 5 0.000869 Preprocessor1_Model1 2 brier_class binary 0.0535 5 0.00122 Preprocessor1_Model1 3 roc_auc binary 0.611 5 0.0153 Preprocessor1_Model1
And finally, we can verify the results in our test set:
model <- wf %>% fit(train) model %>% augment( test) %>% roc_auc(Class, .pred_good, event_level = "second")
# A tibble: 1 x 3 .metric .estimator .estimate <chr> <chr> <dbl> 1 roc_auc binary 0.710
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.