Home

/

CRAN

/

tidymodels

/

Train and evaluate models with tidymodels

Train and evaluate models with tidymodels
In tidymodels: Easily Install and Load the 'Tidymodels' Packages

knitr::opts_chunk$set(echo = TRUE, fig.width = 8, fig.height = 5)

This template offers an opinionated guide on how to structure a modeling analysis. Your individual modeling analysis may require you to add to, subtract from, or otherwise change this structure, but consider this a general framework to start from. If you want to learn more about using tidymodels, check out our Getting Started guide.

In this example analysis, let's fit a model to predict the sex of penguins from species and measurement information.

library(tidymodels)

data(penguins)
glimpse(penguins)

penguins <- na.omit(penguins)

Explore data

Exploratory data analysis (EDA) is an important part of the modeling process.

penguins %>%
  ggplot(aes(bill_depth_mm, bill_length_mm, color = sex, size = body_mass_g)) +
  geom_point(alpha = 0.5) +
  facet_wrap(~species) +
  theme_bw()

Build models

Let's consider how to spend our data budget:

create training and testing sets
create resampling folds from the training set

set.seed(123)
penguin_split <- initial_split(penguins, strata = sex)
penguin_train <- training(penguin_split)
penguin_test <- testing(penguin_split)

set.seed(234)
penguin_folds <- vfold_cv(penguin_train, strata = sex)
penguin_folds

Let's create a model specification for each model we want to try:

glm_spec <-
  logistic_reg() %>%
  set_engine("glm")

ranger_spec <-
  rand_forest(trees = 1e3) %>%
  set_engine("ranger") %>%
  set_mode("classification")

To set up your modeling code, consider using the parsnip addin or the usemodels package.

Now let's build a model workflow combining each model specification with a data preprocessor:

penguin_formula <- sex ~ .

glm_wf    <- workflow(penguin_formula, glm_spec)
ranger_wf <- workflow(penguin_formula, ranger_spec)

If your feature engineering needs are more complex than provided by a formula like sex ~ ., use a recipe. Read more about feature engineering with recipes to learn how they work.

Evaluate models

These models have no tuning parameters so we can evaluate them as they are. Learn about tuning hyperparameters here.

contrl_preds <- control_resamples(save_pred = TRUE)

glm_rs <- fit_resamples(
  glm_wf,
  resamples = penguin_folds,
  control = contrl_preds
)

ranger_rs <- fit_resamples(
  ranger_wf,
  resamples = penguin_folds,
  control = contrl_preds
)

How did these two models compare?

collect_metrics(glm_rs)
collect_metrics(ranger_rs)

We can visualize these results using an ROC curve (or a confusion matrix via conf_mat()):

bind_rows(
  collect_predictions(glm_rs) %>%
    mutate(mod = "glm"),
  collect_predictions(ranger_rs) %>%
    mutate(mod = "ranger")
) %>%
  group_by(mod) %>%
  roc_curve(sex, .pred_female) %>%
  autoplot()

These models perform very similarly, so perhaps we would choose the simpler, linear model. The function last_fit() fits one final time on the training data and evaluates on the testing data. This is the first time we have used the testing data.

final_fitted <- last_fit(glm_wf, penguin_split)
collect_metrics(final_fitted)  ## metrics evaluated on the *testing* data

This object contains a fitted workflow that we can use for prediction.

final_wf <- extract_workflow(final_fitted)
predict(final_wf, penguin_test[55,])

You can save this fitted final_wf object to use later with new data, for example with readr::write_rds().

Any scripts or data that you put into this service are public.

tidymodels documentation built on May 29, 2024, 11:26 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

tidymodels
Easily Install and Load the 'Tidymodels' Packages

Train and evaluate models with tidymodels
In tidymodels: Easily Install and Load the 'Tidymodels' Packages

Explore data

Build models

Evaluate models

Try the tidymodels package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

tidymodels Easily Install and Load the 'Tidymodels' Packages

Train and evaluate models with tidymodels In tidymodels: Easily Install and Load the 'Tidymodels' Packages

Explore data

Build models

Evaluate models

Try the tidymodels package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

tidymodels
Easily Install and Load the 'Tidymodels' Packages

Train and evaluate models with tidymodels
In tidymodels: Easily Install and Load the 'Tidymodels' Packages