knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-" )
library(yardstick) library(dplyr) options(width = 100, digits = 3)
yardstick
is a package to estimate how well models are working using tidy data principles. See the package webpage for more information.
To install the package:
install.packages("yardstick") # Development version: # install.packages("pak") pak::pak("tidymodels/yardstick")
For example, suppose you create a classification model and predict on a new data set. You might have data that looks like this:
library(yardstick) library(dplyr) head(two_class_example)
You can use a dplyr
-like syntax to compute common performance characteristics of the model and get them back in a data frame:
metrics(two_class_example, truth, predicted) # or two_class_example %>% roc_auc(truth, Class1)
All classification metrics have at least one multiclass extension, with many of them having multiple ways to calculate multiclass metrics.
data("hpc_cv") hpc_cv <- as_tibble(hpc_cv) hpc_cv
# Macro averaged multiclass precision precision(hpc_cv, obs, pred) # Micro averaged multiclass precision precision(hpc_cv, obs, pred, estimator = "micro")
If you have multiple resamples of a model, you can use a metric on a grouped data frame to calculate the metric across all resamples at once.
This calculates multiclass ROC AUC using the method described in Hand, Till (2001), and does it across all 10 resamples at once.
hpc_cv %>% group_by(Resample) %>% roc_auc(obs, VF:L)
Curve based methods such as roc_curve()
, pr_curve()
and gain_curve()
all
have ggplot2::autoplot()
methods that allow for powerful and easy visualization.
#| fig-alt: "Faceted ROC curve. 1-specificity along the x-axis, sensitivity along the y-axis. Facets include the classes F, L, M, and VF. Each facet shows 10 lines colored to correspond to a resample. All the lines are quite overlapping. With VF having the tightest and highest values." library(ggplot2) hpc_cv %>% group_by(Resample) %>% roc_curve(obs, VF:L) %>% autoplot()
This project is released with a Contributor Code of Conduct. By contributing to this project, you agree to abide by its terms.
For questions and discussions about tidymodels packages, modeling, and machine learning, please post on RStudio Community.
If you think you have encountered a bug, please submit an issue.
Either way, learn how to create and share a reprex (a minimal, reproducible example), to clearly communicate about your code.
Check out further details on contributing guidelines for tidymodels packages and how to get help.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.