knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "man/figures/README-", fig.height = 6, out.width = "100%" )
The xrayspecs package provides model-agnostic interpretations of black-box models. The package is designed to integrate into the parsnip
package's unified modelling framework. Currently the package supports: permutation importance plots, partial dependence plots and individual conditional expectation plots.
You cannot currently install xrayspecs from CRAN. However, you can install the development version of xrayspecs from GitHub with:
# install.packages("devtools") devtools::install_github("mt-edwards/xrayspecs")
library(xrayspecs) library(tidyverse) library(tidymodels) library(ranger)
The bike
data set is used for this example.This data set contains features only of type double (<dbl>
) and type factor (<fct>
). This is required so that the plot_dependence()
function knows how to plot the feature predictions. For example, the predictions of double (continuous) features are displayed with line plots and the predictions of factor (categorical) features are displayed with bar plots.
data("bike")
The bike
data set set is split into a training set and a test set with the rsample package. The trainiing set is used to train the predictive models and the test set is used to test and interpret the predictive models.
split <- initial_split(bike, prop = 4 / 5, strata = "bikes_rented") bike_train <- training(split) bike_test <- testing(split)
A random forest is trained on the bike_train
data set using the parsnip
package. The parsnip
package provides a unified framework for fitting models in R
. The models that are available for training in parsnip
are listed here. The xrayspecs package is designed to integrate into the parsnip
package's unified framework.
rf <- rand_forest(mode = "regression") %>% set_engine("ranger") %>% fit(bikes_rented ~ ., data = bike_train)
predict(rf, bike_test) %>% bind_cols(bike_test) %>% mae(truth = bikes_rented, estimate = .pred)
To display a permutation importance plot of the random forest features all you need to do is pipe the rf
object into the plot_importance()
function along with the test data (bike_test
) the target (bike_rentals
) and a metric from the yardstick
package, e.g. mae
(Mean Absolute Error). A feature importance is equal to the absolute difference between the metric estimate when the feature is and is not permuted in the data. Note: metrics must be appropriate for the target variable.
rf %>% plot_importance(bike_test, bikes_rented, mae) + labs(title = "Permutation Importance Plot")
To display a partial dependence plot of a feature for the random forest model all you need to do is pipe the rf
object into the plot_dependence()
function along with the data (bike_test
) and the feature. Here the partial dependence plots of temperature
, humidity
and wind_speed
are displayed.
rf %>% plot_dependence(bike_test, temperature) + labs(title = "Partial Dependence Plot") + labs(x = "Temperature") rf %>% plot_dependence(bike_test, humidity) + labs(title = "Partial Dependence Plot") + labs(x = "Humidity") rf %>% plot_dependence(bike_test, wind_speed) + labs(title = "Partial Dependence Plot") + labs(x = "Wind Speed")
rf %>% plot_dependence(bike_test, season) + labs(title = "Partial Dependence Plot") + labs(x = "Season")
To display an individual conditional expectation plot of a feature for the random forest model all you need to do is pipe the rf
object into the plot_dependence()
function along with the data (bike_test
) and the feature with the optional argument examples
assigned to TRUE
. Individual conditional expectation plots are not available for features of type factor
rf %>% plot_dependence(bike_test, days_since_2011) + labs(title = "Partial Dependence Plot") + labs("Days Since 2011") rf %>% plot_dependence(bike_test, days_since_2011, examples = TRUE) + labs(title = "Individual Conditional Expectation Plot") + labs(x = "Days Since 2011") rf %>% plot_dependence(bike_test, days_since_2011, examples = TRUE, center = TRUE) + labs(title = "Centered Individual Conditional Expectation Plot") + labs(x = "Days Since 2011")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.