alluvial_model_response_parsnip: create model response plot for parsnip models

alluvial_model_response_parsnipR Documentation

create model response plot for parsnip models

Description

Wraps alluvial_model_response and get_data_space into one call for parsnip models.

Usage

alluvial_model_response_parsnip(
  m,
  data_input,
  degree = 4,
  bins = 5,
  bin_labels = c("LL", "ML", "M", "MH", "HH"),
  col_vector_flow = c("#FF0065", "#009850", "#A56F2B", "#005EAA", "#710500", "#7B5380",
    "#9DD1D1"),
  method = "median",
  parallel = FALSE,
  params_bin_numeric_pred = list(bins = 5),
  pred_train = NULL,
  stratum_label_size = 3.5,
  force = F,
  resp_var = NULL,
  .f_imp = vip::vi_model,
  ...
)

Arguments

m

parsnip model or trained workflow

data_input

dataframe, input data

degree

integer, number of top important variables to select. For plotting more than 4 will result in two many flows and the alluvial plot will not be very readable, Default: 4

bins

integer, number of bins for numeric variables, increasing this number might result in too many flows, Default: 5

bin_labels

labels for the bins from low to high, Default: c("LL", "ML", "M", "MH", "HH")

col_vector_flow

character vector, defines flow colours, Default: c('#FF0065','#009850', '#A56F2B', '#005EAA', '#710500')

method

character vector, one of c('median', 'pdp')

median

sets variables that are not displayed to median mode, use with regular predictions

pdp

partial dependency plot method, for each observation in the training data the displayed variables are set to the indicated values. The predict function is called for each modified observation and the result is averaged

. Default: 'median'

parallel

logical, turn on parallel processing for pdp method. Default: FALSE

params_bin_numeric_pred

list, additional parameters passed to manip_bin_numerics which is applied to the pred parameter. Default: list(bins = 5, center = T, transform = T, scale = T)

pred_train

numeric vector, base the automated binning of the pred vector on the distribution of the training predictions. This is useful if marginal histograms are added to the plot later. Default = NULL

stratum_label_size

numeric, Default: 3.5

force

logical, force plotting of over 1500 flows, Default: FALSE

resp_var

character, sometimes target variable cannot be inferred and needs to be passed. Default NULL

.f_imp

vip function that calculates feature importance, Default: vip::vi_model

...

additional parameters passed to alluvial_wide

Details

this model visualisation approach follows the "visualising the model in the dataspace" principle as described in Wickham H, Cook D, Hofmann H (2015) Visualizing statistical models: Removing the blindfold. Statistical Analysis and Data Mining 8(4) <doi:10.1002/sam.11271>

Value

ggplot2 object

Parallel Processing

We are using 'furrr' and the 'future' package to paralelize some of the computational steps for calculating the predictions. It is up to the user to register a compatible backend (see plan).

See Also

alluvial_wide, get_data_space, varImp, extractPrediction, get_data_space, get_pdp_predictions

Examples


if(check_pkg_installed("parsnip", raise_error = FALSE) &
   check_pkg_installed("vip", raise_error = FALSE)) {
  df = mtcars2[, ! names(mtcars2) %in% 'ids' ]

  m = parsnip::rand_forest(mode = "regression") %>%
     parsnip::set_engine("randomForest") %>%
     parsnip::fit(disp ~ ., data = df)

  alluvial_model_response_parsnip(m, df, degree = 3)
}
## Not run: 
# workflow --------------------------------- 
m <- parsnip::rand_forest(mode = "regression") %>%
  parsnip::set_engine("randomForest")

rec_prep = recipes::recipe(disp ~ ., df) %>%
  recipes::prep()

wf <- workflows::workflow() %>%
  workflows::add_model(m) %>%
  workflows::add_recipe(rec_prep) %>%
  parsnip::fit(df)

alluvial_model_response_parsnip(wf, df, degree = 3)

# partial dependence plotting method -----
future::plan("multisession")
alluvial_model_response_parsnip(m, df, degree = 3, method = 'pdp', parallel = TRUE)

## End(Not run)

easyalluvial documentation built on May 29, 2024, 5:32 a.m.