apply_lime: Apply Different Implementations of LIME
In goodekat/limeaid: Diagnose LIME Explanations

Description Usage Arguments Examples

View source: R/main-apply_lime.R

Applies LIME with the specified tuning parameter options.

apply_lime(
  train,
  test,
  model,
  sim_method,
  nbins = 4,
  label,
  n_features,
  n_permutations = 5000,
  feature_select = "auto",
  dist_fun = "gower",
  kernel_width = NULL,
  gower_pow = 1,
  all_fs = FALSE,
  return_perms = FALSE,
  parallel = FALSE,
  seed = NULL
)

`train`	Dataframe of training data features.
`test`	Dataframe of testing data features.
`model`	Complex model to be explained.
`sim_method`	Vector of methods to use for creating the simulated data. Options are 'quantile_bins', 'equal_bins', 'kernel_density', and 'normal_approx'.
`nbins`	Vector of number of bins to use with bin based simulation methods.
`label`	Response category to use in the explanations. Current implementation only accepts 1 label.
`n_features`	Number of features to return in the explanations.
`n_permutations`	Number of permutations to use when simulating data for each explanation. Default is 5000.
`feature_select`	Feature selection method. Options are 'auto', 'none', 'forward_selection', 'highest_weights', 'lasso_path', and 'tree'.
`dist_fun`	Distance function to use when computing weights for the simulated data. Default is 'gower'. Otherwise, `stats::dist()` will be used.
`kernel_width`	Kernel width to use if `dist_fun` is not 'gower'.
`gower_pow`	Numeric vector of powers to use when computing the Gower distance. (Note: If gower_pow is a vector with more than one unique number, the simulated values will be reused for an observation in the test data to compare explanations across gower powers within the same set of other tuning parameters.)
`all_fs`	Indicates whether all feature selection methods should be applied for an implementation of LIME to see how the features selected varies within a LIME implemenation. Note that the LIME results returned will correspond to the method specified in the `feature_selection` option.
`return_perms`	Should the simulated dataset (permutations) be returned for all of the observations in the test datatset and LIME implementations? Default is FALSE.
`parallel`	Indicates whether to perform the application of LIME using parallel computation (with furrr) or without (with purrr). Default is FALSE. Setting parallel = TRUE may help with computation time with very large test datasets or many different sets of tuning parameters.
`seed`	Number to be used as a seed (if desired).

# Prepare training and testing data
x_train = sine_data_train[c("x1", "x2", "x3")]
y_train = factor(sine_data_train$y)
x_test = sine_data_test[1:5, c("x1", "x2", "x3")]

# Fit a random forest model
rf <- randomForest::randomForest(x = x_train, y = y_train) 

# Run apply_lime
res <- apply_lime(train = x_train, 
                  test = x_test, 
                  model = rf,
                  label = "1",
                  n_features = 2,
                  sim_method = c('quantile_bins',
                                 'kernel_density'),
                  nbins = 2:3)