View source: R/explain_forecast.R
explain_forecast | R Documentation |
Computes dependence-aware Shapley values for observations in explain_idx
from the specified
model
by using the method specified in approach
to estimate the conditional expectation.
See
Aas, et. al (2021)
for a thorough introduction to dependence-aware prediction explanation with Shapley values.
explain_forecast(
model,
y,
xreg = NULL,
train_idx = NULL,
explain_idx,
explain_y_lags,
explain_xreg_lags = explain_y_lags,
horizon,
approach,
phi0,
max_n_coalitions = NULL,
iterative = NULL,
group_lags = TRUE,
group = NULL,
n_MC_samples = 1000,
seed = NULL,
predict_model = NULL,
get_model_specs = NULL,
verbose = "basic",
extra_computation_args = list(),
iterative_args = list(),
output_args = list(),
...
)
model |
Model object.
Specifies the model whose predictions we want to explain.
Run |
y |
Matrix, data.frame/data.table or a numeric vector. Contains the endogenous variables used to estimate the (conditional) distributions needed to properly estimate the conditional expectations in the Shapley formula including the observations to be explained. |
xreg |
Matrix, data.frame/data.table or a numeric vector. Contains the exogenous variables used to estimate the (conditional) distributions needed to properly estimate the conditional expectations in the Shapley formula including the observations to be explained. As exogenous variables are used contemporaneously when producing a forecast, this item should contain nrow(y) + horizon rows. |
train_idx |
Numeric vector.
The row indices in data and reg denoting points in time to use when estimating the conditional expectations in
the Shapley value formula.
If |
explain_idx |
Numeric vector. The row indices in data and reg denoting points in time to explain. |
explain_y_lags |
Numeric vector.
Denotes the number of lags that should be used for each variable in |
explain_xreg_lags |
Numeric vector.
If |
horizon |
Numeric.
The forecast horizon to explain. Passed to the |
approach |
Character vector of length |
phi0 |
Numeric. The prediction value for unseen data, i.e. an estimate of the expected prediction without conditioning on any features. Typically we set this value equal to the mean of the response variable in our training data, but other choices such as the mean of the predictions in the training data are also reasonable. |
max_n_coalitions |
Integer.
The upper limit on the number of unique feature/group coalitions to use in the iterative procedure
(if |
iterative |
Logical or NULL
If |
group_lags |
Logical.
If |
group |
List.
If |
n_MC_samples |
Positive integer.
For most approaches, it indicates the maximum number of samples to use in the Monte Carlo integration
of every conditional expectation.
For |
seed |
Positive integer.
Specifies the seed before any randomness based code is being run.
If |
predict_model |
Function.
The prediction function used when |
get_model_specs |
Function.
An optional function for checking model/data consistency when
If |
verbose |
String vector or NULL.
Specifies the verbosity (printout detail level) through one or more of strings
|
extra_computation_args |
Named list.
Specifies extra arguments related to the computation of the Shapley values.
See |
iterative_args |
Named list.
Specifies the arguments for the iterative procedure.
See |
output_args |
Named list.
Specifies certain arguments related to the output of the function.
See |
... |
Arguments passed on to
|
This function explains a forecast of length horizon
. The argument train_idx
is analogous to x_train in explain()
, however, it just contains the time indices of where
in the data the forecast should start for each training sample. In the same way explain_idx
defines the time index (indices) which will precede a forecast to be explained.
As any autoregressive forecast model will require a set of lags to make a forecast at an
arbitrary point in time, explain_y_lags
and explain_xreg_lags
define how many lags
are required to "refit" the model at any given time index. This allows the different
approaches to work in the same way they do for time-invariant models.
See the forecasting section of the general usages for further details.
Object of class c("shapr", "list")
. Contains the following items:
shapley_values_est
data.table with the estimated Shapley values with explained observation in the rows and
features along the columns.
The column none
is the prediction not devoted to any of the features (given by the argument phi0
)
shapley_values_sd
data.table with the standard deviation of the Shapley values reflecting the uncertainty.
Note that this only reflects the coalition sampling part of the kernelSHAP procedure, and is therefore by
definition 0 when all coalitions is used.
Only present when extra_computation_args$compute_sd=TRUE
, which is the default when iterative = TRUE
internal
List with the different parameters, data, functions and other output used internally.
pred_explain
Numeric vector with the predictions for the explained observations
MSEv
List with the values of the MSEv evaluation criterion for the approach. See the MSEv evaluation section in the general usage for details.
timing
List containing timing information for the different parts of the computation.
init_time
and end_time
gives the time stamps for the start and end of the computation.
total_time_secs
gives the total time in seconds for the complete execution of explain()
.
main_timing_secs
gives the time in seconds for the main computations.
iter_timing_secs
gives for each iteration of the iterative estimation, the time spent on the different parts
iterative estimation routine.
Jon Lachmann, Martin Jullum
# Load example data
data("airquality")
data <- data.table::as.data.table(airquality)
# Fit an AR(2) model.
model_ar_temp <- ar(data$Temp, order = 2)
# Calculate the zero prediction values for a three step forecast.
p0_ar <- rep(mean(data$Temp), 3)
# Empirical approach, explaining forecasts starting at T = 152 and T = 153.
explain_forecast(
model = model_ar_temp,
y = data[, "Temp"],
train_idx = 2:151,
explain_idx = 152:153,
explain_y_lags = 2,
horizon = 3,
approach = "empirical",
phi0 = p0_ar,
group_lags = FALSE
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.