return_error: Compute forecast error

Description Usage Arguments Value Error Metrics Methods and related functions Examples

View source: R/return_error.R

Description

Compute forecast error metrics on the validation datasets or a new test dataset.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
return_error(
  data_results,
  data_test = NULL,
  test_indices = NULL,
  aggregate = stats::median,
  metrics = c("mae", "mape", "mdape", "smape", "rmse", "rmsse"),
  models = NULL,
  horizons = NULL,
  windows = NULL,
  group_filter = NULL
)

Arguments

data_results

An object of class 'training_results' or 'forecast_results' from running (a) predict on a trained model or (b) combine_forecasts().

data_test

Required for forecast results only. If data_results is an object of class 'forecast_results', a data.frame used to assess the accuracy of a 'forecast_results' object. data_test should have the outcome/target columns and any grouping columns.

test_indices

Required if data_test is given or 'rmsse' row indices or dates (class 'Date' or 'POSIXt') with length nrow(data_test).

aggregate

Default median. A function–without parentheses–that aggregates historical prediction or forecast error across time series. All error metrics are first calculated at the level of the individual time series. aggregate is then used to combine error metrics across validation windows and horizons. Aggregations are returned at the group level if data_results contains groups.

metrics

A character vector of common forecast error metrics. The default behavior is to return all metrics.

models

Optional. A character vector of user-defined model names supplied to train_model() to filter results.

horizons

Optional. A numeric vector to filter results by horizon.

windows

Optional. A numeric vector to filter results by validation window number.

group_filter

Optional. A string for filtering plot results for grouped time series (e.g., "group_col_1 == 'A'"). group_filter is passed to dplyr::filter() internally.

Value

An S3 object of class 'validation_error', 'forecast_error', or 'forecastML_error': A list of data.frames of error metrics for the validation or forecast dataset depending on the class of data_results: 'training_results', 'forecast_results', or 'forecastML' from combine_forecasts().

A list containing:

Error Metrics

Methods and related functions

The output of return_error() has the following generic S3 methods

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
# Sampled Seatbelts data from the R package datasets.
data("data_seatbelts", package = "forecastML")

# Example - Training data for 2 horizon-specific models w/ common lags per predictor.
horizons <- c(1, 12)
lookback <- 1:15

data_train <- create_lagged_df(data_seatbelts, type = "train", outcome_col = 1,
                               lookback = lookback, horizon = horizons)

# One custom validation window at the end of the dataset.
windows <- create_windows(data_train, window_start = 181, window_stop = 192)

# User-define model - LASSO
# A user-defined wrapper function for model training that takes the following
# arguments: (1) a horizon-specific data.frame made with create_lagged_df(..., type = "train")
# (e.g., my_lagged_df$horizon_h) and, optionally, (2) any number of additional named arguments
# which are passed as '...' in train_model().
library(glmnet)
model_function <- function(data, my_outcome_col) {

  x <- data[, -(my_outcome_col), drop = FALSE]
  y <- data[, my_outcome_col, drop = FALSE]
  x <- as.matrix(x, ncol = ncol(x))
  y <- as.matrix(y, ncol = ncol(y))

  model <- glmnet::cv.glmnet(x, y, nfolds = 3)
  return(model)
}

# my_outcome_col = 1 is passed in ... but could have been defined in model_function().
model_results <- train_model(data_train, windows, model_name = "LASSO", model_function,
                             my_outcome_col = 1)

# User-defined prediction function - LASSO
# The predict() wrapper takes two positional arguments. First,
# the returned model from the user-defined modeling function (model_function() above).
# Second, a data.frame of predictors--identical to the datasets returned from
# create_lagged_df(..., type = "train"). The function can return a 1- or 3-column data.frame
# with either (a) point forecasts or (b) point forecasts plus lower and upper forecast
# bounds (column order and column names do not matter).
prediction_function <- function(model, data_features) {

  x <- as.matrix(data_features, ncol = ncol(data_features))

  data_pred <- data.frame("y_pred" = predict(model, x, s = "lambda.min"))
  return(data_pred)
}

# Predict on the validation datasets.
data_valid <- predict(model_results, prediction_function = list(prediction_function),
                      data = data_train)

# Forecast error metrics for validation datasets.
data_error <- return_error(data_valid)

forecastML documentation built on July 8, 2020, 7:27 p.m.