nbeats: General Interface for N-BEATS Time Series Models

Description Usage Arguments Details Engine Details Engine Fit Details See Also Examples

View source: R/parsnip-nbeats.R

Description

nbeats() is a way to generate a specification of a N-BEATS model before fitting and allows the model to be created using different packages. Currently the only package is gluonts. There are 2 N-Beats implementations: (1) Standard N-Beats, and (2) Ensemble N-Beats.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
nbeats(
  mode = "regression",
  id,
  freq,
  prediction_length,
  lookback_length = NULL,
  loss_function = NULL,
  bagging_size = NULL,
  num_stacks = NULL,
  num_blocks = NULL,
  epochs = NULL,
  batch_size = NULL,
  num_batches_per_epoch = NULL,
  learn_rate = NULL,
  learn_rate_decay_factor = NULL,
  learn_rate_min = NULL,
  patience = NULL,
  clip_gradient = NULL,
  penalty = NULL
)

Arguments

mode

A single character string for the type of model. The only possible value for this model is "regression".

id

A quoted column name that tracks the GluonTS FieldName "item_id"

freq

A pandas timeseries frequency such as "5min" for 5-minutes or "D" for daily. Refer to Pandas Offset Aliases.

prediction_length

Numeric value indicating the length of the prediction horizon

lookback_length

Number of time units that condition the predictions Also known as 'lookback period'. Default is 2 * prediction_length.

loss_function

The loss function (also known as metric) to use for training the network. Unlike other models in GluonTS this network does not use a distribution. One of the following: "sMAPE", "MASE" or "MAPE". The default value is "MAPE".

bagging_size

(Applicable to Ensemble N-Beats). The number of models that share the parameter combination of 'context_length' and 'loss_function'. Each of these models gets a different initialization random initialization. Default and recommended value: 10.

num_stacks

The number of stacks the network should contain. Default and recommended value for generic mode: 30 Recommended value for interpretable mode: 2

num_blocks

The number of blocks per stack. A list of ints of length 1 or 'num_stacks'. Default and recommended value for generic mode: 1. Recommended value for interpretable mode: 3.

epochs

Number of epochs that the network will train (default: 5).

batch_size

Number of examples in each batch (default: 32).

num_batches_per_epoch

Number of batches at each epoch (default: 50).

learn_rate

Initial learning rate (default: 10-3).

learn_rate_decay_factor

Factor (between 0 and 1) by which to decrease the learning rate (default: 0.5).

learn_rate_min

Lower bound for the learning rate (default: 5x10-5 ).

patience

The patience to observe before reducing the learning rate, nonnegative integer (default: 10).

clip_gradient

Maximum value of gradient. The gradient is clipped if it is too large (default: 10).

penalty

The weight decay (or L2 regularization) coefficient. Modifies objective by adding a penalty for having large weights (default 10-8 ).

Details

These arguments are converted to their specific names at the time that the model is fit. Other options and arguments can be set using set_engine(). If left to their defaults here (see above), the values are taken from the underlying model functions. If parameters need to be modified, update() can be used in lieu of recreating the object from scratch.

The model can be created using the fit() function using the following engines:

Engine Details

The standardized parameter names in modeltime can be mapped to their original names in each engine:

modeltime NBEATSEstimator NBEATSEnsembleEstimator
id ListDataset('item_id') ListDataset('item_id')
freq freq freq
prediction_length prediction_length prediction_length
lookback_length context_length (= 2 x prediction_length) meta_context_length (= prediction_length x c(2,4))
bagging_size NA meta_bagging_size (3)
loss_function loss_function ('sMAPE') meta_loss_function (list('sMAPE'))
num_stacks num_stacks (30) num_stacks (30)
num_blocks num_blocks (list(1)) num_blocks (list(1))
epochs epochs (5) epochs (5)
batch_size batch_size (32) batch_size (32)
num_batches_per_epoch num_batches_per_epoch (50) num_batches_per_epoch (50)
learn_rate learning_rate (0.001) learning_rate (0.001)
learn_rate_decay_factor learning_rate_decay_factor (0.5) learning_rate_decay_factor (0.5)
learn_rate_min minimum_learning_rate (5e-5) minimum_learning_rate (5e-5)
patience patience (10) patience (10)
clip_gradient clip_gradient (10) clip_gradient (10)
penalty weight_decay (1e-8) weight_decay (1e-8)

Other options can be set using set_engine().

Engine

gluonts_nbeats

The engine uses gluonts.model.n_beats.NBEATSEstimator(). Default values that have been changed to prevent long-running computations:

Required Parameters

The gluonts_nbeats implementation has several Required Parameters, which are user-defined.

1. ID Variable (Required):

An important difference between other parsnip models is that each time series (even single time series) must be uniquely identified by an ID variable.

2. Frequency (Required):

The GluonTS models use a Pandas Timestamp Frequency freq to generate features internally. Examples:

The Pandas Timestamps are quite flexible. Refer to Pandas Offset Aliases.

3. Prediction Length (Required):

Unlike other parsnip models, a prediction_length is required during the model specification and fitting process.

gluonts_nbeats_ensemble

The engine uses gluonts.model.n_beats.NBEATSEnsembleEstimator().

Number of Models Created

This model is very good, but can be expensive (long-running) due to the number of models that are being created. The number of models follows the formula:

length(lookback_length) x length(loss_function) x meta_bagging_size

The default values that have been changed from GluonTS implementation to prevent long-running computations:

The result is: 2 x 1 x 3 = 6 models. Each model will have 5 epochs by default.

Required Parameters

The gluonts_nbeats_ensemble implementation has several Required Parameters, which are user-defined.

1. ID Variable (Required):

An important difference between other parsnip models is that each time series (even single time series) must be uniquely identified by an ID variable.

2. Frequency (Required):

The GluonTS models use a Pandas Timestamp Frequency freq to generate features internally. Examples:

The Pandas Timestamps are quite flexible. Refer to Pandas Offset Aliases.

3. Prediction Length (Required):

Unlike other parsnip models, a prediction_length is required during the model specification and fitting process.

Fit Details

The following features are REQUIRED to be available in the incoming data for the fitting process.

ID Variable

An ID feature must be included in the recipe or formula fitting process. This assists with cataloging the time series inside GluonTS ListDataset. The column name must match the quoted feature name specified in the nbeats(id = "id") expects a column inside your data named "id".

Date and Date-Time Variable

It's a requirement to have a date or date-time variable as a predictor. The fit() interface accepts date and date-time features and handles them internally.

See Also

fit.model_spec(), set_engine()

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
library(tidymodels)
library(tidyverse)
library(timetk)


# ---- MODEL SPEC ----
# - Important: Make sure *required* parameters are provided
model_spec <- nbeats(

    # User Defined (Required) Parameters
    id                    = "id",
    freq                  = "M",
    prediction_length     = 24,

    # Hyper Parameters
    epochs                = 1,
    num_batches_per_epoch = 4
) %>%
    set_engine("gluonts_nbeats")

model_spec

# ---- TRAINING ----
# Important: Make sure the date and id features are included as regressors
#  and do NOT dummy the id feature.
model_fitted <- model_spec %>%
    fit(value ~ date + id, m750)

model_fitted

# ---- PREDICT ----
# - IMPORTANT: New Data must have id and date features
new_data <- tibble(
    id   = factor("M750"),
    date = as.Date("2015-07-01")
)

predict(model_fitted, new_data)

modeltime.gluonts documentation built on Jan. 8, 2021, 2:23 a.m.