View source: R/parsnip-deepar.R
deep_ar | R Documentation |
deep_ar()
is a way to generate a specification of a DeepAR model
before fitting and allows the model to be created using
different packages. Currently the only package is gluonts
.
deep_ar(
mode = "regression",
id,
freq,
prediction_length,
lookback_length = NULL,
cell_type = NULL,
num_layers = NULL,
num_cells = NULL,
dropout = NULL,
epochs = NULL,
batch_size = NULL,
num_batches_per_epoch = NULL,
learn_rate = NULL,
learn_rate_decay_factor = NULL,
learn_rate_min = NULL,
patience = NULL,
clip_gradient = NULL,
penalty = NULL,
scale = NULL
)
mode |
A single character string for the type of model. The only possible value for this model is "regression". |
id |
A quoted column name that tracks the GluonTS FieldName "item_id" |
freq |
A |
prediction_length |
Numeric value indicating the length of the prediction horizon |
lookback_length |
Number of steps to unroll the RNN for before computing predictions (default: NULL, in which case context_length = prediction_length) |
cell_type |
Type of recurrent cells to use (available: 'lstm' or 'gru'; default: 'lstm') |
num_layers |
Number of RNN layers (default: 2) |
num_cells |
Number of RNN cells for each layer (default: 40) |
dropout |
Dropout regularization parameter (default: 0.1) |
epochs |
Number of epochs that the network will train (default: 5). |
batch_size |
Number of examples in each batch (default: 32). |
num_batches_per_epoch |
Number of batches at each epoch (default: 50). |
learn_rate |
Initial learning rate (default: 10-3). |
learn_rate_decay_factor |
Factor (between 0 and 1) by which to decrease the learning rate (default: 0.5). |
learn_rate_min |
Lower bound for the learning rate (default: 5x10-5 ). |
patience |
The patience to observe before reducing the learning rate, nonnegative integer (default: 10). |
clip_gradient |
Maximum value of gradient. The gradient is clipped if it is too large (default: 10). |
penalty |
The weight decay (or L2 regularization) coefficient. Modifies objective by adding a penalty for having large weights (default 10-8 ). |
scale |
Scales numeric data by |
These arguments are converted to their specific names at the time that
the model is fit. Other options and arguments can be set using
set_engine()
. If left to their defaults here (see above),
the values are taken from the underlying model functions.
If parameters need to be modified, update()
can be used in lieu of recreating
the object from scratch.
The model can be created using the fit() function using the following engines:
GluonTS DeepAR: "gluonts_deepar" (the default)
Pytorch: "torch". Requires pytorch
and pytorch-lightning
. Install with install_gluonts(include_pytorch = TRUE)
.
The standardized parameter names in modeltime
can be mapped to their original
names in each engine:
modeltime | DeepAREstimator (GluonTS) | DeepAREstimator (Torch) |
id | NA | NA |
freq | freq | freq |
prediction_length | prediction_length | prediction_length |
lookback_length | context_length (= prediction_length) | context_length (= prediction_length) |
epochs | epochs (5) | max_epochs |
batch_size | batch_size (32) | batch_size (32) |
num_batches_per_epoch | num_batches_per_epoch (50) | Not Used |
learn_rate | learning_rate (0.001) | Not Used |
learn_rate_decay_factor | learning_rate_decay_factor (0.5) | Not Used |
learn_rate_min | minimum_learning_rate (5e-5) | Not Used |
patience | patience (10) | Not Used |
clip_gradient | clip_gradient (10) | Not Used |
penalty | weight_decay (1e-8) | Not Used |
cell_type | cell_type ('lstm') | Not Used |
num_layers | num_layers (2) | Not Used |
num_cells | num_cells (40) | num_cells (40) |
dropout | dropout_rate (0.1) | dropout_rate (0.1) |
scale | scale_by_id (FALSE) | scale_by_id (FALSE) |
Other options can be set using set_engine()
.
The engine uses gluonts.model.deepar.DeepAREstimator()
.
Default values that have been changed to prevent long-running computations:
epochs = 5
: GluonTS uses 100 by default.
Required Parameters
The gluonts
implementation has several Required Parameters,
which are user-defined.
1. ID Variable (Required):
An important difference between other parsnip models is that each time series (even single time series) must be uniquely identified by an ID variable.
The ID feature must be of class character
or factor
.
This ID feature is provided as a quoted expression
during the model specification process (e.g. deep_ar(id = "ID")
assuming
you have a column in your data named "ID").
2. Frequency (Required):
The GluonTS models use a Pandas Timestamp Frequency freq
to generate
features internally. Examples:
freq = "5min"
for timestamps that are 5-minutes apart
freq = "D"
for Daily Timestamps
The Pandas Timestamps are quite flexible. Refer to Pandas Offset Aliases.
3. Prediction Length (Required):
Unlike other parsnip models, a prediction_length
is required
during the model specification and fitting process.
Other Parameters
Other parameters of gluonts.model.deepar.DeepAREstimator()
can be set using set_engine()
.
The engine uses gluonts.torch.model.deepar.DeepAREstimator()
.
Default values that have been changed to prevent long-running computations:
epochs = 5
: Torch DeepAR uses 100 by default.
Important Engine Details
A special feature is the use of pytorch_lightning
for training,
which is different than the implementation for gluonts
.
We can access the pytorch_lightning.trainer.trainer.Trainer()
function
via set_engine()
. This allows us to set parameters like:
Setting up GPUs
Modifying the Pyorch Lightning Logging Checkpoints
To access the Trainer()
function parameters, simply add
arguments to set_engine()
, which will get passed to
the deepar_torch_fit_impl()
(an intermediate function)
that translates parameters for Pytorch Lightning.
For further details, Google the pytorch_lightning.trainer.trainer.Trainer()
function.
The following features are REQUIRED to be available in the incoming data for the fitting process.
Fit: fit(y ~ date + id, data)
: Includes a target feature that is a
function of a "date" and "id" feature. The ID feature must be pre-specified
in the model_specification.
Predict: predict(model, new_data)
where new_data
contains both
a column named "date" and "id".
ID Variable
An ID feature must be included in the recipe or formula fitting
process. This assists with cataloging the time series inside GluonTS
ListDataset.
The column name must match the quoted feature name specified in the
deep_ar(id = "id")
expects a column inside your data named "id".
Date and Date-Time Variable
It's a requirement to have a date or date-time variable as a predictor.
The fit()
interface accepts date and date-time features and handles them internally.
Salinas, David, Valentin Flunkert, and Jan Gasthaus. "DeepAR: Probabilistic forecasting with autoregressive recurrent networks." arXiv preprint arXiv:1704.04110 (2017).
fit.model_spec()
, set_engine()
library(tidymodels)
library(tidyverse)
library(timetk)
# ---- MODEL SPEC ----
# - Important: Make sure *required* parameters are provided
model_spec <- deep_ar(
# User Defined (Required) Parameters
id = "id",
freq = "M",
prediction_length = 24,
# Hyper Parameters
epochs = 1,
num_batches_per_epoch = 4
) %>%
set_engine("gluonts_deepar")
model_spec
# ---- TRAINING ----
# Important: Make sure the date and id features are included as regressors
# and do NOT dummy the id feature.
model_fitted <- model_spec %>%
fit(value ~ date + id, m750)
model_fitted
# ---- PREDICT ----
# - IMPORTANT: New Data must have id and date features
new_data <- tibble(
id = factor("M750"),
date = as.Date("2015-07-01")
)
predict(model_fitted, new_data)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.