initialize_params_grid: Initialize paramaters to optimize over based on a fixed grid

View source: R/initialize_params.R

initialize_params_gridR Documentation

Initialize paramaters to optimize over based on a fixed grid

Description

Use this function to generate a parameter grid (matrix) that can be provided to the param_grid argument of tulip(). The model estimation then reduces to evaluating all provided combinations and choosing the best one. See Details for more. Note that there is nothing special about the matrix generated by this function—you can define a set of possible parameters in any way that suits you.

Usage

initialize_params_grid(
  n_alpha = 15,
  n_beta = 15,
  n_gamma = 15,
  use_damped = TRUE,
  beta_smaller_than_alpha = TRUE,
  gamma_smaller_than_one_minus_alpha = TRUE,
  use_logistic = TRUE,
  logistic_limit = 5,
  upper_limit = 0.5,
  lower_limit = 0
)

Arguments

n_alpha

Base number of alpha candidate values

n_beta

Base number of beta candidate values

n_gamma

Base number of gamma candidate values

use_damped

Logical (default TRUE); should a damped trends be part of the parameter grid? This is implemented by subtracting value from the one_minus_beta parameter such that beta + one_minus_beta < 1.

beta_smaller_than_alpha

Logical; should beta be constrained to always be smaller than or equal to alpha? See, for example, chapter 3.4.2 of Hyndman et al. (2002) referenced below.

gamma_smaller_than_one_minus_alpha

Logical; should gamma be constrained to always be smaller than or equal to (1-alpha)? See, for example, chapter 3.4.3 of Hyndman et al. (2002) referenced below.

use_logistic

Logical; should the logistic function be used to distribute the base candidate values? This is useful if the boundaries at 0 and 1 should be more closely trialed than values around 0.5. If FALSE, the base candidate values are distributed linearly between 0 and 1.

logistic_limit

Most extreme value provided into the logistic function; if logistic_limit is x, then the largest value trialed besides 1 is 1 / (1 + exp(-1 * logistic_limit)). The default is 5, corresponding to a value of 0.9933.

upper_limit

The largest value that any alpha, beta, and gamma can take on in the returned grid; default is 0.5. Note that this is applied at the very end of the function and therefore is not the most efficient way of limiting the set of possible parameters.

lower_limit

The lowest value that any alpha, beta, and gamma can take on in the returned grid; default is 0. Do not deviate from this unless you know what you're doing as the impact on the possible set of models is large due to the beta parameter especially. Note that this is applied at the very end of the function and therefore is not the most efficient way of limiting the set of possible parameters.

Details

The optimization procedure in tulip() evaluates each combination of parameters provided via param_grid. While this is computationally costly, it is also computationally stable. By consciously choosing parameters that are trialled, unstable parameter combinations can be avoided. The prior probability for many parameter combinations can be set to zero this way. If the set of parameters can be restricted very far (for example, because one updates from a previous fit or based on a related time series), it also makes the optimization computationally cheap.

The default grid returned does not include any 'alpha', 'beta', or 'gamma' with value of 0.5 or higher. This reduces the set of possible models to ones that adapt relatively slowly to new observations, reducing the impact an outlier can have. Yet, a 'beta' of 0.5 can still lead to explosive jumps depending on the context. This is very opinionated, and you might want to deviate from that using the upper_limit parameter.

This function returns a grid that repeats n_alpha, n_beta, n_gamma values again and again. This means that the space of possible parameters is not covered very well—a known drawback of grid search compared to random search. Consider drawing the set of possible parameter combinations randomly from a space of allowed values instead to avoid this. See also Bergstra and Bengio (2012) referenced below.

Value

A numeric matrix with six named columns: 'alpha', 'one_minus_alpha', 'beta', 'one_minus_beta', 'gamma', 'one_minus_gamma'. The alpha paramaters belong to the model's level component, the beta parameters to the model's trend component, and the gamma parameters to the model's seasonality component. Each pair usually adds up to 1, however dampening effectively reduces the sum of beta and one_minus_beta to less than 1. As per assertions on tulip()'s param_grid, each row must sum up to a value between 0 and 3, the columns must be named and in order, and each individual value must be between 0 and 1.

References

Rob J. Hyndman, Anne B. Koehler, Ralph D. Snyder, and Simone Grose (2002). A State Space Framework for Automatic Forecasting using Exponential Smoothing Methods.

https://doi.org/10.1016/S0169-2070(01)00110-8

James Bergstra, Yoshua Bengio (2012). Random Search for Hyper-Parameter Optimization.

https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf

See Also

tulip(), initialize_params_random(), initialize_params_naive()

Examples

head(initialize_params_grid(), 10)
tail(initialize_params_grid(), 10)

param_grid <- data.frame(initialize_params_grid())

library(ggplot2)
ggplot(param_grid, aes(x = alpha, y = beta, fill = gamma)) +
  geom_jitter(pch = 21, color = "white", width = 0.01, height = 0.01)


timradtke/heuristika documentation built on April 24, 2023, 1:55 a.m.