initialize_params_grid: Initialize paramaters to optimize over based on a fixed grid
In timradtke/heuristika: Robust Probabilistic Forecasts to Tinker With

initialize_params_grid

R Documentation

Initialize paramaters to optimize over based on a fixed grid

Description

Use this function to generate a parameter grid (matrix) that can be provided to the param_grid argument of tulip(). The model estimation then reduces to evaluating all provided combinations and choosing the best one. See Details for more. Note that there is nothing special about the matrix generated by this function—you can define a set of possible parameters in any way that suits you.

Usage

initialize_params_grid(
  n_alpha = 15,
  n_beta = 15,
  n_gamma = 15,
  use_damped = TRUE,
  beta_smaller_than_alpha = TRUE,
  gamma_smaller_than_one_minus_alpha = TRUE,
  use_logistic = TRUE,
  logistic_limit = 5,
  upper_limit = 0.5,
  lower_limit = 0
)

Arguments

`n_alpha`	Base number of `alpha` candidate values
`n_beta`	Base number of `beta` candidate values
`n_gamma`	Base number of `gamma` candidate values
`use_damped`	Logical (default TRUE); should a damped trends be part of the parameter grid? This is implemented by subtracting value from the `one_minus_beta` parameter such that `beta + one_minus_beta < 1`.
`beta_smaller_than_alpha`	Logical; should `beta` be constrained to always be smaller than or equal to `alpha`? See, for example, chapter 3.4.2 of Hyndman et al. (2002) referenced below.
`gamma_smaller_than_one_minus_alpha`	Logical; should `gamma` be constrained to always be smaller than or equal to `(1-alpha)`? See, for example, chapter 3.4.3 of Hyndman et al. (2002) referenced below.
`use_logistic`	Logical; should the logistic function be used to distribute the base candidate values? This is useful if the boundaries at 0 and 1 should be more closely trialed than values around 0.5. If FALSE, the base candidate values are distributed linearly between 0 and 1.
`logistic_limit`	Most extreme value provided into the logistic function; if `logistic_limit` is x, then the largest value trialed besides 1 is `1 / (1 + exp(-1 * logistic_limit))`. The default is 5, corresponding to a value of 0.9933.
`upper_limit`	The largest value that any `alpha`, `beta`, and `gamma` can take on in the returned grid; default is 0.5. Note that this is applied at the very end of the function and therefore is not the most efficient way of limiting the set of possible parameters.
`lower_limit`	The lowest value that any `alpha`, `beta`, and `gamma` can take on in the returned grid; default is 0. Do not deviate from this unless you know what you're doing as the impact on the possible set of models is large due to the `beta` parameter especially. Note that this is applied at the very end of the function and therefore is not the most efficient way of limiting the set of possible parameters.

Details

The optimization procedure in tulip() evaluates each combination of parameters provided via param_grid. While this is computationally costly, it is also computationally stable. By consciously choosing parameters that are trialled, unstable parameter combinations can be avoided. The prior probability for many parameter combinations can be set to zero this way. If the set of parameters can be restricted very far (for example, because one updates from a previous fit or based on a related time series), it also makes the optimization computationally cheap.

The default grid returned does not include any 'alpha', 'beta', or 'gamma' with value of 0.5 or higher. This reduces the set of possible models to ones that adapt relatively slowly to new observations, reducing the impact an outlier can have. Yet, a 'beta' of 0.5 can still lead to explosive jumps depending on the context. This is very opinionated, and you might want to deviate from that using the upper_limit parameter.

This function returns a grid that repeats n_alpha, n_beta, n_gamma values again and again. This means that the space of possible parameters is not covered very well—a known drawback of grid search compared to random search. Consider drawing the set of possible parameter combinations randomly from a space of allowed values instead to avoid this. See also Bergstra and Bengio (2012) referenced below.

Value

A numeric matrix with six named columns: 'alpha', 'one_minus_alpha', 'beta', 'one_minus_beta', 'gamma', 'one_minus_gamma'. The alpha paramaters belong to the model's level component, the beta parameters to the model's trend component, and the gamma parameters to the model's seasonality component. Each pair usually adds up to 1, however dampening effectively reduces the sum of beta and one_minus_beta to less than 1. As per assertions on tulip()'s param_grid, each row must sum up to a value between 0 and 3, the columns must be named and in order, and each individual value must be between 0 and 1.

References

Rob J. Hyndman, Anne B. Koehler, Ralph D. Snyder, and Simone Grose (2002). A State Space Framework for Automatic Forecasting using Exponential Smoothing Methods.: https://doi.org/10.1016/S0169-2070(01)00110-8
James Bergstra, Yoshua Bengio (2012). Random Search for Hyper-Parameter Optimization.: https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf

Examples

head(initialize_params_grid(), 10)
tail(initialize_params_grid(), 10)

param_grid <- data.frame(initialize_params_grid())

library(ggplot2)
ggplot(param_grid, aes(x = alpha, y = beta, fill = gamma)) +
  geom_jitter(pch = 21, color = "white", width = 0.01, height = 0.01)

timradtke/heuristika documentation built on April 24, 2023, 1:55 a.m.