initialize_params_random: Initialize paramaters to optimize over randomly

View source: R/initialize_params.R

initialize_params_randomR Documentation

Initialize paramaters to optimize over randomly

Description

Use this function to generate a randomly sampled parameter grid (matrix) that can be provided to the param_grid argument of tulip(). The model estimation then reduces to evaluating all provided combinations and choosing the best one. See Details for more. Note that there is nothing special about the matrix generated by this function—you can define a set of possible parameters in any way that suits you.

Usage

initialize_params_random(
  n_damped = 1000,
  n = n_damped/2,
  n_no_trend = ceiling(n_damped^(2/3)),
  n_no_season = ceiling(n_damped^(2/3)),
  n_no_trend_no_season = ceiling(n_damped^(1/3)),
  alpha_lower = 0,
  alpha_upper = 1,
  beta_lower = 0,
  beta_upper = 1,
  gamma_lower = 0,
  gamma_upper = 1,
  beta_smaller_than_alpha = TRUE,
  gamma_smaller_than_one_minus_alpha = TRUE,
  oversample_lower = 0.05,
  oversample_upper = 0.05,
  seed = NULL
)

Arguments

n_damped

Number of parameter combinations to sample that include all three states (level, trend, seasonality) and dampening of the trend. Note: By default, n, n_no_trend, n_no_season, n_no_trend_no_season are functions of the value chosen for n_damped.

n

Number of parameter combinations to sample that include all three states (level, trend, seasonality), but no dampening.

n_no_trend

Number of parameter combinations to sample that set the trend parameters to 0, implying models that don't have a trend component; since only two dimensions are sampled (level and seasonality), it usually makes sense to use a smaller n_no_trend than n (unless you are not interested in models with trend).

n_no_season

Number of parameter combinations to sample that set the seasonality parameters to 0, implying models that don't have a seasonality component; since only two dimensions are sampled (level and trend), it usually makes sense to use a smaller n_no_season than n (unless you are not interested in models with seasonality).

n_no_trend_no_season

Number of parameter combinations to sample that set the trend and seasonality parameters to 0, implying models that don't have a seasonality component; since only one dimension is sampled (level), it usually makes sense to use a smaller n_no_trend_no_season than n (unless you are not interested in models with trend and seasonality).

alpha_lower

A scalar value defining the lowest possible value for the alpha parameter. Can't be less than 0.

alpha_upper

A scalar value defining the largest possible value for the alpha parameter. The default is 1, but values larger than 1 are possible. Can't be less than alpha_lower. If alpha_lower is equal to alpha_upper, all samples for alpha will be equal to alpha_lower and alpha_upper exactly.

beta_lower

A scalar value defining the lowest possible value for the beta parameter. Can't be less than 0.

beta_upper

A scalar value defining the largest possible value for the beta parameter. The default is 1, but values larger than 1 are possible. Can't be less than beta_lower. If beta_lower is equal to beta_upper, all samples for beta will be equal to beta_lower and beta_upper exactly.

gamma_lower

A scalar value defining the lowest possible value for the gamma parameter. Can't be less than 0.

gamma_upper

A scalar value defining the largest possible value for the gamma parameter. The default is 1, but values larger than 1 are possible. Can't be less than gamma_lower. If gamma_lower is equal to gamma_upper, all samples for gamma will be equal to gamma_lower and gamma_upper exactly.

beta_smaller_than_alpha

If TRUE (default), sampling of beta is conditional on the value of the sampled alpha, using pmin(alpha, beta_upper) as the upper limit for beta.

gamma_smaller_than_one_minus_alpha

If TRUE (default), sampling of gamma is conditional on the value of the sampled alpha, using pmin(1 - alpha, gamma_upper) as the upper limit for gamma.

oversample_lower

Can be used to increase the chances that the lowest allowed value is sampled for the parameters alpha, beta, and gamma. This can be useful to find parameter combinations in which a component is not smoothed at all and thus some constant average across all combinations. For example, an ETS model with only a level component and alpha = 0 would be equivalent to the mean forecast.

oversample_upper

Can be used to increase the chances that the largest allowed value is sampled for the parameters alpha, and gamma. This can be useful to find parameter combinations in which a component is heavily smoothed (ajdusting to the latest observation). This turns the level component into behavior similar to a random walk forecast (if alpha_upper = 1), and the seasonal component into behavior similar to a seasonal random walk forecast (if gamma_upper = 1).

seed

Since the parameter grid is sampled randomly, you can set a seed (local to the function) for reproducibility.

Details

The optimization procedure in tulip() evaluates each combination of parameters provided via param_grid. While this is computationally costly, it is also computationally stable. By consciously choosing parameters that are trialled, unstable parameter combinations can be avoided. The prior probability for many parameter combinations can be set to zero this way. If the set of parameters can be restricted very far (for example, because one updates from a previous fit or based on a related time series), it also makes the optimization computationally cheap.

In contrast to initialize_params_grid(), this function draws random combinations of alpha, beta, and gamma from an allowed space of values. This can allow for better overall optimization of the model, as the overall space of possible parameters is covered better. See also Bergstra and Bengio (2012) referenced below for a comparison of grid search and random search.

Depending on the set of chosen function arguments, it can be likely that the function generates some duplicate parameter combinations (for example when oversample_upper or oversample_lower are non-zero). These will be dropped before the final matrix is returned. This means, however, that the function does not guarantee to return n + n_damped + n_no_trend + n_no_season + n_no_trend_no_season parameter combinations. It might return less than that.

One can also combine a fixed set of parameters and randomly drawn parameters, for example to always evaluate parameter combinations known to provide good results for other time series, or to also evaluate parameters that were found at a previous training on the same time series, or to include a set of benchmark models via initialize_params_naive(), for example. See also the examples below.

Value

A numeric matrix with six named columns: 'alpha', 'one_minus_alpha', 'beta', 'one_minus_beta', 'gamma', 'one_minus_gamma'. The alpha paramaters belong to the model's level component, the beta parameters to the model's trend component, and the gamma parameters to the model's seasonality component. Each pair usually adds up to 1, however dampening effectively reduces the sum of beta and one_minus_beta to less than 1. As per assertions on tulip()'s param_grid, each row must sum up to a value between 0 and 3, the columns must be named and in order, and each individual value must be between 0 and 1.

References

Rob J. Hyndman, Anne B. Koehler, Ralph D. Snyder, and Simone Grose (2002). A State Space Framework for Automatic Forecasting using Exponential Smoothing Methods.

https://doi.org/10.1016/S0169-2070(01)00110-8

James Bergstra, Yoshua Bengio (2012). Random Search for Hyperparameter Optimization.

https://www.jmlr.org/papers/volume13/bergstra12a/bergstra12a.pdf

See Also

tulip(), initialize_params_grid(), initialize_params_naive()

Examples

library(ggplot2)

param_grid_small <- initialize_params_random(
  n_damped = 46,
  seed = 388
)

nrow(param_grid_small)

summary(param_grid_small[, "alpha"])
summary(param_grid_small[, "beta"])
summary(param_grid_small[, "one_minus_beta"])
summary(param_grid_small[, "gamma"])

ggplot(as.data.frame(param_grid_small),
       aes(x = alpha, y = gamma,fill = one_minus_beta)) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,1)) +
  geom_abline(intercept = 1, slope = -1, linetype = 3) +
  geom_point(pch = 21, color = "white")

# No one prevents you from combining a set of randomly drawn parameter
# combinations with a fixed set of parameters; for example, you can always
# evaluate parameters that correspond to the Random Walk, Seasonal Random
# Walk, or Mean model:

param_grid_w_naive <- rbind(
  initialize_params_naive(),
  param_grid_small
)

head(param_grid_w_naive)

# note the new dots in the corners at (0, 0) and (0, 1)
ggplot(as.data.frame(param_grid_w_naive),
       aes(x = alpha, y = gamma,fill = one_minus_beta)) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,1)) +
  geom_abline(intercept = 1, slope = -1, linetype = 3) +
  geom_point(pch = 21, color = "white")

# More samples cover the possible parameter space better
param_grid <- initialize_params_random(
  n_damped = 1000,
  seed = 388
)

nrow(param_grid)

summary(param_grid[, "alpha"])
summary(param_grid[, "beta"])
summary(param_grid[, "one_minus_beta"])
summary(param_grid[, "gamma"])

ggplot(as.data.frame(param_grid),
       aes(x = alpha, y = gamma,fill = one_minus_beta)) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,1)) +
  geom_abline(intercept = 1, slope = -1, linetype = 3) +
  geom_point(pch = 21, color = "white")

# by default, we oversample the borders; this can be turned off to not
# sample 0- and 1-valued parameters as often
param_grid_no_border_sampling <- initialize_params_random(
  n_damped = 1000,
  seed = 388,
  oversample_lower = 0,
  oversample_upper = 0
)

summary(param_grid_no_border_sampling[, "alpha"])
summary(param_grid_no_border_sampling[, "beta"])
summary(param_grid_no_border_sampling[, "one_minus_beta"])
summary(param_grid_no_border_sampling[, "gamma"])

ggplot(as.data.frame(param_grid_no_border_sampling),
       aes(x = alpha, y = gamma, fill = one_minus_beta)) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,1)) +
  geom_abline(intercept = 1, slope = -1, linetype = 3) +
  geom_point(pch = 21, color = "white")

# The parameter space can be limited, sampling remains uniform
param_grid_restricted <- initialize_params_random(
  n_damped = 1000,
  seed = 388,
  alpha_upper = 0.5,
  beta_upper = 0.05,
  gamma_upper = 0.5,
  oversample_lower = 0.05,
  oversample_upper = 0
)

nrow(param_grid_restricted)

summary(param_grid_restricted[, "alpha"])
summary(param_grid_restricted[, "beta"])
summary(param_grid_restricted[, "one_minus_beta"])
summary(param_grid_restricted[, "gamma"])

ggplot(as.data.frame(param_grid_restricted),
       aes(x = alpha, y = gamma, fill = one_minus_beta)) +
  coord_cartesian(xlim = c(0,1), ylim = c(0,1)) +
  geom_abline(intercept = 1, slope = -1, linetype = 3) +
  geom_point(pch = 21, color = "white")


timradtke/heuristika documentation built on April 24, 2023, 1:55 a.m.