proteus_random_search: proteus_random_search

View source: R/main2.R

proteus_random_searchR Documentation

proteus_random_search

Description

proteus_random_search is a function for fine-tuning using random search on the hyper-parameter space of proteus (predefined or custom).

Usage

proteus_random_search(
  n_samp,
  data,
  target,
  future,
  past = NULL,
  ci = 0.8,
  smoother = FALSE,
  t_embed = NULL,
  activ = NULL,
  nodes = NULL,
  distr = NULL,
  optim = NULL,
  loss_metric = "crps",
  epochs = 30,
  lr = NULL,
  patience = 10,
  latent_sample = 100,
  verbose = TRUE,
  stride = NULL,
  dates = NULL,
  rolling_blocks = FALSE,
  n_blocks = 4,
  block_minset = 10,
  error_scale = "naive",
  error_benchmark = "naive",
  batch_size = 30,
  min_default = 1,
  seed = 42,
  future_plan = "future::multisession",
  omit = FALSE,
  keep = FALSE
)

Arguments

n_samp

Positive integer. Number of models to be randomly generated sampling the hyper-parameter space.

data

A data frame with time features on columns and possibly a date column (not mandatory).

target

Vector of strings. Names of the time features to be jointly analyzed.

future

Positive integer. The future dimension with number of time-steps to be predicted.

past

Positive integer. Length of past sequences. Default: NULL (search range future:2*future).

ci

Positive numeric. Confidence interval. Default: 0.8.

smoother

Logical. Perform optimal smoothing using standard loess for each time feature. Default: FALSE.

t_embed

Positive integer. Number of embedding for the temporal dimension. Minimum value is equal to 2. Default: NULL (search range 2:30).

activ

String. Activation function to be used by the forward network. Implemented functions are: "linear", "mish", "swish", "leaky_relu", "celu", "elu", "gelu", "selu", "bent", "softmax", "softmin", "softsign", "softplus", "sigmoid", "tanh". Default: NULL (full-option search).

nodes

Positive integer. Nodes for the forward neural net. Default: NULL (search range 2:1024).

distr

String. Distribution to be used by variational model. Implemented distributions are: "normal", "genbeta", "gev", "gpd", "genray", "cauchy", "exp", "logis", "chisq", "gumbel", "laplace", "lognorm", "skewed". Default: NULL (full-option search).

optim

String. Optimization method. Implemented methods are: "adadelta", "adagrad", "rmsprop", "rprop", "sgd", "asgd", "adam". Default: NULL (full-option search).

loss_metric

String. Loss function for the variational model. Three options: "elbo", "crps", "score". Default: "crps".

epochs

Positive integer. Default: 30.

lr

Positive numeric. Learning rate. Default: NULL (search range 0.001:0.1).

patience

Positive integer. Waiting time (in epochs) before evaluating the overfit performance. Default: epochs.

latent_sample

Positive integer. Number of samples to draw from the latent variables. Default: 100.

verbose

Logical. Default: TRUE

stride

Positive integer. Number of shifting positions for sequence generation. Default: NULL (search range 1:3).

dates

String. Label of feature where dates are located. Default: NULL (progressive numbering).

rolling_blocks

Logical. Option for incremental or rolling window. Default: FALSE.

n_blocks

Positive integer. Number of distinct blocks for back-testing. Default: 4.

block_minset

Positive integer. Minimum number of sequence to create a block. Default: 3.

error_scale

String. Scale for the scaled error metrics (for continuous variables). Two options: "naive" (average of naive one-step absolute error for the historical series) or "deviation" (standard error of the historical series). Default: "naive".

error_benchmark

String. Benchmark for the relative error metrics (for continuous variables). Two options: "naive" (sequential extension of last value) or "average" (mean value of true sequence). Default: "naive".

batch_size

Positive integer. Default: 30.

min_default

Positive numeric. Minimum differentiation iteration. Default: 1.

seed

Random seed. Default: 42.

future_plan

how to resolve the future parallelization. Options are: "future::sequential", "future::multisession", "future::multicore". For more information, take a look at future specific documentation. Default: "future::multisession".

omit

Logical. Flag to TRUE to remove missing values, otherwise all gaps, both in dates and values, will be filled with kalman filter. Default: FALSE.

keep

Logical. Flag to TRUE to keep all the explored models. Default: FALSE.

Value

This function returns a list including:

  • random_search: summary of the sampled hyper-parameters and average error metrics.

  • best: best model according to overall ranking on all average error metrics (for negative metrics, absolute value is considered).

  • all_models: list with all generated models (if keep flagged to TRUE).

  • time_log: computation time.

Author(s)

Giancarlo Vercellino giancarlo.vercellino@gmail.com

References

https://rpubs.com/giancarlo_vercellino/proteus


proteus documentation built on Oct. 22, 2023, 1:15 a.m.