proteus_random_search: proteus_random_search
In proteus: Multiform Seq2Seq Model for Time-Feature Analysis

View source: R/main2.R

proteus_random_search

R Documentation

proteus_random_search

Description

proteus_random_search is a function for fine-tuning using random search on the hyper-parameter space of proteus (predefined or custom).

Usage

proteus_random_search(
  n_samp,
  data,
  target,
  future,
  past = NULL,
  ci = 0.8,
  smoother = FALSE,
  t_embed = NULL,
  activ = NULL,
  nodes = NULL,
  distr = NULL,
  optim = NULL,
  epochs = 30,
  lr = NULL,
  patience = 10,
  latent_sample = 100,
  verbose = TRUE,
  stride = NULL,
  dates = NULL,
  rolling_blocks = FALSE,
  n_blocks = 4,
  block_minset = 10,
  error_scale = "naive",
  error_benchmark = "naive",
  batch_size = 30,
  min_default = 1,
  seed = 42,
  future_plan = "future::multisession",
  omit = FALSE,
  keep = FALSE
)

Arguments

`n_samp`	Positive integer. Number of models to be randomly generated sampling the hyper-parameter space.
`data`	A data frame with time features on columns and possibly a date column (not mandatory).
`target`	Vector of strings. Names of the time features to be jointly analyzed.
`future`	Positive integer. The future dimension with number of time-steps to be predicted.
`past`	Positive integer. Length of past sequences. Default: NULL (search range future:2*future).
`ci`	Positive numeric. Confidence interval. Default: 0.8.
`smoother`	Logical. Perform optimal smoothing using standard loess for each time feature. Default: FALSE.
`t_embed`	Positive integer. Number of embedding for the temporal dimension. Minimum value is equal to 2. Default: NULL (search range 2:30).
`activ`	String. Activation function to be used by the forward network. Implemented functions are: "linear", "mish", "swish", "leaky_relu", "celu", "elu", "gelu", "selu", "bent", "softmax", "softmin", "softsign", "softplus", "sigmoid", "tanh". Default: NULL (full-option search).
`nodes`	Positive integer. Nodes for the forward neural net. Default: NULL (search range 2:1024).
`distr`	String. Distribution to be used by variational model. Implemented distributions are: "normal", "cauchy", "gumbel", "laplace", "rayleigh". Default: NULL (full-option search).
`optim`	String. Optimization method. Implemented methods are: "adadelta", "adagrad", "rmsprop", "rprop", "sgd", "asgd", "adam". Default: NULL (full-option search).
`epochs`	Positive integer. Default: 30.
`lr`	Positive numeric. Learning rate. Default: NULL (search range 0.001:0.1).
`patience`	Positive integer. Waiting time (in epochs) before evaluating the overfit performance. Default: epochs.
`latent_sample`	Positive integer. Number of samples to draw from the latent variables. Default: 100.
`verbose`	Logical. Default: TRUE
`stride`	Positive integer. Number of shifting positions for sequence generation. Default: NULL (search range 1:3).
`dates`	String. Label of feature where dates are located. Default: NULL (progressive numbering).
`rolling_blocks`	Logical. Option for incremental or rolling window. Default: FALSE.
`n_blocks`	Positive integer. Number of distinct blocks for back-testing. Default: 4.
`block_minset`	Positive integer. Minimum number of sequence to create a block. Default: 3.
`error_scale`	String. Scale for the scaled error metrics (for continuous variables). Two options: "naive" (average of naive one-step absolute error for the historical series) or "deviation" (standard error of the historical series). Default: "naive".
`error_benchmark`	String. Benchmark for the relative error metrics (for continuous variables). Two options: "naive" (sequential extension of last value) or "average" (mean value of true sequence). Default: "naive".
`batch_size`	Positive integer. Default: 30.
`min_default`	Positive numeric. Minimum differentiation iteration. Default: 1.
`seed`	Random seed. Default: 42.
`future_plan`	how to resolve the future parallelization. Options are: "future::sequential", "future::multisession", "future::multicore". For more information, take a look at future specific documentation. Default: "future::multisession".
`omit`	Logical. Flag to TRUE to remove missing values, otherwise all gaps, both in dates and values, will be filled with kalman filter. Default: FALSE.
`keep`	Logical. Flag to TRUE to keep all the explored models. Default: FALSE.

Value

This function returns a list including:

random_search: summary of the sampled hyper-parameters and average error metrics.
best: best model according to overall ranking on all average error metrics (for negative metrics, absolute value is considered).
all_models: list with all generated models (if keep flagged to TRUE).
time_log: computation time.