ExtraOpt: Cross-Entropy -based Hybrid Optimization
In Laurae2/Laurae: Advanced High Performance Data Science Toolbox for R

Description Usage Arguments Value Examples

This function allows to optimize for any input value: continuous, ordinal, discrete/categorical. Simplex-constrained-type optimization is not yet implemented (mutlivariate constraints which are not univariate constraints are not yet implemented). It tries to keep the discrete distribution, and as such, can be used to reduce dimensionality of supervised machine learning model (feature selection) while optimizing the performance. To get an overview of how to structure your functions to use (you need 3!!), check .ExtraOpt_trainer, .ExtraOpt_estimate, and .ExtraOpt_prob. For plotting, check .ExtraOpt_plot for an example.

ExtraOpt(f_train = .ExtraOpt_trainer, ..., f_est = .ExtraOpt_estimate,
  f_prob = .ExtraOpt_prob, preInit = NULL, Ninit = 50L, Nmax = 200,
  Nimprove = 10, elites = 0.9, max_elites = 150, tested_elites = 5,
  elites_converge = 10, CEmax = 200, CEiter = 20, CEelite = 0.1,
  CEimprove = 3, CEexploration_cont = 2, CEexploration_disc = c(2, 5),
  CEexploration_decay = 0.98, maximize = TRUE, best = NULL,
  cMean = NULL, cSD = NULL, cOrdinal = NULL, cMin = NULL, cMax = NULL,
  cThr = 0.001, dProb = NULL, dThr = 0.999, priorsC = NULL,
  priorsD = NULL, errorCode = -9999, autoExpVar = FALSE,
  autoExpFile = NULL, verbose = 1, plot = NULL, debug = FALSE)

`f_train`	Type: function. The training function which returns at the end the loss. All arguments provided to `ExtraOpt` in `...` are provided to `f_train`. Defaults to `.ExtraOpt_trainer`, which is a sample xgboost trainer.
`...`	Type: any. Arguments to pass to `f_train`.
`f_est`	Type: function. The estimator supervised machine learning function for the variables to optimize. It must return a list with `Model` as the model to use for `f_prob`, and the `Error` as the loss of the estimator model. Defaults to `.ExtraOpt_estimate`, which is a sample xgboost variable estimator.
`f_prob`	Type: function. The predictor function for the supervised machine learning function. It takes the model from `f_est` and a prior vector as inputs, and returns the predicted loss from `f_est`. Defaults to `.ExtraOpt_prob`, which is a sample xgboost estimator prediction.
`preInit`	Type: boolean. Whether a prior list is already computed to be used instead of the initiailzation. Set Ninit accordingly if you use a pre-initialized priors matrix. Defaults to `NULL`.
`Ninit`	Type: integer. The initialization amount. It is best to use at least 2 times the number of initialization vs the number of variables. For instance, 50 features should require `Ninit = 100`, even if it does not guarantee a best result.
`Nmax`	Type: integer. The maximum number of iterations alloted to optimize the variables provided against the loss. Once this amount of iterations is reached (excluding error code iterations), the function stops. Defaults to `200`.
`Nimprove`	Type: integer. The maximum number of iterations alloted to optimize without improvements. Defaults to `10`.
`elites`	Type: numeric. The percentage of iteration samples retained in the parameter estimator. The larger the `elites`, the lower the ability to get stuck at a local optima. However, a very low elite amount would get quickly stuck at a local optima and potentially overfit. After the initialization, a minimum of 5 sampled elites is mandatory. For instance, if `Ninit = 100`, then `elites >= 0.05`. It should do be higher than `1`. If the sampling results in a decimal-valued numeric, it will take the largest value. If the sampling results in a lower than `5` numeric, it will shrink back to `5`. Defaults to `0.90`.
`max_elites`	Type: integer. The maximum allowed number of elite samples. Setting this value low increases the convergence speed, at the expense of exploration. It is not recommended to increase it over `5000` as it will slow down severely the next prior optimization. When elites have the same loss, the elite which was computed the earliest takes precedence over all others identical-loss elites (even if their parameters are different). Defaults to `150`.
`tested_elites`	Type: integer. The number of elites tested at the same time when trying to find new values. A high value increases the space exploration at the expense of convergence speed. Minimum of `1` for small steps but fast convergence speed, supposing the initialization with good enough. Defaults to `5`.
`elites_converge`	Type: integer. The number of elites to use to assess convergence via `cThr` and `dThr`. The larger the `elites_converge`, the tighter the convergence requirements. It cannot be higher than the number of `tested_elites`. Defaults to `10`.
`CEmax`	Type: integer. The maximum alloted swarm for Cross-Entropy optimization of variables post-initialization. The higher the more accurate the potential convergence, but potentially lowers the the exploration space and increases heavily the computation time. Defaults to `200`.
`CEiter`	Type: integer. The maximum alloted iterations for Cross-Entropy optimization of variables post-initialization. The higher the more accurate the potential convergence, but potentially lowers the exploration space and increases heavily the computation time. Defaults to `20`.
`CEelite`	Type: numeric. The elite alloted for Cross-Entropy optimization of variables post-initialization. The lower the more accurate the potential convergence, but potentially lowers the exploration space and increases heavily the computation time. `CEmax * CEelite` defines the Cross-Entropy elite population, which preferably should be equal to `10 * number of variables` for stable updating of the parameter updates. Defaults to `0.1`.
`CEimprove`	Type: integer. The maximum number of iterations alloted for Cross-Entropy optimization of variables post-initialization. The higher the more accurate the potential convergence, but potentially lowers the exploration space and increases heavily the computation time. Defaults to `3`.
`CEexploration_cont`	Type: numeric. The multiplication factor of noise for numeric data. Higher values increase the exploration space. Setting it to 0 forces a full convergence mode instead of exploring data. Must be greater than or equal to 0. Defaults to `2`.
`CEexploration_disc`	Type: vector of two numerics. Respectively the inverse factor of the noise generator, and the multiplicator of noise for discrete data. Setting one of them to `0` nullifies the effect of noise, thus forcing a full convergence mode instead of exploring data. Defaults to `c(2, 5)`
`CEexploration_decay`	Type: numeric. The decay factor of noise for continuous and discrete data. Lower values mean faster decay (`exp(N-1th batch * (1 - CEexploration_decay))`). Must be between `0` (near instant decay) and `1` (no decay). Defaults to `0.98`.
`maximize`	Type: boolean. Whether to maximize or not to maximize (minimize). Defaults to `TRUE`.
`best`	Type: numeric. The best value you can get from the loss you will accept to interrupt early the optimizer. Defaults to `NULL`.
`cMean`	Type: numeric vector. The mean of continuous variables to feed to `f_train`.
`cSD`	Type: numeric vector. The standard deviation of continuous variables to feed to `f_train`.
`cOrdinal`	Type: boolean vector. Whether each continuous variable is ordinal or not.
`cMin`	Type: numeric vector. The minimum of each continuous variable.
`cMax`	Type: numeric vector. The maximum of each continuous variable.
`cThr`	Type: numeric. The value at which if the maximum standard deviation of continuous variables of the elites is higher than `cThr`, the continuous variables are supposed having converged. Once converged, the algorithm will have only one try to generate a higher threshold while optimizing. If it fails, convergence interrupts the optimization. Applies also to the cross-entropy internal optimization. Defaults to `0.001`, which means the continuous variables will be supposed converged once there is no more maximum standard deviation of 0.001.
`dProb`	Type: list of numeric vectors. A list containing for each discrete variable, a vector with the probability of the `i-1`-th element to appear.
`dThr`	Type: numeric. The value at which if the probability of the worst occurring discrete value in discrete variables of the elites is higher or equal to `dThr`, the discrete variables are supposed having converged. Once converged, the algorithm will have only one try to generate a higher threshold while optimizing. If it fails, convergence interrupts the optimization. Applies also to the cross-entropy internal optimization, but as `1 - dThr`. Defaults to `1`, which means the discrete variables will be supposed converged once all discrete variables have the same probability of 1.
`priorsC`	Type: matrix. The matrix of continuous priors. Even when filled, `cMean` and `cSD` are mandatory to be filled.
`priorsD`	Type: matrix. The matrix of discrete priors. Even when filled, `dProb` is mandatory to be filled.
`errorCode`	Type: vector of 2 numerics. When f_train is ill-conditioned or has an "error", you can use an error code to replace it by a dummy value which will be parsed afterwards for removal. You must adapt it to your own error code. For instance, the error code returned by `f_train` should be the `errorCode` value when no features are selected for training a supervised model. The error codes are removed from the priors. Defaults to `-9999`.
`autoExpVar`	Type: boolean. Whether the local priors must be exported to the global environment. This is extremely useful for debugging, but also to catch the `priorsC` and `priorsD` matrices when `ExtraOpt`, `f_train`, `f_est`, or `f_prob` is error-ing without a possible recovery. You would then be able to feed the priors and re-run without having to run again the algorithm from scratch. Defaults to `FALSE`. The saved variable in the global environment is called "temporary_Laurae".
`autoExpFile`	Type: character. Whether the local priors must be exported to as RDS files. This is extremely useful for debugging, but also to catch the `priorsC` and `priorsD` matrices when `ExtraOpt`, `f_train`, `f_est`, or `f_prob` is error-ing without a possible recovery. You would then be able to feed the priors and re-run without having to run again the algorithm from scratch. Defaults to `NULL`.
`verbose`	Type: integer. Should `ExtraOpt` become chatty and report a lot? A value of `0` defines silent, while `1` chats a little bit (and `2` chats a lot). `3` is so chatty it will flood severely. Defaults to `1`.
`plot`	Type: function. Whether to call a function to plot data or not. Your plotting function should take as first argument `"priors"`, which as a matrix with as first column the `Loss`, followed then by continuous variables, and ends with discrete variables. Continuous variables start with `"C"` while discrete variables start with `"D"` in the column names. Defaults to `NULL`.
`debug`	Type: boolean. Whether an interactive console should be used to run line by line for debugging purposes. Defaults to `FALSE`.

A list with best for the best value found, variables for the variable values (split into continuous list and discrete list), priors for the list of iterations and their values, elite_priors for the laste elites used, new_priors for the last iterations issued from the elites, iterations for the number of iterations, and thresh_stats for the threshold statistics over batches.

## Not run: 
# Example of params:
- 50 random initializations
- 200 maximum tries
- 3 continuous variables in [0, 10]
--- with 2 continuous and 1 ordinal
--- with respective means (2, 4, 6)
--- and standard deviation (1, 2, 3)
- and 2 discrete features
- with respective prior probabilities {(0.8, 0.2), (0.7, 0.1, 0.2)}
- and loss error code (illegal priors) of -9999

ExtraOpt(Ninit = 50,
         nthreads = 1,
         eta = 0.1,
         early_stop = 10,
         X_train,
         X_test,
         Y_train,
         Y_test,
         Nmax = 200,
         cMean = c(2, 4, 6),
         cSD = c(1, 2, 3),
         cOrdinal = c(FALSE, FALSE, TRUE),
         cMin = c(0, 0, 0),
         cMax = c(10, 10, 10),
         dProb = list(v1 = c(0.8, 0.2), v2 = c(0.7, 0.1, 0.2)),
         priorsC = NULL,
         priorsD = NULL,
         autoExp = FALSE,
         errorCode = -9999)

## End(Not run)