This is a helper function that generates a default surrogate, based on properties of the objective function and the selected infill criterion.
For numeric-only (including integers) parameter spaces without any dependencies:
A Kriging model “regr.km” with kernel “matern3_2” is created.
If the objective function is deterministic we add a small nugget effect (10^-8*Var(y), y is vector of observed outcomes in current design) to increase numerical stability to hopefully prevent crashes of DiceKriging.
If the objective function is noisy the nugget effect will be estimated with
nugget.estim = TRUE (but you can override this in
jitter is set to
TRUE to circumvent a problem with DiceKriging where already
trained input values produce the exact trained output.
For further information check the
$note slot of the created learner.
Instead of the default
"BFGS" optimization method we use rgenoud (
which is a hybrid algorithm, to combine global search based on genetic algorithms and local search
based on gradients.
This may improve the model fit and will less frequently produce a constant surrogate model.
You can also override this setting in
For mixed numeric-categorical parameter spaces, or spaces with conditional parameters:
A random regression forest “regr.randomForest” with 500 trees is created.
The standard error of a prediction (if required by the infill criterion) is estimated
by computing the jackknife-after-bootstrap.
This is the
se.method = "jackknife" option of the
see this page for further info and alternatives.
If additionally dependencies are in present in the parameter space, inactive conditional parameters
are represented by missing
NA values in the training design data.frame.
We simply handle those with an imputation method, added to the random forest:
If a numeric value is inactive, i.e., missing, it will be imputed by 2 times the maximum of observed values
If a categorical value is inactive, i.e., missing, it will be imputed by the
special class label
Both of these techniques make sense for tree-based methods and are usually hard to beat, see Ding et.al. (2010).
Ding, Yufeng, and Jeffrey S. Simonoff. An investigation of missing data methods for classification trees applied to binary response data. Journal of Machine Learning Research 11.Jan (2010): 131-170.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.