View source: R/gesearch_control.R
| gesearch_control | R Documentation |
Creates a control object specifying algorithm parameters for
gesearch.
gesearch_control(
retain_by = c("probability", "proportion"),
percentile_type = 7L,
tune = FALSE,
number = 10L,
p = 0.75,
stagnation_limit = 5L,
allow_parallel = TRUE,
blas_threads = 1L
)
retain_by |
A character string specifying how training observations are selected at each iteration:
|
percentile_type |
An integer between 1 and 9 specifying the quantile
algorithm when |
tune |
A logical indicating whether to tune regression parameters
(e.g., number of PLS components) via cross-validation at each iteration.
Increases computation time substantially. Default is |
number |
An integer specifying the number of groups for leave-group-out
cross-validation. Default is 10. This is used for validating the final
models built the samples found. When |
p |
A numeric value in (0, 1) specifying the proportion of observations
per group in leave-group-out cross-validation. Default is 0.75. When
|
stagnation_limit |
An integer specifying the maximum number of consecutive iterations with no change in gene pool size before early termination. Prevents infinite loops when target size cannot be reached. Default is 5. |
allow_parallel |
A logical indicating whether to enable parallel
processing for internal resampling and calibration. The parallel backend
must be registered by the user. Default is |
blas_threads |
An integer specifying the number of BLAS threads to use during computation. Default is 1, which avoids multi-threaded OpenBLAS overhead on Linux. Requires RhpcBLASctl. See Details. |
When retain_by = "probability" (default), observations with errors
below a percentile threshold are retained. The percentile is computed using
quantile with probs set to the retain
value from gesearch. This approach is more robust when outlier
observations have extreme error values.
When retain_by = "proportion", a fixed fraction of observations
(specified by the retain argument in gesearch) with the
lowest associated errors are kept at each iteration.
When tune = TRUE, leave-group-out cross-validation is used to select
optimal regression parameters at each iteration. The number argument
controls how many CV groups are formed, and p controls the proportion
of observations in each group.
On Linux systems with multi-threaded OpenBLAS, the default thread count
can cause significant overhead for algorithms that perform many small
matrix operations (like the iterative PLS fits in gesearch).
Setting blas_threads = 1 (the default) eliminates this overhead.
This setting requires the RhpcBLASctl package. If not installed,
the parameter is ignored and a message is displayed. The original thread
count is restored when gesearch completes.
A list of class "gesearch_control" containing the specified
parameters.
Lobsey, C.R., Viscarra Rossel, R.A., Roudier, P., Hedley, C.B. 2017. rs-local data-mines information from spectral libraries to improve local calibrations. European Journal of Soil Science 68:840-852.
gesearch, quantile
# Default parameters (probability-based retention)
gesearch_control()
# Proportion-based retention
gesearch_control(retain_by = "proportion")
# Enable parameter tuning with custom CV settings
gesearch_control(tune = TRUE, number = 5, p = 0.8)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.