sampleCore | R Documentation |
Sample a core collection from the given data.
sampleCore(
data,
obj,
size = 0.2,
always.selected = integer(0),
never.selected = integer(0),
mode = c("default", "fast"),
normalize = TRUE,
time = NA,
impr.time = NA,
steps = NA,
impr.steps = NA,
indices = FALSE,
verbose = FALSE
)
data |
Core Hunter data ( |
obj |
Objective or list of objectives ( |
size |
Desired core subset size (numeric). If larger than one the value is used as the absolute core size after rounding. Else it is used as the sampling rate and multiplied with the dataset size to determine the size of the core. The default sampling rate is 0.2. |
always.selected |
vector with indices (integer) or ids (character) of items that should always be selected in the core collection |
never.selected |
vector with indices (integer) or ids (character) of items that should never be selected in the core collection |
mode |
Execution mode ( |
normalize |
If Normalization requires an independent preliminary search per objective (fast stochastic
hill-climber, executed in parallel for all objectives). The same stop conditions, as
specified for the main search, are also applied to each normalization search. In
Normalization ranges can also be precomputed (see |
time |
Absolute runtime limit in seconds. Not used by default ( |
impr.time |
Maximum time without improvement in seconds. If no explicit
stop conditions are specified, the maximum time without improvement defaults
to ten or two seconds, when executing Core Hunter in |
steps |
Maximum number of search steps. Not used by default ( |
impr.steps |
Maximum number of steps without improvement. Not used by
default ( |
indices |
If |
verbose |
If |
Because Core Hunter uses stochastic algorithms, repeated runs may produce different
results. To eliminate randomness, you may set a random number generation seed using
set.seed
prior to executing Core Hunter. In addition, when reproducible
results are desired, it is advised to use step-based stop conditions instead of the
(default) time-based criteria, because runtimes may be affected by external factors,
and, therefore, a different number of steps may have been performed in repeated runs
when using time-based stop conditions.
Core subset (chcore
). It has an element sel
which is a character or numeric vector containing the sorted ids or indices,
respectively, of the selected individuals (see argument indices
).
In addition the result has one or more elements that indicate the value
of each objective function that was included in the optimization.
coreHunterData
, objective
, getNormalizationRanges
data <- exampleData()
# default size, maximize entry-to-nearest-entry Modified Rogers distance
obj <- objective("EN", "MR")
core <- sampleCore(data, obj)
# fast mode
core <- sampleCore(data, obj, mode = "f")
# absolute size
core <- sampleCore(data, obj, size = 25)
# relative size
core <- sampleCore(data, obj, size = 0.1)
# other objective: minimize accession-to-nearest-entry precomputed distance
core <- sampleCore(data, obj = objective(type = "AN", measure = "PD"))
# multiple objectives (equal weight)
core <- sampleCore(data, obj = list(
objective("EN", "PD"),
objective("AN", "GD")
))
# multiple objectives (custom weight)
core <- sampleCore(data, obj = list(
objective("EN", "PD", weight = 0.3),
objective("AN", "GD", weight = 0.7)
))
# custom stop conditions
core <- sampleCore(data, obj, time = 5, impr.time = 2)
core <- sampleCore(data, obj, steps = 300)
# print progress messages
core <- sampleCore(data, obj, verbose = TRUE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.