#' @title Selector-Combination that Selects According to Two Selectors
#'
#' @include Selector.R
#'
#' @name dict_selectors_maybe
#'
#' @description
#' [`Selector`] that wraps two other [`Selector`]s given during construction and uses both for selection proportionally.
#' Each of the resulting `n_select` individuals is chosen either from `$selector`, or from `$selector_not`.
#'
#' This makes it possible to implement selection methods such as random interleaving, where only a fraction of `p`
#' individuals were selected by a criterion, while the others are taken randomly.
#'
#' @section Algorithm:
#' To perform selection, `n_selector_in` rows of `values` are given to `$selector`, and the remaining
#' `nrow(values) - n_selector_in` rows are given to `$selector_not`. Both selectors are used to generate
#' a subset of selected individuals: `$selector` generates `n_selector_out` individuals, and `$selector_not` generates
#' `n_select - n_selector_out` individuals.
#'
#' `n_selector_in` is either set to
#' `round(nrow(values) * p_in)` when `proportion_in` is `"exact"`, or to `rbinom(1, nrow(values), p_in)` when `proportion_in` is `"random"`.
#'
#' `n_selector_out` is set to `round(n_select * p_out)` when `proportion_out` is `"exact"`, or to `rbinom(1, n_select, p_out)` when `proportion_out` is `"random"`.
#'
#' When `odds_correction` is `TRUE`, then `p_out` is adjusted depending on the used `n_selector_in` value before being applied. Let `odds(p) = p/(1-p)`.
#' Then the effective `p_out` is set such that `odds(effective p_out) = odds(p_out) * n_selector_in / (nrow(values) - n_selector_in) / odds(p_in)`.
#' This corrects for the discrepancy between the chosen `p_in` and the effective proportion of `n_selector_in / nrow(values)` caused either by rounding
#' errors or when `proportion_in` is `"random"`.
#'
#' When `p_in` is exactly 1 or exactly 0, and `p_out` is not equal to `p_in`, then an error is given.
#'
#' If `nrow(values)` is 1, then this individuum is returned and `$selector` / `$selector_not` are not called.
#'
#' If `try_unique` is `TRUE`, then `n_selector_out` is set to at most `n_selector_in` and at least `n_select - nrow(values) + n_selector_in`,
#' and an error is generated when `nrow(values)` is less than `n_select`.
#'
#' If `try_unique` is `FALSE` and `odds_correction` is `TRUE` and `n_selector_in` is either 0 or `nrow(values)`, then `$p_out` is set to either 0 or 1, respectively.
#'
#' If `try_unique` is `FALSE` and `odds_correction` is `FALSE` and `n_selector_in` is either 0 or `nrow(values)`, and `n_selector_out` is not equal
#' to 0 or `n_select`, respectively, then
#' `n_selector_in` is increased / decreased by 1 to give `$selector_not` / `$selector` at least one individuum to choose from. While this behaviour
#' may seem pathological, it is to ensure continuity with sampled values of `n_selector_in` that are close to 0 or `n_select`.
#'
#' If `n_selector_out` is `n_select` or 0, or if `n_selector_in` is `nrows(values) - 1` or 1,
#' then only `$selector` / `$selector_not` is executed, respectively; possibly with a subset
#' of `values` if `n_selector_in` differs from `nrow(values)` / 0.
#'
#' @section Configuration Parameters:
#' This operator has the configuration parameters of the [`Selector`]s that it wraps: The configuration parameters of the operator given to the `selector` construction argument
#' are prefixed with `"maybe."`, the configuration parameters of the operator given to the `selector_not` construction argument are prefixed with `"maybe_not."`.
#'
#' Additional configuration parameters:
#' * `p_in` :: `numeric(1)` \cr
#' Probability per individual (when `random_choise` is `TRUE`), or fraction of individuals (when `random_choice` is `FALSE`),
#' that are given to `$selector` instead of `$selector_not`. This may be overriden when `try_unique` is `TRUE`, in which
#' case at least as many rows are given to `$selector` and `$selector_not` as they are generating output values respectively.
#' When this is exactly 1 or exactly 0, then `p_out` must be equal to `p_in`.
#' Must be set by the user.
#' * `p_out` :: `numeric(1)` \cr
#' Probability per output value (when `random_choise` is `TRUE`), or fraction of output values (when `random_choice` is `FALSE`),
#' that are generated by `$selector` instead of `$selector_not`. When this values is not given, it defaults to `p_in`.
#' * `shuffle_input` :: `logical(1)` \cr
#' Whether to distribute input values randomly to `$selector` / `$selector_not`. If `FALSE`, then the first part of `values`
#' is given to `$selector`. This only randomizes *which* lines of `values` are given to `$selector` / `$selector_not`, but it
#' does not necessarily reorder the lines of values given to each. In particular, if `p_out` is 0 or 1, then no shuffling takes place. Initialized to `TRUE`.
#' * `proportion_in` :: `character(1)` \cr
#' When set to `"random"`, sample the number of individuals given to `$selector` according to `rbinom(1, nrow(values), p_in)`.
#' When set to `"exact"`, give `$selector` `round(nrow(values) * p_in)` individuals. Initialized to `"exact"`.
#' * `proportion_out` :: `character(1)` \cr
#' When set to `"random"`, sample the number of individuals generated by `$selector` according to `rbinom(1, n_select, p_out)`.
#' When set to `"exact"`, have `$selector` generate `round(n_select * p_out)` individuals.
#' * `odds_correction` :: `logical(1)`\cr
#' When set, the effectively used value of `p_out` is set to
#' `1 / (1 + ((nrow(values) - n_selector_in) * p_in * (1 - p_out)) / (n_selector_in * p_out * (1 - p_in)))`, see the **Algorithm** section.
#' Initialized to `FALSE`.
#' * `try_unique` :: `logical(1)`\cr
#' Whether to give at least as many rows of `values` to each of `$selector` and `$selector_not` as they are generating output
#' values. This should be set to `TRUE` whenever `SelectorMaybe` is used to select unique values, and can be set to
#' `FALSE` when selecting values multiple times is acceptable. When this is `TRUE`, then having `n_select > nrow(values)`
#' generates an error. Initialized to `TRUE`.
#'
#' @templateVar id maybe
#' @templateVar additional , \<selector\> \[, \<selector_not\>\]
#' @template autoinfo_prepare_ftr
#'
#' @section Supported Operand Types:
#'
#' Supported [`Domain`][paradox::Domain] classes are the set intersection of supported classes of `selector` and `selector_not`.
#'
#' @template autoinfo_dict
#'
#' @family selectors
#' @family selector wrappers
#' @export
SelectorMaybe = R6Class("SelectorMaybe",
inherit = Selector,
public = list(
#' @description
#' Initialize the `SelectorMaybe` object.
#' @param selector ([`Selector`])\cr
#' [`Selector`] to wrap. This operator gets run with probability / fraction `p_in` (Configuration parameter).\cr
#' The constructed object gets a *clone* of this argument.
#' The `$selector` field will reflect this value.
#' @param selector_not ([`Selector`])\cr
#' Another [`Selector`] to wrap. This operator runs when `selector` is not chosen. By
#' default, this is [`SelectorRandom`], i.e. selecting randomly.\cr
#' The constructed object gets a *clone* of this argument.
#' The `$selector_not` field will reflect this value.
initialize = function(selector, selector_not = SelectorRandom$new()) {
private$.wrapped = assert_r6(selector, "Selector")$clone(deep = TRUE)
private$.wrapped_not = assert_r6(selector_not, "Selector")$clone(deep = TRUE)
if (!paradox_s3) {
private$.wrapped$param_set$set_id = "maybe"
private$.wrapped_not$param_set$set_id = "maybe_not"
}
private$.maybe_param_set = ps(
p_in = p_dbl(0, 1, tags = "required"),
p_out = p_dbl(0, 1),
shuffle_input = p_lgl(tags = "required"),
proportion_in = p_fct(c("random", "exact"), tags = "required"),
proportion_out = p_fct(c("random", "exact"), tags = "required"),
odds_correction = p_lgl(tags = "required"),
try_unique = p_lgl(tags = "required")
)
private$.maybe_param_set$values = list(shuffle_input = TRUE,
proportion_in = "exact", proportion_out = "exact", try_unique = TRUE)
super$initialize(param_classes = intersect(selector$param_classes, selector_not$param_classes),
param_set = alist(private$.maybe_param_set, maybe = private$.wrapped$param_set, maybe_not = private$.wrapped_not$param_set),
supported = intersect(selector$supported, selector_not$supported),
packages = c("stats", selector$packages, selector_not$packages), dict_entry = "maybe",
own_param_set = quote(private$.maybe_param_set))
},
#' @description
#' See [`MiesOperator`] method. Primes both this operator, as well as the wrapped operators
#' given to `selector` and `selector_not` during construction.
#' @param param_set ([`ParamSet`][paradox::ParamSet])\cr
#' Passed to [`MiesOperator`]`$prime()`.
#' @return [invisible] `self`.
prime = function(param_set) {
private$.wrapped$prime(param_set)
private$.wrapped_not$prime(param_set)
super$prime(param_set)
invisible(self)
}
),
active = list(
#' @field selector ([`Selector`])\cr
#' [`Selector`] being wrapped. This operator gets run with probability / proportion `p_in` and generates
#' output with probability / proportion `p_out` (configuration parameters).
selector = function(val) {
if (!missing(val)) stop("mutator is read-only.")
private$.wrapped
},
#' @field selector_not ([`Selector`])\cr
#' Alternative [`Selector`] being wrapped. This operator gets run with probability / proportion `1 - p_in`
#' and generates output with probability / proportion `1 - p_out` (configuration parameters).
selector_not = function(val) {
if (!missing(val)) stop("mutator_not is read-only.")
private$.wrapped_not
}
),
private = list(
.select = function(values, fitnesses, n_select, group_size) {
params = private$.maybe_param_set$get_values()
params$p_out = params$p_out %??% params$p_in
###########
## checks
if (params$p_in %in% c(0, 1) && params$p_out != params$p_in) {
stopf("When p_in is %s, then p_out must also be %s, but it is %s instead.", params$p_in, params$p_in, params$p_out)
}
if (params$try_unique && nrow(values) < n_select) {
stopf("try_unique is TRUE, but nrow(values) (which is %s) is less than n_select (which is %s).", nrow(values), n_select)
}
###########
## default returns
if (nrow(values) == 1) return(rep(1L, n_select))
###########
## determine n_selector_in
n_selector_in = switch(params$proportion_in,
random = stats::rbinom(1, nrow(values), params$p_in),
excact = round(nrow(values) * params$p_in)
)
n_selector_not_in = nrow(values) - n_selector_in
###########
## odds_correction
if (params$odds_correction) {
if (params$p_in == params$p_out || n_selector_in == 0 || n_selector_not_in == 0) {
# catch mostly the p_in == 0 or 1 cases
# If p_out were not equal to p_in in these cases we would throw an error above.
# If they *are* equal, we would get a 0 / 0 division below.
params$p_out = n_selector_in / nrow(values)
} else {
params$p_out = 1 /
(1 +
(n_selector_not_in * params$p_in * (1 - params$p_out)) /
(n_selector_in * params$p_out * (1 - params$p_in))
)
}
}
###########
## determine n_selector_out
n_selector_out = switch(params$proportion_in,
random = stats::rbinom(1, nrow(values), params$p_out),
excact = round(nrow(values) * params$p_out)
)
###########
## handle try_unique
if (params$try_unique) {
# make sure:
# 1) n_selector_out >= n_selector_in
# 2) n_select - n_selector_out >= nrow(values) - n_selector_in
# <=> n_selector_out <= n_select - nrow(values) + n_selector_in
n_selector_out = min(max(n_selector_out, n_select - nrow(values) + n_selector_in), n_selector_in)
} else if (n_selector_out %nin% c(0, n_select)) { # can only happen when !odds_correction, sine otherwise p_out is 0 or 1 by now
n_selector_in = min(max(n_selector_in, 1), nrow(values) - 1)
}
###########
## determine group_size
group_size = vectorize_group_size(group_size, n_select)
if (params$proportion_out == "random") {
# random: distribute inputs among groups
# bitmap to represent all output positions, and whether they are given to selector 'maybe'
# [<group 1>|-group 2 >|-grp3>]
# [_____________________________] -- bitmap
group_bitmap = logical(n_select)
# distribution n_selector_out positions on that bitmap
# [<group 1>|-group 2 >|-grp3>]
# [______#_______#___#__________]
group_bitmap[sample.int(n_select, n_selector_out, replace = FALSE)] = TRUE
# for each position, compute how many go to 'maybe' selector at that position or before
# [<group 1>|-group 2 >|-grp3>]
# [______#_______#___#__________]
# [00000011111111222233333333333] -- take_le
take_le = cumsum(group_bitmap)
# index into this vector through cumsum of group size
# [<group 1>|-group 2 >|-grp3>]
# [______#_______#___#__________]
# [00000011111111222233333333333]
# ^ ^ ^
group_index = cumsum(group_size)
# this many are in the given group, or below
# [ 1 3 3]
group_le = take_le[group_index]
# differentiate to get items in each group, starting with 0
# [ 1 2 0 ]
group_size_out = diff(c(0, group_le))
} else {
# exact: distribute as equally as possible
group_size_out = floor(group_size * n_selector_out / n_select)
remainder = n_selector_out - sum(group_size_out)
adding = sample.int(length(group_size), remainder, replace = FALSE)
group_size_out[adding] = group_size_out[adding] + 1
group_bitmap = unlist(Map(function(gs, gs_out) {
bitmap = logical(gs)
bitmap[sample.int(gs, gs_out, replace = FALSE)] = TRUE
bitmap
}, group_size, group_size_out))
}
group_size_not_out = group_size - group_size_out
selector_in_positions = seq_len(n_selector_in)
selector_not_in_positions = seq_len(nrow(values) - n_selector_in) + n_selector_in
if (params$shuffle_input) {
# sample a logical vector, so we can also negate
s_in = sample.int(nrow(values)) <= n_selector_in
selector_in_positions = s_in[selector_in_positions]
selector_not_in_positions = s_in[selector_not_in_positions]
}
if (n_selector_out == 0) {
selected_out = integer(0)
} else if (n_selector_in == 1) {
selected_out = rep(selector_in_positions, n_selector_out)
} else {
selected_out = selector_in_positions[private$.wrapped$operate(
values[selector_in_positions],
fitnesses[selector_in_positions, , drop = FALSE],
n_selector_out,
group_size_out[group_size_out != 0]
)]
}
if (n_selector_out == n_select) {
selected_not_out = integer(0)
} else if (n_selector_in == nrow(values) - 1) {
selected_not_out = rep(selector_not_in_positions, n_select - n_selector_out)
} else {
selected_not_out = selector_not_in_positions[private$.wrapped$operate(
values[selector_not_in_positions],
fitnesses[selector_not_in_positions, , drop = FALSE],
n_select - n_selector_out,
group_size_not_out[group_size_not_out != 0]
)]
}
output = integer(n_select)
output[group_bitmap] = selected_out
output[!group_bitmap] = selected_not_out
output
},
.wrapped = NULL,
.wrapped_not = NULL,
.maybe_param_set = NULL
)
)
dict_selectors$add("maybe", SelectorMaybe)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.