| func_beta | R Documentation |
P_{t}(a) =
\frac{
\exp(\beta \cdot (Q_t(a) - \max_{a' \in \mathcal{A}} Q_t(a')))
}
{
\sum_{a' \in \mathcal{A}}
\exp(
\beta \cdot (Q_t(a') - \max_{a'_{i} \in \mathcal{A}} Q_t(a'_{i}))
)
}
P_{t}(a) = (1 - lapse \cdot N_{shown}) \cdot P_{t}(a) + lapse
func_beta(shown, qvalue, explor, rownum, params, hidden, system, ...)
shown |
Which options shown in this trial. |
qvalue |
The expected Q values of different behaviors produced by different systems when updated to this trial. |
explor |
Whether the agent made a random choice (exploration) in this trial. |
rownum |
The trial number |
params |
Parameters used by the model's internal functions, see params |
|
All hidden variables within the MDP process belong here. | |
system |
When the agent makes a decision, is a single system at work, or are multiple systems involved? see system |
... |
It currently contains the following information; additional information may be added in future package versions.
|
A NumericVector containing the probability of choosing each
option.
A List
output [NumericVector]
A numeric vector representing the probability of selecting each option.
The inverse temperature parameter beta in the softmax
function primarily controls these probabilities. Larger values
of beta make choices more sensitive to differences in
Q-values, while smaller values make choices closer to random.
In addition, the lapse parameter prevents the probability
of any option from reaching zero, thereby avoiding
logP = -Inf.
hidden [CharacterVector]
User-defined internal variables generated by this function. These represent intermediate (latent) states produced during computation, which can be read or modified by other functions in the MDP process.
func_beta <- function(
shown,
qvalue,
explor,
system,
rownum,
params,
hidden,
...
){
list2env(list(...), envir = environment())
# If you need extra information(...)
# Column names may be lost(C++), indexes are recommended
# e.g.
# Trial <- idinfo[3]
# Frame <- exinfo[1]
# Action <- behave[1]
beta <- params[["beta"]]
lapse <- params[["lapse"]]
weight <- params[["weight"]]
capacity <- params[["capacity"]]
index <- which(!is.na(qvalue[[1]]))
n_shown <- length(index)
n_system <- length(qvalue)
n_options <- length(qvalue[[1]])
# Assign weights to different systems
if (length(weight) == 1L) {weight <- c(weight, 1 - weight)}
weight <- weight / sum(weight)
if (n_system == 1) {weight <- weight[1]}
# Compute the probabilities estimated by different systems
prob_mat <- matrix(0, nrow = n_options, ncol = n_system)
if (explor == 1) {
prob_mat[index, ] <- 1 / n_shown
prob_mat[prob_mat == 0] <- NA
} else {
for (s in seq_len(n_system)) {
sub_qvalue <- qvalue[[s]]
exp_stable <- exp(beta * (sub_qvalue - max(sub_qvalue, na.rm = TRUE)))
prob_mat[, s] <- exp_stable / sum(exp_stable, na.rm = TRUE)
}
}
# Weighted average
prob <- as.vector(prob_mat %*% weight)
# lapse
prob <- (1 - lapse * n_shown) * prob + lapse
return(list(output = prob, hidden = hidden))
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.