| func_alpha | R Documentation |
Q_{new} = Q_{old} + \alpha \cdot (R - Q_{old})
func_alpha(
shown,
is.fp,
qvalue,
reward,
utility,
system,
rownum,
params,
hidden,
...
)
shown |
Which options shown in this trial. |
is.fp |
Is it the first time picking this option? |
qvalue |
The expected Q values of different behaviors produced by different systems when updated to this trial. |
reward |
The feedback received by the agent from the environment at trial(t) following the execution of action(a) |
utility |
The subjective value (internal representation) assigned by the agent to the objective reward. |
system |
When the agent makes a decision, is a single system at work, or are multiple systems involved? see system |
rownum |
The trial number |
params |
Parameters used by the model's internal functions, see params |
|
All hidden variables within the MDP process belong here. | |
... |
It currently contains the following information; additional information may be added in future package versions.
|
A List
output [NumericVector]
A numeric value representing the updated Q-value after learning.
This function specifies how prediction error (PE) is incorporated into value updating, using a learning rate that determines whether updates are more conservative or more aggressive in response to PE.
hidden [CharacterVector]
User-defined internal variables generated by this function. These represent intermediate (latent) states produced during computation, which can be read or modified by other functions in the MDP process.
func_alpha <- function(
shown,
is.fp,
qvalue,
reward,
utility,
params,
rownum,
system,
hidden,
...
){
list2env(list(...), envir = environment())
# If you need extra information(...)
# Column names may be lost(C++), indexes are recommended
# e.g.
# Trial <- idinfo[3]
# Frame <- exinfo[1]
# Action <- behave[1]
Q0 <- params[["Q0"]]
alpha <- params[["alpha"]]
alphaN <- params[["alphaN"]]
alphaP <- params[["alphaP"]]
if (is.nan(Q0) && first) {
update <- utility
hidden[1] <- "first"
return(list(output = update, hidden = hidden))
}
# Determine the model currently in use based on which parameters are free.
if (
system == "RL" && !(is.null(alpha)) && is.null(alphaN) && is.null(alphaP)
) {
model <- "TD"
} else if (
system == "RL" && is.null(alpha) && !(is.null(alphaN)) && !(is.null(alphaP))
) {
model <- "RSTD"
} else if (
system == "WM"
) {
model <- "WM"
} else {
stop("Unknown Model! Plase modify your learning rate function")
}
# TD
if (model == "TD") {
update <- qvalue + alpha * (utility - qvalue)
# RSTD
} else if (model == "RSTD" && utility < qvalue) {
update <- qvalue + alphaN * (utility - qvalue)
} else if (model == "RSTD" && utility >= qvalue) {
update <- qvalue + alphaP * (utility - qvalue)
# WM
} else if (model == "WM") {
update <- reward
}
return(list(output = update, hidden = hidden))
}
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.