run_m | R Documentation |
This function is designed to construct and customize reinforcement learning models.
Items for model construction:
Data Input and Specification: You must provide the raw
dataset for analysis. Crucially, you need to inform the
run_m
function about the corresponding column
names within your dataset
(e.g.,
Mason_2024_G1
,
Mason_2024_G2
)
This is a game, so it's critical that your dataset includes rewards
for both the human-chosen option and the unchosen options.
Customizable RL Models: This function allows you to define and adjust the number of free parameters to create various reinforcement learning models.
Value Function:
Learning Rate:
By adjusting the number of eta
, you can construct basic
reinforcement learning models such as Temporal Difference (TD)
and Risk Sensitive Temporal Difference (RSTD).
You can also directly adjust func_eta
to
define your own custom learning rate function.
Utility Function: You can directly adjust the form
of func_gamma
to incorporate the
principles of Kahneman's Prospect Theory. Currently,
the built-in func_gamma
only takes the
form of a power function, consistent with Stevens' Power Law.
Exploration–Exploitation Trade-off:
Initial Values: This involves setting the initial expected value for each option when it hasn't been chosen yet. A higher initial value encourages exploration.
Epsilon: Adjusting the threshold
,
epsilon
and lambda
parameters can lead to
exploration strategies such as epsilon-first, epsilon-greedy,
or epsilon-decreasing.
Upper-Confidence-Bound: By adjusting pi
,
it controls the degree of exploration by scaling the uncertainty
bonus given to less-explored options.
Soft-Max: By adjusting the inverse temperature
parameter tau
, this controls the agent's sensitivity to
value differences. A higher value of tau means greater emphasis
on value differences, leading to more exploitation. A smaller
value of tau indicates a greater tendency towards exploration.
Objective Function Format for Optimization: Once your
model is defined in run_m
, it must be structured
as an objective function that accepts params
as input and returns
a loss value (typically logL
). This format ensures compatibility
with the algorithm package, which uses it to estimate optimal
parameters. For an example of a standard objective function format, see
TD
,
RSTD
,
Utility
.
For more information, please refer to the homepage of this package: https://yuki-961004.github.io/binaryRL/
run_m(
name = NA,
mode = c("simulate", "fit", "replay"),
policy = c("on", "off"),
data,
id,
n_params,
n_trials,
gamma = 1,
eta,
initial_value = NA_real_,
threshold = 1,
epsilon = NA,
lambda = NA,
pi = NA,
tau = NA,
lapse = 0.02,
alpha = NA,
beta = NA,
priors = NULL,
util_func = func_gamma,
rate_func = func_eta,
expl_func = func_epsilon,
bias_func = func_pi,
prob_func = func_tau,
loss_func = func_logl,
sub = "Subject",
time_line = c("Block", "Trial"),
L_choice = "L_choice",
R_choice = "R_choice",
L_reward = "L_reward",
R_reward = "R_reward",
sub_choose = "Sub_Choose",
rob_choose = "Rob_Choose",
raw_cols = NULL,
var1 = NA_character_,
var2 = NA_character_,
seed = 123,
digits_1 = 2,
digits_2 = 5,
engine = "cpp"
)
name |
[string] The name of your RL model |
mode |
[string] This parameter controls the function's operational mode. It has three possible values, each typically associated with a specific function:
In most cases, you won't need to modify this parameter directly, as suitable default values are set for different contexts. |
policy |
[string] Specifies the learning policy to be used.
This determines how the model updates action values based on observed or
simulated choices. It can be either
|
data |
[data.frame] This data should include the following mandatory columns:
|
id |
[string] Which subject is going to be analyzed. The value should correspond to an entry in the "sub" column, which must contain the subject IDs. e.g. |
n_params |
[integer] The number of free parameters in your model. |
n_trials |
[integer] The total number of trials in your experiment. |
gamma |
[NumericVector] Note: This should not be confused with the discount rate parameter
(also named gamma) found in Temporal Difference (TD) models.
Rescorla-Wagner model does not include a discount rate.
Here,
default: |
eta |
[NumericVector] Parameters used in the Learning Rate Function,
The structure of
TD: RSTD: |
initial_value |
[double] Subject's initial expected value for each stimulus's reward. If this value
is not set default: |
threshold |
[integer] Controls the initial exploration phase in the epsilon-first strategy.
This is the number of early trials where the subject makes purely random
choices, as they haven't yet learned the options' values. For example,
default: epsilon-first: |
epsilon |
[NumericVector] A parameter used in the epsilon-greedy exploration strategy. It
defines the probability of making a completely random choice, as opposed
to choosing based on the relative values of the left and right options.
For example, if
epsilon-greedy: |
lambda |
[NumericVector] A numeric value that controls the decay rate of exploration probability
in the epsilon-decreasing strategy. A higher
epsilon-decreasing: |
pi |
[NumericVector] Parameter used in the Upper-Confidence-Bound (UCB) action selection
formula.
default: |
tau |
[NumericVector] Parameters used in the Soft-Max Function.
default |
lapse |
[double] A numeric value between 0 and 1, representing the lapse rate. You can interpret this parameter as the probability of the agent "slipping" or making a random choice, irrespective of the learned action values. This accounts for moments of inattention or motor errors. In this sense, it represents the minimum probability with which any given option will be selected. It is a free parameter that acknowledges that individuals do not always make decisions with full concentration throughout an experiment. From a modeling perspective, the lapse rate is crucial for preventing the
log-likelihood calculation from returning
default: This ensures each option has a minimum selection probability of 1 percent in TAFC tasks. |
alpha |
[NumericVector] Extra parameters that may be used in functions. |
beta |
[NumericVector] Extra parameters that may be used in functions. |
priors |
[list] A list specifying the prior distributions for the model parameters.
This argument is mandatory when using default: |
util_func |
[Function] Utility Function see |
rate_func |
[Function] Learning Rate Function see |
expl_func |
[Function] Exploration Strategy Function see |
bias_func |
[Function] Upper-Confidence-Bound see |
prob_func |
[Function] Soft-Max Function see |
loss_func |
[Function] Loss Function see |
sub |
[string] Column name of subject ID e.g. |
time_line |
[CharacterVector] A vector specifying the name of the column that the sequence of the experiment. This argument defines how the experiment is structured, such as whether it is organized by "Block" with breaks in between, and multiple trials within each block. default: |
L_choice |
[string] Column name of left choice. default: |
R_choice |
[string] Column name of right choice. default: |
L_reward |
[string] Column name of the reward of left choice default: |
R_reward |
[string] Column name of the reward of right choice default: |
sub_choose |
[string] Column name of choices made by the subject. default: |
rob_choose |
[string] Column name of choices made by the model, which you could ignore. default: |
raw_cols |
[CharacterVector] Defaults to |
var1 |
[string] Column name of extra variable 1. If your model uses more than just reward and expected value, and you need other information, such as whether the choice frame is Gain or Loss, then you can input the 'Frame' column as var1 into the model. default: |
var2 |
[string] Column name of extra variable 2. If one additional variable, var1, does not meet your needs, you can add another additional variable, var2, into your model. default: |
seed |
[integer] Random seed. This ensures that the results are reproducible and remain the same each time the function is run. default: |
digits_1 |
[integer] The number of decimal places to retain for columns related to value function default: |
digits_2 |
[integer] The number of decimal places to retain for columns related to select function. default: |
engine |
[string] - - default: |
A list of class binaryRL
containing the results of the model fitting.
data <- binaryRL::Mason_2024_G2
binaryRL.res <- binaryRL::run_m(
mode = "replay",
data = data,
id = 18,
eta = c(0.321, 0.765),
tau = 0.5,
n_params = 3,
n_trials = 360
)
summary(binaryRL.res)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.