func_epsilon: Function: Exploration or Exploitation
In multiRL: Reinforcement Learning Tools for Multi-Armed Bandit

View source: R/func_epsilon.R

func_epsilon

R Documentation

Function: Exploration or Exploitation

Description

\epsilon-first:

P(x) = \begin{cases} i \le \text{threshold}, & x=1 \\ i > \text{threshold}, & x=0 \end{cases}

\epsilon-greedy:

P(x) = \begin{cases} \epsilon, & x=1 \\ 1-\epsilon, & x=0 \end{cases}

\epsilon-decreasing:

P(x) = \begin{cases} \frac{1}{1+\epsilon \cdot i}, & x=1 \\ \frac{\epsilon \cdot i}{1+\epsilon \cdot i}, & x=0 \end{cases}

Usage

func_epsilon(shown, rownum, params, hidden, ...)

Arguments

`shown`	Which options shown in this trial.
`rownum`	The trial number
`params`	Parameters used by the model's internal functions, see params
`hidden`	All hidden variables within the MDP process belong here.
`...`	It currently contains the following information; additional information may be added in future package versions. idinfo: subid block trial exinfo: contains information whose column names are specified by the user. Frame RT NetWorth ... behave: includes the following: action: the behavior performed by the human in the given trial. latent: the object updated by the agent in the given trial. simulation: the actual behavior performed by the agent. position: the position of the stimulus on the screen. cue and rsp: Cues and responses within latent learning rules, see behrule state: The state stores the stimuli shown in the current trial—split into components by underscores—and the rewards associated with them.

Value

A List

output [int]

Either 0 or 1, indicating exploration or exploitation on the current trial.
hidden [CharacterVector]

User-defined internal variables generated by this function. These represent intermediate (latent) states produced during computation, which can be read or modified by other functions in the MDP process.

Body

func_epsilon <- function(
    shown,
    rownum,
    params,
    hidden,
    ...
){

  list2env(list(...), envir = environment())
  
  # If you need extra information(...)
  # Column names may be lost(C++), indexes are recommended
  # e.g.
  # Trial  <- idinfo[3]
  # Frame  <- exinfo[1]
  # Action <- behave[1]
  
  epsilon   <-  params[["epsilon"]]
  threshold <-  params[["threshold"]]
  
  # Determine the model currently in use based on which parameters are free.
  if (is.na(epsilon) && threshold > 0) {
    model <- "first"
  } else if (!(is.na(epsilon)) && threshold == 0) {
    model <- "decreasing"
  } else if (!(is.na(epsilon)) && threshold == 1) {
    model <- "greedy"
  } else {
    stop("Unknown Model! Plase modify your learning rate function")
  }
  
  set.seed(rownum)
  # Epsilon-First: 
  if (rownum <= threshold) {
    try <- 1
  } else if (rownum > threshold && model == "first") {
    try <- 0
    # Epsilon-Greedy:
  } else if (rownum > threshold && model == "greedy"){
    try <- as.integer(stats::runif(1) < epsilon)
    # Epsilon-Decreasing: 
  } else if (rownum > threshold && model == "decreasing") {
    prob_explore <- 1 / (1 + epsilon * rownum)
    try <- as.integer(stats::runif(1) < prob_explore)
  }
  
  return(list(output = try, hidden = hidden)) 
}

multiRL documentation built on June 9, 2026, 5:09 p.m.