func_delta: Function: Bias
In multiRL: Reinforcement Learning Tools for Multi-Armed Bandit

func_delta

R Documentation

Function: Bias

Description

\text{Bias} = \delta \cdot \sqrt{\frac{\log(N + e)}{N + 10^{-10}}}

Usage

func_delta(shown, count, rownum, params, hidden, ...)

Arguments

`shown`	Which options shown in this trial.
`count`	How many times this action has been executed
`rownum`	The trial number
`params`	Parameters used by the model's internal functions, see params
`hidden`	All hidden variables within the MDP process belong here.
`...`	It currently contains the following information; additional information may be added in future package versions. idinfo: subid block trial exinfo: contains information whose column names are specified by the user. Frame RT NetWorth ... behave: includes the following: action: the behavior performed by the human in the given trial. latent: the object updated by the agent in the given trial. simulation: the actual behavior performed by the agent. position: the position of the stimulus on the screen. cue and rsp: Cues and responses within latent learning rules, see behrule state: The state stores the stimuli shown in the current trial—split into components by underscores—and the rewards associated with them.

Value

A List

output [NumericVector]

A numeric vector representing the bias associated with each option. By default, it follows an Upper Confidence Bound (UCB) scheme, where options selected less frequently receive larger bias values.

Alternative biasing strategies are also supported, such as stickiness to the previously chosen option, the last chosen position, or the most recently updated template.

The bias only affects the probability of selecting an option, and does not influence value updating.
hidden [CharacterVector]

User-defined internal variables generated by this function. These represent intermediate (latent) states produced during computation, which can be read or modified by other functions in the MDP process.

Body

func_delta <- function(
    shown,
    count,
    rownum,
    params,
    hidden,
    ...
){
  
  list2env(list(...), envir = environment())
  
  # If you need extra information(...)
  # Column names may be lost(C++), indexes are recommended
  # e.g.
  # Trial  <- idinfo[3]
  # Frame  <- exinfo[1]
  # Action <- behave[1]
  
  # Sticky to the same latent
  latent <- behave[2]
  if (is.na(latent)) {
    last_latent <- shown * 0
  } else {
    last_latent <- as.numeric(!is.na(shown)) * as.numeric(cue %in% latent)
  }
  # Sticky to the same action(simulation)
  simulation <- behave[3]
  if (is.na(simulation)) {
    last_simulation <- shown * 0
  } else {
    last_simulation <- as.numeric(
      rowSums(state[shown, , drop = FALSE] == simulation) > 0
    )
  }
  # Sticky to the same position
  position <- behave[4]
  if (is.na(position)) {
    last_position <- shown * 0
  } else {
    last_position <- as.numeric(shown == as.numeric(position))
  }
  
  delta     <- params[["delta"]]
  sticky    <- params[["sticky"]]
  
  # Upper-Confidence-Bound
  bias <- delta * sqrt(log(count + exp(1)) / (count + 1e-10)) + 
    # Sticky to the same latent
    sticky * last_latent +
    # Sticky to the same action(simulation)
    sticky * last_simulation +
    # Sticky to the same position
    sticky * last_position 
  
  return(list(output = bias, hidden = hidden)) 
}

multiRL documentation built on June 9, 2026, 5:09 p.m.