Description Usage Arguments Value References Examples
Apply soft maximum (softmax) choice rule for binary predictions
1 | cr_softmax(x, tau)
|
x |
A numeric vector or matrix with probabilistic predictions for actions. If |
tau |
A number above 0 making action selection more random (aka temperature parameter). Large values make actions equiprobable, small values close to zero generate deterministic choices, close to arg max choices or Softmax-greedy choices. |
A matrix holding the probability to select each action in a column
Sutton, R. S., & Barto, A. G. (1998). Reinforcement learning: An introduction. Cambridge, MA: MIT Press
1 | # No examples
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.