R/document_estimate.R
In multiRL: Reinforcement Learning Tools for Multi-Armed Bandit

#' @title Estimate Methods
#' @name estimate
#' @description
#'
#'  The method used for parameter estimation, including \code{"MLE"}
#'    (Maximum Likelihood Estimation), \code{"MAP"} (Maximum A Posteriori),
#'    \code{"ABC"} (Approximate Bayesian Computation), and \code{"RNN"}
#'    (Recurrent Neural Network).
#'
#' @section Class:
#' \code{estimate [Character]}
#'
#' @section 1. Likelihood Based Inference (LBI):
#'  This estimation approach is adopted when latent rules are absent and human
#'    behavior aligns with the value update objective. In other words, it is the
#'    estimation method employed when the log-likelihood can be calculated.
#'
#' \subsection{1.1 Maximum Likelihood Estimation (MLE)}{
#'  Log-likelihood reflects the similarity between the human's observed choice
#'    and the model's prediction. The free parameters (e.g., learning rate)
#'    govern the entire Markov Decision Process, thereby controlling the
#'    returning log-likelihood value. Maximum Likelihood Estimation (MLE) then
#'    involves finding the set of free parameters that maximizes the sum of the
#'    log-likelihoods across all trials.
#'
#'  The search for these optimal parameters can be accomplished using various
#'    algorithms (e.g. GenSA, GA, NLOPT, ...). For details, please refer to the 
#'    documentation for \link[multiRL]{algorithm}.
#'
#' \enumerate{
#'    \item The Markov Decision Process (MDP) continuously updates the expected
#'          value of each action.
#'    \item These expected values are transformed into action probabilities using
#'          the soft-max function.
#'    \item The log-probability of each action is calculated.
#'    \item The likelihood is defined as the product of the human actions and
#'          the log-probabilities estimated by the model.
#' }
#' }
#'
#' \subsection{1.2 Maximum A Posteriori (MAP)}{
#'
#'  Maximum A Posteriori (MAP) is an extension of Maximum Likelihood Estimation
#'    (MLE) In addition to optimizing parameters for each individual subject
#'    based on the likelihood, Maximum A Posteriori incorporates information
#'    about the population distribution of the parameters.
#'
#'  The search for these optimal parameters can be performed using the same 
#'    algorithms as those employed in MLE. For details, please refer to the 
#'    documentation for \link[multiRL]{algorithm}.
#'
#' \enumerate{
#'    \item Perform an initial Maximum Likelihood Estimation (MLE) to find the
#'          best-fitting parameters for each individual subject.
#'    \item Use these best-fitting parameters to estimate the Probability
#'          Density Function of the population-level parameter distribution.
#'          (The Expectation-Maximization with Maximum A Posteriori estimation
#'          (EM-MAP) framework is inspired by the
#'          \href{https://github.com/sjgershm/mfit}{\code{sjgershm/mfit}}.
#'          However, unlike \code{mfit}, which typically assumes a normal
#'          distribution for the posterior. In my opinion, the posterior
#'          density is derived based on the specific prior distribution. For
#'          example, if the prior follows an exponential distribution, the
#'          estimation remains within the exponential family rather than being
#'          forced into a normal distribution.)
#'    \item Perform Maximum Likelihood Estimation (MLE) again for each subject.
#'          However, instead of returning the log-likelihood, the returned
#'          value is the log-posterior. In other words, this step considers
#'          the probability of the best-fitting parameter occurring within its
#'          derived population distribution. This penalization helps avoid
#'          finding extreme parameter estimates.
#'    \item The above steps are repeated until the log-posterior converges.
#' }
#' }
#'
#' @section 2. Simulation Based Inference (SBI):
#'    Simulation-Based Inference (SBI) can be employed when calculating the
#'    log-likelihood is impossible or computationally intractable.
#'    Simulation-Based Inference (SBI) generally seeks to establish a direct
#'    relationship between the behavioral data and the parameters, without
#'    compressing the behavioral data into a single value (log-likelihood).
#'
#' \subsection{2.1 Approximate Bayesian Computation (ABC)}{
#'
#'  The Approximate Bayesian Computation (ABC) model is trained by finding a
#'    mapping between the summary statistics and the free parameters. Once the
#'    model is trained, given a new set of summary statistics, the model can
#'    instantly determine the corresponding input parameters.
#'
#'  An excessive number of options or blocks in an experiment often leads to an 
#'    information overload in summary statistics, resulting in the curse of 
#'    dimensionality. In such cases, dimensionality reduction techniques like 
#'    PCA or PLS are required. For details, please refer to the documentation 
#'    for \link[multiRL]{reduction}.
#'
#' \enumerate{
#'    \item Generate a large amount of simulated data using randomly sampled
#'          input parameters.
#'    \item Compress the simulated data into summary statistics-for instance,
#'          by calculating the proportion of times each action was executed
#'          within different blocks.
#'    \item Establish the mapping between these summary statistics and the
#'          input parameters, which constitutes training the Approximate
#'          Bayesian Computation (ABC) model.
#'    \item Given a new set of summary statistics, the trained model outputs
#'          the input parameters most likely to have generated those statistics.
#' }
#' }
#'
#' \subsection{2.2 Recurrent Neural Network (RNN)}{
#'
#'  The Recurrent Neural Network (RNN) directly seeks a mapping between the
#'    simulated dataset itself and the input free parameters. When provided
#'    with new behavioral data, the trained model can estimate the input
#'    parameters most likely to have generated that specific dataset.
#'
#'  For the recurrent layer, users can choose between \code{GRU} and \code{LSTM}. 
#'    Subsequently, the loss function can be selected from a variety of options
#'    (e.g. MSE, MAE, NLL, ...). For details, please refer to the documentation 
#'    for \link[multiRL]{layer}.
#'
#'
#' \itemize{
#'    \item The Recurrent Neural Network (RNN) component included in
#'          \code{multiRL} is merely a shell for TensorFlow. Consequently,
#'          users who intend to use \code{estimate = "RNN"} must first install
#'          TensorFlow.
#' }
#'
#'  The Recurrent Neural Network (RNN) model is trained using only \code{state}
#'    and \code{action} data as the raw dataset by default. In other words,
#'    the developer assumes that the only necessary input information for the
#'    Recurrent Neural Network (RNN) comprises the trial-by-trial object
#'    presentation (the state) and the agent's resultant action. This
#'    constraint is adopted because excessive input information may not only
#'    interfere with model training but also lead to unnecessary time
#'    consumption.
#'
#' \enumerate{
#'    \item The raw simulated data is limited to the state (object information
#'    presented on each trial) and the action chosen by the agent in response
#'    to that state.
#'    \item After the simulated data is generated, it is partitioned into a
#'    training set and a validation set, and the RNN training commences.
#'    \item The iteration stops when both the training and validation sets
#'    converge. If the loss (e.g. MSE) of the validation set is high
#'    while the loss of the training set is low, this indicates overfitting,
#'    suggesting that the Recurrent Neural Network (RNN) model may lack
#'    generalization ability.
#'    \item Given a new dataset, the trained model infers the input parameters
#'    that are most likely to have generated that dataset.
#' }
#' }
#'
#' @section Example:
#' \preformatted{ # supported estimate methods
#'  # Maximum Likelihood Estimation
#'  estimate = "MLE"
#'  # Maximum A Posteriori
#'  estimate = "MAP"
#'  # Approximate Bayesian Computation
#'  estimate = "ABC"
#'  # Recurrent Neural Network
#'  estimate = "RNN"
#' }
#'
NULL

Any scripts or data that you put into this service are public.

multiRL documentation built on March 31, 2026, 5:06 p.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

multiRL
Reinforcement Learning Tools for Multi-Armed Bandit

R/document_estimate.R
In multiRL: Reinforcement Learning Tools for Multi-Armed Bandit

Try the multiRL package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

multiRL Reinforcement Learning Tools for Multi-Armed Bandit

R/document_estimate.R In multiRL: Reinforcement Learning Tools for Multi-Armed Bandit

Try the multiRL package in your browser

R Package Documentation

Browse R Packages

We want your feedback!

multiRL
Reinforcement Learning Tools for Multi-Armed Bandit

R/document_estimate.R
In multiRL: Reinforcement Learning Tools for Multi-Armed Bandit