R/process_recovery_data.R
In binaryRL: Reinforcement Learning Tools for Two-Alternative Forced Choice Tasks

Documented in recovery_data

#' Process: Recovering Fake Data
#'
#' @description
#' This function processes the synthetic datasets generated by
#'  \code{simulate_list()}. For each of these simulated datasets, it then
#'  fits every model specified within the \code{fit_model} list. In essence,
#'  it iteratively calls the \code{optimize_para()} function for each
#'  generated object.
#'
#' The fitting procedure is analogous to that performed by \code{fit_p},
#'  and it similarly leverages parallel computation across subjects
#'  to significantly accelerate the parameter estimation process.
#'  
#' @param list [list] 
#' 
#' A list generated by function \code{simulate_list()}
#' 
#' @param id [vector]
#' 
#' Specifies which subject's data to use. In parameter and model recovery
#'  analyses, the specific subject ID is often irrelevant. Although the
#'  experimental trial order might have some randomness for each subject,
#'  the sequence of reward feedback is typically pseudo-random.
#'
#' The default value for this argument is \code{NULL}. When \code{id = NULL},
#'  the program automatically detects existing subject IDs within the
#'  dataset. It then randomly selects one subject as a sample, and the
#'  parameter and model recovery procedures are performed based on this
#'  selected subject's data.
#'  
#' default: \code{id = NULL}
#' 
#' @param n_trials [integer] 
#' 
#' The total number of trials in your experiment.
#' 
#' @param n_params [integer] 
#' 
#' The number of free parameters in your model. 
#' 
#' @param funcs [character]
#' 
#' A character vector containing the names of all user-defined functions
#'  required for the computation. When parallel computation is enabled
#'  (i.e., `nc > 1`), user-defined models and their custom functions might
#'  not be automatically accessible within the parallel environment.
#'
#' Therefore, if you have created your own reinforcement learning model
#'  that modifies the package's default four default functions 
#'  (default functions: 
#'     \code{util_func = \link[binaryRL]{func_gamma}}, 
#'     \code{rate_func = \link[binaryRL]{func_eta}}, 
#'     \code{expl_func = \link[binaryRL]{func_epsilon}}
#'     \code{bias_func = \link[binaryRL]{func_pi}}
#'     \code{prob_func = \link[binaryRL]{func_tau}}
#'  ), 
#'  you must explicitly provide the names of your custom functions as a 
#'  vector here.
#' 
#' @param policy [character]
#' 
#' Specifies the learning policy to be used.
#' This determines how the model updates action values based on observed or
#'   simulated choices. It can be either \code{"off"} or \code{"on"}.
#'   
#'  \itemize{
#'   \item {
#'    \strong{Off-Policy (Q-learning): }
#'    This is the most common approach for modeling
#'     reinforcement learning in Two-Alternative Forced Choice (TAFC) tasks.
#'     In this mode, the model's goal is to learn the underlying value of
#'     each option by observing the human participant's behavior. It achieves
#'     this by consistently updating the value of the option that the
#'     human actually chose. The focus is on understanding the value 
#'     representation that likely drove the participant's decisions.
#'   }
#'   \item {
#'    \strong{Off-Policy (SARSA): }
#'    In this mode, the target policy and the behavior policy are identical. 
#'     The model first computes the selection probability for each option based 
#'     on their current values. Critically, it then uses these probabilities to 
#'     sample its own action. The value update is then performed on the action 
#'     that the model itself selected. This approach focuses more on directly 
#'     mimicking the stochastic choice patterns of the agent, rather than just 
#'     learning the underlying values from a fixed sequence of actions.
#'   }
#'  }
#'  
#' default: \code{policy = "off"}
#' 
#' @param model_name [character] 
#' 
#' The name of your modal
#'
#' @param fit_model [function] 
#' 
#' fit model
#'
#' @param lower [vector] 
#' 
#' Lower bounds of free parameters
#' 
#' @param upper [vector] 
#' 
#' Upper bounds of free parameters
#'
#' @param initial_params [numeric]
#' 
#' Initial values for the free parameters that the optimization algorithm will
#'  search from. These are primarily relevant when using algorithms that require
#'  an explicit starting point, such as \code{L-BFGS-B}. If not specified,
#'  the function will automatically generate initial values close to zero.
#'  
#'  default: \code{initial_params = NA}.
#'
#' @param initial_size [integer]
#' 
#' This parameter corresponds to the population size in genetic 
#'  algorithms (\code{GA}). It specifies the number of initial candidate
#'  solutions that the algorithm starts with for its evolutionary search.
#'  This parameter is only required for optimization algorithms that operate on
#'  a population, such as `GA` or `DEoptim`. 
#'  
#'  default: \code{initial_size = 50}.
#'  
#' @param iteration [integer] 
#' 
#' The number of iterations the optimization algorithm will perform
#'  when searching for the best-fitting parameters during the fitting
#'  phase. A higher number of iterations may increase the likelihood of 
#'  finding a global optimum but also increases computation time.
#'  
#' @param seed [integer] 
#' 
#' Random seed. This ensures that the results are 
#'  reproducible and remain the same each time the function is run. 
#'  
#'  default: \code{seed = 123}
#'  
#' @param nc [integer]
#' 
#' Number of cores to use for parallel processing. Since fitting
#' optimal parameters for each subject is an independent task,
#' parallel computation can significantly speed up the fitting process:
#' \itemize{
#'   \item \strong{`nc = 1`}: The fitting proceeds sequentially.
#'   Parameters for one subject are fitted completely before moving
#'   to the next subject.
#'   \item \strong{`nc > 1`}: The fitting is performed in parallel
#'   across subjects. For example, if `nc = 4`, the algorithm will
#'   simultaneously fit data for four subjects. Once these are complete,
#'   it will proceed to fit the next batch of subjects (e.g., subjects
#'   5-8), and so on, until all subjects are processed.
#' }
#' 
#'  default: \code{nc = 1}
#' 
#' @param algorithm [character] 
#' Choose an algorithm package from
#'  \code{L-BFGS-B}, \code{GenSA},\code{GA},\code{DEoptim},\code{PSO},
#'  \code{Bayesian}, \code{CMA-ES}.
#'  
#' In addition, any algorithm from the \code{nloptr} package is also
#'  supported. If your chosen \code{nloptr} algorithm requires a local search,
#'  you need to input a character vector. The first element represents
#'  the algorithm used for global search, and the second element represents
#'  the algorithm used for local search.
#'  
#' @return a data frame for parameter recovery and model recovery
#' @examples
#' \dontrun{
#' binaryRL.res <- binaryRL::optimize_para(
#'   data = Mason_2024_G2,
#'   id = 1,
#'   n_params = 3,
#'   n_trials = 360,
#'   obj_func = binaryRL::RSTD,
#'   lower = c(0, 0, 0),
#'   upper = c(1, 1, 10),
#'   iteration = 100,
#'   algorithm = "L-BFGS-B"
#' )
#'
#' summary(binaryRL.res)
#' }
recovery_data <- function(
    list,
    id = 1,
    n_trials,
    n_params, 
    
    funcs = NULL,
    policy,
    model_name,
    fit_model,
    lower,
    upper,
    
    initial_params = NA,
    initial_size = 50,
    
    iteration = 10,
    seed = 123,
    nc = 1,
    algorithm
){
  # 创建一个空数据集, 用于存放结果
  recovery <- data.frame(
    fit_model = rep(model_name, length(list)),
    ACC = NA,
    LogL = NA,
    LogPr = NA,
    LogPo = NA,
    AIC = NA,
    BIC = NA
  )
  # 检测是都用同一个被试的题目, 还是每次都更换题目
  if (length(id) == 1) {
    id <- rep(id, length(list))
  }
  
  # 增加放置输入参数的列
  n_input_params <- length(list[[1]]$input)
  
  for (i in 1:n_input_params) {
    recovery[, i + 7] <- NA
    names(recovery)[i + 7] <- paste0("input_param_", i)
  }
  # 增加放置输出参数的列
  n_output_params <- length(lower)
  
  for (i in 1:n_output_params) {
    recovery[, i + 7 + n_input_params] <- NA
    names(recovery)[i + 7 + n_input_params] <- paste0("output_param_", i)
  }
  
  sys <- Sys.info()[["sysname"]]
  
  if (nc == 1) {
    future::plan(future::sequential)
  } 
  # Windows
  else if (sys == "Windows") {
    future::plan(future::multisession, workers = nc)
  } 
  # macOS
  else if (sys == "Darwin") {  
    future::plan(future::multisession, workers = nc)
  } 
  # Linux
  else if (sys == "Linux") {
    future::plan(future::multicore, workers = nc)
  }
  
  doFuture::registerDoFuture()

  # 以迭代次数作为进度条
  n_iterations <- length(list) 
  
  progressr::handlers(progressr::handler_txtprogressbar)
  
  progressr::with_progress({
    
    p <- progressr::progressor(steps = n_iterations)
    
    doRNG::registerDoRNG(seed = seed)
    
    # 初始化(定义)foreach中的i
    i <- NA
    
    # 抑制每个线程加载包时的信息
    suppressMessages({
      model_result <- foreach::foreach(
        i = 1:n_iterations, .combine = rbind,
        .packages = "binaryRL", 
        .export = funcs
      ) %dorng% {
        
        data_i <- list[[i]][["data"]]
        
        binaryRL.res <- binaryRL::optimize_para(
          policy = policy,
          estimate = "MLE",
          
          data = data_i,
          id = id[i],
          n_trials = n_trials,
          n_params = n_params,
          
          obj_func = fit_model,
          lower = lower,
          upper = upper,
          priors = NULL,

          initial_params = initial_params,
          initial_size = initial_size,
          
          iteration = iteration,
          seed = seed,
          algorithm = algorithm
        )
        
        row_i <- data.frame(matrix(NA, nrow = 1, ncol = 7 + n_input_params + n_output_params))
        row_i[1, 1] <- model_name
        row_i[1, 2] <- binaryRL.res$acc
        row_i[1, 3] <- binaryRL.res$ll
        row_i[1, 4] <- binaryRL.res$lpr
        row_i[1, 5] <- binaryRL.res$lpo
        row_i[1, 6] <- binaryRL.res$aic
        row_i[1, 7] <- binaryRL.res$bic
        
        for (j in 1:n_input_params) {
          row_i[1, 7 + j] <- list[[i]]$input[j]
        }
        for (j in 1:n_output_params) {
          row_i[1, 7 + n_input_params + j] <- binaryRL.res$output[j]
        }
        
        # 更新進度條
        p() 
        
        return(row_i)
      }
    })
  })
  
  # 继承recovery的列名
  colnames(model_result) <- colnames(recovery)
  
  recovery <- model_result

  # 取消注册的线程
  future::plan(future::sequential)

  return(recovery)
}