Steps: - Define some variables - Load the synthetic data -
Set state for random number generation. This ensures that results can be reproduced.
set.seed(19821101 + 4 * (params$i_dataset -1) + params$n_repetition - 1)
Make magrittr
's pipe accessible from notebook. Pipes enable clear expression of a sequence of multiple operations.
"%>%" <- magrittr::`%>%`
Identify the project directory. All paths in the code are relative to it.
project_dir <- rprojroot::find_root(rprojroot::has_file("DESCRIPTION"))
Define the name of the notebook (notebook_name
). It is used to save notebook output to a notebook-specific directory.
notebook_name <- "parameter_and_model_recovery"
Define some output directories and verify their existence. Optimization statistics obtained in this notebook will be written to a notebook-specific directory inside optimizations_dir
. If directories do not exist, they are created.
optimizations_dir <- file.path(project_dir,"data","optimizations") synthetic_data_dir <- file.path(project_dir,"data","simulations", "simulate_synthetic_data_for_parameter_and_model_recovery") itchmodel::verify_output_dirs(base_dirs = list(optimizations_dir), notebook_name = notebook_name)
The name of the model (model_name
). Currently, only drift diffusion models (DDM) are implemented.
model_name <- "DDM"
Synthetic data were generated with four model parameterizations: the defer/speedup intertemporal choice data were generated with the defer_speedup_time_scaling
and defer_speedup_value_scaling
parameterizations; the date/delay intertemporal choice data were generated with the date_delay_time_scaling
and date_delay_value_scaling
parameterizations. We will fit each data with the parameterization that generated the synthetic data set (generating_parameterization
) and a "counter-parameterization".
# parameterization is specified in the YAML header or in the params argument of rmarkdown::render (parameterization <- params$parameterization)
generating_parameterization <- parameterization if (parameterization == "defer_speedup_time_scaling") { counter_parameterization <- "defer_speedup_value_scaling" } else if (parameterization == "defer_speedup_value_scaling") { counter_parameterization <- "defer_speedup_time_scaling" } else if (parameterization == "date_delay_time_scaling") { counter_parameterization <- "date_delay_value_scaling" } else if (parameterization == "date_delay_value_scaling") { counter_parameterization <- "date_delay_time_scaling" }
Synthetic data were generated with four levels of repetition: each trial (i.e. a factorial combination of frame
, m_s
, m_l
, t_s
, and t_l
) was repeated 2, 3, 4, or 5 times in the experiment (n_repetition
). Here, we fit the dataset containing n_repetition
repetitions.
# n_repetition is specified in the YAML header or in the params argument of rmarkdown::render (n_repetition <- params$n_repetition)
A total of 100 synthetic data sets were generated for each combination of model, parameterization, and number of repetitions. Here, we fit the dataset with index i_dataset
.
# i_dataset is specified in the YAML header or in the params argument of rmarkdown::render (i_dataset <- params$i_dataset)
The maximum number of iterations (max_iter
) of the optimization algorithm before it should terminate.
max_iter <- 1000
The relative tolerance (rel_tol
) is the tolerance for the stopping criterion of the optimization algorithm.
rel_tol <- 1e-6
The number of populations per free parameter defined the number of candidate solutions in the randomly distributed initial population. Higher numbers increase the likelihood of converging to a global optimum.
n_pop_per_free_param <- 20
Define file name of synthetic data file
fmt_data <- "expt_stimuli_observations_model_%s_parameterization_%s_ix_%d_nrep_%d.csv" (synthetic_data_file <- sprintf(fmt = fmt_data, model_name, generating_parameterization, i_dataset, n_repetition) )
Load synthetic data file
(synthetic_data <- readr::read_csv(file = file.path(synthetic_data_dir, synthetic_data_file), col_types = readr::cols() ))
Count the number of data points
(n_data_points <- nrow(synthetic_data))
Put synthetic data in correct format for model fitting
synthetic_data_grpd <- synthetic_data %>% dplyr::group_by(frame) stimuli <- synthetic_data_grpd %>% dplyr::select(frame, m_s, t_s, m_l, t_l, m_ss_type) %>% tidyr::nest() observations <- synthetic_data_grpd %>% dplyr::select(frame, rt, response) %>% tidyr::nest() (synthetic_data_formatted <- dplyr::left_join(x = stimuli, y = observations, by = c("frame") ) %>% dplyr::rename(stimuli = data.x, observations = data.y))
Define the name of the file containing data generating model parameter values
fmt_params <- "model_parameters_model_%s_parameterization_%s_ix_%d.csv" (synthetic_params_file <- sprintf(fmt = fmt_params, model_name, generating_parameterization, i_dataset))
Load parameter file
(generating_params <- readr::read_csv(file = file.path(synthetic_data_dir, synthetic_params_file), col_types = readr::cols() ))
Parameter definitions:
alpha: sensitivity parameter of the function transforming objective into subjective value
mu: scaling parameter of the function transforming objective into subjective value
beta: sensitivity parameter of the function transforming objective into subjective time
kappa: scaling parameter of the function transforming objective into subjective time
w: parameter that weighs and scale the differences in subjective value and time, before comparison
a: threshold separation of the diffusion process
t0: non-decision time
Indices following the parameter name indicate that the parameter varies across conditions.
The dataset will be fit with the generating_parameterization
: r generating_parameterization
.
Define parameter bounds
lowers_gen <- itchmodel::get_par_bounds(model = model_name, parameterization = generating_parameterization, bound = 'lower') uppers_gen <- itchmodel::get_par_bounds(model = model_name, parameterization = generating_parameterization, bound = 'upper')
The number of free parameters (n_free_params
) can be determined by counting the number of parameters for which the lower and upper bounds are identical.
n_free_params_gen <- sum(!(lowers_gen == uppers_gen))
optim_out_gen_model <- DEoptimR::JDEoptim(fn = itchmodel::get_log_likelihood, # optimization function lower = lowers_gen, upper = uppers_gen, constr = itchmodel::get_nonlinear_constraints, maxiter = max_iter, tol = rel_tol, NP = n_pop_per_free_param * n_free_params_gen, trace = TRUE, triter = 25, # Print every 25th iteration # Additional arguments passed to fn and constr: data = synthetic_data_formatted, model = model_name, parameterization = generating_parameterization )
(best_fit_gen_model <- itchmodel::params_to_tibble(optim_out_gen_model$par, model = "DDM", parameterization = generating_parameterization) )
Print optimization procedure stats
(optim_stats_gen_model <- itchmodel::get_fit_stats(optim_output = optim_out_gen_model, model = model_name, parameterization = generating_parameterization, n_data_points = n_data_points) )
Plot the observations and model predictions, using best-fitting parameters
(plt_generating_parameterization <- itchmodel::plot_model_fit_to_data(df = synthetic_data_formatted, params = optim_out_gen_model$par, model = model_name, parameterization = generating_parameterization ))
Points represent observed probabilities of choosing large-but-later option. Lines represent best-fitting model predictions.
The dataset will be fit with the counter_parameterization
: r counter_parameterization
.
Parameter lower bounds (lowers
) and upper bounds (uppers
)
lowers_counter <- itchmodel::get_par_bounds(model = model_name, parameterization = counter_parameterization, bound = 'lower') uppers_counter <- itchmodel::get_par_bounds(model = model_name, parameterization = counter_parameterization, bound = 'upper')
The number of free parameters (n_free_params
) can be determined by counting the number of parameters for which the lower and upper bounds are identical.
n_free_params_counter <- sum(!(lowers_counter == uppers_counter))
optim_out_counter_model <- DEoptimR::JDEoptim(fn = itchmodel::get_log_likelihood, # optimization function lower = lowers_counter, upper = uppers_counter, constr = itchmodel::get_nonlinear_constraints, maxiter = max_iter, tol = rel_tol, NP = n_pop_per_free_param * n_free_params_counter, trace = TRUE, triter = 25, # Print every 25th iteration # Additional arguments passed to fn and constr: data = synthetic_data_formatted, model = model_name, parameterization = counter_parameterization )
(best_fit_counter_model <- itchmodel::params_to_tibble(optim_out_counter_model$par, model = "DDM", parameterization = counter_parameterization) )
Print optimization procedure stats
(optim_stats_counter_model <- itchmodel::get_fit_stats(optim_output = optim_out_counter_model, model = model_name, parameterization = counter_parameterization, n_data_points = n_data_points) )
Plot the observations and model predictions, using best-fitting parameters
(plt_counter_parameterization <- itchmodel::plot_model_fit_to_data(df = synthetic_data_formatted, params = optim_out_counter_model$par, model = model_name, parameterization = counter_parameterization ))
Points represent observed probabilities of choosing large-but-later option. Lines represent best-fitting model predictions.
Optimization function value. Lower values represent better fits
tibble::tibble(model = c("generating", "counter")) %>% dplyr::bind_cols(dplyr::bind_rows(optim_stats_gen_model, optim_stats_counter_model))
Best-fitting parameters, relative to true generating parameters
tibble::tibble(parameter_set = c("true", "best_fit_generating", "best_fit_counter")) %>% dplyr::bind_cols(dplyr::bind_rows(generating_params, best_fit_gen_model, best_fit_counter_model))
Best-fitting parameters of generating parameterization, relative to bounds
lowers_gen_tib <- itchmodel::params_to_tibble(lowers_gen, model = model_name, parameterization = generating_parameterization) uppers_gen_tib <- itchmodel::params_to_tibble(uppers_gen, model = model_name, parameterization = generating_parameterization) tibble::tibble(parameter_set = c("lower", "best_fit_generating", "upper", "best_fit_minus_lower", "upper_minus_best_fit") ) %>% dplyr::bind_cols(., dplyr::bind_rows(lowers_gen_tib, best_fit_gen_model, uppers_gen_tib, lowers_gen_tib - best_fit_gen_model, uppers_gen_tib - best_fit_gen_model ) )
Best-fitting parameters of counter parameterization, relative to bounds
lowers_counter_tib <- itchmodel::params_to_tibble(lowers_counter, model = model_name, parameterization = counter_parameterization) uppers_counter_tib <- itchmodel::params_to_tibble(uppers_counter, model = model_name, parameterization = counter_parameterization) tibble::tibble(parameter_set = c("lower", "best_fit_counter", "upper", "best_fit_minus_lower", "upper_minus_best_fit") ) %>% dplyr::bind_cols(., dplyr::bind_rows(lowers_counter_tib, best_fit_counter_model, uppers_counter_tib, lowers_counter_tib - best_fit_counter_model, uppers_counter_tib - best_fit_counter_model ) )
Write best-fitting parameters of generating parameterization to disk
fmt_bf_params_generating_file <- "best_fitting_params_generating_model_%s_parameterization_%s_ix_%d_nrep_%d.csv" (bf_params_generating_file <- sprintf(fmt = fmt_bf_params_generating_file, model_name, generating_parameterization, i_dataset, n_repetition) ) readr::write_csv(x = best_fit_gen_model, path = file.path(optimizations_dir, notebook_name, bf_params_generating_file), append = TRUE, # enables detection of problems: each file should have only one line, col_names = TRUE )
Write best-fitting parameters of counter parameterization to disk
fmt_bf_params_counter_file <- "best_fitting_params_counter_model_%s_parameterization_%s_ix_%d_nrep_%d.csv" (bf_params_counter_file <- sprintf(fmt = fmt_bf_params_counter_file, model_name, counter_parameterization, i_dataset, n_repetition) ) readr::write_csv(x = best_fit_counter_model, path = file.path(optimizations_dir, notebook_name, bf_params_counter_file), append = TRUE, # enables detection of problems: each file should have only one line, col_names = TRUE )
Write optimization statistics of generating parameterization to disk
fmt_optim_stats_generating_file <- "optimization_statistics_generating_model_%s_parameterization_%s_ix_%d_nrep_%d.csv" (optim_stats_generating_file <- sprintf(fmt = fmt_optim_stats_generating_file, model_name, generating_parameterization, i_dataset, n_repetition) ) readr::write_csv(x = optim_stats_gen_model, path = file.path(optimizations_dir, notebook_name, optim_stats_generating_file), append = TRUE, # enables detection of problems: each file should have only one line, col_names = TRUE )
Write optimization statistics of counter parameterization to disk
fmt_optim_stats_counter_file <- "optimization_statistics_counter_model_%s_parameterization_%s_ix_%d_nrep_%d.csv" (optim_stats_counter_file <- sprintf(fmt = fmt_optim_stats_counter_file, model_name, generating_parameterization, i_dataset, n_repetition) ) readr::write_csv(x = optim_stats_counter_model, path = file.path(optimizations_dir, notebook_name, optim_stats_counter_file), append = TRUE, # enables detection of problems: each file should have only one line, col_names = TRUE )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.