cross_val_tmax: Cross-validation for t_max estimation
In JoKra1/papss: What the package does (short line)

cross_val_tmax

R Documentation

Cross-validation for t_max estimation

Description

Denison et al. (2020) report large variance in the optimal t_max parameters between subjects. This function can thus be used to recover the optimal parameter for subjects using cross-validation. The same procedure utilized by Denison et al. (2020) is adopted here: 1/n trials are held-out and the model is fitted on the remaining trials. The error between the average on the held-out set and the model prediction is then taken as the cross-validation error. The held-out set cycles through the entire data-set resulting in n repetitions and n cross-validation errors. The average cross-validation error for a specific t_max is then reported.

Usage

cross_val_tmax(
  cand_tmax,
  folds,
  pulse_spacing,
  trial_data,
  factor_id = "subject",
  model = "WIER_SHARED",
  n = 10.1,
  f = 1/(10^24),
  drop_last = 500,
  maxiter_inner = 10000,
  maxiter_outer = 25,
  convergence_tol = 1e-08,
  start_lambda = 0.1,
  should_accum_H = F,
  init_cf = NULL,
  expand_by = 800,
  sample_length = 20,
  time_id = "time",
  pupil_id = "pupil",
  should_plot = T
)

Arguments

`cand_tmax`	vector with all t_max values to be considered
`folds`	list of vectors, each vector corresponds to fold and contains trial values to be held-out in that fold!
`pulse_spacing`	Model pulses every 'pulse_spacing' samples. Setting this to 1 ensures 1 pulse every sample
`trial_data`	trial-level data with a time and pupil column. Also needs a factor column
`factor_id`	Name of the factor column. Model will estimate demand trajectory for each level of this factor
`model`	Model template.
`n`	Choice for parameter defined by Hoeks & Levelt (number of laters)
`f`	Choice for parameter defined by Wierda et al. (scaling factor), can also be a vector with values for each t_max candidate
`drop_last`	Drop pulses that would happen in the last drop_last ms
`maxiter_inner`	Maximum steps taken by inner optimizer
`maxiter_outer`	Maximum steps taken by outer optimizer
`convergence_tol`	Convergence check to terminate early
`start_lambda`	Initial lambda value. Must be > 0 if a penalty should be used! Setting this to 0 and maxiter_outer=1, leads to estimation of an un-penalized additive model, i.e., recovers the traditional NNLS estimate used by Wierda et al. (2012) and Denison et al. (2012).
`should_accum_H`	Whether Hessian should be approximated using BFGS rule or not. If not, then least squares Hessian matrix is used. With the BFGS rule models ended up being much smoother in our simulations. So this should be set to true if under-smoothing is observed. However, the BFGS update is much more costly and takes much more time!
`init_cf`	NULL or vector with initial coefficient estimate
`expand_by`	Time in ms by which to expand the time-series in the past. Then pulses that happened before the recorded time-window can still be approximated! See artificial_data_analysis vignette for details.
`sample_length`	Duration in ms of a single sample. If pupil dilation time-course was down-sampled to 50HZ, set this to 20
`time_id`	Name of time column in trial_data
`pupil_id`	Name of pupil column in trial_data
`should_plot`	Whether or not fit plots should be generated as well.
`t_max`	Choice for parameter defined by Hoeks & Levelt (response maximum in ms)

Details

Note that different forms of cross-validation are possible depending on the experimental design and one's assumptions. It is possible to optimize t_max for each subject for each condition individually (then only data from one subject and one condition should be passed to the function) or across conditions (then data from all conditions should be passed to the function). Based on the findings by Denison et al. (2020), the latter is likely sufficient and more appropriate.

JoKra1/papss documentation built on June 15, 2022, 8:57 a.m.