Intro

This vignette illustrates the process of fitting (attentional) drift diffusion models to experimental choice and reaction time data (and fixation data). We are going to walk through a model fit that is done by unique experiment condition using only the functionality provided by the addmtoolbox package in R.

As example data we are goint to use two data frames that come supplied by the package:

  1. addm_data_choice
  2. addm_data_eye

Load Environment

Before starting lets load the necessary environment for working with the addmtoolbox.

library("addmtoolbox")

The package is self-contained in that it should load all dependencies correctly, and you are good to go now.


Data Preparation

The Data Frames

Lets first have a quick look at the content of the two main data frames that are supplied with the package.

Choice Data

# Lets look at the choice data 
head(addm_data_choice)
pander::pandoc.table(head(addm_data_choice))

We can see that addm_data_choice has five columns.

  1. id: which provides a unique key per observed trial
  2. v1: providing the valuation of the left item on the screen
  3. v2: providing the valuation of the right item on the screen
  4. decision: providing the decision taken in a specific trial (1 = left, 2 = right)
  5. rt: the reaction time in a given trial

Fixation Data

  # Lets look at the choice data 
  head(addm_data_eye)
  pander::pandoc.table(head(addm_data_eye))

We can see that addm_data_eye has four columns.

  1. fixloc: the fixation location of a given fixation (coded as referring to item attended)
  2. fixdur: the duration of a given fixation
  3. fixnr: the within trial number of a fixation
  4. id: a unique trial identifier that importantly matches the id of addm_data_choice

Pre Processing

The two data.frames supplied in the addmtoolbox package serve as an illustration of the structure that is expected for the package to be conveniently usable.

Once our data is in shape we can continue to use the addm_preprocess() function, like so:

 my.dat = addm_preprocess(choice.dat = addm_data_choice,
                        eye.dat = addm_data_eye,
                        timestep = 10,
                        rtbinsize = 100)  

You can look at detailed descriptions of the input parameters directly in the package documentation. Click on the package name in your RStudio package window or type ?addmtoolbox.

Nevertheless, lets not some important details here:

When fitting by condition, the model will assume that you are providing a fixation model yourself, and therefore does not need a separate eyetracking dataframe. Therefore we have no element for eyetracking data in the returned list.

When fitting by trial, the model needs empirical eyetracking data by trial, however a data.frame providing information about unique conditions is obsolete. Therefore we have a return element for eyetracking data, but none for conditions.

Lets look at the output that we have generated with out call to addm_preprocess():

 my.dat

Note, that now there is a condition_id variable. A simple rule is applied to form the condition_id variable. The values that appear in every trial display are simply concatenated as a string and separated by "_".

Important

addm_preprocess() expects you to supply an id-variable that is a unique identifier of each trial for each subject (that means that trial 1 for subject 1 has a different key than trial 1 for subject 2). Note, that row.numbers are not sufficient, because consistency is demanded between trial-identifiers of the choice data.frame and trial-identifiers of the eye data.frame.

When you choose to fit by condition, it is important to note that addm_preprocess() is actually changing the trial-identifiers for you into condition identifiers. STILL, addm_preprocess() expects you to supply the data with identifiers by trial first.


Fitting the model (by condition)

The addm_fit_grid() function

Now that our data is prepared for usage with the main functions of the addmtoolbox package, we can call the addm_fit_grid() function. This function has quite some parameters, and you find detailed information what each one stands for in the function/pckage documentation. Lets look at the default setting of all parameters below to get an overview.

addm_fit_grid = function(data = list(choice.dat = NULL, eye.dat = NULL, conditions.dat = NULL),
                         drifts = seq(0.0,0.002,0.0005),
                         thetas = seq(0.0,1.0,0.2),
                         sds = seq(0.05,0.15,0.025),
                         non.decision.times = 0,
                         timestep = 10,
                         nr.reps = 2000,
                         model.type = 'standard',
                         fit.type = 'condition',
                         fixation.model = 'fixedpath',
                         allow.fine.grid = 0,
                         log.file = "defaultlog.txt",
                         parallel = 1)

To highlight a few of the non-obvious options:

  1. model.type which defaults to "standard", lets you choose "memnoise" as an alternative. This is a version of the model which, given bigger set sizes than 2, lets you scale down/up noise and evidence accumulation separately for seen and unseen items.

  2. fixation.model is important when fitting by condition. I supply two simple fixation models with the package which are specific for the two item version for now. addm2_fixation_model_fixedpath() a model that spits out exactly the same fixation pathway everytime it is called (fixation duration of 400ms), which you utilize when entering "fixedpath" (the default). addm2_fixation_model_random() a model that randomly changes the starting sequence (either locations 1 then 2 or location 2 then 1), which you utilize when entering "random". Most importantly, fixation.model can be initialized as "user". This allows you to supply your own fixation model, which will be called to fill in a fixation location and a fixation duration vector at the beginning of each simulation run. You will have to supply your fixation model as a function with the name user_fixation_model() so that the package knows where to find it when running.

  3. parallel is initialized as 1, which means that the function runs on all cores that your local system has. Supply 0 to use only one core.

  4. allow.fine.grid tells the grid search procedure whether the maximum likelihood found with during the coarse grid search (using all parameter combinations that can be formed from the possible parameter values you supplied) should be taken as the centerpoint of a new, finer grid search in the near neighboorhood of these parameter combination. 0 if you don't want this behavior.

  5. log.file is where the outcome list will be stored as a file. If you don't supply a name, it will still store it under "defualtlog.txt", so you will not loose your data. This file however, will be overwritten when you start a new grid search.

Let's do it once

Lets take our prepared data.frames and fit the model once ourselves. I chose defaults so that reasonable outcomes will be returned for the example data-set. We will be fine in calling the function with default values for all but the conditions_dat and choice_dat variables, where we fit in the corresponding data.tables we retrieved from addm_preprocess() earlier. Note that for your own datasets you likely want to adjust the search-grid etc..

my.log.liks = addm_fit_grid(my.dat)

In this vignette I am not going to run the code, but instead show you the resulting data.table that stores the loglikelihood values and corresponding model-parameters from a precomputed model-fit.

head(addm_data_loglik_condition)

You can see that all model parameters are stored along with a variable coarse which can be ignored for now. This variable is needed when plotting the loglikelihoods, separate for the coarse and fine grid search. In the defaults we are not using a fine grid, so we will not bother with this for now.


Plotting Model Fit Results

The package included as convenience function, that visualizes the log likelihood data.table, so that the behavior of the loglikelihood across the parameter space can be analized. Note in the output below that the function automatically prints out the best parameter combination and corresponding likelihood. We can use this function like so:

  # Note that in your case you will use 'my.log.liks' here !
  out = addm_plot_loglik(addm_data_loglik_condition)  
  out$plot.coarse.grid




Detailed model output

After fitting the model, the next natural step is to take the best parameter-combination and simulate the model with detailed monitor output, to plot various predictions of the model against real data. The addmtoolbox can do this with the addm_by_condition() function. You need to change the parameter output.type from its default "fit" to "full", which tells the function to provide detailed output, instead of the barebones output needed to simply generate a log likelihood. Lets do it once:

  # Generate vector with optimal parameters (as seen above)
  # Order: sd,theta,drift,non_decision_time
  our.best.parameters = c(0.0015,1,0.07,0)

  # Now call addm_by_condition() function
  detailed.output = addm_run_by_condition(conditions.dat = my.dat$conditions.dat, 
                                          choice.dat = my.dat$choice.dat, 
                                          output.type = 'full', 
                                          nr.reps = 250, 
                                          model.parameters = our.best.parameters)


Lets look at the output we generated:

  head(detailed.output)

All variables are quite self-explanatory. Note that in this output the "id" variable refers to conditions instead of trials as expected when you first supply your experimental data. This is no issue when you chose to fit your data by condition, since then you already have your choice.data with an id variable that is referencing conditions. If you did your fits by trial however, you should re-run addm_preprocess() and use fit.type = 'condition to get conditions data and your choice data with id variable reference by conditions. Use this then as a supply to addm_by_condition() as shown above.

Why by condition?

For generating fake choice data, you need two components.

  1. Unique trial conditions
  2. A fixation model

Lets assume you would want to fit every unique trial (note the difference between unique trial and unique trial condition)....

What would you learn in addition? The model makes exactly the same predictions about a specific trial condition, whether it was encountered the first or second time or whether it was encountered in the context of the experimental session of subject one or two. There is no need to run the model over each trial to generate a simulated data set. Even if you actually fit the model by trial.



Plotting Results

You can use the addm_plot_family() function to get some basic psychometric plots, showing the model predictions and you original data. The function returns a list with plot-objects, which you can plot or, given some knowledge of the utilities that the ggplot2 package offers, you can freely edit. You use the function like this:

  addm.plot = addm2_plot_family(choice.dat = my.dat$choice.dat,
                                addm.output = detailed.output)
  addm.plot[[1]]
  addm.plot[[2]]

The functionality of this function will increase over time. From the user perspective the only difference however will be that you will receive back a longer list of plot-objects.



Fitting the Model (by trial)

The procedure to fit the model by trial is exactly the same. When calling addm_fit_grid() however, you will simply change the fit.type parameter to "trial". All other code will follow precisely the same recipe as when fitting by condition.

Note, that the model-fit result will likely be different (guaranteed when working with the example data, and the example fixation models). This is due to the impact of the fixation models.



DONE

That's it. You did your first model-fit with the addmtoolbox package, plotted the likelihood functions, generated detailed output and checked some basic psychometrics of your model against real behavioral data. Be sure to update the addmtoolbox package regularly, since it is in a very early development stage and will improve rather rapidly over the next few months. Please provide feedback here, such as bugreports or suggestions for useful additions to the functionality of the package.



AlexanderFengler/addmtoolbox documentation built on May 5, 2019, 4:53 a.m.