knitr::opts_chunk$set( collapse = TRUE, comment = "#>" )
This is an R markdown file explaining pupil pre-processing functions contained in the gazeR package. The example dataset is from a lexical decision task that had individuals judge the lexicality of printed and cursive stimuli. Cursive stimuli are ambiguous and non-segmented making them really hard to recognize. It was predicted that it should be harder to recognize cursive words than printed words. Good thing we have this new package to help us test this hypothesis!
Please see our pre-print for a more in-depth walkthrough of the functions below.
remotes::install_github("dmirman/gazer") library(gazer) library(data.table) library(tidyverse) library(knitr)
Before using this package, a number of steps are required. In order for gazeR to work you need several columns.
col_names <- c("subject","trial","blink","pupil", "x", "y", "time","message") knitr::kable(col_names, col.names = "Names") kable(col_names)
You should also include other columns of interest.
Colnames can be changes to match gazer conventions
gazer_pupil <- make_gazer(data, subject="subject", trial="subject", time="time", x="x", y="y", pupil=NULL)
We need to read in the file that contains data from 3 participants. Given how large pupil files can be, I did this to reduce computational processing.
pupil_path <- system.file("extdata", "pupil_sample_files_edf.xls", package = "gazer") pupil_sub1<-fread(pupil_path)
In reality, you will have many Ss files. The function merge_gazer_files
will take all your pupil files from a folder path and merge them together.
pupil_files<-merge_gazer_files(file_list, filetype="edf")
If you are also interested in analyzing behavioral data (RTs and accuracy), the behave_pupil
function will cull the important behavioral data from the Sample Report. behave_pupil
will return a data frame without errors when omiterrors=TRUE or a data frame with errors for accuracy/error analysis when omiterrors=FALSE. The columns relevant for your experiment need to be specified within the behave_col names argument. This does not eliminate outliers. You must use your preferred method. I recommend the very good trimr
package by Jim Grange.
behave_data<-behave_pupil(pupil_sub1, omiterrors = FALSE, behave_colnames = c("subject","script","alteration", "trial", "target","accuracy","rt", "block", "cb"))
If you are exporting files from SR, there is an option to extend blinks within Data Viewer. It is generally recommended that you extend blinks 100 ms before the blink and 100 ms after the blink. If you have not done this before exporting into R, you can use the extend_blinks
function. The fillback argument extends blinks back in time and the fillforward argument extends blinks forward in time. For this experiment the tracker sampled at 250Hz (once every 4 ms) and we extend the blinks forward and backward 100 ms in time.
pup_extend<- pupil_sub1 %>% group_by(subject, trial) %>% mutate(extendpupil=extend_blinks(pupil, fillback=100, fillforward=100, hz=250)) %>% relocate(extendpupil, .after=trial) pup_extend
Pupil data can be extremely noisy! There are many ways to smooth pupil data. In gazeR, you can use a n-point moving average or a hanning filter. To smooth the data using a n-point moving average set the filter
argument to "moving." To use this function, you need to specify the column that contains the interpolated pupil values and how many samples you want to average over. In this example, we use a 5-point moving average (n= 5).
Missing data stemming from blinks or failure of the eye tracker need to be interpolated.smooth_interpolate_pupil
searches the data and reconstructs the pupil size for each trial from the relevant samples using either linear interpolation [@Bradley2008; @Cohen2015; @Siegle2003;@Steinhauer2000] or cubic-spline interpolation (@Mathot2018). If extendblinks argument = FALSE, samples with blinks are turned into "NAs" and are then interpolated linearly or by cubic interpolation using the type argument. This function returns a tibble with a column called interp which contains interpolated values. As an important note, if you had Data Viewer extend blinks, the extendblink argument must be set to FALSE. If you used our function, the extendblink argument should be set to TRUE. It is important to note that SR only extends the blink column and does not set blinks to NA in the sample report. For this example, we will set extendblinks to false and use cubic interpolation.
smooth_interp3 <- smooth_interpolate_pupil(pup_extend, pupil="pupil", extendpupil="extendpupil", extendblinks=TRUE, step.first="interp",filter="moving", maxgap=Inf, type="linear", hz=250, n=5)
To control for variability arising from non-task related processing, baseline correction is commonly done. There are several different types of baseline correction. In a recent paper by Mathot et al. (2018), it was recommended that a subtractive baseline correction be done based on the median. The baseline_correction_pupil_msg
function finds the median pupil size during a specified baseline period for each trial. In this example, we use messages to calcuate our baseline. We take 100ms before the target event.
baseline_pupil<-baseline_correction_pupil_msg(smooth_interp3, pupil_colname='pup_interp', baseline_dur=100, event="target", baseline_method = "sub") head(baseline_pupil)
Subjects and trials with a lot of missing data--due to blinks--should be removed. count_missing_pupil
will remove subjects and items that pass a specified missing threshold (we have it set at .2, but users can change this to whatever value they would like). The percentage of subjects and trials thrown out are returned for reporting.
pup_missing<-count_missing_pupil(baseline_pupil, pupil="pupil", missingthresh = .2)
Artifacts can arise from quick changes in pupil size (Kret & Sjak-Shie, in press). The max_dilation function calculates the normalized dilation speed, which is the max absolute change between samples divided by the temporal separation for each sample, preceding or succeeding the sample. To detect out liters, the median absolute deviation is calculated from the speed dilation variable, multiplied by a constant, and added to the median dilation speed variable--values above this threshold are then removed.
max_pup<-pup_missing %>% dplyr::group_by(subject, trial) %>% dplyr::mutate(speed=speed_pupil(pup_interp,time)) %>% dplyr::mutate(MAD=calc_mad(speed))%>% dplyr::filter(speed < MAD)
In most psychological experiments, each trial includes several events. We can use this information to start our trial onset at zero. To do this we can use the onset_pupil
function. This function requires three arguments to be specified: time column, sample message column, and the event of interest ("target") in our example). This will allow us to start the trial at the onset of the target.
baseline_pupil_onset<-baseline_pupil %>% dplyr::group_by(subject, trial) %>% dplyr::mutate(time_zero=onset_pupil(time, message, event=c("target"))) %>% ungroup() %>% relocate(time_zero,time, baseline, .after="trial") baseline_pupil_onset
Place data into timebins (users can specifiy a timebin to use). For this example, we will use 200 ms timebins. The argument aggvars
requires a vector of coloumn names to aggregate on. If you are using pupil data, type
should set to "pupil."
timebins1<- downsample_gaze(baseline_pupil_onset, bin.length=200, timevar = "time_zero", aggvars = c("subject", "script", "timebins"), type="pupil") timebins1
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.