library(lookr)
In this doc, I discuss what makes up our eye-tracking data and the main levels of abstraction that we use to contain the data.
Here are the files that make up a single subject's eye-tracking data. This participant received two blocks of the coarticulation experiment.
data_files <- list.files("data/Coartic_WFFArea_2a/001P00XS1/") data_files
Each experimental block produces a .txt
file and a .gazedata
file. These two files contain all the pertinent experimental data for an eye-tracking block. During an experiment, Eprime also produces an .edat
file, but that file is unusable outside of Eprime so we ignore it.
The .txt
and .gazedata
files for a block should have the same basename (i.e., the same filename except for the file extension). When we truncate the file extensions, we see that we only have two unique filenames, one for each block of data.
unique(tools::file_path_sans_ext(data_files))
The .txt
file for a block is the Stimdata file. It contains information about each experimental trial (like stimulus presentation or event timing). Eprime generates this file, and it's not pretty. Here is a single trial, as recorded in the file.
*** LogFrame Start *** TrialList: 1 Procedure: TrialProcedure ImageL: BoardBook1 ImageR: StuffedDog1 Carrier: Where Target: ImageL Pitch: hi AudioStim: Whe_hi_the_V_Book_neut Attention: AN_LookAtThat AudioDur: 2190 AttentionDur: 1500 WordGroup: dog-book StimType: neutral TargetWord: book Running: TrialList TrialList.Cycle: 1 TrialList.Sample: 1 Image2sec.OnsetTime: 37347 Image2sec.StartTime: 37324 Fixation.OnsetDelay: 39 Fixation.OnsetTime: 38902 Fixation.StartTime: 38893 Target.OnsetDelay: 0 Target.OnsetTime: 39353 Target.StartTime: 39289 Wait1SecFirst.OnsetDelay: 0 Wait1SecFirst.OnsetTime: 41543 Wait1SecFirst.StartTime: 41543 Attention.OnsetDelay: 3 Attention.OnsetTime: 42546 Attention.StartTime: 42343 *** LogFrame End ***
The Stimdata()
function extracts and massages the pertinent experimental information from the stimdata file into a dataframe. The massaging process is complex and kind of a pain, but it handles the many iterations of our eye-tracking experiments: It determines which task is being performed and with which stimulus presentation protocol, and it adjusts the timing attributes accordingly. If we design and implement a new experiment or a new version of an existing experiment, I usually edit the backend of Stimdata()
function to account for the new experiment.
Anyway, in the output of Stimdata()
, each row of the dataframe represents the attributes of an experimental trial.
stim_path <- "data/Coartic_WFFArea_2a/001P00XS1/Coartic_Block1_001P00XS1.txt" stimdata <- Stimdata(stim_path) str(stimdata)
The .gazedata
file contains tab-delimited Gazedata from the eye-tracker for the entire block.
gaze_path <- "data/Coartic_WFFArea_2a/001P00XS1/Coartic_Block1_001P00XS1.gazedata" raw_gaze <- read.delim(gaze_path, na.strings = c("-1.#INF", "1.#INF"), stringsAsFactors = FALSE) str(raw_gaze)
We don't need all these columns---which are documented in another doc, btw---so Gazedata()
keeps just the ones we care about. The function also computes the monocular averages for each gaze-data variable, combining available data from the left and right eyes.
gazedata <- Gazedata(gaze_path) str(gazedata)
Sometimes only one of the eyes is tracked, so we use data from the available eye to compute the monocular average, as shown in the example below. Some researchers advise against this kind of interpolation [citation TODO].
xmean_from_right <- subset(gazedata[c("XLeft", "XRight", "XMean")], is.na(XLeft) & !is.na(XRight)) head(xmean_from_right)
Now that we have the stimdata for each trial and the gazedata from the whole block, we can combine these two together using Block()
. This function slices up the gazedata, creating a dataframe for each trial. The stimulus properties for each trial are attached to the trial as attributes. The gazedata dataframe and attached stimdata make up a Trial
object. The code below shows the structure of a single trial.
block1 <- Block(gazedata, stimdata) trial <- block1[[1]] str(trial)
Block
also accepts a character argument when it gives the basename of a block---that is, the path of gazedata file minus the .gazedata
extension. This is the second rung in our ladder of convenient abstractions, as Blocks abstract away from Stimdata and Gazedata.
gaze_path2 <- "data/Coartic_WFFArea_2a/001P00XS1/Coartic_Block2_001P00XS1.gazedata" (block_basename <- tools::file_path_sans_ext(gaze_path2)) # Load stimdata and gazedata and merge in one step block2 <- Block(block_basename)
When the stimdata and gazedata are combined, six new columns are also produced. The columns all end in ToTarget
and they describe the screen location of the gaze in terms of proximity to the target image. That is, plain-old XMean
describes the location of the gaze such that 0 is the left side of the screen and 1 is the right side. In the trial
above, the target word is on the left side of the screen, so small XMean
values are closer to the left side of the screen and hence closer to the target. In XMeanToTarget
, the XMean
values are flipped so that greater values are closer to the target image. Essentially, the target image becomes the right image for all trials. This kind of normalization is useful if we want to look at the gaze-location with respect to the target image over several trials.
# the added columns grep("ToTarget", names(trial), value = TRUE) library(ggplot2) # default plot qplot(data = trial, x = Time, y = XMean) + labs(title = "Raw XMean value") qplot(data = trial, x = Time, y = XMeanToTarget) + labs(title = "XMean flipped towards target")
%@%
A Block
is a list of Trial
objects. We can access the attributes of multiple trials using the %@%
function.
block1 %@% "TargetWord" block1 %@% "TargetImage" block1 %@% "TargetOnset"
The attribute-infix function %@%
can also be used on single trials to get and set their attribute values.
trial %@% "TargetWord" trial %@% "TargetImage" # Setting trial %@% "SpecialNewAttribute" trial %@% "SpecialNewAttribute" <- "Hello!" trial %@% "SpecialNewAttribute"
Here's how one can use %@%
to manually adjust the timing of a Trial so that the TargetOnset occurs at 0ms. You should never have to manually do this, because the AlignTrials
functions does this for you.
trial$Time <- trial$Time - (trial %@% "TargetOnset") trial %@% "TargetOnset" <- 0 qplot(data = trial, x = Time, y = XMeanToTarget, xlim = c(-800, 1500)) + labs(title = "Looking to target with adjusted time values\n(Don't ever do this manually!)")
We organize our data by task and then by subject. Put another way, block files are nested within subject folders within task folders, as shown in the mock file hierarchy below.
/data/ |-- Task1 | |-- Subject1 | | |-- Task1_Subject1_block1.gazedata | | |-- Task1_Subject1_block1.txt | | |-- Task1_Subject1_block2.gazedata | | |-- Task1_Subject1_block2.txt |-- Task2 | |-- Subject2 | | |-- Task2_Subject2_block1.gazedata | | |-- Task2_Subject2_block1.txt | | |-- Task2_Subject2_block2.gazedata | | |-- Task2_Subject2_block2.txt | |-- Subject3 | | |-- Task2_Subject3_block2.gazedata | | |-- Task2_Subject3_block2.txt
Our next level of abstraction is the Session
which contains all the blocks in a subject directory.
session <- Session("data/Coartic_WFFArea_2a/001P00XS1/")
A Session
is just a list of Trial
objects; Session[[1]]
is the first Trial object in the list. When we blocks are combined to form the session, the trials are renumbered. We can recover the original block number using Basename
or Block
attributes.
# Trial numbering of the separate blocks block_numbering <- c(block1 %@% "TrialNo", block2 %@% "TrialNo") data.frame(Basename = session %@% "Basename", BlockNo = session %@% "Block", OrigTrialNo = block_numbering, TrialNo = session %@% "TrialNo")
The highest level of abstraction is the Task
. It contains all the blocks for all the subjects in a task directory. Just like a Session
or a Block
it is just a list of Trial objects.
coartic <- Task("data/Coartic_WFFArea_2a/") length(coartic)
Suppose we want to load data from 50 subjects. An idiomatic R approach might be something like lapply(subject_paths, Session)
where we apply the Session
function to each element in a vector of subject directories. But wait, there's a problem with the data for the 47th subject! After spending a couple minutes loading the data for 46 subjects, the whole thing crashes and we lose everything. Barf! This problem happens occasionally, especially for our eye-tracking tasks involving toddlers who fuss out from time to time.
Task()
fails gracefully when it encounters a bad block. Here I have simulated some bad data by truncating the data-files for a block midway through the experiment, which happens when we abort an experiment.
(blocks_to_load <- list.files("data/RWL_WFFArea_Long/", recursive = TRUE, pattern = "gazedata")) task <- Task("data/RWL_WFFArea_Long/") # 3 of 4 blocks loaded unique(task %@% "Basename")
Sys.time() sessionInfo()
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.