knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.align = "center", fig.width = 6, fig.height = 4 )
library(papss) library(mgcv) set.seed(2022) # For replicability.
papss
According to Hoeks & Levelt (1993), pupil dilation change resulting from an increase in 'cognitive demand' is best modeled using a Gamma-Erlang function:
In their original work they focused on isolated 'demand spikes' linked to clearly identifiable events in their experiment. Below we visualize their 'pupil-response function'. Note that the t_max
parameter is set to 930, as proposed by Hoeks & Levelt (1993), so that the pupil response peaks around 930 ms after the spike occurred.
help(h_basis)
In their original work they really focused on isolated 'demand spikes' linked to clearly identifiable events in their experiment. Below we visualize their 'pupil-response function':
# First we create a time variable. time <- seq(0,2500,by=20) # Time at 50 HZ # Index vector for possible pulse locations, pointing to each sample pulse_locations <- seq(1,length(time),by=1) # Now we can calculate the effect on the pupil of an isolated spike # that happened during the 5th. sample using the default parameters # estimated by Hoeks & Levelt. # Note that the 'f' parameter was added later by Wierda et al. (2012) to # scale the response of the pupil. pupil_dilation <- h_basis(5,time,0,time,pulse_locations, rep(1,length.out=length(time)), n=10.1,t_max=930,f=1/(10^24)) # Which we can now visualize... plot(time,pupil_dilation,typ="l",lwd=3,xlab="time",ylab="Pupil dilation") # Together with the spike... abline(v=time[5],lty=2,lwd=3,col="red") legend("topright", c("Pupil response","Spike in demand"), lty = c(1,2),lwd=3,col=c("black","red"))
You can check the documentation of the h_basis function if you want to know more (or read their paper!):
help(h_basis)
According to their analysis, Hoeks & Levelt (1993) argued that the observed pupil dilation time-course is nothing more than the result of multiple weighted spikes - each having an additive effect on the dilation of the pupil:
# Now lets add some more spikes - and vary their weight! # This variable will hold the sum of the individual responses - i.e., what we could observe! pupil <- rep(0,length.out=length(time)) # We now refer to the resulting variables as h_{i} H <- matrix(nrow=length(time),ncol = 4) h1 <- h_basis(5,time,0,time,pulse_locations, rep(1,length.out=length(time)), n=10.1,t_max=930,f=1/(10^24)) H[,1] <- h1 # Don't forget to add the weighted contribution of this spike to the overall time-course pupil <- pupil + 0.3 * h1 # Visualization of the individual contribution plot(time,0.3 * h1,typ="l", lwd=3,xlab="time", ylab="Pupil dilation", ylim=c(0,70),lty=2) # Together with the spike... abline(v=time[5],lty=2,lwd=3,col="black") # Sample some weights betas <- runif(3) # Now add more spikes! iw <- 1 for (si in c(10,15,20)) { hi <- h_basis((si*3),time,0,time,pulse_locations, rep(1,length.out=length(time)), n=10.1,t_max=930,f=1/(10^24)) pupil <- pupil + betas[iw] * hi H[,(iw+1)] <- hi lines(time,betas[iw] * hi,lwd=3,lty=2,col=si) abline(v=time[(si*3)],lty=2,lwd=3,col=si) iw <- iw +1 } # And the resulting pupil dilation time-course we could actually observe! lines(time,pupil,lwd=3) # Merge weights for later betas <- c(0.3,betas)
Our package papss
tries to find the set of weights that most likely generated an observed pupil dilation time-course.
However, papss
is not limited to the assumption made by Hoeks & Levelt that there are only a couple of isolated spikes. Rather, it attempts to recover continuous changes in the underlying demand trajectory. The idea that demand changes continuously traces back to the work by Wierda and colleagues (2012) who revealed that they can track more continuous-like changes in demand with "spikes" assumed to happen every 100 ms. papss
goes one step further and models truly continuous changes in demand, with possible "spikes" at every measured sample. This document provides instructions on how papss
can be used for an analysis of the pupil dilation time-course.
Of course we first need data. papss
imposes some requirements on the layout:
papss
expects pupil dilation data in a pupil column. If your column has a different name
pass it to the pupil_id
argument when calling papss::pupil_solve()
.
papss
expects the time variable in a time column. If your column has a different name
pass it to the time_id
argument when calling papss::pupil_solve()
.
papss
by default expects a subject factor, but it is also possible
to calculate demand trajectories for each level of any other categorical predictor
(e.g., conditions in an experiment). In that case the name of the factor column needs to
be passed to the call to papss::pupil_solve()
via the factor_id
argument.
# Load simulated data dat <- readRDS("../data/sim_dat.rds") head(dat)
The code below forms simple averages over time and subject, because the method by Hoeks & Levelt, extensions (e.g., Wierda et al., 2012; Denison et al., 2020), and also papss
require aggregated data.
Important:
factor_id
In some experimental setups, aggregation of the data is not desirable. For example, with manipulations of a continuous predictor variable, or between-participant differences in the experimental stimuli that were presented. In these case, model-based aggregates may be a better solution, which estimate an additive model of the pupil (for example using 'mgcv', see Wood (2017)) that takes into account the effects of these predictors on the size of the pupil.
# No NAs in simulated data :) so no check necessary here. aggr_dat <- aggregate(list("pupil"=dat$pupil),by=list("subject"=dat$subject,"time"=dat$time),FUN=mean) # Sort according to subject here! aggr_dat <- aggr_dat[order(aggr_dat$subject),] # Plot averages. plot(aggr_dat$time[aggr_dat$subject == 1], aggr_dat$pupil[aggr_dat$subject == 1], type="l",ylim=c(min(aggr_dat$pupil), max(aggr_dat$pupil)), xlab="Time", ylab="Pupil dilation", main= "Average pupil trajectories for all subjects", lwd=3) for (si in 2:length(unique(aggr_dat$subject))) { sub_aggr_dat <- aggr_dat[aggr_dat$subject == si,] lines(sub_aggr_dat$time,sub_aggr_dat$pupil,col=si,lwd=3) }
Demand recovery is mainly handled via the papss::pupil_solve()
function. papss
automatically calculates possible demand-spike locations with the function. Note that by default papss
also attempts to correct for pulses that happened before the recording window! In the simulation here, demand really starts to increase only within the time-window of interest. However, with real data demand may fluctuate already before stimulus onset, due to anticipation, fatigue, or other causes. Because of the lagged pupil response, pulses outside of the time-window (here pulses before time-point 0) will then still influence the pupil response within the time-window of interest.
Below we summarize the most important function arguments that usually need to be selected/adjusted for different experiments:
pulse_spacing : Controls the spacing (in samples) between spikes. This should usually be set to 1 (i.e., a "spike" every sample, a 2 would mean a "spike" every two samples). Note that for the spacing used by Wierda et al. (2012) the pupil dilation time-course should be down-sampled to 50 HZ (20ms samples) and pulse_spacing set to 5 (resulting in a pulse every 100 ms).
data: Aggregated data that adheres to the layout restrictions outlined earlier!
factor_id: Name of the factor column that distinguishes between the different aggregated pupil dilation time-courses. I.e., for each level of this factor the aggregated data should contain a pupil-dilation time-course.
expand_by : Controls the time (in ms; defaults to 500 ms) before the time-window of interest in which pulses may originate that influence the pupil response.
sample_length: The duration of a single sample (in ms). I.e., if the pupil-dilation time-course is down-sampled to 50 HZ, this argument needs to be set to 20 (the default).
drop_last : Controls the possibility to drop pulses far towards the end of the time-window, as these are often over-estimated (as discussed in Wierda et al, 2012). Increasing the amount of dropped pulses, obviously frees more basis functions that can be dedicated to account for baseline fluctuations, so in practice researchers will usually have to find a good balancing act between this argument and the expand_by
one. papss
will prompt a warning and cancel estimation, in case the problem becomes unidentifiable!
f : Scaling parameter of the pupil response function, introduced by Wierda et al. This value is generally not known. The choice for this parameter depends largely on the selected pre-processing pipeline and the eye-tracker used for recording. For example, Wierda et al. (2012) used normalized pupil dilation time-courses. In that case a value of (1/(10^(27))) worked quite well. The simulation here generates data that is more in line with data that was base-lined by means of subtraction. We achieved good results with a value of (1/(10^(24))) in that case. Note, this also depends on the value selected for t_max
!
t_max: Choice the peak delay parameter of the pupil response function (Hoeks & Levelt, 1993). Denison et al. (2020) showed that the default value of 930 ms is not always a good choice! (see Advanced topics at the end of this document)
maxiter_outer: Maximum number of iterations for outer estimator (spike penalization). Should be increased if papss
does not converge (and these issues do not depend on t_max
). To check convergence inspect the convergence
attribute returned by papss::pupil_solve()
a negative number means the algorithm did not converge.
There are more arguments that alter how papss
attempts to recover demand trajectories, but these are the most important ones! The help function for papss::pupil_solve()
provides more information on the remaining arguments.
help(pupil_solve)
With choices made for all those parameters we can now attempt to recover the subjects' demand trajectories! We set the maxiter_outer
argument to 10 to speed up estimation for this example. In practice, good convergence requires between 15-30 iterations of the outer optimizer.
solvedPupil <- readRDS("../data/solvedPupil.rds")
# Solve pupil spline solvedPupil <- papss::pupil_solve(pulse_spacing = 1, data = aggr_dat, maxiter_inner = 10000, maxiter_outer = 10, expand_by = 0, drop_last = 300, # Drop pulses from last 300 ms f=(1/(10^(24)))) recovered_coef <- solvedPupil$coef model_mat <- solvedPupil$modelmat
Finally, we can take a look at the recovered demand trajectories and predicted pupil time-courses (plotted in red against the observed true ones). To extract demand curves the extract_demand_for_fact()
function can
be used!
The most important arguments for the function are:
aggr_dat: Aggregated data passed to solver
recovered_coef: The coefficients returned by pupil_solve()
pulse_locations: The index vector pointing at pulse locations returned by pupil_solve()
real_locations: The vector with the time-points at which pulses are modelled, returned by pupil_solve()
pulses_in_time: Boolean vector indicating which pulse was within the time-window. Returned by pupil_solve()
expanded_time: The expanded time series returned by papss::pupil_solve()
expanded_by: Expansion time in ms divided by sample length in ms (i.e., 500/20 for pupil_solve(expand_by=500, sample_length=20)
)
factor_id: Name of factor column in aggr_dat
t: Choice for t_max parameter
f: Choice for f parameter
You can again check help(extract_demand_for_fact)
to learn more about the remaining arguments as well as default values.
# We are in the fortunate situation that we have the true demand add_sim <- readRDS("../data/sim_dat_gen_model.rds")
# Returns data-frame with demand over time per subject demand_extract <- extract_demand_for_fact(aggr_dat, solvedPupil$coef, solvedPupil$pulseLocations, solvedPupil$realLocations, solvedPupil$pulsesInTime, solvedPupil$expandedTime, 0/20, # expand_by/sample_length arguments for pupil_solve() factor_id = "subject") # Get demand dat for each subject and plot par(mfrow=c(1,2)) for (sub in unique(aggr_dat$subject)) { sub_dat <- demand_extract[demand_extract$factor == sub,] # Since this is simulated data we can compare the solutions to the simulated truth plot(sub_dat$time, add_sim$sub_demand[,as.numeric(sub)], type="l", xlab="Time", ylab="Demand", main=sub, ylim = c(0,1.0), lwd=3) # Now plot recovered estimate lines(sub_dat$time,sub_dat$demand,lty=2,col="red",lwd=3) } legend("topleft",c("True Demand", "Estimated Demand"), lty=c(1,2), col=c("black","red")) par(mfrow=c(1,1))
papss
essentially estimates an additive model of pupil dilation:
$pupil_i = pupilSpline_i + \epsilon_i$
The $pupilSpline$ corresponds to the sum of weighted pupil response functions! Thus, we also expect that the $\epsilon_i$ are normally distributed:
$\epsilon_i \in N(0, \sigma_e)$
This becomes important when we want to get an estimate of
parameter confidence using bootstrapping (Efron & Tibshirani, 1993).
papss
supports residual replacing only and as discussed by
Efron & Tibshirani, this requires evaluating the residuals carefully to ensure
that they match what we would expect according to the model:
plot(solvedPupil$fitted, solvedPupil$resid, xlab="Fitted values", ylab = "Residuals") abline(h=0,lty=2) acf(solvedPupil$resid, main="Auto-correlation in residuals")
The residuals should show approximately constant variance and low correlation across different lags (Efron & Tibshirani, 1993). Only then are the confidence estimates, recovered below, to be trusted.
bootstrap <- readRDS("../data/bootstrap.rds")
# Perform bootstrap estimation. bootstrap <- papss::bootstrap_papss_standard_error(solvedPupil$coef, aggr_dat$pupil, solvedPupil$setup, N=15, # This should be higher in practice! maxiter_inner = 100000, maxiter_outer = 10, f=(1/(10^(24))))
We can now visualize the individual demand trajectories with an estimate of the uncertainty surrounding them.
se <- bootstrap$standardError # Add se plot to demand dat extracted earlier par(mfrow=c(1,2)) si <- 1 for (sub in unique(aggr_dat$subject)) { sub_dat <- demand_extract[demand_extract$factor == sub,] # Get standard error for ALL coef for this subject sub_se <- se[((si - 1) * length(solvedPupil$pulseLocations) + 1):(si * length(solvedPupil$pulseLocations))] # Only consider those in the time-window of interest, so drop those added # because of expand_by! sub_se <- sub_se[solvedPupil$pulsesInTime] # Embed those in the time-variable for the plot sub_se_e <- rep(0,times=length(sub_dat$time)) sub_se_e[sub_dat$time %in% solvedPupil$realLocations[solvedPupil$pulsesInTime]] <- sub_se plot(sub_dat$time, sub_dat$demand, type="l", xlab="Time", ylab="Demand", main=sub, ylim = c(0,max(demand_extract$demand)), lwd=3) lines(sub_dat$time, sub_dat$demand + sub_se_e,lty=2) lines(sub_dat$time, sub_dat$demand - sub_se_e,lty=2) si <- si + 1 } par(mfrow=c(1,1))
So far we simply used the parameters for the pupil response function provided by
Hoeks & Levelt (1996): t_max=930
and n=10.1
. In practice, these values are
good starting points but they are unlikely to be perfect!
Denison et al., (2020) for example reported that the former varied a lot between different subjects.
They recommend that t_max
should generally be estimated.
papss
includes a cross-validation function based on the procedure recommended by
the authors to achieve this:
cross_val_errs <- readRDS("../data/cross_val_errs.rds")
# Method expects numeric trial variable for easy splitting. dat$num_trial <- as.numeric(dat$trial) # Randomize trial order unq_trials <- unique(dat$num_trial) unq_trials <- sample(unq_trials,length(unq_trials)) # Split data into n=4 folds n_trials <- length(unq_trials) fold_order <- cut(1:n_trials,4,labels = F) folds <- list() for(f in fold_order){ folds[[f]] <- unq_trials[fold_order == f] }
# Perform cross-validation to find tmax! cross_val_errs <- papss::cross_val_tmax(cand_tmax=c(1130,1030,930,830,730), # Candidate t_max values to be checked folds=folds, pulse_spacing = 1, trial_data = dat, maxiter_inner = 10000, maxiter_outer = 10, model="WIER_SHARED", start_lambda = 1.0, expand_by = 0, drop_last = 300, f=(1/(10^(24))), should_plot = F)
plot(c(1130,1030,930,830,730),cross_val_errs,ylim=c(min(cross_val_errs) - 0.01*min(cross_val_errs), max(cross_val_errs) + 0.01*max(cross_val_errs)), ylab="Cross-validation error", xlab="t_max candidates") lines(c(1130,1030,930,830,730),cross_val_errs)
In this simulation all subjects were very similar in how they experienced demand. This is not reflective of real data.
Usually there is data from multiple experimental conditions and subjects' pupil dilation time-courses will differ a lot (see for example Denison et al., 2020). In that case fitting a shared model across subjects is often not appropriate because the demand solutions for different subjects will likely differ in smoothness as well.
In these situations it will be better to first estimate the t_max
parameter
for each subject. This can be based on the data from all experimental conditions following
Denison et al. (2020). Then the best t_max
model for each subject should
be fitted using the data from all experimental conditions as well!
This setup is outlined in the code example below:
# First build condition dat condDat <- NULL n <- 2 for (ci in 1:2) { cDat <- additive_pupil_sim(n_sub = n, slope_dev = 1.5, sub_dev = 0.15, trial_dev = 0.25, residual_dev=15.0, should_plot=T, seed = 2022+ci)$data cDat$condition <- ci condDat <- rbind(condDat,cDat) } condDat$condition <- as.factor(condDat$condition) # Now find the best t_max parameters for all subjects best_t_max_all <- c() tmax_candidates <- c(1130,1030,930,830,730) for (si in 1:n) { # Get subject data sub_dat <- condDat[condDat$subject == si,] # Perform cross-validation for this subject! # Start with making continuous trial variable again sub_dat$num_trial <- as.numeric(sub_dat$trial) # Randomize trial order unq_trials <- unique(sub_dat$num_trial) unq_trials <- sample(unq_trials,length(unq_trials)) # Split data into n=4 folds n_trials <- length(unq_trials) fold_order <- cut(1:n_trials,4,labels = F) folds <- list() for(f in fold_order){ folds[[f]] <- unq_trials[fold_order == f] } # Perform cross-validation to find tmax! cross_val_errs <- papss::cross_val_tmax(cand_tmax=tmax_candidates, folds=folds, pulse_spacing = 1, trial_data = sub_dat, factor_id = "condition", # Estimate demand per condition for this subj.! maxiter_inner = 100000, maxiter_outer = 10, model="WIER_SHARED", start_lambda = 0.1, expand_by = 0, drop_last = 300, f=(1/(10^(24)))) # Get best t_max for this subject! min_err <- min(cross_val_errs) best_tmax <- tmax_candidates[cross_val_errs == min_err] cat("Best tmax: ",best_tmax,"\n") best_t_max_all <- c(best_t_max_all,best_tmax) } # Now fit the best model for each subject demand_diff_dat <- NULL for (si in 1:n) { # Get subject data again sub_dat <- condDat[condDat$subject == si,] # Get subjects t_max subj_t_max <- best_t_max_all[si] # Now create by condition averages! aggr_cond_dat_subj <- aggregate(list("pupil"=sub_dat$pupil),by=list("time"=sub_dat$time, "condition"=sub_dat$condition),FUN=mean) aggr_cond_dat_subj <- aggr_cond_dat_subj[order(aggr_cond_dat_subj$condition),] # Solve the model for this subject solvedPupil <- papss::pupil_solve(pulse_spacing = 1, data = aggr_cond_dat_subj, factor_id = "condition", # Again, estimate demand per condition. maxiter_inner = 100000, maxiter_outer = 10, model="WIER_SHARED", start_lambda = 0.1, expand_by = 0, drop_last = 300, f=(1/(10^(24))), t_max = subj_t_max) # Get the demand trajectories for all conditions per subject demand_dat_sub <- papss::extract_demand_for_fact(aggr_cond_dat_subj, solvedPupil$coef, solvedPupil$pulseLocations, solvedPupil$realLocations, solvedPupil$pulsesInTime, solvedPupil$expandedTime, (0/20), factor_id = "condition") # These can then be investigated further and possibly even be aggregated # to check whether there is a shared trend across subjects - but see remark # below! }
Important:
If there is a lot of between-subject variation in the demand trajectories, aggregating subjects' demand trajectories will likely not reveal a meaningful shared trend.
In that case, pupil_slove()
is better called on condition averages instead of subject averages
(as recommended by Hoeks & Levelt, 1993). Subjects' demand trajectories should still be estimated
to get an estimate of between-subject differences.
papss
offers multiple model setups, not just the one inspired by Wierda et al.'s setup (i.e., with extra slope terms). Dension et al. (2020) for example included intercept terms for each subject, and setting model="DEN_SHARED"
will similarly estimate intercept terms instead of slope terms.
See:
help(WIER_SHARED_NNLS_model_setup) help(WIER_IND_NNLS_model_setup) help(DEN_SHARED_NNLS_model_setup) help(WIER_DEN_SHARED_NNLS_model_setup)
If you want to play around with the simulation code that generated the data here, you can explore the following function:
# Used to generate data n <- 15 sim_obj <- additive_pupil_sim(n_sub = n, slope_dev = 1.5, sub_dev = 0.15, trial_dev = 0.25, residual_dev=15.0, should_plot=T)
NNLS optimizer used here:
Ang, A. (2020a). Accelerated gradient descent for large-scale optimization: On Gradient descent solving Quadratic problems—Guest lecture of MARO 201—Advanced Optimization. https://angms.science/doc/teaching/GDLS.pdf
Ang, A. (2020b). Nonnegative Least Squares—PGD, accelerated PGD and with restarts. https://angms.science/doc/NMF/nnls_pgd.pdf
Pupil dilation modelling:
Denison, R. N., Parker, J. A., & Carrasco, M. (2020). Modeling pupil responses to rapid sequential events. Behavior Research Methods, 52(5), 1991–2007. https://doi.org/10.3758/s13428-020-01368-6
Fink, L., Simola, J., Tavano, A., Lange, E. B., Wallot, S., & Laeng, B. (2021). From pre-processing to advanced dynamic modeling of pupil data. PsyArXiv. https://doi.org/10.31234/osf.io/wqvue
Hoeks, B., & Levelt, W. (1993). Pupillary dilation as a measure of attention: A quantitative system analysis. Behav. Res. Meth. Ins. C., 25, 16–26.
Wierda, S. M., van Rijn, H., Taatgen, N. A., & Martens, S. (2012). Pupil dilation deconvolution reveals the dynamics of attention at high temporal resolution. Proceedings of the National Academy of Sciences of the United States of America, 109(22), 8456–8460. https://doi.org/10.1073/pnas.1201858109
NNLS and acceleration in general:
Slawski, M., & Hein, M. (2014). Non-negative least squares for high-dimensional linear models: Consistency and sparse recovery without regularization. ArXiv:1205.0953 [Math, Stat]. http://arxiv.org/abs/1205.0953
Sutskever, I., Martens, J., Dahl, G., & Hinton, G. (2013). On the importance of initialization and momentum in deep learning. Proceedings of the 30th International Conference on Machine Learning, 1139–1147. https://proceedings.mlr.press/v28/sutskever13.html
GAMMs in general and the specific spike penalization method used here:
Wood, S. N., & Fasiolo, M. (2017). A generalized Fellner-Schall method for smoothing parameter optimization with application to Tweedie location, scale and shape models. Biometrics, 73(4), 1071–1081. https://doi.org/10.1111/biom.12666
Wood, S. N. (2017). Generalized Additive Models: An Introduction with R, Second Edition (2nd ed.). Chapman and Hall/CRC.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.