library(RePsychLing) library(knitr) library(rstan) library(parallel) library(xtable) opts_chunk$set(comment=NA) options(width=92, show.signif.stars = FALSE)
The data from Kronmüller and Barr (2007) is available as the kb07
data frame in the
RePscyhLing package for R
str(kb07)
As is common with factorial designs the experimental factors and their interactions are given
short names: a single character (S
,P
,C
) for the main effects, two-character names for the two-factor
interactions and a three-character name for the three-factor interaction.
The data are from 56 subjects who responded to 32 iterms. Specifically, subjects had to select one of several objects presented on a monitor with a cursor. The manipulations involved
P
) S
) aC
) All factors were expressed using a ±1
encoding which ensures that the scale of the interactions is the same as the
scale of the main effects and of the intercept term. The columns of the full-factorial model matrix, X
X <- unname(model.matrix(~ 1+S+P+C+SP+SC+PC+SPC, kb07)) attr(X,"assign") <- NULL str(X)
have the same length and are nearly orthogonal.
crossprod(X) # X'X
The response, RTtrunc
, is the reaction time after truncating some unusually large values.
Stan is a system for creating Markov chain Monte Carlo (MCMC) samplers for statistical models.
In the model specification the data, including dimensions of arrays, are described first.
standat <- ' data { int<lower=0> N; // num observations int<lower=1> K; // length of fixed-effects vector int<lower=0> M; // num subjects int<lower=1> J; // length of subj vector-valued random effects int<lower=0> L; // num items int<lower=1> I; // length of item vector-values random effects int<lower=1,upper=M> subj[N]; // subject indicator int<lower=1,upper=L> item[N]; // item indicator row_vector[K] X[N]; // model matrix for fixed-effects parameters row_vector[J] Zs[N]; // generator model matrix for subj random effects row_vector[I] Zi[N]; // generator model matrix for item random effects vector[N] y; // response vector (reaction time) } '
The response vector, y
, is expressed as a vector of N
elements.
The integer vectors subj
and item
are the indicators of subject and of item for each response.
The fixed-effects model matrix, X
, the subject-specific random effects model matrix, Zs
,
and the item-specific random effects model matrix, Zi
, are stored as vectors of N
row_vector
's.
The reason for storing the model matrices like this is to provide fast access to individual rows when
formulating the model.
For generality the number of columns in these matrices is specified separately as K
, J
and I
.
For the maximal model these will all be eight. The numbers of subjects and items are M
and L
, respectively.
In a Bayesian formulation all the random variables are considered model parameters; there is no distinction
between the random effects and the fixed-effects coefficients. The covariance matrices for the within-subject
and within-item random effects are specified according to the Cholesky factor of the correlation matrix,
a cholesky_factor_corr
type and the standard deviations, which are called taus
and taui
, respectively.
The spherical random effects, often written as $u$, are called us
and ui
respectively.
These are matrices that are stored as vectors of vectors.
stanpars <- ' parameters { cholesky_factor_corr[J] Ls; // Cholesky factor of subj r.e. correlations cholesky_factor_corr[I] Li; // Cholesky factor of item r.e. correlations vector<lower=0>[J] taus; // standard deviations of unconditional subj r.e. dist vector<lower=0>[I] taui; // standard deviations of unconditional item r.e. dist vector[J] us[M]; // spherical subj random effects vector[I] ui[L]; // spherical item random effects vector[K] beta; // fixed-effects real<lower=0> sigma; // standard deviation of response given random effects } '
The correlation matrices formed from the Cholesky factors are considered transformed parameters
stantrans <- ' transformed parameters { matrix[J,J] corrs; matrix[I,I] corri; corrs <- tcrossprod(Ls); // for monitoring subj correlations corri <- tcrossprod(Li); // for monitoring item correlations } '
Finally, the model is expressed in terms of square matrices Lambdas
and Lambdai
that are
the Cholesky factors of the covariance matrices for the subject-specific and item-specific
random effects, respectively.
The prior distributions for the parameters (taus
, taui
, Ls
, Li
, us
, and ui
) are
also specified at this point.
No prior distribution is provided for beta
, implying that this vector has a "flat" or "locally uniform"
improper prior distribution.
It could be given, say, a diffuse multivariate Gaussian distribution, but doing so has no effect on inferences.
stanmod <- ' model { matrix[J,J] Lambdas; vector[J] bs[M]; matrix[I,I] Lambdai; vector[I] bi[L]; taus ~ cauchy(0,2.5); taui ~ cauchy(0,2.5); Ls ~ lkj_corr_cholesky(2); Li ~ lkj_corr_cholesky(2); Lambdas <- diag_pre_multiply(taus,Ls); Lambdai <- diag_pre_multiply(taui,Li); for (m in 1:M) { us[m] ~ normal(0,1); bs[m] <- Lambdas * us[m]; } for (l in 1:L) { ui[l] ~ normal(0,1); bi[l] <- Lambdai * ui[l]; } for (n in 1:N) y[n] ~ normal(X[n] * beta + Zs[n] * bs[subj[n]] + Zi[n] * bi[item[n]], sigma); } '
In the last loop the linear predictor for y[n]
is evaluated as the sum of three products of
row_vector
s and vector
s. A row_vector
multiplied by a vector
is a scalar, which is the reason
for storing X
, Zs
and Zi
as vectors of row_vector
s.
These pieces are concatenated to form the model
model <- paste(standat, stanpars, stantrans, stanmod)
The model is compiled via a call to stan
that includes the model and the data.
The data must be specified as a list or as an environment.
For the maximal model, X
, Zs
and Zi
are the model matrix for the full factorial
maxdat <- within(list(), { N <- nrow(X) K <- J <- I <- ncol(X) M <- length(levels(kb07$subj)) L <- length(levels(kb07$item)) X <- Zs <- Zi <- unname(X) y <- kb07$RTtrunc subj <- as.integer(kb07$subj) item <- as.integer(kb07$item) } ) str(maxdat)
Initially we set the number of chains to zero to check that the model compiles properly
maxmodel <- stan(model_name="maxmodel", model_code=model, data=maxdat, chains=0)
We use mclapply
from the parallel
package to generate the chains in parallel.
system.time(KB07_stan <- sflist2stanfit( mclapply(1:4, mc.cores = 4, # adjust number of cores to suit function(i) stan(fit = maxmodel, data = maxdat, iter=2000, chains = 1, chain_id = i, refresh = -1)) ) )
We see that the elapsed time is considerably less than the user time. This is because four processes are running in parallel.
A close examination of the timing shows that much more time is spent in the "warmup" phase than in actually generating samples. Stan defaults to using a form of Hamiltonian Monte Carlo (HMC) sampling called a "No U-Turn Sampler" (NUTS) and it is tuning these samplers that is taking most of the time.
KB07_results<- summary(KB07_stan, pars=c("beta", "sigma", "taus","taui", "corrs","corri"), probs = c(0.025, 0.975), digits_summary = 3) rownames(KB07_results$summary)
Note that all of the correlation matrix elements are monitored even though
?stan the diagonal elements are, by definition, unity and the matrix must be symmetric. An arduous extraction provides the table
print(xtable(KB07_results$summary[c(1:25,27:33,36:41,45:49,55:57,62:64,81,91:97,100:105),c(1,4,5)]), type="html")
Note that most of the correlations, especially those for the item-specific random effects, have a mean close to zero and upper and lower limits that are more-or-less symmetric. This is strong evidence that these could be zero.
Because of the way the priors are defined the taus
and taui
values cannot become zero. However, many of these values are very close to zero. The only standard deviations that are substantion are the by-subject intercept and the by-item intercept and coefficient for P
.
The final, reduced model has a single column (the intercept) in Zs
and two columns (intercept and main-effect for P
) in Zi
.
finaldat <- within(list(), { N <- nrow(X) K <- ncol(X) J <- 1L I <- 2L M <- length(levels(kb07$subj)) L <- length(levels(kb07$item)) X <- X Zs <- X[, 1, drop=FALSE] Zi <- X[, 1:2] y <- kb07$RTtrunc subj <- as.integer(kb07$subj) item <- as.integer(kb07$item) } ) str(finaldat)
system.time(KB07_finalstan <- sflist2stanfit( mclapply(1:4, mc.cores = 4, # adjust number of cores to suit function(i) stan(fit = maxmodel, data = finaldat, iter=2000, chains = 1, chain_id = i, refresh = -1)) ) )
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.