library(lme4)
library(RePsychLing)
library(knitr)
opts_chunk$set(comment=NA)
options(width=92,show.signif.stars = FALSE)

We apply the iterative reduction of LMM complexity to truncated response times of a 2x2x2 factorial psycholinguistic experiment (@Kronmuller:Barr:2007, Exp. 2; reanalyzed with an LMM in @Barr:Levy:Scheepers:Tily:13). The data are from 56 subjects who responded to 32 items. Specifically, subjects had to select one of several objects presented on a monitor with a cursor. The manipulations involved (1) auditory instructions that maintained or broke a precedent of reference for the objects established over prior trials, (2) with the instruction being presented by the speaker who established the precedent (i.e., an old speaker) or a new speaker, and (3) whether the task had to be performed without or with a cognitive load consisting of six random digits. All factors were varied within subjects and within items. There were main effects of Load, Speaker, and Precedent; none of the interactions were significant. Although standard errors of fixed-effect coefficents varied slightly across models, our reanalyses afforded the same statistical inference about the experimental manipulations as the original article, irrespective of LMM specification. The purpose of the analysis is to illustrate an assessment of model complexity as far as variance components and correlation parameters are concerned, neither of which were in the focus of the original publication.

Data from @Kronmuller:Barr:2007

The data are available as kb07 in the RePsychLing package.

str(kb07)

Maximal linear mixed model (maxLMM)

Barr et al. (2012, supplement) analyzed Kronmüller and Barr (2007, Exp. 2) with the maxLMM comprising 16 variance components (eight each for the random factors subj and item, respectively) (Footnote below output). This model takes a long time to fit using lmer because there are so many parameters and the likelihood surface is very flat. The lmm function from the MixedModels package for Julia is much faster fitting this particular model, providing the results

m0 <- lmer(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC + (1+S+P+C+SP+SC+PC+SPC|subj) +
           (1+S+P+C+SP+SC+PC+SPC|item), kb07, REML=FALSE, start=thcvg$kb07$m0,
           control=lmerControl(optimizer="Nelder_Mead",optCtrl=list(maxfun=1L),
                              check.conv.grad="ignore",check.conv.hess="ignore"))
print(summary(m0),corr=FALSE)

This fit converges and produces what look like reasonable parameter estimates (i.e., no variance components with estimates close to zero; no correlation parameters with values close to $\pm1$).

We started this model fit at the converged parameter estimates to save time. Starting from the usual initial values, the model fit took nearly 40,000 iterations for the nonlinear optimizer to converge. Theparameter values look reasonable but are a local optimum. We use a better parameter value here.

The slow convergence is due to a total of 2 x 36 = 72 parameters in the optimization. These parameters are all in the relative covariance factors. The more easily estimated nine fixed-effects parameters have been "profiled out" of the optimization.

Footnote: The model formula reported in the supplement of Barr et al. (2012) specified only five variance components for the random factor item. However, lmer() automatically includes all lower-order terms of interactions specified for random-factor terms, resulting in the maxLMM for this experimental design.

Evaluation of singular value decomposition (svd) for maxLMM

Considering that there are only 56 subjects and 32 items it is quite optimistic to expect to estimate 36 highly nonlinear covariance parameters for subj and another 36 for item.

summary(rePCA(m0))

The directions are the principal components for this covariance matrix. We see that there are seven singular values of zero, that is there is zero variability in seven directions. Overall, the svd analysis of this model returns only eight principal components with variances larger than one percent. Thus, the maxLMM is clearly too complex.

Zero-correlation-parameter linear mixed model (zcpLMM)

As a first step of model reduction, we propose to start with a model including all 16 variance components, but no correlation parameters. Note that here we go through the motion to be consistent with the recommended strategy. The large number of components with zero or close to zero variance in maxLMM already strongly suggests the need for a reduction of the number of variance components--as done in the next step. For this zcpLMM, we extract the vector-valued variables from the model matrix without the intercept column which is provided by the R formula. Then, we use the new double-bar syntax for lmer() to force correlation parameters to zero.

m1 <- lmer(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC + (1+S+P+C+SP+SC+PC+SPC||subj) +
             (1+S+P+C+SP+SC+PC+SPC||item), kb07, REML=FALSE)
print(summary(m1),corr=FALSE)

anova(m1, m0)  

Nominally, the zcpLMM fits significantly worse than the maxLMM, but note that the \chi^2 for the LRT (85) is smaller than twice the degrees of freedom for the LRT (56). Also the degrees of freedom are somewhat of an underestimate. According to our judgement, zcpLMM could be preferred to maxLMM.

Principal components analysis for zcpLMM

summary(rePCA(m1))

The PCM analysis of zcpLMM returns 12 of 16 components with variances different from zero. Thus, using this result as guide, the zcpLMM is still too complex. Inspection of zcpLMM variance components (see zcpLMM m1) suggests a further reduction of model complexity with drop1-LRT tests, starting with the smallest variance components.

Dropping non-significant variance components

A second step of model reduction is to remove variance components that are not significant according to a likelihood ratio test (LRT). Starting with the smallest variance component (or a set of them) this step can be repeated until significant change in goodness of fit is indicated. For the present case, variance components for SC and SPC for subj and S and SP for item are estimated with zero values. We refit the LMM without these variance components.

m2 <- lmer(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC + (1+S+P+C+SP+PC||subj) +
             (1+P+C+SC+PC+SPC||item), kb07, REML=FALSE)
anova(m2, m1)  # not significant: prefer m2 over m1

Obviously, these four variance components are not supported by information in the data. So we drop the next four smallest variance components, vc1 and vc2 for subj and vc5 and vc7 for item.

m3 <- lmer(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC + (1+C+SP+PC||subj) +  (1+P+C+PC||item),kb07,REML=FALSE)
anova(m3,  m2)  # not significant: prefer m3 over m2

There is no significant drop in goodness of fit. Therefore, we continue with dropping vc3, vc4, and vc6 for subj and vc3 and vc6 for item.

m4 <- lmer(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC + (1|subj) + (1|item) + (0+P|item),kb07,REML=FALSE)
anova(m4, m3)  # not significant: prefer m4 over m3
anova(m4, m1)  # not significant: prefer m4 over m1 (no accumulation)

As a final test, we refit the LMM without vc2 for item.

m5 <- lmer(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC + (1|subj) + (1|item), data=kb07, REML=FALSE)
anova(m5, m4)  # significant: prefer m4 over m5

This time the LRT is significant. Therefore, we stay with LMM m4 and test correlation parameters for this model.

Extending the reduced LMM with a correlation parameter

m6 <- lmer(RTtrunc ~ 1+S+P+C+SP+SC+PC+SPC + (1|subj) + (1+P|item), kb07, REML=FALSE)
print(summary(m6), corr=FALSE)

anova(m4, m6)  # significant: prefer m6 over m4
anova(m6, m0)  # not significant: prefer m6 over m0 (no accumulation)

There is evidence for a reliable item-related negative correlation parameter between mean and precedence effect, that is there are reliable differences between items in the precedence effect. Finally, there is no significant difference between LMM m6 and the maxLMM m0. The final number of reliable dimensions is actually smaller than suggested by the PCA analysis of the maxLMM m0.

Profiling the parameters

Confidence intervals for all parameters can be obtained

Summary

In our opinion, m6 is the optimal LMM for the data of this experiment. The general strategy of (1) starting with maxLMM, (2) followed by zcpLMM, (3) followed by iteratively dropping variance components until there is a significant decrease in goodness of model fit, (4) followed by inclusion of correlation parameters for the remaining variance components, and (5) using svd all along the way to check the principal dimensionality of the data for the respective intermediate models worked quite well again. Indeed, we also reanalyzed two additional experiments reported in the supplement of Barr et al. (2012). As documented in the RePsychLing package accompanying the present article, in each case, the maxLMM was too complex for the information provided by the experimental data. In each case, the data supported only a very sparse random-effects structure beyond varying intercepts for subjects and items. Fortunately and interestingly, none of the analyses changed the statistical inference about fixed effects in these experiments. Obviously, this cannot be ruled out in general. If authors adhere to a strict criterion for significance, such as p < .05 suitably adjusted for multiple comparisons, there is always a chance that a t-value will fall above or below the criterion across different versions of an LMM.

Given the degree of deconstruction (i.e., model simplification) reported for these models, one may wonder whether it might be more efficient to iteratively increase rather the decrease LMM complexity, that is to start with a minimal linear mixed model (minLMM), varying only intercepts of subject and item factors and adding variance components and correlation parameters to such a model. We will turn to this strategy in the next section.

Versions of packages used

sessionInfo()

References



dmbates/RePsychLing documentation built on May 15, 2019, 9:19 a.m.