fitLMER.fnc: Back-fit fixed effects and forward-fit random effects of an...
In LMERConvenienceFunctions: Model Selection and Post-Hoc Analysis for (G)LMER Models

Description Usage Arguments Details Value Warnings Note Author(s) References See Also Examples

The function follows these steps: (1) If llrt is set to TRUE, set REML to FALSE (unless specified otherwise); (2) back-fit initial model either on F- (by default) or on t/z-values; (3) forward-fit random effects; (4) re-back-fit fixed effects; (5) if llrt is set to TRUE, set REML to TRUE (unless specified otherwise). Note that, this function CAN be used with generalized linear mixed-effects models (glmers).

fitLMER.fnc(model, item = FALSE, backfit.on = c("F",
"t"), method = c("F", "t", "z", "llrt", "AIC", "BIC", "relLik.AIC", 
"relLik.BIC"), threshold = NULL, t.threshold = NULL, 
ran.effects = list(ran.intercepts = as.character(), 
slopes = as.character(), corr = as.character(), 
by.vars = as.character()), alpha = NULL, alphaitem = NULL, 
if.warn.not.add = TRUE, prune.ranefs = TRUE, p.value = "upper", 
set.REML.FALSE = TRUE, keep.single.factors = FALSE, 
reset.REML.TRUE = TRUE, log.file.name = NULL)

`model`	A `mer` object (fitted by function `lmer`). This function can be used with generalized linear mixed-effects models (`glmer`s) if argument `backfit.on` is set to "t", but not if it is set to "F".
`item`	Whether or not to evaluate the addition of by-item random intercepts to the model, evaluated by way of log-likelihood ratio test. Either `FALSE` (the default, does not evaluate this addition) or the column name (quoted) of the item identifier (e.g., `"Item"`, `"Word"`).
`backfit.on`	Either "F" (default) or "t". Refers to the statistic which will be used to determine which term to test and potentially remove from the model. If you are backfitting a generalized linear mixed-effects model (`glmer`), make sure to set `backfit.on` to "t"; the algorithm efectively backfits on "z".
`method`	Backfitting method. One of "F" (p-value), "t" (t statistic), "z" (z statistic), "llrt", "AIC", "BIC", "relLik.AIC", or "relLik.BIC" (the latter two are based on relative likelihood, see function `relLik`). Defaults to "t". You can find information regarding differences between AIC and BIC from `http://methodology.psu.edu/eresources/ask/sp07`.
`threshold`	Method-specific threshold for parameter selection. It refers to alpha in the case of "F" and "llrt", to the t/z-value in case of "t" or "z", to the minimum reduction in likelihood in the case of "AIC" and "BIC", or to the minimum difference in probability in the case of "relLik.AIC" and "relLik.BIC". Defaults `NULL`, which means `0.05` for "F" and "llrt", 2 for "t", 5 for "AIC" and "BIC", and 4 for "relLik.AIC" and "relLik.BIC".
`t.threshold`	Defaults to `NULL`. If the `method = "t"` or `method = "z"`, it is the t/z-value below which a model term is dropped (if `t.threshold = NULL`, it will be set to 2). Otherwise it is the threshold for t/z-value below which a test (see `method`) is performed between a model with the term under consideration and a simpler model without it (if `t.threshold = NULL`, it is set to `Inf`, which means that all terms are tested.
`ran.effects`	Can be either a vector or a list. In the former case, the random effects to be evaluated are provided. For example `c("(1 + Frequency \| Subject)", "(0 + Length \| Subject)", "(1 + NSynSet \| Subject)")`. In the latter case, the list can be composed of (i) a vector of random intercepts to be evaluated (`ran.intercepts`), (ii) a vector of random slopes to be evaluated (`slopes`), (iii) a vector specifying, for each element of `slopes`, whether the correlation between the slope and by-variables specified in `by.vars` should be added (`corr`), and (iv) a vector of “by” variables for the random slopes (`by.vars`). Values that can be supplied to the `corr` argument are `1` (add correlation), `0` (do not add correlation), and `NA` (for when the "slope" is a factor variable). Note that if a term in `slopes` is a factor variable, the `corr` value tied to it will be automatically set to `NA`. Also note that if no values are supplied to `corr`, a vector of 0 as long as the `slopes` vector will be automatically supplied. For example `list(ran.intercepts = "Word", slopes = c("Frequency", "Length", "NSynSet","Class"), corr = c(0, 0, 1, NA), by.vars = "Subject")`. Another example is `list(slopes = c("Trial", "Class"), by.vars = "Subject")`, where the `corr` argument will be equal to `c(0, NA)`.
`alpha`	If the method is `F`, it is the p-value (from `pamer.fnc`) above which a model term is dropped. In this case, it defaults to the value passed to argument `threshold`, i.e., 0.05. Otherwise it is the p-value threshold above which a test (see `method`) is performed between a model with the term under consideration and a simpler model without it (in this case, defaults to `0`, i.e. all terms will be tested).
`alphaitem`	Alpha value for the evaluation of by-item random intercepts. Defaults to `0.05` or to the specified threshold.
`if.warn.not.add`	Logical. If a warning is issued after fitting a model with a new random effect (e.g., `false convergence` or the like), should the random effect nevertheless be evaluated? Defaults to `TRUE`, meaning that if such a warning is issued, the random effect will not be added to the random effects structure of the model. If set to `FALSE`, the random effect will be evaluated for inclusion as any other random effects would be via log likelihood ratio testing even if a warning is issued.
`prune.ranefs`	Logical. Whether to remove any random effect for which its variable is not also present in the fixed effects structure (with the exception of the grouping variables such as `"Subjects"` and `"Items"`). Defaults to `TRUE`. For example, if the random effects structure contains the terms `Condition + ROI + Group`, and the random effects structure contains the terms `(1 \| Subject) + (0 + TrialNum \| Subject)`, the ranedom effect `(0 + TrialNum \| Subject)` will be pruned from the model given that it is not in the model's fixed effects structure.
`p.value`	Whether to use upper-bound (“upper”; the default) or lower-bound (“lower”) p-values when back-fitting with method "F".
`set.REML.FALSE`	Logical. Whether or not to set `REML` to `FALSE`. Defaults to `FALSE`.
`reset.REML.TRUE`	Logical. Whether or not to re-set the back-fitted model to `REML = TRUE`.
`keep.single.factors`	Logical. Whether or not main effects are kept (not subjected to testing and reduction). Defaults to `FALSE`.
`log.file.name`	Should the back-fitting log be saved? Defaults to `NULL`, which means that a log file is saved in a temporary folder (platform dependent) as `file.path(tempdir(), paste("fitLMER_log_", gsub(":", "-", gsub(" ", "_", date())), ".txt", sep = ""))`. The path and file name of the log can be changed to whatever the use wishes. Set to `FALSE` to disable.

The process has three stages. In the first stage, either bfFixefLMER_F.fnc or bfFixefLMER_t.fnc is called (depending on the user's choice) and the fixed effects are back-fitted accordingly. In the second stage, ffRanefLMER.fnc is called and random effects are forward-fitted. In the third stage, the fixed effects are back-fitted again. This is done because the inclusion of certain random effects sometimes renders certain fixed effects non-significant. This process was used in Tremblay and Tucker (2011) and in Newman, Tremblay, Nichols, Neville, and Ullman (2012).

If, for example, you have many analyses to run and a cluster is available, write a bash script that will create (1) .R files that will relevel the conditions and update the model, and (2) an associated .sh job submission script to submit the .R files. For example, let's consider two ERP analyses all in a time window ranging from 100 to 250 ms. Two three-way interactions were considered: Position (factor; 1 to 6) X Length of the second word of a four-word sequence (e.g., in the middle of) X Working Memory Capacity score (continuous, from 0 to 100) and Trial (continuous; 1 to 432) X Length X Working Memory Capacity. Analyses were performed at electrodes Fp1 Fp2 AF3 AF4 F7 F3 Fz F4 F8 FC5 FC1 FC2 FC6 T7 C3 Cz C4 T8 CP5 CP1 CP2 CP6. See Tremblay and Newman (In preparation) for more details. The analysis script named Fp1-CP6_100250.sh we used on the ACEnet cluster is as follows:

      electrodes=(Fp1 Fp2 AF3 AF4 F7 F3 Fz F4 F8 FC5 FC1 FC2 FC6 T7 C3 Cz C4 T8 CP5 CP1 CP2 CP6)
      for e in ${electrodes[*]}; do 
            export E=$e;
            # create .R script to load data, perform necessary manipulations
            # and perform the analysis using fitLMER.fnc
            echo 'e<-Sys.getenv("E")' > $e".R"
            echo 'load("../data/eeg600_trim_v2.rda")' >> $e".R"
            echo 'dat0<-dat' >> $e".R"
            echo 'rm(dat);gc(T,T)' >> $e".R"
            echo 'dat <- dat0[dat0$Time >= 100 & dat0$Time <= 250, , drop = TRUE]' >> $e".R"
            echo 'dat <- dat[dat$Electrode == e, , drop = TRUE]' >> $e".R"
            echo 'subj<-sort(unique(dat$Subject))' >> $e".R"
            echo 'for(i in subj){' >> $e".R"
            echo 'tmp<-dat[dat$Subject==i,,drop=TRUE]' >> $e".R"
            echo 'tmp$newfact<-paste(tmp$Block,tmp$Position,sep="_")' >> $e".R"
            echo 'newvec<-vector("numeric")' >> $e".R"
            echo 'for(j in 1:length(unique(tmp$newfact))){' >> $e".R"
            echo 'newvec<-c(newvec,rep(j,nrow(tmp[tmp$newfact==unique(tmp$newfact)[j],])))' >> $e".R"
            echo '}' >> $e".R"
            echo 'tmp$Trial<-newvec' >> $e".R"
            echo 'if(grep(i,subj)[1]==1){' >> $e".R"
            echo 'newdat<-tmp' >> $e".R"
            echo '}else{' >> $e".R"
            echo 'newdat<-rbind(newdat,tmp)' >> $e".R"
            echo '}' >> $e".R"
            echo '}' >> $e".R"
            echo 'dat<-newdat' >> $e".R"
            echo 'dat$Position<-as.factor(dat$Position)' >> $e".R"
            echo 'm7 <- lmer(Amplitude ~ (Position + Trial)*(LengthBc * WMCc) + ' >> $e".R"
            echo '(1 | Subject), data = dat)' >> $e".R"
            echo 'm7b<-fitLMER.fnc(m7,item="Item",ran.effects=c("(0+Trial|Subject)",' >> $e".R"
            echo '"(0+LengthBc|Subject)","(0+Trial|Item)","(0+WMCc|Item)",' >> $e".R"
            echo '"(Position|Subject)"))' >> $e".R"
            echo 'smry<-pamer.fnc(m7b)' >> $e".R"
            echo 'save(m7b,file=file.path("..","models",paste("m7b_",e,"_100250.rda",sep="")))' >> $e".R"
            echo 'save(smry,file=file.path("..","summaries",paste("smry_m7b_",e,"_100250.rda",sep="")))' >> $e".R"
  
            ### create the job submission script for the .R file created above
            echo '#$ -S /bin/bash' > "job."$e".sh"
            echo '#$ -cwd' >> "job."$e".sh"
            echo '#$ -j y' >> "job."$e".sh"
            echo '#$ -l h_rt=48:00:00' >> "job."$e".sh"
            echo '#$ -l h_vmem=8G' >> "job."$e".sh"
            echo '#$ -R y' >> "job."$e".sh"
            echo '#$ -N '$e >> "job."$e".sh"
            echo 'R -q -f '$e'.R' >> "job."$e".sh"
            
            ### submit the job
            qsub  "job."$e".sh"
      done;

and then type in the console

1	. Fp1-CP6_100250.sh

On the ACEnet cluster, this results in 22 independent analyses, simultaneously using a total of 22 cores and 176 GB of RAM. This analysis completes in about 30 minutes to 1 hour.

A mer object with back-fitted fixed effects and forward-fitted random effects, as well as a log of the process, which is printed on screen and, optionally, printed in a log file.

Upper-bound p-values can be anti-conservative, while lower-bound p-values can be conservative. See function pamer.fnc.

The removal of a random effect from the random effects structure if the variables that compose it are not also in the fixed effects structure has been turned off in this version.

Antoine Tremblay, Statistics Canada, trea26@gmail.com

Baayen, R.H., Davidson, D.J. and Bates, D.M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59, 390–412.

Newman, A.J., Tremblay, A., Nichols, E.S., Neville, H.J., and Ullman, M.T. (2012). The Influence of Language Proficiency on Lexical Semantic Processing in Native and Late Learners of English. Journal of Cognitive Neuroscience, 25, 1205–1223.

Pinheiro, J.C. and Bates, D.M. (2000). Mixed Effects Models in S and S-Plus. New York: Springer.

Tremblay, A. and Tucker B. V. (2011). The Effects of N-gram Probabilistic Measures on the Processing and Production of Four-word Sequences. The Mental Lexicon, 6(2), 302–324.

bfFixefLMER_F.fnc; bfFixefLMER_t.fnc; ffRanefLMER.fnc; mcposthoc.fnc; pamer.fnc; mcp.fnc; relLik; romr.fnc; perSubjectTrim.fnc.

1	# see example LMERConvenienceFunctions help page.

Warning messages:
1: In rgl.init(initValue, onlyNULL) : RGL: unable to open X11 display
2: 'rgl_init' failed, running with rgl.useNULL = TRUE 
3: .onUnload failed in unloadNamespace() for 'rgl', details:
  call: fun(...)
  error: object 'rgl_quit' not found