fitLMER.fnc: Back-fit fixed effects and forward-fit random effects of an...

Description Usage Arguments Details Value Warnings Note Author(s) References See Also Examples

View source: R/fitLMER.fnc.R

Description

The function follows these steps: (1) If llrt is set to TRUE, set REML to FALSE (unless specified otherwise); (2) back-fit initial model either on F- (by default) or on t/z-values; (3) forward-fit random effects; (4) re-back-fit fixed effects; (5) if llrt is set to TRUE, set REML to TRUE (unless specified otherwise). Note that, this function CAN be used with generalized linear mixed-effects models (glmers).

Usage

1
2
3
4
5
6
7
8
9
fitLMER.fnc(model, item = FALSE, backfit.on = c("F",
"t"), method = c("F", "t", "z", "llrt", "AIC", "BIC", "relLik.AIC", 
"relLik.BIC"), threshold = NULL, t.threshold = NULL, 
ran.effects = list(ran.intercepts = as.character(), 
slopes = as.character(), corr = as.character(), 
by.vars = as.character()), alpha = NULL, alphaitem = NULL, 
if.warn.not.add = TRUE, prune.ranefs = TRUE, p.value = "upper", 
set.REML.FALSE = TRUE, keep.single.factors = FALSE, 
reset.REML.TRUE = TRUE, log.file.name = NULL)

Arguments

model

A mer object (fitted by function lmer). This function can be used with generalized linear mixed-effects models (glmers) if argument backfit.on is set to "t", but not if it is set to "F".

item

Whether or not to evaluate the addition of by-item random intercepts to the model, evaluated by way of log-likelihood ratio test. Either FALSE (the default, does not evaluate this addition) or the column name (quoted) of the item identifier (e.g., "Item", "Word").

backfit.on

Either "F" (default) or "t". Refers to the statistic which will be used to determine which term to test and potentially remove from the model. If you are backfitting a generalized linear mixed-effects model (glmer), make sure to set backfit.on to "t"; the algorithm efectively backfits on "z".

method

Backfitting method. One of "F" (p-value), "t" (t statistic), "z" (z statistic), "llrt", "AIC", "BIC", "relLik.AIC", or "relLik.BIC" (the latter two are based on relative likelihood, see function relLik). Defaults to "t". You can find information regarding differences between AIC and BIC from http://methodology.psu.edu/eresources/ask/sp07.

threshold

Method-specific threshold for parameter selection. It refers to alpha in the case of "F" and "llrt", to the t/z-value in case of "t" or "z", to the minimum reduction in likelihood in the case of "AIC" and "BIC", or to the minimum difference in probability in the case of "relLik.AIC" and "relLik.BIC". Defaults NULL, which means 0.05 for "F" and "llrt", 2 for "t", 5 for "AIC" and "BIC", and 4 for "relLik.AIC" and "relLik.BIC".

t.threshold

Defaults to NULL. If the method = "t" or method = "z", it is the t/z-value below which a model term is dropped (if t.threshold = NULL, it will be set to 2). Otherwise it is the threshold for t/z-value below which a test (see method) is performed between a model with the term under consideration and a simpler model without it (if t.threshold = NULL, it is set to Inf, which means that all terms are tested.

ran.effects

Can be either a vector or a list. In the former case, the random effects to be evaluated are provided. For example c("(1 + Frequency | Subject)", "(0 + Length | Subject)", "(1 + NSynSet | Subject)"). In the latter case, the list can be composed of (i) a vector of random intercepts to be evaluated (ran.intercepts), (ii) a vector of random slopes to be evaluated (slopes), (iii) a vector specifying, for each element of slopes, whether the correlation between the slope and by-variables specified in by.vars should be added (corr), and (iv) a vector of “by” variables for the random slopes (by.vars). Values that can be supplied to the corr argument are 1 (add correlation), 0 (do not add correlation), and NA (for when the "slope" is a factor variable). Note that if a term in slopes is a factor variable, the corr value tied to it will be automatically set to NA. Also note that if no values are supplied to corr, a vector of 0 as long as the slopes vector will be automatically supplied. For example list(ran.intercepts = "Word", slopes = c("Frequency", "Length", "NSynSet","Class"), corr = c(0, 0, 1, NA), by.vars = "Subject"). Another example is list(slopes = c("Trial", "Class"), by.vars = "Subject"), where the corr argument will be equal to c(0, NA).

alpha

If the method is F, it is the p-value (from pamer.fnc) above which a model term is dropped. In this case, it defaults to the value passed to argument threshold, i.e., 0.05. Otherwise it is the p-value threshold above which a test (see method) is performed between a model with the term under consideration and a simpler model without it (in this case, defaults to 0, i.e. all terms will be tested).

alphaitem

Alpha value for the evaluation of by-item random intercepts. Defaults to 0.05 or to the specified threshold.

if.warn.not.add

Logical. If a warning is issued after fitting a model with a new random effect (e.g., false convergence or the like), should the random effect nevertheless be evaluated? Defaults to TRUE, meaning that if such a warning is issued, the random effect will not be added to the random effects structure of the model. If set to FALSE, the random effect will be evaluated for inclusion as any other random effects would be via log likelihood ratio testing even if a warning is issued.

prune.ranefs

Logical. Whether to remove any random effect for which its variable is not also present in the fixed effects structure (with the exception of the grouping variables such as "Subjects" and "Items"). Defaults to TRUE. For example, if the random effects structure contains the terms Condition + ROI + Group, and the random effects structure contains the terms (1 | Subject) + (0 + TrialNum | Subject), the ranedom effect (0 + TrialNum | Subject) will be pruned from the model given that it is not in the model's fixed effects structure.

p.value

Whether to use upper-bound (“upper”; the default) or lower-bound (“lower”) p-values when back-fitting with method "F".

set.REML.FALSE

Logical. Whether or not to set REML to FALSE. Defaults to FALSE.

reset.REML.TRUE

Logical. Whether or not to re-set the back-fitted model to REML = TRUE.

keep.single.factors

Logical. Whether or not main effects are kept (not subjected to testing and reduction). Defaults to FALSE.

log.file.name

Should the back-fitting log be saved? Defaults to NULL, which means that a log file is saved in a temporary folder (platform dependent) as file.path(tempdir(), paste("fitLMER_log_", gsub(":", "-", gsub(" ", "_", date())), ".txt", sep = "")). The path and file name of the log can be changed to whatever the use wishes. Set to FALSE to disable.

Details

The process has three stages. In the first stage, either bfFixefLMER_F.fnc or bfFixefLMER_t.fnc is called (depending on the user's choice) and the fixed effects are back-fitted accordingly. In the second stage, ffRanefLMER.fnc is called and random effects are forward-fitted. In the third stage, the fixed effects are back-fitted again. This is done because the inclusion of certain random effects sometimes renders certain fixed effects non-significant. This process was used in Tremblay and Tucker (2011) and in Newman, Tremblay, Nichols, Neville, and Ullman (2012).

If, for example, you have many analyses to run and a cluster is available, write a bash script that will create (1) .R files that will relevel the conditions and update the model, and (2) an associated .sh job submission script to submit the .R files. For example, let's consider two ERP analyses all in a time window ranging from 100 to 250 ms. Two three-way interactions were considered: Position (factor; 1 to 6) X Length of the second word of a four-word sequence (e.g., in the middle of) X Working Memory Capacity score (continuous, from 0 to 100) and Trial (continuous; 1 to 432) X Length X Working Memory Capacity. Analyses were performed at electrodes Fp1 Fp2 AF3 AF4 F7 F3 Fz F4 F8 FC5 FC1 FC2 FC6 T7 C3 Cz C4 T8 CP5 CP1 CP2 CP6. See Tremblay and Newman (In preparation) for more details. The analysis script named Fp1-CP6_100250.sh we used on the ACEnet cluster is as follows:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
      electrodes=(Fp1 Fp2 AF3 AF4 F7 F3 Fz F4 F8 FC5 FC1 FC2 FC6 T7 C3 Cz C4 T8 CP5 CP1 CP2 CP6)
      for e in ${electrodes[*]}; do 
            export E=$e;
            # create .R script to load data, perform necessary manipulations
            # and perform the analysis using fitLMER.fnc
            echo 'e<-Sys.getenv("E")' > $e".R"
            echo 'load("../data/eeg600_trim_v2.rda")' >> $e".R"
            echo 'dat0<-dat' >> $e".R"
            echo 'rm(dat);gc(T,T)' >> $e".R"
            echo 'dat <- dat0[dat0$Time >= 100 & dat0$Time <= 250, , drop = TRUE]' >> $e".R"
            echo 'dat <- dat[dat$Electrode == e, , drop = TRUE]' >> $e".R"
            echo 'subj<-sort(unique(dat$Subject))' >> $e".R"
            echo 'for(i in subj){' >> $e".R"
            echo 'tmp<-dat[dat$Subject==i,,drop=TRUE]' >> $e".R"
            echo 'tmp$newfact<-paste(tmp$Block,tmp$Position,sep="_")' >> $e".R"
            echo 'newvec<-vector("numeric")' >> $e".R"
            echo 'for(j in 1:length(unique(tmp$newfact))){' >> $e".R"
            echo 'newvec<-c(newvec,rep(j,nrow(tmp[tmp$newfact==unique(tmp$newfact)[j],])))' >> $e".R"
            echo '}' >> $e".R"
            echo 'tmp$Trial<-newvec' >> $e".R"
            echo 'if(grep(i,subj)[1]==1){' >> $e".R"
            echo 'newdat<-tmp' >> $e".R"
            echo '}else{' >> $e".R"
            echo 'newdat<-rbind(newdat,tmp)' >> $e".R"
            echo '}' >> $e".R"
            echo '}' >> $e".R"
            echo 'dat<-newdat' >> $e".R"
            echo 'dat$Position<-as.factor(dat$Position)' >> $e".R"
            echo 'm7 <- lmer(Amplitude ~ (Position + Trial)*(LengthBc * WMCc) + ' >> $e".R"
            echo '(1 | Subject), data = dat)' >> $e".R"
            echo 'm7b<-fitLMER.fnc(m7,item="Item",ran.effects=c("(0+Trial|Subject)",' >> $e".R"
            echo '"(0+LengthBc|Subject)","(0+Trial|Item)","(0+WMCc|Item)",' >> $e".R"
            echo '"(Position|Subject)"))' >> $e".R"
            echo 'smry<-pamer.fnc(m7b)' >> $e".R"
            echo 'save(m7b,file=file.path("..","models",paste("m7b_",e,"_100250.rda",sep="")))' >> $e".R"
            echo 'save(smry,file=file.path("..","summaries",paste("smry_m7b_",e,"_100250.rda",sep="")))' >> $e".R"
  
            ### create the job submission script for the .R file created above
            echo '#$ -S /bin/bash' > "job."$e".sh"
            echo '#$ -cwd' >> "job."$e".sh"
            echo '#$ -j y' >> "job."$e".sh"
            echo '#$ -l h_rt=48:00:00' >> "job."$e".sh"
            echo '#$ -l h_vmem=8G' >> "job."$e".sh"
            echo '#$ -R y' >> "job."$e".sh"
            echo '#$ -N '$e >> "job."$e".sh"
            echo 'R -q -f '$e'.R' >> "job."$e".sh"
            
            ### submit the job
            qsub  "job."$e".sh"
      done;

and then type in the console

1
      . Fp1-CP6_100250.sh

On the ACEnet cluster, this results in 22 independent analyses, simultaneously using a total of 22 cores and 176 GB of RAM. This analysis completes in about 30 minutes to 1 hour.

Value

A mer object with back-fitted fixed effects and forward-fitted random effects, as well as a log of the process, which is printed on screen and, optionally, printed in a log file.

Warnings

Upper-bound p-values can be anti-conservative, while lower-bound p-values can be conservative. See function pamer.fnc.

Note

The removal of a random effect from the random effects structure if the variables that compose it are not also in the fixed effects structure has been turned off in this version.

Author(s)

Antoine Tremblay, Statistics Canada, trea26@gmail.com

References

Baayen, R.H., Davidson, D.J. and Bates, D.M. (2008). Mixed-effects modeling with crossed random effects for subjects and items. Journal of Memory and Language, 59, 390–412.

Newman, A.J., Tremblay, A., Nichols, E.S., Neville, H.J., and Ullman, M.T. (2012). The Influence of Language Proficiency on Lexical Semantic Processing in Native and Late Learners of English. Journal of Cognitive Neuroscience, 25, 1205–1223.

Pinheiro, J.C. and Bates, D.M. (2000). Mixed Effects Models in S and S-Plus. New York: Springer.

Tremblay, A. and Tucker B. V. (2011). The Effects of N-gram Probabilistic Measures on the Processing and Production of Four-word Sequences. The Mental Lexicon, 6(2), 302–324.

See Also

bfFixefLMER_F.fnc; bfFixefLMER_t.fnc; ffRanefLMER.fnc; mcposthoc.fnc; pamer.fnc; mcp.fnc; relLik; romr.fnc; perSubjectTrim.fnc.

Examples

1
# see example LMERConvenienceFunctions help page.

Example output

Warning messages:
1: In rgl.init(initValue, onlyNULL) : RGL: unable to open X11 display
2: 'rgl_init' failed, running with rgl.useNULL = TRUE 
3: .onUnload failed in unloadNamespace() for 'rgl', details:
  call: fun(...)
  error: object 'rgl_quit' not found 

LMERConvenienceFunctions documentation built on Oct. 23, 2020, 5:12 p.m.