time_basisFun_mem: Fit a mixed-effects model with a set of by-time basis...

View source: R/time_basisFun_mem.R

time_basisFun_memR Documentation

Fit a mixed-effects model with a set of by-time basis functions

Description

Given a typical [generalized] mixed-effects model, augment this model in order to account for time-related variations. Additions to the original model, which are included to account for variations over time, use a random-effects structure including basis functions constructed from a vector of times (e..g, trial numbers). This allows for simultaneously controlling for time-related variations, and for recovery of those variations from the fitted model.

Usage

time_basisFun_mem(
  formula_mem,
  data_mem,
  ...,
  groupingVarName,
  timeVarName,
  basisDens = "wide",
  basis_calc_fun = "gaussian",
  backend = c("lmer", "glmer", "brm"),
  n_oos = 0
)

Arguments

formula_mem

Mixed-effects model formula, as in glmer or brm

data_mem

Data frame to fit model

...

Additional arguments to pass to the fitting function (e.g., brm)

groupingVarName

Character. Variable name, within data_mem, by which basis functions should be grouped (i.e., defined for each one).

timeVarName

Character. Variable name, within data_mem, over which basis functions should be defined (e.g., trial number).

basisDens

Numeric scalar. Distance between basis function peaks, on the same scale as timeVarName

basis_calc_fun

Name of the method to calculate bases. Currently only "gaussian" is supported

backend

Character. Name of fitting function (i.e., lmer, glmer, or brm)

n_oos

Numeric scalar. Number of times to re-fit the model with random 98/2 cross-validation. This provides an estimate of out-of-sample predictiveness.

Details

Various fitting functions (backends) can be used, such as lmer, glmer, and brm. These require that their respective packages and dependencies be installed.

In the case that out-of-sample likelihoods are desired, see attr(modelObject,'delta_logLik_outOfSample_description') for a description and attr(mod,'delta_logLik_outOfSample') for the vector of out-of-sample delta-log-likelihoods. Out-of-sample log-likelihoods are calculated as a difference from the pointwise in-sample likelihood from a static fit (i.e., a model fit with only formula_mem and no bases). This provides a uniform baseline pointwise in-sample likelihood against which various models (e.g., with different basis densities) can be compared.

Note that, if by-'groupingVar' intercepts are included in the random effects structure, then at least one random effect will be unidentified, because the combination of basis function offsets can act as an overall intercept. Also note that, if there is a sufficiently strong group-level temporal trend, it is likely that including a by-timepoint random intercept (e.g., '(1|timeVar') may be very useful.

See Also

vignette('mem_basis_vignette')

Examples


library(lme4)

## define data
nSubj <- 20
nTrials <- 200
betweenSubjSD <- .2
xEffect <- 1
changeEffect <- .5
noiseSD <- .25

## generate data
d <- do.call('rbind',replicate(nSubj,{
  dSubj <- data.frame(
    subID = paste0(sample(letters,8),collapse='')
    , trialNum = 1:nTrials
    , xVar = rbinom(nTrials,1,.5)
  )

  dSubj$xNum <- (dSubj$xVar * xEffect - xEffect/2) * rnorm(1,1,betweenSubjSD)

  dSubj$sinVar <- sin(rnorm(1,0,10) + # phase offset
                      (1:nTrials)/
                        rnorm(1,25,2) # frequency offset
  ) * changeEffect * rnorm(1,1,betweenSubjSD)

  dSubj$y <- dSubj$xNum +
    dSubj$sinVar +
    rnorm(1,0,betweenSubjSD) +
    rnorm(nTrials, 0 ,noiseSD)

  dSubj
},simplify=F))

## fit basis function model
m1 <- time_basisFun_mem(
  y ~ xVar + (0 + xVar|subID)
  ,d
  ,groupingVarName = 'subID'
  ,timeVarName = 'trialNum'
)

## extract the fitted timecourse (using `lme4` because this is an `lmer` model):
m1_fitted_timeCourse <- predict(m1,random.only=T
   ,re.form = as.formula(paste('~',gsub('(0 + xVar|subID) + ','',as.character(formula(m1))[3],fixed=T)) ) )
plot(d$trialNum,m1_fitted_timeCourse)


## let's compare two models' out-of-sample likelihoods and choose the best

## A fairly conservative, size (see m1 for conservative default)
m2 <- time_basisFun_mem(
   y ~ xVar + (0 + xVar|subID)
   ,d
   ,groupingVarName = 'subID'
   ,timeVarName = 'trialNum'
   ,basisDens = 50
   ,n_oos = 50
)

## A more-dense set of bases, every 20 trials
m3 <- time_basisFun_mem(
   y ~ xVar + (0 + xVar|subID)
   ,d
   ,groupingVarName = 'subID'
   ,timeVarName = 'trialNum'
   ,basisDens = 20
   ,n_oos = 50
)
## This takes noticeably longer to run

## approximate Cohen's D of the out-of-sample log-likelihoods for m3 over m2:
(mean(attr(m3,'delta_logLik_outOfSample')) - mean(attr(m2,'delta_logLik_outOfSample')) )/
   sd(c(attr(m3,'delta_logLik_outOfSample'),attr(m2,'delta_logLik_outOfSample')))
## Clearly m3's out-of-sample predictiveness is better than m2's, although only slightly
## What about its fitted timecourse?

m3_fitted_timeCourse <- predict(m3,random.only=T
   ,re.form = as.formula(paste('~',gsub('xVar + (0 + xVar|subID) + ','',as.character(formula(m3))[3],fixed=T)) ) )
plot(d$trialNum,m3_fitted_timeCourse)
## These look like the sin-wave oscillations that were originally generated!


akcochrane/TEfits documentation built on June 12, 2025, 11:10 a.m.