In mne-tools/mne-r: Fast Access to MNE-Python From Within R

Here we will use a multilevel model to assess the difference between two conditions over time. For now we will stick with the lme4 optimization-based implementation and use the parametric bootstrap for uncertainty estimation. Bayesian inference will be covered in a future example.

library(tidyverse)
library(lme4)
library(merTools)
library(mne)

Let's read in the raw data.

data_path <- mne$datasets$sample$data_path()
subject <- "sample"

raw_fname <- paste(data_path,
                   'MEG', 
                   subject,
                   'sample_audvis_filt-0-40_raw.fif',
                   sep = '/')

raw <- mne$io$read_raw_fif(raw_fname, preload = T)

We can now go ahead and compute evokeds.

events <- mne$find_events(raw)
storage.mode(events) <- "integer"  # R gets the events as floats.

tmin <- -0.2
tmax <- 0.5
baseline <- reticulate::tuple(NULL, 0)
event_id <- list("aud/l" = 1L, "vis/l" = 3L)
picks <- mne$pick_channels(raw$ch_names, list('MEG 1332'))
epochs <- mne$Epochs(raw = raw, events = events, event_id = event_id,
                     tmin = tmin, tmax = tmax,
                     picks = picks %>% as.integer(),
                     baseline = baseline, reject = NULL, preload = T)

Let's get the epochs into a dataframe.

# use MNE-R function.
epochs_df <- mne::get_data_frame(epochs)

Now we can set up a simple varying effects model:

$$ y_i \sim N(\alpha + \alpha[j] + T\beta[j]_i, \sigma^2) $$ Where $y_i$ is the MEG observation at a given epoch and time point, $\alpha$ is the global intercept, $\alpha[j]$ the intercept of each tome point, $T$ is our treatment information, and $\beta[j]$ is the estimated treatment effect over all $j$ time points of the epoch. The multilevel model will learn a population-shrinkage that depewnds on the variance over the time.

One of the assumptions of this model is that all $J$ coefficients come from a common distribution:

$$ \beta_{j} \sim N(\mu_{\beta[j]}, \sigma_{j}^2) $$ This is what makes this model one model instead of $j$ separate models. In this model, all $j$ coefficients are shrunk to $\mu_j$ to an extent that is proportional to the variance of the group effects.

mod1 <- lmer(observation ~ 1 + condition + (1 + condition | time),
             data = epochs_df)
mod1 %>% summary() %>% print()

As often, looking at the coefficients and text summaries of such models is not tremendously helpful. We will instead generate predictions.

probe <- expand.grid(
  condition = c("aud/l", "vis/l") %>% as.factor(),
  time = epochs_df$time %>% unique()
)

pred_mod1 <- predict(mod1, probe)
probe$pred <- pred_mod1 

ggplot(data = epochs_df,
       mapping = aes(x = time, y = observation,
                     group = interaction(condition, epoch),
                     color = condition)) +
  geom_line(size = 0.3, alpha = 0.4) +
  geom_line(
    size = 1.5, data = probe,
    mapping = aes(x = time, y = pred, group = condition,
                  color = condition)) +
  theme_minimal() +
  theme(text = element_text(size = 24, family = "Helvetica")) +
  labs(x = "times [ms]",
       y = "predicted GRAD signal [fT/cm]") +
  ylim(-300, 300)

We can see that the model predictions are recapitulating the temporal structure of the data. But how much uncertainty do we have about those predicrions?

In the absence of a full Bayesian analysis we can do an informed parametric bootstrap that takes into account all sources of uncertainty of our multilevel model, which yields more conservative estimates of compatibility intervals. We will use the merTools package for this purpose.

We'll put a few custom options which should speak for themselves.

pred_interval_mod1 <- predictInterval(
  merMod = mod1, newdata = probe, which = "full", level = 0.95,
  n.sims = 1000, stat = "mean", type = "linear.prediction",
  returnSims = T, seed = 42
)

probe_int <- bind_cols(
  probe, pred_interval_mod1)

Let's first look at the compatibility intervals.

ggplot(
    data = probe_int,
    mapping = aes(x = time, y = pred, group = condition,
                  color = condition, ymin = lwr, ymax = upr)) +
  geom_ribbon(mapping = aes(fill = condition), alpha = 0.1) +
  geom_line(size = 1) +
  theme_minimal() +
  theme(text = element_text(size = 20, family = "Helvetica")) +
  labs(x = "times [ms]",
       y = "predicted GRAD signal [fT/cm]") +
  facet_wrap(~condition) +
  guides(color = F, fill = F)

Now let's plot the single simulations on top of it. We first have to extract the simulations from the merTools output.

# let's subsample a few bootstrap simulations.
idx <- sample(1:1000, size = 100)

pred_sims <- attr(pred_interval_mod1, "sim.results")[,idx] %>%
  as.data.frame() %>%
  gather(key = "sim", value = "pred_hat")

pred_sims$sim <- pred_sims$sim %>% as.factor()
pred_sims$time <- probe_int$time
pred_sims$condition <- probe_int$condition
pred_sims$pred <- probe_int$pred

Now we can plot the prediction from each bootstrap replica. In order to not clutter things we will make generous use of transparency.

ggplot(
    data = pred_sims,
    mapping = aes(x = time, y = pred,
                  group = condition,
                  color = condition)) +
  geom_line(
    alpha = 0.05, mapping = aes(
      y = pred_hat,
      group = interaction(sim, condition))) +
  stat_summary(
      fun.ymin = function(x){quantile(x, 0.025)},
      fun.ymax = function(x){quantile(x, 0.975)},
      mapping = aes(x = time,
                    y = pred_hat,
                    fill = condition,
                    group = condition,
                    color = condition),
      geom = "ribbon", alpha = 0.1) +
  geom_line(size = 1.5) +
  theme_minimal() +
  theme(text = element_text(size = 20, family = "Helvetica")) +
  labs(x = "times [ms]",
       y = "predicted GRAD signal [fT/cm]") +
  facet_wrap(~condition) +
  guides(color = F, fill = F)