knitr::opts_chunk$set( collapse = TRUE, comment = "#>", warning = FALSE )
In this vignette, we are reporting results for a parameter recovery of the tva_recovery data set. We need packages tibble, tidyr and dplyr for data wrangling, ggplot2 for plotting, RStanTVA for the actual analyses, and brms for specifying priors.
library(tibble) library(dplyr) library(tidyr) library(ggplot2) library(RStanTVA) library(brms)
We want to make use of the parallelization built into Stan by activating the corresponding option in RStan:
rstan_options(threads_per_chain = parallel::detectCores())
Each trial is simulated assuming partial report in TVA with a global processing capacity $C$, top-down selectivity $\alpha$, hemispheric bias ($r$), probit-distributed memory capacity $K$ (parameters $\mu_K$ and $\sigma_K$), and Gaussian sensory thresholds $t_0$ (parameters $\mu_0$ and $\sigma_0$).
For easier understanding, we will later on only look at the expected $K$ values instead of the distributional parameters $\mu_K$ and $\sigma_K$. The expected value (mean) can be derived from $(\mu_K,\sigma_K)$ samples like this:
meanprobitK <- Vectorize(function(mK, sigmaK, nS) { p <- pnorm(1:nS, mK, sigmaK) lps <- c(0, p) ups <- c(p, 1) sum(0:nS * (ups - lps)) }, c("mK","sigmaK"))
The same is often done for $t_0$, where the expected value of a Gaussian $t_0$ is simply $\mu_0$.
Note: For a demonstration of simulated TVA trials, see the ShinyTVA app, which can be run as follows:
shiny::runGitHub("mmrabe/ShinyTVA")
Processing speed $C$ differs between the "low" and "high" conditions so that $C_\textrm{low}=r exp(tva_recovery_true_params$b["C_Intercept"])$ and $C_\textrm{high}=r exp(tva_recovery_true_params$b["C_Intercept"]+tva_recovery_true_params$b["C_conditionhigh"])$. There are no true differences between the conditions for any of the other model parameters.
Each of the r n_distinct(tva_recovery$subject) subjects has their individual parameter configuration and condition-level effects, and all of those parameters were randomly sampled from a multivariate normal distribution so that:
$$
\begin{pmatrix}
a_{\log C}\
b_{\log C}\
a_{\log\alpha}\
b_{\log\alpha}\
a_{\mu_K}\
b_{\mu_K}\
a_{\log\sigma_K}\
b_{\log\sigma_K}\
a_{\log\sigma_0}\
b_{\log\sigma_0}\
a_{\mu_0}\
b_{\mu_0}\
a_{\textrm{logit}(r)}\
b_{\textrm{logit}(r)}
\end{pmatrix}\sim\mathcal{N}_{r length(tva_recovery_true_params$b)}\left(\begin{pmatrix}
r paste0(tva_recovery_true_params$b, collapse = "\\\\")
\end{pmatrix},\begin{pmatrix}
r (tva_recovery_true_params$r_subject * tcrossprod(tva_recovery_true_params$s_subject)) %>% apply(1, paste, collapse = "&") %>% paste(collapse="\\\\")
\end{pmatrix}\right)
$$
The data frame tva_recovery contains simulated CombiTVA trials for r n_distinct(tva_recovery$subject) subjects (r nrow(tva_recovery)/n_distinct(tva_recovery$subject) trials per subject, =r nrow(tva_recovery) trials total). Half of the trials of each subject were performed in a "low" condition and the other half in a "high" condition:
data("tva_recovery") head(tva_recovery)
The package also contains the true values used for the simulation (including the subject-level parameters):
data("tva_recovery_true_params") str(tva_recovery_true_params)
These can be conveniently converted to data frame for later comparison with fitted values:
true_params <- tva_recovery_true_params$coef_subject true_params_narrow <- true_params %>% mutate(`italic(w)[lambda]` = w[,2], `italic(K)` = meanprobitK(mK,sK,6)) %>% select(-w,-mK,-sK) %>% rename(`italic(t)[0]` = mu0, `sigma[0]` = sigma0, `italic(C)` = C) %>% pivot_longer(-c(subject,condition), names_to = "param", values_to = "true_value") head(true_params_narrow) true_params_narrow2 <- bind_rows( true_params_narrow %>% filter(condition == "low" & param != "italic(C)") %>% select(-condition), true_params_narrow %>% filter(param == "italic(C)") %>% transmute(subject, param = sprintf("%s[%s]", param, condition), true_value) ) head(true_params_narrow2)
Note: The R script that generates tva_recovery is located in the package source at data-raw/tva_recovery.R.
For a single trial, given the effective exposure duration $\tau=t-t_0$ of the set of displayed items $S$, and predicted individual processing rates $\mathbf v$ for those items, the probability of a memory report $M$ is given as:
$$
P_{\mathrm{M}}\left(M\mid\tau,S,K,\mathbf{v}\right)
=\begin{cases}
1 & \textrm{if }M=\emptyset\textrm{ and }K=0\
1 & \textrm{if }M=\emptyset\textrm{ and }\tau=0\
\bar{F}\left(\tau;\sum_{z\in S}v_{z}\right) & \textrm{if }M=\emptyset\textrm{ and }K>0\textrm{ and }\tau>0\
\prod_{y\in M}F\left(\tau;v_{y}\right)\prod_{z\in S\setminus M}\bar{F}\left(\tau;v_{z}\right) & \textrm{if }0<\left|M\right|
For partial reports (i.e., when the display set contains distractors $S_{{\rm D}}$ and targets $S_{{\rm T}}$, we need to iterate over the set of potential subsets of $S_{{\rm D}}$, powerset $\mathcal{P}{\leq K-\left|R{{\rm T}}\right|}\left(S_{{\rm D}}\right)$, that could have made it into VSTM before $\tau$, in addition to the actually reported target items $R_\mathrm{T}$:
$$ P_{{\rm PR}}\left(R_{{\rm T}}\mid\tau,S_{{\rm T}},S_{{\rm D}},K,\mathbf{v}\right) =\begin{cases} \sum_{R_{{\rm D}}\in\mathcal{P}{\leq K-\left|R{{\rm T}}\right|}\left(S_{{\rm D}}\right)}P_{{\rm M}}\left(R_{{\rm T}}\cup R_{{\rm D}}\mid\tau,S_{{\rm T}}\cup S_{{\rm D}},K,\mathbf{v}\right) & \textrm{if }S_{{\rm D}}\neq\emptyset\textrm{ and }\tau\geq0\ P_{{\rm M}}\left(R_{{\rm T}}\mid\tau,S_{{\rm T}},K,\mathbf{v}\right) & \textrm{if }S_{{\rm D}}=\emptyset\textrm{ and }\tau\geq0\ 0 & \textrm{otherwise} \end{cases} $$
Note that $P_{{\rm PR}}$ simplifies to $P_{\mathrm{M}}$ if there were no distractors ($S_{{\rm D}}=\emptyset$), i.e. when the trial was effectively a whole-report trial.
Being based on RStan, RStanTVA can be used in different ways, i.e. using maximum-likelihood estimation or Bayesian inference of different hierarchical complexity (single-condition, single-participant/fixed-effects, fully-hierarchical/mixed-effects). Hence we can also recover model parameters in different ways.
Using MLE, we can define four different inference models:
The most straightforward model uses no regularization by priors (priors = FALSE) and conducts single fits for all model parameters:
m <- stantva_model( locations = 6, task = "pr", regions = list(left = 1:3, right = 4:6), w_mode = "regions", t0_mode = "gaussian", K_mode = "probit", save_log_lik = FALSE, priors = FALSE )
By defining a regression model for $\log C$, which includes an intercept and a treatment effect of "high" vs. "low" (log(C) ~ 1 + condition), we can model both experimental conditions in a single model:
m_nested <- stantva_model( formula = list(log(C) ~ 1 + condition), locations = 6, task = "pr", regions = list(left = 1:3, right = 4:6), w_mode = "regions", t0_mode = "gaussian", K_mode = "probit", save_log_lik = FALSE, priors = FALSE )
Omitting the priors = FALSE argument falls back to the default priors = NULL, which uses vaguely informative default priors for all parameters. The model is otherwise identical to the baseline RStanTVA-ML model above:
m_reg <- stantva_model( locations = 6, task = "pr", regions = list(left = 1:3, right = 4:6), w_mode = "regions", t0_mode = "gaussian", K_mode = "probit", save_log_lik = FALSE )
When using regularizing priors, if a parameter is described by an equation such as $C$, we need to specify priors for intercepts and slopes:
priors <- prior(normal(0,.5),dpar=C)+ prior(normal(4.5,.6),dpar=C,coef=Intercept) m_nested_reg <- stantva_model( formula = list(log(C) ~ 1 + condition), locations = 6, task = "pr", regions = list(left = 1:3, right = 4:6), w_mode = "regions", t0_mode = "gaussian", K_mode = "probit", save_log_lik = FALSE, priors = priors )
For brevity, we are only reporting results of the RStanTVA-MLRN fits here, since those have been found to be most reliable (see manuscript). The model can be fitted to the individual data sets (each participant) as follows:
fit_mle_nested_reg <- readRDS("tva_recovery_cache/fit_mle_nested_reg.rds")
fit_mle_nested_reg <- lapply(1:50, function(i) { d <- tva_recovery %>% filter(subject == i) repeat { p <- optimizing(m_nested_reg, d) if(p$return_code == 0L) break } tibble(subject = i, param = c("italic(C)[low]","italic(C)[high]","alpha","italic(w)[lambda]","italic(t)[0]","sigma[0]","italic(K)"), fitted_value = c(exp(p$par["C_Intercept"]), exp(p$par["C_Intercept"]+p$par["C_conditionhigh"]), p$par[c("alpha","r[2]","mu0","sigma0")], meanprobitK(p$par["mK"],p$par["sK"],6)), converged = p$return_code == 0L) }) %>% bind_rows() %>% left_join(true_params_narrow2)
The critical function above is optimizing(m_nested_reg, d), which uses MLE to fit m_nested_reg to the subject-level subset d.
We can, in principle, fit the exact same models as above to the data using Bayesian inference instead of MLE, simply by replacing the optimizing(...) with a sampling(...) function call. So the following code will fit the RStanTVA-MLRN (m_nested_reg) model to subject-level subsets d using HMC methods in Stan:
fit_bayesian_nested <- readRDS("tva_recovery_cache/fit_bayesian_nested.rds")
fit_bayesian_nested <- lapply(1:50, function(i) { d <- tva_recovery %>% filter(subject == i) sf <- sampling(m_nested_reg, d) p1 <- predict(sf, data.frame(subject = i, condition = "low"), c("C","alpha", "r", "mu0","sigma0","mK","sK")) p2 <- predict(sf, data.frame(subject = i, condition = "high"), "C") tibble( `italic(C)[low]` = t(p1$C), `italic(C)[high]` = t(p2$C), `alpha` = t(p1$alpha), `italic(w)[lambda]` = t(p1$r[,2,1]), `italic(t)[0]` = t(p1$mu0), `sigma[0]` = t(p1$sigma0), `italic(K)` = t(meanprobitK(p1$mK,p1$sK,6)) ) %>% pivot_longer(everything(), names_to = "param", values_to = "samples") %>% rowwise(param) %>% reframe(subject = i, posterior_sd = sd(samples), fitted_value = median(samples), cri = t(quantile(samples, c(.025,.975))), converged = Rhat(matrix(samples, ncol = sf@sim$chains)) < 1.1) }) %>% bind_rows() %>% left_join(true_params_narrow2)
Note that we must use the method predict(fit, data, parameters) to predict the TVA parameter values from the fitted parameters. This will return a posterior distribution for each row (subject) in data and each specified parameter. We are then calculating 95% credible intervals (CrIs), medians, $\hat R$ and posteriors SDs as measures of goodness-of-fit.
For hierarchical inference, we define mixed-effects regressions for the model parameters $C$, $\log\alpha$, $\mu_K$, $\mu_0$, and $\textrm{logit}(r)$, while keeping $\sigma_0$ and $\sigma_K$ fixed across all subjects:
m_hierarchical <- stantva_model( formula = list( C ~ 1 + condition + (1 + condition | subject), log(alpha) ~ 1 + (1 | subject), mK ~ 1 + (1 | subject), mu0 ~ 1 + (1 | subject), log(r) ~ 1 + (1 | subject) ), locations = 6, task = "pr", regions = list(left = 1:3, right = 4:6), C_mode = "equal", w_mode = "regions", t0_mode = "gaussian", K_mode = "probit", priors = prior(normal(90, 20/sqrt2()), coef = Intercept, dpar = C) + prior(normal(0, 10/sqrt2()), dpar = C) + prior(gamma(2, 2/20 * sqrt2()), class = sd, coef = Intercept, dpar = C) + prior(gamma(2, 2/10 * sqrt2()), class = sd, dpar = C) + prior(normal(-0.4, 0.6/sqrt2()), coef = Intercept, dpar = alpha) + prior(normal(0, 0.3/sqrt2()), dpar = alpha) + prior(gamma(2, 2/0.6 * sqrt2()), class = sd, coef = Intercept, dpar = alpha) + prior(gamma(2, 2/0.3 * sqrt2()), class = sd, dpar = alpha) + prior(normal(0, 0.1/sqrt2()), coef = Intercept, dpar = r) + prior(normal(0, 0.05/sqrt2()), dpar = r) + prior(gamma(2, 2/0.1 * sqrt2()), class = sd, coef = Intercept, dpar = r) + prior(gamma(2, 2/0.05 * sqrt2()), class = sd, dpar = r) + prior(normal(3.5, 0.5/sqrt2()), coef = Intercept, dpar = mK) + prior(normal(0, 0.25/sqrt2()), dpar = mK) + prior(gamma(2, 2/0.5 * sqrt2()), class = sd, coef = Intercept, dpar = mK) + prior(gamma(2, 2/0.25 * sqrt2()), class = sd, dpar = mK) + prior(normal(20, 15/sqrt2()), coef = Intercept, dpar = mu0) + prior(normal(0, 7.5/sqrt2()), dpar = mu0) + prior(gamma(2, 2/15 * sqrt2()), class = sd, coef = Intercept, dpar = mu0) + prior(gamma(2, 2/7.5 * sqrt2()), class = sd, dpar = mu0) + prior(normal(0.5, 0.05/sqrt2()), coef = Intercept, dpar = sK) + prior(normal(0, 0.025/sqrt2()), dpar = sK) + prior(gamma(2, 2/0.05 * sqrt2()), class = sd, coef = Intercept, dpar = sK) + prior(gamma(2, 2/0.025 * sqrt2()), class = sd, dpar = sK) + prior(normal(20, 15/sqrt2()), coef = Intercept, dpar = sigma0) + prior(normal(0, 7.5/sqrt2()), dpar = sigma0) + prior(gamma(2, 2/15 * sqrt2()), class = sd, coef = Intercept, dpar = sigma0) + prior(gamma(2, 2/7.5 * sqrt2()), class = sd, dpar = sigma0) )
We can then simply fit the hierarchical model m_hierarchical to the entire data set tva_recovery without subsetting:
fit_bayesian_hierarchical_globals <- readRDS("tva_recovery_cache/fit_bayesian_hierarchical_globals.rds")
fg <- sampling(m_hierarchical, tva_recovery) fit_bayesian_hierarchical_globals <- { p1 <- predict(fg, tibble(subject = 1:50, condition = "low"), c("C","alpha", "r", "mu0","sigma0","mK","sK")) p2 <- predict(fg, tibble(subject = 1:50, condition = "high"), "C") tibble( subject = 1:50, `italic(C)[low]` = t(p1$C), `italic(C)[high]` = t(p2$C), `alpha` = t(p1$alpha), `italic(w)[lambda]` = t(p1$r[,2,]), `italic(t)[0]` = t(p1$mu0), `sigma[0]` = t(p1$sigma0), `italic(K)` = t(matrix(meanprobitK(p1$mK,p1$sK,6),ncol=50)) ) } %>% pivot_longer(-subject, names_to = "param", values_to = "samples") %>% rowwise(c(subject,param)) %>% reframe(posterior_sd = sd(samples), fitted_value = median(samples), cri = t(quantile(samples, c(.025,.975))), converged = Rhat(t(samples)) < 1.1) %>% left_join(true_params_narrow2)
figdat <- bind_rows( fit_mle_nested_reg %>% add_column(method = "MLRN"), fit_bayesian_hierarchical_globals %>% add_column(method = "HB"), fit_bayesian_nested %>% add_column(method = "NB") ) %>% mutate(method = factor(method, levels = c("MLRN","NB","HB"))) fig <- ggplot(figdat) + theme_minimal() + theme(text = element_text(family = "sans", size = 10)) + labs(x = "True value", y = "Recovered value") + facet_wrap(method~param, ncol = 7, scales = "free", labeller = function(x) label_parsed(list(sprintf("%s~(\"%s\")", x$param, x$method)))) + geom_abline(linetype = "dashed") + geom_linerange(aes(x=true_value,ymin=cri[,1],ymax=cri[,2]), color = "gray50", linewidth = 0.2) + geom_point(aes(x=true_value,y=fitted_value), size = 0.5, color = "blue") + geom_point(aes(x=vx,y=vy), color = "transparent", figdat %>% group_by(param) %>% reframe(vx = range(true_value), vy = range(c(fitted_value,cri), na.rm = TRUE))) print(fig)
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.