knitr::opts_chunk$set( collapse = TRUE, comment = "#>", echo = TRUE, warning = FALSE, message = FALSE, eval = nzchar(Sys.getenv("bmscstan_eval")) )

The package **bmscstan** provides useful functions to fit
Bayesian Multilevel Single Case models (BMSC) using as
backend *Stan* [@Carpenter2017].

This approach is based on the seminal approach of the Crawford's tests [@Crawford1998; @Crawford2005; @Crawford2010], using a small control sample of individuals, to see whether the performance of the single case deviates from them. Unfortunately, Crawford's tests are limited to a number of specific experimental designs that do not allow researchers to use complex experimental designs.

The BMSC approach is born mainly to deal with this problem: its purpose is, in fact, to allow the fitting of models with the same flexibility of a Multilevel Model, with single case and controls data.

The core function of the **bmscstan** package is `BMSC`

,
whose theoretical assumptions, and its validation are
reported in [@scandola_romano_2021].

The syntax used by the `BMSC`

function is extremely similar to the syntax
used in the `lme4`

package. However, the specification of random effects
is limited, but it can cover the greatest part of cases
(for further details, please see `?bmscstan::randomeffects`

).

In order to show an example on the use of the **bmscstan** package,
the datasets in this package will be used.

In these datasets we have data coming from a Body Sidedness Effect paradigm [@Ottoboni2005; @Tessari2012], that is a Simon-like paradigm useful to measure body representation.

In this experimental paradigm, participants have to answer to a circle showed in the centre of the computer screen, superimposed to an irrelevant image of a left or right hand, or to a left or right foot.

The circle can be of two colors (e.g. red or blue), and participants have to press one button with the left when the circle is of a specific colour, and with the right hand when the circle is of the another colour.

When the irrelevant background image (foot or hand) is incongruent with the hand used to answer, the reaction times and frequency of errors are higher.

The two irrelevant backgrounds are administered in different experimental blocks.

This is considered an effect of the body representation.

In the package there are two datasets, one composed by 16 healthy control participants, and the other one by an individual affected by right unilateral brachial plexus lesion (however, s/he could independently press the keyboard buttons).

The datasets are called `data.pt`

for the single case, and `data.ctrl`

for the
control group, and they can be loaded using `data(BSE)`

.

In these datasets there are the Reaction Times `RT`

, a `Body.District`

factor
with levels FOOT and HAND, a `Congruency`

factor (levels: Congruent,
Incongruent), and a `Side`

factor (levels: Left, Right). In the `data.ctrl`

dataset there also is an `ID`

factor, representing the different 16 control
participants.

library(ggplot2) library(bmscstan) data(BSE) str(data.pt) str(data.ctrl) ggplot(data.pt, aes(y = RT, x = Body.District:Side , fill = Congruency))+ geom_boxplot() ggplot(data.ctrl, aes(y = RT, x = Body.District:Side , fill = Congruency))+ geom_boxplot()+ facet_wrap( ~ ID , ncol = 4)

These data seem to have some outliers. Let see if they are normally distributed.

qqnorm(data.ctrl$RT, main = "Controls") qqline(data.ctrl$RT) qqnorm(data.pt$RT, main = "Single Case") qqline(data.pt$RT)

They are not normally distributed. Outliers will be removed by using the
`boxplot.stats`

function.

out <- boxplot.stats( data.ctrl$RT )$out data.ctrl <- droplevels( data.ctrl[ !data.ctrl$RT %in% out , ] ) out <- boxplot.stats( data.pt$RT )$out data.pt <- droplevels( data.pt[ !data.pt$RT %in% out , ] ) qqnorm(data.ctrl$RT, main = "Controls") qqline(data.ctrl$RT) qqnorm(data.pt$RT, main = "Single Case") qqline(data.pt$RT)

They are not perfect, but definitively better.

First of all, there is the necessity to think to our hypotheses, and setting the contrast matrices consequently.

In all cases, our factors have only two levels. Therefore, we set
the factors with a Treatment Contrasts matrix, with baseline level
for `Side`

the *Left* level, for `Congruency`

the *Congruent* level, and for
`Body.District`

the *FOOT* level.

In this way, each coefficient will represent the difference between the two levels.

contrasts( data.ctrl$Side ) <- contr.treatment( n = 2 ) contrasts( data.ctrl$Congruency ) <- contr.treatment( n = 2 ) contrasts( data.ctrl$Body.District ) <- contr.treatment( n = 2 ) contrasts( data.pt$Side ) <- contr.treatment( n = 2 ) contrasts( data.pt$Congruency ) <- contr.treatment( n = 2 ) contrasts( data.pt$Body.District ) <- contr.treatment( n = 2 )

The use of the `BMSC`

function, for those who are used to `lme4`

or `brms`

syntax should be straightforward.

In this case, we want to fit the following model:

`RT ~ Body.District * Congruency * Side + (Congruency * Side | ID : Body.District)`

Unfortunately, `BMSC`

does not directly allow the syntax `ID : Body.District`

in the specification of the random effects.

Therefore, it is necessary to create a new variable for `ID : Body.District`

data.ctrl$BD_ID <- interaction( data.ctrl$Body.District , data.ctrl$ID )

and the model would be:

`RT ~ Body.District * Congruency * Side + (Congruency * Side | BD_ID)`

For further details concerning the random effects available in
`bmscstan`

, please type `?bmscstan::randomeffect`

.

At this point, fitting the model is easy, and it can be done with the use of a single function.

mdl <- BMSC(formula = RT ~ Body.District * Congruency * Side + (Congruency * Side | BD_ID), data_ctrl = data.ctrl, data_sc = data.pt, chains = 2, cores = 1, seed = 2020)

After fitting the model, we should check its quality by means of
Posterior Predictive P-Values [@Gelman2013] with the `bmscstan::pp_check`

function.

Thanks to this graphical function, we will see if the observed data and the data sampled from the posterior distributions of our BMSC model are similar.

If we observe strong deviations, it means that your BMSC model is not adequately
describing your data. In this case, you might want to change the priors
distribution (see the `help`

page), change the random effects structure,
or transform your dependent variable (using the logarithm or other math
functions).

```
pp_check( mdl )
```

In both the controls and the single case data, the Posterior Predictive P-Values check seems to adequately resemble the observed data.

A further control on our model is given by checking the Effective Sample Size (ESS) for each coefficient and the $\hat{R}$ diagnostic index [@Gelman1992].

The ESS is the "effective number of simulation draws" for any coefficient, namely the approximate number of independent draws, taking into account that the various simulations in a Monte Carlo Markov Chain (MCMC) are not independent each other. For further details, see an introductory book in Bayesian Statistics. A good ESS estimates should be $ESS > 100$ or $ESS > 10\%$ of the total draws (remembering that you should remove the burn-in simulations from the total iterations counting).

The $\hat{R}$ is an index of the convergence of the MCMCs. In `BMSC`

the default
is 4. Usually, MCMCs are considered convergent when $\hat{R} < 1.1$ (*Stan*
default).

In order to check these values, the `summary.BMSC`

function is needed
(see next section).

`summary.BMSC`

outputThe output of the `brmscstan::summary.BMSC`

function is divided in four main
parts:

- In the first part, the model and the selected priors are recalled.
- In the second part, the coefficients of the fixed effects for the control group are shown.
- In the third part, the coefficients of the fixed effects for the single case are shown.
- In the fourth and last part, the fixed effects coefficients for the difference between the single case and the control group are shown.

print( sum_mdl <- summary( mdl ) , digits = 3 )

In the second and fourth part of the output, we can observe a descriptive
summary reporting the mean, the standard error, the standard deviation,
the $2.5\%$, $25%$, $50\%$, $75\%$ and $97.5\%$ of the posterior distributions
of each coefficient. If we want the $95\%$ Credible Interval, we can consider
only the $2.5\%$ and $97.5\%$ extremes. Then, two diagnostic indexes are
reported: the `n_eff`

parameter, that is the *ESS*, and the `Rhat`

($\hat{R}$).
Finally, the Savage-Dickey Bayes Factor is reported (*BF10*).

In the third part the diagnostic indexes are not reported because these coefficients are computed as marginal probabilities from the probabilities summarized in the second and fourth part.

`summary.BMSC`

outputThe first step should be controlling the diagnostic indexes.

In this model, all `n_eff`

are greater than the $10\%$ of the total iterations
(default iterations: 4000, default warmup iterations: 2000, default chains: 4 =
`r (4000 - 2000) * 4 / 10`

). Also, all $\hat{R}s < 1.1$. Finally, we already
saw that the Posterior Predictive P-values are showing that the model is
representative of the data.

Then, observing what the fixed effects of the Control group are showing is important before of seeing the differences with the single case.

In this analysis, there are 5 fixed effects which $BF_{10}$ is greater than 3 [@Raftery1995].

tmp <- sum_mdl[[1]][sum_mdl[[1]]$BF10 > 3,c("BF10","mean","2.5%","97.5%")] colnames(tmp) <- c("$BF_{10}$", "$\\mu$", "low $95\\%~CI$", "up $95\\%~CI$") knitr::kable( tmp, digits = 3 )

We can have a general overview of the coefficients of the model with the
`plot.BMSC`

function.

plot( mdl , who = "control" )

The interaction between Body District and Congruency needs a further
analysis to better understand the phenomenon. It comes useful the function
`pairwise.BMSC`

.

pp <- pairwise.BMSC(mdl = mdl , contrast = "Body.District2:Congruency2" , who = "control") print( pp , digits = 3 )

The output of this function is divided in two parts:

- a first part, called "Marginal distributions", where the marginal distributions of each level of the coefficients are reported with a $BF_{10}$ against zero.
- a second part, called "Table of contrasts", with all possible pairwise comparisons and their $BF_{10}$.

It is also possible to plot the results of this function with the use of
`plot.pairwise.BMSC`

.

```
plot( pp )
```

Finally, it is possible to plot marginal posterior distributions for each effects with $BF_{10} > 3$.

p1 <- pairwise.BMSC(mdl , contrast = "Body.District2" , who = "control" ) plot( p1 )[[1]] + ggtitle("Body District" , subtitle = "Marginal effects") plot( p1 )[[2]] + ggtitle("Body District" , subtitle = "Contrasts") p2 <- pairwise.BMSC(mdl , contrast = "Congruency2" , who = "control" ) plot( p2 )[[1]] + ggtitle("Congruency" , subtitle = "Marginal effects") plot( p2 )[[2]] + ggtitle("Congruency" , subtitle = "Contrasts") p3 <- pairwise.BMSC(mdl , contrast = "Side2" , who = "control" ) plot( p3 )[[1]] + ggtitle("Side" , subtitle = "Marginal effects") plot( p3 )[[2]] + ggtitle("Side" , subtitle = "Contrasts")

Finally, the difference between the Control Group and the Single Case is of interest.

A general plot can be obtained in the following way, plotting both the Control Group and the Single Case:

plot( mdl ) + theme_bw( base_size = 18 )+ theme( legend.position = "bottom", legend.direction = "horizontal")

or plotting only the difference

plot( mdl ,who = "delta" ) + theme_bw( base_size = 18 )

The relevant coefficients are:

tmp <- sum_mdl[[3]][sum_mdl[[3]]$BF10 > 3,c("BF10","mean","2.5%","97.5%")] colnames(tmp) <- c("$BF_{10}$", "$\\mu$", "low $95\\%~CI$", "up $95\\%~CI$") knitr::kable( tmp, digits = 3 )

The Intercept coefficient is showing us that the single case is generally slower than the Control Sample (generally speaking, when you analyse healthy controls against a single case with a specific disease, the single case is slower).

All the main effects can be further analysed by simply looking at their
estimates (knowing the contrast matrix and the direction of the estimate
you can understand which level is greater than the other), or by means
of the `pairwise.BMSC`

function, if you also want marginal effects
and automatic plots.

The interactions require the use of the `pairwise.BMSC`

function.

p4 <- pairwise.BMSC(mdl , contrast = "Body.District2:Congruency2" , who = "delta") print( p4 , digits = 3 )

The `pairwise.BMSC`

function shows that in all cases the marginal effects of the
RTs where greater than zero, but the differences where present only in
the comparison between FOOT Congruent and the other cases.

plot( p4 , type = "interval") plot( p4 , type = "area") plot( p4 , type = "hist")

In this case we can observe that the single case was more facilitated by the FOOT Congruent condition than the Control Group.

If the interpretation of the results is difficult, it can be useful look what happens in the Single Case marginal effects.

p5 <- pairwise.BMSC(mdl , contrast = "Body.District2:Congruency2" , who = "singlecase") plot( p5 , type = "hist")[[1]]

p6 <- pairwise.BMSC(mdl , contrast = "Body.District2:Side2" , who = "delta") print( p6 , digits = 3 ) plot( p6 , type = "hist")[[1]] + theme_bw( base_size = 18)+ theme( strip.text.y = element_text( angle = 0 ) )

In this case, we can see that the left - right difference in the single case is always present, with faster RTs in the left foot than in the other cases.

p7 <- pairwise.BMSC(mdl , contrast = "Body.District2:Congruency2:Side2" , who = "delta") print( p7 , digits = 3 ) plot( p7 , type = "hist")[[1]] + theme_bw( base_size = 18)+ theme( strip.text.y = element_text( angle = 0 ) )

Here we can see that the effect was pushed by the facilitation that the single case had in the Left Congruent Foot condition compared to the Control Group.

The **bmscstan** package has wrapper functions to interface with the `loo`

package, to diagnostic and compare BMSC models.

Leaving-One-Out scores, diagnostics and comparisons are separately computed for the Control group and the Single Case data.

In order to see the Leaving-One-Out and the Pareto smoothed importance sampling
(PSIS), it is possible to use the function `loo.BMSC`

:

print( loo1 <- BMSC_loo( mdl ) ) plot( loo1 )

Model comparison can be done by means of the `BMSC_loo_compare`

function:

mdl.null <- BMSC(formula = RT ~ 1 + (Congruency * Side | BD_ID), data_ctrl = data.ctrl, data_sc = data.pt, cores = 1, chains = 2, seed = 2021) print( loo2 <- BMSC_loo( mdl.null ) ) plot( loo2 ) BMSC_loo_compare( list( loo1, loo2 ) )

Further details on LOO, PSIS and their use can be found in the **loo**
package and in @Vehtari2016 and @Vehtari2015.

In this section, a brief example on how to use the package for binomial data.

We start simulating the data.

###################################### # simulation of controls' group data ###################################### # Number of levels for each condition and trials NCond <- 2 Ntrials <- 20 NSubjs <- 40 betas <- c( 0.5 , 0 ) data.sim <- expand.grid( trial = 1:Ntrials, ID = factor(1:NSubjs), Cond = factor(1:NCond) ) ### d.v. generation y <- rep( times = nrow(data.sim) , NA ) # cheap simulation of individual random intercepts set.seed(1) rsubj <- rnorm(NSubjs , sd = 0.1) for( i in 1:length( levels( data.sim$ID ) ) ){ sel <- which( data.sim$ID == as.character(i) ) mm <- model.matrix(~ 1 + Cond , data = data.sim[ sel , ] ) set.seed(1 + i) y[sel] <- mm %*% as.matrix(betas + rsubj[i]) + rnorm( n = Ntrials * NCond ) } data.sim$y <- y data.sim$bin <- sapply( LaplacesDemon::invlogit(data.sim$y), function(x) rbinom( 1, 1, x) ) data.sim.bin <- aggregate( bin ~ Cond * ID, data = data.sim, FUN = sum) data.sim.bin$n <- aggregate( bin ~ Cond * ID, data = data.sim, FUN = length)$bin ###################################### # simulation of patient data ###################################### betas.pt <- c( 0 , 2 ) data.pt <- expand.grid( trial = 1:Ntrials, Cond = factor(1:NCond) ) ### d.v. generation mm <- model.matrix(~ 1 + Cond , data = data.pt ) set.seed(5) data.pt$y <- (mm %*% as.matrix(betas.pt + betas) + rnorm( n = Ntrials * NCond ))[,1] data.pt$bin <- sapply( LaplacesDemon::invlogit(data.pt$y), function(x) rbinom( 1, 1, x) ) data.pt.bin <- aggregate( bin ~ Cond, data = data.pt, FUN = sum) data.pt.bin$n <- aggregate( bin ~ Cond, data = data.pt, FUN = length)$bin plot(x = data.sim.bin$Cond, y = data.sim.bin$bin, ylim = c(0,20)) points(x = data.pt.bin$Cond, y = data.pt.bin$bin, col = "red")

The boxplot represents the control participants, the red dot the single case.

Now, we can specify the model:

`cbind(bin, n) ~ Cond`

The right-hand side of the formula follows the usual lmer- and brms-like
syntax. In the left-hand side of the formula, `brms`

and `lme4`

have divergent notations.

In future, the `bmscstan`

package will be able to use both notations,
for the moment it is necessary the `lme4`

notation `cbind(bin, n)`

where:

`bin`

is the number of observations`n`

is the total number of trials

mdlBin <- BMSC(formula = cbind(bin, n) ~ 1 + Cond, data_ctrl = data.sim.bin, data_sc = data.pt.bin, seed = 2022, chains = 2, family = "binomial", cores = 1) print( summary( mdlBin ) , digits = 3 )

In this vignette we have seen how to use the package **bmscstan**
and its functions to analyse and make sense of Single Case data.

The output of the main functions is rich of information, and the Bayesian Inference can be done by taking into account the Savage-Dickey $BF_{10}$, or the $95\%$ CI [see @Kruschke2014 for further details].

In this vignette there is almost no discussion concerning how to test the Single Case fixed effects (third part of the main output), but it was used to better understand what happens in the differences between the single case and the control group.

However, if your hypotheses focus on the behaviour of the patient, and not only on the differences between single case and the control group, it will be important to analyse in detail also that part.

**Any scripts or data that you put into this service are public.**

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.