glmer2stan: Define Stan model using glmer notation
In rmcelreath/glmer2stan: RStan models defined by glmer formulas

Description Usage Arguments Details Author(s) References See Also Examples

Using standard formula notation from glmer (lme4), defines a Stan model (rstan) and optionally samples from the posterior. Can optionally compute DIC or WAIC. Supports model families: "gaussian", "binomial", "poisson", "ordered", "gamma", and the zero-augmented family "zigamma". A number of custom mixture and multiple-outcome models can be specified by using lists of formulas and family names.

glmer2stan( formula , data , family="gaussian" , varpriors="flat" , 
    sample=TRUE , warmup=5000 , iter=10000 , chains=1 , initmethod="zero" , 
    extract=FALSE , calcDIC=TRUE , calcWAIC=FALSE , verbose=TRUE , 
    fixed_prefix="beta_" , vary_prefix="vary_" , ... )
    
lm2stan( formula , data , ... )
glm2stan( formula , data , family , ... )
lmer2stan ( formula , data , ... )

`formula`	Model formula or list of formulas, using `glmer` notation for varying effects.
`data`	Data frame or list
`family`	Model family name or list of names for outcome(s). Valid choices are: "gaussian", "binomial" (logit link), "poisson" (log link), "ordered" (cumulative logit), "gamma" (log link), "zigamma" (zero-augmented gamma, logit and log links).
`varpriors`	Variance prior presets. Valid choices are `'weak'` and `'flat'`. See details below.
`sample`	Whether or not to sample from the posterior (`TRUE`) or just return model code (`FALSE`)
`warmup`	rstan parameter: number of adaptation samples
`iter`	rstan parameter: total number of samples, including warmup
`chains`	rstan parameter: number of chains
`initmethod`	Method of determining initial values for sampling. Options are "zero" for all zeros on fixed effects (except Intercepts) and ones on standard deviations, "random" for random inits within parameter support, and "lme4" for estimates from a `glmer` fit.
`extract`	If TRUE, extracts samples (minus warmup) from resulting `rstan` fit and returns only the samples, not the entire `stanfit` object
`calcDIC`	If TRUE, computes the Deviance Information Criterion, DIC, after sampling
`calcWAIC`	If TRUE, computes the Widely Applicable Information Criterion, WAIC, after sampling
`verbose`	If TRUE, displays various progress messages
`fixed_prefix`	Text to prepend to fixed effect parameter names. Default is 'beta_'.
`vary_prefix`	Text to prepend to varying effect parameter names. Default is 'vary_'.
`...`	additional parameters to pass to `glmer2stan` and `stan`

These commands use the varying effects and fixed effects structures of a lmer or glmer model formula to define the corresponding model in rstan. The functions lm2stan, glm2stan, and lmer2stan just provide simplified aliases for glmer2stan. All processing is done by glmer2stan.

There are two modes.

First, when option sample=FALSE, the command will define the model code and the data and init lists for stan, returning them in a list with named slots model, data, and init. These objects can later be passed to stan to sample from the model. You can use show to display the model code in a readable format.

Second, when sample=TRUE, the glmer2stan will begin rstan sampling, returning the stanfit object. Once sampling is done, the functions stanmer and varef provide summary and varying effect estimates, respectively, for the stanfit object. Also, all of the summary, plotting, and display functions defined by rstan still operate normally.

It is possible to pass lists of formulas and families. In that case, glmer2stan defines multiple outcome models that may share varying effects. Using lists in this way, it is possible to define mixture models, as well as multilevel MANOVA-style models. For example, the family list('binomial','poisson') corresponds to a zero-inflated poisson. Any varying effect grouping variables shared across formulas define large blocks of varying effects that bridge outcome variables.

Note that glmer2stan does not yet correctly handle models that pair varying slopes with non-varying intercepts. For example, y ~ (0+x|id) + x indicates a model with varying slopes on x, but no varying intercepts. This is a legitimate (although unusual) model, and glmer will estimate it, but glmer2stan will not define it properly. A later release may fix this issue. For now, manually editing the model code is necessary.

When calcDIC=TRUE, additional code is added to the model that computes the deviance at each sample. These can later be used to simplify computing DIC. When sampling is also done through glmer2stan, DIC will be computed and displayed when the chain completes. DIC is defined as 2*Dbar - Dhat, where Dbar is the average deviance and Dhat is the deviance at the average values of all parameters. All samples after warmup are used to compute these averages. See Section 8.6 of Lunn et al 2013 for details.

When calcWAIC=TRUE, the Stan model code is not altered, but glmer2stan will compute WAIC (Widely Applicable Information Criterion), when sampling completes. WAIC requires computing the likelihood for each data point, for each sample from the posterior. The formula used for the penalty term is:

p = \Sum_i \var_s(\log L(y_i|θ_s))

where the summation is over data points i and the variance is over posterior samples s. See Watanabe 2010. See also Gelman, Hwang & Vehtari 2013 for alternative formulas. Note that binomial family models should be constructed in logistic regression form, with 0/1 outcomes, for properly computing WAIC. Currently, WAIC only works for single formula models. Zero-inflated and multiple-outcome models require some additional choices to be made, in order to define WAIC, since WAIC requires partitioning the data into 'points'. In the future, some default decisions for such models may be incorporated. In the meantime, glmer2stan merely warns the user.

When choosing family='ordered', the parameter vector cutpoints will contain the ordered intercepts. The outcome variable must be integer valued with minimum 1.

Family 'gamma' uses a log link on the mean of the gamma density. The rate (inverse scale) parameter for the gamma distribution is returned as theta. glmer2stan will try to guess a good initial value for theta by maximum likelihood search, but manual choice of inits may be necessary in some cases.

When using the default initmethod of 'zero', the code tries to guess good starting values for any Intercept parameters. All other fixed effects are initialized at zero.

When using initmethod='lme4' to initialize parameters, variance components may be initialized to defaults. This is necessary whenever glmer returns a boundary estimate: zero variance or -1/+1 correlation.

The varpriors='weak' priors for variance components are sigma ~ gamma(2,1e-4) prior for standard deviations of varying effects and Rho ~ lkj_corr(1.5) prior for correlation structure of the multivariate effects. The eta=1.5 value creates a nearly-uniform correlation prior, but with low prior probability for correlations near -1 and +1. Changing eta to 1 creates a uniform correlation prior. Estimates are still reported in a merged Sigma variance-covariance matrix, even though priors are defined separately for standard deviations and the correlation matrix. See the model code for details. The varpriors='flat' preset uses uniform priors. Stan does not need nor benefit from the use of fully conjugate priors, so they are not provided as a preset. However, any prior can of course be manually added to the model code, using option sample=FALSE.

All top-level (fixed effect) regression parameters are assigned default normal(0,100) priors. To change these, set sample=FALSE and edit the model code directly.

Richard McElreath

Gelman, A., J. Hwang, and A. Vehtari. 2013. Understanding predictive information criteria for Bayesian models.

Lunn, D., C. Jackson, N. Best, A. Thomas, and D. Spiegelhalter. 2013. The BUGS Book. CRC Press.

Watanabe, S. 2010. Asymptotic equivalence of Bayes cross validation and Widely Applicable Information Criterion in singular learning theory. Journal of Machine Learning Research 11:3571-3594.

glmer,stan

# gaussian test

# simulate data
library(MASS)
N <- 1000 # 1000 cases
J <- 100 # 100 clusters
J2 <- 20
NperJ <- N/J
sigma <- 2 # top-level standard deviation
mu <- c(10,-0.5) # means of varying effects coefficients
x <- runif(N,min=-2,max=2) # predictor variable
x2 <- runif(N,min=-2,max=2)
id <- rep( 1:J , each=NperJ ) # cluster id's
id2 <- rep( 1:J2 , each=N/J2 )
Sigma <- matrix( 0 , nrow=2 , ncol=2 ) # var-cov matrix
Sigma[1,1] <- 2
Sigma[2,2] <- 0.2
Sigma[1,2] <- Sigma[2,1] <- -0.8 * sqrt( Sigma[1,1] * Sigma[2,2] )
beta <- mvrnorm( J , mu=mu , Sigma=Sigma )
y <- rnorm( N , mean=beta[id,1]+beta[id,2]*x , sd=sigma )

# display model code
model.code <- glmer2stan( y ~ (1+x+x2|id) + (1+x|id2) + x + x2 , 
  data=list(y=y,x=x,x2=x2,id=id,id2=id2) , family="gaussian" , 
  calcDIC=TRUE , sample=FALSE )
show(model.code)

# now fit rstan model
stanresult <- glmer2stan( y ~ (1+x+x2|id) + (1+x|id2) + x + x2 , 
  data=list(y=y,x=x,x2=x2,id=id,id2=id2) , family="gaussian" , 
  calcDIC=TRUE , sample=TRUE )

# summarize
stanmer(stanresult)
varef(stanresult)