Translating lme4 models to sommer

The sommer package was developed to provide R users a powerful and reliable multivariate mixed model solver. The package is focused on two approaches: 1) p > n (more effects to estimate than observations) using the mmer() function, and 2) n > p (more observations than effects to estimate) using the mmec() function. The core algorithms are coded in C++ using the Armadillo library. This package allows the user to fit mixed models with the advantage of specifying the variance-covariance structure for the random effects, and specifying heterogeneous variances, and obtaining other parameters such as BLUPs, BLUEs, residuals, fitted values, variances for fixed and random effects, etc.

The purpose of this vignette is to show how to translate the syntax formula from lme4 models to sommer models. Feel free to remove the comment marks from the lme4 code so you can compare the results.

1) Random slopes with same intercept 2) Random slopes and random intercepts (without correlation) 3) Random slopes and random intercepts (with correlation) 4) Random slopes with a different intercept 5) Other models not available in lme4

1) Random slopes

This is the simplest model people use when a random effect is desired and the levels of the random effect are considered to have the same intercept.

# install.packages("lme4")
# library(lme4)
library(sommer)
data(DT_sleepstudy)
DT <- DT_sleepstudy
###########
## lme4
###########
# fm1 <- lmer(Reaction ~ Days + (1 | Subject), data=DT)
# summary(fm1) # or vc <- VarCorr(fm1); print(vc,comp=c("Variance"))
# Random effects:
#  Groups   Name        Variance Std.Dev.
#  Subject  (Intercept) 1378.2   37.12   
#  Residual              960.5   30.99   
# Number of obs: 180, groups:  Subject, 18
###########
## sommer
###########
fm2 <- mmer(Reaction ~ Days,
            random= ~ Subject, 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

# fm2 <- mmec(Reaction ~ Days,
#             random= ~ Subject, 
#             data=DT, tolParInv = 1e-6, verbose = FALSE)
# summary(fm2)$varcomp

2) Random slopes and random intercepts (without correlation)

This is the a model where you assume that the random effect has different intercepts based on the levels of another variable. In addition the || in lme4 assumes that slopes and intercepts have no correlation.

###########
## lme4
###########
# fm1 <- lmer(Reaction ~ Days + (Days || Subject), data=DT)
# summary(fm1) # or vc <- VarCorr(fm1); print(vc,comp=c("Variance"))
# Random effects:
#  Groups    Name        Variance Std.Dev.
#  Subject   (Intercept) 627.57   25.051  
#  Subject.1 Days         35.86    5.988  
#  Residual              653.58   25.565  
# Number of obs: 180, groups:  Subject, 18
###########
## sommer
###########
fm2 <- mmer(Reaction ~ Days,
            random= ~ Subject + vsr(Days, Subject), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

# fm2 <- mmec(Reaction ~ Days,
#             random= ~ Subject + vsc(dsc(Days), isc(Subject)),
#             data=DT, tolParInv = 1e-6, verbose = FALSE)
# summary(fm2)$varcomp

Notice that Days is a numerical (not factor) variable.

3) Random slopes and random intercepts (with correlation)

This is the a model where you assume that the random effect has different intercepts based on the levels of another variable. In addition a single | in lme4 assumes that slopes and intercepts have a correlation to be estimated.

###########
## lme4
###########
# fm1 <- lmer(Reaction ~ Days + (Days | Subject), data=DT)
# summary(fm1) # or # vc <- VarCorr(fm1); print(vc,comp=c("Variance"))
# Random effects:
#  Groups   Name        Variance Std.Dev. Corr
#  Subject  (Intercept) 612.10   24.741       
#           Days         35.07    5.922   0.07
#  Residual             654.94   25.592       
# Number of obs: 180, groups:  Subject, 18
###########
## sommer
###########
## no equivalence using mmer() to find the correlation between the 2 vc
## using mmec() the model would be
# mm <- diag(2)+matrix(.02,2,2)
# fm3 <- mmec(Reaction ~ Days,
#             random=~ covc( vsc(isc(Subject)), vsc(isc(Days), isc(Subject)), theta = mm ), 
#             nIters = 100,
#             data=DT)
# summary(fm3)$varcomp # or # 
# cov2cor(fm3$theta[[1]])

Notice that this last model required to specify the initial scaled variance components because an unstructured covariance normally has difficulty to converge.

4) Random slopes with a different intercept

This is the a model where you assume that the random effect has different intercepts based on the levels of another variable but there's not a main effect. The 0 in the intercept in lme4 assumes that random slopes interact with an intercept but without a main effect.

###########
## lme4
###########
# fm1 <- lmer(Reaction ~ Days + (0 + Days | Subject), data=DT)
# summary(fm1) # or vc <- VarCorr(fm1); print(vc,comp=c("Variance"))
# Random effects:
#  Groups   Name Variance Std.Dev.
#  Subject  Days  52.71    7.26   
#  Residual      842.03   29.02   
# Number of obs: 180, groups:  Subject, 18
###########
## sommer
###########
fm2 <- mmer(Reaction ~ Days,
            random= ~ vsr(Days, Subject), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

# fm2 <- mmec(Reaction ~ Days,
#             random= ~ vsc(dsc(Days), isc(Subject)), 
#             data=DT, tolParInv = 1e-6, verbose = FALSE)
# summary(fm2)$varcomp

4) Other models available in sommer but not in lme4

One of the strengths of sommer is the availability of other variance covariance structures. In this section we show 4 models available in sommer that are not available in lme4 and might be useful.

library(orthopolynom)
## diagonal model
fm2 <- mmer(Reaction ~ Days,
            random= ~ vsr(dsr(Daysf), Subject), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp
## unstructured model
fm2 <- mmer(Reaction ~ Days,
            random= ~ vsr(usr(Daysf), Subject), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp
## random regression (legendre polynomials)
fm2 <- mmer(Reaction ~ Days,
            random= ~ vsr(leg(Days,1), Subject), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp
## unstructured random regression (legendre)
fm2 <- mmer(Reaction ~ Days,
            random= ~ vsr(usr(leg(Days,1)), Subject), 
            data=DT, tolParInv = 1e-6, verbose = FALSE)
summary(fm2)$varcomp

# same can be done with the mmec function

Literature

Covarrubias-Pazaran G. 2016. Genome assisted prediction of quantitative traits using the R package sommer. PLoS ONE 11(6):1-15.

Covarrubias-Pazaran G. 2018. Software update: Moving the R package sommer to multivariate mixed models for genome-assisted prediction. doi: https://doi.org/10.1101/354639

Bernardo Rex. 2010. Breeding for quantitative traits in plants. Second edition. Stemma Press. 390 pp.

Gilmour et al. 1995. Average Information REML: An efficient algorithm for variance parameter estimation in linear mixed models. Biometrics 51(4):1440-1450.

Henderson C.R. 1975. Best Linear Unbiased Estimation and Prediction under a Selection Model. Biometrics vol. 31(2):423-447.

Kang et al. 2008. Efficient control of population structure in model organism association mapping. Genetics 178:1709-1723.

Lee, D.-J., Durban, M., and Eilers, P.H.C. (2013). Efficient two-dimensional smoothing with P-spline ANOVA mixed models and nested bases. Computational Statistics and Data Analysis, 61, 22 - 37.

Lee et al. 2015. MTG2: An efficient algorithm for multivariate linear mixed model analysis based on genomic information. Cold Spring Harbor. doi: http://dx.doi.org/10.1101/027201.

Maier et al. 2015. Joint analysis of psychiatric disorders increases accuracy of risk prediction for schizophrenia, bipolar disorder, and major depressive disorder. Am J Hum Genet; 96(2):283-294.

Rodriguez-Alvarez, Maria Xose, et al. Correcting for spatial heterogeneity in plant breeding experiments with P-splines. Spatial Statistics 23 (2018): 52-71.

Searle. 1993. Applying the EM algorithm to calculating ML and REML estimates of variance components. Paper invited for the 1993 American Statistical Association Meeting, San Francisco.

Yu et al. 2006. A unified mixed-model method for association mapping that accounts for multiple levels of relatedness. Genetics 38:203-208.

Tunnicliffe W. 1989. On the use of marginal likelihood in time series model estimation. JRSS 51(1):15-27.



Try the sommer package in your browser

Any scripts or data that you put into this service are public.

sommer documentation built on Nov. 13, 2023, 9:05 a.m.