lmer: Fit Linear Mixed-Effects Models
In lme4: Linear Mixed-Effects Models using 'Eigen' and S4

View source: R/lmer.R

lmer	R Documentation

Fit Linear Mixed-Effects Models

Description

Fit a linear mixed-effects model (LMM) to data, via restricted maximum likelihood (REML) or maximum likelihood.

Usage

lmer(formula, data = NULL, REML = TRUE, control = lmerControl(),
     start = NULL, verbose = 0L, subset, weights, na.action,
     offset, contrasts = NULL, devFunOnly = FALSE)

Arguments

`formula`	a two-sided linear formula object describing both the fixed-effects and random-effects part of the model, with the response on the left of a `~` operator and the terms, separated by `+` operators, on the right. Random-effects terms are distinguished by vertical bars (`\|`) separating expressions for design matrices from grouping factors. By default, non-scalar random effects (where the design matrix has more than one column, e.g. `(1+x\|f)`) are fitted with unstructured (general positive semidefinite) covariance matrices. Two vertical bars (`\|\|`) can be used to specify multiple uncorrelated random effects for the same grouping variable. With default settings, the `\|\|`-syntax works only for design matrices containing numeric (continuous) predictors; to fit models with independent categorical effects, use `diag(f\|g)` or set `options(lme4.doublevert.default = "diag_special")` (see `getDoublevertDefault`). Tags preceding a random effect term specify covariance structure: `us` (default: `us(f\|g)` is equivalent to `(f\|g)`): unstructured, positive semi-definite `diag`: diagonal (all correlations set to zero). Specify `diag(f\|g, hom = TRUE)` to fit a homogeneous diagonal covariance matrix `cs`: compound symmetric (all pairwise correlations set identical). Specify `cs(f\|g, hom = TRUE)` for homogeneous variances. `ar1`: autoregressive order 1. Note that AR1 models are homogeneous by default; specify `ar1(f\|g, hom = FALSE)` for heterogeneous variances.
`data`	an optional data frame containing the variables named in `formula`. By default the variables are taken from the environment from which `lmer` is called. While `data` is optional, the package authors strongly recommend its use, especially when later applying methods such as `update` and `drop1` to the fitted model (such methods are not guaranteed to work properly if `data` is omitted). If `data` is omitted, variables will be taken from the environment of `formula` (if specified as a formula) or from the parent frame (if specified as a character vector).
`REML`	logical scalar - Should the estimates be chosen to optimize the REML criterion (as opposed to the log-likelihood)?
`control`	a list (of correct class, resulting from `lmerControl()` or `glmerControl()` respectively) containing control parameters, including the nonlinear optimizer to be used and parameters to be passed through to the nonlinear optimizer, see the `*lmerControl` documentation for details.
`start`	a numeric vector or a named list with one optional component named `par` or `theta`, giving starting values for covariance parameters. Numeric `start` is equivalent to `list(par = start)`. Parameters corresponding to unstructured covariance matrices are on the scale of the Cholesky factor of the relative covariance matrix. By default, all relative covariance matrices are identity matrices.
`verbose`	integer scalar. If `> 0` verbose output is generated during the optimization of the parameter estimates. If `> 1` verbose output is generated during the individual penalized iteratively reweighted least squares (PIRLS) steps.
`subset`	an optional expression indicating the subset of the rows of `data` that should be used in the fit. This can be a logical vector, or a numeric vector indicating which observation numbers are to be included, or a character vector of the row names to be included. All observations are included by default.
`weights`	an optional vector of ‘prior weights’ to be used in the fitting process. Should be `NULL` or a numeric vector. Prior `weights` are not normalized or standardized in any way. In particular, the diagonal of the residual covariance matrix is the squared residual standard deviation parameter `sigma` times the vector of inverse `weights`. Therefore, if the `weights` have relatively large magnitudes, then in order to compensate, the `sigma` parameter will also need to have a relatively large magnitude.
`na.action`	a function that indicates what should happen when the data contain `NA`s. The default action (`na.omit`, inherited from the 'factory fresh' value of `getOption("na.action")`) strips any observations with any missing values in any variables.
`offset`	this can be used to specify an a priori known component to be included in the linear predictor during fitting. This should be `NULL` or a numeric vector of length equal to the number of cases. One or more `offset` terms can be included in the formula instead or as well, and if more than one is specified their sum is used. See `model.offset`.
`contrasts`	an optional list. See the `contrasts.arg` of `model.matrix.default`.
`devFunOnly`	logical - return only the deviance evaluation function. Note that because the deviance function operates on variables stored in its environment, it may not return exactly the same values on subsequent calls (but the results should always be within machine tolerance).

Details

If the formula argument is specified as a character vector, the function will attempt to coerce it to a formula. However, this is not recommended (users who want to construct formulas by pasting together components are advised to use as.formula or reformulate); model fits will work but subsequent methods such as drop1, update, etc. may fail.
When handling perfectly collinear predictor variables (i.e. fixed-effect design matrices of less than full rank), [gn]lmer is not as sophisticated as modeling frameworks such as lm and glm. While it does automatically drop collinear variables (with a message rather than a warning), it does not automatically fill in NA values for the dropped coefficients; these can be added via fixef(fitted.model, add.dropped=TRUE). This information can also be retrieved via attr(getME(fitted.model, "X"), "col.dropped").
the deviance function returned when devFunOnly is TRUE takes a single numeric vector argument which defines the scaled variance-covariance matrices of the random effects.
- In the case of unstructured covariances, this vector is directly mapped to the theta vector, which represents the unique non-zero values in the Cholesky factor of the (scaled) covariance matrix. For models with only simple (intercept-only) random effects, par (and thus theta) is a vector of the standard deviations of the random effects. For more complex or multiple random effects, running getME(.,"par") or (equivalently) getME(., "theta") to retrieve the theta vector for a fitted model and examining the names of the vector is probably the easiest way to determine the correspondence between the elements of the theta vector and elements of the lower triangles of the Cholesky factors of the random effects.
- For structured covariances, the getTheta method translates the parameter vector to the theta (Cholesky-factor element) scale for internal use. The parameter vector is usually composed of a set of standard-deviation values (one if hom = TRUE or many if hom = FALSE), followed by one or more parameters that determine the correlation matrix.

Value

An object of class merMod (more specifically, an object of subclass lmerMod), for which many methods are available (e.g. methods(class="merMod"))

Note

In earlier version of the lme4 package, a method argument was used. Its functionality has been replaced by the REML argument.

Also, lmer(.) allowed a family argument (to effectively switch to glmer(.)). This has been deprecated in summer 2013, and been disabled in spring 2019.

Examples

## linear mixed models - reference values from older code
(fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy))
summary(fm1) # (with its own print method; see class?merMod % ./merMod-class.Rd
plot(fm1) # plotting the model diagnostics; see ?plot.merMod

str(terms(fm1))
stopifnot(identical(terms(fm1, fixed.only=FALSE),
                    terms(model.frame(fm1))))
attr(terms(fm1, FALSE), "dataClasses") # fixed.only=FALSE needed for dataCl.

## Maximum Likelihood (ML), and "monitor" iterations via 'verbose':
fm1_ML <- update(fm1, REML=FALSE, verbose = 1)
(fm2 <- lmer(Reaction ~ Days + (Days || Subject), sleepstudy))
anova(fm1, fm2)
sm2 <- summary(fm2)
print(fm2, digits=7, ranef.comp="Var") # the print.merMod()         method
print(sm2, digits=3, corr=FALSE)       # the print.summary.merMod() method

## Fit sex-specific variances by constructing numeric dummy variables
## for sex and sex:age; in this case the estimated variance differences
## between groups in both intercept and slope are zero ...
data(Orthodont,package="nlme")
Orthodont$nsex <- as.numeric(Orthodont$Sex=="Male")
Orthodont$nsexage <- with(Orthodont, nsex*age)
lmer(distance ~ age + (age|Subject) + (0+nsex|Subject) +
     (0 + nsexage|Subject), data=Orthodont)

lme4 documentation built on March 6, 2026, 1:07 a.m.