predict.lmm: Predicted Outcome Value by a Linear Mixed Model.
In bozenne/repeated: Repeated Measurement Models for Discrete Times

predict.lmm

R Documentation

Predicted Outcome Value by a Linear Mixed Model.

Description

Estimate the expected outcome conditional on covariates and possibly on other outcomes based on a linear mixed model.

Usage

## S3 method for class 'lmm'
predict(
  object,
  newdata,
  type = "static",
  p = NULL,
  se = NULL,
  robust = FALSE,
  df = NULL,
  level = 0.95,
  keep.data = NULL,
  format = "long",
  export.vcov = FALSE,
  simplify = TRUE,
  ...
)

Arguments

`object`	a `lmm` object.
`newdata`	[data.frame] a dataset containing covariate values to condition on. When setting the argument 'dynamic' predictions should also contain cluster, timepoint, and outcome values.
`type`	[character] evaluate the expected outcome conditional on covariates only (`"static"`), the contribution of each variable to this 'static' prediction (`"terms"`), or the expected outcome conditional covariates and outcome values at other timepoints (`"dynamic"`). Based on the observed outcome and the 'dynamic' prediction for the missing outcome, can also evaluate the change from first repetitition (`"change"`), area under the curve (`"auc"`), and the area under the curve minus baseline (`"auc-b"`).
`p`	[numeric vector] value of the model coefficients at which to evaluate the predictions. Only relevant if differs from the fitted values.
`se`	[logical] should the standard error and confidence intervals for the predictions be output? It can also be a logical vector of length 2 to indicate the type of uncertainty to be accounted for: estimation and residual variance. In particular `c(TRUE,TRUE)` provides prediction intervals.
`robust`	[logical] Should robust standard errors (aka sandwich estimator) be output instead of the model-based standard errors. Can also be `2` compute the degrees-of-freedom w.r.t. robust standard errors instead of w.r.t. model-based standard errors.
`df`	[logical] should a Student's t-distribution be used to model the distribution of the predicted mean. Otherwise a normal distribution is used.
`level`	[numeric,0-1] the confidence level of the confidence intervals.
`keep.data`	[logical] should the dataset relative to which the predicted means are evaluated be output along side the predicted values? Only possible in the long format.
`format`	[character] should the prediction be output in a matrix format with clusters in row and timepoints in columns (`"wide"`), or in a data.frame/vector with as many rows as observations (`"long"`)
`export.vcov`	[logical] should the variance-covariance matrix of the prediction error be outcome as an attribute (`"vcov"`)?
`simplify`	[logical] simplify the data format (vector instead of data.frame) and column names (no mention of the time variable) when possible.
`...`	Not used. For compatibility with the generic method.
`vcov`	[logical] should the variance-covariance matrix of the predictions be output as an attribute.

Details

Static prediction are made using the linear predictor X\beta while dynamic prediction uses the conditional normal distribution of the missing outcome given the observed outcomes. So if outcome 1 is observed but not 2, prediction for outcome 2 is obtain by X_2\beta + \sigma_{21}\sigma^{-1}_{22}(Y_1-X_1\beta). In that case, the uncertainty is computed as the sum of the conditional variance \sigma_{22}-\sigma_{21}\sigma^{-1}_{22}\sigma_{12} plus the uncertainty about the estimated conditional mean (obtained via delta method using numerical derivatives).

The model terms are computing similarly to stats::predict.lm, by centering the design matrix around the mean value of the covariates used to fit the model. Then the centered design matrix is multiplied by the mean coefficients and columns assigned to the same variable (e.g. three level factor variable) are summed together.

Value

When format="long", a data.frame containing the following columns:

estimate: predicted mean.
se: uncertainty about the predicted mean.
df: degrees-of-freedom
lower: lower bound of the confidence interval of the predicted mean
upper: upper bound of the confidence interval of the predicted mean

When format="wide", a matrix containing the predict means (one line per cluster, one column per timepoint).

Examples

## simulate data in the long format
set.seed(10)
dL <- sampleRem(100, n.times = 3, format = "long")

## fit Linear Mixed Model
eUN.lmm <- lmm(Y ~ visit + X1 + X2 + X5,
               repetition = ~visit|id, structure = "UN", data = dL)

## prediction
newd <- data.frame(X1 = 1, X2 = 2, X5 = 3, visit = factor(1:3, levels = 1:3))
predict(eUN.lmm, newdata = newd)
predict(eUN.lmm, newdata = newd, keep.data = TRUE)
predict(eUN.lmm, newdata = newd, keep.data = TRUE, se = c(TRUE,TRUE))

## dynamic prediction
newd.d1 <- cbind(newd, Y = c(NA,NA,NA))
predict(eUN.lmm, newdata = newd.d1, keep.data = TRUE, type = "dynamic")
newd.d2 <- cbind(newd, Y = c(6.61,NA,NA))
predict(eUN.lmm, newdata = newd.d2, keep.data = TRUE, type = "dynamic")
newd.d3 <- cbind(newd, Y = c(1,NA,NA))
predict(eUN.lmm, newdata = newd.d3, keep.data = TRUE, type = "dynamic")
newd.d4 <- cbind(newd, Y = c(1,1,NA))
predict(eUN.lmm, newdata = newd.d4, keep.data = TRUE, type = "dynamic")

bozenne/repeated documentation built on July 16, 2025, 11:16 p.m.