residuals: Extract The Residuals From a Linear Mixed Model

residualsR Documentation

Extract The Residuals From a Linear Mixed Model

Description

Extract or compute the residuals of a linear mixed model.

Usage

## S3 method for class 'lmm'
residuals(
  object,
  type = "response",
  var = NULL,
  data = NULL,
  p = NULL,
  format = "long",
  keep.data = FALSE,
  simplify = TRUE,
  ...
)

## S3 method for class 'clmm'
residuals(object, ...)

## S3 method for class 'mlmm'
residuals(object, simplify = TRUE, ...)

Arguments

object

a lmm object.

type

[character] type of residual to output: raw residuals ("response"), Pearson residuals ("pearson"), normalized residuals ("normalized", scaled residual "scaled"), or partial residuals ("partial" or "partial-center"). Can also be "all" to output all except partial residuals. See detail section.

var

[character vector] name of the variable relative to which the partial residuals should be computed.

data

[data.frame] dataset relative to which the residuals should be computed. Only relevant if differs from the dataset used to fit the model.

p

[numeric vector] value of the model coefficients at which to evaluate the residuals. Only relevant if differs from the fitted values.

format

[character] Should the residuals be output in a matrix format with clusters in row and timepoints in columns ("wide"), or in a data.frame/vector with as many rows as observations ("long")

keep.data

[logical] Should the dataset relative to which the residuals are evaluated be output along side the residual values? Only possible in the long format.

simplify

[logical] Simplify the data format (vector instead of data.frame) and column names (no mention of the time variable) when possible. Otherwise, information about the call and reference values used for partial residuals be added as an attribute.

...

Not used. For compatibility with the generic method.

Details

The argument type defines how the residuals are computed:

  • "fitted": fitted value X_{ij} \hat{\beta}.

  • "response": raw residual, i.e. observed outcome minus fitted value \varepsilon_{ij} = Y_{ij} - X_{ij} \hat{\beta}.

  • "pearson": each raw residual is divided by its modeled standard deviation \varepsilon_{ij} = \frac{Y_{ij} - X_{ij} \hat{\beta}}{\sqrt{\hat{\omega}_{ij}}}.

  • "studentized": same as "pearson" but excluding the contribution of the cluster in the modeled standard deviation \varepsilon_{ij} = \frac{Y_{ij} - X_{ij} \hat{\beta}}{\sqrt{\hat{\omega}_{ij}-\hat{q}_{ij}}}.

  • "normalized": raw residuals are multiplied, within clusters, by the inverse of the (upper) Cholesky factor of the modeled residual variance covariance matrix \varepsilon_{ij} = ( Y_{i} - X_{i} \hat{\beta} )\hat{C}^{-1}.

  • "normalized2": raw residuals are multiplied, within clusters, by the inverse of the modeled residual variance covariance matrix \varepsilon_{ij} = ( Y_{i} - X_{i} \hat{\beta} )\hat{Omega}^{-1}.

  • "scaled": scaled residuals (see PROC MIXED in SAS). Numerically identical to "normalized" but computed by sequentially scaling and centering the residuals, to make them conditionally independent of previous residuals from the same cluster at previous repetitions.

  • "partial": partial residuals (\gamma E + \hat{\varepsilon}). A reference level can be also be specified via the attribute "reference" to change the absolute level of the partial residuals. "partial-center": partial residuals with centered continuous covariates (\gamma E + \hat{\varepsilon} where E has been centered, i.e., has 0-mean)

where

  • X=(E,W) the design matrix. For partial residuals, it is split according to the variable(s) in argument var (E) and the rest (W).

  • Y the outcome

  • \hat{\beta}=(\hat{\gamma},\hat{\delta}) the estimated mean coefficients relative to X=(E,W)

  • \hat{\Omega} the modeled variance-covariance of the residuals and \hat{\omega} its diagonal elements

  • \hat{C} the upper Cholesky factor of \hat{\Omega}, i.e. upper triangular matrix satisfying \hat{C}^{t} \hat{C} = \hat{\Omega}

  • \hat{Q}_i= X_i (X^{t}\hat{\Omega}X)^{-1}X_i^{t} a cluster specific correction factor, approximating the contribution of cluster i to \hat{\Omega}. Its diagonal elements are denoted \hat{q}_i.

  • \hat{D}_i the upper Cholesky factor of \hat{\Omega}-\hat{Q}_i

Value

lmm: a vector or a data.frame when format="long" (one line per observation, one column per type of residual), a matrix when format="wide" (one line per cluster, one column per timepoint).

Examples


#### simulate data in the long format ####
set.seed(10)
dL <- sampleRem(100, n.times = 3, format = "long")

#### Linear Model ####
e.lm <- lmm(Y ~ visit + X1 + X2 + X6, data = dL)

## partial residuals
residuals(e.lm, type = "partial", var = "X6")
## residuals(e.lm) + dL$X6 * coef(e.lm)["X6"]
e.reslm <- residuals(e.lm, type = "partial", var = "X6", keep.data = TRUE, simplify = FALSE)
plot(e.reslm)

## partial residuals with specific reference
type <- "partial"
attr(type,"reference") <- data.frame(visit=factor(2,1:3),X2=0,X6=3)
residuals(e.lm, type = type, var = "X1")
## residuals(e.lm) + dL$X1 * coef(e.lm)["X1"] + coef(e.lm)["visit2"]

## partial residuals with centered covariates
residuals(e.lm, type = "partial-center", var = "X1")
## residuals(e.lm) + (dL$X1-mean(dL$X1)) * coef(e.lm)["X1"]

#### Linear Mixed Model ####
eUN.lmm <- lmm(Y ~ visit + X1 + X2 + X5 + X6,
               repetition = ~visit|id, structure = "UN", data = dL)

## residuals
e.resL <- residuals(eUN.lmm, type = "normalized",
                    keep.data = TRUE, simplify = FALSE)
plot(e.resL, type = "qqplot")
plot(e.resL, type = "scatterplot", labeller = ggplot2::label_both)
e.resW <- residuals(eUN.lmm, format = "wide", type = "normalized",
                    simplify = FALSE)
plot(e.resW, type = "correlation")

## residuals and predicted values
residuals(eUN.lmm, type = "all")
residuals(eUN.lmm, type = "all", keep.data = TRUE)

## partial residuals
residuals(eUN.lmm, type = "partial", var = c("(Intercept)","X6"))
residuals(eUN.lmm, type = "partial", var = c("X6"))

LMMstar documentation built on Nov. 9, 2023, 1:06 a.m.