lmm: Fit Linear Mixed Model

View source: R/lmm.R

lmmR Documentation

Fit Linear Mixed Model

Description

Fit a linear mixed model defined by a mean and a covariance structure. g

Usage

lmm(
  formula,
  repetition,
  structure,
  data,
  weights = NULL,
  scale.Omega = NULL,
  method.fit = NULL,
  df = NULL,
  type.information = NULL,
  trace = NULL,
  control = NULL
)

Arguments

formula

[formula] Specify the model for the mean. On the left hand side the outcome and on the right hand side the covariates affecting the mean value. E.g. Y ~ Gender + Gene.

repetition

[formula] Specify the structure of the data: the time/repetition variable and the grouping variable, e.g. ~ time|id.

structure

[character] type of covariance structure, either "CS" (compound symmetry) or "UN" (unstructured).

data

[data.frame] dataset (in the long format) containing the observations.

weights

[formula or character] variable in the dataset used to weight the log-likelihood and its derivative. Should be constant within cluster.

scale.Omega

[formula or character] variable in the dataset used to rescale the residual variance-covariance matrix. Should be constant within cluster.

method.fit

[character] Should Restricted Maximum Likelihoood ("REML") or Maximum Likelihoood ("ML") be used to estimate the model parameters?

df

[logical] Should the degree of freedom be computed using a Satterthwaite approximation?

type.information

[character] Should the expected information be computed (i.e. minus the expected second derivative) or the observed inforamtion (i.e. minus the second derivative).

trace

[interger, >0] Show the progress of the execution of the function.

control

[list] Control values for the optimization method. The element optimizer indicates which optimizer to use and additional argument will be pass to the optimizer.

Details

Computation time the lmm has not been developped to be a fast function as, by default, it uses REML estimation with the observed information matrix and uses a Satterthwaite approximation to compute degrees of freedom (this require to compute the third derivative of the log-likelihood which is done by numerical differentiation). The computation time can be substantially reduced by using ML estimation with the expected information matrix and no calculation of degrees of freedom: arguments method.fit="ML", type.information="expected", df=FALSE. This will, however, lead to less accurate p-values and confidence intervals in small samples.

By default, the estimation of the model parameters will be made using a Newton Raphson algorithm. This algorithm does not ensure that the residual covariance matrix is positive definite and therefore may sometimes fail. When using an unstructured pattern, i.e. structure="UN" the nlme::gls may be preferable as it can ensures positve definitness. See argument optimizer in LMMstar.options.

Argument control: when using the optimizer "FS", the following elements can be used

  • init: starting values for the model parameters.

  • n.iter: maximum number of interations of the optimization algorithm.

  • tol.score: score value below which convergence has been reached.

  • tol.param: difference in estimated parameters from two successive iterations below which convergence has been reached.

  • trace: display progress of the optimization procedure.

Argument repetition: when numeric, it will be converted into a factor variable, possibly adding a leading 0 to preserve the ordering. This transformation may cause inconsistency when combining results between different lmm object. This is why the grouping variable should preferably be of type character or factor.

Value

an object of class lmm containing the estimated parameter values, the residuals, and relevant derivatives of the likelihood.

See Also

summary.lmm for a summary of the model fit.
model.tables.lmm for a data.frame containing estimates with their uncertainty.
plot.lmm for a graphical display of the model fit or diagnostic plots.
levels.lmm to display the reference level.
anova.lmm for testing linear combinations of coefficients (F-test, multiple Wald tests)
getVarCov.lmm for extracting estimated residual variance-covariance matrices.
residuals.lmm for extracting residuals or creating residual plots (e.g. qqplots).
predict.lmm for evaluating mean and variance of the outcome conditional on covariates or other outcome values.

Examples

#### 1- simulate data in the long format ####
set.seed(10)
dL <- sampleRem(100, n.times = 3, format = "long")
dL$X1 <- as.factor(dL$X1)
dL$X2 <- as.factor(dL$X2)

#### 2- fit Linear Mixed Model ####
eCS.lmm <- lmm(Y ~ X1 * X2 + X5, repetition = ~visit|id, structure = "CS", data = dL)

logLik(eCS.lmm) ## -670.9439
summary(eCS.lmm)


#### 3- estimates ####
## reference level
levels(eCS.lmm)$reference
## mean parameters
coef(eCS.lmm)
model.tables(eCS.lmm)
confint(eCS.lmm)

if(require(emmeans)){
  dummy.coef(eCS.lmm)
}

## all parameters
coef(eCS.lmm, effects = "all")
model.tables(eCS.lmm, effects = "all")
confint(eCS.lmm, effects = "all")

## variance-covariance structure
sigma(eCS.lmm)

#### 4- diagnostic plots ####
quantile(residuals(eCS.lmm))
quantile(residuals(eCS.lmm, type = "normalized"))

## Not run: 
if(require(ggplot2)){
  ## investigate misspecification of the mean structure
  plot(eCS.lmm, type = "scatterplot")
  ## investigate misspecification of the variance structure
  plot(eCS.lmm, type = "scatterplot2")
  ## investigate misspecification of the correlation structure
  plot(eCS.lmm, type = "correlation")
  ## investigate misspecification of the residual distribution
  plot(eCS.lmm, type = "qqplot")
}

## End(Not run)

#### 5- statistical inference ####
anova(eCS.lmm) ## effect of each variable
anova(eCS.lmm, effects = "X11-X21=0") ## test specific coefficient
## test several hypothese with adjustment for multiple comparisons
summary(anova(eCS.lmm, effects = c("X11=0","X21=0")))

#### 6- prediction ####
## conditional on covariates
newd <- dL[1:3,]
predict(eCS.lmm, newdata = newd, keep.newdata = TRUE)
## conditional on covariates and outcome
newd <- dL[1:3,]
newd$Y[3] <- NA
predict(eCS.lmm, newdata = newd, type = "dynamic", keep.newdata = TRUE)

#### EXTRA ####
if(require(mvtnorm)){
## model for the average over m replicates
## (only works with independent replicates)
Sigma1 <- diag(1,1,1); Sigma5 <- diag(1,5,5)
n <- 100
dfW <- rbind(data.frame(id = 1:n, rep = 5, Y = rowMeans(rmvnorm(n, sigma = Sigma5))),
             data.frame(id = (n+1):(2*n), rep = 1, Y = rmvnorm(n, sigma = Sigma1)))

e.lmmW <- lmm(Y~1, data = dfW, scale.Omega=~rep, control = list(optimizer = "FS"))
e.lmm0 <- lmm(Y~1, data = dfW, control = list(optimizer = "FS"))
model.tables(e.lmmW, effects = "all")
model.tables(e.lmm0, effects = "all")
## TRUE standard error is 1

}


LMMstar documentation built on Jan. 7, 2023, 1:20 a.m.