In CecileProust-Lima/lcmm: Extended Mixed Models Using Latent Classes and Latent Processes

knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

library(lcmm)

Background and definitions

Each dynamic phenomenon can be characterized by a latent process $(\Lambda(t))$ which evolves in continuous time $t$. Sometimes, this latent process is measured through several markers so that the latent process is their common factor. Function multlcmm treats this idea and extend the linear mixed model theory to several markers measuring the same underlying quantity, these markers not being necessarily Gaussian. ## The latent process mixed model for multivariate markers

The latent process mixed model is introduced in Proust-Lima et al. (2006 - https://doi.org/10.1111/j.1541-0420.2006.00573.x, 2013 - https://doi.org/10.1111/bmsp.12000 and 2022 - https://doi.org/10.1016/j.ymeth.2022.01.005 ). The quantity of interest defined as a latent process is modeled according to time using a linear mixed model: $$\Lambda(t) = X(t) \beta + Z(t)u_i +w_i(t)$$ where: - $X(t)$ and $Z(t)$ are vectors of covariates ($Z(t)$ is included in $X(t)$); - $\beta$ are the fixed effects (i.e., population mean effects); - $u_i$ are the random effects (i.e., individual effects); they are distributed according to a zero-mean multivariate normal distribution with covariance matrix $B$; - $(w_i(t))$ is a Gaussian process that might be added in the model to relax the intra-subject correlation structure. This structural model for $\Lambda(t)$ according to time and covariates is exactly the same as in the [univariate case](latent_process_model_with_lcmm.html). Now, instead of defining one equation of observation, we define K equations of observation for the K different markers with $Y_{ijk}$ the observation for subject $i$, marker $k$ and occasion $j$. As in the univariate case, several types of markers can be handled by defining a marker-specific link function $H_{k}$. The marker-specific equation of observation also includes potentially some contrasts $\gamma_k$ on covariates and a marker and subject specific random intercept so that:

$$Y_{ijk} = H_{k}(~ \Lambda(t_{ijk})+ X_{cijk}\gamma_{k} + \alpha_{ik} + \epsilon_{ijk} ~ ; \eta_{k}) $$ where: - $\alpha_{ik} \tilde{} N(0,\sigma_k^2)$ - $X_{cijk}$ vector of covariates - $\gamma_k$ are contrasts (with the sum over $k$ equal to 0) - $t_{ijk}$ the time of measurement for subject $i$, marker $k$ and occasion $j$; - $\epsilon_{ijk}$ an independent Gaussian error with mean zero and variance $\sigma_{\epsilon_k}^2$; - $H_k$ the link function (parameterized by $\eta_k$) that transforms the latent process into the scale and metric of marker $k$. The link functions are the same as in the univariate case (in lcmm). For continuous outcomes, $H^{-1}$ is a parametric family of increasing monotonic functions among: - the linear transformation: this reduces to the linear mixed model (2 parameters) - the Beta cumulative distribution family rescaled (4 parameters) - the basis of quadratic I-splines with m knots (m+2 parameters) For binary and ordinal outcomes, $H_k$ links each level of the variable with an interval of values for the latent process plus measurement errors. This corresponds to the (cumulative) probit model. We describe below a case with continuous link functions. For ordinal markers, see [vignette](latent_process_model_with_multlcmm_IRT.html). ## Identifiability As in any latent variable model, the metric of the latent variable has to be defined. In contrast with lcmm function, here the variance of the first random effect $u_i$ is set to 1 and the mean intercept (in $\beta$) is set to 0. # Example with cognitive process

In this example, we study cognitive trajectory over time when cognition is defined as the common factor underlying three psychometric tests: **MMSE**, **BVRT** and **IST**. Here the timescale is years since entry into the cohort, the trajectory is assumed quadratic in time (both at individual and population level) and the model is adjusted for $age$ at entry. To further investigate the effect of gender, both mean effects on the common factor and differential effects (contrasts) on each marker are included (not in interaction with time in this example).

## Model considered :

$$Y_{ijk} = H_k(~ \beta_{1}t_{ij} + \beta_{2}t_{ij}^2 +\beta_{3}age75_{i} + \beta_{4}male_{i} +\gamma_{k}male_{i} \\ +u_{0i}+u_{1i}t_{ij}+u_{2i}t_{ij}^2+ \alpha_{ik} + \omega_i(t_{ijk}) +\epsilon_{ijk} ~~, \eta_k)$$ *Where : * $u_{i}\tilde{}N(0,B)$ and V($u_{0i}$)=1, $\omega_i(t)$ is a Brownian process, $\alpha_{ik}\tilde{}N(0,\sigma_k^2)$ and for k = 1,2,3: $Y_{ij1}= MMSE_{ij}$ , $Y_{ij2}= IST_{ij}$ and $Y_{ij3}= BVRT_{ij})$ ## Estimation with different link functions We first create the variable time and recenter and scale age to avoid numerical problems wzxhzdk:2 wzxhzdk:3 ### linear link functions By default, all the link functions are set to linear: wzxhzdk:4 ### nonlinear link functions Depending on the nature of the data, some nonlinear link functions may be necessary. Here for instance, the MMSE is highly skewed: wzxhzdk:5 As in the univariate case, Beta CDF or splines can be considered. The link function family can either be the same for all the markers (even if the parameters will be different): wzxhzdk:6 Or the link functions can be chosen differently. For instance, wzxhzdk:7 ### Fixing some transformation parameters Note that the splines transformation may sometimes involve parameters so close to 0 that it entails a lack of convergence (since the parameter is at the boundary of the parameter space). This often happens with MMSE. For instance, in the example below, convergence cannot be reached easily because the third parameter of MMSE transformation is lower than $10^{-4}$. wzxhzdk:8 This problem can be easily dealt with by fixing this parameter with posfix option. To do so, the position of the parameter can be identified from the vector of estimates (21st parameter here): wzxhzdk:9 And the model can be refitted from these estimates and the newly fixed parameter: wzxhzdk:10 With this constraint, the model converges correctly. ## Comparison of the models

Objects `mult_lin`, `mult_beta`, `mult_betaspl`, 'mult_splines2' are multivariate latent process mixed models that assume the exact same trajectory for the underlying latent process but different link functions. As in the univariate case, the models can be compared using information criteria. The `summarytable` give us such information. wzxhzdk:11 Models involving Beta transformations and splines transformations seem to fit a lot better in terms of AIC than the linear transformations showing the departure from normality. The transformations can be plotted and compared between models: wzxhzdk:12 Except from the linear transformations, all the estimation transformations are really close. ## Postfit outputs ### Estimated link functions: Confidence intervals of the link functions can be obtained by the Monte Carlo method: wzxhzdk:13 ### Summary The summary of the model includes convergence, goodness of fit criteria and estimated parameters. wzxhzdk:14 From the estimates, the underlying cognition has a quadratic trajectory over time, older subjects at baseline have systematically lower cognitive level. There is no difference according to gender. However, there are significantly differential effects of gender on psychometric tests (p=0.0003) with a systematically higher BVRT for men and higher IST levels for women. ### Variance explained For multivariate data, the latent process is the common underlying factor of the different markers. Thus, we can compute the residual variance of each marker explained the latent process. This variance explained is conditional on the covariates and computed for a specific time. wzxhzdk:15 For example, the common factor explained 42% of MMSE residual variation while it explained 26% of the BVRT residual variation at time 0. ### Graph of predicted trajectories for the markers As with lcmm function, predicted trajectories of the markers can be computed according to a covariate profile and then plotted. wzxhzdk:16 ### Goodness of fit: graph of the residuals As in any mixed model, we expect the subject-specicif residuals (bottom right panel) to be Gaussian. wzxhzdk:17 ### Goodness of fit: graph of the predictions versus observations The mean predictions and observations can be plotted according to time. Note that the predictions and observations are in the scale of the latent process (observations are transformed with the estimated link functions): wzxhzdk:18 # To go further ... ## heterogeneous profiles of trajectories The latent process mixed model for multivariate markers extends to the heterogeneous case with latent classes. The same strategy as explained with [hlme](latent_class_model_with_hlme.html) can be used. ## joint analysis of a time to event The latent process mixed model for multivariate markers extends to the case of a joint model. This is done in [Jointlcmm when one latent process is involved](joint_latent_class_model_with_Jointlcmm.html) or mpjlcmm when [multivariate latent processes are considered](multivariate_latent_class_model_with_mpjlcmm.html).

CecileProust-Lima/lcmm documentation built on June 13, 2025, 8:43 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

CecileProust-Lima/lcmm
Extended Mixed Models Using Latent Classes and Latent Processes

In CecileProust-Lima/lcmm: Extended Mixed Models Using Latent Classes and Latent Processes

Background and definitions

R Package Documentation

Browse R Packages

We want your feedback!

CecileProust-Lima/lcmm Extended Mixed Models Using Latent Classes and Latent Processes

In CecileProust-Lima/lcmm: Extended Mixed Models Using Latent Classes and Latent Processes

Background and definitions

R Package Documentation

Browse R Packages

We want your feedback!

CecileProust-Lima/lcmm
Extended Mixed Models Using Latent Classes and Latent Processes