Description Usage Arguments Details Value Author(s) References See Also Examples
This function constitutes a multivariate extension of function lcmm
.
It fits multivariate mixed models and multivariate latent class mixed models
for multivariate longitudinal outcomes of different types. It handles
continuous longitudinal outcomes (Gaussian or nonGaussian, curvilinear) as
well as ordinal longitudinal outcomes (with cumulative probit measurement model).
The model assumes that all the outcomes measure the same underlying latent process
defined as their common factor, and each outcome is related to this latent common
factor by a specific parameterized link function. At the latent process level, the
model estimates a standard linear mixed model or a latent class linear mixed
model when heterogeneity in the population is investigated (in the same way
as in functions hlme
and lcmm
). Parameters of the nonlinear link
functions and of the latent process mixed model are estimated simultaneously
using a maximum likelihood method.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67  multlcmm(
fixed,
mixture,
random,
subject,
classmb,
ng = 1,
idiag = FALSE,
nwg = FALSE,
randomY = FALSE,
link = "linear",
intnodes = NULL,
epsY = 0.5,
cor = NULL,
data,
B,
convB = 1e04,
convL = 1e04,
convG = 1e04,
maxiter = 100,
nsim = 100,
prior,
range = NULL,
subset = NULL,
na.action = 1,
posfix = NULL,
partialH = FALSE,
verbose = TRUE,
returndata = FALSE,
methInteg = "QMC",
nMC = NULL,
var.time = NULL
)
mlcmm(
fixed,
mixture,
random,
subject,
classmb,
ng = 1,
idiag = FALSE,
nwg = FALSE,
randomY = FALSE,
link = "linear",
intnodes = NULL,
epsY = 0.5,
cor = NULL,
data,
B,
convB = 1e04,
convL = 1e04,
convG = 1e04,
maxiter = 100,
nsim = 100,
prior,
range = NULL,
subset = NULL,
na.action = 1,
posfix = NULL,
partialH = FALSE,
verbose = TRUE,
returndata = FALSE,
methInteg = "QMC",
nMC = NULL,
var.time = NULL
)

fixed 
a twosided linear formula object for specifying the
fixedeffects in the linear mixed model at the latent process level. The
response outcomes are separated by 
mixture 
a onesided formula object for the classspecific fixed
effects in the latent process mixed model (to specify only for a number of
latent classes greater than 1). Among the list of covariates included in

random 
an optional onesided formula for the randomeffects in the
latent process mixed model. At least one random effect should be included
for identifiability purposes. Covariates with a randomeffect are separated
by 
subject 
name of the covariate representing the grouping structure. 
classmb 
an optional onesided formula describing the covariates in
the classmembership multinomial logistic model. Covariates included are
separated by 
ng 
number of latent classes considered. If 
idiag 
optional logical for the variancecovariance structure of the
randomeffects. If 
nwg 
optional logical of classspecific variancecovariance of the
randomeffects. If 
randomY 
optional logical for including an outcomespecific random
intercept. If 
link 
optional vector of families of parameterized link functions to
estimate (one by outcome). Option "linear" (by default) specifies a linear
link function. Other possibilities include "beta" for estimating a link
function from the family of Beta cumulative distribution functions,
"thresholds" for using a threshold model to describe the correspondence
between each level of an ordinal outcome and the underlying latent process and
"Splines" for approximating the link function by Isplines. For this latter
case, the number of nodes and the nodes location should be also specified.
The number of nodes is first entered followed by 
intnodes 
optional vector of interior nodes. This argument is only required for a Isplines link function with nodes entered manually. 
epsY 
optional definite positive real used to rescale the marker in (0,1) when the beta link function is used. By default, epsY=0.5. 
cor 
optional indicator for inclusion of an autocorrelated Gaussian
process in the latent process linear (latent process) mixed model. Option
"BM" indicates a brownian motion with parameterized variance. Option "AR"
specifies an autoregressive process of order 1 with parameterized variance
and correlation intensity. Each option should be followed by the time
variable in brackets as 
data 
data frame containing the variables named in 
B 
optional specification for the initial values for the parameters.
Three options are allowed: (1) a vector of initial values is entered (the
order in which the parameters are included is detailed in 
convB 
optional threshold for the convergence criterion based on the parameter stability. By default, convB=0.0001. 
convL 
optional threshold for the convergence criterion based on the loglikelihood stability. By default, convL=0.0001. 
convG 
optional threshold for the convergence criterion based on the derivatives. By default, convG=0.0001. 
maxiter 
optional maximum number of iterations for the Marquardt iterative algorithm. By default, maxiter=100. 
nsim 
number of points used to plot the estimated link functions. By default, nsim=100. 
prior 
name of the covariate containing the prior on the latent class membership. The covariate should be an integer with values in 0,1,...,ng. When there is no prior, the value should be 0. When there is a prior for the subject, the value should be the number of the latent class (in 1,...,ng). 
range 
optional vector indicating the range of the outcomes (that is the minimum and maximum). By default, the range is defined according to the minimum and maximum observed values of the outcome. The option should be used only for Beta and Splines transformations. 
subset 
optional vector giving the subset of observations in

na.action 
Integer indicating how NAs are managed. The default is 1 for 'na.omit'. The alternative is 2 for 'na.fail'. Other options such as 'na.pass' or 'na.exclude' are not implemented in the current version. 
posfix 
Optional vector giving the indices in vector B of the parameters that should not be estimated. Default to NULL, all parameters are estimated. 
partialH 
optional logical for Splines link functions only. Indicates whether the parameters of the link functions can be dropped from the Hessian matrix to define convergence criteria. 
verbose 
logical indicating if information about computation should be reported. Default to TRUE. 
returndata 
logical indicating if data used for computation should be returned. Default to FALSE, data are not returned. 
methInteg 
character indicating the type of integration if ordinal outcomes are considered. 'MCO' for ordinary Monte Carlo, 'MCA' for antithetic Monte Carlo, 'QMC' for quasi Monte Carlo. Default to "QMC". 
nMC 
integer, number of Monte Carlo simulations. By default, 1000 points are used if at least one threshold link is specified. 
var.time 
optional character indicating the name of the time variable. 
A. THE PARAMETERIZED LINK FUNCTIONS
multlcmm
function estimates multivariate latent class mixed models
for different types of outcomes by assuming a parameterized link function
for linking each outcome Y_k(t) with the underlying latent common factor
L(t) they measure. To fix the latent process dimension, we chose to
constrain at the latent process level the (first) intercept of the latent
class mixed model at 0 and the standard error of the first random effect at
1.
1. With the "linear" link function, 2 parameters are required for the following transformation (Y(t)  b1)/b2
2. With the "beta" link function, 4 parameters are required for the following transformation: [ h(Y(t)',b1,b2)  b3]/b4 where h is the Beta CDF with canonical parameters c1 and c2 that can be derived from b1 and b2 as c1=exp(b1)/[exp(b2)*(1+exp(b1))] and c2=1/[exp(b2)*(1+exp(b1))], and Y(t)' is the rescaled outcome i.e. Y(t)'= [ Y(t)  min(Y(t)) + epsY ] / [ max(Y(t))  min(Y(t)) +2*epsY ].
3. With the "splines" link function, n+2 parameters are required for the following transformation b_1 + b_2*I_1(Y(t)) + ... + b_n+2 I_n+1(Y(t)), where I_1,...,I_n+1 is the basis of quadratic Isplines. To constraint the parameters to be positive, except for b_1, the program estimates b_k^* (for k=2,...,n+2) so that b_k=(b_k^*)^2. This parameterization may lead in some cases to problems of convergence that we are currently addressing.
4. With the "thresholds" link function for an ordinal outcome with levels 0,...,C, C1 parameters are required for the following transformation: Y(t)=c <=> b_c < L(t) <= b_c+1 with b_0 =  infinity and b_C+1=+infinity. To constraint the parameters to be increasing, except for the first parameter b_1, the program estimates b_k^* (for k=2,...C1) so that b_k=b_k1+(b_k^*)^2.
Details of these parameterized link functions can be found in the papers: ProustLima et al. (Biometrics 2006), ProustLima et al. (BJMSP 2013), ProustLima et al. (arxiv 2021  https://arxiv.org/abs/2109.13064)
B. THE VECTOR OF PARAMETERS B
The parameters in the vector of initial values B
or in the vector of
maximum likelihood estimates best
are included in the following
order: (1) ng1 parameters are required for intercepts in the latent class
membership model, and if covariates are included in classmb
, ng1
paramaters should be entered for each one; (2) for all covariates in
fixed
, one parameter is required if the covariate is not in
mixture
, ng paramaters are required if the covariate is also in
mixture
; When ng=1, the intercept is not estimated and no parameter
should be specified in B
. When ng>1, the first intercept is not
estimated and only ng1 parameters should be specified in B
; (3) for
all covariates included with contrast()
in fixed
, one
supplementary parameter per outcome is required excepted for the last
outcome for which the parameter is not estimated but deduced from the
others; (4) if idiag=TRUE
, the variance of each randomeffect
specified in random
is required excepted the first one (usually the
intercept) which is constrained to 1. (5) if idiag=FALSE
, the
inferior triangular variancecovariance matrix of all the randomeffects is
required excepted the first variance (usually the intercept) which is
constrained to 1. (5) only if nwg=TRUE
and ng
>1, ng1
parameters for classspecific proportional coefficients for the variance
covariance matrix of the randomeffects; (6) if cor
is specified, the
standard error of the Brownian motion or the standard error and the
correlation parameter of the autoregressive process; (7) the standard error
of the outcomespecific Gaussian errors (one per outcome); (8) if
randomY=TRUE
, the standard error of the outcomespecific random
intercept (one per outcome); (9) the parameters of each parameterized link
function: 2 for "linear", 4 for "beta", n+2 for "splines" with n nodes.
C. CAUTIONS REGARDING THE USE OF THE PROGRAM
Some caution should be made when using the program. Convergence criteria are very strict as they are based on the derivatives of the loglikelihood in addition to the parameter and loglikelihood stability. In some cases, the program may not converge and reach the maximum number of iterations fixed at 100. In this case, the user should check that parameter estimates at the last iteration are not on the boundaries of the parameter space.
If the parameters are on the boundaries of the parameter space, the identifiability of the model is critical. This may happen especially with splines parameters that may be too close to 0 (lower boundary) or classmb parameters that are too high or low (perfect classification). When identifiability of some parameters is suspected, the program can be run again from the former estimates by fixing the suspected parameters to their value with option posfix. This usually solves the problem. An alternative is to remove the parameters of the Beta or Splines link function from the inverse of the Hessian with option partialH.
If not, the program should be run again with other initial values, with a higher maximum number of iterations or less strict convergence tolerances.
Specifically when investigating heterogeneity (that is with ng>1): (1) As
the loglikelihood of a latent class model can have multiple maxima, a
careful choice of the initial values is crucial for ensuring convergence
toward the global maximum. The program can be run without entering the
vector of initial values (see point 2). However, we recommend to
systematically enter initial values in B
and try different sets of
initial values. (2) The automatic choice of initial values we provide
requires the estimation of a preliminary linear mixed model. The user should
be aware that first, this preliminary analysis can take time for large
datatsets and second, that the generated initial values can be very not
likely and even may converge slowly to a local maximum. This is the reason
why several alternatives exist. The vector of initial values can be directly
specified in B
the initial values can be generated (automatically or
randomly) from a model with ng=
. Finally, function gridsearch
performs an automatic grid search.
D. NUMERICAL INTEGRATION WITH THE THRESHOLD LINK FUNCTION
When dealing only with continuous outcomes, the computation of the likelihood does not require any numerical integration over the randomeffects, so that the estimation procedure is relatively fast. When at least one ordinal outcome is modeled, a numerical integration over the randomeffects is required in each computation of the individual contribution to the likelihood. This achieved using a MonteCarlo procedure. We allow three options: the standard MonteCarlo simulations, as well as antithetic MonteCarlo and quasi MonteCarlo methods as proposed in Philipson et al (2020).
The list returned is:
ns 
number of grouping units in the dataset 
ng 
number of latent classes 
loglik 
loglikelihood of the model 
best 
vector of parameter estimates in the same order as
specified in 
V 
vector containing the upper triangle matrix of variancecovariance
estimates of 
gconv 
vector of convergence criteria: 1. on the parameters, 2. on the likelihood, 3. on the derivatives 
conv 
status of convergence: =1 if the convergence criteria were satisfied, =2 if the maximum number of iterations was reached, =4 or 5 if a problem occured during optimisation 
call 
the matched call 
niter 
number of Marquardt iterations 
N 
internal information used in related functions 
idiag 
internal information used in related functions 
pred 
table of individual predictions and residuals in the underlying
latent process scale; it includes marginal predictions (pred_m), marginal
residuals (resid_m), subjectspecific predictions (pred_ss) and
subjectspecific residuals (resid_ss) averaged over classes, the transformed
observations in the latent process scale (obs) and finally the
classspecific marginal and subjectspecific predictions (with the number of
the latent class: pred_m_1,pred_m_2,...,pred_ss_1,pred_ss_2,...). If 
pprob 
table of posterior classification and posterior individual classmembership probabilities 
Xnames 
list of covariates included in the model 
predRE 
table containing individual predictions of the randomeffects : a column per randomeffect, a line per subject. 
cholesky 
vector containing the estimates of the Cholesky transformed parameters of the variancecovariance matrix of the randomeffects 
estimlink 
table containing the simulated values of each outcome and the corresponding estimated link function 
epsY 
definite positive reals used to rescale the markers in (0,1) when the beta link function is used. By default, epsY=0.5. 
linktype 
indicators of link function types: 0 for linear, 1 for beta, 2 for splines and 3 for thresholds 
linknodes 
vector of nodes useful only for the 'splines' link functions 
data 
the original data set (if returndata is TRUE) 
Cecile ProustLima and Viviane Philipps
ProustLima C, Philipps V, Liquet B (2017). Estimation of Extended Mixed Models Using Latent Classes and Latent Processes: The R Package lcmm. Journal of Statistical Software, 78(2), 156. doi:10.18637/jss.v078.i02
Proust and JacqminGadda (2005). Estimation of linear mixed models with a mixture of distribution for the randomeffects. Comput Methods Programs Biomed 78: 16573.
Proust, JacqminGadda, Taylor, Ganiayre, and Commenges (2006). A nonlinear model with latent process for cognitive evolution using multivariate longitudinal data. Biometrics 62, 101424.
ProustLima, Dartigues and JacqminGadda (2011). Misuse of the linear mixed model when evaluating risk factors of cognitive decline. Amer J Epidemiol 174(9): 107788.
ProustLima, Amieva, JacqminGadda (2013). Analysis of multivariate mixed longitudinal data: A flexible latent process approach. Br J Math Stat Psychol 66(3): 47087.
Commenges, ProustLima, Samieri, Liquet (2012). A universal approximate crossvalidation criterion and its asymptotic distribution, Arxiv.
Philipson, Hickey, Crowther, KolamunnageDona (2020). Faster Monte Carlo estimation of semiparametric joint models of timetoevent and multivariate longitudinal data. Computational Statistics & Data Analysis 151.
ProustLima, Philipps, Perrot, Blanchin, Sebille (2021). Modeling repeated selfreported outcome data: a continuoustime longitudinal Item Response Theory model. https://arxiv.org/abs/2109.13064
postprob
, plot.multlcmm
, predictL
,
predictY
lcmm
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42  ## Not run:
# Latent process mixed model for two curvilinear outcomes. Link functions are
# aproximated by Isplines, the first one has 3 nodes (i.e. 1 internal node 8),
# the second one has 4 nodes (i.e. 2 internal nodes 12,25)
m1 < multlcmm(Ydep1+Ydep2~1+Time*X2+contrast(X2),random=~1+Time,
subject="ID",randomY=TRUE,link=c("4manualsplines","3manualsplines"),
intnodes=c(8,12,25),data=data_lcmm)
# to reduce the computation time, the same model is estimated using
# a vector of initial values
m1 < multlcmm(Ydep1+Ydep2~1+Time*X2+contrast(X2),random=~1+Time,
subject="ID",randomY=TRUE,link=c("4manualsplines","3manualsplines"),
intnodes=c(8,12,25),data=data_lcmm,
B=c(1.071, 0.192, 0.106, 0.005, 0.193, 1.012, 0.870, 0.881,
0.000, 0.000, 7.520, 1.401, 1.607 , 1.908, 1.431, 1.082,
7.528, 1.135 , 1.454 , 2.328, 1.052))
# output of the model
summary(m1)
# estimated link functions
plot(m1,which="linkfunction")
# variation percentages explained by linear mixed regression
VarExpl(m1,data.frame(Time=0))
#### Heterogeneous latent process mixed model with linear link functions
#### and 2 latent classes of trajectory
m2 < multlcmm(Ydep1+Ydep2~1+Time*X2,random=~1+Time,subject="ID",
link="linear",ng=2,mixture=~1+Time,classmb=~1+X1,data=data_lcmm,
B=c( 18,20.77,1.16,1.41,1.39,0.32,0.16,0.26,1.69,1.12,1.1,10.8,
1.24,24.88,1.89))
# summary of the estimation
summary(m2)
# posterior classification
postprob(m2)
# longitudinal predictions in the outcomes scales for a given profile of covariates
newdata < data.frame(Time=seq(0,5,length=100),X1=0,X2=0,X3=0)
predGH < predictY(m2,newdata,var.time="Time",methInteg=0,nsim=20)
head(predGH)
## End(Not run)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.