joint: Fit joint model for survival and longitudinal data measured...

View source: R/joint.R

jointR Documentation

Fit joint model for survival and longitudinal data measured with error


This generic function fits a joint model with random latent association, building on the formulation described in Wulfsohn and Tsiatis (1997) while allowing for the presence of longitudinal and survival covariates, and three choices for the latent process. The link between the longitudinal and survival processes can be proportional or separate. When failure is attributable to 2 separate causes, a competing risks joint model is fitted as per Williamson et al. (2008).


  model = c("intslope", "int", "quad"),
  sepassoc = FALSE,
  longsep = FALSE,
  survsep = FALSE,
  verbose = FALSE



an object of class jointdata containing the variables named in the formulae arguments.


a formula object with the response variable, and the covariates to include in the longitudinal sub-model.


a formula object with the survival time, censoring indicator and the covariates to include in the survival sub-model. The response must be a survival object as returned by the Surv function.


a character string specifying the type of latent association. This defaults to the intercept and slope version as seen in Wulfsohn and Tsiatis (1997). For association via the random intercept only, choose model = "int", whereas for a quadratic association, use model = "quad". Computing times are commensurate with the type of association structure chosen.


logical value: if TRUE then the joint model is fitted with separate association, see Details.


logical value: if TRUE, parameter estimates and log-likelihood from a separate linear mixed model analysis of the longitudinal data (see the lme function for details) are returned.


if TRUE, parameter estimates and log-likelihood from a separate analysis of the survival data using the Cox proportional hazards model are returned (see coxph function for details).


the number of quadrature points across which the integration with respect to the random effects will be performed. Defaults to gpt = 3 which produces stable estimates in most datasets.


the number of quadrature points which the log-likelihood is evaluated over following a model fit. This defaults to lgpt = 10, though lgpt = 3 is often sufficient.

the maximum number of iterations of the EM algorithm that the function will perform. Defaults to = 200, though more iterations may be necessary for large, complex data.


the tolerance level before convergence of the algorithm is deemed to have occurred. Default value is tol = 0.001.


if TRUE, the parameter estimates at each iteration of the EM algorithm are printed. Default is verbose = FALSE.


The joint function fits a joint model to survival and longitudinal data. The formulation is similar to Wulfsohn and Tsiatis (1997). A linear mixed effects model is assumed for the longitudinal data, namely

Y_i = X_{i1}(t_i)^Tβ_1 + D_i(t_i)^T U_i + ε_i,

where U_i is a vector of random effects, (U_{0i}, … U_{qi}) whose length depends on the model chosen, i.e. q = 1 for the random intercept model. D_i is the random effects covariate matrix, which will be time-dependent for all but the random intercept model. X_{i1} is the longitudinal design matrix for unit i, and t_i is the vector of measurement times for subject i. Measurement error is represented by ε_i.

The Cox proportional hazards model is adopted for the survival data, namely

λ(t) = λ_0(t) \exp\{{X_{i2}(t)^T β_2 + D_i(t)(γ^T U_i)}\}.

The parameter γ determines the level of association between the two processes. For the intercept and slope model with separate association we have

D_i(t) (γ^T U_i) = γ_0 U_{0i} + γ_1 U_{1i} t,

whereas under proportional association

D_i(t)(γ^T U_i) = γ (U_{0i} + U_{1i} t).

X_{i2} is the vector of survival covariates for unit i. The baseline hazard function is λ_0(t).

The function uses an EM algorithm to estimate parameters in the joint model. Starting values are provided by calls to standard R functions lme and coxph for the longitudinal and survival components, respectively.


A list containing the parameter estimates from the joint model and, if required, from either or both of the separate analyses. The combined log-likelihood from a separate analysis and the log-likelihood from the joint model are also produced as part of the fit.

Competing risks

If failure can be attributed to 2 causes, i.e. so-called competing risks events data, then a cause-specific hazards model is adopted, namely

λ_g(t) = λ_{0g}(t) \exp\{{X_{i2}(t)^T β_2^{(g)} + D_i(t)(γ^T U_i)}\},

where g = 1,2 denotes the failure type, β_2^{(g)} (g = 1,2) are cause-specific hazard parameters corresponding to the same covariates, and λ_{0g}(t) are cause-specific baseline hazard functions. For this data, a proportional association structure is assumed (i.e. sepassoc = FALSE) and a random-intercepts and random-slopes model must be used (i.e. model = "intslope"). Note that the function only permits 2 failure types. The model is specified in full by Williamson et al. (2008). The function joint() automatically detects whether competing risks are present by counting the number of unique components in the event column on the event time data.

Separate models

Both longsep and survsep ignore any latent association (i.e. γ = 0) between the longitudinal and survival processes but their output can be used to compare with the results from the joint model. If interest is solely in the individual processes then the user should instead make use of the functions lme and coxph mentioned above. Furthermore, if interest is in the separate effect of each random effect (this is for intercept and slope or quadratic models only) upon the survival data, the user should set sepassoc = TRUE.


Since numerical integration is required, it is advisable to check the stability of the maximum likelihood estimates with an increasing number of Gauss-Hermite quadrature points. joint() uses gpt = 3 by default, as this has been adequate for many datasets. However, for certain datasets and models, this might be too small.


Pete Philipson


Wulfsohn MS, Tsiatis AA. A joint model for survival and longitudinal data measured with error. Biometrics. 1997; 53(1): 330-339.

Henderson R, Diggle PJ, Dobson A. Joint modelling of longitudinal measurements and event time data. Biostatistics. 2000; 1(4): 465-480.

Williamson PR, Kolamunnage-Dona R, Philipson P, Marson AG. Joint modelling of longitudinal and competing risks data. Stat Med. 2008; 27: 6426-6438.

See Also

lme, coxph, jointdata, jointplot.


## Standard joint model

heart.surv <- UniqueVariables(heart.valve,
                              var.col = c("fuyrs", "status"),
                              id.col = "num")
heart.long <- heart.valve[, c("num", "time", "log.lvmi")]
heart.cov <- UniqueVariables(heart.valve,
                             c("age", "hs", "sex"),
                             id.col = "num")
heart.valve.jd <- jointdata(longitudinal = heart.long,
                            baseline = heart.cov,
                            survival = heart.surv,
                            id.col = "num",
                            time.col = "time")
fit <- joint(data = heart.valve.jd,
             long.formula = log.lvmi ~ 1 + time + hs,
             surv.formula = Surv(fuyrs, status) ~ hs,
             model = "intslope")
## Competing risks joint model (same data as Williamson et al. 2008)

## Not run: 
epileptic$interaction <- with(epileptic, time * (treat == "LTG"))
longitudinal <- epileptic[, c(1:3, 13)]
survival <- UniqueVariables(epileptic, c(4, 6), "id")
baseline <- UniqueVariables(epileptic, "treat", "id")
data <- jointdata(longitudinal = longitudinal,
                  survival = survival,
                  baseline = baseline,
                  id.col = "id",
                  time.col = "time")
fit2 <- joint(data = data,
              long.formula = dose ~ time + treat + interaction,
              surv.formula = Surv(with.time, with.status2) ~ treat,
              longsep = FALSE, survsep = FALSE,
              gpt = 3)

## End(Not run)

joineR documentation built on Jan. 23, 2023, 5:39 p.m.