Description Usage Arguments Details Value See Also Examples

Fits a shared parameter joint model for longitudinal and time-to-event (e.g. survival) data under a Bayesian framework using Stan.

1 2 3 4 5 6 7 8 9 10 11 | ```
stan_jm(formulaLong, dataLong, formulaEvent, dataEvent, time_var, id_var,
family = gaussian, assoc = "etavalue", lag_assoc = 0, grp_assoc,
epsilon = 1e-05, basehaz = c("bs", "weibull", "piecewise"),
basehaz_ops, qnodes = 15, init = "prefit", weights,
priorLong = normal(), priorLong_intercept = normal(),
priorLong_aux = cauchy(0, 5), priorEvent = normal(),
priorEvent_intercept = normal(), priorEvent_aux = cauchy(),
priorEvent_assoc = normal(), prior_covariance = lkj(),
prior_PD = FALSE, algorithm = c("sampling", "meanfield", "fullrank"),
adapt_delta = NULL, max_treedepth = 10L, QR = FALSE,
sparse = FALSE, ...)
``` |

`formulaLong` |
A two-sided linear formula object describing both the
fixed-effects and random-effects parts of the longitudinal submodel,
similar in vein to formula specification in the | |||||||||

`dataLong` |
A data frame containing the variables specified in
| |||||||||

`formulaEvent` |
A two-sided formula object describing the event
submodel. The left hand side of the formula should be a | |||||||||

`dataEvent` |
A data frame containing the variables specified in
| |||||||||

`time_var` |
A character string specifying the name of the variable
in | |||||||||

`id_var` |
A character string specifying the name of the variable in
| |||||||||

`family` |
The family (and possibly also the link function) for the
longitudinal submodel(s). See | |||||||||

`assoc` |
A character string or character vector specifying the joint
model association structure. Possible association structures that can
be used include: "etavalue" (the default); "etaslope"; "etaauc";
"muvalue"; "muslope"; "muauc"; "shared_b"; "shared_coef"; or "null".
These are described in the | |||||||||

`lag_assoc` |
A non-negative scalar specifying the time lag that should be
used for the association structure. That is, the hazard of the event at
time | |||||||||

`grp_assoc` |
Character string specifying the method for combining information
across lower level units clustered within an individual when forming the
association structure. This is only relevant when a grouping factor is
specified in | |||||||||

`epsilon` |
The half-width of the central difference used to numerically
calculate the derivate when the | |||||||||

`basehaz` |
A character string indicating which baseline hazard to use
for the event submodel. Options are a B-splines approximation estimated
for the log baseline hazard ( | |||||||||

`basehaz_ops` |
A named list specifying options related to the baseline
hazard. Currently this can include: `df` A positive integer specifying the degrees of freedom for the B-splines if `basehaz = "bs"` , or the number of intervals used for the piecewise constant baseline hazard if`basehaz = "piecewise"` . The default is 6.`knots` An optional numeric vector specifying the internal knot locations for the B-splines if `basehaz = "bs"` , or the internal cut-points for defining intervals of the piecewise constant baseline hazard if`basehaz = "piecewise"` . Knots cannot be specified if`df` is specified. If not specified, then the default is to use`df - 4` knots if`basehaz = "bs"` , or`df - 1` knots if`basehaz = "piecewise"` , which are placed at equally spaced percentiles of the distribution of observed event times.
| |||||||||

`qnodes` |
The number of nodes to use for the Gauss-Kronrod quadrature that is used to evaluate the cumulative hazard in the likelihood function. Options are 15 (the default), 11 or 7. | |||||||||

`init` |
The method for generating the initial values for the MCMC.
The default is | |||||||||

`weights` |
Experimental and should be used with caution. The user can optionally supply a 2-column data frame containing a set of 'prior weights' to be used in the estimation process. The data frame should contain two columns: the first containing the IDs for each individual, and the second containing the corresponding weights. The data frame should only have one row for each individual; that is, weights should be constant within individuals. | |||||||||

`priorLong, priorEvent, priorEvent_assoc` |
The prior distributions for the regression coefficients in the longitudinal submodel(s), event submodel, and the association parameter(s). Can be a call to one of the various functions provided by rstanarm for specifying priors. The subset of these functions that can be used for the prior on the coefficients can be grouped into several "families":
See the priors help page for details on the families and
how to specify the arguments for all of the functions in the table above.
To omit a prior —i.e., to use a flat (improper) uniform prior—
| |||||||||

`priorLong_intercept, priorEvent_intercept` |
The prior distributions
for the intercepts in the longitudinal submodel(s) and event submodel.
Can be a call to
| |||||||||

`priorLong_aux` |
The prior distribution for the "auxiliary" parameters
in the longitudinal submodels (if applicable).
The "auxiliary" parameter refers to a different parameter
depending on the
If fitting a multivariate joint model, you have the option to specify a list of prior distributions, however the elements of the list that correspond to any longitudinal submodel which does not have an auxiliary parameter will be ignored. | |||||||||

`priorEvent_aux` |
The prior distribution for the "auxiliary" parameters
in the event submodel. The "auxiliary" parameters refers to different
parameters depending on the baseline hazard. For | |||||||||

`prior_covariance` |
Cannot be | |||||||||

`prior_PD` |
A logical scalar (defaulting to | |||||||||

`algorithm` |
A string (possibly abbreviated) indicating the
estimation approach to use. Can be | |||||||||

`adapt_delta` |
Only relevant if | |||||||||

`max_treedepth` |
A positive integer specifying the maximum treedepth
for the non-U-turn sampler. See the | |||||||||

`QR` |
A logical scalar defaulting to | |||||||||

`sparse` |
A logical scalar (defaulting to | |||||||||

`...` |
Further arguments passed to the function in the rstan
package ( |

The `stan_jm`

function can be used to fit a joint model (also
known as a shared parameter model) for longitudinal and time-to-event data
under a Bayesian framework. The underlying
estimation is carried out using the Bayesian C++ package Stan
(http://mc-stan.org/).

The joint model may be univariate (with only one longitudinal submodel) or
multivariate (with more than one longitudinal submodel).
For the longitudinal submodel a (possibly multivariate) generalised linear
mixed model is assumed with any of the `family`

choices
allowed by `glmer`

. If a multivariate joint model is specified
(by providing a list of formulas in the `formulaLong`

argument), then
the multivariate longitudinal submodel consists of a multivariate generalized
linear model (GLM) with group-specific terms that are assumed to be correlated
across the different GLM submodels. That is, within
a grouping factor (for example, patient ID) the group-specific terms are
assumed to be correlated across the different GLM submodels. It is
possible to specify a different outcome type (for example a different
family and/or link function) for each of the GLM submodels, by providing
a list of `family`

objects in the `family`

argument. Multi-level
clustered data are allowed, and that additional clustering can occur at a
level higher than the individual-level (e.g. patients clustered within
clinics), or at a level lower than the individual-level (e.g. tumor lesions
clustered within patients). If the clustering occurs at a level lower than
the individual, then the user needs to indicate how the lower level
clusters should be handled when forming the association structure between
the longitudinal and event submodels (see the `grp_assoc`

argument
described above).

For the event submodel a parametric
proportional hazards model is assumed. The baseline hazard can be estimated
using either a cubic B-splines approximation (`basehaz = "bs"`

, the
default), a Weibull distribution (`basehaz = "weibull"`

), or a
piecewise constant baseline hazard (`basehaz = "piecewise"`

).
If the B-spline or piecewise constant baseline hazards are used,
then the degrees of freedom or the internal knot locations can be
(optionally) specified. If
the degrees of freedom are specified (through the `df`

argument) then
the knot locations are automatically generated based on the
distribution of the observed event times (not including censoring times).
Otherwise internal knot locations can be specified
directly through the `knots`

argument. If neither `df`

or
`knots`

is specified, then the default is to set `df`

equal to 6.
It is not possible to specify both `df`

and `knots`

.

Time-varying covariates are allowed in both the
longitudinal and event submodels. These should be specified in the data
in the same way as they normally would when fitting a separate
longitudinal model using `lmer`

or a separate
time-to-event model using `coxph`

. These time-varying
covariates should be exogenous in nature, otherwise they would perhaps
be better specified as an additional outcome (i.e. by including them as an
additional longitudinal outcome in the joint model).

Bayesian estimation of the joint model is performed via MCMC. The Bayesian
model includes independent priors on the
regression coefficients for both the longitudinal and event submodels,
including the association parameter(s) (in much the same way as the
regression parameters in `stan_glm`

) and
priors on the terms of a decomposition of the covariance matrices of the
group-specific parameters.
See `priors`

for more information about the priors distributions
that are available.

Gauss-Kronrod quadrature is used to numerically evaluate the integral
over the cumulative hazard in the likelihood function for the event submodel.
The accuracy of the numerical approximation can be controlled using the
number of quadrature nodes, specified through the `qnodes`

argument. Using a higher number of quadrature nodes will result in a more
accurate approximation.

The association structure for the joint model can be based on any of the following parameterisations:

current value of the linear predictor in the longitudinal submodel (

`"etavalue"`

)first derivative (slope) of the linear predictor in the longitudinal submodel (

`"etaslope"`

)the area under the curve of the linear predictor in the longitudinal submodel (

`"etaauc"`

)current expected value of the longitudinal submodel (

`"muvalue"`

)the area under the curve of the expected value from the longitudinal submodel (

`"muauc"`

)shared individual-level random effects (

`"shared_b"`

)shared individual-level random effects which also incorporate the corresponding fixed effect as well as any corresponding random effects for clustering levels higher than the individual) (

`"shared_coef"`

)interactions between association terms and observed data/covariates (

`"etavalue_data"`

,`"etaslope_data"`

,`"muvalue_data"`

,`"muslope_data"`

). These are described further below.interactions between association terms corresponding to different longitudinal outcomes in a multivariate joint model (

`"etavalue_etavalue(#)"`

,`"etavalue_muvalue(#)"`

,`"muvalue_etavalue(#)"`

,`"muvalue_muvalue(#)"`

). These are described further below.no association structure (equivalent to fitting separate longitudinal and event models) (

`"null"`

or`NULL`

)

More than one association structure can be specified, however,
not all possible combinations are allowed.
Note that for the lagged association structures baseline values (time = 0)
are used for the instances
where the time lag results in a time prior to baseline. When using the
`"etaauc"`

or `"muauc"`

association structures, the area under
the curve is evaluated using Gauss-Kronrod quadrature with 15 quadrature
nodes. By default, `"shared_b"`

and `"shared_coef"`

contribute
all random effects to the association structure; however, a subset of the
random effects can be chosen by specifying their indices between parentheses
as a suffix, for example, `"shared_b(1)"`

or `"shared_b(1:3)"`

or
`"shared_b(1,2,4)"`

, and so on.

In addition, several association terms (`"etavalue"`

, `"etaslope"`

,
`"muvalue"`

, `"muslope"`

) can be interacted with observed
data/covariates. To do this, use the association term's main handle plus a
suffix of `"_data"`

then followed by the model matrix formula in
parentheses. For example if we had a variable in our dataset for gender
named `sex`

then we might want to obtain different estimates for the
association between the current slope of the marker and the risk of the
event for each gender. To do this we would specify
`assoc = c("etaslope", "etaslope_data(~ sex)")`

.

It is also possible, when fitting a multivariate joint model, to include
interaction terms between the association terms themselves (this only
applies for interacting `"etavalue"`

or `"muvalue"`

). For example,
if we had a joint model with two longitudinal markers, we could specify
`assoc = list(c("etavalue", "etavalue_etavalue(2)"), "etavalue")`

.
The first element of list says we want to use the value of the linear
predictor for the first marker, as well as it's interaction with the
value of the linear predictor for the second marker. The second element of
the list says we want to also include the expected value of the second marker
(i.e. as a "main effect"). Therefore, the linear predictor for the event
submodel would include the "main effects" for each marker as well as their
interaction.

There are additional examples in the **Examples** section below.

A stanjm object is returned.

`stanreg-objects`

, `stanmvreg-methods`

,
`print.stanmvreg`

, `summary.stanmvreg`

,
`posterior_traj`

, `posterior_survfit`

,
`posterior_predict`

, `posterior_interval`

,
`pp_check`

, `ps_check`

, `stan_mvmer`

.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 | ```
#####
# Univariate joint model, with association structure based on the
# current value of the linear predictor
f1 <- stan_jm(formulaLong = logBili ~ year + (1 | id),
dataLong = pbcLong,
formulaEvent = Surv(futimeYears, death) ~ sex + trt,
dataEvent = pbcSurv,
time_var = "year",
# this next line is only to keep the example small in size!
chains = 1, cores = 1, seed = 12345, iter = 1000)
print(f1)
summary(f1)
#####
# Univariate joint model, with association structure based on the
# current value and slope of the linear predictor
f2 <- stan_jm(formulaLong = logBili ~ year + (year | id),
dataLong = pbcLong,
formulaEvent = Surv(futimeYears, death) ~ sex + trt,
dataEvent = pbcSurv,
assoc = c("etavalue", "etaslope"),
time_var = "year",
chains = 1, cores = 1, seed = 12345, iter = 1000)
print(f2)
#####
# Univariate joint model, with association structure based on the
# lagged value of the linear predictor, where the lag is 2 time
# units (i.e. 2 years in this example)
f3 <- stan_jm(formulaLong = logBili ~ year + (1 | id),
dataLong = pbcLong,
formulaEvent = Surv(futimeYears, death) ~ sex + trt,
dataEvent = pbcSurv,
time_var = "year",
assoc = "etavalue", lag_assoc = 2,
chains = 1, cores = 1, seed = 12345, iter = 1000)
print(f3)
#####
# Univariate joint model, where the association structure includes
# interactions with observed data. Here we specify that we want to use
# an association structure based on the current value of the linear
# predictor from the longitudinal submodel (i.e. "etavalue"), but we
# also want to interact this with the treatment covariate (trt) from
# pbcLong data frame, so that we can estimate a different association
# parameter (i.e. estimated effect of log serum bilirubin on the log
# hazard of death) for each treatment group
f4 <- stan_jm(formulaLong = logBili ~ year + (1 | id),
dataLong = pbcLong,
formulaEvent = Surv(futimeYears, death) ~ sex + trt,
dataEvent = pbcSurv,
time_var = "year",
assoc = c("etavalue", "etavalue_data(~ trt)"),
chains = 1, cores = 1, seed = 12345, iter = 1000)
print(f4)
######
# Multivariate joint model, with association structure based
# on the current value and slope of the linear predictor in the
# first longitudinal submodel and the area under the marker
# trajectory for the second longitudinal submodel
mv1 <- stan_jm(
formulaLong = list(
logBili ~ year + (1 | id),
albumin ~ sex + year + (year | id)),
dataLong = pbcLong,
formulaEvent = Surv(futimeYears, death) ~ sex + trt,
dataEvent = pbcSurv,
assoc = list(c("etavalue", "etaslope"), "etaauc"),
time_var = "year",
chains = 1, cores = 1, seed = 12345, iter = 100)
print(mv1)
#####
# Multivariate joint model, where the association structure is formed by
# including the expected value of each longitudinal marker (logBili and
# albumin) in the linear predictor of the event submodel, as well as their
# interaction effect (i.e. the interaction between the two "etavalue" terms).
# Note that whether such an association structure based on a marker by
# marker interaction term makes sense will depend on the context of your
# application -- here we just show it for demostration purposes).
mv2 <- stan_jm(
formulaLong = list(
logBili ~ year + (1 | id),
albumin ~ sex + year + (year | id)),
dataLong = pbcLong,
formulaEvent = Surv(futimeYears, death) ~ sex + trt,
dataEvent = pbcSurv,
assoc = list(c("etavalue", "etavalue_etavalue(2)"), "etavalue"),
time_var = "year",
chains = 1, cores = 1, seed = 12345, iter = 100)
#####
# Multivariate joint model, with one bernoulli marker and one
# Gaussian marker. We will artificially create the bernoulli
# marker by dichotomising log serum bilirubin
pbcLong$ybern <- as.integer(pbcLong$logBili >= mean(pbcLong$logBili))
mv3 <- stan_jm(
formulaLong = list(
ybern ~ year + (1 | id),
albumin ~ sex + year + (year | id)),
dataLong = pbcLong,
formulaEvent = Surv(futimeYears, death) ~ sex + trt,
dataEvent = pbcSurv,
family = list(binomial, gaussian),
time_var = "year",
chains = 1, cores = 1, seed = 12345, iter = 1000)
``` |

rstanarm documentation built on Oct. 4, 2019, 1:04 a.m.

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.