# jm_bamlss: Fit Flexible Additive Joint Models In bamlss: Bayesian Additive Models for Location Scale and Shape (and Beyond)

## Description

Family object to fit a flexible additive joint model for longitudinal and survival data under a Bayesian approach as presented in Koehler et al. (2016). All parts of the joint model can be specified as structured additive predictors. See the details and examples.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 ## JM family object. jm_bamlss(...) ## "bamlss.frame" transformer function ## to set up joint models. jm.transform(x, y, data, terms, knots, formula, family, subdivisions = 25, timedependent = c("lambda", "mu", "alpha", "dalpha"), timevar = NULL, idvar = NULL, alpha = .Machine$double.eps, mu = NULL, sigma = NULL, sparse = TRUE, ...) ## Posterior mode optimizing engine. jm.mode(x, y, start = NULL, weights = NULL, offset = NULL, criterion = c("AICc", "BIC", "AIC"), maxit = c(100, 1), nu = c("lambda" = 0.1, "gamma" = 0.1, "mu" = 0.1, "sigma" = 0.1, "alpha" = 0.1, "dalpha" = 0.1), update.nu = TRUE, eps = 0.0001, alpha.eps = 0.001, ic.eps = 1e-08, nback = 40, verbose = TRUE, digits = 4, ...) ## Sampler function. jm.mcmc(x, y, family, start = NULL, weights = NULL, offset = NULL, n.iter = 1200, burnin = 200, thin = 1, verbose = TRUE, digits = 4, step = 20, ...) ## Predict function, set to default in jm_bamlss(). jm.predict(object, newdata, type = c("link", "parameter", "probabilities", "cumhaz"), dt, steps, id, FUN = function(x) { mean(x, na.rm = TRUE) }, subdivisions = 100, cores = NULL, chunks = 1, verbose = FALSE, ...) ## Survival plot. jm.survplot(object, id = 1, dt = NULL, steps = 10, points = TRUE, rug = !points)  ## Arguments  x The x list, as returned from function bamlss.frame (and transformed by function jm.transform()), holding all model matrices and other information that is used for fitting the model. y The model response, as returned from function bamlss.frame. data A data.frame or list containing the model response variable(s) and covariates specified in the formula in long format. By default the variables are taken from environment(formula): typically the environment from which bamlss is called. terms The corresponding terms.bamlss object needed for processing. knots An optional list containing user specified knots, see the documentation of function gam. formula The corresponding bamlss.formula. family The bamlss.family object. subdivisions How many time points should be created for each individual. timedependent A character vector specifying the names of parameters in x that are time-dependent. Time grid design matrices are only computed for these parameters. timevar A character specifying the name of the survival time variable in the data set. idvar Depending on the type of data set, this is the name of the variable specifying identifier of individuals. alpha Numeric, a starting value for the intercept of the association parameter alpha. mu Numeric, a starting value for the intercept of the mu parameter. sigma Numeric, a starting value for the intercept of the sigma parameter. sparse Logical, should sparse matrix structures be used for updating and sampling of mu parameter model terms? start A named numeric vector containing possible starting values, the names are based on function parameters. weights Currently not supported. offset Currently not supported. criterion Information criterion to be used, e.g., for smoothing variance selection. Options are the corrected AIC "AICc" (see Details), the "BIC" and "AIC". Defaults to "AICc"? maxit Vector containing the maximum number of iterations for the backfitting algorithm with maxit[1] defining the iterations for the full model and maxit[2] the iterations within each predictor. maxit[2] defaults to 1 if only one value is specified. nu Vector of step lengths for parameter updates of one Newton-Raphson update for each predictor of the joint model. update.nu Should the updating step length be optimized in each iteration of the backfitting algorithm? Uses nu as starting value if set to TRUE. eps The relative convergence tolerance of the backfitting algorithm. alpha.eps The relative convergence tolerance of the backfitting algorithm for predictor alpha. ic.eps The relative convergence tolerance of the information criterion used, e.g., for smoothing variance selection. nback For computing ic.eps, how many iterations back should be included when computing relative convergence tolerance of the information criterion. verbose Print information during runtime of the algorithm. digits Set the digits for printing when verbose = TRUE. n.iter the number of MCMC iterations. burnin the burn-in phase of the sampler, i.e., the number of starting samples that should be removed. thin the thinning parameter for MCMC simulation. E.g., thin = 10 means, that only every 10th sampled parameter will be stored. step How many times should algorithm runtime information be printed, divides n.iter. object A "bamlss" object processed with the JM optimizer function jm.mode() ans/or sampler function jm.mcmc() for which the survival plot should be created. newdata Dataset for which to create predictions. Not needed for conditional survival probabilities. type Character string indicating which type of predictions to compute. link returns estimates for all predictors with the respective link functions applied, "parameter" returns the estimates for all pedictors, "probabilities" returns the survival probabilities conditional on the survival up to the last longitudinal measurement, and "cumhaz" return the cumulative hazard up to the survival time or for a time window after the last longitudinal measurement. id Integer or character, that specifies the individual for which the plot should be created. dt The time window after the last observed measurement for which predictions should be computed. The default is 0.4 * max(obstime) and obstime are the individual's longitudinal measurement times. steps Integer, the number of steps for which to evaluate the conditional survival probability up to dt. FUN A function that should be applied on the samples of predictors or parameters, depending on argument type. cores Specifies the number of cores that should be used for prediction. Note that this functionality is based on the parallel package. chunks Should computations be split into chunks? Prediction is then processed sequentially. points Should longitudinal observations be added to the plot. rug Should longitudinal observed time points be added on the x-axis to the plot. ... Currently not used. ## Details We refer to the paper of Koehler et al. (2016) for details on the flexible additive joint model. In short, we model the hazard of subject i an event at time t as h_{i}(t) = \exp [η_{λ i}(t) + η_{γ i}+η_{α i}(t) \cdot η_{μ i}(t)] with predictor η_{λ} for all survival covariates that are time-varying or have a time-varying coefficient (including the log baseline hazard), predictor η_{γ} for baseline survival covariates, predictor η_{α} representing the potentially time-varying association between the longitudinal marker η_{μ} and the hazard. The longitudinal response y_{ij} at time points t_{ij} is modeled as y_{ij}=η_{μ i}(t_{ij})+e_{ij} with independent normal errors N(0, \exp[η_{σ i}(t_{ij})]^2). Each predictor η_{ki} is a structured additive predictor, i.e. a sum of functions of covariates η_{ki} = ∑_{m=1}^{M_k} f_{km}(\bm{x}_{ki}). Each of these functions can be modeled parametrically or using basis function evaluations from the smooth constructors in mgcv such as s, te and ti and can include smooth time-varying, random or spatial effects. For the Bayesian estimation of these effects we specify corresponding priors: For linear or parametric terms we use vague normal priors, smooth and random effect terms are regularized by placing generic multivariate normal priors on the coefficients and for anisotropic smooths, when multiple smoothing variance parameters are involved, more complex prior are in place (cf. Koehler et al., 2016). We use inverse Gamma hyper-priors, i.e. IG(0.001, 0.001) to obtain an inverse Gamma full conditional for the variance parameters. We estimate the posterior mode by maximizing the log-posterior of the model using a Newton-Raphson procedure, the posterior mean is obtained via derivative-based Metropolis-Hastings sampling. We recommend to use posterior mode estimates for a quick model assessment. In order to draw correct inferences from the model, posterior mean estimates should be computed. We approximate integration in the survival part of the likelihood using trapezoidal rule. For posterior mode estimation. ## Note Please note that in order to make use of the sparse setup in fitting random slopes (cf. first example), no observation time should be exactly 0. ## References Koehler M, Umlauf N, Beyerlein, A., Winkler, C. Ziegler, A.-G., Greven S (2016). Flexible Bayesian Additive Joint Models with an Application to Type 1 Diabetes Research. arXiv:1611.01485 [stat], http://arxiv.org/abs/1611.01485. Umlauf N, Klein N, Zeileis A (2016). Bayesian Additive Models for Location Scale and Shape (and Beyond). (to appear) ## See Also bamlss, bamlss.frame. ## Examples   1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 ## Not run: set.seed(123) ## Simulate survival data ## with random intercepts/slopes and a linear effect of time, ## constant association alpha and no effect of the derivative d <- simJM(nsub = 200, long_setting = "linear", alpha_setting = "constant", dalpha_setting = "zero", full = FALSE) # remove 0s from the longitudinal time variable to allow sparse calculations d$obstime <- d$obstime - sqrt(.Machine$double.eps) ## Formula of the according joint model f <- list( Surv2(survtime, event, obs = y) ~ s(survtime, bs = "ps"), gamma ~ s(x1, bs = "ps"), mu ~ obstime + s(id, bs = "re") + s(id, obstime, bs = "re"), sigma ~ 1, alpha ~ 1, dalpha ~ -1 ) ## Joint model estimation ## jm_bamlss() sets the default optimizer and sampler function. ## First, posterior mode estimates are computed using function ## jm.mode(), afterwards the sampler jm.mcmc() is started. b <- bamlss(f, data = d, family = "jm", timevar = "obstime", idvar = "id") ## Plot estimated effects. plot(b) ## Predict event probabilities for two individuals ## at 12 time units after their last longitudinal measurement. ## The event probability is conditional on their survival ## up to their last observed measurement. nd <- subset(d, id %in% c(192, 127)) p <- predict(b, type = "probabilities", dt = 12, FUN = c95) print(p) ## Plot of survival probabilities and ## corresponding longitudinal effects ## for individual id. jm.survplot(b, id = 3) jm.survplot(b, id = 30) ## Simulate survival data ## with functional random intercepts and a nonlinear effect ## of time, time-varying association alpha and no effect ## of the derivative. ## Note: This specification is the simJM default. d <- simJM(nsub = 200, full = FALSE) ## Formula of the according joint model ## number of knots for the smooth nonlinear effect of time klong <- 8 f <- list( Surv2(survtime, event, obs = y) ~ s(survtime, bs = "ps"), gamma ~ s(x1, bs = "ps"), mu ~ ti(id, bs = "re") + ti(obstime, bs = "ps", k = klong) + ti(id, obstime, bs = c("re", "ps"), k = c(nlevels(d\$id), klong)) + s(x2, bs = "ps"), sigma ~ 1, alpha ~ s(survtime, bs = "ps"), dalpha ~ -1 ) ## Estimating posterior mode only using jm.mode() b_mode <- bamlss(f, data = d, family = "jm", timevar = "obstime", idvar = "id", sampler = FALSE) ## Estimating posterior means using jm.mcmc() ## with starting values generated from posterior mode b_mean <- bamlss(f, data = d, family = "jm", timevar = "obstime", idvar = "id", optimizer = FALSE, start = parameters(b_mode), results = FALSE) ## Plot effects. plot(b_mean, model = "alpha") ## End(Not run) 

bamlss documentation built on May 31, 2017, 3:41 a.m.