Nothing
#' Simulate data from a joint model
#'
#' @description This function simulates longitudinal and time-to-event data from
#' a joint model.
#'
#' @param n the number of subjects to simulate data for.
#' @inheritParams joint
#' @param ntms the maximum number of (discrete) time points to simulate repeated
#' longitudinal measurements at.
#' @param b1 a vector specifying the coefficients of the fixed effects in the
#' longitudinal sub-model. The order in each row is intercept, a continuous
#' covariate, covariate, a binary covariate, the measurement time.
#' @param b2 a vector of \code{length = 2} specifying the coefficients for the
#' time-to-event baseline covariates, in the order of a continuous covariate
#' and a binary covariate.
#' @param gamma a vector of specifying the latent association parameter(s) for
#' the longitudinal outcome. It must be of length 1 if \code{sepassoc =
#' FALSE}.
#' @param sigu a positive-definite matrix specifying the variance-covariance
#' matrix. If \code{model = "int"}, the matrix has dimension \code{dim = c(1,
#' 1)}; if \code{model = "intslope"}, the matrix has dimension \code{dim =
#' c(2, 2)}; else if \code{model = "quad"}, the matrix has dimension \code{dim
#' = c(3, 3)}. If \code{D = NULL} (default), an identity matrix is assumed.
#' @param vare a numeric value specifying the residual standard error.
#' @param theta0,theta1 parameters controlling the failure rate. See Details.
#' @param censoring logical: if \code{TRUE}, includes an independent censoring
#' time.
#' @param censlam a scale (\eqn{>0}) parameter for an exponential distribution
#' used to simulate random censoring times for when \code{censoring = TRUE}.
#' @param truncation logical: if \code{TRUE}, adds a truncation time for a
#' maximum event time in the case of \code{model = "int"} or \code{model =
#' "intslope"}.
#' @param trunctime a truncation time for use when \code{truncation = TRUE}. For
#' \code{model = "quad"}, \code{trunctime} is required, and defaults to
#' \code{max(ntms)} if not specified.
#' @param gridstep the step-size for the grid used to simulate event times when
#' \code{model = "quad"}. Default is \code{gridstep = 0.01}. See Details.
#'
#' @details The function \code{simjoint} simulates data from a joint model,
#' similar to that performed in Henderson et al. (2000). It works by first
#' simulating longitudinal data for all possible follow-up times using random
#' draws for the multivariate Gaussian random effects and residual error
#' terms. Data can be simulated assuming either random-intercepts only
#' (\code{model = "int"}) in each of the longitudinal sub-models;
#' random-intercepts and random-slopes (\code{model = "intslope"}); or
#' quadratic random effects structures (\code{model = "quad"}). The failure
#' times are simulated from proportional hazards time-to-event models, using
#' the following methodologies:
#'
#' \describe{
#'
#' \item{\code{model = "int"}}{The baseline hazard function is specified to be
#' an exponential distribution with
#'
#' \deqn{\lambda_0(t) = \exp{\theta_0}.}
#'
#' Simulation is conditional on known time-independent effects, and the
#' methodology of Bender et al. (2005) is used to simulate the failure time.}
#'
#' \item{\code{model = "intslope"}}{The baseline hazard function is specified
#' to be a Gompertz distribution with
#'
#' \deqn{\lambda_0(t) = \exp{\theta_0 + \theta_1 t}.}
#'
#' In the usual representation of the Gompertz distribution, \eqn{\theta_1} is
#' the shape parameter, and the scale parameter is equivalent to
#' \eqn{\exp(\theta_0)}. Simulation is conditional on on a predictable
#' (linear) time-varying process, and the methodology of Austin (2012) is used
#' to simulate the failure time.}
#'
#' \item{\code{model="quad"}}{The baseline hazard function is specified as per
#' \code{model="intslope"}. The integration technique used for the above two
#' cases is complex in quadratic (and higher order) models, therefore we use a
#' different approach. We note that hazard function can be written as
#'
#' \deqn{\lim_{dt \rightarrow 0} \lambda(t) dt = \lim_{dt \rightarrow 0}
#' P[t \le T \le t + dt | T \ge t].}
#'
#' In the simulation routine the parameter \code{gridstep} acts as \eqn{dt} in
#' that we choose a nominally small value, which multiplies the hazard and
#' this scaled hazard is equivalent to the probability of having an event in
#' the interval \eqn{(t, t + dt)}, or equivalently \eqn{(t, t +
#' }\code{gridstep}\eqn{)}. A vector of possible times is set up for each
#' individual, ranging from 0 to \code{trunctime} in increments of \eqn{dt} (or
#' \code{gridstep}). The failure probability at each time is compared to an
#' independent \eqn{U(0, 1)} draw, and if the probability does not exceed the
#' random draw then the survival time is set as \code{trunctime}, otherwise it
#' is the generated time from the vector of candidate times. The minimum of
#' these candidate times (i.e. when we deem the event to have first happened)
#' is taken as the survival time.}
#'
#' }
#'
#' @author Pete Philipson
#' @keywords datagen survival
#'
#' @references
#'
#' Austin PC. Generating survival times to simulate Cox proportional hazards
#' models with time-varying covariates. \emph{Stat Med.} 2012; \strong{31(29)}:
#' 3946-3958.
#'
#' Bender R, Augustin T, Blettner M. Generating survival times to simulate Cox
#' proportional hazards models. \emph{Stat Med.} 2005; \strong{24}: 1713-1723.
#'
#' Henderson R, Diggle PJ, Dobson A. Joint modelling of longitudinal
#' measurements and event time data. \emph{Biostatistics.} 2000; \strong{1(4)}:
#' 465-480.
#'
#' @return A list of 2 \code{data.frame}s: one recording the requisite
#' longitudinal outcomes data, and one recording the time-to-event data.
#' @export
#'
#' @examples
#' simjoint(10, sepassoc = TRUE)
simjoint <- function(n = 500, model = c("intslope", "int", "quad"), sepassoc = FALSE,
ntms = 5, b1 = c(1, 1, 1, 1), b2 = c(1, 1), gamma = c(1, 0.1),
sigu, vare = 0.01, theta0 = -3, theta1 = 1, censoring = TRUE,
censlam = exp(-3), truncation = FALSE, trunctime = max(ntms),
gridstep = 0.01) {
model <- match.arg(model)
if (model != "intslope" && model != "int" && model != "quad") {
stop(paste("Unknown model:", model))
}
ran <- 2 # default: model = "intslope"
if (model == "int") {
ran <- 1
} else if (model == "quad") {
ran <- 3
}
lat <- ran
if (!sepassoc) {
lat <- 1
}
if (length(gamma) != lat) {
warning("Number of association parameters do not match model choice\n")
}
gamma <- rep(gamma, length = ran)
if (missing(sigu)) {
sigu <- diag(ran)
}
if (length(sigu) != ran^2) {
warning("Dimension of covariance matrix does not match chosen model\n")
if(length(sigu) > ran^2) {
sigu <- sigu[1:ran, 1:ran]
} else {
sigu <- diag(ran) * sigu[1]
}
}
if (model == "int") {
if(sigu < 0) {
stop("Variance must be positive")
}
} else {
if (!isSymmetric(sigu)) {
stop("Covariance matrix is not symmetric")
}
if (any(eigen(sigu)$values < 0) || (det(sigu) <= 0)) {
stop("Covariance matrix must be positive semi-definite")
}
}
sim <- simdat(n, model, sepassoc, ntms, ran, b1, b2, gamma, sigu, vare,
theta0, theta1,
censoring, censlam, truncation, trunctime, gridstep)
list(longitudinal = sim$longdat, survival = sim$survdat)
}
Any scripts or data that you put into this service are public.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.