sim_joint_data_set: Simulate Individuals from a Joint Survival and Marker Model
In SimSurvNMarker: Simulate Survival Time and Markers

View source: R/sim.R

sim_joint_data_set

R Documentation

Simulate Individuals from a Joint Survival and Marker Model

Description

Simulates individuals from the following model

\vec U_i \sim N^{(K)}(\vec 0, Ψ)

\vec Y_{ij} \mid \vec U_i = \vec u_i \sim N^{(r)}(\vec μ_i(s_{ij}, \vec u_i), Σ)

h(t \mid \vec u) = \exp(\vecω^\top\vec b(t) + \vecδ^\top\vec z_i + \vec 1^\top(diag(\vec α) \otimes \vec x_i^\top)vec(Γ) + \vec 1^\top(diag(\vec α) \otimes \vec g(t)^\top)vec(B) + \vec 1^\top(diag(\vec α) \otimes \vec m(t)^\top)\vec u)

with

\vecμ_i(s, \vec u) = (I \otimes \vec x_i^\top)vec(Γ) + ≤ft(I \otimes \vec g(s)^\top\right)vec(B) + ≤ft(I \otimes \vec m(s)^\top\right) \vec u

where h(t \mid \vec u) is the conditional hazard function.

Usage

sim_joint_data_set(
  n_obs,
  B,
  Psi,
  omega,
  delta,
  alpha,
  sigma,
  gamma,
  b_func,
  m_func,
  g_func,
  r_z,
  r_left_trunc,
  r_right_cens,
  r_n_marker,
  r_x,
  r_obs_time,
  y_max,
  use_fixed_latent = TRUE,
  m_func_surv = m_func,
  g_func_surv = g_func,
  gl_dat = get_gl_rule(30L),
  tol = .Machine$double.eps^(1/4)
)

Arguments

`n_obs`	integer with the number of individuals to draw.
`B`	coefficient matrix for time-varying fixed effects. Use `NULL` if there is no effect.
`Psi`	the random effects' covariance matrix.
`omega`	numeric vector with coefficients for the baseline hazard.
`delta`	coefficients for fixed effects in the log hazard.
`alpha`	numeric vector with association parameters.
`sigma`	the noise's covariance matrix.
`gamma`	coefficient matrix for the non-time-varying fixed effects. Use `NULL` if there is no effect.
`b_func`	basis function for the baseline hazard like `poly`.
`m_func`	basis function for `U` like `poly`.
`g_func`	basis function for `B` like `poly`.
`r_z`	generator for the covariates in the log hazard. Takes an integer for the individual's id.
`r_left_trunc`	generator for the left-truncation time. Takes an integer for the individual's id.
`r_right_cens`	generator for the right-censoring time. Takes an integer for the individual's id.
`r_n_marker`	function to generate the number of observed markers. Takes an integer for the individual's id.
`r_x`	generator for the covariates in for the markers. Takes an integer for the individual's id.
`r_obs_time`	function to generate the observations times given the number of observed markers. Takes an integer for the number of markers and an integer for the individual's id.
`y_max`	maximum survival time before administrative censoring.
`use_fixed_latent`	logical for whether to include the \vec 1^\top(diag(\vec α) \otimes \vec x_i^\top)vec(Γ) term in the log hazard. Useful if derivatives of the latent mean should be used.
`m_func_surv`	basis function for `U` like `poly` in the log hazard. Can be different from `m_func`. Useful if derivatives of the latent mean should be used.
`g_func_surv`	basis function for `B` like `poly` in the log hazard. Can be different from `g_func`. Useful if derivatives of the latent mean should be used.
`gl_dat`	Gauss–Legendre quadrature data. See `get_gl_rule`.
`tol`	convergence tolerance passed to `uniroot`.

Examples

#####
# example with polynomial basis functions
b_func <- function(x){
  x <- x - 1
  cbind(x^3, x^2, x)
}
g_func <- function(x){
  x <- x - 1
  cbind(x^3, x^2, x)
}
m_func <- function(x){
  x <- x - 1
  cbind(x^2, x, 1)
}

# parameters
delta <- c(-.5, -.5, .5)
gamma <- matrix(c(.25, .5, 0, -.4, 0, .3), 3, 2)
omega <- c(1.4, -1.2, -2.1)
Psi <- structure(c(0.18, 0.05, -0.05, 0.1, -0.02, 0.06, 0.05, 0.34, -0.25,
                   -0.06, -0.03, 0.29, -0.05, -0.25, 0.24, 0.04, 0.04,
                   -0.12, 0.1, -0.06, 0.04, 0.34, 0, -0.04, -0.02, -0.03,
                   0.04, 0, 0.1, -0.08, 0.06, 0.29, -0.12, -0.04, -0.08,
                   0.51), .Dim = c(6L, 6L))
B <- structure(c(-0.57, 0.17, -0.48, 0.58, 1, 0.86), .Dim = 3:2)
sig <- diag(c(.6, .3)^2)
alpha <- c(.5, .9)

# generator functions
r_n_marker <- function(id)
  # the number of markers is Poisson distributed
  rpois(1, 10) + 1L
r_obs_time <- function(id, n_markes)
  # the observations times are uniform distributed
  sort(runif(n_markes, 0, 2))
r_z <- function(id)
  # return a design matrix for a dummy setup
  cbind(1, (id %% 3) == 1, (id %% 3) == 2)
r_x <- r_z # same covariates for the fixed effects
r_left_trunc <- function(id)
  # no left-truncation
  0
r_right_cens <- function(id)
  # right-censoring time is exponentially distributed
  rexp(1, rate = .5)

# simulate
gl_dat <- get_gl_rule(30L)
y_max <- 2
n_obs <- 100L
set.seed(1)
dat <- sim_joint_data_set(
  n_obs = n_obs, B = B, Psi = Psi, omega = omega, delta = delta,
  alpha = alpha, sigma = sig, gamma = gamma, b_func = b_func,
  m_func = m_func, g_func = g_func, r_z = r_z, r_left_trunc = r_left_trunc,
  r_right_cens = r_right_cens, r_n_marker = r_n_marker, r_x = r_x,
  r_obs_time = r_obs_time, y_max = y_max)

# checks
stopifnot(
  NROW(dat$survival_data) == n_obs,
  NROW(dat$marker_data) >= n_obs,
  all(dat$survival_data$y <= y_max))

SimSurvNMarker documentation built on Nov. 10, 2022, 5:12 p.m.