simrec: simrec
In simrec: Simulation of Recurrent Event Data for Non-Constant Baseline Hazard

View source: R/simrec.R

simrec

R Documentation

simrec

Description

Simulation of recurrent event data for non-constant baseline hazard (total-time model)

This function allows simulation of recurrent event data following the multiplicative intensity model described in Andersen and Gill [1] with the baseline hazard being a function of the total/calendar time. To induce between-subject-heterogeneity a random effect covariate (frailty term) can be incorporated. Data for individual i are generated according to the intensity process

Y_i(t) * \lambda_0(t)* Z_i *exp(\beta^t X_i),

where X_i defines the covariate vector and \beta the regression coefficient vector. \lambda_0(t) denotes the baseline hazard, being a function of the total/calendar time t, and Y_i(t) the predictable process that equals one as long as individual i is under observation and at risk for experiencing events. Z_i denotes the frailty variable with (Z_i)_i iid with E(Z_i)=1 and Var(Z_i)=\theta. The parameter \theta describes the degree of between-subject-heterogeneity. Data output is in the counting process format.

Usage

simrec(
  N,
  fu.min,
  fu.max,
  cens.prob = 0,
  dist.x = "binomial",
  par.x = 0,
  beta.x = 0,
  dist.z = "gamma",
  par.z = 0,
  dist.rec,
  par.rec,
  pfree = 0,
  dfree = 0
)

Arguments

`N`	Number of individuals
`fu.min`	Minimum length of follow-up.
`fu.max`	Maximum length of follow-up. Individuals length of follow-up is generated from a uniform distribution on `[fu.min, fu.max]`. If `fu.min=fu.max`, then all individuals have a common follow-up.
`cens.prob`	Gives the probability of being censored due to loss to follow-up before `fu.max`. For a random set of individuals defined by a B(N,`cens.prob`)-distribution, the time to censoring is generated from a uniform distribution on `[0, fu.max]`. Default is `cens.prob=0`, i.e. no censoring due to loss to follow-up.
`dist.x`	Distribution of the covariate(s) `X`. If there is more than one covariate, `dist.x` must be a vector of distributions with one entry for each covariate. Possible values are `"binomial"` and `"normal"`, default is `dist.x="binomial"`.
`par.x`	Parameters of the covariate distribution(s). For `"binomial", par.x` is the probability for `x=1`. For `"normal"`, `par.x=c(\mu, \sigma)` where `\mu` is the mean and `\sigma` is the standard deviation of a normal distribution. If one of the covariates is defined to be normally distributed, `par.x` must be a list, e.g. `dist.x <- c("binomial", "normal")` and `par.x <- list(0.5, c(1,2))`. Default is `par.x=0`, i.e. `x=0` for all individuals.
`beta.x`	Regression coefficient(s) for the covariate(s) `x`. If there is more than one covariate, `beta.x` must be a vector of coefficients with one entry for each covariate. `simrec` generates as many covariates as there are entries in `beta.x`. Default is `beta.x=0`, corresponding to no effect of the covariate `x`.
`dist.z`	Distribution of the frailty variable `Z` with `E(Z)=1` and `Var(Z)=\theta`. Possible values are `"gamma"` for a Gamma distributed frailty and `"lognormal"` for a lognormal distributed frailty. Default is `dist.z="gamma"`.
`par.z`	Parameter `\theta` for the frailty distribution: this parameter gives the variance of the frailty variable `Z`. Default is `par.z=0`, which causes `Z=1`, i.e. no frailty effect.
`dist.rec`	Form of the baseline hazard function. Possible values are `"weibull"` or `"gompertz"` or `"lognormal"` or `"step"`.
`par.rec`	Parameters for the distribution of the event data. If `dist.rec="weibull"` the hazard function is `\lambda_0(t)=\lambda\nu t^{\nu - 1},` where `\lambda>0` is the scale and `\nu>0` is the shape parameter. Then `par.rec=c(\lambda, \nu)`. A special case of this is the exponential distribution for `\nu=1`.\ If `dist.rec="gompertz"`, the hazard function is `\lambda_0(t)=\lambdaexp(\alpha t),` where `\lambda>0` is the scale and `\alpha\in(-\infty,+\infty)` is the shape parameter. Then `par.rec=c(\lambda, \alpha)`.\ If `dist.rec="lognormal"`, the hazard function is `\lambda_0(t)=[(1/(\sigma t))\phi((ln(t)-\mu)/\sigma)]/[\Phi((-ln(t)-\mu)/\sigma)],` where `\phi` is the probability density function and `\Phi` is the cumulative distribution function of the standard normal distribution, `\mu\in(-\infty,+\infty)` is a location parameter and `\sigma>0` is a shape parameter. Then `par.rec=c(\mu,\sigma)`. Please note, that specifying `dist.rec="lognormal"` together with some covariates does not specify the usual lognormal model (with covariates specified as effects on the parameters of the lognormal distribution resulting in non-proportional hazards), but only defines the baseline hazard and incorporates covariate effects using the proportional hazard assumption.\ If `dist.rec="step"` the hazard function is `\lambda_0(t)=a, t<=t_1, and \lambda_0(t)=b, t>t_1` . Then `par.rec=c(a,b,t_1)`.
`pfree`	Probability that after experiencing an event the individual is not at risk for experiencing further events for a length of `dfree` time units. Default is `pfree=0`.
`dfree`	Length of the risk-free interval. Must be in the same time unit as `fu.max`. Default is `dfree=0`, i.e. the individual is continously at risk for experiencing events until end of follow-up.

Details

Simulation of recurrent event data for non-constant baseline hazard in the total time model with risk-free intervalls and possibly a competing event. The simrec package enables to cut the data to an interim data set, and provides functionality to plot.

Data are simulated by extending the methods proposed by Bender et al [2] to the multiplicative intensity model.

Value

The output is a data.frame consisting of the columns:

`id`	An integer number for identification of each individual
`x`	or `x.V1, x.V2, ...` - depending on the covariate matrix. Contains the randomly generated value of the covariate(s) `X` for each individual.
`z`	Contains the randomly generated value of the frailty variable `Z` for each individual.
`start`	The start of interval `[start, stop]`, when the individual starts to be at risk for a next event.
`stop`	The time of an event or censoring, i.e. the end of interval `[start, stop]`.
`status`	An indicator of whether an event occured at time `stop` (`status=1`) or the individual is censored at time `stop` (`status=0`).
`fu`	Length of follow-up period `[0,fu]` for each individual.

For each individual there are as many lines as it experiences events, plus one line if being censored. The data format corresponds to the counting process format.

Author(s)

Katharina Ingel, Stella Preussler, Antje Jahn-Eimermacher, Federico Marini

Maintainer: Antje Jahn-Eimermacher jahna@uni-mainz.de

Katharina Ingel, Stella Preussler, Antje Jahn-Eimermacher. Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg-University Mainz, Germany

References

Andersen P, Gill R (1982): Cox's regression model for counting processes: a large sample study. The Annals of Statistics 10:1100-1120
Bender R, Augustin T, Blettner M (2005): Generating survival times to simulate Cox proportional hazards models. Statistics in Medicine 24:1713-1723
Jahn-Eimermacher A, Ingel K, Ozga AK, Preussler S, Binder H (2015): Simulating recurrent event data with hazard functions defined on a total time scale. BMC Medical Research Methodology 15:16

Examples

### Example:
### A sample of 10 individuals

N <- 10

### with a binomially distributed covariate with a regression coefficient
### of beta=0.3, and a standard normally distributed covariate with a
### regression coefficient of beta=0.2,

dist.x <- c("binomial", "normal")
par.x <- list(0.5, c(0, 1))
beta.x <- c(0.3, 0.2)

### a gamma distributed frailty variable with variance 0.25

dist.z <- "gamma"
par.z <- 0.25

### and a Weibull-shaped baseline hazard with shape parameter lambda=1
### and scale parameter nu=2.

dist.rec <- "weibull"
par.rec <- c(1, 2)

### Subjects are to be followed for two years with 20% of the subjects
### being censored according to a uniformly distributed censoring time
### within [0,2] (in years).

fu.min <- 2
fu.max <- 2
cens.prob <- 0.2

### After each event a subject is not at risk for experiencing further events
### for a period of 30 days with a probability of 50%.

dfree <- 30 / 365
pfree <- 0.5

simdata <- simrec(
  N, fu.min, fu.max, cens.prob, dist.x, par.x, beta.x, dist.z, par.z,
  dist.rec, par.rec, pfree, dfree
)
# print(simdata)  # only run for small N!

simrec documentation built on Sept. 8, 2023, 6:18 p.m.