simrec: simrec

View source: R/simrec.R

simrecR Documentation

simrec

Description

Simulation of recurrent event data for non-constant baseline hazard (total-time model)

This function allows simulation of recurrent event data following the multiplicative intensity model described in Andersen and Gill [1] with the baseline hazard being a function of the total/calendar time. To induce between-subject-heterogeneity a random effect covariate (frailty term) can be incorporated. Data for individual i are generated according to the intensity process

Y_i(t) * \lambda_0(t)* Z_i *exp(\beta^t X_i),

where X_i defines the covariate vector and \beta the regression coefficient vector. \lambda_0(t) denotes the baseline hazard, being a function of the total/calendar time t, and Y_i(t) the predictable process that equals one as long as individual i is under observation and at risk for experiencing events. Z_i denotes the frailty variable with (Z_i)_i iid with E(Z_i)=1 and Var(Z_i)=\theta. The parameter \theta describes the degree of between-subject-heterogeneity. Data output is in the counting process format.

Usage

simrec(
  N,
  fu.min,
  fu.max,
  cens.prob = 0,
  dist.x = "binomial",
  par.x = 0,
  beta.x = 0,
  dist.z = "gamma",
  par.z = 0,
  dist.rec,
  par.rec,
  pfree = 0,
  dfree = 0
)

Arguments

N

Number of individuals

fu.min

Minimum length of follow-up.

fu.max

Maximum length of follow-up. Individuals length of follow-up is generated from a uniform distribution on [fu.min, fu.max]. If fu.min=fu.max, then all individuals have a common follow-up.

cens.prob

Gives the probability of being censored due to loss to follow-up before fu.max. For a random set of individuals defined by a B(N,cens.prob)-distribution, the time to censoring is generated from a uniform distribution on [0, fu.max]. Default is cens.prob=0, i.e. no censoring due to loss to follow-up.

dist.x

Distribution of the covariate(s) X. If there is more than one covariate, dist.x must be a vector of distributions with one entry for each covariate. Possible values are "binomial" and "normal", default is dist.x="binomial".

par.x

Parameters of the covariate distribution(s). For "binomial", par.x is the probability for x=1. For "normal", par.x=c(\mu, \sigma) where \mu is the mean and \sigma is the standard deviation of a normal distribution. If one of the covariates is defined to be normally distributed, par.x must be a list, e.g. dist.x <- c("binomial", "normal") and par.x <- list(0.5, c(1,2)). Default is par.x=0, i.e. x=0 for all individuals.

beta.x

Regression coefficient(s) for the covariate(s) x. If there is more than one covariate, beta.x must be a vector of coefficients with one entry for each covariate. simrec generates as many covariates as there are entries in beta.x. Default is beta.x=0, corresponding to no effect of the covariate x.

dist.z

Distribution of the frailty variable Z with E(Z)=1 and Var(Z)=\theta. Possible values are "gamma" for a Gamma distributed frailty and "lognormal" for a lognormal distributed frailty. Default is dist.z="gamma".

par.z

Parameter \theta for the frailty distribution: this parameter gives the variance of the frailty variable Z. Default is par.z=0, which causes Z=1, i.e. no frailty effect.

dist.rec

Form of the baseline hazard function. Possible values are "weibull" or "gompertz" or "lognormal" or "step".

par.rec

Parameters for the distribution of the event data. If dist.rec="weibull" the hazard function is

\lambda_0(t)=\lambda*\nu* t^{\nu - 1},

where \lambda>0 is the scale and \nu>0 is the shape parameter. Then par.rec=c(\lambda, \nu). A special case of this is the exponential distribution for \nu=1.\ If dist.rec="gompertz", the hazard function is

\lambda_0(t)=\lambda*exp(\alpha t),

where \lambda>0 is the scale and \alpha\in(-\infty,+\infty) is the shape parameter. Then par.rec=c(\lambda, \alpha).\ If dist.rec="lognormal", the hazard function is

\lambda_0(t)=[(1/(\sigma t))*\phi((ln(t)-\mu)/\sigma)]/[\Phi((-ln(t)-\mu)/\sigma)],

where \phi is the probability density function and \Phi is the cumulative distribution function of the standard normal distribution, \mu\in(-\infty,+\infty) is a location parameter and \sigma>0 is a shape parameter. Then par.rec=c(\mu,\sigma). Please note, that specifying dist.rec="lognormal" together with some covariates does not specify the usual lognormal model (with covariates specified as effects on the parameters of the lognormal distribution resulting in non-proportional hazards), but only defines the baseline hazard and incorporates covariate effects using the proportional hazard assumption.\ If dist.rec="step" the hazard function is

\lambda_0(t)=a, t<=t_1, and \lambda_0(t)=b, t>t_1

. Then par.rec=c(a,b,t_1).

pfree

Probability that after experiencing an event the individual is not at risk for experiencing further events for a length of dfree time units. Default is pfree=0.

dfree

Length of the risk-free interval. Must be in the same time unit as fu.max. Default is dfree=0, i.e. the individual is continously at risk for experiencing events until end of follow-up.

Details

Simulation of recurrent event data for non-constant baseline hazard in the total time model with risk-free intervalls and possibly a competing event. The simrec package enables to cut the data to an interim data set, and provides functionality to plot.

Data are simulated by extending the methods proposed by Bender et al [2] to the multiplicative intensity model.

Value

The output is a data.frame consisting of the columns:

id

An integer number for identification of each individual

x

or x.V1, x.V2, ... - depending on the covariate matrix. Contains the randomly generated value of the covariate(s) X for each individual.

z

Contains the randomly generated value of the frailty variable Z for each individual.

start

The start of interval [start, stop], when the individual starts to be at risk for a next event.

stop

The time of an event or censoring, i.e. the end of interval [start, stop].

status

An indicator of whether an event occured at time stop (status=1) or the individual is censored at time stop (status=0).

fu

Length of follow-up period [0,fu] for each individual.

For each individual there are as many lines as it experiences events, plus one line if being censored. The data format corresponds to the counting process format.

Author(s)

Katharina Ingel, Stella Preussler, Antje Jahn-Eimermacher, Federico Marini

Maintainer: Antje Jahn-Eimermacher jahna@uni-mainz.de

Katharina Ingel, Stella Preussler, Antje Jahn-Eimermacher. Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI), University Medical Center of the Johannes Gutenberg-University Mainz, Germany

References

  1. Andersen P, Gill R (1982): Cox's regression model for counting processes: a large sample study. The Annals of Statistics 10:1100-1120

  2. Bender R, Augustin T, Blettner M (2005): Generating survival times to simulate Cox proportional hazards models. Statistics in Medicine 24:1713-1723

  3. Jahn-Eimermacher A, Ingel K, Ozga AK, Preussler S, Binder H (2015): Simulating recurrent event data with hazard functions defined on a total time scale. BMC Medical Research Methodology 15:16

See Also

Useful links:

simreccomp

Examples

### Example:
### A sample of 10 individuals

N <- 10

### with a binomially distributed covariate with a regression coefficient
### of beta=0.3, and a standard normally distributed covariate with a
### regression coefficient of beta=0.2,

dist.x <- c("binomial", "normal")
par.x <- list(0.5, c(0, 1))
beta.x <- c(0.3, 0.2)

### a gamma distributed frailty variable with variance 0.25

dist.z <- "gamma"
par.z <- 0.25

### and a Weibull-shaped baseline hazard with shape parameter lambda=1
### and scale parameter nu=2.

dist.rec <- "weibull"
par.rec <- c(1, 2)

### Subjects are to be followed for two years with 20% of the subjects
### being censored according to a uniformly distributed censoring time
### within [0,2] (in years).

fu.min <- 2
fu.max <- 2
cens.prob <- 0.2

### After each event a subject is not at risk for experiencing further events
### for a period of 30 days with a probability of 50%.

dfree <- 30 / 365
pfree <- 0.5

simdata <- simrec(
  N, fu.min, fu.max, cens.prob, dist.x, par.x, beta.x, dist.z, par.z,
  dist.rec, par.rec, pfree, dfree
)
# print(simdata)  # only run for small N!

simrec documentation built on Sept. 8, 2023, 6:18 p.m.