Description Usage Arguments Details Value Note Author(s) References Examples
Simulate survival times from standard parametric survival distributions, 2component mixture distributions, or a userdefined hazard or log hazard function.
1 2 3 4 5 6  simsurv(dist = c("weibull", "exponential", "gompertz"), lambdas, gammas, x,
betas, tde, tdefunction = NULL, mixture = FALSE, pmix = 0.5, hazard,
loghazard, cumhazard, logcumhazard, idvar = NULL, ids = NULL,
nodes = 15, maxt = NULL, interval = c(1e08, 500),
rootsolver = c("uniroot", "dfsane"), rootfun = log,
seed = sample.int(.Machine$integer.max, 1), ...)

dist 
Character string specifying the parametric survival distribution.
Can be 
lambdas 
A numeric vector corresponding to the scale parameters for
the exponential, Weibull or Gompertz distributions. This vector should be
length one for any of the standard parametric distributions, or length two
for a 2component mixture distribution when 
gammas 
A numeric vector corresponding to the shape parameters for
the Weibull or Gompertz distributions. This vector should be length
one for the standard Weibull or Gompertz distributions, or length two
for a 2component mixture distribution when 
x 
A data frame containing the covariate values for each individual.
Each row of the data frame should supply covariate data for one individual.
The column names should correspond to the named elements in 
betas 
A named vector, a data frame, or a list of data frames,
containing the "true" parameters (e.g. log hazard ratios).
If a standard exponential, Weibull, or Gompertz distribution, or a 2component
mixture distribution is being used then 
tde 
A named vector, containing the "true" parameters that will be used
to create time dependent effects (i.e. nonproportional hazards). The
values specified in 
tdefunction 
An optional function of time to which covariates specified
in 
mixture 
Logical specifying whether to use a 2component mixture
model for the survival distribution. If 
pmix 
Scalar between 0 and 1 defining the mixing parameter when

hazard 
Optionally, a userdefined hazard function, with arguments

loghazard 
Optionally, a userdefined log hazard function, with arguments

cumhazard 
Optionally, a userdefined cumulative hazard function, with
arguments 
logcumhazard 
Optionally, a userdefined log cumulative hazard function, with
arguments 
idvar 
The name of the ID variable identifying individuals. This is
only required when 
ids 
A vector containing the unique values of 
nodes 
Integer specifying the number of quadrature nodes to use for the GaussKronrod quadrature. Can be 7, 11, or 15. 
maxt 
The maximum event time. For simulated event times greater than

interval 
The interval over which to search for the

rootsolver 
Character string specifying the function to use for
univariate root finding when required. This can currently be

rootfun 
A function to apply to each side of the root finding equation
when numerical root finding is used to solve for the simulated event time.
An appropriate function helps to improve numerical stability. The default
is to use a log transformation; that is, to solve log(S(T))  log(U) = 0
where S(T) is the survival probability at the event time and
U is a uniform random variate. Suitable alternatives might be to
specify 
seed 
The 
... 
Other arguments passed to 
The simsurv
function simulates survival times from
standard parametric survival distributions (exponential, Weibull, Gompertz),
2component mixture distributions, or a userdefined hazard or log hazard function.
Baseline covariates can be included under a proportional hazards assumption.
Time dependent effects (i.e. nonproportional hazards) can be included by
interacting covariates with time (by specifying them in the tde
argument); the default behaviour is to interact the covariates with linear
time, however, they can be interacted with some other function of time simply
by using the tdefunction
argument.
Under the 2component mixture distributions (obtained by setting
mixture = TRUE
) the baseline survival at time t is taken to be
S(t) = p * S_1(t) + (1  p) * S_2(t)
where S_1(t) and S_2(t) are the baseline survival under each
component of the mixture distribution and p is the mixing parameter
specified via the argument pmix
. Each component of the mixture
distribution is assumed to be either exponential, Weibull or Gompertz.
The 2component mixture distributions can allow for a variety of flexible
baseline hazard functions (see Crowther and Lambert (2013) for some examples).
If the user wishes to provide a userdefined hazard or log hazard function
(instead of using one of the standard parametric survival distributions) then
this is also possible via the hazard
or loghazard
argument.
If a userdefined hazard or log hazard function is specified, then this is
allowed to be timedependent, and the resulting cumulative hazard function
does not need to have a closedform solution. The survival times are
generated using the approach described in Crowther and Lambert (2013),
whereby the cumulative hazard is evaluated using numerical quadrature and
survival times are generated using an iterative algorithm which nests the
quadraturebased evaluation of the cumulative hazard inside Brent's (1973)
univariate root finder (for the latter the uniroot
function is used). Not requiring a closed form solution to the cumulative
hazard function has the benefit that survival times can be generated for
complex models such as joint longitudinal and survival models; the
Examples section provides an example of this.
For the exponential distribution, with scale parameter
λ > 0, the baseline hazard and survival
functions used by simsurv
are:
h(t) = λ and
S(t) = \exp(λ).
Our parameterisation is equivalent to the one used by Wikipedia, the
dexp
function, the eha package, and the flexsurv
package, except what we call the scale parameter
they call the rate parameter.
For the Weibull distribution, with shape parameter γ > 0
and scale parameter λ > 0, the baseline
hazard and survival functions used by simsurv
are:
h(t) = γ λ t ^ {γ  1} and
S(t) = \exp(λ t ^ {γ}). Setting γ equal
to 1 leads to the exponential distribution as a special case.
Our parameterisation differs from the one used by Wikipedia,
dweibull
, the phreg
modelling
function in the eha package, and the
flexsurvreg
modelling function in the
flexsurv package. The parameterisation used in those
functions can be achieved by transforming the scale parameter via the
relationship b = λ ^ {\frac{1}{γ}}, or equivalently
λ = b ^ {γ} where b is the scale parameter under
their parameterisation of the Weibull distribution.
For the Gompertz distribution, with and shape parameter γ > 0
and scale parameter λ > 0, the baseline
hazard and survival functions used by simsurv
are:
h(t) = λ \exp(γ t) and
S(t) = \exp(\frac{λ (\exp(γ t)  1)}{γ}).
Setting γ equal to 0 leads to the exponential distribution
as a special case.
Our parameterisation is equivalent to the one used by the
dgompertz
and flexsurvreg
functions in the flexsurv package, except they use slightly different
terminology. Their parameterisation can be achieved via the relationship
a = γ and b = λ where a and b are their
shape and rate parameters, respectively.
Our parameterisation differs from the one used in the
dgompertz
and phreg
functions
in the eha package. Their parameterisation can be achieved via
the relationship a = λ and b = \frac{1}{γ} where
a and b are their shape and scale parameters, respectively.
Our parameterisation differs from the one used by Wikipedia. Their parameterisation can be achieved via the relationship a = \frac{λ}{γ} and b = γ where a and b are their shape and scale parameters, respectively.
A data frame with a row for each individual, and the following three columns:
id
The individual identifier
eventtime
The simulated event (or censoring) time
status
The event indicator, 1 for failure, 0 for censored
This package is modelled on the userwritten survsim
package
available in the Stata software (see Crowther and Lambert (2012)).
Sam Brilleman ([email protected])
Crowther MJ, and Lambert PC. (2013) Simulating biologically plausible complex survival data. Statistics in Medicine 32, 4118–4134. doi: 10.1002/sim.5823
Bender R, Augustin T, and Blettner M. (2005) Generating survival times to simulate Cox proportional hazards models. Statistics in Medicine 24(11), 1713–1723.
Brent R. (1973) Algorithms for Minimization without Derivatives. Englewood Cliffs, NJ: PrenticeHall.
Crowther MJ, and Lambert PC. (2012) Simulating complex survival data. The Stata Journal 12(4), 674–687. http://www.statajournal.com/sjpdf.html?articlenum=st0275
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87  # Simpler examples
# Generate times from a Weibull model including a binary
# treatment variable, with log(hazard ratio) = 0.5, and censoring
# after 5 years:
set.seed(9911)
covs < data.frame(id = 1:100, trt = stats::rbinom(100, 1L, 0.5))
s1 < simsurv(lambdas = 0.1, gammas = 1.5, betas = c(trt = 0.5),
x = covs, maxt = 5)
head(s1)
# Generate times from a Gompertz model:
s2 < simsurv(dist = "gompertz", lambdas = 0.1, gammas = 0.05, x = covs)
# Generate times from a 2component mixture Weibull model:
s3 < simsurv(lambdas = c(0.1, 0.05), gammas = c(1, 1.5),
mixture = TRUE, pmix = 0.5, x = covs, maxt = 5)
# Generate times from userdefined log hazard function:
fn < function(t, x, betas, ...)
(1 + 0.02 * t  0.03 * t ^ 2 + 0.005 * t ^ 3)
s4 < simsurv(loghazard = fn, x = covs, maxt = 1.5)
# Generate times from a Weibull model with diminishing treatment effect:
s5 < simsurv(lambdas = 0.1, gammas = 1.5, betas = c(trt = 0.5),
x = covs, tde = c(trt = 0.05), tdefunction = "log")
# Complex examples
# Here we present an example of simulating survival times
# based on a joint longitudinal and survival model
# First we define the hazard function to pass to simsurv
# (NB this is a Weibull proportional hazards regression submodel
# from a joint longitudinal and survival model with a "current
# value" association structure).
haz < function(t, x, betas, ...) {
betas[["shape"]] * (t ^ (betas[["shape"]]  1)) * exp(
betas[["betaEvent_intercept"]] +
betas[["betaEvent_binary"]] * x[["Z1"]] +
betas[["betaEvent_continuous"]] * x[["Z2"]] +
betas[["betaEvent_assoc"]] * (
betas[["betaLong_intercept"]] +
betas[["betaLong_slope"]] * t +
betas[["betaLong_binary"]] * x[["Z1"]] +
betas[["betaLong_continuous"]] * x[["Z2"]]
)
)
}
# Then we construct data frames with the true parameter
# values and the covariate data for each individual
set.seed(5454) # set seed before simulating data
N < 20 # number of individuals
# Population (fixed effect) parameters
betas < data.frame(
shape = rep(2, N),
betaEvent_intercept = rep(11.9,N),
betaEvent_binary = rep(0.6, N),
betaEvent_continuous = rep(0.08, N),
betaEvent_assoc = rep(0.03, N),
betaLong_binary = rep(1.5, N),
betaLong_continuous = rep(1, N),
betaLong_intercept = rep(90, N),
betaLong_slope = rep(2.5, N)
)
# Individualspecific (random effect) parameters
b_corrmat < matrix(c(1, 0.5, 0.5, 1), 2, 2)
b_sds < c(20, 3)
b_means < rep(0, 2)
b_z < MASS::mvrnorm(n = N, mu = b_means, Sigma = b_corrmat)
b < sapply(1:length(b_sds), function(x) b_sds[x] * b_z[,x])
betas$betaLong_intercept < betas$betaLong_intercept + b[,1]
betas$betaLong_slope < betas$betaLong_slope + b[,2]
# Covariate data
covdat < data.frame(
Z1 = stats::rbinom(N, 1, 0.45), # a binary covariate
Z2 = stats::rnorm(N, 44, 8.5) # a continuous covariate
)
# Then we simulate the survival times based on the
# hazard function, covariates, and true parameter values
times < simsurv(hazard = haz, x = covdat, betas = betas, maxt = 10)
head(times)

Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.