Parametric regression for time-to-event data using the generalized F and other flexible distributions. Users may easily extend this function with their own survival distributions.

1 2 |

`formula` |
A formula expression in conventional R linear modelling
syntax. The response must be a survival object as returned by the
Only If there are no covariates, specify By default, covariates are placed on the “location” parameter of the distribution, typically the "scale" or "rate" parameter, through a linear model, or a log-linear model if this parameter must be positive. This gives an accelerated failure time model or a proportional hazards model, depending on the distribution. Covariates can be placed on other parameters by using the name of
the parameter as a function in the formula. For example, in a
Weibull model, the following expresses the scale parameter in terms
of age and a treatment variable
| |||||||||||||||||||

`data` |
A data frame in which to find variables supplied in | |||||||||||||||||||

`weights` |
Optional vector of case weights. | |||||||||||||||||||

`subset` |
Vector of integer or logicals specifying the subset of the observations to be used in the fit. | |||||||||||||||||||

`na.action` |
a missing-data filter function, applied after any
'subset' argument has been used. Default is | |||||||||||||||||||

`dist` |
Either one of the following strings identifying a built-in distribution:
or a list specifying a custom distribution. See section “Custom distributions” below for how to construct this list. The parameterisations of the built-in distributions used here are
the same as in their built-in distribution functions:
Note that the Weibull parameterisation is different from that
in Similarly in the exponential distribution, the rate, rather than the mean, is modelled on covariates. | |||||||||||||||||||

`inits` |
A numeric vector giving initial values for each unknown parameter.
If not specified, default initial values are chosen from a simple
summary of the uncensored survival time, for example the mean
is often used to initialize scale parameters. See the object
| |||||||||||||||||||

`fixedpars` |
Vector of indices of parameters whose values will be
fixed at their initial values during the optimisation. The indices
are ordered with parameters of the baseline distribution coming
first, followed by covariate effects. For example, in a stable
generalized Gamma model with two covariates, to fix the third
of three generalized gamma parameters (the shape | |||||||||||||||||||

`cl` |
Width of symmetric confidence intervals for maximum likelihood estimates, by default 0.95. | |||||||||||||||||||

`...` |
Optional arguments to the general-purpose |

Parameters are estimated by maximum likelihood using the
algorithms available in the standard R `optim`

function.
Parameters defined to be positive are estimated on the log scale.
Confidence intervals are estimated from the Hessian at the maximum,
and transformed back to the original scale of the parameters.

The usage of `flexsurvreg`

is intended to be as similar as possible to
`survreg`

in the survival package.

A list of class `"flexsurvreg"`

with the following elements.

`call` |
A copy of the function call, for use in post-processing. |

`dlist` |
List defining the survival distribution used. |

`res` |
Matrix of maximum likelihood estimates and confidence limits, with parameters on their natural scales. |

`res.t` |
Matrix of maximum likelihood estimates and confidence
limits, with parameters all transformed to the real line. The |

`loglik` |
Log-likelihood |

`AIC` |
Akaike's information criterion (-2*log likelihood + 2*number of estimated parameters) |

`flexsurvreg`

is intended to be easy to extend to handle
new distributions. To define a new distribution for use in
`flexsurvreg`

, construct a list with the following
elements:

`name`

:A string naming the distribution. If this is called

`"dist"`

, for example, then there must be a function called`ddist`

in the working environment which defines the probability density, and a function called`pdist`

which defines the probability distribution or cumulative density. These functions may be in an add-on package (see below for an example) or may be user-written. Arguments other than parameters must be named in the conventional way – for example`x`

for the first argument of the density function, as in`dnorm(x, ...)`

and`q`

for the first argument of the probability function.`pars`

:Vector of strings naming the parameters of the distribution. These must be the same names as the arguments of the density and probability functions.

`location`

:Name of the parameter which can be modelled as a linear function of covariates, possibly after transformation.

`transforms`

:Vector of R functions which transform the range of values taken by each parameter onto the real line. For example,

`c(log, log)`

for a distribution with two positive parameters.`inv.transforms`

:Vector of R functions defining the corresponding inverse transformations.

`inits`

:A function of the uncensored survival times

`t`

, which returns a vector of reasonable initial values for maximum likelihood estimation of each parameter. For example,`function(t){ c(1, mean(t)) }`

will always initialize the first of two parameters at 1, and the second (a scale parameter, for instance) at the mean of`t`

.

For example, suppose we want to use a log-logistic survival
distribution. This is available in the CRAN package eha, which
provides conventionally-defined density and probability functions called
`dllogis`

and `pllogis`

. See the
Examples below for the custom list in this case, and the
subsequent command to fit the model.

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Jackson, C. H. and Sharples, L. D. and Thompson, S. G. (2010) Survival models in health economic evaluations: balancing fit and parsimony to improve prediction. International Journal of Biostatistics 6(1):Article 34.

Cox, C. (2008) The generalized *F* distribution: An umbrella for
parametric survival analysis. Statistics in Medicine 27:4301-4312.

Cox, C., Chu, H., Schneider, M. F. and Muñoz, A. (2007) Parametric survival analysis and taxonomy of hazard functions for the generalized gamma distribution. Statistics in Medicine 26:4252-4374

`flexsurvspline`

for flexible survival modelling using the
spline model of Royston and Parmar.

`plot.flexsurvreg`

and `lines.flexsurvreg`

to
plot fitted survival, hazards and cumulative hazards from models fitted
by `flexsurvreg`

and `flexsurvspline`

.

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | ```
data(ovarian)
## Compare generalized gamma fit with Weibull
fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="gengamma")
fitg
fitw <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="weibull")
fitw
plot(fitg)
lines(fitw, col="blue", lwd.ci=1, lty.ci=1)
## Identical AIC, probably not enough data in this simple example for a
## very flexible model to be worthwhile.
## Custom distribution
library(eha) ## make "dllogis" and "pllogis" available to the working environment
custom.llogis <- list(name="llogis",
pars=c("shape","scale"),
location="scale",
transforms=c(log, log),
inv.transforms=c(exp, exp),
inits=function(t){ c(1, median(t)) })
fitl <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist=custom.llogis)
fitl
lines(fitl, col.fit="purple", col.ci="purple")
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

All documentation is copyright its authors; we didn't write any of that.