Description Usage Arguments Details Value Custom distributions Author(s) References See Also Examples
Parametric regression for time-to-event data using the generalized F and other flexible distributions. Users may easily extend this function with their own survival distributions.
1 2 |
formula |
A formula expression in conventional R linear modelling
syntax. The response must be a survival object as returned by the
Only If there are no covariates, specify By default, covariates are placed on the “location” parameter of the distribution, typically the "scale" or "rate" parameter, through a linear model, or a log-linear model if this parameter must be positive. This gives an accelerated failure time model or a proportional hazards model, depending on the distribution. Covariates can be placed on other parameters by using the name of
the parameter as a function in the formula. For example, in a
Weibull model, the following expresses the scale parameter in terms
of age and a treatment variable
| |||||||||||||||||||
data |
A data frame in which to find variables supplied in | |||||||||||||||||||
weights |
Optional vector of case weights. | |||||||||||||||||||
subset |
Vector of integer or logicals specifying the subset of the observations to be used in the fit. | |||||||||||||||||||
na.action |
a missing-data filter function, applied after any
'subset' argument has been used. Default is | |||||||||||||||||||
dist |
Either one of the following strings identifying a built-in distribution:
or a list specifying a custom distribution. See section “Custom distributions” below for how to construct this list. The parameterisations of the built-in distributions used here are
the same as in their built-in distribution functions:
Note that the Weibull parameterisation is different from that
in Similarly in the exponential distribution, the rate, rather than the mean, is modelled on covariates. | |||||||||||||||||||
inits |
A numeric vector giving initial values for each unknown parameter.
If not specified, default initial values are chosen from a simple
summary of the uncensored survival time, for example the mean
is often used to initialize scale parameters. See the object
| |||||||||||||||||||
fixedpars |
Vector of indices of parameters whose values will be
fixed at their initial values during the optimisation. The indices
are ordered with parameters of the baseline distribution coming
first, followed by covariate effects. For example, in a stable
generalized Gamma model with two covariates, to fix the third
of three generalized gamma parameters (the shape | |||||||||||||||||||
cl |
Width of symmetric confidence intervals for maximum likelihood estimates, by default 0.95. | |||||||||||||||||||
... |
Optional arguments to the general-purpose R
optimisation routine |
Parameters are estimated by maximum likelihood using the
algorithms available in the standard R optim
function.
Parameters defined to be positive are estimated on the log scale.
Confidence intervals are estimated from the Hessian at the maximum,
and transformed back to the original scale of the parameters.
The usage of flexsurvreg
is intended to be as similar as possible to
survreg
in the survival package.
A list of class "flexsurvreg"
with the following elements.
call |
A copy of the function call, for use in post-processing. |
dlist |
List defining the survival distribution used. |
res |
Matrix of maximum likelihood estimates and confidence limits, with parameters on their natural scales. |
res.t |
Matrix of maximum likelihood estimates and confidence
limits, with parameters all transformed to the real line. The |
loglik |
Log-likelihood |
AIC |
Akaike's information criterion (-2*log likelihood + 2*number of estimated parameters) |
flexsurvreg
is intended to be easy to extend to handle
new distributions. To define a new distribution for use in
flexsurvreg
, construct a list with the following
elements:
name
:A string naming the distribution.
If this is called "dist"
, for example, then there must be a
function called ddist
in the working environment which
defines the probability density,
and a function called pdist
which defines the probability
distribution or cumulative density. These functions may be in an add-on
package (see below for an example) or may be user-written.
Arguments other than parameters must be named in the conventional
way – for example x
for the first argument of the density
function, as in dnorm(x, ...)
and q
for the first
argument of the probability function.
pars
:Vector of strings naming the parameters of the distribution. These must be the same names as the arguments of the density and probability functions.
location
:Name of the parameter which can be modelled as a linear function of covariates, possibly after transformation.
transforms
:Vector of R functions which transform the range of
values taken by each parameter onto the real line. For example,
c(log, log)
for a distribution with two positive parameters.
inv.transforms
:Vector of R functions defining the corresponding inverse transformations.
inits
:A function of the uncensored survival times t
,
which returns a vector of reasonable initial values for maximum
likelihood estimation of each parameter. For example,
function(t){ c(1, mean(t)) }
will always initialize the first
of two parameters at 1, and the second (a scale
parameter, for instance) at the mean of t
.
For example, suppose we want to use a log-logistic survival
distribution. This is available in the CRAN package eha, which
provides conventionally-defined density and probability functions called
dllogis
and pllogis
. See the
Examples below for the custom list in this case, and the
subsequent command to fit the model.
Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>
Jackson, C. H. and Sharples, L. D. and Thompson, S. G. (2010) Survival models in health economic evaluations: balancing fit and parsimony to improve prediction. International Journal of Biostatistics 6(1):Article 34.
Cox, C. (2008) The generalized F distribution: An umbrella for parametric survival analysis. Statistics in Medicine 27:4301-4312.
Cox, C., Chu, H., Schneider, M. F. and Muñoz, A. (2007) Parametric survival analysis and taxonomy of hazard functions for the generalized gamma distribution. Statistics in Medicine 26:4252-4374
flexsurvspline
for flexible survival modelling using the
spline model of Royston and Parmar.
plot.flexsurvreg
and lines.flexsurvreg
to
plot fitted survival, hazards and cumulative hazards from models fitted
by flexsurvreg
and flexsurvspline
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | data(ovarian)
## Compare generalized gamma fit with Weibull
fitg <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="gengamma")
fitg
fitw <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist="weibull")
fitw
plot(fitg)
lines(fitw, col="blue", lwd.ci=1, lty.ci=1)
## Identical AIC, probably not enough data in this simple example for a
## very flexible model to be worthwhile.
## Custom distribution
library(eha) ## make "dllogis" and "pllogis" available to the working environment
custom.llogis <- list(name="llogis",
pars=c("shape","scale"),
location="scale",
transforms=c(log, log),
inv.transforms=c(exp, exp),
inits=function(t){ c(1, median(t)) })
fitl <- flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian, dist=custom.llogis)
fitl
lines(fitl, col.fit="purple", col.ci="purple")
|
Loading required package: survival
Call:
flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian,
dist = "gengamma")
Estimates:
est L95% U95% se
mu 6.426 4.984 7.868 0.736
sigma 1.426 0.888 2.292 0.345
Q -0.766 -3.340 1.807 1.313
N = 26, Events: 12, Censored: 14
Total time at risk: 15588
Log-likelihood = -96.94907, df = 3
AIC = 199.8981
Call:
flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian,
dist = "weibull")
Estimates:
est L95% U95% se
shape 1.108 0.674 1.822 0.281
scale 1225.419 690.421 2174.979 358.714
N = 26, Events: 12, Censored: 14
Total time at risk: 15588
Log-likelihood = -97.9539, df = 2
AIC = 199.9078
Attaching package: 'eha'
The following objects are masked from 'package:flexsurv':
Hgompertz, Hllogis, Hlnorm, Hweibull, dgompertz, dllogis,
hgompertz, hllogis, hlnorm, hweibull, pgompertz, pllogis,
qgompertz, qllogis, rgompertz, rllogis
Call:
flexsurvreg(formula = Surv(futime, fustat) ~ 1, data = ovarian,
dist = custom.llogis)
Estimates:
est L95% U95% se
shape 1.376 0.842 2.249 0.345
scale 837.753 470.310 1492.271 246.770
N = 26, Events: 12, Censored: 14
Total time at risk: 15588
Log-likelihood = -97.3547, df = 2
AIC = 198.7094
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.