flexsurvspline: Flexible survival regression using the Royston/Parmar spline...
In flexsurv: Flexible parametric survival models

Description Usage Arguments Details Value Author(s) References See Also Examples

Flexible parametric modelling of time-to-event data using the spline model of Royston and Parmar (2002).

1 2	flexsurvspline(formula, data, k=0, knots=NULL, scale="hazard", weights, subset, na.action, inits=NULL, fixedpars=NULL, cl=0.95,...)

`formula`	A formula expression in conventional R linear modelling syntax. The response must be a survival object as returned by the `Surv` function, and any covariates are given on the right-hand side. For example, `Surv(time, dead) ~ age + sex` If there are no covariates, specify `1` on the right hand side, for example `Surv(time, dead) ~ 1`. Only `Surv` objects of `type="right"` or `type="counting"`, corresponding to right-censored and/or left-truncated observations, are supported.
`data`	A data frame in which to find variables supplied in `formula`. If not given, the variables should be in the working environment.
`k`	Number of knots in the spline. The default `k=0` gives a Weibull, log-logistic or lognormal model, if `"scale"` is `"hazard"`, `"odds"` or `"normal"` respectively. `k` is equivalent to `df-1` in the notation of `stpm` for Stata. The knots are then chosen as equally-spaced quantiles of the log uncensored survival times, for example, at the median with one knot, or at the 33% and 67% quantiles of log time with two knots. To override this default knot placement, specify `knots` instead.
`knots`	Locations of knots on the axis of log time. If not specified, knot locations are chosen as described in `k` above. Either `k` or `knots` must be specified. If both are specified, `knots` overrides `k`.
`scale`	If `"hazard"`, the log cumulative hazard is modelled as a spline function of log time. If `"odds"`, the log cumulative odds is modelled as a spline function of log time. If `"normal"`, -InvPhi(S(t)) is modelled as a spline function of log time, where InvPhi() is the inverse normal distribution function `qnorm`.
`weights`	Optional vector of case weights.
`subset`	Vector of integer or logicals specifying the subset of the observations to be used in the fit.
`na.action`	a missing-data filter function, applied after any 'subset' argument has been used. Default is 'options()$na.action'.
`inits`	A numeric vector giving initial values for each unknown parameter. If not specified, default initial values are chosen by estimating the baseline survival at each observed death time from the equivalent Cox model, transforming to the log cumulative hazard log(H) (or equivalent under the odds or normal models) then performing a linear regression of log(H) on the spline basis and covariates.
`fixedpars`	Vector of indices of parameters whose values will be fixed at their initial values during optimisation. The indices are ordered with the intercept `"gamma0"` first, then the remaining spline coefficients `"gamma1","gamma2"...` followed by covariate effects.
`cl`	Width of symmetric confidence intervals for maximum likelihood estimates, by default 0.95.
`...`	Optional arguments to the general-purpose R optimisation routine `optim`. See `flexsurvreg` for examples.

In the spline-based survival model of Royston and Parmar (2002), a transformation g(S(t,z)) of the survival function is modelled as a natural cubic spline function of log time x = log(t) plus linear effects of covariates z.

g(S(t,z)) = s(x, gamma) + beta^T z

The proportional hazards model (scale="hazard") defines g(S(t,z)) = log(-log(S(t,z))) = log(H(t,z)), the log cumulative hazard.

The proportional odds model (scale="odds") defines g(S(t,z)) = log(1/S(t,z) - 1), the log cumulative odds.

The probit model (scale="normal") defines g(S(t,z)) = -InvPhi(S(t,z)), where InvPhi() is the inverse normal distribution function qnorm.

With no knots, the spline reduces to a linear function, and these models are equivalent to Weibull, log-logistic and lognormal models respectively.

Natural cubic splines are cubic splines constrained to be linear beyond boundary knots kmin,kmax. The spline function is defined as

s(x,gamma) = gamma0 + gamma1 x + gamma2 v1(x) + ... + gamma_{m+1} vm(x)

where vj(x) is the jth basis function

vj(x) = (x - kj)^3_+ - λ_j(x - kmin)^3_+ - (1 -λ_j) (x - kmax)^3_+

λ_j = (kmax - kj) / (kmax - kmin)

and (x - a)_+ = max(0, x - a).

Parameters gamma,beta are estimated by maximum likelihood using the algorithms available in the standard R optim function. Confidence intervals are estimated from the Hessian at the maximum.

A list of class "flexsurvreg" with the following elements.

`call`	A copy of the function call, for use in post-processing.
`k`	Number of knots.
`knots`	Location of knots on the log time axis.
`res`	Matrix of maximum likelihood estimates and confidence limits. Spline coefficients are labelled `"gamma..."`, and covariate effects are labelled with the names of the covariates. Coefficients `gamma1,gamma2,...` here are the equivalent of `s0,s1,...` in Stata `streg`, and `gamma0` is the equivalent of the `xb` constant term. To reproduce results, use the `noorthog` option in Stata, since no orthogonalisation is performed on the spline basis here. In the Weibull model, for example, `gamma0,gamma1` are `-shape log(scale), shape` respectively in `dweibull` or `flexsurvreg` notation, or (`-Intercept/scale`, `1/scale`) in `survreg` notation. In the log-logistic model with shape `a` and scale `b` (as in `dllogis` from the eha package), `1/b^a` is equivalent to `exp(gamma0)`, and `a` is equivalent to `gamma1`. In the log-normal model with log-scale mean `mu` and standard deviation `sigma`, `-mu/sigma` is equivalent to `gamma0` and `1/sigma` is equivalent to `gamma1`.
`loglik`	The maximised log-likelihood. This will differ from Stata, where the sum of the log uncensored survival times is added to the log-likelihood in survival models, to remove dependency on the time scale.
`AIC`	Akaike's information criterion (-2log likelihood + 2number of estimated parameters)

Christopher Jackson <chris.jackson@mrc-bsu.cam.ac.uk>

Royston, P. and Parmar, M. (2002). Flexible parametric proportional-hazards and proportional-odds models for censored survival data, with application to prognostic modelling and estimation of treatment effects. Statistics in Medicine 21(1):2175-2197.

flexsurvreg for flexible survival modelling using fully parametric distributions including the generalized F and gamma.

plot.flexsurvreg and lines.flexsurvreg to plot fitted survival, hazards and cumulative hazards from models fitted by flexsurvspline and flexsurvreg.

data(bc)
bc$recyrs <- bc$rectime/365

## Best-fitting model to breast cancer data from Royston and Parmar (2002)
## One internal knot (2 df) and cumulative odds scale
spl <- flexsurvspline(Surv(recyrs, censrec) ~ group, data=bc, k=1, scale="odds")

## Fitted survival
plot(spl, ci=TRUE, lwd=3, lwd.ci=1, col.ci="gray")

## Simple Weibull model fits much less well
splw <- flexsurvspline(Surv(recyrs, censrec) ~ group, data=bc, k=0, scale="hazard")
lines(splw, col="blue")

## Alternative way of fitting the Weibull
splw2 <- flexsurvreg(Surv(recyrs, censrec) ~ group, data=bc, dist="weibull")