cph | R Documentation |
Modification of Therneau's coxph
function to fit the Cox model and
its extension, the Andersen-Gill model. The latter allows for interval
time-dependent covariables, time-dependent strata, and repeated events.
The Survival
method for an object created by cph
returns an S
function for computing estimates of the survival function.
The Quantile
method for cph
returns an S function for computing
quantiles of survival time (median, by default).
The Mean
method returns a function for computing the mean survival
time. This function issues a warning if the last follow-up time is uncensored,
unless a restricted mean is explicitly requested.
cph(formula = formula(data), data=environment(formula),
weights, subset, na.action=na.delete,
method=c("efron","breslow","exact","model.frame","model.matrix"),
singular.ok=FALSE, robust=FALSE,
model=FALSE, x=FALSE, y=FALSE, se.fit=FALSE,
linear.predictors=TRUE, residuals=TRUE, nonames=FALSE,
eps=1e-4, init, iter.max=10, tol=1e-9, surv=FALSE, time.inc,
type=NULL, vartype=NULL, debug=FALSE, ...)
## S3 method for class 'cph'
Survival(object, ...)
# Evaluate result as g(times, lp, stratum=1, type=c("step","polygon"))
## S3 method for class 'cph'
Quantile(object, ...)
# Evaluate like h(q, lp, stratum=1, type=c("step","polygon"))
## S3 method for class 'cph'
Mean(object, method=c("exact","approximate"), type=c("step","polygon"),
n=75, tmax, ...)
# E.g. m(lp, stratum=1, type=c("step","polygon"), tmax, \dots)
formula |
an S formula object with a |
object |
an object created by |
data |
name of an S data frame containing all needed variables. Omit this to use a data frame already in the S “search list”. |
weights |
case weights |
subset |
an expression defining a subset of the observations to use in the fit. The default
is to use all observations. Specify for example |
na.action |
specifies an S function to handle missing data. The default is the function |
method |
for For |
singular.ok |
If |
robust |
if |
model |
default is |
x |
default is |
y |
default is |
se.fit |
default is |
linear.predictors |
set to |
residuals |
set to |
nonames |
set to |
eps |
convergence criterion - change in log likelihood. |
init |
vector of initial parameter estimates. Defaults to all zeros.
Special residuals can be obtained by setting some elements of |
iter.max |
maximum number of iterations to allow. Set to |
tol |
tolerance for declaring singularity for matrix inversion (available only when survival5 or later package is in effect) |
surv |
set to |
time.inc |
time increment used in deriving |
type |
(for For |
vartype |
see |
debug |
set to |
... |
other arguments passed to |
times |
a scalar or vector of times at which to evaluate the survival estimates |
lp |
a scalar or vector of linear predictors (including the centering constant) at which to evaluate the survival estimates |
stratum |
a scalar stratum number or name (e.g., |
q |
a scalar quantile or a vector of quantiles to compute |
n |
the number of points at which to evaluate the mean survival time, for
|
tmax |
For |
If there is any strata by covariable interaction in the model such that
the mean X beta varies greatly over strata, method="approximate"
may
not yield very accurate estimates of the mean in Mean.cph
.
For method="approximate"
if you ask for an estimate of the mean for
a linear predictor value that was outside the range of linear predictors
stored with the fit, the mean for that observation will be NA
.
For Survival
, Quantile
, or Mean
, an S function is returned. Otherwise,
in addition to what is listed below, formula/design information and
the components
maxtime, time.inc, units, model, x, y, se.fit
are stored, the last 5
depending on the settings of options by the same names.
The vectors or matrix stored if y=TRUE
or x=TRUE
have rows deleted according to subset
and
to missing data, and have names or row names that come from the
data frame used as input data.
n |
table with one row per stratum containing number of censored and uncensored observations |
coef |
vector of regression coefficients |
stats |
vector containing the named elements |
var |
variance/covariance matrix of coefficients |
linear.predictors |
values of predicted X beta for observations used in fit, normalized to have overall mean zero, then having any offsets added |
resid |
martingale residuals |
loglik |
log likelihood at initial and final parameter values |
score |
value of score statistic at initial values of parameters |
times |
lists of times (if |
surv |
lists of underlying survival probability estimates |
std.err |
lists of standard errors of estimate log-log survival |
surv.summary |
a 3 dimensional array if |
center |
centering constant, equal to overall mean of X beta. |
Frank Harrell
Department of Biostatistics, Vanderbilt University
fh@fharrell.com
coxph
, survival-internal
,
Surv
, residuals.cph
,
cox.zph
, survfit.cph
,
survest.cph
, survfit.coxph
,
survplot
, datadist
,
rms
, rms.trans
, anova.rms
,
summary.rms
, Predict
,
fastbw
, validate
, calibrate
,
plot.Predict
, ggplot.Predict
,
specs.rms
, lrm
, which.influence
,
na.delete
,
na.detail.response
, print.cph
,
latex.cph
, vif
, ie.setup
,
GiniMd
, dxy.cens
,
concordance
# Simulate data from a population model in which the log hazard
# function is linear in age and there is no age x sex interaction
require(survival)
require(ggplot2)
n <- 1000
set.seed(731)
age <- 50 + 12*rnorm(n)
label(age) <- "Age"
sex <- factor(sample(c('Male','Female'), n,
rep=TRUE, prob=c(.6, .4)))
cens <- 15*runif(n)
h <- .02*exp(.04*(age-50)+.8*(sex=='Female'))
dt <- -log(runif(n))/h
label(dt) <- 'Follow-up Time'
e <- ifelse(dt <= cens,1,0)
dt <- pmin(dt, cens)
units(dt) <- "Year"
dd <- datadist(age, sex)
options(datadist='dd')
S <- Surv(dt,e)
f <- cph(S ~ rcs(age,4) + sex, x=TRUE, y=TRUE)
cox.zph(f, "rank") # tests of PH
anova(f)
ggplot(Predict(f, age, sex)) # plot age effect, 2 curves for 2 sexes
survplot(f, sex) # time on x-axis, curves for x2
res <- resid(f, "scaledsch")
time <- as.numeric(dimnames(res)[[1]])
z <- loess(res[,4] ~ time, span=0.50) # residuals for sex
plot(time, fitted(z))
lines(supsmu(time, res[,4]),lty=2)
plot(cox.zph(f,"identity")) #Easier approach for last few lines
# latex(f)
f <- cph(S ~ age + strat(sex), surv=TRUE)
g <- Survival(f) # g is a function
g(seq(.1,1,by=.1), stratum="sex=Male", type="poly") #could use stratum=2
med <- Quantile(f)
plot(Predict(f, age, fun=function(x) med(lp=x))) #plot median survival
# Fit a model that is quadratic in age, interacting with sex as strata
# Compare standard errors of linear predictor values with those from
# coxph
# Use more stringent convergence criteria to match with coxph
f <- cph(S ~ pol(age,2)*strat(sex), x=TRUE, eps=1e-9, iter.max=20)
coef(f)
se <- predict(f, se.fit=TRUE)$se.fit
require(lattice)
xyplot(se ~ age | sex, main='From cph')
a <- c(30,50,70)
comb <- data.frame(age=rep(a, each=2),
sex=rep(levels(sex), 3))
p <- predict(f, comb, se.fit=TRUE)
comb$yhat <- p$linear.predictors
comb$se <- p$se.fit
z <- qnorm(.975)
comb$lower <- p$linear.predictors - z*p$se.fit
comb$upper <- p$linear.predictors + z*p$se.fit
comb
age2 <- age^2
f2 <- coxph(S ~ (age + age2)*strata(sex))
coef(f2)
se <- predict(f2, se.fit=TRUE)$se.fit
xyplot(se ~ age | sex, main='From coxph')
comb <- data.frame(age=rep(a, each=2), age2=rep(a, each=2)^2,
sex=rep(levels(sex), 3))
p <- predict(f2, newdata=comb, se.fit=TRUE)
comb$yhat <- p$fit
comb$se <- p$se.fit
comb$lower <- p$fit - z*p$se.fit
comb$upper <- p$fit + z*p$se.fit
comb
# g <- cph(Surv(hospital.charges) ~ age, surv=TRUE)
# Cox model very useful for analyzing highly skewed data, censored or not
# m <- Mean(g)
# m(0) # Predicted mean charge for reference age
#Fit a time-dependent covariable representing the instantaneous effect
#of an intervening non-fatal event
rm(age)
set.seed(121)
dframe <- data.frame(failure.time=1:10, event=rep(0:1,5),
ie.time=c(NA,1.5,2.5,NA,3,4,NA,5,5,5),
age=sample(40:80,10,rep=TRUE))
z <- ie.setup(dframe$failure.time, dframe$event, dframe$ie.time)
S <- z$S
ie.status <- z$ie.status
attach(dframe[z$subs,]) # replicates all variables
f <- cph(S ~ age + ie.status, x=TRUE, y=TRUE)
#Must use x=TRUE,y=TRUE to get survival curves with time-dep. covariables
#Get estimated survival curve for a 50-year old who has an intervening
#non-fatal event at 5 days
new <- data.frame(S=Surv(c(0,5), c(5,999), c(FALSE,FALSE)), age=rep(50,2),
ie.status=c(0,1))
g <- survfit(f, new)
plot(c(0,g$time), c(1,g$surv[,2]), type='s',
xlab='Days', ylab='Survival Prob.')
# Not certain about what columns represent in g$surv for survival5
# but appears to be for different ie.status
#or:
#g <- survest(f, new)
#plot(g$time, g$surv, type='s', xlab='Days', ylab='Survival Prob.')
#Compare with estimates when there is no intervening event
new2 <- data.frame(S=Surv(c(0,5), c(5, 999), c(FALSE,FALSE)), age=rep(50,2),
ie.status=c(0,0))
g2 <- survfit(f, new2)
lines(c(0,g2$time), c(1,g2$surv[,2]), type='s', lty=2)
#or:
#g2 <- survest(f, new2)
#lines(g2$time, g2$surv, type='s', lty=2)
detach("dframe[z$subs, ]")
options(datadist=NULL)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.