coxed.gam: Predict expected durations using the GAM method

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/coxed.gam.R

Description

This function is called by coxed and is not intended to be used by itself.

Usage

1
2
coxed.gam(cox.model, newdata = NULL, k = -1, coef = NULL,
  b.ind = NULL, warn = TRUE)

Arguments

cox.model

The output from a Cox proportional hazards model estimated with the coxph function in the survival package or with the cph function in the rms package

newdata

An optional data frame in which to look for variables with which to predict. If omitted, the fitted values are used

k

The number of knots in the GAM smoother. The default is -1, which employs the choose.k function from the mgcv package to choose the number of knots

coef

A vector of new coefficients to replace the coefficients attribute of the cox.model. Used primarily for bootstrapping, to recalculate durations using new coefficients derived from a bootstrapped sample. If NULL, the original coefficients are employed

b.ind

A vector of observation numbers to pass to the estimation sample to construct the a bootstrapped sample with replacement

warn

If TRUE, displays warnings, and if FALSE suppresses them

Details

This function employs the GAM method of generating expected durations described in Kropko and Harden (2018), which proceeds according to five steps. First, it uses coefficient estimates from the Cox model, so researchers must first estimate the model just as they always have. Then the method computes expected values of risk for each observation by matrix-multiplying the covariates by the estimated coefficients from the model, then exponentiating the result. This creates the exponentiated linear predictor (ELP). Then the observations are ranked from smallest to largest according to their values of the ELP. This ranking is interpreted as the expected order of failure; the larger the value of the ELP, the sooner the model expects that observation to fail, relative to the other observations.

The next step is to connect the model's expected risk for each observation (ELP) to duration time (the observed durations). A gam fits a model to data by using a series of locally-estimated polynomial splines set by the user (see, for example, Wood, Pya, and Saefken 2016). It is a flexible means of allowing for the possibility of nonlinear relationships between variables. coxed.gam uses a GAM to model the observed utilizes a cubic regression spline to draw a smoothed line summarizing the bivariate relationship between the observed durations and the ranks. The GAM fit can be used directly to compute expected durations, given the covariates, for each observation in the data.

Value

Returns a list containing the following components:

exp.dur A vector of predicted mean durations for the estimation sample if newdata is omitted, or else for the specified new data.
gam.model Output from the gam function in which the durations are fit against the exponentiated linear predictors from the Cox model.
gam.data Fitted values and confidence intervals from the GAM model.

Author(s)

Jonathan Kropko <jkropko@virginia.edu> and Jeffrey J. Harden <jharden2@nd.edu>

References

Kropko, J. and Harden, J. J. (2018). Beyond the Hazard Ratio: Generating Expected Durations from the Cox Proportional Hazards Model. British Journal of Political Science https://doi.org/10.1017/S000712341700045X

Wood, S.N., N. Pya and B. Saefken (2016). Smoothing parameter and model selection for general smooth models (with discussion). Journal of the American Statistical Association 111, 1548-1575 http://dx.doi.org/10.1080/01621459.2016.1180986

See Also

gam, coxed

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
mv.surv <- Surv(martinvanberg$formdur, event = rep(1, nrow(martinvanberg)))
mv.cox <- coxph(mv.surv ~ postel + prevdef + cont + ident + rgovm +
pgovno + tpgovno + minority, method = "breslow", data = martinvanberg)

ed <- coxed.gam(mv.cox)
summary(ed$gam.data)
summary(ed$gam.model)
ed$exp.dur

#Plotting the GAM fit
## Not run: require(ggplot2)
ggplot(ed$gam.data, aes(x=rank.xb, y=y)) +
    geom_point() +
    geom_line(aes(x=rank.xb, y=gam_fit)) +
    geom_ribbon(aes(ymin=gam_fit_95lb, ymax=gam_fit_95ub), alpha=.5) +
    xlab("Cox model LP rank (smallest to largest)") +
    ylab("Duration")

## End(Not run)

#Running coxed.gam() on a bootstrap sample and with new coefficients
bsample <- sample(1:nrow(martinvanberg), nrow(martinvanberg), replace=TRUE)
newcoefs <- rnorm(8)
ed2 <- coxed.gam(mv.cox, b.ind=bsample, coef=newcoefs)

coxed documentation built on Aug. 2, 2020, 9:07 a.m.