library(knitr) opts_chunk$set( fig.align = "center", cache = TRUE, message = FALSE, fig.height = 5, fig.width = 5)
library(magrittr) library(dplyr) library(ggplot2) theme_set(theme_bw()) library(pamm) library(mgcv) library(survival) Set1 <- RColorBrewer::brewer.pal(9, "Set1")
We first illustrate the estimation on non-linear smooth effects using
the veteran
data from the survival package:
data("veteran", package="survival") veteran %<>% mutate( trt = 1+(trt == 2), prior = 1*(prior==10)) %>% filter(time < 400) head(veteran)
There are two continuous covariates in the data set that could potentially
exhibit non-linear effects, age
and karno
. We fit a Cox-PH model and
a PAM of the form
$$ \lambda(t|x) = \lambda_0(t)\exp(f(x_{karno})) $$
and
$$ \lambda(t|x) = \exp(f_0(t_j) + f(x_{karno})) $$ respectively:
## Cox-PH cph <- coxph(Surv(time, status) ~ pspline(karno, df=0), data=veteran) ## PAM ped <- split_data(Surv(time, status) ~ ., data=veteran, id="id") pam <- gam(ped_status ~ s(tend) + s(karno, bs="ps"), data=ped, family=poisson(), offset=offset)
# crate data set with varying karno values (from min to max in 100 steps) # and add term contribution of karno from PAM and Cox-PH models karno_df <- ped %>% make_newdata(expand = "karno", n = 100) %>% add_term(pam, term="karno") %>% mutate( cox = predict(cph, newdata=., type="terms")[, "pspline(karno, df = 0)"] - predict(cph, newdata=data.frame(karno=mean(veteran$karno)))) ggplot(karno_df, aes(x=karno, ymin=low, ymax=high)) + geom_line(aes(y=fit, col="PAM")) + geom_ribbon(alpha=0.2) + geom_line(aes(y=cox, col="Cox PH"))+ scale_colour_manual(name="Method",values=c("Cox PH"=Set1[1],"PAM"=1)) + xlab(expression(x[plain(karno)])) + ylab(expression(hat(f)(x[plain(karno)])))
Both methods are in good agreement. The higher the Karnofsky-Score the lower the
expected log-hazard. For further evaluation of the Karnofsky Score effect using
time-varying terms of the form $f(t)\cdot x_{karno}$ and $f(x_{karno}, t)$ see
vignette("tveffects", package="pam")
.
In the following we consider another example using using data presented in the
respective vignette in the survival
package
(see vignette("splines", package = "survival")
).
The example presented in the vignette goes as follows, using the standard
base R workflow and termplot
for visualization:
data("mgus", package = "survival") mfit <- coxph(Surv(futime, death) ~ sex + pspline(age, df = 4), data = mgus) mfit
termplot(mfit, term = 2, se = TRUE, col.term = 1, col.se = 1)
The equivalent fit using PAMs requires the additional step of transforming the
data into a suitable format. We then use mgcv::gam
to fit
the model:
mgus.ped <- split_data(Surv(futime, death)~sex + age, data = mgus, id = "id") head(mgus.ped) pamfit <- gam(ped_status ~ s(tend) + sex + s(age, bs = "ps"), data = mgus.ped, method = "REML", family = poisson(), offset = offset) summary(pamfit)
For visualization of the smooth effects we can use the default mgcv::plot.gam
function:
layout(matrix(1:2, nrow = 1)) termplot(mfit, term = 2, se = TRUE, col.term = 1, col.se = 1) plot(pamfit, select = 2, se = TRUE, ylim = c(-4, 3.5))
In this example the PAM approach estimates a linear effect of age, which is
consistent with the estimation using coxph
, as there is only weak non-linearity.
Another example from the same vignette shows that estimated effects are very similar if the effect of the covariate is in fact strongly non-linear:
fit <- coxph(Surv(futime, death) ~ age + pspline(hgb, 4), mgus2) mgus2.ped <- split_data(Surv(futime, death)~age + hgb, data = mgus2, id = "id") pamfit2 <- gam(ped_status~s(tend) + age + s(hgb), data = mgus2.ped, family = poisson(), offset = offset)
layout(matrix(1:2, nrow = 1)) termplot(fit, term = 2, se = TRUE, col.term = 1, col.se = 1) plot(pamfit2, select = 2, se = TRUE, ylim = c(-0.5, 2))
The vignette in the survival
package further discusses enforcing monotonicity
contraints on the effect of hgb
. These can be achieved here more easily
using the functionality provided in the scam
package (see also @pya2015).
The usage is exactly the same as before, replacing the call to gam
with a
call to scam
and specifying bs = "mpd"
. Note that the fit using contraints
takes considerably longer compared to the unconstrained fit.
library(scam) mpam <- scam(ped_status ~ s(tend) + age + s(hgb, bs = "mpd"), data = mgus2.ped, family = poisson(), offset = offset)
layout(matrix(1:2, nrow = 1)) plot(pamfit2, select = 2, se = TRUE, ylim = c(-0.5, 2), main="Unconstrained fit") # 1.72 = intercept difference between pamfit2 and mpam const <- abs(coef(pamfit2)[1] - coef(mpam)[1]) plot(mpam, select = 2, se = TRUE, shift = const, ylim = c(-0.5-const, 2-const), main="Fit with monotonicity constraint")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.