knitr::opts_chunk$set(
  collapse = TRUE,
  comment = "#>"
)

requireNamespace("devtools")

matchsurv provides R functions to estimate excess risk regression models on both the hazard and the cumulative incidence scale.

Installation

You can install matchsurv from github with:

# install.packages("devtools")
devtools::install_github("cribosch/matchsurv",build_vignettes = TRUE )

libraries

library(matchsurv)

#other useful packages 
library(timereg)
library(geepack)
library(ggplot2) # for the plot

These functions work with matched cohort studies where the outcome is time-to-event. The excess risk can be estimated in a survival setting on the hazard scale and in a cometing risk setting also in terms of cumulative incidence. Data can be simulated in both the two settings.

Excess hazard model

Data simulation

We simulate matched cohort data in the survival setting with 5000 exposed subjects and 5 matched unexposed subjects for each exposed.

haz.data<-sim.data.MatchH(5000,5)
head(haz.data)

The optioncompeting=TRUE of sim.data.MatchH functions allows to estimate data in a competing risk setting where the excess risk is defined in terms of cause specific hazard. When nullmod=TRUE, the excess risk does not depend on covariates, so no covariates are simulated.

Model estimation

To get the model estimates we first need to set up the data correctly; both the two approaches work on the exposed-unexposed pair information.

set.hazd<-compdata(Surv(time, status)~x+z+cc, clust=id, idControl=j, data=haz.data)
head(set.hazd)

We suggest to set all the possible coviariates in the perdictor part of the formula; this is not the estimated model yet. The dataset we obtain with compdata will be used in the model estimation.

To estimate the model, run:

exc.haz.mod<-matchpropexc(Surv(entry,exit,status)~strata(z)+factor(x), data=set.hazd)

Note that it is possible to specify possible strata or interactions.

Results

To visualise the estimated effects:

summary(exc.haz.mod)

To estimate the cumulative baseline excess hazard:

exccumhaz(exc.haz.mod, time = seq(0,25,5))

It is possible to specify a vector of timepoints where to estimate the cumulative baseline excess hazard. To plot the estimated cumulative hazard, we can use the function excplot; the option stratas= allows to visualize only some strata (strata are identified with a number, 0 refers to the first stratum). The option relsurv=TRUE plots the relative survival estimated through the excess risk model.

excplot(exc.haz.mod)

A better overview of the function excplot() is given by example("excplot").

Excess cumulative incidence model

Data simulation

Matched cohort data in a competing risk setting with two possible causes of failure are generated:

cif.data<-sim.data.MatchCR(1000,5)
head(cif.data)

Model estimate

data set up

tp<-c(0.5,1,2,5,10,15,25) # define vector of time points
set.cifd<-compcomp(Event(time=FALSE,time2=time,cause=cause)~X1+X2,
                   data=cif.data,
                   cluster=i, #cluster ID
                   idControl=j, #subject ID
                   time.points = tp, #time points
                   cens.formula = NULL, #censoring weight model
                   event=1 #event of interest
)
head(set.cifd)

GEE

exc.cif.mod<-geese(Rt~-1+factor(h)+X1+X2,
                data=set.cifd,
                family="gaussian", #error distribution
                mean.link = "log", #link function for Rt
                corstr="independence", #correlation structure
                id=clust.num, #cluster vector
                weights=weights #censoring weights
                )

Results

predicted excess CIFs

af<-paste0("-1+factor(h)+X1+X2") #model formula
newd<-data.frame(expand.grid(h=tp,X1=c(0,1),X2=c(0.8,1.5,2.5))) # newdata
# define the different subjects for whom the excess risk is predicted
strata.levels<-factor(1:6, levels=1:6, 
                      labels =paste0(rep("X1=",6),
                                     expand.grid(X1=c(0,1),X2=c(0.8,1.5,2.5))[,1],
                                     rep(", X2=",6),
                                     expand.grid(X1=c(0,1),X2=c(0.8,1.5,2.5))[,2]))

pred.exc.cif<-ecif.pred(exc.cif.mod,times = tp,dataset = newd, 
                     formula = af, strata.levels = strata.levels)
head(pred.exc.cif,8)

the plot

p<-ggplot()
plot.exc.cif<-p+
  geom_step(aes(x=time, y=lower.ci, alpha=0.1), data=pred.exc.cif,lty="dotted", size=0.7)+
  geom_step(aes(x=time, y=upper.ci, alpha=0.1), data=pred.exc.cif,lty="dotted", size=0.7)+
  geom_step(aes(x=time, y=cif),col="black", data=pred.exc.cif, size=0.8)+
  scale_x_continuous(name="Time since entry",limits = c(0,30),breaks = seq(0,30,5), labels=seq(5,35,5))+
  scale_y_continuous(name="Excess risk")+ 
  facet_wrap(~strata,ncol = 2)+
  geom_abline(slope = 0,intercept = 0, size=0.2)+
  theme(legend.position = 'none');plot.exc.cif

estimate effects

ecif.coef(exc.cif.mod,times = tp, link = "log")


cribosch/matchsurv documentation built on Aug. 15, 2019, 11:55 a.m.