knitr::opts_chunk$set( collapse = TRUE, comment = "#>", fig.path = "README-" )
survMCOD is an R package to fit Cox regression models for the hazard of death due to a disease of interest based on multiple cause-of-death data ("Multiple-cause analysis"). Such data are usually obtained by extracting all the diseases mentioned on the death certificate, not only the so-called "underlying cause of death". This multiple-cause mortality model thus acknowledges that death may be caused by several disease processes acting together. It was proposed by Moreno-Betancur et al. (2017), and it is the first formal extension of the single-cause competing risks model to that setting.
Specifically, using all deaths partially attributed to the disease of interest, we model the pure hazard rate of the disease of interest. This is the rate of deaths caused exclusively by that disease, and is thus a quantity that is conceptually closer to the marginal "causal" hazard than the cause-specific hazard. The latter is the quantity modeled when using competing risks Cox regression based on the underlying cause of death and ignoring all other diseases mentioned on the death certificate ("Single-cause analysis").
Note: Please note that the version available on GitHub is the most up-to-date development version of the package. A stable version of the package will be available from CRAN once it is released.
The survMCOD package requires the survival and Matrix packages.
The survMCOD package can be installed directly from GitHub using the devtools package. To do this you should first check you have devtools installed by executing the following commands from within your R session:
if (!require(devtools)) { install.packages("devtools") }
Then execute the following commands to install survMCOD:
library(devtools) install_github("moreno-betancur/survMCOD")
The full reference manual of the package can be dowloaded here.
To illustrate the use of the package, we use data simulated using the package's simMCOD
function (use ?simMCOD
for more details):
library(survMCOD) set.seed(234) #we set a seed to ensure we get the same simulated dataset everytime datEx<-simMCOD(n=4000,xi=-1,rho=-2,phi=0, pgen=c(1,0,0.75,0.25,0.125,0.083), lambda=0.001,v=2,pUC=c(1,0.75)) head(datEx)
The dataset is of the format required by survMCOD
. It contains one row per invididual, and the variables consist of:
When analysing your own survival data with multiple causes of death, a key preliminary step to fitting models with survMCOD
will be to assign a weight to each death that represents the proportion of the death attributed to the disease of interest. The user is referred to Moreno-Betancur et al. (2017), Piffaretti et al. (2016) and Rey et al. (2017) for descriptions and discussions of various weight-attribution strategies.
The assumptions of the multiple-cause model and details of the estimation procedure are provided in Moreno-Betancur et al. (2017). A key feature of the model is that a Cox model for the disease of interest and a Cox model for other causes need to be fitted simultaneously, and deaths with a weight between 0 and 1 will contribute to both. This is why the user needs to specify regressors for each of these models using the two arguments formula
and formOther
of the function as follows (use ?survMCOD
and ?SurvM
for more details).
fitMCOD<-survMCOD(SurvM(time=TimeEntry,time2=TimeExit,status=Status, weight=Pi)~X1, formOther=~Z1,data=datEx,UC_indicator="UC")
One aspect of the multiple-cause model is that a fully parametric model for the log ratio of the baseline pure hazards needs to be posited. The current default is to parametrise this a piecewise constant function with cut-offs at the 25th, 50th and 75th percentile of the failure time distribution of pure events (with weight=1). Future versions will provide the user control over this. Thus the model fit provides regression coefficient estimates for the models relating to the disease of interest and other diseases (as log-hazard ratios) and also estimates of the piecewise constant estimated log ratio of the baseline hazards. These results are extracted as follows:
fitMCOD[["Multiple-cause"]]
The results from the single-cause analysis can also be extracted from the object, as follows:
fitMCOD[["Single-cause"]]
The convergence of the multiple-cause analysis should be checked using (use ?check.survMCOD
for more details):
check.survMCOD(fitMCOD)
Healthy convergence is seen by curves showing variation in estimates across the first three points, followed by a stabilisation of the curve around the final estimate.
Future versions will define a proper class of for objects created by survMCOD
and summary and other such methods.
If you find any bugs, please report them via email to Margarita Moreno-Betancur.
Moreno-Betancur M, Sadaoui H, Piffaretti C, Rey G. Survival analysis with multiple causes of death: Extending the competing risks model. Epidemiology 2017; 28(1): 12-19.
Piffaretti C, Moreno-Betancur M, Lamarche-Vadel A, Rey G. Quantifying cause-related mortality by weighting multiple causes of death. Bulletin of the World Health Organization 2016; 94:870-879B.
Rey G, Piffaretti C, Rondet C, Lamarche-Vadel A, Moreno-Betancur M. Analyse de la mortalite par cause : ponderation des causes multiples. Bulletin Epidemiologique Hebdomadaire, 2017; (1): 13-9.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.