Demonstration of a survival analysis workflow using the dataAnalysisMisc package.

Aims: Easy survival analysis and visualization using standard tools as kaplan-meier plots as well as forest plots for the visualization of CoxPH models (uni and multivariate). Automatic model selection (multivariate) is supported. Patient characteristics tables are automatically created (tex file for easy adaption / pdf is pdflatex is available). Standard model assumptions hold true (e.g. no paired observations).

Install and update the package

As bugs might be eleminated from time to time and functionality might increase, an update might be a good idea :)

# use devtools to get the package from github
# devtools installed?
if (!("devtools" %in% installed.packages())) install.packages("devtools")
devtools::install_github("mknoll/dataAnalysisMisc", dependencies=T)

Load the package

require(dataAnalysisMisc)

Exemplary data

We use the retinopathy data from the survival package and construct a survival object. We treat events as death.

It is ignored in this case, that we do have two observations for each patients (left and right eye), we treat them as independent observations from two patients. Don't do that with real data, go e.g. for an gee using cluster().

require(survival)
data("retinopathy")

srv <- Surv(retinopathy$futime, retinopathy$status)

Kaplan Meier

Median OS will be drawn if reached. The offsetNRisk variable moves the number at risk table in y axis.

One group KM plots:

plotKM(srv, rep("All", length(retinopathy[,1])), col="black", offsetNRisk=-0.2,
       xlab="Time [months]", ylab="OS")

More group KM plots:

plotKM(srv, retinopathy$laser, offsetNRisk=-0.2, ylab="OS", xlab="Time [months]")

"Optimal" cutoff

If you want to know if your data has the potential to allow grouping into prognostically different cohorts by dichotomizing a continuous variable, you can search for an "optimal" cutoff, e.g. by varying the respective cutoff and test for differences in the resulting groups. Please do not abuse this approach ;-)

The CoxPH likelihood ratio test p-value is shown.

res <- dataAnalysisMisc::findOptCutoff(retinopathy$age, srv, delta=0.01)
res[1:4,]

plotKM(srv, ifelse(retinopathy$age <= res$cutoff[1], "low", "high"), offsetNRisk=-0.2, ylab="OS", xlab="Time [months]")

Forestplot

We want to conduct univariate CoxPH analyses and create and forest plot for the different variables.

Let's select the variables of interest.

data <- retinopathy[,c("laser", "eye", "type", "trt")]

Conduct univariate analyses:

res <- plotForest(srv,data)

We can do that with multiple observations per subject as well:

res <- plotForest(srv,data,subject=retinopathy$id)

And multivariate analysis:

res <- plotForestMV(srv, data)

Multiple observations / subject:

res <- plotForestMV(srv, data, subject=retinopathy$id)

Model selection

We can use the famous step() function to select models, and pass the direction value directly as selection parameter.

res <- plotForestMV(srv, data, selection="both")

Or, if we do have too many variables, we can give a p-value cutoff for the univarate analyses. If passed, the will be included in the multivariate analysis.

res <- plotForestMV(srv, data, selection=0.4)

Patient characteristics

Let's create a cool patient characteristics table using the data we just build our models with.

If you do have pdflatex on your machine, a pdf will be created. Otherwise, go for the .tex file to do your final adjustments :)

res <- createPatChar(data, latex=T, outdir=tempdir())
res
knitr::include_graphics(res$pdffile)

For multiple observations / subject:

subjVar <- list(laser=which(retinopathy$trt==0))
res <- createPatChar(data, latex=T,subject=retinopathy$id, subjVar=subjVar, outdir=tempdir())
res
knitr::include_graphics(res$pdffile)


mknoll/dataAnalysisMisc documentation built on Feb. 4, 2024, 8:16 a.m.