library(knitr)
opts_chunk$set(prompt = TRUE, comment = "")
hook_output <- knit_hooks$get("output")
knit_hooks$set(output = function(x, options) {
  lines <- options$output.lines
  if (is.null(lines)) {
    return(hook_output(x, options))  # pass to default hook
  }
  x <- unlist(strsplit(x, "\n"))
  more <- "..."
  if (length(lines)==1) {        # first n lines
    if (length(x) > lines) {
      # truncate the output, but add ....
      x <- c(head(x, lines), more)
    }
  } else {
    x <- c(more, x[lines], more)
  }
  # paste these lines together
  x <- paste(c(x, ""), collapse = "\n")
  hook_output(x, options)
})
knitr::opts_chunk$set(fig.width = 6, fig.height = 3.5) 

In this vignette, we demonstrate how to create event plots and mean cumulative function in reReg package. We will illustrate the usage of our functions with the readmission data from the frailtypack package [@rondeau2012frailtypack], [@gonzalez2005sex]. The data contains re-hospitalization times after surgery in patients diagnosed with colorectal cancer. In this data set, the recurrent event is the readmission and the terminal event is either death or end of study. See ?readmission for data details.

library(reReg)
packageVersion("reReg")
data(readmission, package = "frailtypack")
head(readmission)

For illustration, we first remove subjects who has more than one terminal events.

readmission <- subset(readmission, !(id %in% c(60, 109, 280)))

Event plots

An easy way to glance at recurrent event data is by plotting the event plots. Event plots can be created by applying the generic function plot() to Recur objects directly, as shown in Figure 1.

reObj <- with(readmission, Recur(t.stop, id, event, death))
plot(reObj)
reObj <- with(readmission, Recur(t.stop, id, event, death))
plot(reObj, base_size = 7)

The gray horizontal lines represent each subjects' follow-up times, the green dots marks the recurrent events, and the red triangles marks terminal events. With the default setting, the event plot is arranged so that subjects with longer follow-up times are presented on the top. The plot() method returns a ggplot2 object [@R:ggplot] to allow extensive flexibility and customization. Common graphical options like xlab, ylab, main, and more can be directly passed down to plot().

plot(reObj, cex = 1.5, xlab = "Time in days", ylab = "Patients", 
     main = "Event plot for readmission data", 
     terminal.name = "Death", 
     recurrent.name = "Hospital readmission")
plot(reObj, cex = 1.5, xlab = "Time in days", ylab = "Patients", 
     main = "Event plot for readmission data", 
     terminal.name = "Death", 
     recurrent.name = "Hospital readmission", base_size = 7)

Applying the plot() function to Recur objects internally calls the plotEvent() function, which allows users to configure event plots in a model formula. The synopsis of plotEvents() is:

args(plotEvents)

The arguments are as follows

The required argument for plotEvent() is formula. The full list of these parameters with their default values are displayed below.

args(reReg:::plotEvents.control)

The parameters xlab, ylab, and main are adopted from the base R graphics package for modifying labels to the x-axis, the y-axis, and the main title, respectively. The parameters terminal.name and recurrent.name are character labels displayed in the legend. The parameter recurrent.type specifies the recurrent event types to be displayed in the legend when more than one recurrent event types is specified in the Recur object. The parameter legend.position is adopted from the % the legend.position components of the ggplot2 theme to control the location of the legend box. The parameters cex and alpha control the size and the transparency of the symbols on the event plots, respectively. The base_size parameter controls the base font size in pts.

Stratifying the event plot could provide new insights into the different covariate effects. For example, an event plot stratified by the binary variable sex can be produced with the following code.

plotEvents(Recur(t.stop, id, event, death) ~ sex, data = readmission)
plotEvents(Recur(t.stop, id, event, death) ~ sex, data = readmission, base_size = 7)

Similarly, event plot stratified by sex and chemo can be produced with the following code. The additional argument base_size = 5 is included to display the full legend labels.

plotEvents(Recur(t.stop, id, event, death) ~ sex + chemo, data = readmission, base_size = 5)
plotEvents(Recur(t.stop, id, event, death) ~ sex + chemo, data = readmission, base_size = 5)

The plotEvents() can accommodate multiple recurrent types specified via the event argument in the Recur object. To illustrate this feature, we perturb the event indicator so that the value of event indicates the index of the different recurrent event. The following command is used to create the stratified event plot with different recurrent event.

readmission$event2 <- with(readmission, ifelse(t.stop > 500 & event > 0, 2, event))
plotEvents(Recur(t.stop, id, event2, death) ~ sex, data = readmission)
readmission$event2 <- with(readmission, ifelse(t.stop > 500 & event > 0, 2, event))
plotEvents(Recur(t.stop, id, event2, death) ~ sex, data = readmission, base_size = 7)

The different types of recurrent events are marked in different colors in Figure 5. The plotEvents() function can also create event plots in calendar time by setting calendarTime = TRUE. For illustration, we create a new simulated data based on readmission with time intervals shift proportionally by chemo. We further convert the time intervals to a Date class. The construction of the new data and the stratified event plot in Figure 6 is included in the following

readmission2 <- readmission
readmission2$t.start <- as.Date(readmission2$t.start + as.numeric(readmission2$chemo) * 5, 
  origin = "20-01-01")
readmission2$t.stop <- as.Date(readmission2$t.stop + as.numeric(readmission2$chemo) * 5, 
  origin = "20-01-01")
plotEvents(Recur(t.start %2% t.stop, id, event2, death) ~ sex, data = readmission2, 
           calendarTime = TRUE)
readmission2 <- readmission
readmission2$t.start <- as.Date(readmission2$t.start + as.numeric(readmission2$chemo) * 5, 
  origin = "20-01-01")
readmission2$t.stop <- as.Date(readmission2$t.stop + as.numeric(readmission2$chemo) * 5, 
  origin = "20-01-01")
plotEvents(Recur(t.start %2% t.stop, id, event2, death) ~ sex, data = readmission2, 
           calendarTime = TRUE, base_size = 7)

In this case, subjects with later censoring times are presented on the top and the date string are printed on the axis label of the event plot.

Mean cumulative function plots

Let $N(t)$ be the number of recurrent events occurring over the interval $[0, t]$ and $D$ be the failure time of interest subjects to right censoring by $C$. Define the composite censoring time $Y = \min(D, C, \tau)$ and the failure event indicator $\Delta = I{D\le \min(C, \tau)}$, where $\tau$ is the maximum follow-up time. We assume the recurrent event process $N(\cdot)$ is observed up to $Y$. Let $X$ be a $p$-dimensional covariate vector. Consider a random sample of $n$ subjects, the observed data are independent and identically distributed (iid) copies of ${Y_i, \Delta_i, X_i, N_i(t), 0\le t\le Y_i}$, where the subscript $i$ denotes the index of a subject for $i= 1, \ldots, n$. Let $m_i = N_i(Y_i)$ be the number of recurrent events the $i$th subject experienced before time $Y_i$, then the jump times of $N_i(t)$ give the observed recurrent event times $t_{i1}, \ldots, t_{im_i}$ when $m_i > 0$. Thus, the observed data can also be expressed as iid copies of ${Y_i, \Delta_i, X_i, m_i, (t_{i1}, \ldots, t_{m_i})}$.

Under independence censoring, a nonparametric estimate of $\Lambda(t)$ known as the Nelson-Aalen estimator is \citep{lawless1995some} $$\widehat\Lambda(t) = \sum_{i = 1}^n\int_0^t\frac{d N_i(u)}{\sum_{j = 1}^nI(Y_i \ge u)},$$ which is also known as the mean cumulative function (MCF). The MCF presents the average number of recurrent events per subject observed by time~$t$ while adjusting for the risk set. In the case when all subjects remain at risk of recurrent events throughout the study, i.e., $n = \sum_{j = 1}^nI(Y_i \ge t)$, the Nelson-Aalen estimator reduces to the cumulative sample mean function introduced in Chapter 1 of @cook2007statistical.

The MCF is useful in describing and comparing the average number of events of an individual and between groups. Thus, it provides additional insights into the longitudinal patterns of the recurrent process. Under the independent censoring assumption, the Nelson-Aalen estimate can be created by applying the generic function plot() to the Recur object with an additional argument mcf = TRUE. The following command plots the Nelson-Aalen estimate in Figure 7. The 95\% confidence interval is enabled by additionally setting mcf.conf.int = TRUE.

plot(reObj, mcf = TRUE, mcf.conf.int = TRUE)

To create the cumulative sample mean function, one needs to additionally specify the argument mcf.adjustRiskset = FALSE. Plotting the Recur object with the argument mcf = TRUE internally calls the mcf() function from the reda package [@R:reda]. The mcf() function is imported when the reReg package is loaded. The usage of the mcf() function is similar to that of the plotEvent(), where the recurrent event process is specified in a model formula with Recur objects as the formula response and the covariates specified in the model formula should be a combination of factor variables. The mcf() computes the overall sample MCF when an intercept-only-model is specified. When covariates are specified in the model formula, the sample MCF for each level of the combination of the factor variables will be computed. The following code is used to produce Figure 8, which displays the Nelson-Aalon estimates for sex = "Female"and sex = "Male".

mcf0 <- mcf(Recur(t.start %2% t.stop, id, event, death) ~ sex, data = readmission)
plot(mcf0, conf.int = TRUE)

In the presence of informative censoring, the NPMLE of [@wang1986asymptotic] can be plotted by first fitting an intercept-only model with reReg(), then applying the generic function plot(). The Recur object is used as the formula response. The following code illustrate this feature with output displayed in Figure 9.

mcf1 <- reReg(Recur(t.start %2% t.stop, id, event, death) ~ 1, data = readmission)
plot(mcf1)
mcf1 <- reReg(Recur(t.start %2% t.stop, id, event, death) ~ 1, data = readmission)
plot(mcf1, base_size = 7)

The 95\% confidence interval is computed based on the non-parametric bootstrap approach with 200 bootstrap replicates as specified by the argument B = 200. Separate MCFs can be created and plotted for each level of a factor variable using the subset option in reReg(). The basebind() function can then be applied to combine the MCFs into one plot to allow easy visual comparison. The synopsis for basebind() is shown below.

args(basebind)

The arguments are

When fitting regression models with reReg(), the baseline() function can be applied to combine the estimates of baseline functions across groups. The following code is used to create Figure~ 9. where the NPMLEs for sex = "Female" and sex = "Male" are presented. The corresponding 95\% confidence intervals are computed with 200 bootstrap replicates.

mcf2 <- reReg(Recur(t.start %2% t.stop, id, event, death) ~ 1, subset = sex == "Female", data = readmission, B = 200)
mcf3 <- update(mcf2, subset = sex == "Male")
g1 <- plot(mcf2)
g2 <- plot(mcf3)
basebind(g1, g2, legend.title = "Sex", legend.labels = c("Female", "Male"))
mcf2 <- reReg(Recur(t.start %2% t.stop, id, event, death) ~ 1, subset = sex == "Female", data = readmission, B = 200)
mcf3 <- update(mcf2, subset = sex == "Male")
g1 <- plot(mcf2)
g2 <- plot(mcf3)
basebind(g1, g2, legend.title = "Sex", legend.labels = c("Female", "Male"), 
         control = list(base_size = 7, legend.position = "topleft"))

Linking with ggplot

We will take the following readmission data for example.

library(ggplot2)
library(ggplot2)
(f <- plotEvents(Recur(t.stop, id, event, death) ~ chemo, data = readmission))

Add axis breaks

f + scale_y_continuous(breaks = seq(0, 2000, 100)) + ylab("Days since surgery")

Replace labeling strip in event plot

labels <- c("chemo = Treated" = "Yes",
            "chemo = NonTreated" = "No")
f + facet_grid(chemo ~., scales = "free", space = "free", switch = "both",
               labeller = labeller(chemo = labels)) +
  xlab("Received chemo?")

Customize bar style

f + geom_rect(xmin = f$data$id - .45, # bar height
              xmax = f$data$id + .45, # bar height
              ymin = 0, ymax = f$data$time2, # bar width
              fill = 1 + as.numeric(f$data$chemo), # bar color
              alpha = .25) # transparency

Reference



stc04003/reReg documentation built on April 26, 2024, 1:32 p.m.