knitr::opts_chunk$set(prompt = TRUE, comment = "")

The regression model implemented in the reReg package accommodate informative censoring via the use of a subject-specific frailty variable. In contrast to existing frailty models, the implemented estimation procedure does not require distributional information on the frailty variable. Using the borrowing-strength approach in the estimating procedure, our model allows users to specify any combination of the sub-models between the recurrent event process and the terminal events when they are fitted jointly.

In this vignette, we demonstrate how to use the reReg() function in the reReg package to fit semiparametric regression models to recurrent event data.

Notations

Let $N(t)$ be the number of recurrent events occurring over the interval $[0, t]$ and $D$ be the failure time of interest subjects to right censoring by $C$. Define the composite censoring time $Y = \min(D, C, \tau)$ and the failure event indicator $\Delta = I{D\le \min(C, \tau)}$, where $\tau$ is the maximum follow-up time. We assume the recurrent event process $N(\cdot)$ is observed up to $Y$. Let $X$ be a $p$-dimensional covariate vector. Consider a random sample of $n$ subjects, the observed data are independent and identically distributed (iid) copies of ${Y_i, \Delta_i, X_i, N_i(t), 0\le t\le Y_i}$, where the subscript $i$ denotes the index of a subject for $i= 1, \ldots, n$. Let $m_i = N_i(Y_i)$ be the number of recurrent events the $i$th subject experienced before time $Y_i$, then the jump times of $N_i(t)$ give the observed recurrent event times $t_{i1}, \ldots, t_{im_i}$ when $m_i > 0$. Thus, the observed data can also be expressed as iid copies of ${Y_i, \Delta_i, X_i, m_i, (t_{i1}, \ldots, t_{m_i})}$. The primary interests in recurrent event data analysis is in making inference about the recurrent event process and the failure event and understanding the corresponding covariate effects.

Model assumption

We assume a generalized joint frailty scale-change model for the rate function of the recurrent event process and the hazard function of the failure time formulated as follow: \begin{equation} \begin{cases} \lambda(t) = \displaystyle Z\lambda_0(te^{X^\top\alpha})e^{X^\top\beta},&\ h(t) = \displaystyle Zh_0(te^{X^\top\eta})e^{X^\top\theta}, &t\in[0, \tau], \end{cases} \end{equation} where $Z$ is a latent shared frailty variable to account for association between the two types of outcomes, $\lambda_0(\cdot)$ is the baseline rate function, $h_0(\cdot)$ is the baseline hazard function, and the regression coefficients $(\alpha, \eta)$ and $(\beta, \theta)$ correspond to the shape and size parameters of the rate function and hazard function, respectively. In contrast to many shared-frailty models that require a parametric assumption, following the idea of @Wang2001, the reReg() function implements semiparametric estimation procedures that do not require the knowledge about the frailty distribution. As a result, the dependence between recurrent events and failure event is left unspecified and the proposed implementations accommodate informative censoring. The reReg() function fits the recurrent event data under the above joint model setting. The arguments of reReg() are as follows

library(reReg)
args(reReg)

Specifying model via the model argument

When the interest is the covariate effects on the risk of recurrent events and the terminal event is treated as nuisances, model = "cox" and model = "gsc" give the Cox-type model and the generalized scale-change rate model considered in @Wang2001 and @xu2019generalized, respectively. When the recurrent event process and the terminal events are modeled jointly, the types of rate function and hazard function can be specified simultaneously in model separated by ``|''. The model types for the hazard function are cox, ar, am, and gsc. For examples, the joint frailty Cox-type model of @Huang2004 and the joint frailty accelerated mean model of @xu2017joint can be called by model = "cox|cox" and model = "am|am", respectively. Depending on the application, users can specify different model types for the rate function and the hazard function. For example, model = "cox|ar" postulate a Cox-type proportional model for the recurrent event rate function and an accelerated rate model for the terminal event hazard function, or $\alpha = \theta = 0$ in (1).

Variance estimation

The nonparametric bootstrap method for clustered data is adopted to estimate the standard errors of the estimators. The bootstrap samples are formed by resampling the subjects with replacement of the same size from the original data. The above estimating procedures are then applied to each bootstrap sample to provide one draw of the bootstrap estimate. With a large number of replicates, the asymptotic variance matrices are estimated by the sample variance of the bootstrap estimates. In an attempt to overcome the computational burden in bootstrap variance estimation, parallel computing techniques based on methods in the parallel package will be applied when boot.parallel = TRUE}. The number of CPU cores used in the parallel computing is controlled by the argument boot.parCl}, whose default value is half of the total number of CPU cores available on the current host identified by parallel::detectCores() \%/\% 2}. When fitting the generalized scale-change rate model of @xu2019generalized, e.g., with model = "sc", an efficient resampling-based sandwich estimator is also available via se = "sand". The number of bootstrap or resampling is controlled by the argument B. When B = 0 (default), the variance estimation procedure will not be performed and only the point estimates will be returned, which can be useful when the variance estimation is time consuming. Future work will be devoted to generalize the efficient resampling-based sandwich estimator to all the sub-models.

The control list

The complete control list consists of the following parameters:

The default equation solver ("BB::dfsane") uses the derivative-free Barzilai-Borwein spectral approach for solving nonlinear equations implemented in dfsane() from the package BB [@Vara:Gilb:BB:2009]. Setting solver = "BB::BBsolve" calls the wrapper function BBsolve() in the BB package to locate a root with different Barzilai-Borwein step-lengths, non-monotonicity parameters, and initialization approaches. Based on our observation, the "BB::BBsolve" algorithm generally exhibited more reliable convergence but the solver = "BB::dfsane" algorithm provides a better balance between convergence and speed. The alternative options, solver = "BB::BBoptim" and solver = "optim", attempts to identify roots by minimizing the $\ell_2$-norm of the estimating functions. The options solver = "BB::BBoptim" and solver = "optim" call the BBoptim() function from the package BB and the base function optim(), respectively.

Examples

We will illustrate the usage of reReg with simulated data generated from the simGSC() function. Readers are referred to the vignette on Simulating recurrent event data for using simGSC() to generate recurrent event data.

Joint Cox model of @Huang2004

A simulated data following the joint Cox model can be generated by ```{R, cox, cache = TRUE} set.seed(1); datCox <- simGSC(200, summary = TRUE)

Since the default parameters in `simGSC()` are $\alpha = (0, 0)^\top$, $\beta = (-1, -1)^\top$, $\eta = (0, 0)^\top$, 
and $\theta = (-1, -1)^\top$, the underlying true model has the form:
$$\begin{cases}
\lambda(t) = Z \lambda_0(t)e^{-X_1 - X_2}, &\\
h(t) = Z h_0(t)e^{-X_1 - X_2}, & t\in[0, 60].
\end{cases}$$
The model fit is
```{R, coxfit, cache = TRUE}
fit.cox <- reReg(Recur(t.start %to% t.stop, id, event, status) ~ x1 + x2, 
                 B = 200, data = datCox, model = "cox|cox")
summary(fit.cox)

The baseline functions can be plotted via plot(): ```{R, coxplot} plot(fit.cox)

#### Joint accelerated mean model of @xu2017joint
A simulated data following the joint accelerated mean model can be generated by
```{R, am, cache = TRUE}
set.seed(1)
par0 <- list(alpha = c(1, 1), beta = c(1, 1), eta = -c(1, 1), theta = -c(1, 1))
datam <- simGSC(200, par = par0, summary = TRUE)

The underlying true model has the form: $$\begin{cases} \lambda(t) = Z \lambda_0(te^{X_1 + X_2})e^{X_1 + X_2}, &\ h(t) = Z h_0(te^{-X_1 - X_2})e^{-X_1 - X_2}, & t\in[0, 60]. \end{cases}$$ The model fit is ```{R, amfit, cache = TRUE} fit.am <- reReg(Recur(t.start %to% t.stop, id, event, status) ~ x1 + x2, B = 200, data = datam, model = "am|am") summary(fit.am)

The baseline functions can be plotted via `plot()`: 
```{R, amplot}
plot(fit.am)

Joint Cox/accelerated rate model

A simulated model following a special case of the generalized scale-change model where the rate function has a Cox structure and the hazard function has an accelerated rate model structure can be generated by ```{R, coxar, cache = TRUE} set.seed(1) par0 <- list(eta = c(1, 1), theta = c(0, 0)) datCoxAr <- simGSC(200, par = par0, summary = TRUE)

The underlying true model has the form:
$$\begin{cases}
\lambda(t) = Z \lambda_0(t)e^{-X_1 - X_2}.&\\
h(t) = Z h_0(te^{X_1 + X_2})& t\in[0, 60].
\end{cases}$$
The model fit is
```{R, coxarfit, cache = TRUE}
fit.CoxAr <- reReg(Recur(t.start %to% t.stop, id, event, status) ~ x1 + x2,
                   B = 200, data = datCoxAr, model = "cox|ar")
summary(fit.CoxAr)

The baseline functions can be plotted via plot(): {R, coxarplot} plot(fit.CoxAr)

Other popular methods

Some methods that assumes Z = 1 and requires independent censoring are also implemented in reReg. These includes the methods proposed by @lin2000semiparametric, @ghosh2002, and @ghosh2003, that can be called by specifying model = "cox.LWYY", model = "cox.GL", and model = "am.GL", respectively.

It is also worth noting that methods in multi-state models could also be applied to analyze recurrent events data. See the vignette on Multi-state models and competing risks for more details.

Reference



stc04003/reReg documentation built on Sept. 20, 2023, 7:30 a.m.