library(knitr) opts_chunk$set(prompt = TRUE, comment = "") hook_output <- knit_hooks$get("output") knit_hooks$set(output = function(x, options) { lines <- options$output.lines if (is.null(lines)) { return(hook_output(x, options)) # pass to default hook } x <- unlist(strsplit(x, "\n")) more <- "..." if (length(lines)==1) { # first n lines if (length(x) > lines) { # truncate the output, but add .... x <- c(head(x, lines), more) } } else { x <- c(more, x[lines], more) } # paste these lines together x <- paste(c(x, ""), collapse = "\n") hook_output(x, options) }) knitr::opts_chunk$set(fig.width = 6, fig.height = 3.5)
The Cox-type proportional rate model (available in reReg()
via model = "cox.LWYY"
)
is a common semiparametric regression model for
recurrent event processes under the noninformative censoring assumption.
Since the construction of the pseudo-partial score function ignores the
dependency among recurrent events,
the efficiency of the Cox-type proportional rate model can be improved by combining
a system of weighted pseudo-partial score equations via the generalized method of moments (GMM)
and empirical likelihood (EL) estimation [@huang2022improved].
In this vignette, we demonstrate the proposed GMM and EL procedures in the reReg
package.
We will illustrate the idea with simulated data generated from the simGSC()
function.
library(reReg) packageVersion("reReg")
We first generate a simulated data where the recurrent event process is generated from a Cox-type proportional rate model.
This can be achieved with the default settings of simGSC()
;
see the vignette on simulating recurrent event data for illustrations of simGSC()
.
set.seed(1); head(dat <- simGSC(50, frailty = rep(1, 50)))
We can fit the Cox-type proportional rate model [@lin2000semiparametric] with reReg()
via the following specification.
fit <- reReg(Recur(t.stop, id, event) ~ x1 + x2, model = "cox.LWYY", data = dat) summary(fit)
@huang2022improved proposed improved semiparametric estimation methods of the above proportional rate model. By combining different weighted pseudo-partial score functions through generalized methods of moments or empirical likelihood methods, @huang2022improved showed that substantial efficiency gain can be achieved without imposing additional model assumptions than those assumed in @lin2000semiparametric. Specifically, the efficiency gain is a result of combining a set of asymptotically independently and identically distributed estimating equations derived from the weighted pseudo-partial score equations of @lin2000semiparametric.
Two available approaches to combine estimating equations are the generalized method of moments (GMM)
and the empirical likelihood (EL) estimations, both of which can be specified with the cppl
argument
within the control
list.
In addition to the combination approach, weight functions to be combined with
the weighted pseudo-partial score functions can also be specified with the cppl.wfun
argument within the control
list.
In theory, combining more weight functions leads to better asymptotic efficiency,
especially when the sample size is large.
However, as observed in simulation studies of @huang2022improved,
combining one or two weighted functions can achieve substantial efficiency gain under scenarios
with small to moderate sample sizes.
When the sample size is small, @huang2022improved suggests using a combination of the
unit weight and a decreasing function, since the decreasing function can downweight the
later time period, in which the events are more effectively censored.
For those reason, the reReg()
currently only allows user to specify up to two weight functions with
cppl.wfun
.
The reReg()
allows users choose either the cumulative baseline rate function or the Gehan's weight
as two possible options for the weight functions.
Additionally, the reReg()
also allows users
to specify arbitrary function in cppl.wfun
via function formulas.
The following gives an example of the improved estimation by combining estimating equations via GMM with the cumulative baseline rate function and the Gehan's weight. ```{R cox2, cache = TRUE} fit2 <- reReg(Recur(t.stop, id, event) ~ x1 + x2, model = "cox.LWYY", data = dat, control = list(cppl = "GMM", cppl.wfun = list("Gehan", "cumbase"))) summary(fit2)
For this data set, the standard errors changed from `r fit$par1.se[1]` and `r fit$par1.se[2]` to `r fit2$par1.se[1]` and `r fit2$par1.se[2]`, respectively. This achieves a `r 100 * round(1 - fit2$par1.se / fit$par1.se, 4)[1]`\% and `r 100 * round(1 - fit2$par1.se / fit$par1.se, 4)[2]`\% efficiency gain, respectively. The above result can be called by updating the original fit, `fit`, with the `update()` function, as illustrated in the following. ```{R, eval = FALSE} fit2 <- update(fit, control = list(cppl = "GMM", cppl.wfun = list("Gehan", "cumbase")))
The following gives an example of the improved estimation by combining estimating equations
via EL with the user-specified function $1 / (x + 1)$ and the Gehan's weight.
{R cox3, cache = TRUE}
fit3 <- reReg(Recur(t.stop, id, event) ~ x1 + x2, model = "cox.LWYY", data = dat,
control = list(cppl = "EL", cppl.wfun = list(function(x) 1 / (x + 1), "Gehan")))
summary(fit3)
For this data set, the standard errors changed from r fit$par1.se[1]
and r fit$par1.se[2]
to r fit3$par1.se[1]
and r fit3$par1.se[2]
, respectively.
This achieves a r 100 * round(1 - fit3$par1.se / fit$par1.se, 4)[1]
\% and
r 100 * round(1 - fit3$par1.se / fit$par1.se, 4)[2]
\% efficiency gain, respectively.
Although the above example assumes the covariates are time-independent,
the proposed approach and the implementation allow time-dependent covariates.
Although this vignette is focus on improving the proportional rate model,
the proposed approaches by @huang2022improved can be extended to handle other types of rate function.
Such an extension will be investigated and the reReg
package will be expanded accordingly.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.