Description Usage Arguments Details Value Note Author(s) References See Also Examples

View source: R/A1_main_function.R

Fits a quantile regression model to possibly censored and truncated data, e.g., survival data.

1 |

`formula` |
an object of class “ |

`data` |
an optional data frame containing the variables in the model. |

`weights` |
an optional vector of weights to be used in the fitting process. The weights will always be normalized to sum to the sample size. This implies that, for example, using double weights will not halve the standard errors. |

`p` |
numerical vector indicating the order of the quantile(s) to be fitted. |

`CDF` |
an object of class “ |

`control` |
a list of operational parameters for the optimization algorithm, usually passed via |

`...` |
for future arguments. |

This function implements the method described in Frumento and Bottai (2017) for censored, truncated quantile regression, and the method described in Frumento (2021) for interval-censored quantile regression.

The left side of `formula` must be of the form `Surv(time, event)` if the data are right-censored, `Surv(time0, time, event)` if the data are right-censored and left-truncated (`time0 < time`, `time0`

can be `-Inf`), and `Surv(time1, time2, type = "interval2")` if the data are interval-censored (use `time1 = time2` for exact observations, `time1 = -Inf` or `NA` for left-censored, and `time2 = Inf` or `NA` for right-censored). Using `Surv(time)` is also allowed and indicates that the data are neither censored nor truncated.

The conditional distribution function (`CDF`

) of the response variable represents a nuisance parameter
and is estimated preliminarly via `pchreg`

. If missing, `CDF = pchreg(formula)`

is used as default. See the “Note” and the documentation of `pchreg`

.

Estimation is carried out using an algorithm for gradient-based optimization. To estimate the asymptotic covariance matrix, standard two-step procedures are used (e.g., Ackerberg et al., 2012).

An object of class “`ctqr`

”, which is a list with the following items:

`p` |
the quantile(s) being estimated. |

`coefficients` |
a named vector or matrix of quantile regression coefficients. |

`call` |
the matched call. |

`n.it` |
the number of iterations. |

`converged` |
logical. The convergence status. |

`fitted` |
the fitted values. |

`terms` |
the |

`mf` |
the model frame used. |

`covar` |
the estimated asymptotic covariance matrix. |

`CDF` |
the used |

Note that the dimension of all items, except `call`

, `terms`

, `mf`

, and `CDF`

,
is the same as the dimension of `p`. For example, if `p = c(0.25,0.5,0.75)`

, `coefficients`

and `fitted`

will be 3-columns matrices; `n.it`

and `converged`

will be vectors of 3 elements;
and `covar`

will be a list of three covariance matrices.

The generic accessor functions `summary`

, `plot`

, `predict`

, `coef`

, `terms`

, `nobs`

,
can be used to extract information from the model. The functions
`waldtest`

(from the package lmtest), and `linearHypothesis`

(from the package car) can be
used to perform Wald test, and to test linear restrictions. These functions, however,
will only work if `p`

is scalar.

NOTE 1. The first-step estimator (the `CDF` argument) is computed using the `pchreg`

function of the
pch package. To be correctly embedded in `ctqr`, a `pch` object must be constructed using
the same observations, in the same order.

If the first-step estimator is biased, and there is censoring or truncation, the estimates of the quantile regression coefficients and their standard errors will also be biased.

If the data are neither censored nor truncated, the `CDF`

does not enter the estimating equation of the model. However, since the first-step estimator is used to compute the starting points,
the final estimates may be sensitive to the supplied `CDF`

.

NOTE 2. Right-censoring is a special case of interval censoring, in which exact events are identified by
`time2 = time1`

, while censored observations have `time2 = Inf`

.
Note, however, that `ctqr(Surv(time1, time2, type = "interval2") ~ x)`

will not be identical to `ctqr(Surv(time = time1, event = (time2 < Inf)) ~ x)`

.
The estimating equation used for interval-censored data is that described in Frumento (2018),
while that used for right-censored data is that of Frumento and Bottai (2017). The two
estimating equations are only asymptotically equivalent (see Frumento 2018 for details).

Paolo Frumento <paolo.frumento@unipi.it>

Ackerberg, D., Chen, X., and Hahn, J. (2012). A practical asymptotic variance estimator for two-step semiparametric estimators. The Review of Economics and Statistics, 94 (2), 481-498.

Frumento, P., and Bottai, M. (2017). An estimating equation for censored and truncated quantile regression. *Computational Statistics and Data Analysis*, Vol.113, pp.53-63. ISSN: 0167-9473.

Frumento, P. (2021). A quantile regression estimator for interval-censored data (unpublished).

`plot.ctqr`

, `predict.ctqr`

, `pchreg`

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 | ```
# Using simulated data
# Example 1 - censored data ####################################################
n <- 1000
x1 <- runif(n); x2 <- runif(n) # covariates
t <- runif(n, 0, 1 + x1 + x2) # time variable (e.g., time to death)
c <- runif(n,0,5) # censoring variable (e.g., end of follow-up)
y <- pmin(t,c) # observed variable = min(t,c)
d <- (t <= c) # 1 = event (e.g., death), 0 = censored
CDF1 <- pchreg(Surv(y,d) ~ x1 + x2)
model1 <- ctqr(Surv(y,d) ~ x1 + x2, p = 0.5, CDF = CDF1)
model2 <- ctqr(Surv(y,d) ~ x1, p = 0.5, CDF = CDF1)
# model1 is identical to ctqr(Surv(y,d) ~ x1 + x2, p = 0.5)
# model2 is NOT identical to ctqr(Surv(y,d) ~ x1, p = 0.5),
# which would have default CDF = pchreg(Surv(y,d) ~ x1)
# Example 2 - censored and truncated data ######################################
n <- 1000
x1 <- runif(n); x2 <- runif(n) # covariates
t <- runif(n, 0, 1 + x1 + x2) # time variable
c <- runif(n,0,5) # censoring variable
y <- pmin(t,c) # observed variable = min(t,c)
d <- (t <= c) # 1 = event, 0 = censored
z <- rnorm(n) # truncation variable (e.g., time at enrollment)
w <- which(y > z) # data are only observed when y > z
z <- z[w]; y <- y[w]; d <- d[w]; x1 <- x1[w]; x2 <- x2[w]
# implement various CDFs and choose the model with smallest AIC
CDFs <- list(
pchreg(Surv(z,y,d) ~ x1 + x2, breaks = 5),
pchreg(Surv(z,y,d) ~ x1 + x2, breaks = 10),
pchreg(Surv(z,y,d) ~ x1 + x2 + x1:x2, breaks = 5),
pchreg(Surv(z,y,d) ~ x1 + x2 + x1^2 + x2^2, breaks = 10)
)
CDF <- CDFs[[which.min(sapply(CDFs, function(obj) AIC(obj)))]]
summary(ctqr(Surv(z,y,d) ~ x1 + x2, p = 0.5, CDF = CDF))
# Example 3 - interval-censored data ###########################################
# t is only known to be in the interval (t1,t2) ################################
n <- 1000
x1 <- runif(n); x2 <- runif(n) # covariates
t <- runif(n, 0, 10*(1 + x1 + x2)) # time variable
t1 <- floor(t) # lower extreme of the interval
t2 <- ceiling(t) # upper extreme of the interval
model <- ctqr(Surv(t1,t2, type = "interval2") ~ x1 + x2, p = 0.5)
``` |

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.