# survrtrunc: Nonparametric estimator of survival from right-truncated,... In flexsurv: Flexible Parametric Survival and Multi-State Models

 survrtrunc R Documentation

## Nonparametric estimator of survival from right-truncated, uncensored data

### Description

Estimates the survivor function from right-truncated, uncensored data by reversing time, interpreting the data as left-truncated, applying the Kaplan-Meier / Lynden-Bell estimator and transforming back.

### Usage

```survrtrunc(t, rtrunc, tmax, data = NULL, eps = 0.001, conf.int = 0.95)
```

### Arguments

 `t` Vector of observed times from an initial event to a final event. `rtrunc` Individual-specific right truncation points, so that each individual's survival time `t` would not have been observed if it was greater than the corresponding element of `rtrunc`. If any of these are greater than `tmax`, then the actual individual-level truncation point for these individuals is taken to be `tmax`. `tmax` Maximum possible time to event that could have been observed. `data` Data frame to find `t` and `rtrunc` in. If not supplied, these should be in the working environment. `eps` Small number that is added to `t` before implementing the time-reversed estimator, to ensure the risk set is consistent between forward and reverse time scales. It should be just large enough that `t+eps` is not `==t`. This should not need changing from the default of 0.001, unless `t` are extremely large or small and the data are rounded to integer. `conf.int` Confidence level, defaulting to 0.95.

### Details

Note that this does not estimate the untruncated survivor function - instead it estimates the survivor function truncated above at a time defined by the maximum possible time that might have been observed in the data.

Define X as the time of the initial event, Y as the time of the final event, then we wish to determine the distribution of T = Y- X.

Observations are only recorded if Y ≤q t_{max}. Then the distribution of T in the resulting sample is right-truncated by `rtrunc` = t_{max} - X.

Equivalently, the distribution of t_{max} - T is left-truncated, since it is only observed if t_{max} - T ≥q X. Then the standard Kaplan-Meier type estimator as implemented in `survfit` is used (as described by Lynden-Bell, 1971) and the results transformed back.

This situation might happen in a disease epidemic, where X is the date of disease onset for an individual, Y is the date of death, and we wish to estimate the distribution of the time T from onset to death, given we have only observed people who have died by the date t_{max}.

If the estimated survival is unstable at the highest times, then consider replacing `tmax` by a slightly lower value, then if necessary, removing individuals with `t > tmax`, so that the estimand is changed to the survivor function truncated over a slightly narrower interval.

### Value

A list with components:

`time` Time points where the estimated survival changes.

`surv` Estimated survival at `time`, truncated above at `tmax`.

`se.surv` Standard error of survival.

`std.err` Standard error of -log(survival). Named this way for consistency with `survfit`.

`lower` Lower confidence limits for survival.

`upper` Upper confidence limits for survival.

### References

D. Lynden-Bell (1971) A method of allowing for known observational selection in small samples applied to 3CR quasars. Monthly Notices of the Royal Astronomical Society, 155:95–118.

Seaman, S., Presanis, A. and Jackson, C. (2020) Review of methods for estimating distribution of time to event from right-truncated data.

### Examples

```
## simulate some event time data
set.seed(1)
X <- rweibull(100, 2, 10)
T <- rweibull(100, 2, 10)

## truncate above
tmax <- 20
obs <- X + T < tmax
rtrunc <- tmax - X
dat <- data.frame(X, T, rtrunc)[obs,]
sf <-    survrtrunc(T, rtrunc, data=dat, tmax=tmax)
plot(sf, conf.int=TRUE)
## Kaplan-Meier estimate ignoring truncation is biased
sfnaive <- survfit(Surv(T) ~ 1, data=dat)
lines(sfnaive, conf.int=TRUE, lty=2, col="red")

## truncate above the maximum observed time
tmax <- max(X + T) + 10
obs <- X + T < tmax
rtrunc <- tmax - X
dat <- data.frame(X, T, rtrunc)[obs,]
sf <-    survrtrunc(T, rtrunc, data=dat, tmax=tmax)
plot(sf, conf.int=TRUE)
## estimates identical to the standard Kaplan-Meier
sfnaive <- survfit(Surv(T) ~ 1, data=dat)
lines(sfnaive, conf.int=TRUE, lty=2, col="red")

```

flexsurv documentation built on June 18, 2022, 1:06 a.m.