View source: R/landest_functions.R

delta.land.obs | R Documentation |

Estimates the probability of survival past some specified time and the treatment effect, defined as the difference in survival at the specified time, using landmark estimation for an observational study setting

```
delta.land.obs(tl, dl, treat, tt, landmark, short = NULL, z.cov = NULL,
var = FALSE, conf.int = FALSE, ps.weights = NULL, weight.perturb = NULL,
perturb.ps = FALSE, cov.for.ps = NULL, bw = NULL)
```

`tl` |
observed event time of primary outcome, equal to min(T, C) where T is the event time and C is the censoring time. |

`dl` |
event indicator, equal to I(T<C) where T is the event time and C is the censoring time. |

`treat` |
treatment indicator, should be 0/1. |

`tt` |
the time of interest, function estimates the probability of survival past this time |

`landmark` |
the landmark time |

`short` |
a matrix of intermediate event information, there should be two columns for each intermediate event, the first column contains the observed intermediate event time, equal to min(TS, C) where TS is the event time and C is the censoring time, and the second column contains the event indicator, equal to I(TS<C) |

`z.cov` |
matrix of baseline covariate information |

`var` |
TRUE or FALSE; indicates whether variance estimates for the treatment effect and survival in each group are requested, default is FALSE. |

`conf.int` |
TRUE or FALSE; indicates whether 95% confidence intervals for the treatment effect and survival in each group are requested, default is FALSE. |

`ps.weights` |
propensity score (or inverse probability of treatment) weights |

`weight.perturb` |
a (n1+n0) by x matrix of weights where n1 = length of tl for treatment group 1 and n0 = length of tl for treatment group 0; used for perturbation-resampling, default is null. If var or conf.int is TRUE and weight.perturb is not provided, the function generates exponential(1) weights. |

`perturb.ps` |
TRUE or FALSE indicating whether the weight.perturb matrix includes the perturbed propensity score (or inverse probability of treatment) weights; if cov.for.ps is supplied instead of ps.weights, this is forced to be TRUE. |

`cov.for.ps` |
matrix of covariates to be used to estimate propensity score (or inverse probability of treatment) weights; either ps.weights or cov.for.ps must be supplied. |

`bw` |
bandwidth used for kernel estimation, default is NULL |

Let `T_{Li}`

denote the time of the primary event of interest for person `i`

, `T_{Si}`

denote the time of the available intermediate event(s), `C_i`

denote the censoring time, `Z_{i}`

denote the vector of baseline (pretreatment) covariates, and `G_i`

be the treatment group indicator such that `G_i = 1`

indicates treatment and `G_i = 0`

indicates control. Due to censoring, we observe `X_{Li}= min(T_{Li}, C_{i})`

and `\delta_{Li} = I(T_{Li}\leq C_{i})`

and `X_{Si}= min(T_{Si}, C_{i})`

and `\delta_{Si} = I(T_{Si}\leq C_{i})`

. This function estimates survival at time t within each treatment group, `S_j(t) = P(T_{L} > t | G = j)`

for `j = 1,0`

and the treatment effect defined as `\Delta(t) = S_1(t) - S_0(t)`

.

To derive these estimates using landmark estimation for an observational study setting, we first decompose the quantity into two components `S_j (t)= S_j(t|t_0) S_j(t_0)`

using a landmark time `t_0`

and estimate each component separately incorporating inverse probability of treatment weights (IPTW) to account for potential selection bias. Let `W_j(Z_i) = {P(G_{i} = j | Z_i)}`

, and `\hat{W}_j(Z_i)`

be the estimated propensity score (or probability of treatment, see ps.wgt.fun for more information). In this presentation, we assume `Z_i`

indicates the vector of baseline (pretreatment) covariates and that `Z_i`

is used to estimate the propensity scores and incorporated into the survival and treatment effect estimation. However, the function allows one to use different subsets of `Z_i`

for the propensity score estimation vs. survival estimation, as is appropriate in the setting of interest. Intermediate event information is used in estimation of the conditional component `S_j(t|t_0)`

,

`S_j(t|t_0)= P(T_L>t |T_L> t_0,G=j)=E[E[I(T_L>t | T_L> t_0,G=j,H)]]=E[S_{j,H} (t|t_0)]`

where `S_{j,H}(t|t_0) = P(T_L>t | T_L> t_0,G=j,H)`

and `H = \{Z, I(T_S \leq t_0), min(T_S, t_0) \}.`

Then `S_{j,H}(t|t_0)`

is estimated in two stages. The first stage involves fitting a weighted Cox proportional hazards model among individuals with `X_L> t_0`

to obtain an estimate of `\beta`

, denoted as `\hat{\beta}`

,

`S_{j,H}(t|t_0)=\exp \{-\Lambda_{j,0} (t|t_0) \exp(\beta^{T} H) \} `

where `\Lambda_{j,0} (t|t_0)`

is the cumulative baseline hazard in group `j`

. Specifically, `\hat{\beta}`

is the solution to the weighted Cox partial likelhoodand, with weights `\hat{W}_j(Z_i)^{-1}`

. The second stage uses a weighted nonparametric kernel Nelson-Aalen estimator to obtain a local constant estimator for the conditional hazard `\Lambda_{j,u}(t|t_0) = -\log [S_{j,u}(t|t_0)]`

as

`\hat{\Lambda}_{j,u}(t|t_0) = \int_{t_0}^t \frac{\sum_i \hat{W}_j(Z_i)^{-1} K_h(\hat{U}_i - u) dN_i(z)}{\sum_i \hat{W}_j(Z_i)^{-1} K_h(\hat{U}_i - u) Y_i(z)}`

where `S_{j,u}(t|t_0)=P(T_L>t | T_L> t_0,G=j,\hat{U}=u), \hat{U} = \hat{\beta}^{T} H, Y_i(t)=I(T_L \geq t),N_i (t)=I(T_L\leq t)I(T_L<C),K(\cdot)`

is a smooth symmetric density function, `K_h (x/h)/h`

, `h=O(n^{-v})`

is a bandwidth with `1/2 > v > 1/4`

, and the summation is over all individuals with `G=j`

and `X_L>t_0`

. The resulting estimate for `S_{j,u}(t|t_0)`

is `\hat{S}_{j,u}(t|t_0) = \exp \{-\hat{\Lambda}_{j,u}(t|t_0)\}`

, and the final estimate

`\hat{S}_j(t|t_0) = \frac{n^{-1} \sum_{i =1}^n \hat{W}_j(Z_i)^{-1} \hat{S}_j(t|t_0, H_i) I(G_i=1)I(X_{Li} > t_0)}{n^{-1} \sum_{i =1}^n \hat{W}_j(Z_i)^{-1} I(G_i=1)I(X_{Li} > t_0) }`

is a consistent estimate of `S_j(t|t_0)`

.

Estimation of `S_j(t_0)`

uses a similar two-stage approach but using only baseline covariates, to obtain `\hat{S}_j(t_0)`

. The final overall estimate of survival at time `t`

is, `\hat{S}_{LM,j} (t)= \hat{S}_j(t|t_0) \hat{S}_j(t_0)`

. The treatment effect in terms of the difference in survival at time `t`

is estimated as `\hat{\Delta}_{LM}(t) = \hat{S}_{LM,1}(t) - \hat{S}_{LM,0}(t).`

To obtain an appropriate `h`

we first use the bandwidth selection procedure given by Scott(1992) to obtain `h_{opt}`

; and then we let `h = h_{opt}n_j^{-0.10}`

.

To obtain variance estimates and construct confidence intervals, we use a perturbation-resampling method. Specifically, let `\{V^{(b)}=(V_1^{(b)}, . . . ,V_n^{(b)})^{T}, b=1,...B\}`

be
`n\times B`

independent copies of a positive random variable U from a known distribution with unit mean and unit variance such as an Exp(1) distribution. To estimate the variance of our estimates, we appropriately weight the estimates using these perturbation weights to obtain perturbed values: `\hat{S}_{LM,0} (t)^{(b)}`

, `\hat{S}_{LM,1} (t)^{(b)}`

, and `\hat{\Delta}_{LM} (t)^{(b)}, b=1,...B`

. We then estimate the variance of each estimate as the empirical variance of the perturbed quantities. To construct confidence intervals, one can either use the empirical percentiles of the perturbed samples or a normal approximation.

A list is returned:

`S.estimate.1` |
the estimate of survival at the time of interest for treatment group 1, |

`S.estimate.0` |
the estimate of survival at the time of interest for treatment group 0, |

`delta.estimate` |
the estimate of treatment effect at the time of interest |

`S.var.1 ` |
the variance estimate of |

`S.var.0 ` |
the variance estimate of |

`delta.var` |
the variance estimate of |

`p.value` |
the p-value from testing |

`conf.int.normal.S.1` |
a vector of size 2; the 95% confidence interval for |

`conf.int.normal.S.0` |
a vector of size 2; the 95% confidence interval for |

`conf.int.normal.delta` |
a vector of size 2; the 95% confidence interval for |

`conf.int.quantile.S.1` |
a vector of size 2; the 95% confidence interval for |

`conf.int.quantile.S.0` |
a vector of size 2; the 95% confidence interval for |

`conf.int.quantile.delta` |
a vector of size 2; the 95% confidence interval for |

Layla Parast

Parast, L. & Griffin B.A. (2017). Landmark Estimation of Survival and Treatment Effects in Observational Studies. Lifetime Data Analysis, 23:161-182.

Rosenbaum, P. R., & Rubin, D. B. (1983). The central role of the propensity score in observational studies for causal effects. Biometrika, 70(1), 41-55.

Rosenbaum, P. R., & Rubin, D. B. (1984). Reducing bias in observational studies using subclassification on the propensity score. Journal of the American Statistical Association, 79(387), 516-524.

```
data(example_obs)
W.weight = ps.wgt.fun(treat = example_obs$treat, cov.for.ps = as.matrix(example_obs$Z))
#executable but takes time
#delta.land.obs(tl=example_obs$TL, dl = example_obs$DL, treat = example_obs$treat, tt=2,
#landmark = 1, short = cbind(example_obs$TS,example_obs$DS), z.cov = as.matrix(example_obs$Z),
#ps.weights = W.weight)
#delta.land.obs(tl=example_obs$TL, dl = example_obs$DL, treat = example_obs$treat, tt=2,
#landmark = 1, short = cbind(example_obs$TS,example_obs$DS), z.cov = as.matrix(example_obs$Z),
#cov.for.ps = as.matrix(example_obs$Z))
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.