R.multiple.surv: Calculates the proportion of treatment effect explained by...

Description Usage Arguments Details Value Note Author(s) References Examples

View source: R/Functions_Rsurrogate.R

Description

This function calculates the proportion of treatment effect on the primary outcome explained by multiple surrogate markers measured at t_0 and primary outcome information up to t_0. The user can also request a variance estimate, estimated using perturbating-resampling, and a 95% confidence interval. If a confidence interval is requested three versions are provided: a normal approximation based interval, a quantile based interval and Fieller's confidence interval, all using perturbation-resampling. The user can also request an estimate of the incremental value of the surrogate marker information.

Usage

1
2
3
R.multiple.surv(xone, xzero, deltaone, deltazero, sone, szero, type =1, t, 
weight.perturb = NULL, landmark, extrapolate = FALSE, transform = FALSE, 
conf.int = FALSE, var = FALSE, incremental.value = FALSE, approx = T)

Arguments

xone

numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time.

xzero

numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time.

deltaone

numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time.

deltazero

numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time.

sone

matrix of numeric values; surrogate marker measurements at t_0 for treated observations. If X_{1i}<t_0, then the surrogate marker measurements should be NA.

szero

matrix of numeric values; surrogate marker measurements at t_0 for control observations. If X_{0i}<t_0, then the surrogate marker measurements should be NA.

type

type of estimate; options are 1 = two-stage robust estimator, 2 = weighted two-stage robust estimator, 3 = double-robust estimator, 4 = two-stage model-based estimator, 5 = weighted estimator, 6 = double-robust model-based estimator; default is 1.

t

the time of interest.

weight.perturb

weights used for perturbation resampling.

landmark

the landmark time t_0 or time of surrogate marker measurement.

extrapolate

TRUE or FALSE; indicates whether the user wants to use extrapolation.

transform

TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker pseudo-score.

conf.int

TRUE or FALSE; indicates whether a 95% confidence interval for delta is requested, default is FALSE.

var

TRUE or FALSE; indicates whether a variance estimate is requested, default is FALSE.

incremental.value

TRUE or FALSE; indicates whether the user would like to see the incremental value of the surrogate marker information, default is FALSE.

approx

TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE.

Details

Let G \in \{A, B\} be the binary treatment indicator and we assume that subjects are randomly assigned to either treatment group A or B at baseline. Let T denote the time to the occurrence of the primary outcome, death for example, and S = (S_1,S_2,...,S_k)^{T} denote the vector of k surrogate markers measured at a given time t_0. Let T^{(g)} and S^{(g)} denote the counterfactual event time and surrogate marker measurements under treatment G = g for g \in \{A, B\}. In practice, we only observe (T, S)=(T^{(A)}, S^{(A)}) or (T^{(B)}, S^{(B)}) depending on whether G=A or B. The treatment effect, Δ(t), is the treatment difference in survival rates at time t > t_0, Δ(t)=E\{ I(T^{(A)}>t)\} - E\{I(T^{(B)}>t)\} = P(T^{(A)}>t) - P(T^{(B)}>t) where I(\cdot) is the indicator function. For individuals who are censored or experience the primary outcome before t_0, we assume that their S information is not available.

The surrogate marker information at time t_0 is defined as a combination of the observed information on I(T >t_0) and the observed S at t_0, denoted by Q_{t_0} = \{Q_{t_0}^{(g)}, g = A, B\}, where Q_{t_0} ^{(g)} = \{ T ^{(g)} \wedge t_0, S ^{(g)} I(T ^{(g)} > t_0)\}. With information on Q_{t_0}, the residual treatment effect is defined as: Δ_{S}(t,t_0) = E\{ I(T ^{(A)} > t) - I(T ^{(B)}>t) \mid Q_{t_0}^{(A)} = Q_{t_0}^{(B)} \} =P(T^{(B)} > t_0) \int_{S} ψ_A(t \mid S, t_0) dF_{B}(S \mid t_0) -P(T^{(B)}> t) where S = (s_1, ..., s_k)^{T}, F_{g}(S \mid t_0) = P(S^{(g)}≤ S \mid T^{(g)}>t_0), ψ_g(t\mid S,t_0) = P(T^{(g)}> t\mid S^{(g)}=S, T^{(g)}> t_0). The proportion of the treatment effect on the primary outcome that is explained by the treatment effect on Q_{t_0} is R_{S}(t,t_0)=1-Δ_{S}(t,t_0)/Δ. This function provides 6 different estimators for R_{S}(t,t_0) using censored data.

Due to censoring, the observed data consist of n observations \{(G_i, X_{i}, δ_{i}, S_{i}I(X_{i} > t_0)), i=1,...,n\} from the two treatment groups, where X_{i} = \min(T_{i}, C_{i}), δ_{i} = I(T_{i} < C_{i}), and C_{i} denotes the censoring time for the ith subject. We assume independent censoring i.e., (T_i, S_i) \perp C_i \mid G_i. For ease of notation, we also let \{(X_{gi}, δ_{gi}, S_{gi}I(X_{gi} > t_0)), i=1,...,n_g\} denote n_g = ∑_{i=1}^n I(G_i = g) observations from treatment group g \in \{A,B\}, where X_{gi}=\min(T_{gi}, C_{gi}) and δ_{gi}=I(T_{gi}<C_{gi}). Without loss of generality, we assume that \bar{π}_g = n_g/n \to π_g \in (0,1) as n\to ∞. Throughout, we estimate the treatment effect Δ(t) =P(T^{(A)}>t) - P(T^{(B)}>t) as \hat{Δ}(t) = n_A^{-1} ∑_{i=1}^{n_A} \frac{I(X_{Ai}>t)}{\hat{W}^C_A(t)} - n_B^{-1} ∑_{i=1}^{n_B} \frac{I(X_{Bi}>t)}{\hat{W}^C_B(t)} where \hat{W}^C_g(t) is the Kaplan-Meier estimator of W^C_g(t) = P(C_{g} > t).

We first describe the two-stage robust estimator which involves a two-stage procedure combining the use of a working model and a nonparametric estimation procedure for Δ_{S}(t,t_0). The idea is simply to summarize S into a univariate score U and then construct a nonparametric estimator for R_S(t,t_0) treating U as S. To construct U, we approximate the conditional distribution of T^{(A)} \mid S^{(A)}, T^{(A)} > t_0 by using a working semiparametric model such as the landmark proportional hazards model q_A(S) = P(T^{(A)} >t \mid T^{(A)} >t_0, S^{(A)}) = \exp\{-Λ_0(t|t_0) \exp(β ^{T} S^{(A)}) \}, t>t_0, where Λ_0(t|t_0) is the unspecified baseline cumulative hazard function for T^{(A)} conditional on \{T^{(A)} > t_0\} and β is an unknown vector of coefficients. Let \hat{β} be the maximizer of the corresponding log partial likelihood function and \hat{Λ}_0(t|t_0) be the Breslow-type estimate of baseline hazard. If one were to assume that this working model is correctly specified, then a consistent estimate of Δ_{S}(t,t_0) would simply be: \hat{Δ}_{S}^M=n_B^{-1} ∑_{i=1}^{n_B} [ \exp \{ -\hat{Λ}_0(t|t_0)\exp(\hat{β} ^{T} S_{Bi}) \} \frac{I(X_{Bi} > t_0)}{\hat{W}^C_B(t_0)} - \frac{I(X_{Bi} > t)}{\hat{W}^C_B(t)}]. We refer to this estimate as the two-stage model-based estimator (option 4 for type). Instead of relying on correct specification of this model, we use the resulting score U = β_0^{T}S as a univariate “pseudo-marker" to summarize the k surrogates. In the second stage, to estimate Δ_{S}(t,t_0), we apply a nonparametric approach with S represented by the univariate marker U. Specifically, we use a kernel Nelson-Aalen estimator to nonparametrically estimate φ_A(t|u, t_0)=P(T^{(A)}> t\mid U^{(A)}=u, T^{(A)}> t_0) = \exp\{-Λ_A(t|u, t_0 )\} as \hat φ_A(t|u, t_0) = \exp\{-\hat{Λ}_A(t|u, t_0) \}, where \hat{Λ}_A(t|u, t_0) = \int_{t_0}^t \frac{∑_{i=1}^{n_A} I(X_{Ai}>t_0) K_h\{γ(\hat{U}_{Ai}) - γ(u)\}dN_{Ai}(z)}{∑_{i=1}^{n_A} I(X_{Ai}>t_0) K_h\{γ(\hat{U}_{Ai}) - γ(u)\} Y_{Ai}(z)}, \hat{U}_{Ai} = \hat{β} ^{T} S_{Ai}, \hat{U}_{Bi} = \hat{β} ^{T} S_{Bi}, Y_{Ai} = I(X_{Ai} ≥q t), N_{Ai}(t) = I(X_{Ai} ≤q t) δ_{Ai}, K(\cdot) is a smooth symmetric density function, K_h(x) = K(x/h)/h, and γ(\cdot) is a given monotone transformation function. We then estimate Δ_{S}(t,t_0) as \hat{Δ}_{S}(t,t_0) = n_B^{-1} ∑_{i=1}^{n_B} [\hat{φ}_A(t|\hat{U}_{Bi},t_0) \frac{I(X_{Bi} > t_0)}{\hat{W}^C_B(t_0)} - \frac{I(X_{Bi} > t)}{\hat{W}^C_B(t)}] and \hat{R}_{S}(t,t_0) =1- \hat{Δ}_{S}(t,t_0)/\hat{Δ}(t). We refer to this estimate as the two-stage robust estimator (option 1 for type).

The next estimator borrows ideas from the extensive causal inference literature focusing on double robust estimators two-stage weighted estimator with a propensity score weight explicitly balancing the two treatment groups with respect to the distribution of S. The weighting enables us to “adjust" the distribution of S^{(A)} before constructing the conditional survival estimate \hat φ_A(t|u, t_0). This approach results in a double-robust estimator of Δ_{S}(t, t_0), which is consistent when either U^{(A)} captures all the information about the relationship between I(T^{(A)}≥ t) and S^{(A)} or the propensity score model for π(S,t_0)=P(G_i=B|S_i=S, T_{i} > t_0) is correctly specified. While π(S,t_0) depends on t_0, for simplicity, we drop t_0 from our notation and simply use π(S).

Regression models can be imposed to obtain estimates for π(S). For example, a simple logistic regression model can be imposed for \tilde{π}(S)=P(G_i=B|S_i=S, X_{i}>t_0) with logit\{\tilde{π}(S)\} = α_0 + α_1 ^{T} S, where α_0 and α_1 are estimated only among those with X_{gi} > t_0 to account for censoring. The propensity score of interest, π(S), can be derived from \tilde{π}(S) directly since logit\{π(S)\}=logit\{\tilde{π}(S)\} + \log\{W_A^C(t_0)/W_B^C(t_0)\}, which follows from the assumption that (T_{gi}, S_{gi}) \perp C_{gi}. We then modify the above expression by weighting observations with the estimated L(S_{Ai})=π(S_{Ai})/\{1-π(S_{Ai})\} and obtain

\hat{Λ}^w_A(t|u, t_0) = \int_{t_0}^t \frac{∑_{i=1}^{n_A} \hat{L}(S_{Ai}) I(X_{Ai}>t_0) K_h\{γ(\hat{U}_{Ai}) - γ(u)\}dN_{Ai}(z)}{∑_{i=1}^{n_A} \hat{L}(S_{Ai}) I(X_{Ai}>t_0) K_h\{γ(\hat{U}_{Ai}) - γ(u)\} Y_{Ai}(z)},

, where \hat{L}(S_{gi}) = \exp(\hat{α}_0+\hat{α}_1^{T} S_{gi})\hat{W}^C_B(t_0)/\hat{W}^C_A(t_0).

Subsequently, we define \hat{Δ}^w_{S}(t,t_0) = n_B^{-1} ∑_{i=1}^{n_B} [\hat{φ}^w_A(t|\hat{U}_{Bi},t_0) \frac{I(X_{Bi} > t_0)}{\hat{W}^C_B(t_0)} - \frac{I(X_{Bi} > t)}{\hat{W}^C_B(t)}] and \hat{R}^w_{S}(t,t_0) =1- \hat{Δ}^w_{S}(t,t_0)/\hat{Δ}(t) where \hat φ^w_A(t|t_0,u) = \exp\{-\hat{Λ}^w_A(t|t_0,u) \}. We refer to this estimate as the weighted two-stage robust estimator (option 2 for type).

While the two-stage weighted estimator reflects one way to enhance the robustness of an initial estimator, the idea of combining a propensity-score type model and a regression-type model has certainly been extensively studied in the causal inference literature and a more familiar double-robust estimator can be constructed as: \hat{Δ}_{S}^{DR}(t,t_0) = n^{-1} [∑_{i=1}^{n_A}\frac{I(X_{Ai}>t)}{\hat{W}_{A}^C(t)\bar{π}_B} \hat{L}(S_{Ai}) - ∑_{i=1}^{n_B} \frac{I(X_{Bi}>t)}{\hat{W}_{B}^C(t)\bar{π}_B} ] - n^{-1} [ ∑_{i=1}^{n_A} \frac{ \hat{φ}_A(t\mid\hat{U}_{Ai},t_0) I(X_{Ai} > t_0) }{\hat{W}_{A}^C(t_0)\bar{π}_B} \hat{L}(S_{Ai}) - ∑_{i=1}^{n_B}\frac{ \hat{φ}_A(t\mid \hat{U}_{Bi},t_0) I(X_{Bi} > t_0) }{\hat{W}_{B}^C(t_0)\bar{π}_B} ] and \hat{R}_{S}^{DR}(t,t_0) =1- \hat{Δ}_{S}^{DR}(t,t_0)/\hat{Δ}(t), where \hat{φ}_A(t|\hat{U}_{gi},t_0) is the (unweighted) estimate of φ_A(t\mid u,t_0) used in \hat{Δ}_{S}(t,t_0). We refer to this estimate as the double robust estimator (option 3 for type).

The weighted estimator (option type 5) is defined as: \hat{Δ}^{PS}_{S}(t,t_0) = n^{-1} ∑_{i=1}^n \{\frac{I(X_{i}>t)}{\hat{W}_{G_i}^C(t)\bar{π}_B} [ I(G_i = A)\hat{L}(S_{Ai}) - I(G_i = B ) ] \} and \hat{R}^{PS}_{S}(t,t_0) =1- \hat{Δ}^{PS}_{S}(t,t_0)/\hat{Δ}(t). This estimator completely relies on the correct specification of π(S). The double-robust model-based estimator (option 6 for type) is defined as \hat{Δ}_{S}^{DR2}(t,t_0) and \hat{R}_{S}^{DR2}(t,t_0) =1- \hat{Δ}_{S}^{DR2}(t,t_0)/\hat{Δ}(t) which are constructed parallel to the construction of \hat{Δ}_{S}^{DR}(t,t_0) i.e., a combination of \hat{Δ}_{S}^M(t,t_0) and \hat{R}^{PS}_{S}(t,t_0).

Variance estimates are obtained using perturbation resampling. If a confidence interval is requested three versions are provided: a normal approximation based interval, a quantile based interval and Fieller's confidence interval, all using perturbation-resampling. An estimate of the incremental value of the surrogate marker information can also be requested; this essentially compared the proportion explained by the surrogate information vs. the proportion explained by T alone up to t_0. Details can be found in Parast, L., Cai, T., Tian, L. (2021). Evaluating Multiple Surrogate Markers with Censored Data. Biometrics, In press.

Value

A list is returned:

delta

the estimate, \hat{Δ}(t), described in delta.estimate documentation.

delta.s

the residual treatment effect estimate, \hat{Δ}_S(t,t_0).

R.s

the estimated proportion of treatment effect explained by the set of markers, \hat{R}_S(t,t_0).

delta.var

the variance estimate of \hat{Δ}(t); if var = TRUE or conf.int = TRUE.

delta.s.var

the variance estimate of \hat{Δ}_S(t,t_0); if var = TRUE or conf.int = TRUE.

R.s.var

the variance estimate of \hat{R}_S(t,t_0); if var = TRUE or conf.int = TRUE.

conf.int.normal.delta

a vector of size 2; the 95% confidence interval for \hat{Δ}(t) based on a normal approximation; if conf.int = TRUE.

conf.int.quantile.delta

a vector of size 2; the 95% confidence interval for \hat{Δ}(t) based on sample quantiles of the perturbed values; if conf.int = TRUE.

conf.int.normal.delta.s

a vector of size 2; the 95% confidence interval for \hat{Δ}_S(t,t_0) based on a normal approximation; if conf.int = TRUE.

conf.int.quantile.delta.s

a vector of size 2; the 95% confidence interval for \hat{Δ}_S(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE.

conf.int.normal.R.s

a vector of size 2; the 95% confidence interval for \hat{R}_S(t,t_0) based on a normal approximation; if conf.int = TRUE.

conf.int.quantile.R.s

a vector of size 2; the 95% confidence interval for \hat{R}_S(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE.

conf.int.fieller.R.s

a vector of size 2; the 95% confidence interval for \hat{R}_S(t,t_0) based on Fieller's approach; if conf.int = TRUE.

delta.t

the estimate, \hat{Δ}_T(t,t_0); if incremental.vaue = TRUE.

R.t

the estimated proportion of treatment effect explained by survival only, \hat{R}_T(t,t_0); if incremental.vaue = TRUE.

incremental.value

the estimate of the incremental value of the surrogate markers, \hat{IV}_S(t,t_0); if incremental.vaue = TRUE.

delta.t.var

the variance estimate of \hat{Δ}_T(t,t_0); if var = TRUE or conf.int = TRUE and incremental.vaue = TRUE.

R.t.var

the variance estimate of \hat{R}_T(t,t_0); if var = TRUE or conf.int = TRUE and incremental.vaue = TRUE.

incremental.value.var

the variance estimate of \hat{IV}_S(t,t_0); if var = TRUE or conf.int = TRUE and incremental.vaue = TRUE.

conf.int.normal.delta.t

a vector of size 2; the 95% confidence interval for \hat{Δ}_T(t,t_0) based on a normal approximation; if conf.int = TRUE and incremental.vaue = TRUE.

conf.int.quantile.delta.t

a vector of size 2; the 95% confidence interval for \hat{Δ}_T(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE and incremental.vaue = TRUE.

conf.int.normal.R.t

a vector of size 2; the 95% confidence interval for \hat{R}_T(t,t_0) based on a normal approximation; if conf.int = TRUE and incremental.vaue = TRUE.

conf.int.quantile.R.t

a vector of size 2; the 95% confidence interval for \hat{R}_T(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE and incremental.vaue = TRUE.

conf.int.fieller.R.t

a vector of size 2; the 95% confidence interval for \hat{R}_T(t,t_0) based on Fieller's approach; if conf.int = TRUE and incremental.vaue = TRUE.

conf.int.normal.iv

a vector of size 2; the 95% confidence interval for \hat{IV}_S(t,t_0) based on a normal approximation; if conf.int = TRUE and incremental.vaue = TRUE.

conf.int.quantile.iv

a vector of size 2; the 95% confidence interval for \hat{IV}_S(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE and incremental.vaue = TRUE.

Note

If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting".

Author(s)

Layla Parast

References

Parast, L., Cai, T., & Tian, L. (2017). Evaluating surrogate marker information using censored data. Statistics in Medicine, 36(11), 1767-1782.

Details can be found in Parast, L., Cai, T., Tian, L. (2021). Evaluating Multiple Surrogate Markers with Censored Data. Biometrics, In press.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
data(d_example_multiple)
names(d_example_multiple)
## Not run: 
R.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone = 
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone = 
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0), 
type =1, t = 1, landmark=0.5)
R.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone = 
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone = 
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0), 
type =1, t = 1, landmark=0.5, conf.int = T)	
R.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone = 
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone = 
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0), 
type =3, t = 1, landmark=0.5)

## End(Not run)

Rsurrogate documentation built on Nov. 14, 2021, 9:07 a.m.