Description Usage Arguments Details Value Note Author(s) References Examples
View source: R/Functions_Rsurrogate.R
This function calculates the proportion of treatment effect on the primary outcome explained by multiple surrogate markers measured at t_0 and primary outcome information up to t_0. The user can also request a variance estimate, estimated using perturbating-resampling, and a 95% confidence interval. If a confidence interval is requested three versions are provided: a normal approximation based interval, a quantile based interval and Fieller's confidence interval, all using perturbation-resampling. The user can also request an estimate of the incremental value of the surrogate marker information.
1 2 3 |
xone |
numeric vector, the observed event times in the treatment group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
xzero |
numeric vector, the observed event times in the control group, X = min(T,C) where T is the time of the primary outcome and C is the censoring time. |
deltaone |
numeric vector, the event indicators for the treatment group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
deltazero |
numeric vector, the event indicators for the control group, D = I(T<C) where T is the time of the primary outcome and C is the censoring time. |
sone |
matrix of numeric values; surrogate marker measurements at t_0 for treated observations. If X_{1i}<t_0, then the surrogate marker measurements should be NA. |
szero |
matrix of numeric values; surrogate marker measurements at t_0 for control observations. If X_{0i}<t_0, then the surrogate marker measurements should be NA. |
type |
type of estimate; options are 1 = two-stage robust estimator, 2 = weighted two-stage robust estimator, 3 = double-robust estimator, 4 = two-stage model-based estimator, 5 = weighted estimator, 6 = double-robust model-based estimator; default is 1. |
t |
the time of interest. |
weight.perturb |
weights used for perturbation resampling. |
landmark |
the landmark time t_0 or time of surrogate marker measurement. |
extrapolate |
TRUE or FALSE; indicates whether the user wants to use extrapolation. |
transform |
TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker pseudo-score. |
conf.int |
TRUE or FALSE; indicates whether a 95% confidence interval for delta is requested, default is FALSE. |
var |
TRUE or FALSE; indicates whether a variance estimate is requested, default is FALSE. |
incremental.value |
TRUE or FALSE; indicates whether the user would like to see the incremental value of the surrogate marker information, default is FALSE. |
approx |
TRUE or FALSE indicating whether an approximation should be used when calculating the probability of censoring; most relevant in settings where the survival time of interest for the primary outcome is greater than the last observed event but before the last censored case, default is TRUE. |
Let G \in \{A, B\} be the binary treatment indicator and we assume that subjects are randomly assigned to either treatment group A or B at baseline. Let T denote the time to the occurrence of the primary outcome, death for example, and S = (S_1,S_2,...,S_k)^{T} denote the vector of k surrogate markers measured at a given time t_0. Let T^{(g)} and S^{(g)} denote the counterfactual event time and surrogate marker measurements under treatment G = g for g \in \{A, B\}. In practice, we only observe (T, S)=(T^{(A)}, S^{(A)}) or (T^{(B)}, S^{(B)}) depending on whether G=A or B. The treatment effect, Δ(t), is the treatment difference in survival rates at time t > t_0, Δ(t)=E\{ I(T^{(A)}>t)\} - E\{I(T^{(B)}>t)\} = P(T^{(A)}>t) - P(T^{(B)}>t) where I(\cdot) is the indicator function. For individuals who are censored or experience the primary outcome before t_0, we assume that their S information is not available.
The surrogate marker information at time t_0 is defined as a combination of the observed information on I(T >t_0) and the observed S at t_0, denoted by Q_{t_0} = \{Q_{t_0}^{(g)}, g = A, B\}, where Q_{t_0} ^{(g)} = \{ T ^{(g)} \wedge t_0, S ^{(g)} I(T ^{(g)} > t_0)\}. With information on Q_{t_0}, the residual treatment effect is defined as: Δ_{S}(t,t_0) = E\{ I(T ^{(A)} > t) - I(T ^{(B)}>t) \mid Q_{t_0}^{(A)} = Q_{t_0}^{(B)} \} =P(T^{(B)} > t_0) \int_{S} ψ_A(t \mid S, t_0) dF_{B}(S \mid t_0) -P(T^{(B)}> t) where S = (s_1, ..., s_k)^{T}, F_{g}(S \mid t_0) = P(S^{(g)}≤ S \mid T^{(g)}>t_0), ψ_g(t\mid S,t_0) = P(T^{(g)}> t\mid S^{(g)}=S, T^{(g)}> t_0). The proportion of the treatment effect on the primary outcome that is explained by the treatment effect on Q_{t_0} is R_{S}(t,t_0)=1-Δ_{S}(t,t_0)/Δ. This function provides 6 different estimators for R_{S}(t,t_0) using censored data.
Due to censoring, the observed data consist of n observations \{(G_i, X_{i}, δ_{i}, S_{i}I(X_{i} > t_0)), i=1,...,n\} from the two treatment groups, where X_{i} = \min(T_{i}, C_{i}), δ_{i} = I(T_{i} < C_{i}), and C_{i} denotes the censoring time for the ith subject. We assume independent censoring i.e., (T_i, S_i) \perp C_i \mid G_i. For ease of notation, we also let \{(X_{gi}, δ_{gi}, S_{gi}I(X_{gi} > t_0)), i=1,...,n_g\} denote n_g = ∑_{i=1}^n I(G_i = g) observations from treatment group g \in \{A,B\}, where X_{gi}=\min(T_{gi}, C_{gi}) and δ_{gi}=I(T_{gi}<C_{gi}). Without loss of generality, we assume that \bar{π}_g = n_g/n \to π_g \in (0,1) as n\to ∞. Throughout, we estimate the treatment effect Δ(t) =P(T^{(A)}>t) - P(T^{(B)}>t) as \hat{Δ}(t) = n_A^{-1} ∑_{i=1}^{n_A} \frac{I(X_{Ai}>t)}{\hat{W}^C_A(t)} - n_B^{-1} ∑_{i=1}^{n_B} \frac{I(X_{Bi}>t)}{\hat{W}^C_B(t)} where \hat{W}^C_g(t) is the Kaplan-Meier estimator of W^C_g(t) = P(C_{g} > t).
We first describe the two-stage robust estimator which involves a two-stage procedure combining the use of a working model and a nonparametric estimation procedure for Δ_{S}(t,t_0). The idea is simply to summarize S into a univariate score U and then construct a nonparametric estimator for R_S(t,t_0) treating U as S. To construct U, we approximate the conditional distribution of T^{(A)} \mid S^{(A)}, T^{(A)} > t_0 by using a working semiparametric model such as the landmark proportional hazards model q_A(S) = P(T^{(A)} >t \mid T^{(A)} >t_0, S^{(A)}) = \exp\{-Λ_0(t|t_0) \exp(β ^{T} S^{(A)}) \}, t>t_0, where Λ_0(t|t_0) is the unspecified baseline cumulative hazard function for T^{(A)} conditional on \{T^{(A)} > t_0\} and β is an unknown vector of coefficients. Let \hat{β} be the maximizer of the corresponding log partial likelihood function and \hat{Λ}_0(t|t_0) be the Breslow-type estimate of baseline hazard. If one were to assume that this working model is correctly specified, then a consistent estimate of Δ_{S}(t,t_0) would simply be: \hat{Δ}_{S}^M=n_B^{-1} ∑_{i=1}^{n_B} [ \exp \{ -\hat{Λ}_0(t|t_0)\exp(\hat{β} ^{T} S_{Bi}) \} \frac{I(X_{Bi} > t_0)}{\hat{W}^C_B(t_0)} - \frac{I(X_{Bi} > t)}{\hat{W}^C_B(t)}]. We refer to this estimate as the two-stage model-based estimator (option 4 for type). Instead of relying on correct specification of this model, we use the resulting score U = β_0^{T}S as a univariate “pseudo-marker" to summarize the k surrogates. In the second stage, to estimate Δ_{S}(t,t_0), we apply a nonparametric approach with S represented by the univariate marker U. Specifically, we use a kernel Nelson-Aalen estimator to nonparametrically estimate φ_A(t|u, t_0)=P(T^{(A)}> t\mid U^{(A)}=u, T^{(A)}> t_0) = \exp\{-Λ_A(t|u, t_0 )\} as \hat φ_A(t|u, t_0) = \exp\{-\hat{Λ}_A(t|u, t_0) \}, where \hat{Λ}_A(t|u, t_0) = \int_{t_0}^t \frac{∑_{i=1}^{n_A} I(X_{Ai}>t_0) K_h\{γ(\hat{U}_{Ai}) - γ(u)\}dN_{Ai}(z)}{∑_{i=1}^{n_A} I(X_{Ai}>t_0) K_h\{γ(\hat{U}_{Ai}) - γ(u)\} Y_{Ai}(z)}, \hat{U}_{Ai} = \hat{β} ^{T} S_{Ai}, \hat{U}_{Bi} = \hat{β} ^{T} S_{Bi}, Y_{Ai} = I(X_{Ai} ≥q t), N_{Ai}(t) = I(X_{Ai} ≤q t) δ_{Ai}, K(\cdot) is a smooth symmetric density function, K_h(x) = K(x/h)/h, and γ(\cdot) is a given monotone transformation function. We then estimate Δ_{S}(t,t_0) as \hat{Δ}_{S}(t,t_0) = n_B^{-1} ∑_{i=1}^{n_B} [\hat{φ}_A(t|\hat{U}_{Bi},t_0) \frac{I(X_{Bi} > t_0)}{\hat{W}^C_B(t_0)} - \frac{I(X_{Bi} > t)}{\hat{W}^C_B(t)}] and \hat{R}_{S}(t,t_0) =1- \hat{Δ}_{S}(t,t_0)/\hat{Δ}(t). We refer to this estimate as the two-stage robust estimator (option 1 for type).
The next estimator borrows ideas from the extensive causal inference literature focusing on double robust estimators two-stage weighted estimator with a propensity score weight explicitly balancing the two treatment groups with respect to the distribution of S. The weighting enables us to “adjust" the distribution of S^{(A)} before constructing the conditional survival estimate \hat φ_A(t|u, t_0). This approach results in a double-robust estimator of Δ_{S}(t, t_0), which is consistent when either U^{(A)} captures all the information about the relationship between I(T^{(A)}≥ t) and S^{(A)} or the propensity score model for π(S,t_0)=P(G_i=B|S_i=S, T_{i} > t_0) is correctly specified. While π(S,t_0) depends on t_0, for simplicity, we drop t_0 from our notation and simply use π(S).
Regression models can be imposed to obtain estimates for π(S). For example, a simple logistic regression model can be imposed for \tilde{π}(S)=P(G_i=B|S_i=S, X_{i}>t_0) with logit\{\tilde{π}(S)\} = α_0 + α_1 ^{T} S, where α_0 and α_1 are estimated only among those with X_{gi} > t_0 to account for censoring. The propensity score of interest, π(S), can be derived from \tilde{π}(S) directly since logit\{π(S)\}=logit\{\tilde{π}(S)\} + \log\{W_A^C(t_0)/W_B^C(t_0)\}, which follows from the assumption that (T_{gi}, S_{gi}) \perp C_{gi}. We then modify the above expression by weighting observations with the estimated L(S_{Ai})=π(S_{Ai})/\{1-π(S_{Ai})\} and obtain
\hat{Λ}^w_A(t|u, t_0) = \int_{t_0}^t \frac{∑_{i=1}^{n_A} \hat{L}(S_{Ai}) I(X_{Ai}>t_0) K_h\{γ(\hat{U}_{Ai}) - γ(u)\}dN_{Ai}(z)}{∑_{i=1}^{n_A} \hat{L}(S_{Ai}) I(X_{Ai}>t_0) K_h\{γ(\hat{U}_{Ai}) - γ(u)\} Y_{Ai}(z)},
, where \hat{L}(S_{gi}) = \exp(\hat{α}_0+\hat{α}_1^{T} S_{gi})\hat{W}^C_B(t_0)/\hat{W}^C_A(t_0).
Subsequently, we define \hat{Δ}^w_{S}(t,t_0) = n_B^{-1} ∑_{i=1}^{n_B} [\hat{φ}^w_A(t|\hat{U}_{Bi},t_0) \frac{I(X_{Bi} > t_0)}{\hat{W}^C_B(t_0)} - \frac{I(X_{Bi} > t)}{\hat{W}^C_B(t)}] and \hat{R}^w_{S}(t,t_0) =1- \hat{Δ}^w_{S}(t,t_0)/\hat{Δ}(t) where \hat φ^w_A(t|t_0,u) = \exp\{-\hat{Λ}^w_A(t|t_0,u) \}. We refer to this estimate as the weighted two-stage robust estimator (option 2 for type).
While the two-stage weighted estimator reflects one way to enhance the robustness of an initial estimator, the idea of combining a propensity-score type model and a regression-type model has certainly been extensively studied in the causal inference literature and a more familiar double-robust estimator can be constructed as: \hat{Δ}_{S}^{DR}(t,t_0) = n^{-1} [∑_{i=1}^{n_A}\frac{I(X_{Ai}>t)}{\hat{W}_{A}^C(t)\bar{π}_B} \hat{L}(S_{Ai}) - ∑_{i=1}^{n_B} \frac{I(X_{Bi}>t)}{\hat{W}_{B}^C(t)\bar{π}_B} ] - n^{-1} [ ∑_{i=1}^{n_A} \frac{ \hat{φ}_A(t\mid\hat{U}_{Ai},t_0) I(X_{Ai} > t_0) }{\hat{W}_{A}^C(t_0)\bar{π}_B} \hat{L}(S_{Ai}) - ∑_{i=1}^{n_B}\frac{ \hat{φ}_A(t\mid \hat{U}_{Bi},t_0) I(X_{Bi} > t_0) }{\hat{W}_{B}^C(t_0)\bar{π}_B} ] and \hat{R}_{S}^{DR}(t,t_0) =1- \hat{Δ}_{S}^{DR}(t,t_0)/\hat{Δ}(t), where \hat{φ}_A(t|\hat{U}_{gi},t_0) is the (unweighted) estimate of φ_A(t\mid u,t_0) used in \hat{Δ}_{S}(t,t_0). We refer to this estimate as the double robust estimator (option 3 for type).
The weighted estimator (option type 5) is defined as: \hat{Δ}^{PS}_{S}(t,t_0) = n^{-1} ∑_{i=1}^n \{\frac{I(X_{i}>t)}{\hat{W}_{G_i}^C(t)\bar{π}_B} [ I(G_i = A)\hat{L}(S_{Ai}) - I(G_i = B ) ] \} and \hat{R}^{PS}_{S}(t,t_0) =1- \hat{Δ}^{PS}_{S}(t,t_0)/\hat{Δ}(t). This estimator completely relies on the correct specification of π(S). The double-robust model-based estimator (option 6 for type) is defined as \hat{Δ}_{S}^{DR2}(t,t_0) and \hat{R}_{S}^{DR2}(t,t_0) =1- \hat{Δ}_{S}^{DR2}(t,t_0)/\hat{Δ}(t) which are constructed parallel to the construction of \hat{Δ}_{S}^{DR}(t,t_0) i.e., a combination of \hat{Δ}_{S}^M(t,t_0) and \hat{R}^{PS}_{S}(t,t_0).
Variance estimates are obtained using perturbation resampling. If a confidence interval is requested three versions are provided: a normal approximation based interval, a quantile based interval and Fieller's confidence interval, all using perturbation-resampling. An estimate of the incremental value of the surrogate marker information can also be requested; this essentially compared the proportion explained by the surrogate information vs. the proportion explained by T alone up to t_0. Details can be found in Parast, L., Cai, T., Tian, L. (2021). Evaluating Multiple Surrogate Markers with Censored Data. Biometrics, In press.
A list is returned:
delta |
the estimate, \hat{Δ}(t), described in delta.estimate documentation. |
delta.s |
the residual treatment effect estimate, \hat{Δ}_S(t,t_0). |
R.s |
the estimated proportion of treatment effect explained by the set of markers, \hat{R}_S(t,t_0). |
delta.var |
the variance estimate of \hat{Δ}(t); if var = TRUE or conf.int = TRUE. |
delta.s.var |
the variance estimate of \hat{Δ}_S(t,t_0); if var = TRUE or conf.int = TRUE. |
R.s.var |
the variance estimate of \hat{R}_S(t,t_0); if var = TRUE or conf.int = TRUE. |
conf.int.normal.delta |
a vector of size 2; the 95% confidence interval for \hat{Δ}(t) based on a normal approximation; if conf.int = TRUE. |
conf.int.quantile.delta |
a vector of size 2; the 95% confidence interval for \hat{Δ}(t) based on sample quantiles of the perturbed values; if conf.int = TRUE. |
conf.int.normal.delta.s |
a vector of size 2; the 95% confidence interval for \hat{Δ}_S(t,t_0) based on a normal approximation; if conf.int = TRUE. |
conf.int.quantile.delta.s |
a vector of size 2; the 95% confidence interval for \hat{Δ}_S(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE. |
conf.int.normal.R.s |
a vector of size 2; the 95% confidence interval for \hat{R}_S(t,t_0) based on a normal approximation; if conf.int = TRUE. |
conf.int.quantile.R.s |
a vector of size 2; the 95% confidence interval for \hat{R}_S(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE. |
conf.int.fieller.R.s |
a vector of size 2; the 95% confidence interval for \hat{R}_S(t,t_0) based on Fieller's approach; if conf.int = TRUE. |
delta.t |
the estimate, \hat{Δ}_T(t,t_0); if incremental.vaue = TRUE. |
R.t |
the estimated proportion of treatment effect explained by survival only, \hat{R}_T(t,t_0); if incremental.vaue = TRUE. |
incremental.value |
the estimate of the incremental value of the surrogate markers, \hat{IV}_S(t,t_0); if incremental.vaue = TRUE. |
delta.t.var |
the variance estimate of \hat{Δ}_T(t,t_0); if var = TRUE or conf.int = TRUE and incremental.vaue = TRUE. |
R.t.var |
the variance estimate of \hat{R}_T(t,t_0); if var = TRUE or conf.int = TRUE and incremental.vaue = TRUE. |
incremental.value.var |
the variance estimate of \hat{IV}_S(t,t_0); if var = TRUE or conf.int = TRUE and incremental.vaue = TRUE. |
conf.int.normal.delta.t |
a vector of size 2; the 95% confidence interval for \hat{Δ}_T(t,t_0) based on a normal approximation; if conf.int = TRUE and incremental.vaue = TRUE. |
conf.int.quantile.delta.t |
a vector of size 2; the 95% confidence interval for \hat{Δ}_T(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE and incremental.vaue = TRUE. |
conf.int.normal.R.t |
a vector of size 2; the 95% confidence interval for \hat{R}_T(t,t_0) based on a normal approximation; if conf.int = TRUE and incremental.vaue = TRUE. |
conf.int.quantile.R.t |
a vector of size 2; the 95% confidence interval for \hat{R}_T(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE and incremental.vaue = TRUE. |
conf.int.fieller.R.t |
a vector of size 2; the 95% confidence interval for \hat{R}_T(t,t_0) based on Fieller's approach; if conf.int = TRUE and incremental.vaue = TRUE. |
conf.int.normal.iv |
a vector of size 2; the 95% confidence interval for \hat{IV}_S(t,t_0) based on a normal approximation; if conf.int = TRUE and incremental.vaue = TRUE. |
conf.int.quantile.iv |
a vector of size 2; the 95% confidence interval for \hat{IV}_S(t,t_0) based on sample quantiles of the perturbed values; if conf.int = TRUE and incremental.vaue = TRUE. |
If the treatment effect is not significant, the user will receive the following message: "Warning: it looks like the treatment effect is not significant; may be difficult to interpret the residual treatment effect in this setting".
Layla Parast
Parast, L., Cai, T., & Tian, L. (2017). Evaluating surrogate marker information using censored data. Statistics in Medicine, 36(11), 1767-1782.
Details can be found in Parast, L., Cai, T., Tian, L. (2021). Evaluating Multiple Surrogate Markers with Censored Data. Biometrics, In press.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 | data(d_example_multiple)
names(d_example_multiple)
## Not run:
R.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone =
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone =
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0),
type =1, t = 1, landmark=0.5)
R.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone =
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone =
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0),
type =1, t = 1, landmark=0.5, conf.int = T)
R.multiple.surv(xone = d_example_multiple$x1, xzero = d_example_multiple$x0, deltaone =
d_example_multiple$delta1, deltazero = d_example_multiple$delta0, sone =
as.matrix(d_example_multiple$s1), szero = as.matrix(d_example_multiple$s0),
type =3, t = 1, landmark=0.5)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.