R.s.estimate.me: Calculates the proportion of treatment effect explained...

Description Usage Arguments Details Value Author(s) References Examples

View source: R/Functions_Rsurrogate.R

Description

This function calculates the proportion of treatment effect on the primary outcome explained by the treatment effect on a surrogate marker, correcting for measurement error in the surrogate marker. This function is intended to be used for a fully observed continuous outcome. The user must specify what type of estimation they would like (parametric or nonparametric estimation of the proportion explained, denoted by R) and what estimator they would like (see below for details).

Usage

1
2
R.s.estimate.me(sone, szero, yone, yzero, parametric = FALSE, estimator = "n", 
me.variance, extrapolate = TRUE, transform = FALSE, naive = FALSE, Ronly = TRUE)

Arguments

sone

numeric vector or matrix; surrogate marker for treated observations, assumed to be continuous. If there are multiple surrogates then this should be a matrix with n_1 (number of treated observations) rows and n.s (number of surrogate markers) columns.

szero

numeric vector; surrogate marker for control observations, assumed to be continuous.If there are multiple surrogates then this should be a matrix with n_0 (number of control observations) rows and n.s (number of surrogate markers) columns.

yone

numeric vector; primary outcome for treated observations, assumed to be continuous.

yzero

numeric vector; primary outcome for control observations, assumed to be continuous.

parametric

TRUE or FALSE; indicates whether the user wants the parametric approach to be used (TRUE) or nonparametric (FALSE).

estimator

options are "d","q","n" for parametric and "q","n" for nonparametric; "d" stands for the disattenuated estimator, "q" stands for the SIMEX estimator with quadratic extrapolation, "n" stands for the SIMEX estimator with a nonlinear extrapolation. Note that the nonlinear extrapolation may have convergence issues with a small sample size; if this occurs, please consider using quadratic extrapolation instead.

me.variance

the variance of the measurement error; must be provided.

extrapolate

TRUE or FALSE; indicates whether the user wants to use extrapolation.

transform

TRUE or FALSE; indicates whether the user wants to use a transformation for the surrogate marker.

naive

TRUE or FALSE; indicates whether the user wants the naive estimate (not correcting for measurement error) to also be calculated

Ronly

TRUE or FALSE; indicates whether the user wants only R (and corresponding variance and confidence intervals) to be returned.

Details

While there are many methods available to quantify the value of a surrogate marker, most assume that the marker is measured without error. This function calculates the proportion of treatment effect on the primary outcome explained by the treatment effect on a surrogate marker, correcting for measurement error in the surrogate marker. The user can choose either the parametric framework or nonparametric framework for estmation. Within the parametric framework there are three options for measurement error correction: the disattenuated estimator, the SIMEX estimator with quadratic extrapolation, and the SIMEX estimator with nonlinear extrapolation. Within the nonparametric framework there are two options for measurement error correction: the SIMEX estimator with quadratic extrapolation and the SIMEX estimator with nonlinear extrapolation. We describe each below.

Let G be the binary treatment indicator with G=1 indicating treatment and G=0 indicating control (or placebo). We assume throughout that subjects are randomly assigned to treatment or control at baseline. Let Y and S denote the continuous primary outcome and continuous surrogate marker, respectively, where S is measured post-baseline and is assumed to be a biomarker, clinical measurement, psychological test score, or other physiological measurement. In the absence of measurement error, the observed data consists of \{Y_i, S_i, G_i\} for i \in \{1,...,n\}. With measurement error, instead of observing S we observe W = S + U, where E(U|S) = 0 and the variance of U is σ_u^2. Such measurement error may be attributable to, for example, laboratory error. Thus, our observed data will consist of \{Y_i, W_i, G_i\} for i \in \{1,...,n\}. Throughout, we assume that σ_u^2 is known. Here, we are interested in estimating the proprtion of the treatment effect on the primary outcome that is explained by the treatment effect on the surrogate marker, denoted as R_S.

To estimate R_S parametrically, we assume the following models E(Y|G) = β_0 + β_1 G and E(Y|G,S) = β_0^* + β_1^*G + β_2^* S. It can be shown that if these models hold, R_S=1-β_1^*/β_1. When W = S+U is available instead of S, this measurement error does not affect estimation of β_1, but it does affect estimation of β_1^*, and β_2^*. Since estimation of R_S relies on estimation of β_1 and β_1^*, we focus on the effect of measurement error on β_1^* estimation. The attenuation bias for \hat β_1^* and \hat R can be written out in closed form when the proportion of treatment effect is parametrically estimated as described above, when these specified models hold, and when the surrogate marker S is measured with error. There exist two methods to eliminate this bias when estimating R_S. Taking advantage of the fact that we can express the attenuation bias in closed form, the first is a straightforward disattenuated estimator: \hat β _{1A} = \hat{β}_1^* - \frac{ \hat{β}_2^* \{Ω^2_{W} Ω_{GW}-Ω_{GW}(Ω^2_{W} - σ_u^2)\}}{Ω^2_{G}(Ω^2_{W} - σ_u^2)-Ω_{GW}Ω_{GW}} and \hat{R}_{A} = 1- ≤ft [ \hat{β}_1^* - \frac{ \hat{β}_2^* \{Ω^2_{W} Ω_{GW}-Ω_{GW}(Ω^2_{W} - σ_u^2)\}}{Ω^2_{G}(Ω^2_{W} - σ_u^2)-Ω_{GW}Ω_{GW}} \right] / \hat{β}_1 where Ω^2 denotes the sample variance or covariance.

The second method to eliminate this bias uses Simulation Extrapolation (SIMEX) estimation, which is a simulation-based method that involves first generating additional measurement error and observing how it affects the bias of the parameter estimate of interest, and then extrapolating this information to a setting with no measurement error. To incorporate SIMEX estimation within our surrogate marker framework, we define W_{b,i}(λ) = W_i + λ^{1/2} σ_u ε_{i,b} for b=1,...,B where B=50, ε_{i,b} \sim N(0,1), σ_u is assumed known, and λ \in (0,0.25,0.5,0.75,1.0, 1.25,1.5,1.75,2.0) and for each iteration b and λ value, obtaining \hat β_{1b}^*(λ) by fitting the regression model: E(Y \mid W_b(λ),S) = β_{0b}^* + β_{1b}^* W_{b}(λ) + β_{2b}^* S. We then calculate the average estimate for each quantity over the iterations b=1,...,B for each λ value, denoted as \hat β^*_{1,S,σ^2_u(1+λ)} = ∑_{b=1}^B \hat β_{1b}^*(λ). The second step, extrapolation, takes these average estimates for each λ value and extrapolates using a function G(Γ, λ) to obtain the estimated quantity if λ=-1. For the extrapolation step, we use both a quadratic extrapolation and nonlinear extrapolation i.e., we solve for Γ = (α_0, α_1, α_2)^T in \hat β^*_{1,S,σ^2_u(1+λ)} = α_0 + α_1 λ + α_2 λ^2 and \hat β^*_{1,S,σ^2_u(1+λ)}= α_0 + α_1 /( α_2 + λ), respectively. Using the estimates of α_0, α_1, α_2, we calculate the predicted \hat β^*_{1,S,σ^2_u(1+λ)} when λ = -1. In essence, the simulations add successively larger measurement errors of size (1+λ)σ^2_u and then extrapolate to the case when λ = -1 such that the measurement error is 0. We denote the resulting estimator of β_1^* as \hat{β}^*_{1,SIMEX} = G(\hat Γ, -1) and define \hat{R}_{SIMEX} = 1- \hat{β}^*_{1,SIMEX}/ \hat β_1.

While the parametric approach to estimate the proportion of treatment effect explained by S is most commonly used in clinical practice, previous work has demonstrated biased results when the assumed models are not correctly specified. An alternative approach involves estimating the treatment effect, Δ, and residual treatment effect, Δ_S, as R_S is defined as 1-Δ/Δ_S. The quantity Δ can be estimated simply by \hat{Δ} = n_1^{-1}∑_{i=1}^{n} Y_i I(G_i = 1) - n_0^{-1}∑_{i=1}^{n} Y_i I(G_i = 0), where n_1 and n_0 denote the number of individuals in the treatment and control groups, respectively. The quantity Δ_S can be estimated nonparametrically using kernel smoothing as \hat{Δ}_S = n_0^{-1} ∑_{i: G_i = 0}\hat{μ}_1(S_i) - n_0^{-1}∑_{i=1}^{n} Y_i I(G_i = 0) where \hat{μ}_1(s) = \{ ∑_{j: G_j = 1} K_h(S_j - s)Y_j \}/ \{∑_{j:G_j = 1} K_h(S_j - s)\}, K(\cdot) is a smooth symmetric density function with finite support, K_h(\cdot)=K(\cdot/h)/h and h is a specified bandwidth such that h=O(n_1^{-ν}) with ν \in (1/4,1/2).

When W = S + U is available instead of S, estimation of Δ is not affected whereas estimation of Δ_S is affected and thus, the nonparametric estimation procedure described above results in a biased estimate of R_S. Unlike the parametric approach, the attenuation bias cannot be expressed in closed form. Within this nonparametric framework, SIMEX estimation can be used to correct for measurement error. We implement the estimation procedure as described above where we first generate additional measurement error to obtain W_{b,i}(λ) and for each iteration b and λ values obtain \hat{Δ}_{S,b}(λ) = n_0^{-1} ∑_{i: G_i = 0} ≤ft \{ \frac{∑_{j: G_j = 1} K_h(W_{b,j}(λ) - W_{b,i}(λ))Y_j}{∑_{j:G_j = 1} K_h(W_{b,j}(λ)- W_{b,i}(λ))} \right \} - n_0^{-1}∑_{i=1}^{n} Y_i I(G_i = 0). We then calculate the average estimate for each quantity over the iterations b=1,...,B for each λ value, denoted as \hat{Δ}_{S,σ_u^2(1+λ)} = ∑_{b=1}^B \hat{Δ}_{S,b}(λ) and extrapolate using a function G(Γ, λ); we specifically use the quadratic and nonlinear functions as in the parametric setting. We denote the resulting estimator of Δ_S as \hat{Δ}_{S,SIMEX} = G(\hat Γ, -1) and define \hat{R}_{S,SIMEX} = 1- \hat{Δ}_{S,SIMEX} / \hat Δ.

In this function, parametric estimation is equivalent to Freedman's approach in the R.s.estimate documentation; nonparametric estimation is equivalent to the robust approach in the R.s.estimate documentation. Variance estimates for all estimators are calculated in this function based on derived closed form variance expressions. For all approaches, confidence intervals for Δ_S can be constructed using a normal approximation; confidence intervals for R_S can be constructed using either a normal approximation or using Fieller's method, all of which are provided in this function. Details regarding the asymptotic properties of these estimators and closed form variance calculation can be found in: Parast, L., Garcia, TP, Prentice, RL, Carroll, RJ (2021). Robust Methods to Correct for Measurement Error when Evaluating a Surrogate Marker. Biometrics, In press.

Value

A list is returned:

R.naive

the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE

R.naive.var

the estimated variance of the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE

R.naive.CI.normal

the 95% confidence interval using the normal approximation for the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE

R.naive.CI.fieller

the 95% confidence interval using Fieller's approach for the naive estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE

B1star.naive

the naive estimate of the adjusted regression coefficient for treatment; only if naive = TRUE and Ronly = FALSE and parametric = TRUE

B1star.naive.var

the estimated variance of the naive estimate of the adjusted regression coefficient for treatment; only if naive = TRUE and Ronly = FALSE and parametric = TRUE

B1star.naive.CI.normal

the 95% confidence interval using the normal approximation for the naive estimate of the adjusted regression coefficient for treatment; only if naive = TRUE and Ronly = FALSE and parametric = TRUE

deltas.naive

the naive estimate of the residual treatment effect; only if naive = TRUE and Ronly = FALSE and parametric = FALSE

deltas.naive.var

the estimated variance of the naive estimate of the residual treatment effect; only if naive = TRUE and Ronly = FALSE and parametric = FALSE

deltas.naive.CI.normal

the 95% confidence interval using the normal approximation for the naive estimate of the residual treatment effect; only if naive = TRUE and Ronly = FALSE and parametric = FALSE

R.corrected.dis

the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if parametric = TRUE and estimator ="d"

R.corrected.var.dis

the estimated variance of the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if naive = TRUE

R.corrected.CI.normal.dis

the 95% confidence interval using the normal approximation for the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if parametric = TRUE and estimator ="d"

R.corrected.CI.fieller.dis

the 95% confidence interval using Fieller's approach for the corrected disattenuated estimate of the proportion of treatment effect explained by the surrogate marker; only if parametric = TRUE and estimator ="d"

B1star.corrected.dis

the corrected disattenuated estimate of the adjusted regression coefficient for treatment; only if parametric = TRUE and estimator = "d" and Ronly = FALSE

B1star.corrected.var.dis

the estimated variance of the corrected disattenuated estimate of the adjusted regression coefficient for treatment; only if parametric = TRUE and estimator = "d" and Ronly = FALSE

B1star.corrected.CI.normal.dis

the 95% confidence interval using the normal approximation for the corrected disattenuated estimate of the adjusted regression coefficient for treatment; only if parametric = TRUE and estimator = "d" and Ronly = FALSE

R.corrected.q

the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q"

R.corrected.var.q

the estimated variance of the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q"

R.corrected.CI.normal.q

the 95% confidence interval using the normal approximation for the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q"

R.corrected.CI.fieller.q

the 95% confidence interval using Fieller's approach for the corrected SIMEX (quadratic) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q"

B1star.corrected.q

the corrected SIMEX (quadratic) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE

B1star.corrected.var.q

the estimated variance of the corrected SIMEX (quadratic) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE

B1star.corrected.CI.normal.q

the 95% confidence interval using the normal approximation for the corrected SIMEX (quadratic) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE

deltas.corrected.q

the corrected SIMEX (quadratic) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE

deltas.corrected.var.q

the estimated variance of the corrected SIMEX (quadratic) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE

deltas.corrected.CI.normal.q

the 95% confidence interval using the normal approximation for the corrected SIMEX (quadratic) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE

R.corrected.nl

the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q"

R.corrected.var.nl

the estimated variance of the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q"

R.corrected.CI.normal.nl

the 95% confidence interval using the normal approximation for the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q"

R.corrected.CI.fieller.nl

the 95% confidence interval using Fieller's approach for the corrected SIMEX (nonlinear) estimate of the proportion of treatment effect explained by the surrogate marker; only if estimator = "q"

B1star.corrected.nl

the corrected SIMEX (nonlinear) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE

B1star.corrected.var.nl

the estimated variance of the corrected SIMEX (nonlinear) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE

B1star.corrected.CI.normal.nl

the 95% confidence interval using the normal approximation for the corrected SIMEX (nonlinear) estimate of the adjusted regression coefficient for treatment; only if estimator = "q" and Ronly = FALSE and parametric = TRUE

deltas.corrected.nl

the corrected SIMEX (nonlinear) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE

deltas.corrected.var.nl

the estimated variance of the corrected SIMEX (nonlinear) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE

deltas.corrected.CI.normal.nl

the 95% confidence interval using the normal approximation for the corrected SIMEX (nonlinear) estimate of the residual treatment effect; only if estimator = "q" and Ronly = FALSE and parametric = FALSE

Author(s)

Layla Parast

References

Parast, L., Garcia, TP, Prentice, RL, Carroll, RJ (2021). Robust Methods to Correct for Measurement Error when Evaluating a Surrogate Marker. Biometrics, In press.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
data(d_example_me)
names(d_example_me)
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, sone=d_example_me$s1, 
szero=d_example_me$s0, parametric = TRUE, estimator = "d", me.variance = 0.5, 
naive= TRUE, Ronly = FALSE)
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, sone=d_example_me$s1, 
szero=d_example_me$s0, parametric = TRUE, estimator = "q", me.variance = 0.5, 
naive= FALSE, Ronly = TRUE)

#estimating measurement error variance with replicates
replicates = rbind(cbind(d_example_me$s1_rep1, d_example_me$s1_rep2, 
d_example_me$s1_rep3), cbind(d_example_me$s0_rep1, d_example_me$s0_rep2, 
d_example_me$s0_rep3))
mean.i = apply(replicates,1,mean, na.rm = TRUE)
num.i = apply(replicates,1,function(x) sum(!is.na(x)))
var.u = sum((replicates-mean.i)^2, na.rm = TRUE)/sum(num.i)
var.u
R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, sone=d_example_me$s1, 
szero=d_example_me$s0, parametric = TRUE, estimator = "d", me.variance = var.u, 
naive= TRUE, Ronly = FALSE)

R.s.estimate.me(yone=d_example_me$y1, yzero=d_example_me$y0, 
sone=d_example_me$s1, szero=d_example_me$s0, parametric = FALSE, estimator = "q", 
me.variance = 0.5, naive= FALSE, Ronly = TRUE)

Rsurrogate documentation built on Nov. 14, 2021, 9:07 a.m.