View source: R/FixedBinBinIT.r
FixedBinBinIT | R Documentation |
The function FixedBinBinIT
uses the information-theoretic approach (Alonso & Molenberghs, 2007) to estimate trial- and individual-level surrogacy based on fixed-effect models when both S and T are binary variables. The user can specify whether a (weighted or unweighted) full, semi-reduced, or reduced model should be fitted. See the Details section below.
FixedBinBinIT(Dataset, Surr, True, Treat, Trial.ID, Pat.ID,
Model=c("Full"), Weighted=TRUE, Min.Trial.Size=2, Alpha=.05,
Number.Bootstraps=50, Seed=sample(1:1000, size=1))
Dataset |
A |
Surr |
The name of the variable in |
True |
The name of the variable in |
Treat |
The name of the variable in |
Trial.ID |
The name of the variable in |
Pat.ID |
The name of the variable in |
Model |
The type of model that should be fitted, i.e., |
Weighted |
Logical. In practice it is often the case that different trials (or other clustering units) have different sample sizes. Univariate models are used to assess surrogacy in the information-theoretic approach, so it can be useful to adjust for heterogeneity in information content between the trial-specific contributions (particularly when trial-level surrogacy measures are of primary interest and when the heterogeneity in sample sizes is large). If |
Min.Trial.Size |
The minimum number of patients that a trial should contain to be included in the analysis. If the number of patients in a trial is smaller than the value specified by |
Alpha |
The |
Number.Bootstraps |
The standard errors and confidence intervals for |
Seed |
The seed to be used in the bootstrap procedure. Default |
Individual-level surrogacy
The following univariate generalised linear models are fitted:
g_{T}(E(T_{ij}))=\mu_{Ti}+\beta_{i}Z_{ij},
g_{T}(E(T_{ij}|S_{ij}))=\gamma_{0i}+\gamma_{1i}Z_{ij}+\gamma_{2i}S_{ij},
where i
and j
are the trial and subject indicators, g_{T}
is an appropriate link function (i.e., a logit link when binary endpoints are considered), S_{ij}
and T_{ij}
are the surrogate and true endpoint values of subject j
in trial i
, and Z_{ij}
is the treatment indicator for subject j
in trial i
. \mu_{Ti}
and \beta_{i}
are the trial-specific intercepts and treatment-effects on the true endpoint in trial i
. \gamma_{0i}
and \gamma_{1i}
are the trial-specific intercepts and treatment-effects on the true endpoint in trial i
after accounting for the effect of the surrogate endpoint.
The -2
log likelihood values of the previous models in each of the i
trials (i.e., L_{1i}
and L_{2i}
, respectively) are subsequently used to compute individual-level surrogacy based on the so-called Variance Reduction Factor (VFR; for details, see Alonso & Molenberghs, 2007):
R^2_{h}= 1 - \frac{1}{N} \sum_{i} exp \left(-\frac{L_{2i}-L_{1i}}{n_{i}} \right),
where N
is the number of trials and n_{i}
is the number of patients within trial i
.
When it can be assumed (i) that the treatment-corrected association between the surrogate and the true endpoint is constant across trials, or (ii) when all data come from a single clinical trial (i.e., when N=1
), the previous expression simplifies to:
R^2_{h.ind}= 1 - exp \left(-\frac{L_{2}-L_{1}}{N} \right).
The upper bound does not reach to 1 when T
is binary, i.e., its maximum is 0.75. Kent (1983) claims that 0.75 is a reasonable upper bound and thus R^2_{h.ind}
can usually be interpreted without paying special consideration to the discreteness of T
. Alternatively, to address the upper bound problem, a scaled version of the mutual information can be used when both S
and T
are binary (Joe, 1989):
R^2_{b.ind}= \frac{I(T,S)}{min[H(T), H(S)]},
where the entropy of T
and S
in the previous expression can be estimated using the log likelihood functions of the GLMs shown above.
Trial-level surrogacy
When a full or semi-reduced model is requested (by using the argument Model=c("Full")
or Model=c("SemiReduced")
in the function call), trial-level surrogacy is assessed by fitting the following univariate models:
S_{ij}=\mu_{Si}+\alpha_{i}Z_{ij}+\varepsilon_{Sij}, (1)
T_{ij}=\mu_{Ti}+\beta_{i}Z_{ij}+\varepsilon_{Tij}, (1)
where i
and j
are the trial and subject indicators, S_{ij}
and T_{ij}
are the surrogate and true endpoint values of subject j
in trial i
, Z_{ij}
is the treatment indicator for subject j
in trial i
, \mu_{Si}
and \mu_{Ti}
are the fixed trial-specific intercepts for S and T, and \alpha_{i}
and \beta_{i}
are the fixed trial-specific treatment effects on S and T, respectively. The error terms \varepsilon_{Sij}
and \varepsilon_{Tij}
are assumed to be independent.
When a reduced model is requested by the user (by using the argument Model=c("Reduced")
in the function call), the following univariate models are fitted:
S_{ij}=\mu_{S}+\alpha_{i}Z_{ij}+\varepsilon_{Sij}, (2)
T_{ij}=\mu_{T}+\beta_{i}Z_{ij}+\varepsilon_{Tij}, (2)
where \mu_{S}
and \mu_{T}
are the common intercepts for S and T. The other parameters are the same as defined above, and \varepsilon_{Sij}
and \varepsilon_{Tij}
are again assumed to be independent.
When the user requested a full model approach (by using the argument Model=c("Full")
in the function call, i.e., when models (1) were fitted), the following model is subsequently fitted:
\widehat{\beta}_{i}=\lambda_{0}+\lambda_{1}\widehat{\mu_{Si}}+\lambda_{2}\widehat{\alpha}_{i}+\varepsilon_{i}, (3)
where the parameter estimates for \beta_i
, \mu_{Si}
, and \alpha_i
are based on models (1) (see above). When a weighted model is requested (using the argument Weighted=TRUE
in the function call), model (3) is a weighted regression model (with weights based on the number of observations in trial i
). The -2
log likelihood value of the (weighted or unweighted) model (3) (L_1
) is subsequently compared to the -2
log likelihood value of an intercept-only model (\widehat{\beta}_{i}=\lambda_{3}
; L_0
), and R^2_{ht}
is computed based based on the Variance Reduction Factor (for details, see Alonso & Molenberghs, 2007):
R^2_{ht}= 1 - exp \left(-\frac{L_1-L_0}{N} \right),
where N
is the number of trials.
When a semi-reduced or reduced model is requested (by using the argument Model=c("SemiReduced")
or Model=c("Reduced")
in the function call), the following model is fitted:
\widehat{\beta}_{i}=\lambda_{0}+\lambda_{1}\widehat{\alpha}_{i}+\varepsilon_{i},
where the parameter estimates for \beta_i
and \alpha_i
are based on models (1) when a semi-reduced model is fitted or on models (2) when a reduced model is fitted. The -2
log likelihood value of this (weighted or unweighted) model (L_1
) is subsequently compared to the -2
log likelihood value of an intercept-only model (\widehat{\beta}_{i}=\lambda_{3}
; L_0
), and R^2_{ht}
is computed based on the reduction in the likelihood (as described above).
An object of class FixedBinBinIT
with components,
Data.Analyze |
Prior to conducting the surrogacy analysis, data of patients who have a missing value for the surrogate and/or the true endpoint are excluded. In addition, the data of trials (i) in which only one type of the treatment was administered, and (ii) in which either the surrogate or the true endpoint was a constant (i.e., all patients within a trial had the same surrogate and/or true endpoint value) are excluded. In addition, the user can specify the minimum number of patients that a trial should contain in order to include the trial in the analysis. If the number of patients in a trial is smaller than the value specified by |
Obs.Per.Trial |
A |
Trial.Spec.Results |
A |
R2ht |
A |
R2h.ind |
A |
R2h |
A |
R2b.ind |
A |
R2h.Ind.By.Trial |
A |
Wim Van der Elst, Ariel Alonso, & Geert Molenberghs
Alonso, A, & Molenberghs, G. (2007). Surrogate marker evaluation from an information theory perspective. Biometrics, 63, 180-186.
Joe, H. (1989). Relative entropy measures of multivariate dependence. Journal of the American Statistical Association, 84, 157-164.
Kent, T. J. (1983). Information gain as a general measure of correlation. Biometrica, 70, 163-173.
FixedBinContIT
, FixedContBinIT
, plot Information-Theoretic BinCombn
## Not run: # Time consuming (>5sec) code part
# Generate data with continuous Surr and True
Sim.Data.MTS(N.Total=5000, N.Trial=50, R.Trial.Target=.9, R.Indiv.Target=.9,
Fixed.Effects=c(0, 0, 0, 0), D.aa=10, D.bb=10, Seed=1,
Model=c("Full"))
# Dichtomize Surr and True
Surr_Bin <- Data.Observed.MTS$Surr
Surr_Bin[Data.Observed.MTS$Surr>.5] <- 1
Surr_Bin[Data.Observed.MTS$Surr<=.5] <- 0
True_Bin <- Data.Observed.MTS$True
True_Bin[Data.Observed.MTS$True>.15] <- 1
True_Bin[Data.Observed.MTS$True<=.15] <- 0
Data.Observed.MTS$Surr <- Surr_Bin
Data.Observed.MTS$True <- True_Bin
# Assess surrogacy using info-theoretic framework
Fit <- FixedBinBinIT(Dataset = Data.Observed.MTS, Surr = Surr,
True = True, Treat = Treat, Trial.ID = Trial.ID,
Pat.ID = Pat.ID, Number.Bootstraps=100)
# Examine results
summary(Fit)
plot(Fit, Trial.Level = FALSE, Indiv.Level.By.Trial=TRUE)
plot(Fit, Trial.Level = TRUE, Indiv.Level.By.Trial=FALSE)
## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.