ICABinBin: Assess surrogacy in the causal-inference single-trial setting...

Description Usage Arguments Details Value Author(s) References See Also Examples


The function ICA.BinBin quantifies surrogacy in the single-trial causal-inference framework (individual causal association and causal concordance) when both the surrogate and the true endpoints are binary outcomes. See Details below.


ICA.BinBin(pi1_1_, pi1_0_, pi_1_1, pi_1_0, pi0_1_, pi_0_1, 
Monotonicity=c("General"), Sum_Pi_f = seq(from=0.01, to=0.99, by=.01), 
M=10000, Volume.Perc=0, Seed=sample(1:100000, size=1))



A scalar or vector that contains values for P(T=1,S=1|Z=0), i.e., the probability that S=T=1 when under treatment Z=0. A vector is specified to account for uncertainty, i.e., rather than keeping P(T=1,S=1|Z=0) fixed at one estimated value, a distribution can be specified (see examples below) from which a value is drawn in each run.


A scalar or vector that contains values for P(T=1,S=0|Z=0).


A scalar or vector that contains values for P(T=1,S=1|Z=1).


A scalar or vector that contains values for P(T=1,S=0|Z=1).


A scalar or vector that contains values for P(T=0,S=1|Z=0).


A scalar or vector that contains values for P(T=0,S=1|Z=1).


Specifies which assumptions regarding monotonicity should be made: Monotonicity=c("General"), Monotonicity=c("No"), Monotonicity=c("True.Endp"), Monotonicity=c("Surr.Endp"), or Monotonicity=c("Surr.True.Endp"). See Details below. Default Monotonicity=c("General").


A scalar or vector that specifies the grid of values G={g_{1},\: g_{2},\:...,\: g_{k}} to be considered when the sensitivity analysis is conducted. See Details below. Default Sum_Pi_f = seq(from=0.01, to=0.99, by=.01).


The number of runs that are conducted for a given value of Sum_Pi_f. This argument is not used when Volume.Perc=0. Default M=10000.


Note that the marginals that are observable in the data set a number of restrictions on the unidentified correlations. For example, under montonicity for S and T, it holds that π_{0111}<=min(π_{0\cdot1\cdot}, π_{\cdot1\cdot1}) and π_{1100}<=min(π_{1\cdot0\cdot}, π_{\cdot1\cdot0}). For example, when min(π_{0\cdot1\cdot}, π_{\cdot1\cdot1})=0.10 and min(π_{1\cdot0\cdot}, π_{\cdot1\cdot0})=0.08, then all valid π_{0111}<=0.10 and all valid π_{1100}<=0.08. The argument Volume.Perc specifies the fraction of the 'volume' of the paramater space that is explored. This volume is computed based on the grids G=0, 0.01, ..., maximum possible value for the counterfactual probability at hand. E.g., in the previous example, the 'volume' of the parameter space would be 11*9=99, and when e.g., the argument Volume.Perc=1 is used a total of 99 runs will be conducted for each given value of Sum_Pi_f. Notice that when monotonicity is not assumed, relatively high values of Volume.Perc will lead to a large number of runs and consequently a long analysis time.


The seed to be used to generate π_r. Default Seed=sample(1:100000, size=1).


In the continuous normal setting, surroagacy can be assessed by studying the association between the individual causal effects on S and T (see ICA.ContCont). In that setting, the Pearson correlation is the obvious measure of association.

When S and T are binary endpoints, multiple alternatives exist. Alonso et al. (2014) proposed the individual causal association (ICA; R_{H}^{2}), which captures the association between the individual causal effects of the treatment on S (Δ_S) and T (Δ_T) using information-theoretic principles.

The function ICA.BinBin computes R_{H}^{2} based on plausible values of the potential outcomes. Denote by \bold{Y}'=(T_0,T_1,S_0,S_1) the vector of potential outcomes. The vector \bold{Y} can take 16 values and the set of parameters π_{ijpq}=P(T_0=i,T_1=j,S_0=p,S_1=q) (with i,j,p,q=0/1) fully characterizes its distribution.

However, the parameters in π_{ijpq} are not all functionally independent, e.g., 1=π_{\cdot\cdot\cdot\cdot}. When no assumptions regarding monotonicity are made, the data impose a total of 7 restrictions, and thus only 9 proabilities in π_{ijpq} are allowed to vary freely (for details, see Alonso et al., 2014). Based on the data and assuming SUTVA, the marginal probabilites π_{1 \cdot 1 \cdot}, π_{1 \cdot 0 \cdot}, π_{\cdot 1 \cdot 1}, π_{\cdot 1 \cdot 0}, π_{0 \cdot 1 \cdot}, and π_{\cdot 0 \cdot 1} can be computed (by hand or using the function MarginalProbs). Define the vector

\bold{b}'=(1, π_{1 \cdot 1 \cdot}, π_{1 \cdot 0 \cdot}, π_{\cdot 1 \cdot 1}, π_{\cdot 1 \cdot 0}, π_{0 \cdot 1 \cdot}, π_{\cdot 0 \cdot 1})

and \bold{A} is a contrast matrix such that the identified restrictions can be written as a system of linear equation

\bold{A π} = \bold{b}.

The matrix \bold{A} has rank 7 and can be partitioned as \bold{A=(A_r | A_f)}, and similarly the vector \bold{π} can be partitioned as \bold{π^{'}=(π_r^{'} | π_f^{'})} (where f refers to the submatrix/vector given by the 9 last columns/components of \bold{A/π}). Using these partitions the previous system of linear equations can be rewritten as

\bold{A_r π_r + A_f π_f = b}.

The following algorithm is used to generate plausible distributions for \bold{Y}. First, select a value of the specified grid of values (specified using Sum_Pi_f in the function call). For k=1 to M (specified using M in the function call), generate a vector π_f that contains 9 components that are uniformly sampled from hyperplane subject to the restriction that the sum of the generated components equals Sum_Pi_f (the function RandVec, which uses the randfixedsum algorithm written by Roger Stafford, is used to obtain these components). Next, \bold{π_r=A_r^{-1}(b - A_f π_f)} is computed and the π_r vectors where all components are in the [0;\:1] range are retained. This procedure is repeated for each of the Sum_Pi_f values. Based on these results, R_H^2 is estimated. The obtained values can be used to conduct a sensitivity analysis during the validation exercise.

The previous developments hold when no monotonicity is assumed. When monotonicity for S, T, or for S and T is assumed, some of the probabilities of π are zero. For example, when montonicity is assumed for T, then P(T_0 <= T_1)=1, or equivantly, π_{1000}=π_{1010}=π_{1001}=π_{1011}=0. When monotonicity is assumed, the procedure described above is modified accordingly (for details, see Alonso et al., 2014). When a general analysis is requested (using Monotonicity=c("General") in the function call), all settings are considered (no monotonicity, monotonicity for S alone, for T alone, and for both for S and T.)

To account for the uncertainty in the estimation of the marginal probabilities, a vector of values can be specified from which a random draw is made in each run (see Examples below).


An object of class ICA.BinBin with components,


An object of class data.frame that contains the valid π vectors.


The vector of the R_H^2 values.


The vector of odds ratios for T.


The vector of odds ratios for S.


The vector of the entropies of Δ_T.


The assumption regarding monotonicity that was made.


The 'volume' of the parameter space when monotonicity is not assumed. Is only provided when the argument Volume.Perc is used (i.e., when it is not equal to 0.


The 'volume' of the parameter space when monotonicity for T is assumed. Is only provided when the argument Volume.Perc is used.


The 'volume' of the parameter space when monotonicity for S is assumed. Is only provided when the argument Volume.Perc is used.


The 'volume' of the parameter space when monotonicity for S and T is assumed. Is only provided when the argument Volume.Perc is used.


Wim Van der Elst, Paul Meyvisch, Ariel Alonso & Geert Molenberghs


Alonso, A., Van der Elst, W., & Molenberghs, G. (2015). Validation of surrogate endpoints: the binary-binary setting from a causal inference perspective.

See Also

ICA.ContCont, MICA.ContCont


## Not run: # Time consuming code part
# Compute R2_H given the marginals specified as the pi's, making no 
# assumptions regarding monotonicity (general case)
ICA <- ICA.BinBin(pi1_1_=0.2619048, pi1_0_=0.2857143, pi_1_1=0.6372549, 
pi_1_0=0.07843137, pi0_1_=0.1349206, pi_0_1=0.127451, Seed=1, 
Monotonicity=c("General"), Sum_Pi_f = seq(from=0.01, to=.99, by=.01), M=10000)

# obtain plot of the results
plot(ICA, R2_H=TRUE)

# Example 2 where the uncertainty in the estimation 
# of the marginals is taken into account
ICA_BINBIN2 <- ICA.BinBin(pi1_1_=runif(10000, 0.2573, 0.4252), 
pi1_0_=runif(10000, 0.1769, 0.3310), 
pi_1_1=runif(10000, 0.5947, 0.7779), 
pi_1_0=runif(10000, 0.0322, 0.1442), 
pi0_1_=runif(10000, 0.0617, 0.1764), 
pi_0_1=runif(10000, 0.0254, 0.1315),
Sum_Pi_f = seq(from=0.01, to=0.99, by=.01), 
M=50000, Seed=1)

# Plot results

## End(Not run)

Surrogate documentation built on May 19, 2017, 8:40 p.m.
Search within the Surrogate package
Search all R packages, documentation and source code

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs in the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.