View source: R/computeInformation.R
computeThreePointInfo | R Documentation |
Three point information is defined and computed as the difference of mutual information and conditional mutual information, e.g.
I(X;Y;Z|U) = I(X;Y|U) - Ik(X;Y|U,Z)
For discrete or categorical variables, the three-point information is computed with the empirical frequencies minus a complexity cost (computed as BIC or with the Normalized Maximum Likelihood).
computeThreePointInfo(
x,
y,
z,
df_conditioning = NULL,
maxbins = NULL,
cplx = c("nml", "bic"),
n_eff = -1,
sample_weights = NULL,
is_continuous = NULL
)
x |
[a vector]
The |
y |
[a vector]
The |
z |
[a vector]
The |
df_conditioning |
[a data frame]
The data frame of the observations of the set of conditioning variables
|
maxbins |
[an integer] When the data contain continuous variables, the maximum number of bins allowed during the discretization. A smaller number makes the computation faster, a larger number allows finer discretization. |
cplx |
[a string] The complexity model:
|
n_eff |
[an integer] The effective number of samples. When there is significant autocorrelation between successive samples, you may want to specify an effective number of samples that is lower than the total number of samples. |
sample_weights |
[a vector of floats] Individual weights for each sample, used for the same reason as the effective number of samples but with individual weights. |
is_continuous |
[a vector of booleans]
Specify if each variable is to be treated as continuous (TRUE) or discrete
(FALSE), must be of length 'ncol(df_conditioning) + 3', in the order
|
For variables X
, Y
, Z
and a set of conditioning
variables U
, the conditional three point information is defined as
Ik(X;Y;Z|U) = Ik(X;Y|U) - Ik(X;Y|U,Z)
where Ik
is the shifted or regularized conditional mutual information.
See computeMutualInfo
for the definition of Ik
.
A list that contains :
i3: The estimation of (conditional) three-point information without the complexity cost.
i3k: The estimation of (conditional) three-point information with the complexity cost (i3k = i3 - cplx).
i2: For reference, the estimation of (conditional) mutual information
I(X;Y|U)
used in the estimation of i3.
i2k: For reference, the estimation of regularized (conditional) mutual
information Ik(X;Y|U)
used in the estimation of i3k.
Cabeli et al., PLoS Comput. Biol. 2020, Learning clinical networks from medical records based on information estimates in mixed-type data
Affeldt et al., UAI 2015, Robust Reconstruction of Causal Graphical Models based on Conditional 2-point and 3-point Information
library(miic)
N <- 1000
# Dependence, conditional independence : X <- Z -> Y
Z <- runif(N)
X <- Z * 2 + rnorm(N, sd = 0.2)
Y <- Z * 2 + rnorm(N, sd = 0.2)
res <- computeThreePointInfo(X, Y, Z)
message("I(X;Y;Z) = ", res$i3)
message("Ik(X;Y;Z) = ", res$i3k)
# Independence, conditional dependence : X -> Z <- Y
X <- runif(N)
Y <- runif(N)
Z <- X + Y + rnorm(N, sd = 0.1)
res <- computeThreePointInfo(X, Y, Z)
message("I(X;Y;Z) = ", res$i3)
message("Ik(X;Y;Z) = ", res$i3k)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.