View source: R/entropy_tetravar.R
| entropy_tetravar | R Documentation |
Computes tetravariate entropies, expected conditional entropies, and expected conditional joint entropies for all quadruples of variables in a multivariate discrete data set.
entropy_tetravar(dat, dec = 2)
dat |
dataframe with rows as observations and columns as variables. Variables must be categorical with finite range spaces. |
dec |
number of decimals used for rounding the entropy values. Default is 2. |
For four variables X, Y, Z, and U, the tetravariate entropy is denoted H(X,Y,Z,U). The expected conditional entropies are computed as
EH(U|X,Y,Z) = H(X,Y,Z,U) - H(X,Y,Z)
and
EH(Z|X,Y,U) = H(X,Y,Z,U) - H(X,Y,U).
The expected conditional joint entropy is computed as
EJ(X,Y|Z,U) = H(X,Z,U) + H(Y,Z,U) - H(Z,U) - H(X,Y,Z,U).
This quantity measures deviation from conditional independence of the form
X \perp Y \,\vert\, Z, U.
Smaller values indicate weaker conditional dependence.
A dataframe with one row for each ordered decomposition of four variables into predictors and conditioning variables. The columns are:
X |
first variable in the pair of interest. |
Y |
second variable in the pair of interest. |
Z |
first conditioning variable. |
U |
second conditioning variable. |
H_XYZU |
tetravariate entropy H(X,Y,Z,U). |
EH_U_XYZ |
expected conditional entropy EH(U|X,Y,Z). |
EH_Z_XYU |
expected conditional entropy EH(Z|X,Y,U). |
EJ_XY_ZU |
expected conditional joint entropy EJ(X,Y|Z,U). |
Termeh Shafie
Frank, O., & Shafie, T. (2016). Multivariate entropy analysis of network data. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 129(1), 45-63.
entropy_trivar, entropy_bivar,
prediction_power
# use internal data set
data(lawdata)
# extract node attributes
df_att <- lawdata[[4]]
# data editing:
# 1. discretize 'years' and 'age' into three approximately balanced groups
# 2. recode selected variables so categories start at 0
att_var <- data.frame(
status = df_att$status - 1,
gender = df_att$gender,
office = df_att$office - 1,
years = ifelse(df_att$years <= 3, 0,
ifelse(df_att$years <= 13, 1, 2)),
age = ifelse(df_att$age <= 35, 0,
ifelse(df_att$age <= 45, 1, 2)),
practice = df_att$practice,
lawschool = df_att$lawschool - 1
)
# compute tetravariate entropy quantities for five selected variables
entropy_tetravar(
dat = att_var[, c("gender", "years", "age", "office", "practice")]
)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.