View source: R/entropy_trivar.R
| entropy_trivar | R Documentation | 
Computes trivariate entropies of all triples of (discrete) variables in a multivariate data set.
entropy_trivar(dat)
| dat | dataframe with rows as observations and columns as variables. Variables must all be observed or transformed categorical with finite range spaces. | 
Trivariate entropies can be used to check for functional relationships and
stochastic independence between triples of variables.
The trivariate entropy H(X,Y,Z) of three discrete random variables X, Y and Z
is bounded according to 
H(X,Y) <= H(X,Y,Z) <= H(X,Z) + H(Y,Z) - H(Z).
The increment between the trivariate entropy and its lower bound is equal to the expected conditional entropy.
Dataframe with the first three columns representing possible triples of variables (V1,V2,V3)
and the fourth column gives trivariate entropies H(V1,V2,V3).
Termeh Shafie
Frank, O., & Shafie, T. (2016). Multivariate entropy analysis of network data. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 129(1), 45-63.
entropy_bivar, prediction_power
# use internal data set
data(lawdata)
df.att <- lawdata[[4]]
# three steps of data editing:
# 1. categorize variables 'years' and 'age' based on
# approximately three equally size groups (values based on cdf)
# 2. make sure all outcomes start from the value 0 (optional)
# 3. remove variable 'senior' as it consists of only unique values (thus redundant)
df.att.ed <- data.frame(
    status = df.att$status,
    gender = df.att$gender,
    office = df.att$office - 1,
    years = ifelse(df.att$years <= 3, 0,
        ifelse(df.att$years <= 13, 1, 2)
    ),
    age = ifelse(df.att$age <= 35, 0,
        ifelse(df.att$age <= 45, 1, 2)
    ),
    practice = df.att$practice,
    lawschool = df.att$lawschool - 1
)
# calculate trivariate entropies
H.triv <- entropy_trivar(df.att.ed)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.