entropy_trivar: Trivariate Entropy

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/entropy_trivar.R

Description

Computes trivariate entropies of all triples of (discrete) variables in a multivariate data set.

Usage

1

Arguments

dat

dataframe with rows as observations and columns as variables. Variables must all be observed or transformed categorical with finite range spaces.

Details

Trivariate entropies can be used to check for functional relationships and stochastic independence between triples of variables. The trivariate entropy H(X,Y,Z) of three discrete random variables X, Y and Z is bounded according to

H(X,Y) <= H(X,Y,Z) <= H(X,Z) + H(Y,Z) - H(Z).

The increment between the trivariate entropy and its lower bound is equal to the expected conditional entropy.

Value

Dataframe with the first three columns representing possible triples of variables (V1,V2,V3) and the fourth column gives trivariate entropies H(V1,V2,V3).

Author(s)

Termeh Shafie

References

Frank, O., & Shafie, T. (2016). Multivariate entropy analysis of network data. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 129(1), 45-63.

Nowicki, K., Shafie, T., & Frank, O. (Forthcoming 2022). Statistical Entropy Analysis of Network Data.

See Also

entropy_bivar, prediction_power

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
# use internal data set
data(lawdata)
df.att <- lawdata[[4]]

# three steps of data editing:
# 1. categorize variables 'years' and 'age' based on
# approximately three equally size groups (values based on cdf)
# 2. make sure all outcomes start from the value 0 (optional)
# 3. remove variable 'senior' as it consists of only unique values (thus redundant)
df.att.ed <- data.frame(
   status   = df.att$status,
   gender   = df.att$gender,
   office   = df.att$office-1,
   years    = ifelse(df.att$years<=3,0,
              ifelse(df.att$years<=13,1,2)),
   age      = ifelse(df.att$age<=35,0,
                ifelse(df.att$age<=45,1,2)),
   practice = df.att$practice,
   lawschool= df.att$lawschool-1)

# calculate trivariate entropies
H.triv <- entropy_trivar(df.att.ed)

netropy documentation built on Feb. 2, 2022, 9:07 a.m.