entropy_bivar: Bivariate Entropy

Description Usage Arguments Details Value Author(s) References See Also Examples

View source: R/entropy_bivar.R

Description

Computes the bivariate entropies between all pairs of (discrete) variables in a multivariate data set.

Usage

1

Arguments

dat

dataframe with rows as observations and columns as variables. Variables must all be observed or transformed categorical with finite range spaces.

Details

The bivariate entropy H(X,Y) of two discrete random variables X and Y can be used to check for functional relationships and stochastic independence between pairs of variables. The bivariate entropy is bounded according to

H(X) <= H(X,Y) <= H(X) + H(Y)

where H(X) and H(Y) are the univariate entropies.

Value

Upper triangular matrix giving bivariate entropies between pairs of variables given as rows and columns of the matrix. The univariate entropies are given in the diagonal.

Author(s)

Termeh Shafie

References

Frank, O., & Shafie, T. (2016). Multivariate entropy analysis of network data. Bulletin of Sociological Methodology/Bulletin de Méthodologie Sociologique, 129(1), 45-63.

Nowicki, K., Shafie, T., & Frank, O. (Forthcoming 2022). Statistical Entropy Analysis of Network Data.

See Also

joint_entropy, entropy_trivar, redundancy, prediction_power

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
# use internal data set
data(lawdata)
df.att <- lawdata[[4]]

# three steps of data editing:
# 1. categorize variables 'years' and 'age' based on
# approximately three equally size groups (values based on cdf)
# 2. make sure all outcomes start from the value 0 (optional)
# 3. remove variable 'senior' as it consists of only unique values (thus redundant)
df.att.ed <- data.frame(
   status   = df.att$status,
   gender   = df.att$gender,
   office   = df.att$office-1,
   years    = ifelse(df.att$years<=3,0,
              ifelse(df.att$years<=13,1,2)),
   age      = ifelse(df.att$age<=35,0,
                ifelse(df.att$age<=45,1,2)),
   practice = df.att$practice,
   lawschool= df.att$lawschool-1)

# calculate bivariate entropies
H.biv <- entropy_bivar(df.att.ed)
# univariate entropies are then given as
diag(H.biv)

netropy documentation built on Feb. 2, 2022, 9:07 a.m.