Home

/

GitHub

/

In FJValverde/entropies: Plotting ternary entropy diagrams for joint distributions, contingency matrices, etc

knitr::opts_chunk$set(echo = TRUE)
library(tidyverse)
library(entropies)

Data

The following data is defined in the help for mtcars.

mtcars2 <- within(mtcars, {
   vs <- factor(vs, labels = c("V", "S"))
   am <- factor(am, labels = c("automatic", "manual"))
   cyl  <- ordered(cyl)
   gear <- ordered(gear)
   carb <- ordered(carb)
})
summary(mtcars2)

Now work the simple entropies for that.

raw.entropies(mtcars2)#should return an error!

The problem with this dataset is that there is only one observation per group of discrete variables. This makes it impossible to estimate the entropy for 1-KNN which returns -Inf.

Another test on Iris

Iris only has one discrete component, hence it is not a very complex example of estimation.

First analyze the continuous component.

# Select only the continuous component
X <- iris %>% select_if(function(x) is.double(x))
raw.entropies(X)

Then analyze the discrete component.

X <- iris %>% select_if(function(x) !is.double(x))
raw.entropies(X)

Next just analyze everything.

X <- iris
raw.entropies(X)

We can also do it with other datasets which are a mix of categorical and numerical observations. But NOT with EVERY dataset. Some are just not observations of a process, but data tables.

raw.entropies(mtcars2)#Will raise an error due to too few observations

Yet others do not have enough observations for the combinatorial entropy estimator for uniform distributions.

library(mlbench)
data("Ionosphere")
str(Ionosphere)
# X <- Ionosphere
thisCmap <- sapply(Ionosphere,function(x) is.numeric(x))
print(sprintf("There are %d continuous and %d discrete features", 
      sum(thisCmap), sum(!thisCmap)))
# sum(thisDmap)
# # To test multivariate.grid
####raw.entropies(Ionosphere)#FVA: generates a conditioning error 13/03/25
# IndepTest::KLentropy(
    # multivariate.grid(Ionosphere[!thisDmap],type="uniform"),
    # k=10
    # )$Unweighted[1]/log(2)

A good point to go on exploring would be to go to the vignettes on the SMET and CMET whose datasets have mixes of discrete and continuous data.

Testing the joint entropies

In the following use cases we test primitive raw.jentropies

A first case just test iris against iris, to see how much information is there in this null transformation

First compare with raw.entropies on two copies of iris and a single copy.

X <- iris
Y <- iris
# Since R will not allow these two to be column-bound, we modify their colnames to make them unique in this context of cbind.
colnames(X) <- map_chr(colnames(X), function(x) sprintf("x%s", x))
colnames(Y) <- map_chr(colnames(Y), function(y) sprintf("y%s", y))
XY <- cbind(X,Y)
bed <- raw.entropies(XY) 
#compare to the basic iris entropy
comparison <- 
    rbind(
        cbind(dname="iris", raw.entropies(iris)),
        cbind(dname="double iris", bed)
   )
comparison

Observations:

The uniform entropy seems to be working properly, because we enforce the calculation of the entropies for each component RV in the vector.
However, we would expect that the entropy for the continuous component does not increase due to duplication, but it does (colum H_P_Xc).
Q.FVA: Is this an indirect method to select the k in FNN::entropy?

Next compare on the joint entropies to see how this affects M_P_X and VI_P_X.

X <- iris
Y <- iris
jed <- raw.jentropies(X,Y)
# Some checks
print(mutate(jed, balance=DeltaH_P + VI_P + M_P))

Session information

sessionInfo()

FJValverde/entropies documentation built on June 8, 2025, 2:18 a.m.

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

FJValverde/entropies
Plotting ternary entropy diagrams for joint distributions, contingency matrices, etc

In FJValverde/entropies: Plotting ternary entropy diagrams for joint distributions, contingency matrices, etc

Data

Another test on Iris

Testing the joint entropies

Session information

R Package Documentation

Browse R Packages

We want your feedback!

FJValverde/entropies Plotting ternary entropy diagrams for joint distributions, contingency matrices, etc

In FJValverde/entropies: Plotting ternary entropy diagrams for joint distributions, contingency matrices, etc

Data

Another test on Iris

Testing the joint entropies

Session information

R Package Documentation

Browse R Packages

We want your feedback!

FJValverde/entropies
Plotting ternary entropy diagrams for joint distributions, contingency matrices, etc