ClusterShannonInfo: Shannon Information

View source: R/ClusterShannonInfo.R

ClusterShannonInfoR Documentation

Shannon Information

Description

Shannon Information [Shannon, 1948] for each column in ClsMatrix.

Usage

ClusterShannonInfo(ClsMatrix)

Arguments

ClsMatrix

[1:n,1:C] matrix of C clusterings each columns is defined as:

1:n numerical vector of numbers defining the classification as the main output of the clustering algorithm for the n cases of data. It has k unique numbers representing the arbitrary labels of the clustering.

Details

Info[1:d] = sum(-p * log(p)/MaxInfo) for all unique cases with probability p in ClsMatrix[,c] for a column with k clusters MaxInfo = -(1/k)*log(1/k)

Value

Info

[1:max.nc,1:C] matrix of Shannin informaton as defined in details, each column represents one Cls of ClsMatrix,each row yields the information of one cluster up the ClusterNo k, if k<max.nc (highest number of clusters) then NaN are filled.

ClusterNo

Number of Clusters k found for each Cls respectively

MaxInfo

max per column of Info

MinInfo

min per column of Info

MedianInfo

median per column of Info

MeanInfo

mean per column of Info

Note

reeimplemented from Alfred's Ultsch Matlab version but not verified yet.

Author(s)

Michael Thrun

References

[Shannon, 1948] Shannon, C. E.: A Mathematical Theory of Communication, Bell System Technical Journal, Vol. 27(3), pp. 379-423. doi doi:10.1002/j.1538-7305.1948.tb01338.x, 1948.

Examples

# Reading the iris dataset from the standard R-Package datasets
data <- as.matrix(iris[,1:4])
max.nc = 7
# Creating the clusterings for the data set
#(here with method complete) for the number of classes 2 to 8
hc <- hclust(dist(data), method = "complete")
clsm <- matrix(data = 0, nrow = dim(data)[1],

ncol = max.nc)
for (i in 2:(max.nc+1)) {
  clsm[,i-1] <- cutree(hc,i)
}

ClusterShannonInfo(clsm)

FCPS documentation built on Oct. 19, 2023, 5:06 p.m.