pe: Partition Entropy
In zcebeci/fcvalid: Internal Validity Indexes for Fuzzy and Possibilistic Clustering

View source: R/pe.R

pe	R Documentation

Partition Entropy

Description

Computes the Partition Entropy (Bezdek, 1974) in order to validate the result of a fuzzy and/or possibilistic clustering analysis.

Usage

pe(u, m, t=NULL, eta, tidx="f")

Arguments

`u`	an object of class ‘ppclust’ containing the clustering results from a fuzzy clustering algorithm in the package ppclust. Alternatively, a numeric data frame or matrix containing the membership matrix.
`t`	a numeric data frame or matrix containing the cluster prototypes. It should be specified if `x` is not an object of ‘ppclust’ and the option e or g is assigned to `tidx`.
`m`	a number specifying the fuzzy exponent. It should be specified if `x` is not an object of ‘ppclust’.
`eta`	a number specifying the typicality exponent. It should be specified if `x` is not an object of ‘ppclust’ and `tidx` is either e or g.
`tidx`	a character specifying the type of index. The default is ‘f’ for fuzzy index. The other options are ‘e’ for extended and ‘g’ for generalized index.

Details

Partition Entropy (PE) (Classification Entropy in some of the literature) (Bezdek, 1974) is one of the earliest fuzzy validity indexes. The formula of PE is very simple because it only requires the membership degrees matrix associated with fuzzy c-partitions of a data set as follows:

I_{PE}=-\frac{1}{n} ∑\limits_{i=1}^n ∑\limits_{j=1}^k u_{ij} \; \log_α (u_{ij})

, where logarithmic base α \in (1, ∞) and u_{ij} \log_α u_{ij} /n = 0.

The value of I_{PE} varies between 0 and \log_α k (0 ≤q I_{PE} ≤q \log_α k). While it is 0 the clustering becomes entirely crisp, if it is \log_α k the clustering reaches to the maximum fuzziness (Halkidi et al, 2002b). This means that there is no clustering structure in the dataset or the algorithm is unsuccessful to reveal the clustering. The optimal clustering is found at the minimum value of I_{PE}.

Value

`pe`	PE index value if `tidx` is ‘f’
`pe.e`	extended PE index value if `tidx` is ‘e’
`pe.g`	generalized PE index value if `tidx` is ‘g’

Author(s)

Zeynel Cebeci

References

Bezdek, J.C. (1974). Cluster validity with fuzzy sets. J Cybernetics,3(3):58-72. <doi:10.1080/01969727308546047>

Halkidi, M., Batistakis, Y. & Vazirgiannis, M. (2002b). Clustering validity checking methods: part II. ACM Sigmod Record, 31(3):19-27. <doi:10.1145/601858.601862>

Examples

# Load the dataset iris and use the first four feature columns 
data(iris)
x <- iris[,1:4]

# Run FCM algorithm in the package ppclust 
res.fcm <- ppclust::fcm(x, centers=3)

# Compute the PE index using res.fcm, which is a ppclust object
idx <- pe(res.fcm)
print(idx)
 
# Compute the PE index U matrix
idx <- pe(res.fcm$u)
print(idx)

# Run UPFC algorithm in the package ppclust 
res.upfc <- ppclust::upfc(x, centers=3)
# Compute the generalized PE index using res.upfc, which is a ppclust object
idx <- pe(res.upfc, tidx="g")
print(idx)

zcebeci/fcvalid documentation built on Oct. 4, 2022, 9:01 p.m.