cwb: Composed Within and Between Scattering Index
In zcebeci/fcvalid: Internal Validity Indexes for Fuzzy and Possibilistic Clustering

View source: R/cwb.R

cwb	R Documentation

Composed Within and Between Scattering Index

Description

Computes the CWB index (Rezaee et al, 1998) in order to validate the result of a fuzzy and/or possibilistic cluster analysis.

Usage

cwb(x, u, v, m, t=NULL, eta, av=1, tidx="f")

Arguments

`x`	an object of class ‘ppclust’ containing the clustering results from a fuzzy clustering algorithm in the package ppclust. Alternatively, a numeric data frame or matrix containing the data set.
`u`	a numeric data frame or matrix containing the fuzzy membership values. It should be specified if `x` is not an object of ‘ppclust’.
`v`	a numeric data frame or matrix containing the cluster prototypes. It should be specified if `x` is not an object of ‘ppclust’.
`t`	a numeric data frame or matrix containing the cluster prototypes. It should be specified if `x` is not an object of ‘ppclust’ and the option e or g is assigned to `tidx`.
`m`	a number specifying the fuzzy exponent. It should be specified if `x` is not an object of ‘ppclust’.
`eta`	a number specifying the typicality exponent. It should be specified if `x` is not an object of ‘ppclust’ and `tidx` is either e or g.
`av`	a number specifying the weighting factor which is needed in order to counterbalance both terms of the index value in a proper way. It equals to Dis(k_{max}). The default value is 1.
`tidx`	a character specifying the type of index. The default is ‘f’ for fuzzy index. The other options are ‘e’ for extended and ‘g’ for generalized index.

Details

The Composed Within and Between Scattering (CWB) index which has been defined by Rezaee et al (1998) is a combined validation index of the average compactness and separation of fuzzy partitions generated by the fuzzy c-means algorithm. The formula of this index as follows:

I_{CWB}=α \; Scat(k) + Dis(k)

The first term of the index I_{CWB} indicates the average of the scattering variation within the clusters for k, number of clusters, and it is defined as follows:

Scat(k) = \frac{\frac{1}{k} ∑\limits_{j=1}^k ||σ(v_i)||}{||σ(X)||}

, where ||x||=√{x^T x}.

In the above equation, σ_{v_j} is the fuzzy variation of the cluster j for the feature p.

σ_{v_j}^p=\frac{1}{n} ∑\limits_{i=1}^n u_{ij}(\vec{x}_{i}^p - \vec{v}_{j}^p)^2

The second term of the index I_{CWB} indicates the total scattering separation between the clusters. Generally, this term will increase with the number of clusters and is influenced by the geometry of the cluster centers ((Rezaee et al, 1998). It is defined as follows:

Dis(k) = \frac{D_{max}}{D_{min}} ∑\limits_{j=1}^k \Big(∑\limits_{l=1}^k ||v_j - v_l|| \Big)^{-1}

where D_{max}=\max{(||v_j - v_l||)} and D_{min}=\min{(||v_j - v_l||)}

The optimal clustering is obtained at the minimum value of I_{CWB}.

Value

`cwb`	CWB index value, if `tidx` is ‘f’
`cwb.e`	extended CWB index, if `tidx` is ‘e’
`cwb.g`	generalized CWB index value, if `tidx` is ‘g’

Author(s)

Zeynel Cebeci

References

Rezaee, M. R., Lelieveldt, B. P. & Reiber, J. H. (1998). A new cluster validity index for the fuzzy c-mean. Pattern Recognition Letters, 19(3):237-246.<doi:10.1016/S0167-8655(97)00168-2>

Examples

# Load the dataset iris
data(iris)
x <- iris[,1:4]

# Run FCM algorithm in the package ppclust 
res.fcm <- ppclust::fcm(x, centers=3)

# Compute the CWB index using res.fcm, which is a ppclust object
idx <- cwb(res.fcm)
print(idx)
 
# Compute the CWB index using X, U and V matrices
idx <- cwb(res.fcm$x, res.fcm$u, res.fcm$v)
print(idx)

# Run UPFC algorithm of the package ppclust 
res.upfc <- ppclust::upfc(x, centers=3)
# Compute the generalized CWB index using res.upfc, which is a ppclust object
idx <- cwb(res.upfc, tidx="g")
print(idx)

zcebeci/fcvalid documentation built on Oct. 4, 2022, 9:01 p.m.