Gustafson Kessel Improved Covariance Estimation

Share:

Description

This function used to perform Gustafson Kessel Clustering of X dataset.

Usage

1
2
fuzzy.GK(X, K = 2, m = 1.5, max.iteration = 100, threshold = 10^-5,
  RandomNumber = 0, rho = rep(1, K), gamma = 0)

Arguments

X

data frame n x p

K

specific number of cluster (must be >1)

m

fuzzifier / degree of fuzziness

max.iteration

maximum iteration to convergence

threshold

threshold of convergence

RandomNumber

specific seed

rho

cluster volume

gamma

tuning parameter of covariance

Details

This function perform Fuzzy C-Means algorithm by Gustafson Kessel (1968) that improved by Babuska et al (2002). Gustafson Kessel (GK) is one of fuzzy clustering methods to clustering dataset become K cluster. Number of cluster (K) must be greater than 1. To control the overlaping or fuzziness of clustering, parameter m must be specified. Maximum iteration and threshold is specific number for convergencing the cluster. Random Number is number that will be used for seeding to firstly generate fuzzy membership matrix.

Clustering will produce fuzzy membership matrix (U) and fuzzy cluster centroid (V). The greatest value of membership on data point will determine cluster label. Centroid or cluster center can be use to interpret the cluster. Both membership and centroid produced by calculating mathematical distance. Fuzzy C-Means calculate distance with Covariance Cluster norm distance. So it can be said that cluster will have both sperichal and elipsodial shape of geometry.

Babuska improve the covariance estimation via tuning covariance cluster with covariance of data. Tuning parameter determine proportion of covariance data and covariance cluster that will be used to estimate new covariance cluster. Beside improving via tuning, Basbuka improve the algorithm with decomposition of covariance so it will become non singular matrix.

Value

func.obj objective function that calculated.

U matrix n x K consist fuzzy membership matrix

V matrix K x p consist fuzzy centroid

D matrix n x K consist distance of data to centroid that calculated

Clust.desc cluster description (dataset with additional column of cluster label)

References

Babuska, R., Veen, P. v., & Kaymak, U. (2002). Improved Covarians Estimation for Gustafson Kessel Clustering. IEEE, 1081-1084.

Balasko, B., Abonyi, J., & Feil, B. (2002). Fuzzy Clustering and Data Analysis Toolbox: For Use with Matlab. Veszprem, Hungary.

Gustafson, D. E., & Kessel, W. C. (1978). Fuzzy Clustering With A Fuzzy Covariance Matrix. 761-766.

Examples

1
2
3
library(RcmdrPlugin.FuzzyClust)
data(iris)
fuzzy.GK(X=iris[,1:4],K = 3,m = 2,RandomNumber = 1234,gamma=0, max.iteration=20)->cl