Description Usage Arguments Details Value Author(s) References See Also Examples
Partitions a numeric data set by using the Gustafson-Kessel (GK) algorithm within the PFCM (Possibilistic Fuzzy C-Means) clustering algorithm (Ojeda-Magaina et al, 2006).
1 2 3 4 |
x |
a numeric vector, data frame or matrix. |
centers |
an integer specifying the number of clusters or a numeric matrix containing the initial cluster centers. |
memberships |
a numeric matrix containing the initial membership degrees. If missing, it is internally generated. |
m |
a number greater than 1 to be used as the fuzziness exponent or fuzzifier. The default is 2. |
eta |
a number greater than 1 to be used as the typicality exponent. The default is 2. |
a |
a number for the relative importance of the fuzzy part of the objective function. The default is 1. |
b |
a number for the relative importance of the possibilistic part of the objective function. The default is 1. |
K |
a number greater than 0 to be used as the weight of penalty term. The default is 1. |
omega |
a numeric vector of reference distances. If missing, it is internally generated. |
dmetric |
a string for the distance metric. The default is sqeuclidean for the squared Euclidean distances. See |
pw |
a number for the power of Minkowski distance calculation. The default is 2 if the |
alginitv |
a string for the initialization of cluster prototypes matrix. The default is kmpp for K-means++ initialization method (Arthur & Vassilvitskii, 2007). For the list of alternative options see |
alginitu |
a string for the initialization of memberships degrees matrix. The default is imembrand for random sampling of initial membership degrees. |
nstart |
an integer for the number of starts for clustering. The default is 1. |
iter.max |
an integer for the maximum number of iterations allowed. The default is 1000. |
con.val |
a number for the convergence value between the iterations. The default is 1e-09. |
fixcent |
a logical flag to make the initial cluster centers not changed along the different starts of the algorithm. The default is |
fixmemb |
a logical flag to make the initial membership degrees not changed along the different starts of the algorithm. The default is |
stand |
a logical flag to standardize data. Its default value is |
numseed |
a seeding number to set the seed of R's random number generator. |
Gustafson-Kessel clustering within Possibilistic Fuzzy C-Means (GKPFCM) algorithm is an improvement for PFCM algorithm that consists of the modification of the distance metric for d_{ij_A}. The original PFCM uses the Euclidean distance whereas GKPFCM uses the Mahalanobis distance with GK algorithm. Babuska et al (2002) have proposed an improvement for calculating the covariance matrix \mathbf{F}_j as follows:
\mathbf{F}_j := (1 - γ) \mathbf{F}_j + γ (\mathbf{F}_0)^{1/n} \mathbf{I}
In the above equation, \mathbf{F}_j involves a weighted identity matrix. The eigenvalues λ_{ij} and the eigenvectors Φ_{ij} of the resulting matrix are calculated, and the maximum eigenvalue λ_{i,max} = max_{j}/ λ_{ij} is determined. With the obtained results, λ_{i,max} = λ_{ij}/β, \forall j, which satisfies λ_{i,max} / λ_{ij} ≥q β is calculated. Finally, \mathbf{F}_j is recomputed with the following equation:
\mathbf{F}_j = [Φ_{j,1},…, Φ_{j,n}] diag(λ_{j,1}, …, λ_{j,n}) [Φ_{j,1},…, Φ_{j,n}]^{-1} \;\; \forall j
The objective function of GKPFCM is:
J_{GK}(\mathbf{X}; \mathbf{V}, \mathbf{A}, \mathbf{U}) = ∑\limits_{i=1}^n ∑\limits_{j=1}^k u_{ij}^m d_{A_j}(\vec{x}_i, \vec{v}_j)
m is the fuzzifier to specify the amount of fuzziness for the clustering; 1≤q m≤q ∞. It is usually chosen as 2.
η is the typicality exponent to specify the amount of typicality for the clustering; 1≤q η≤q ∞. It is usually chosen as 2.
The objective function J_{GKPFCM} is minimized by using the following update equations:
u_{ij} =\Bigg[∑\limits_{j=1}^k \Big(\frac{d_{A_j}(\vec{x}_i, \vec{v}_j)}{d_{A_j}(\vec{x}_i, \vec{v}_l)}\Big)^{2/(m-1)} \Bigg]^{-1} \;\;; 1≤q i ≤q n \;,\; 1 ≤q l ≤q k
t_{ij} =\Bigg[∑\limits_{j=1}^k \Big(\frac{d_{A_j}(\vec{x}_i, \vec{v}_j))}{d_{A_j}(\vec{x}_i, \vec{v}_l))}\Big)^{2/(η-1)} \Bigg]^{-1} \;;\; 1 ≤q i ≤q n \;,\; 1 ≤q l ≤q k
\vec{v}_{j} =\frac{∑\limits_{i=1}^n (u_{ij}^m + t_{ij}^η) \vec{x}_i}{∑\limits_{i=1}^n (u_{ij}^m + t_{ij}^η)} \;\;; {1≤q j≤q k}
an object of class ‘ppclust’, which is a list consists of the following items:
x |
a numeric matrix containing the processed data set. |
v |
a numeric matrix containing the final cluster prototypes (centers of clusters). |
u |
a numeric matrix containing the fuzzy memberships degrees of the data objects. |
d |
a numeric matrix containing the distances of objects to the final cluster prototypes. |
k |
an integer for the number of clusters. |
m |
a number for the fuzzifier. |
eta |
a number greater than 1 to be used as the typicality exponent. |
a |
a number for the relative importance of the fuzzy part of the objective function. |
b |
a number for the relative importance of the possibilistic part of the objective function. |
omega |
a numeric vector of reference distances. |
cluster |
a numeric vector containing the cluster labels found by defuzzying the fuzzy membership degrees of the objects. |
csize |
a numeric vector containing the number of objects in the clusters. |
iter |
an integer vector for the number of iterations in each start of the algorithm. |
best.start |
an integer for the index of start that produced the minimum objective functional. |
func.val |
a numeric vector for the objective function values in each start of the algorithm. |
comp.time |
a numeric vector for the execution time in each start of the algorithm. |
stand |
a logical value, |
wss |
a number for the within-cluster sum of squares for each cluster. |
bwss |
a number for the between-cluster sum of squares. |
tss |
a number for the total within-cluster sum of squares. |
twss |
a number for the total sum of squares. |
algorithm |
a string for the name of partitioning algorithm. It is ‘FCM’ with this function. |
call |
a string for the matched function call generating this ‘ppclust’ object. |
Zeynel Cebeci & Cagatay Cebeci
Arthur, D. & Vassilvitskii, S. (2007). K-means++: The advantages of careful seeding, in Proc. of the 18th Annual ACM-SIAM Symposium on Discrete Algorithms, p. 1027-1035. <http://ilpubs.stanford.edu:8090/778/1/2006-13.pdf>
Gustafson, D. E. & Kessel, W. C. (1979). Fuzzy clustering with a fuzzy covariance matrix. In Proc. of IEEE Conf. on Decision and Control including the 17th Symposium on Adaptive Processes, San Diego. pp. 761-766. <doi:10.1109/CDC.1978.268028>
Babuska, R., van der Veen, P. J. & Kaymak, U. (2002). Improved covariance estimation for Gustafson-Kessel clustering. In Proc. of Int. Conf. on Fuzzy Systems, Hawaii, 2002, pp. 1081-1085. <https://tr.scribd.com/document/209211977/Fuzzy-and-Neural-Control>.
Ojeda-Magaina, B., Ruelas, R., Corona-Nakamura, M. A. & Andina, D. (2006). An improvement to the possibilistic fuzzy c-means clustering algorithm. In Proc. of IEEE World Automation Congress, 2006 (WAC'06). pp. 1-8. <doi:10.1109/WAC.2006.376056>
ekm
,
fcm
,
fcm2
,
fpcm
,
fpppcm
,
gg
,
gk
,
hcm
,
pca
,
pcm
,
pcmr
,
pfcm
,
upfc
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | ## Not run:
# Load dataset iris
data(iris)
x <- iris[,-5]
# Initialize the prototype matrix using K-means++
v <- inaparc::kmpp(x, k=3)$v
# Initialize the memberships degrees matrix
u <- inaparc::imembrand(nrow(x), k=3)$u
# Run FCM with the initial prototypes and memberships
gkpfcm.res <- gkpfcm(x, centers=v, memberships=u, m=2)
# Show the fuzzy membership degrees for the top 5 objects
head(gkpfcm.res$u, 5)
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.