PrototypeBased Partitions of Clusterings
Description
Compute prototypebased partitions of a cluster ensemble by minimizing ∑ w_b u_{bj}^m d(x_b, p_j)^e, the sum of the caseweighted and membershipweighted eth powers of the dissimilarities between the elements x_b of the ensemble and the prototypes p_j, for suitable dissimilarities d and exponents e.
Usage
1 2 
Arguments
x 
an ensemble of partitions or hierarchies, or something
coercible to that (see 
k 
an integer giving the number of classes to be used in the partition. 
method 
the consensus method to be employed, see

m 
a number not less than 1 controlling the softness of the partition (as the “fuzzification parameter” of the fuzzy cmeans algorithm). The default value of 1 corresponds to hard partitions obtained from a generalized kmeans problem; values greater than one give partitions of increasing softness obtained from a generalized fuzzy cmeans problem. 
weights 
a numeric vector of nonnegative case weights.
Recycled to the number of elements in the ensemble given by 
control 
a list of control parameters. See Details. 
Details
Partitioning is performed using pclust
via a family
constructed from method
. The dissimilarities d and
exponent e are implied by the consensus method employed, and
inferred via a registration mechanism currently only made available to
builtin consensus methods. The default methods compute Least Squares
Euclidean consensus clusterings, i.e., use Euclidean dissimilarity
d and e = 2.
For m = 1, the partitioning procedure was introduced by Gaul and Schader (1988) for “Clusterwise Aggregation of Relations” (with the same domains), containing equivalence relations, i.e., hard partitions, as a special case.
Available control parameters are as for pclust
.
The fixed point approach employed is a heuristic which cannot be guaranteed to find the global minimum (as this is already true for the computation of consensus clusterings). Standard practice would recommend to use the best solution found in “sufficiently many” replications of the base algorithm.
Value
An object of class "cl_partition"
representing the obtained
“secondary” partition by an object of class "cl_pclust"
,
which is a list containing at least the following components.
prototypes 
a cluster ensemble with the k prototypes. 
membership 
an object of class 
cluster 
the class ids of the nearest hard partition. 
silhouette 
Silhouette information for the partition, see

validity 
precomputed validity measures for the partition. 
m 
the softness control argument. 
call 
the matched call. 
d 
the dissimilarity function d = d(x, p) employed. 
e 
the exponent e employed. 
References
J. C. Bezdek (1981). Pattern recognition with fuzzy objective function algorithms. New York: Plenum.
W. Gaul and M. Schader (1988). Clusterwise aggregation of relations. Applied Stochastic Models and Data Analysis, 4:273–282.
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17  ## Use a precomputed ensemble of 50 kmeans partitions of the
## Cassini data.
data("CKME")
CKME < CKME[1 : 30] # for saving precious time ...
diss < cl_dissimilarity(CKME)
hc < hclust(diss)
plot(hc)
## This suggests using a partition with three classes, which can be
## obtained using cutree(hc, 3). Could use cl_consensus() to compute
## prototypes as the least squares consensus clusterings of the classes,
## or alternatively:
set.seed(123)
x1 < cl_pclust(CKME, 3, m = 1)
x2 < cl_pclust(CKME, 3, m = 2)
## Agreement of solutions.
cl_dissimilarity(x1, x2)
table(cl_class_ids(x1), cl_class_ids(x2))

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker. Vote for new features on Trello.