batchkmeans: Generic K-means Clustering

Description Usage Arguments Details Value Author(s) See Also

Description

Generic function to perform K-means clustering on some data.

Usage

1
2
batchkmeans(data, ncenters, init = c("prototypes", "random", "cluster"),
prototypes, weights, max.iter, verbose = FALSE, keepdata = TRUE, ...)

Arguments

data

the data to cluster. Acceptable data type depend on the available methods, see details

ncenters

the number of clusters

init

the initialisation method (see details)

prototypes

Initial values for the prototypes (the exact representation of the prototypes depends on the data type). If missing, initial prototypes are chosen via the method specified by the init parameter (see details)

weights

optional weights for the data points

max.iter

maximal number of iterations of the algorithm

verbose

switch for tracing the clustering process

keepdata

if TRUE, the original data are returned as part of the result object

...

additional arguments to be passed to methods

Details

In yasomi, the batchkmeans generic function is implemented by two methods which provide K-means for two distinct data representation:

If the initial value of prototypes is not provided, it is obtained by one of the following method specified by the init parameter:

"prototypes"

the standard method proceeds by choosing randomly a subset of the data of the requested size (with repetition if the grid size is larger than the data size). If the weights parameter is given, the probability of choosing a data point is proportionnal to its weight.

"random"

the "random" method generate prototypes randomly and uniformly in the hypercube spanned by the data for standard Euclidean data. For dissimilarity data or for the Kernel data, the method generates prototypes via random convex combinations of the data points. In all cases, the optional weights are not taken into account by this method.

"cluster"

the clustering initialisation method build a random partition the data into balanced clusters and uses as initial prototypes the centre of mass of those clusters. The optional weights are not taken into account for balancing the clusters.

Value

An object of class "batchkmeans", a list with components including

prototypes

a representation of the prototypes that depends on the actual method

classif

a vector of integer indicating to which cluster each observation has been assigned

errors

a vector containing the evolution of the quantisation error during the fitting process

data

the original data if the function is called with keepdata = TRUE

weights

the weights of the data points if the function is called with keepdata = TRUE and if the weights is given

The list will generally contain additional components specific to each implementation. The returned object will also generally have another class more specific than "batchkmeans".

Author(s)

Fabrice Rossi

See Also

See batchsom for Self-Oganising Map which provides both clustering and visualisation.


yasomi documentation built on May 2, 2019, 5:59 p.m.