batchsom.dist: Fit a Self-Organising Map to dissimilarity data
In yasomi: Yet Another Self Organising Map Implementation

Description Usage Arguments Details Value Author(s) See Also Examples

batchsom.dist is used to fit a Self-Organising Map to dissimilarity data.

## S3 method for class 'dist'
batchsom(data, somgrid, init=c("pca","random"),
                prototypes,weights,
                mode = c("continuous","stepwise"), min.radius, max.radius,
                steps, decrease = c("power", "linear"), max.iter,
                kernel = c("gaussian", "linear"), normalised,
                assignment = c("single", "heskes"), cut = 1e-07,
                verbose = FALSE, keepdata = TRUE, ...)

`data`	the data to which the SOM will be fitted represented by a dissimilarity matrix (an object of class `"dist"` from the `proxy` package) with all pairwise dissimilarities between the observations.
`somgrid`	an object of class `"somgrid"` that specifies the prior structure of the Self-Organising Map: see `somgrid`
`prototypes`	a matrix of initial values for the prototypes. It contains linear coefficients that describe the prototypes as virtual linear combination of the initial data points. It has therefore one row for each prototype (as specified by the prior structure `somgrid`) and has many columns as data points. If missing chosen via the method specified by the `init` parameter (see details)
`init`	the initialisation method (see details)
`weights`	optional weights for the data points
`mode`	annealing mode: `"continuous"` (default) this is the standard annealing strategy for SOM: the influence of neighbours changes at each epoch of the algorithm, from `max.radius` to `min.radius` in exactly `step` steps. `"stepwise"` in this strategy, the algorithm performs several epochs (a maximum of `max.iter`) for each of the `step` radii (from `max.radius` to `min.radius`). The algorithm changes the neighbours influence only when the classification remains stable from one epoch to another. The `max.iter` parameter provides a safeguard against cycling behaviours.
`min.radius`	the minimum neighbourhood influence radius. If missing, the value depends on the one of `kernel` but ensures in practice a local learning only (see details)
`max.radius`	the maximal neighbourhood influence radius. If missing two third of the prior structure diameter plus one
`steps`	the number of radii to use during annealing
`decrease`	the radii generating formula (`"power"` or `"linear"`), i.e., the way the `steps` radii are generated from the extremal values given by `min.radius` and `max.radius`
`max.iter`	maximal number of epochs for one radius in the `"stepwise"` annealing mode (defaults to 75)
`kernel`	the kernel used to transform distances in the prior structure into influence coefficients
`normalised`	switch for normalising the neighbouring interactions. Has no influence with the `"single"` assignment method
`assignment`	the assignment method used to compute the best matching unit (BMU) of an observation during training: `"single"` (default) this is the standard BMU calculation approach in which the best unit for an observation is the one of the closest prototype of this observation `"heskes"` Tom Heskes' variant for the BMU in which a weighted fit of all the prototypes to an observation is used to compute the best unit. The rationale is that the BMU's prototype and its neighbouring units' prototypes must be close to the observation.
`cut`	minimal value below wich neighbouring interactions are not take into account
`verbose`	switch for tracing the fitting process
`keepdata`	if `TRUE`, the original data are returned as part of the result object
`...`	additional arguments to be passed to the initialisation method

This function implements the relational Self-Organising Map algorithm in which virtual linear combination of the original data are used to represent the prototypes. If the initial value of prototypes is not provided, it is obtained by a call to a function specified by the init parameter. If its value is "pca" proprototypes are obtained by a call to sominit.pca.dist (this is also the case when init is not specified), while sominit.random.dist is called when init is "random". In both case, the additional parameters submitted to the method are transmitted to the initialisation method.

An object of class "som" and of class "relationalsom", a list with components including

`somgrid`	as in the arguments to `batchsom`
`prototypes`	a matrix containing the virtual coordinates of the prototypes: each row of the matrix sums to one and can be interpreted as the coefficients of a linear combination of the original observations.
`classif`	a vector of integer indicating to which unit each observation has been assigned
`errors`	a vector containing the evolution of the quantisation error during the fitting process
`control`	a list containing all the parameters used to fit the SOM
`data`	the original data if the function is called with `keepdata = TRUE`
`weights`	the weights of the data points if the function is called with `keepdata = TRUE` and if the `weights` is given

Fabrice Rossi

See sominit.pca.dist and sominit.random.dist for some control on the initial configuration of the prototypes, som.tune for the optimisation of some magic parameters (such as the radii), umatrix and distance.grid for visual analysis of the distances between the prototypes.

data(iris)
# scaling and dissimilarity computation
data <- dist(scale(iris[1:4]))

# a small hexagonal grid
sg <- somgrid(xdim=7,ydim=7,topo="hex")

# fit the SOM (random initialisation)
som <- batchsom(data,sg,init="random",method="cluster")

# and display the umatrix
umatrix(som)