batchsom.dist: Fit a Self-Organising Map to dissimilarity data

Description Usage Arguments Details Value Author(s) See Also Examples

Description

batchsom.dist is used to fit a Self-Organising Map to dissimilarity data.

Usage

1
2
3
4
5
6
7
8
## S3 method for class 'dist'
batchsom(data, somgrid, init=c("pca","random"),
                prototypes,weights,
                mode = c("continuous","stepwise"), min.radius, max.radius,
                steps, decrease = c("power", "linear"), max.iter,
                kernel = c("gaussian", "linear"), normalised,
                assignment = c("single", "heskes"), cut = 1e-07,
                verbose = FALSE, keepdata = TRUE, ...)

Arguments

data

the data to which the SOM will be fitted represented by a dissimilarity matrix (an object of class "dist" from the proxy package) with all pairwise dissimilarities between the observations.

somgrid

an object of class "somgrid" that specifies the prior structure of the Self-Organising Map: see somgrid

prototypes

a matrix of initial values for the prototypes. It contains linear coefficients that describe the prototypes as virtual linear combination of the initial data points. It has therefore one row for each prototype (as specified by the prior structure somgrid) and has many columns as data points. If missing chosen via the method specified by the init parameter (see details)

init

the initialisation method (see details)

weights

optional weights for the data points

mode

annealing mode:

"continuous" (default)

this is the standard annealing strategy for SOM: the influence of neighbours changes at each epoch of the algorithm, from max.radius to min.radius in exactly step steps.

"stepwise"

in this strategy, the algorithm performs several epochs (a maximum of max.iter) for each of the step radii (from max.radius to min.radius). The algorithm changes the neighbours influence only when the classification remains stable from one epoch to another. The max.iter parameter provides a safeguard against cycling behaviours.

min.radius

the minimum neighbourhood influence radius. If missing, the value depends on the one of kernel but ensures in practice a local learning only (see details)

max.radius

the maximal neighbourhood influence radius. If missing two third of the prior structure diameter plus one

steps

the number of radii to use during annealing

decrease

the radii generating formula ("power" or "linear"), i.e., the way the steps radii are generated from the extremal values given by min.radius and max.radius

max.iter

maximal number of epochs for one radius in the "stepwise" annealing mode (defaults to 75)

kernel

the kernel used to transform distances in the prior structure into influence coefficients

normalised

switch for normalising the neighbouring interactions. Has no influence with the "single" assignment method

assignment

the assignment method used to compute the best matching unit (BMU) of an observation during training:

"single" (default)

this is the standard BMU calculation approach in which the best unit for an observation is the one of the closest prototype of this observation

"heskes"

Tom Heskes' variant for the BMU in which a weighted fit of all the prototypes to an observation is used to compute the best unit. The rationale is that the BMU's prototype and its neighbouring units' prototypes must be close to the observation.

cut

minimal value below wich neighbouring interactions are not take into account

verbose

switch for tracing the fitting process

keepdata

if TRUE, the original data are returned as part of the result object

...

additional arguments to be passed to the initialisation method

Details

This function implements the relational Self-Organising Map algorithm in which virtual linear combination of the original data are used to represent the prototypes. If the initial value of prototypes is not provided, it is obtained by a call to a function specified by the init parameter. If its value is "pca" proprototypes are obtained by a call to sominit.pca.dist (this is also the case when init is not specified), while sominit.random.dist is called when init is "random". In both case, the additional parameters submitted to the method are transmitted to the initialisation method.

Value

An object of class "som" and of class "relationalsom", a list with components including

somgrid

as in the arguments to batchsom

prototypes

a matrix containing the virtual coordinates of the prototypes: each row of the matrix sums to one and can be interpreted as the coefficients of a linear combination of the original observations.

classif

a vector of integer indicating to which unit each observation has been assigned

errors

a vector containing the evolution of the quantisation error during the fitting process

control

a list containing all the parameters used to fit the SOM

data

the original data if the function is called with keepdata = TRUE

weights

the weights of the data points if the function is called with keepdata = TRUE and if the weights is given

Author(s)

Fabrice Rossi

See Also

See sominit.pca.dist and sominit.random.dist for some control on the initial configuration of the prototypes, som.tune for the optimisation of some magic parameters (such as the radii), umatrix and distance.grid for visual analysis of the distances between the prototypes.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
data(iris)
# scaling and dissimilarity computation
data <- dist(scale(iris[1:4]))

# a small hexagonal grid
sg <- somgrid(xdim=7,ydim=7,topo="hex")

# fit the SOM (random initialisation)
som <- batchsom(data,sg,init="random",method="cluster")

# and display the umatrix
umatrix(som)

yasomi documentation built on May 2, 2019, 5:59 p.m.