sominit.random: Initialise the prototypes of a SOM via some random sample

Description Usage Arguments Details Value Author(s) See Also Examples

Description

Initialise the prototypes of a Self-Organising Map by choosing randomly some subset of the data, or as centre of mass of the clusters of a random partition of the data, or as uniformly sampled random points in the hypercube spanned by the data.

Usage

1
2
3
4
5
6
7
sominit.random(data, somgrid, method=c("prototypes","random","cluster"),...)
## Default S3 method:
sominit.random(data, somgrid, method=c("prototypes","random","cluster"),weights,...)
## S3 method for class 'dist'
sominit.random(data, somgrid, method=c("prototypes","random","cluster"),weights,...)
## S3 method for class 'kernelmatrix'
sominit.random(data, somgrid, method=c("prototypes","random","cluster"),weights,...)

Arguments

data

the data to which the SOM will be fitted. This can be, e.g., a matrix or data frame of observations (which should be scaled), a distance matrix or a kernel matrix

somgrid

a somgrid object

method

the initialisation method (see details)

weights

optional weights for the data points

...

additional parameters

Details

There are three methods for generating the initial prototypes:

"prototypes"

the standard method proceeds by choosing randomly a subset of the data of the requested size (with repetition if the grid size is larger than the data size). If the weights parameter is given, the probability of choosing a data point is proportionnal to its weight.

"random"

the "random" method generate prototypes randomly and uniformly in the hypercube spanned by the data for standard Euclidean data. For dissimilarity data or for the Kernel data, the method generates prototypes via random convex combinations of the data points. In the Euclidean case, the optional weights are not taken into account as they do not modify the definition of the span of the data. In the dissimilarity/kernel case, weights are used to define the prior importance of each observation in the random convex conbinations: if the first observation has weight 2 while the second has weight 1, then in average, the coefficient of the first observation in random convex combinations will be twice the one of the second observation.

"cluster"

the clustering initialisation method build a random partition the data into balanced clusters and uses as initial prototypes the centre of mass of those clusters. The optional weights are taken into account for balancing the clusters: the algorithm produces random clusters with approximate identical total weights.

Value

A matrix containing appropriate initial prototypes. It should be compatible with the SOM prior structure (i.e., it should have as many rows as the size of the grid) and with the data.

Author(s)

Fabrice Rossi

See Also

sominit.pca for a PCA based initialisation.

Examples

1
2
3
4
5
6
7
8
9
X <- cbind(rnorm(500),rnorm(500))

sg <- somgrid(xdim=7,ydim=7,topo="rect")

proto <- sominit.random(X,sg)

plot(X,pch="+",col="red",xlim=range(X[,1],proto[,1]),
     ylim=range(X[,2],proto[,2]))
points(proto,pch=20)

yasomi documentation built on May 2, 2019, 5:59 p.m.