superClass: Create Super-Clusters from SOM Results

View source: R/superclasses.R

superClassR Documentation

Create Super-Clusters from SOM Results

Description

Aggregate the resulting clustering of the SOM algorithm into super-clusters.

Usage

superClass(sommap, method, members, k, h, ...)

## S3 method for class 'somSC'
print(x, ...)

## S3 method for class 'somSC'
summary(object, ...)

## S3 method for class 'somSC'
plot(
  x,
  what = c("obs", "prototypes", "add"),
  type = c("dendrogram", "grid", "hitmap", "lines", "meanline", "barplot", "boxplot",
    "mds", "color", "poly.dist", "pie", "graph", "dendro3d", "projgraph"),
  plot.var = TRUE,
  show.names = TRUE,
  names = 1:prod(x$som$parameters$the.grid$dim),
  ...
)

## S3 method for class 'somSC'
projectIGraph(object, init.graph, ...)

cutree(object, k = NULL, h = NULL)

Arguments

sommap

A somRes object.

method

Argument passed to the hclust function.

members

Argument passed to the hclust function.

k

Argument passed to the cutree function (number of super-clusters to cut the dendrogram).

h

Argument passed to the cutree function (height at which to cut the dendrogram).

...

Used for plot.somSC: further arguments passed either to the function plot (case type = "dendro") or to plot.myGrid (case type = "grid") or to plot.somRes (all other cases).

x

A somSC object.

object

A somSC object.

what

What to plot. Can be either the observations (obs), the prototypes (prototypes), an additional variable (add), or NULL if not appropriate.
Automatically set for types "hitmap" (to "obs") and "grid", (to "prototypes"). Default to "obs" otherwise.
If what = "add", the function plot.somRes is also run with the argument what set to "add".

type

The type of plot to draw. Default value is "dendrogram", to plot the dendrogram of the clustering. Case "grid" plots the grid with colors corresponding to the clusters of the super clustering. Case "projgraph" uses an is_igraph object passed to the argument variable and plots the projected graph as defined by the method projectIGraph. All other cases are those available in the function plot.somRes and superimpose the super-clusters over these plots.

plot.var

A boolean indicating whether a plot showing the evolution of the explained variance should be plotted. This argument is only used when type = "dendrogram", its default value is TRUE.

show.names

Whether the cluster titles must be printed in center of the grid or not for type = "grid". Default to FALSE (titles not displayed).

names

If show.names = TRUE, values of the title to display for type="grid". Default to "Cluster " followed by the cluster number.

init.graph

An igraph object which is projected according to the super-clusters. The vertices of init.graph must correspond to the rows of the original dataset processed by SOM (note that case "korresp" is not handled by this function). In the projected graph, the vertices are positioned at the center of gravity of the super-clusters (more details in the section Details below).

Details

The superClass method can be used in 2 ways:

  • to choose the number of super clusters via an hclust object: then, both arguments k and h can be NULL. In this case, superClass only returns the dendrogram of the hierarchical clustering, which can then be cut with the method cutree (to which either k or h must be specified);

  • to cut the clustering into super clusters. Then, either argument k or argument h must be specified (see cutree for details).

The squared distance between prototypes is passed to the algorithm.

summary on a superClass object produces a complete summary of the results that displays the number of clusters and super-clusters, the clustering itself and performs ANOVA analyses. For type = "numeric" the ANOVA is performed for each input variable and test the difference of this variable across the super-clusters of the map. For type = "relational" a dissimilarity ANOVA is performed (see (Anderson, 2001), except that in the present version, a crude estimate of the p-value is used which is based on the Fisher distribution and not on a permutation test.

On plots, the different super classes are identified in the following ways:

  • either with different color, when type is set among: "grid" (N, K, R), "hitmap" (N, K, R), "lines" (N, K, R), "barplot" (N, K, R), "boxplot", "poly.dist" (N, K, R), "mds" (N, K, R), "dendro3d" (N, K, R), "graph" (R), "projgraph" (R);

  • or with title, when type is set among: "color" (N, K), "pie" (N, R).

In the list above, the charts available for a numerical SOM are indicated with a N, with a K for a korresp SOM and with an R for relational SOM.

projectIGraph produces a projected graph from the is_igraph object passed to the argument variable as described in (Olteanu and Villa-Vialaneix, 2015). The attributes of this graph are the same than the ones obtained from the SOM map itself in the function projectIGraph. plot.somSC used with type = "projgraph" calculates this graph and represents it by positioning the super-vertexes at the center of gravity of the super-clusters. This feature can be combined with pie.graph = TRUE to super-impose the information from an external factor related to the individuals in the original dataset (or, equivalently, to the vertexes of the graph).

Value

The superClass method returns an object of class somSC, which is a list of the following elements:

cluster

The super clustering of the prototypes (only if either k or h are given by user).

tree

An hclust object.

som

The somRes object given as argument (see trainSOM for details).

The projectIGraph method returns an object of class is_igraph with the following attributes:

layout

provides the layout of the projected graph according to the center of gravity of the super-clusters positioned on the SOM grid (graph attribute);

name and size

respectively are the vertex number on the grid and the number of vertexes included in the corresponding cluster (vertex attribute);

weight

gives the number of edges (or the sum of the weights) between the vertexes of the two corresponding clusters (edge attribute).

Author(s)

Élise Maigné elise.maigne@inrae.fr
Madalina Olteanu olteanu@ceremade.dauphine.fr
Nathalie Vialaneix nathalie.vialaneix@inrae.fr

References

Anderson M.J. (2001). A new method for non-parametric multivariate analysis of variance. Austral Ecology, 26, 32-46.

Olteanu M., Villa-Vialaneix N. (2015) Using SOMbrero for clustering and visualizing graphs. Journal de la Societe Francaise de Statistique, 156, 95-119.

See Also

hclust, cutree, trainSOM, plot.somRes

Examples

set.seed(11051729)
my.som <- trainSOM(x.data = iris[,1:4])
# choose the number of super-clusters
sc <- superClass(my.som)
plot(sc)
# cut the clustering
sc <- superClass(my.som, k = 4)
summary(sc)
plot(sc)
plot(sc, type = "grid")
plot(sc, what = "obs", type = "hitmap")

# cut the clustering with a different number of clusters
sc <- superClass(my.som, k = 5)
summary(sc)

SOMbrero documentation built on Aug. 18, 2025, 5:36 p.m.