aweSOMplot | R Documentation |
Plot interactive visualizations of self-organizing maps (SOM), as an html page. The plot can represent general map informations, or selected categorical or numeric variables (not necessarily the ones used during training). Hover over the map to focus on the selected cell or variable, and display further information.
aweSOMplot( som, type = c("Hitmap", "Cloud", "UMatrix", "Circular", "Barplot", "Boxplot", "Radar", "Line", "Color", "Pie", "CatBarplot"), data = NULL, variables = NULL, superclass = NULL, obsNames = NULL, scales = c("contrast", "range", "same"), values = c("mean", "median", "prototypes"), size = 400, palsc = c("Set3", "viridis", "grey", "rainbow", "heat", "terrain", "topo", "cm", rownames(RColorBrewer::brewer.pal.info)), palvar = c("viridis", "grey", "rainbow", "heat", "terrain", "topo", "cm", rownames(RColorBrewer::brewer.pal.info)), palrev = FALSE, showAxes = TRUE, transparency = TRUE, boxOutliers = TRUE, showSC = TRUE, pieEqualSize = FALSE, showNames = TRUE, legendPos = c("beside", "below", "none"), legendFontsize = 14, cloudType = c("cellPCA", "kPCA", "PCA", "proximity", "random"), cloudSeed = NA, elementId = NULL )
som |
|
type |
character, the plot type. The default "Hitmap" is a population map. "Cloud" plots the observations as a scatterplot within each cell (see Details). "UMatrix" plots the average distance of each cell to its neighbors, on a color scale. "Circular" (barplot), "Barplot", "Boxplot", "Radar" and "Line" are for numeric variables. "Color" (heat map) is for a single numeric variable. "Pie" (pie chart) and "CatBarplot" are for a single categorical (factor) variable. |
data |
data.frame containing the variables to plot. This is typically not the training data, but rather the unscaled original data, as it is easier to read the results in the original units, and this allows to plot extra variables not used in training. If not provided, the training data is used. |
variables |
character vector containing the names of the variable(s) to plot. See Details. |
superclass |
integer vector, the superclass of each cell of the SOM. |
obsNames |
character vector, names of the observations to be displayed when hovering over the cells of the SOM. Must have a length equal to the number of data rows. If not provided, the row names of data will be used. |
scales |
character, controls the scaling of the variables on the plot. See Details. |
values |
character, the type of value to be displayed. The default "mean" uses the observation means (from data) for each cell. Alternatively, "median" uses the observation medians for each cell, and "prototypes" uses the SOM's prototypes values. |
size |
numeric, plot size, in pixels. Default 400. |
palsc |
character, the color palette used to represent the superclasses as background of the cells. Default is "Set3". Can be "viridis", "grey", "rainbow", "heat", "terrain", "topo", "cm", or any palette name of the RColorBrewer package. |
palvar |
character, the color palette used to represent the variables. Default is "viridis", available choices are the same as for palsc. |
palrev |
logical, whether color palette for variables is reversed. Default is FALSE. |
showAxes |
logical, whether to display the axes (for "Circular", "Barplot", "Boxplot", "Star", "Line", "CatBarplot"), default TRUE. |
transparency |
logical, whether to use transparency when focusing on a variable, default TRUE. |
boxOutliers |
logical, whether outliers in "Boxplot" are displayed, default TRUE. |
showSC |
logical, whether to display superclasses as labels in the "Color" and "UMatrix" plots, default TRUE. |
pieEqualSize |
logical, whether "Pie" should display pies of equal size. The default FALSE displays pies with areas proportional to the number of observations in the cells. |
showNames |
logical, whether to display the observations names in a box below the plot. |
legendPos |
character, whether and where to display the legend (if applicable). Possible values are "beside", "below" or "none". |
legendFontsize |
numeric, font size to use for the legend, and for the tooltip information of the "Cloud" plot. Default is 14. |
cloudType |
character, for "Cloud" type, controls how the point coordinates are computed, see Details. |
cloudSeed |
numeric, for "random Cloud" type, seed for the pseudo-random placement of the points. If NA (the default), no seed will be set. |
elementId |
character, user-defined elementId of the widget. Can be useful for user extensions when embedding the result in an html page. |
The selected variables
must be numeric for types "Circular",
"Barplot", "Boxplot", "Radar", "Color" and "Line", or factor for types
"Pie" and "CatBarplot". If not provided, all columns of data will be
selected. If a numeric variable is provided to a "Cloud", "Pie" or
"CatBarplot", it will be split into a maximum of 8 classes. For "Cloud"
plots, the first element of variables
is used to color the points
(and can be "None" for no coloring), the following elements (if any) are
used in the information box of each point.
Variables scales: All values that are used for the plots (means,
medians, prototypes) are scaled to 0-1 for display (minimum height to
maximum height). The scales
parameter controls how this scaling is
done.
"contrast": for each variable, the minimum height is the minimum observed mean/median/prototype on the map, the maximum height is the maximum on the map. This ensures maximal contrast on the plot.
"range": observation range; for each variable, the minimum height corresponds to the minimum of that variable over the whole dataset, the maximum height to the maximum of the variable on the whole dataset.
"same": same scales; all heights are displayed on the same scale, using the global minimum and maximum of the dataset.
Cloud plot: three types of cloud plots are available, controlled by the
cloudType
argument:
"cellPCA": (default) the point coordinates are computed cell by cell, by computing a PCA on the training data of that cell only. Points close to the center of the cell are close to the mean of its observations. Points far apart within a cell are likely to have different characteristics.
"kPCA": the point coordinates are computed globally, by a kernel PCA performed on all the differences between the training data and their winning prototypes. Points close to the center of their cell are close to their prototype, and points with similar placements in the clouds thus have a similar difference to their prototype. Not recommended for large datasets (eg. > 1000 observations), as it tends to take too much memory.
"PCA": the point coordinates are computed globally, by a PCA performed on all the differences between the training data and their winning prototypes. Points close to the center of their cell are close to their prototype, and points with similar placements in the clouds thus have a similar difference to their prototype.
"proximity": the point coordinates are computed one by one, based on the distances of the observation's training data to its cell's prototype and to its second best matching prototypes among its cell's neighbors. Points close to their cell's center are close to their closest prototype, while points close to another cell are close to that cell's prototype.
"random": the point coordinates are random samples from a uniform distribution.
Returns an object of class htmlwidget
.
## Build training data dat <- iris[, c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width")] ### Scale training data dat <- scale(dat) ## Train SOM ### Initialization (PCA grid) init <- somInit(dat, 4, 4) ok.som <- kohonen::som(dat, grid = kohonen::somgrid(4, 4, 'hexagonal'), rlen = 100, alpha = c(0.05, 0.01), radius = c(2.65,-2.65), init = init, dist.fcts = 'sumofsquares') ## Group cells into superclasses (PAM clustering) superclust <- cluster::pam(ok.som$codes[[1]], 2) superclasses <- superclust$clustering ## Observations cloud ('Cloud') variables <- c("Sepal.Length", "Sepal.Width", "Petal.Length", "Petal.Width") aweSOMplot(som = ok.som, type = 'Cloud', data = iris, variables = c("Species", variables), superclass = superclasses) ## Not run: ## Population map ('Hitmap') aweSOMplot(som = ok.som, type = 'Hitmap', superclass = superclasses) ## Plots for numerical variables ## Circular barplot aweSOMplot(som = ok.som, type = 'Circular', data = iris, variables= variables, superclass = superclasses) ## Barplot (numeric variables) aweSOMplot(som = ok.som, type = 'Barplot', data = iris, variables= variables, superclass = superclasses) ## Plots for categorial variables (iris species, not used for training) ## Pie aweSOMplot(som = ok.som, type = 'Pie', data = iris, variables= "Species", superclass = superclasses) ## Barplot (categorical variables) aweSOMplot(som = ok.som, type = 'CatBarplot', data = iris, variables= "Species", superclass = superclasses) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.