grpeval: Finding groups in data

Description Usage Arguments Details Value Examples

View source: R/grpeval.R

Description

Designed to assist users who wish to employ SOM as a clustering tool. Applies standard approaches to assist with identification of grouping structure in multivariate data.

Usage

1
grpeval(x, kmx = 10, itermax = NULL, nstarts = NULL, symsize = 1)

Arguments

x

is a dataframe object

kmx

user specified maximum number of clusters/groups to examine. Default is 10.

itermax

maximum number of iterations allowed for kmeans. Default is 500*k.

nstarts

number of random initializatons for kmeans to employ. Default is 5

symsize

sets symbol size on plots

Details

Many unsupervised learning algorithms (e.g., SOM, kmeans) require the number of groupings for the algorithm to seek out as a user input. This tool assists users with this decision using two traditional strategies often applied in cluster analysis. Understanding patterns in multivariate data can be assissted by low-dimensional visualization that seek to represent similarity of individual observations in a dataset. Here, we employ multi-dimensional scaling (MDS) to construct a 2-D mapping that projects the pairwise distances among a set of observations into a configuration of points mapped onto abstract coordinate space. Here, we employ MDS as an ordination technique in order visualize information within the data's distance matrix. Similar objects are closer in space and thus multiple isolated regions of high-density will be presented if clustering is obvious. Second, multiple applications of k-means are used to internally assess how grouping structure changes as a function of the number of clusters. Results are presented as a scree plot based on the total within cluster sum-of-squares (WCSS) for each data partition. An ideal plot will present clear 'elbowing', where the measure decreases more slowly as the number of groupings increases.

MDS is implemented via cmdscale() and k-means employs kmeans() via the stats package.

Value

Panel a illustrates multi-dimensional scaling (MDS) results. Panel b presents a scree plot of k-means resuts. A dataframe with cluster/group statistics is also returned.

Examples

1
2
3
#NIEHS Mixtures Workshop dataset1
data(dataset1)
grpeval(scale(dataset1[,3:9]))

johnlpearce/sommix documentation built on Jan. 7, 2021, 11:38 p.m.