as.clustrange: Build a clustrange object to compare different clustering...
In WeightedCluster: Clustering of Weighted Data

as.clustrange

R Documentation

Build a clustrange object to compare different clustering solutions.

Description

Build a clustrange object to compare different clustering solutions.

Usage

as.clustrange(object, diss, weights=NULL, R=1,  samplesize=NULL, ...)
## S3 method for class 'twins'
as.clustrange(object, diss, weights=NULL, R=1, samplesize=NULL, 
		ncluster=20, ...) 
## S3 method for class 'hclust'
as.clustrange(object, diss, weights=NULL, R=1, samplesize=NULL, 
		ncluster=20, ...) 
## S3 method for class 'dtclust'
as.clustrange(object, diss, weights=NULL, R=1, samplesize=NULL, 
		ncluster=20, labels = TRUE, ...)
## S3 method for class 'clustrange'
plot(x, stat="noCH", legendpos="bottomright", 
    norm="none", withlegend=TRUE, lwd=1, col=NULL, ylab="Indicators", 
	xlab="N clusters", conf.int=0.9, ci.method="none", ci.alpha=.3, line="t0", ...)

Arguments

`object`	The object to convert such as a data.frame.
`diss`	A dissimilarity matrix or a dist object (see `dist`).
`weights`	Optional numerical vector containing weights.
`R`	Optional number of bootstrap that can be used to build confidence intervals.
`samplesize`	Size of bootstrap sample. Default to sum of weights.
`ncluster`	Integer. Maximum number of cluster. The range will include all clustering solution starting from two to `ncluster`.
`labels`	Logical. If `TRUE`, rules to assign an object to a sequence is used to label the cluster (instead of a number).
`x`	A `clustrange` object to be plotted.
`stat`	Character. The list of statistics to plot or "noCH" to plot all statistics except "CH" and "CHsq" or "all" for all statistics. See `wcClusterQuality` for a list of possible values. It is also possible to use "RHC" to plot the quality measure 1-HC. Unlike HC, RHC should be maximized as all other quality measures.
`legendpos`	Character. legend position, see `legend`.
`norm`	Character. Normalization method of the statistics can be one of "none" (no normalization), "range" (given as (value -min)/(max-min), "zscore" (adjusted by mean and standard deviation) or "zscoremed" (adjusted by median and median of the difference to the median).
`withlegend`	Logical. If `FALSE`, the legend is not plotted.
`lwd`	Numeric. Line width, see `par`.
`col`	A vector of line colors, see `par`. If `NULL`, a default set of color is used.
`xlab`	x axis label.
`ylab`	y axis label.
`conf.int`	Confidence to build the confidence interval (default: 0.9).
`ci.method`	Method used to build the confidence interval (only if bootstrap has been used, see R above). One of "none" (do not plot confidence interval), "norm" (based on normal approximation), "perc" (based on percentile).)
`ci.alpha`	alpha color value used to plot the interval.
`line`	Which value should be plotted by the line? One of "t0" (value for actual sample), "mean" (average over all bootstraps), "median"(median over all bootstraps).
`...`	Additionnal parameters passed to/from methods.

Details

as.clustrange convert objects to clustrange objects. clustrange objects contains a list of clustering solution with associated statistics and can be used to find the optimal clustering solution.

If object is a data.frame or a matrix, each column should be a clustering solution to be evaluated.

If object is an hclust or twins objects (i.e. hierarchical clustering output, see hclust, diana or agnes), the function compute all clustering solution ranging from two to ncluster and compute the associated statistics.

Value

An object of class clustrange with the following elements:

clustering:: A data.frame of all clustering solutions.
stats:: A matrix containing the clustering statistics of each cluster solution.

Examples

data(mvad)
## Aggregating state sequence
aggMvad <- wcAggregateCases(mvad[, 17:86], weights=mvad$weight)

## Creating state sequence object
mvad.seq <- seqdef(mvad[aggMvad$aggIndex, 17:86], weights=aggMvad$aggWeights)

## COmpute distance using Hamming distance
diss <- seqdist(mvad.seq, method="HAM")

## Ward clustering
wardCluster <- hclust(as.dist(diss), method="ward", members=aggMvad$aggWeights)

## Computing clustrange from Ward clustering
wardRange <- as.clustrange(wardCluster, diss=diss, 
		weights=aggMvad$aggWeights, ncluster=15)

## Plot all statistics (standardized)
plot(wardRange, stat="all", norm="zscoremed", lwd=3)

## Plot HC, RHC and ASW
plot(wardRange, stat=c("HC", "RHC", "ASWw"), norm="zscore", lwd=3)

WeightedCluster documentation built on April 12, 2025, 9:13 a.m.