shadow: Cluster Shadows and Silhouettes

Description Usage Arguments Details Author(s) References See Also Examples

Description

Compute and plot shadows and silhouettes.

Usage

1
2
3
4
## S4 method for signature 'kccasimple'
shadow(object, ...)
## S4 method for signature 'kcca'
Silhouette(object, data=NULL, ...)

Arguments

object

An object of class "kcca" or "kccasimple".

data

Data to compute silhouette values for. If the cluster object was created with save.data=TRUE, then these are used by default.

...

Currently not used.

Details

The shadow value of each data point is defined as twice the distance to the closest centroid divided by the sum of distances to closest and second-closest centroid. If the shadow values of a point is close to 0, then the point is close to its cluster centroid. If the shadow value is close to 1, it is almost equidistant to the two centroids. Thus, a cluster that is well separated from all other clusters should have many points with small shadow values.

The silhouette value of a data point is defined as the scaled difference between the average dissimilarity of a point to all points in its own cluster to the smallest average dissimilarity to the points of a different cluster. Large silhouette values indicate good separation.

The main difference between silhouette values and shadow values is that we replace average dissimilarities to points in a cluster by dissimilarities to point averages (=centroids). See Leisch (2009) for details.

Author(s)

Friedrich Leisch

References

Friedrich Leisch. Neighborhood graphs, stripes and shadow plots for cluster visualization. Statistics and Computing, 2009. Accepted for publication on 2009-06-16.

See Also

silhouette

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
data(Nclus)
set.seed(1)
c5 <- cclust(Nclus, 5, save.data=TRUE)
c5
plot(c5)

## high shadow values indicate clusters with *bad* separation
shadow(c5)
plot(shadow(c5))

## high Silhouette values indicate clusters with *good* separation
Silhouette(c5)
plot(Silhouette(c5))

Example output

Loading required package: grid
Loading required package: lattice
Loading required package: modeltools
Loading required package: stats4
kcca object of family 'kmeans' 

call:
cclust(x = Nclus, k = 5, save.data = TRUE)

cluster sizes:

  1   2   3   4   5 
198 105  52 147  48 

        1         2         3         4         5 
0.3832332 0.3894565 0.6327909 0.3979773 0.6106751 
        1         2         3         4         5 
0.6537156 0.6333430 0.3344578 0.6336956 0.4170439 

flexclust documentation built on May 2, 2019, 10:59 a.m.