Description Usage Arguments Details Value Author(s) See Also Examples
View source: R/fviz_silhouette.R
Silhouette (Si) analysis is a cluster validation approach that
measures how well an observation is clustered and it estimates the average
distance between clusters. fviz_silhouette() provides ggplot2-based elegant
visualization of silhouette information from i) the result of
silhouette
(), pam
(),
clara
() and fanny
() [in
cluster package]; ii) eclust
() and hcut
() [in
factoextra].
Read more: Clustering Validation Statistics.
1 | fviz_silhouette(sil.obj, label = FALSE, print.summary = TRUE, ...)
|
sil.obj |
an object of class silhouette: pam, clara, fanny [in cluster package]; eclust and hcut [in factoextra]. |
label |
logical value. If true, x axis tick labels are shown |
print.summary |
logical value. If true a summary of cluster silhouettes are printed in fviz_silhouette(). |
... |
other arguments to be passed to the function ggpubr::ggpar(). |
- Observations with a large silhouhette Si (almost 1) are very well clustered.
- A small Si (around 0) means that the observation lies between two clusters.
- Observations with a negative Si are probably placed in the wrong cluster.
return a ggplot
Alboukadel Kassambara alboukadel.kassambara@gmail.com
fviz_cluster
, hcut
,
hkmeans
, eclust
, fviz_dend
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 | set.seed(123)
# Data preparation
# +++++++++++++++
data("iris")
head(iris)
# Remove species column (5) and scale the data
iris.scaled <- scale(iris[, -5])
# K-means clustering
# +++++++++++++++++++++
km.res <- kmeans(iris.scaled, 3, nstart = 2)
# Visualize kmeans clustering
fviz_cluster(km.res, iris[, -5], ellipse.type = "norm")+
theme_minimal()
# Visualize silhouhette information
require("cluster")
sil <- silhouette(km.res$cluster, dist(iris.scaled))
fviz_silhouette(sil)
# Identify observation with negative silhouette
neg_sil_index <- which(sil[, "sil_width"] < 0)
sil[neg_sil_index, , drop = FALSE]
## Not run:
# PAM clustering
# ++++++++++++++++++++
require(cluster)
pam.res <- pam(iris.scaled, 3)
# Visualize pam clustering
fviz_cluster(pam.res, ellipse.type = "norm")+
theme_minimal()
# Visualize silhouhette information
fviz_silhouette(pam.res)
# Hierarchical clustering
# ++++++++++++++++++++++++
# Use hcut() which compute hclust and cut the tree
hc.cut <- hcut(iris.scaled, k = 3, hc_method = "complete")
# Visualize dendrogram
fviz_dend(hc.cut, show_labels = FALSE, rect = TRUE)
# Visualize silhouhette information
fviz_silhouette(hc.cut)
## End(Not run)
|
Loading required package: ggplot2
Welcome! Related Books: `Practical Guide To Cluster Analysis in R` at https://goo.gl/13EFCZ
Sepal.Length Sepal.Width Petal.Length Petal.Width Species
1 5.1 3.5 1.4 0.2 setosa
2 4.9 3.0 1.4 0.2 setosa
3 4.7 3.2 1.3 0.2 setosa
4 4.6 3.1 1.5 0.2 setosa
5 5.0 3.6 1.4 0.2 setosa
6 5.4 3.9 1.7 0.4 setosa
Loading required package: cluster
cluster size ave.sil.width
1 1 50 0.64
2 2 47 0.35
3 3 53 0.39
cluster neighbor sil_width
[1,] 2 3 -0.01058434
[2,] 2 3 -0.02489394
cluster size ave.sil.width
1 1 50 0.63
2 2 45 0.35
3 3 55 0.38
cluster size ave.sil.width
1 1 49 0.64
2 2 24 0.48
3 3 77 0.32
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.