cluster_stats: Compute CH index and silhouette width

Description Usage Arguments Value Examples

View source: R/helpers.R

Description

Computes the raw and normalized Calinski-Harabasz index and silhouette width for various number of clusters.

Usage

1
cluster_stats(dist_matrix, cluster_model, k_min, k_max)

Arguments

dist_matrix

a distance matrix

cluster_model

a clustering model such as the output from hclust

k_min

the minimum number of clusters to test

k_max

the maximum number of clusters to test

Value

tibble

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
library(TraMineR)
data(mvad)
seqstatl(mvad[, 17:86])
mvad.alphabet <- c("employment", "FE", "HE", "joblessness", "school",
                   "training")
mvad.labels <- c("employment", "further education", "higher education",
                 "joblessness", "school", "training")
mvad.seq <- seqdef(mvad, 17:86, alphabet = mvad.alphabet, # states = mvad.scodes,
                   labels = mvad.labels, xtstep = 6)
dist_matrix <- TraMineR::seqdist(seqdata = mvad.seq, method = "DHD")
cluster_model <- hclust(d = as.dist(dist_matrix), method = 'ward.D2')

cluster_stats(
 dist_matrix = as.dist(dist_matrix),
 cluster_model = cluster_model,
 k_min = 2,
 k_max = 5
)

joemarlo/sequenchr documentation built on Sept. 29, 2021, 12:23 a.m.