index.G2 | R Documentation |
Calculates G2 internal cluster quality index - Baker & Hubert adaptation of Goodman & Kruskal's Gamma statistic
index.G2(d,cl)
d |
'dist' object |
cl |
A vector of integers indicating the cluster to which each object is allocated |
See file $R_HOME\library\clusterSim\pdf\indexG2_details.pdf for further details
calculated G2 index
Marek Walesiak marek.walesiak@ue.wroc.pl, Andrzej Dudek andrzej.dudek@ue.wroc.pl
Department of Econometrics and Computer Science, University of Economics, Wroclaw, Poland
Everitt, B.S., Landau, E., Leese, M. (2001), Cluster analysis, Arnold, London, p. 104. ISBN 9780340761199.
Gatnar, E., Walesiak, M. (Eds.) (2004), Metody statystycznej analizy wielowymiarowej w badaniach marketingowych [Multivariate statistical analysis methods in marketing research], Wydawnictwo AE, Wroclaw, p. 339.
Gordon, A.D. (1999), Classification, Chapman & Hall/CRC, London, p. 62. ISBN 9781584880134.
Hubert, L. (1974), Approximate evaluation technique for the single-link and complete-link hierarchical clustering procedures, "Journal of the American Statistical Association", vol. 69, no. 347, 698-704. Available at: \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1080/01621459.1974.10480191")}.
Milligan, G.W., Cooper, M.C. (1985), An examination of procedures of determining the number of cluster in a data set, "Psychometrika", vol. 50, no. 2, 159-179. Available at: \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/BF02294245")}.
index.G1
, index.G3
, index.S
, index.H
,
index.KL
, index.Gap
, index.C
, index.DB
# Example 1
library(clusterSim)
data(data_ratio)
d <- dist.GDM(data_ratio)
c <- pam(d, 5, diss = TRUE)
icq <- index.G2(d,c$clustering)
#print(icq)
# Example 2
library(clusterSim)
data(data_ordinal)
d <- dist.GDM(data_ordinal, method="GDM2")
# nc - number_of_clusters
min_nc=2
max_nc=6
res <- array(0,c(max_nc-min_nc+1, 2))
res[,1] <- min_nc:max_nc
clusters <- NULL
for (nc in min_nc:max_nc)
{
cl2 <- pam(d, nc, diss=TRUE)
res[nc-min_nc+1,2] <- G2 <- index.G2(d,cl2$cluster)
clusters <- rbind(clusters,cl2$cluster)
}
print(paste("max G2 for",(min_nc:max_nc)[which.max(res[,2])],"clusters=",max(res[,2])))
print("clustering for max G2")
print(clusters[which.max(res[,2]),])
plot(res, type="p", pch=0, xlab="Number of clusters", ylab="G2", xaxt="n")
axis(1, c(min_nc:max_nc))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.