Iteratively collapses the rows of a table (typically a contingency table) by selecting the pair of rows each time whose combination creates the smalled loss of chi-squared.
a numeric matrix or data frame
a logical indicating whether to apply a continuity correction if and when the clustered table reaches a 2x2 dimension.
if TRUE, prints the clustered table along with r-squared and p-value at each step
An object of class
greenclust which is compatible with most
hclust object functions, such as
rect.hclust(). The height vector represents the proportion
of chi-squared, relative to the original table, seen at each clustering
step. The greenclust object also includes a vector for the chi-squared
test p-value at each step and a boolean vector indicating whether the
step had a tie for "winner".
Greenacre, M.J. (1988) "Clustering the Rows and Columns of a Contingency Table," Journal of Classification 5, 39-51. https://doi.org/10.1007/BF01901670
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17
# Combine Titanic passenger attributes into a single category tab <- t(as.data.frame(apply(Titanic, 4:1, FUN=sum))) # Remove rows with all zeros tab <- tab[apply(tab, 1, sum) > 0, ] # Perform clustering on contingency table grc <- greenclust(tab) # Plot r-squared and p-values for each potential cut point greenplot(grc) # Get clusters at suggested cut point clusters <- greencut(grc) # Plot dendrogram with clusters marked plot(grc) rect.hclust(grc, max(clusters))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.