greencut | R Documentation |
Cuts a greenclust
tree at an automatically-determined number
of groups.
greencut(g, k = NULL, h = NULL)
g |
a tree as producted by |
k |
an integer scalar with the desired number of groups |
h |
numeric scalar with the desired height where the tree should be cut |
The cut point is calculated by finding the number of groups/clusters that results in a collapsed contingency table with the most-significant (lowest p-value) chi-squared test. If there are ties, the smallest number of groups wins.
If a certain number of groups is required or a specific r-squared
(1 - height) threshold is targeted, values for either k
or h
may be provided. (While the regular cutree
function could
also be used in this circumstance, it may still be useful to have the
additional attributes that greencut()
provides.)
As with cutree()
, k
overrides h
if both are given.
greencut
returns a vector of group memberships, with the
resulting r-squared value and p-value as object attributes,
accessable via attr
.
Greenacre, M.J. (1988) "Clustering the Rows and Columns of a Contingency Table," Journal of Classification 5, 39-51. \Sexpr[results=rd]{tools:::Rd_expr_doi("10.1007/BF01901670")}
greenclust
, greenplot
,
assign.cluster
# Combine Titanic passenger attributes into a single category
# and create a contingency table for the non-zero levels
tab <- t(as.data.frame(apply(Titanic, 4:1, FUN=sum)))
tab <- tab[apply(tab, 1, sum) > 0, ]
grc <- greenclust(tab)
greencut(grc)
plot(grc)
rect.hclust(grc, max(greencut(grc)),
border=unique(greencut(grc))+1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.