# greenclust: Row Clustering Using Greenacre's Method In greenclust: Combine Categories Using Greenacre's Method

## Description

Iteratively collapses the rows of a table (typically a contingency table) by selecting the pair of rows each time whose combination creates the smalled loss of chi-squared.

## Usage

 `1` ```greenclust(x, correct = FALSE, verbose = FALSE) ```

## Arguments

 `x` a numeric matrix or data frame `correct` a logical indicating whether to apply a continuity correction if and when the clustered table reaches a 2x2 dimension. `verbose` if TRUE, prints the clustered table along with r-squared and p-value at each step

## Value

An object of class `greenclust` which is compatible with most `hclust` object functions, such as `plot()` and `rect.hclust()`. The height vector represents the proportion of chi-squared, relative to the original table, seen at each clustering step. The greenclust object also includes a vector for the chi-squared test p-value at each step and a boolean vector indicating whether the step had a tie for "winner".

## References

Greenacre, M.J. (1988) "Clustering the Rows and Columns of a Contingency Table," Journal of Classification 5, 39-51. https://doi.org/10.1007/BF01901670

`greencut`, `greenplot`, `assign.cluster`
 ``` 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17``` ```# Combine Titanic passenger attributes into a single category tab <- t(as.data.frame(apply(Titanic, 4:1, FUN=sum))) # Remove rows with all zeros tab <- tab[apply(tab, 1, sum) > 0, ] # Perform clustering on contingency table grc <- greenclust(tab) # Plot r-squared and p-values for each potential cut point greenplot(grc) # Get clusters at suggested cut point clusters <- greencut(grc) # Plot dendrogram with clusters marked plot(grc) rect.hclust(grc, max(clusters)) ```