hierarchical_cluster: Fit a Hierarchical Cluster
In trinker/hclustext: Optimized Hierarchical Clustering for Text Data

Description Usage Arguments Value Examples

Fit a hierarchical cluster to text data. Prior to distance measures being calculated the tf-idf (see weightTfIdf) is applied to the DocumentTermMatrix. Cosine dissimilarity is used to generate the distance matrix supplied to hclust. method defaults to "ward.D2". A faster cosine dissimilarity calculation is used under the hood (see cosine_distance). Additionally, hclust is used to quickly calculate the fit. Essentially, this is a wrapper function optimized for clustering text data.

hierarchical_cluster(x, method = "ward.D2", ...)

## S3 method for class 'data_store'
hierarchical_cluster(x, method = "ward.D", ...)

`x`	A data type (e.g., `DocumentTermMatrix` or `TermDocumentMatrix`).
`method`	The agglomeration method to be used. This must be (an unambiguous abbreviation of) one of `"single"`, `"complete"`, `"average"`, `"mcquitty"`, `"ward.D"`, `"ward.D2"`, `"centroid"`, or `"median"`.
`...`	ignored.

Returns an object of class "hclust".

library(dplyr)

x <- with(
    presidential_debates_2012,
    data_store(dialogue, paste(person, time, sep = "_"))
)

hierarchical_cluster(x) %>%
    plot(k=4)

hierarchical_cluster(x) %>%
    plot(h=.7, lwd=2)

hierarchical_cluster(x) %>%
    assign_cluster(h=.7)

hierarchical_cluster(x, method="complete") %>%
    plot(k=6)

hierarchical_cluster(x) %>%
    assign_cluster(k=6)

x2 <- presidential_debates_2012 %>%
    with(data_store(dialogue))

myfit2 <- hierarchical_cluster(x2)

plot(myfit2)
plot(myfit2, 55)

assign_cluster(myfit2, k = 55)