Description Usage Arguments Value Examples
Fit a hierarchical cluster to text data. Prior to distance measures being
calculated the tf-idf (see weightTfIdf
) is applied to the
DocumentTermMatrix
. Cosine dissimilarity is used to generate
the distance matrix supplied to hclust
. method
defaults to "ward.D2"
. A faster cosine dissimilarity calculation is used
under the hood (see cosine_distance
). Additionally,
hclust
is used to quickly calculate the fit.
Essentially, this is a wrapper function optimized for clustering text data.
1 2 3 4 | hierarchical_cluster(x, method = "ward.D2", ...)
## S3 method for class 'data_store'
hierarchical_cluster(x, method = "ward.D", ...)
|
x |
A data type (e.g., |
method |
The agglomeration method to be used. This must be (an
unambiguous abbreviation of) one of |
... |
ignored. |
Returns an object of class "hclust"
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | library(dplyr)
x <- with(
presidential_debates_2012,
data_store(dialogue, paste(person, time, sep = "_"))
)
hierarchical_cluster(x) %>%
plot(k=4)
hierarchical_cluster(x) %>%
plot(h=.7, lwd=2)
hierarchical_cluster(x) %>%
assign_cluster(h=.7)
hierarchical_cluster(x, method="complete") %>%
plot(k=6)
hierarchical_cluster(x) %>%
assign_cluster(k=6)
x2 <- presidential_debates_2012 %>%
with(data_store(dialogue))
myfit2 <- hierarchical_cluster(x2)
plot(myfit2)
plot(myfit2, 55)
assign_cluster(myfit2, k = 55)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.