Description Usage Arguments Value Examples
Fit a hierarchical cluster to text data. Prior to distance measures being
calculated the tf-idf (see weightTfIdf
) is applied to the
DocumentTermMatrix
. Cosine dissimilarity is used to generate
the distance matrix supplied to hclust
. method
defaults to "ward.D2"
. A faster cosine dissimilarity calculation is used
under the hood (see cosine_distance
). Additionally,
hclust
is used to quickly calculate the fit.
Essentially, this is a wrapper function optimized for clustering text data.
1 2 3 4 5 | hierarchical_cluster(x, distance = "cosine", method = "ward.D2", ...)
## S3 method for class 'data_store'
hierarchical_cluster(x, distance = "cosine",
method = "ward.D", ...)
|
x |
A data store object (see |
distance |
A distance measure ("cosine" or "jaccard"). |
method |
The agglomeration method to be used. This must be (an
unambiguous abbreviation of) one of |
... |
ignored. |
Returns an object of class "hclust"
.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 | library(dplyr)
x <- with(
presidential_debates_2012,
data_store(dialogue, paste(person, time, sep = "_"))
)
hierarchical_cluster(x) %>%
plot(k=4)
hierarchical_cluster(x) %>%
plot(h=.7, lwd=2)
hierarchical_cluster(x) %>%
assign_cluster(h=.7)
## Not run:
## interactive cutting
hierarchical_cluster(x) %>%
plot(h=TRUE)
## End(Not run)
hierarchical_cluster(x, method="complete") %>%
plot(k=6)
hierarchical_cluster(x) %>%
assign_cluster(k=6)
x2 <- presidential_debates_2012 %>%
with(data_store(dialogue))
myfit2 <- hierarchical_cluster(x2)
plot(myfit2)
plot(myfit2, 55)
assign_cluster(myfit2, k = 55)
## Example from StackOverflow Question Response
## Asking fo grouping similar texts together
## http://stackoverflow.com/q/22936951/1000343
dat <- data.frame(
person = LETTERS[1:3],
text = c("Best way to waste money",
"Amazing stuff. lets you stay connected all the time",
"Instrument to waste money and time"),
stringsAsFactors = FALSE
)
x <- with(
dat,
data_store(text, person)
)
hierarchical_cluster(x) %>%
plot(h=.9, lwd=2)
hierarchical_cluster(x) %>%
assign_cluster(h=.9)
hierarchical_cluster(x) %>%
assign_cluster(h=.9) %>%
get_terms()
hierarchical_cluster(x) %>%
assign_cluster(h=.9) %>%
get_terms() %>%
as_topic()
hierarchical_cluster(x) %>%
assign_cluster(h=.9) %>%
get_documents()
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.