assign_cluster: Assign Clusters to Documents/Text Elements

Description Usage Arguments Value Examples

Description

Assign clusters to documents/text elements.

Usage

1
2
3
4
5
assign_cluster(x, k = approx_k(get_dtm(x)), h = NULL, ...)

## S3 method for class 'hierarchical_cluster'
assign_cluster(x, k = approx_k(get_dtm(x)),
  h = NULL, ...)

Arguments

x

a hierarchical_cluster object.

k

The number of clusters (can supply h instead). Defaults to use approx_k of the DocumentTermMatrix produced by data_storage.

h

The height at which to cut the dendrograms (determines number of clusters). If this argument is supplied k is ignored.

...

ignored.

Value

Returns an assign_cluster object; a named vector of cluster assignments with documents as names. The object also contains the original data_storage object and a join function. join is a function (a closure) that captures information about the assign_cluster that makes rejoining to the original data set simple. The user simply supplies the original data set as an argument to join (attributes(FROM_ASSIGN_CLUSTER)$join(ORIGINAL_DATA)).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
library(dplyr)

x <- with(
    presidential_debates_2012,
    data_store(dialogue, paste(person, time, sep = "_"))
)

hierarchical_cluster(x) %>%
    plot(h=.7, lwd=2)

hierarchical_cluster(x) %>%
    assign_cluster(h=.7)

hierarchical_cluster(x, method="complete") %>%
    plot(k=6)

hierarchical_cluster(x) %>%
    assign_cluster(k=6)


x2 <- presidential_debates_2012 %>%
    with(data_store(dialogue)) %>%
    hierarchical_cluster()

ca <- assign_cluster(x2, k = 55)
summary(ca)

## add to original data
attributes(ca)$join(presidential_debates_2012)

## split text into clusters
get_text(ca)

trinker/hclustext documentation built on May 31, 2019, 8:50 p.m.