assign_cluster: Assign Clusters to Documents/Text Elements

Description Usage Arguments Value Examples

Description

Assign clusters to documents/text elements.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
assign_cluster(x, k = approx_k(get_dtm(x)), h = NULL, ...)

## S3 method for class 'hierarchical_cluster'
assign_cluster(x, k = approx_k(get_dtm(x)),
  h = NULL, cut = "static", deepSplit = TRUE, minClusterSize = 1, ...)

## S3 method for class 'kmeans_cluster'
assign_cluster(x, ...)

## S3 method for class 'skmeans_cluster'
assign_cluster(x, ...)

## S3 method for class 'nmf_cluster'
assign_cluster(x, ...)

Arguments

x

a xxx_cluster object.

k

The number of clusters (can supply h instead). Defaults to use approx_k of the DocumentTermMatrix produced by data_storage.

h

The height at which to cut the dendrograms (determines number of clusters). If this argument is supplied k is ignored.

cut

The type of cut method to use for hierarchical_cluster; one of 'static', 'dynamic' or 'iterative'.

deepSplit

logical. See cutreeDynamic.

minClusterSize

The minimum cluster size. See cutreeDynamic.

...

ignored.

Value

Returns an assign_cluster object; a named vector of cluster assignments with documents as names. The object also contains the original data_storage object and a join function. join is a function (a closure) that captures information about the assign_cluster that makes rejoining to the original data set simple. The user simply supplies the original data set as an argument to join (attributes(FROM_ASSIGN_CLUSTER)$join(ORIGINAL_DATA)).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
## Not run: 
library(dplyr)

x <- with(
    presidential_debates_2012,
    data_store(dialogue, paste(person, time, sep = "_"))
)

hierarchical_cluster(x) %>%
    plot(h=.7, lwd=2)

hierarchical_cluster(x) %>%
    assign_cluster(h=.7)

hierarchical_cluster(x, method="complete") %>%
    plot(k=6)

hierarchical_cluster(x) %>%
    assign_cluster(k=6)


x2 <- presidential_debates_2012 %>%
    with(data_store(dialogue)) %>%
    hierarchical_cluster()

ca2 <- assign_cluster(x2, k = 55)
summary(ca2)

## Dynamic cut
ca3 <- assign_cluster(x2, cut = 'dynamic', minClusterSize = 5)
get_text(ca3)

## add to original data
attributes(ca2)$join(presidential_debates_2012)

## split text into clusters
get_text(ca2)

## Kmeans Algorithm
kmeans_cluster(x, k=6) %>%
    assign_cluster()

x3 <- presidential_debates_2012 %>%
    with(data_store(dialogue)) %>%
    kmeans_cluster(55)

ca3 <- assign_cluster(x3)
summary(ca3)

## split text into clusters
get_text(ca3)

## End(Not run)

trinker/clustext documentation built on May 31, 2019, 8:41 p.m.