kmeans_cluster: Fit a Kmeans Cluster

Description Usage Arguments Value Examples

Description

Fit a kmeans cluster to text data. Prior to distance measures being calculated the tf-idf (see weightTfIdf) is applied to the DocumentTermMatrix.

Usage

1
2
3
4
5
kmeans_cluster(x, k = hclustext::approx_k(get_dtm(x)), ...)

## S3 method for class 'data_store'
kmeans_cluster(x, k = hclustext::approx_k(get_dtm(x)),
  ...)

Arguments

x

A data type (e.g., DocumentTermMatrix or TermDocumentMatrix).

k

The number of clusters. Defaults to use approx_k of the DocumentTermMatrix produced by data_storage.

...

Other arguments passed to kmeans.

Value

Returns an object of class "kmeans".

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
library(dplyr)

x <- with(
    presidential_debates_2012,
    data_store(dialogue, paste(person, time, sep = "_"))
)

## K predicted
kmeans_cluster(x)

## 6 topic model
kmeans_cluster(x, k=6)

kmeans_cluster(x, k=6) %>%
    assign_cluster()

kmeans_cluster(x, k=6) %>%
    assign_cluster() %>%
    summary()

x2 <- presidential_debates_2012 %>%
    with(data_store(dialogue))

myfit2 <- kmeans_cluster(x2, 55)

assign_cluster(myfit2)

assign_cluster(myfit2) %>%
    summary()

trinker/kmeanstext documentation built on May 31, 2019, 8:51 p.m.