get_terms: Get Terms Based on Cluster Assignment in 'assign_cluster'

Description Usage Arguments Value Examples

Description

Get the terms weighted (either by tf-idf or returned from the model) and min/max scaling associated with each of the k clusters .

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
get_terms(x, min.weight = 0.6, nrow = NULL, ...)

## S3 method for class 'assign_cluster_hierarchical'
get_terms(x, min.weight = 0.6,
  nrow = NULL, ...)

## S3 method for class 'assign_cluster_kmeans'
get_terms(x, min.weight = 0.6, nrow = NULL,
  ...)

## S3 method for class 'assign_cluster_skmeans'
get_terms(x, min.weight = 0.6, nrow = NULL,
  ...)

## S3 method for class 'assign_cluster_nmf'
get_terms(x, min.weight = 0.6, nrow = NULL,
  ...)

Arguments

x

A assign_cluster object.

min.weight

The lowest min/max scaled tf-idf weighting to consider as a document's salient term.

nrow

The max number of rows to display in the returned data.frames.

...

ignored.

Value

Returns a list of data.frames of top weighted terms.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
library(dplyr)
library(textshape)

myterms <- presidential_debates_2012 %>%
    with(data_store(dialogue)) %>%
    hierarchical_cluster() %>%
    assign_cluster(k = 55) %>%
    get_terms()

myterms
textshape::tidy_list(myterms[!sapply(myterms, is.null)], "Topic")
## Not run: 
library(ggplot2)
library(gridExtra)
library(dplyr)
library(textshape)
library(wordcloud)

max.n <- max(textshape::tidy_list(myterms)[["n"]])

myplots <- Map(function(x, y){
    x %>%
        mutate(term = factor(term, levels = rev(term))) %>%
        ggplot(aes(term, weight=n)) +
            geom_bar() +
            scale_y_continuous(expand = c(0, 0),limits=c(0, max.n)) +
            ggtitle(sprintf("Topic: %s", y)) +
            coord_flip()
}, myterms, names(myterms))

myplots[["ncol"]] <- 10

do.call(gridExtra::grid.arrange, myplots[!sapply(myplots, is.null)])

##wordclouds
par(mfrow=c(5, 11), mar=c(0, 4, 0, 0))
Map(function(x, y){
    wordcloud::wordcloud(x[[1]], x[[2]], scale=c(1,.25),min.freq=1)
    mtext(sprintf("Topic: %s", y), col = "blue", cex=.55, padj = 1.5)
}, myterms, names(myterms))

## End(Not run)

trinker/clustext documentation built on May 31, 2019, 8:41 p.m.