Topicanalysis: Topicanalysis Class.

Description Arguments Methods Public fields Methods Examples

Description

Analyse topicmodels.

Arguments

new

New value for a label or a category.

n

Number of a topic.

n_words

An integer, the number of words to be displayed in a wordcloud.

x

Number or name of a topics.

y

Number or name of a topic cooccurring with x.

k

Number of top topics of a document to consider.

exclude

A logical value, whether to to exclude topics earmarked in logical vector in field exclude.

aggregation

Level of aggregation of as.zoo method.

...

Further parameters passed to worker function (such as wordcloud::wordcloud when calling $wordcloud(), for instance).

regex

A regular expression that will limit the evaluation to those documents only that are matched by the regular expression.

Methods

$initialize(topicmodel)

Instantiate new Topicanalysis object. Upon initialization, labels will be the plain numbers of the topics, all exclude values are FALSE.

$cooccurrences(k = 3, regex = NULL, docs = NULL, renumber = NULL, progress = TRUE, exclude = TRUE)

Get cooccurrences of topics. Arguments are documented with the S4 cooccurrences-method for TopicModel-objects.

$relabel(n, new)

Relabel topic n, assigning new label new.

$add_category(new)

Add new, a character vector as a new category to the character vector in the field category.

$ignorance(n, new)

Exclude topic n (i.e. add to ignore).

$wordcloud(n, n = 50, ...)

Generate wordcloud for topic n, with n_words words. Further arguments can be passed into wordcloud::wordcloud usint the three dots.

$docs(x, y = NULL, n = 3L, s_attributes = NULL)

Get documents where topic x occurrs among the top n topics. If y is provided, documents are returned where x and y are among the n top topics. If x or y are provided as a character vector, the method will look up this label in the labels field.

$read(x, n = 3, no_token = 100)

Read document x, highlighting the number of topics specified by n, indicated by no_token.

$as.zoo(x = NULL, y = NULL, k = 3, exclude = TRUE, aggregation = c(NULL, "month", "quarter", "year"))

Generate zoo object from topicmodel.

$compare(x, ...)

Compare the similarity of two topicmodels.

$find_topics(x, n = 100, word2vec = NULL)

Find a topic.

Public fields

topicmodel

A topicmodel of class TopicModel, generated from package topicmodels.

posterior

Slot to store posterior, not used at this point.

terms

The matrix with the terms of a topicmodel. Keeping the terms may speed up subsequent operations.

topics

The matrix with the topics present in documents. Keeping this matrix may speed up subsequent operations.

bundle

A partition_bundle, required to use method read to access full text.

labels

A character vector, labels for the topics.

name

A name for the Topicanalysis object. Useful if combining several objects into a bundle.

categories

A character vector with categories.

grouping

Not used at this stage.

exclude

Topics to exclude from further analysis.

type

Corpus type, necessary for applying correct template for fulltext output.

Methods

Public methods


Method new()

Usage
Topicanalysis$new(topicmodel)

Method relabel()

Usage
Topicanalysis$relabel(n, new)

Method add_category()

Usage
Topicanalysis$add_category(new)

Method ignorance()

Usage
Topicanalysis$ignorance(n, new)

Method cooccurrences()

Usage
Topicanalysis$cooccurrences(
  k = 3,
  regex = NULL,
  docs = NULL,
  renumber = NULL,
  progress = TRUE,
  verbose = FALSE,
  exclude = TRUE
)

Method wordcloud()

Usage
Topicanalysis$wordcloud(n, n_words = 50, ...)

Method docs()

Usage
Topicanalysis$docs(x, y = NULL, n = 3L, s_attributes = NULL, regex = NULL)

Method read()

Usage
Topicanalysis$read(x, n = 3L, no_token = 100L)

Method as.zoo()

Usage
Topicanalysis$as.zoo(
  x = NULL,
  y = NULL,
  k = 3L,
  exclude = TRUE,
  aggregation = c(NULL, "month", "quarter", "year")
)

Method compare()

Usage
Topicanalysis$compare(...)

Method find_topics()

Usage
Topicanalysis$find_topics(x, n = 100, word2vec = NULL)

Method clone()

The objects of this class are cloneable with this method.

Usage
Topicanalysis$clone(deep = FALSE)
Arguments
deep

Whether to make a deep clone.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
data(BE_lda)
data(BE_labels)
data(BE_exclude)

BE <- Topicanalysis$new(topicmodel = BE_lda)
BE$labels <- BE_labels
BE$exclude <- BE_exclude
BE$exclude <- grepl("^\\((split|)\\)$", BE$labels)
BE$name <- "Berlin"
BE$type <- "plpr_partition"

z <- BE$as.zoo(x = "Flucht, Asyl, vorläufiger Schutz", aggregation = "year")
plot(z)

y <- BE$as.zoo(
  x = grep("Asyl", BE_labels),
  y = grep("Europ", BE_labels),
  aggregation = "year"
)
plot(y)

BE$exclude <- grepl("^\\(.*?\\)$", BE$labels)
dt <- BE$cooccurrences(k = 3L, exclude = TRUE)
dt_min <- dt[chisquare >= 10.83]


if (requireNamespace("igraph")){
g <- igraph::graph_from_data_frame(
  d = data.frame(
    from = dt_min[["a_label"]],
    to = dt_min[["b_label"]],
    n = dt_min[["count_coi"]],
    stringsAsFactors = FALSE
  ),
  directed = TRUE
)
g <- igraph::as.undirected(g, mode = "collapse")
if (interactive()){
  igraph::plot.igraph(
    g, shape = "square", vertex.color = "steelblue",
    label = igraph::V(g)$name, label.family = 11, label.cex = 0.5
  )
}
}

topic_flucht <- 125L
topic_integration <- 241
BE$docs(x = "Flucht, Asyl, vorläufiger Schutz")
BE$docs(x = grep("Flucht", BE$labels))
BE$docs(x = 125L)
docs <- BE$docs(x = 125L, y = 241L)

## Not run: 
li <- lapply(
  docs, 
  function(doc){
    polmineR::as.speeches(
      polmineR::partition(
        "BE",
        who = gsub("^(.*?)_.*$", "\\1", doc),
        date = gsub("^.*(\\d{4}-\\d{2}-\\d{2})_\\d+$", "\\1", doc)
      ),
      s_attribute_name = "who"
    )[[as.integer(gsub("^.*?_(\\d+)$", "\\1", doc))]]
})
BE$bundle <- as.partition_bundle(li)

read(BE$topicmodel, BE$bundle[[1]], n = 3L, no_token = 250)
read(BE$topicmodel, BE$bundle[[2]], n = 3L, no_token = 250)
read(BE$topicmodel, BE$bundle[[3]], n = 3L, no_token = 250)
for (doc in docs){
  print(doc)
  p <- BE$bundle[[doc]]
  read(BE$topicmodel, p, n = 3L, no_token = 250)
  readline(prompt = "Hit any key to continue.")
}

## End(Not run)

#############################

data(SL_lda)
data(SL_labels)
data(SL_exclude)

SL <- Topicanalysis$new(topicmodel = SL_lda)
SL$labels <- SL_labels
SL$exclude <- SL_exclude
SL$exclude <- grepl("^\\((split|)\\)$", SL$labels)
SL$name <- "Hamburg"

cp_1 <- BE$compare(SL, BE)
cp_2 <- BE$compare(SL, BE)

PolMine/polmineR.topics documentation built on March 6, 2020, 6:03 p.m.