Description Usage Arguments Value Examples
Topic models describe documents as composed of different topics. This property can be used to obtain co-occuurrence statistics of topics.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 | ## S4 method for signature 'TopicModel'
cooccurrences(
.Object,
k,
regex = NULL,
docs = NULL,
renumber = NULL,
method = "chisquare",
progress = TRUE,
verbose = TRUE
)
## S4 method for signature 'matrix'
cooccurrences(
.Object,
regex = NULL,
docs = NULL,
renumber = NULL,
method = "chisquare",
progress = TRUE,
verbose = TRUE
)
|
.Object |
Either an object inheriting from the |
k |
An |
regex |
If not |
docs |
If not |
renumber |
If not |
method |
The statistic to calculate co-occurrences, "chisquare" by default. |
progress |
A |
verbose |
A |
A data.table
with co-occurrence statistics with at least the
following columns:
number of the topic of interest
number of the co-occurring topic
number of total occurrences of topic b; if the document-topic matrix has been renumbered, the times at least one of the topics in a group occurs in a docuent
number of total occurrences of topic a; if the document-topic matrix has been renumbered, the times at least one of the topics in a group occurs in a docuent
number of joint occurrences of topics a and b
number of occurrences of b without co-occurring of a
If argument method
is not NULL
, additional columns will be
included in the topic co-occurrence table. E.g. if method
is "chisquare",
a column "exp_coi", will report the expected number of occurrences of b together with a,
column "chisquare" will report the value of the chi squared test, and a
column "rank_chisquare" will report the rank of the statistical significance of the
co-occurrence of a and b according to the chi squared test.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 | data(BE_lda, BE_labels)
dt <- cooccurrences(BE_lda, k = 3L)
topics_to_drop <- grep("^\\(.*?\\)$", BE_labels)
dt_min <- dt[chisquare >= 10.83][!a %in% topics_to_drop][!b %in% topics_to_drop]
dt_min[, "a_label" := BE_labels[ dt_min[["a"]] ] ]
dt_min[, "b_label" := BE_labels[ dt_min[["b"]] ] ]
# Using the cooccurrence data for generating a network visualisation
if (requireNamespace("igraph")){
g <- igraph::graph_from_data_frame(
d = data.frame(
from = dt_min[["a_label"]],
to = dt_min[["b_label"]],
n = dt_min[["count_coi"]],
stringsAsFactors = FALSE
),
directed = TRUE
)
g <- igraph::as.undirected(g, mode = "collapse")
if (interactive()){
igraph::plot.igraph(
g, shape = "square", vertex.color = "steelblue",
label = igraph::V(g)$name, label.family = 11, label.cex = 0.5
)
}
}
# Example how to use the argument 'renumber' if a concept is represented by
# several topics
renumber_li <- list(
school = grep("Grundschule", BE_labels),
cummunity = grep("Gemeindeentwicklung", BE_labels),
traffic = grep("Verkehrsmittel", BE_labels)
)
dt <- cooccurrences(BE_lda, k = 3L, renumber = renumber_li)
dt[a == grep("Grundschule", BE_labels)[1]][chisquare > 10.83]
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.