cosineTopics | R Documentation |
Calculates the similarity of all pairwise topic combinations using the Cosine Similarity.
cosineTopics(topics, progress = TRUE, pm.backend, ncpus)
topics |
[ |
progress |
[ |
pm.backend |
[ |
ncpus |
[ |
The Cosine Similarity for two topics \bm z_{i} and \bm z_{j} is calculated by
\cos(θ | \bm z_{i}, \bm z_{j}) = \frac{ ∑_{v=1}^{V}{n_{i}^{(v)} n_{j}^{(v)}} }{ √{∑_{v=1}^{V}{≤ft(n_{i}^{(v)}\right)^2}} √{∑_{v=1}^{V}{≤ft(n_{j}^{(v)}\right)^2}} }
with θ determining the angle between the corresponding count vectors \bm z_{i} and \bm z_{j}, V is the vocabulary size and n_k^{(v)} is the count of assignments of the v-th word to the k-th topic.
[named list
] with entries
sims
[lower triangular named matrix
] with all pairwise
similarities of the given topics.
wordslimit
[integer
] = vocabulary size. See
jaccardTopics
for original purpose.
wordsconsidered
[integer
] = vocabulary size. See
jaccardTopics
for original purpose.
param
[named list
] with parameter
type
[character(1)
] = "Cosine Similarity"
.
Other TopicSimilarity functions:
dendTopics()
,
getSimilarity()
,
jaccardTopics()
,
jsTopics()
,
rboTopics()
res = LDARep(docs = reuters_docs, vocab = reuters_vocab, n = 4, K = 10, num.iterations = 30) topics = mergeTopics(res, vocab = reuters_vocab) cosine = cosineTopics(topics) cosine sim = getSimilarity(cosine) dim(sim)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.