chainsDistances | R Documentation |
Computes the distance between different estimates of a topic model. Since the estimation of a topic model is random, the results may largely differ as the process is repeated. This function allows to compute the distance between distinct realizations of the estimation process. Estimates are referred to as chains.
chainsDistances(
x,
method = c("euclidean", "hellinger", "cosine", "minMax", "naiveEuclidean",
"invariantEuclidean"),
...
)
x |
a valid |
method |
the method used to measure the distance between chains. |
... |
further arguments passed to internal distance functions. |
The method
argument determines how are computed distance.
euclidean
finds the pairs of topics that minimizes and returns the total
Euclidean distance.
hellinger
does the same but based on the Hellinger distance.
cosine
does the same but based on the Cosine distance.
minMax
computes the maximum distance among the best pairs of distances.
Inspired by the minimum-matching distance from Tang et al. (2014).
naiveEuclidean
computes the Euclidean distance without searching for the
best pairs of topics.
invariantEuclidean
computes the best pairs of topics for all allowed
permutations of topic indices. For JST and reversed-JST models, the two-
levels hierarchy of document-sentiment-topic leads some permutations of
indices to represent a drastically different outcome. This setting restricts
the set of permutations to the ones that do not change the interpretation of
the model. Equivalent to euclidean
for LDA models.
A matrix of distance between the elements of x
Olivier Delmarcelle
Tang, J., Meng, Z., Nguyen, X., Mei, Q., and Zhang, M. (2014). Understanding the Limiting Factors of Topic Modeling via Posterior Contraction Analysis. In Proceedings of the 31st International Conference on Machine Learning, 32, 90–198.
plot.multiChains()
chainsScores()
model <- LDA(ECB_press_conferences_tokens)
model <- fit(model, 10, nChains = 5)
chainsDistances(model)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.