divergence | R Documentation |
divergence()
computes the regularized topic divergence scores to help users
to find the optimal number of topics for LDA.
divergence(
x,
min_size = 0.01,
select = NULL,
regularize = TRUE,
newdata = NULL,
...
)
x |
a LDA model fitted by |
min_size |
the minimum size of topics for regularized topic divergence.
Ignored when |
select |
names of topics for which the divergence is computed. |
regularize |
if |
newdata |
if provided, |
... |
additional arguments passed to textmodel_lda. |
divergence()
computes the average Jensen-Shannon divergence
between all the pairs of topic vectors in x$phi
. The divergence score
maximizes when the chosen number of topic k
is optimal (Deveaud et al.,
2014). The regularized divergence penalizes topics smaller than min_size
to avoid fragmentation (Watanabe & Baturo, forthcoming).
Returns a singple numeric value.
Deveaud, Romain et al. (2014). "Accurate and Effective Latent Concept Modeling for Ad Hoc Information Retrieval". doi:10.3166/DN.17.1.61-84. Document Numérique.
Watanabe, Kohei & Baturo, Alexander. (2023). "Seeded Sequential LDA: A Semi-supervised Algorithm for Topic-specific Analysis of Sentences". doi:10.1177/08944393231178605. Social Science Computer Review.
perplexity
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.