Description Usage Arguments Details Functions Examples
Transformation from bag-of-words counts into a topic space of lower dimensionality. LDA is a probabilistic extension of LSA (also called multinomial PCA), so LDA’s topics can be interpreted as probability distributions over words. These distributions are, just like with LSA, inferred automatically from a training corpus. Documents are in turn interpreted as a (soft) mixture of these topics (again, just like with LSA).
1 2 3 4 5 6 7 |
corpus |
Model as returned by |
... |
Any other options, from the official documentation of |
file |
Path to a saved model. |
Target dimensionality (num_topics
) of 200–500 is recommended as a “golden standard” https://dl.acm.org/citation.cfm?id=1458105.
model_lda
- Single-core implementation.
model_ldamc
- Multi-core implementation.
1 2 3 4 5 6 7 8 9 | docs <- prepare_documents(corpus)
dictionary <- corpora_dictionary(docs)
corpora <- doc2bow(dictionary, docs)
corpus_mm <- serialize_mmcorpus(corpora, auto_delete = FALSE)
# fit model
lda <- model_lda(corpus_mm, id2word = dictionary, num_topics = 2L)
lda_topics <- lda$get_document_topics(corpora)
get_docs_topics(lda_topics)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.