Description Usage Arguments Details Value Examples
View source: R/topic_modeling_core.R
A wrapper for RSpectra::svds
that returns
a nicely-formatted latent semantic analysis topic model.
1 | FitLsaModel(dtm, k, calc_coherence = TRUE, return_all = FALSE, ...)
|
dtm |
A document term matrix of class |
k |
Number of topics |
calc_coherence |
Do you want to calculate probabilistic coherence of topics
after the model is trained? Defaults to |
return_all |
Should all objects returned from |
... |
Other arguments to pass to |
Latent semantic analysis, LSA, uses single value decomposition to factor the document term matrix. In many LSA applications, TF-IDF weights are applied to the DTM before model fitting. However, this is not strictly necessary.
Returns a list with a minimum of three objects: phi
,
theta
, and sv
. The rows of phi
index topics and the
columns index tokens. The rows of theta
index documents and the
columns index topics. sv
is a vector of singular values.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | # Load a pre-formatted dtm
data(nih_sample_dtm)
# Convert raw word counts to TF-IDF frequency weights
idf <- log(nrow(nih_sample_dtm) / Matrix::colSums(nih_sample_dtm > 0))
dtm_tfidf <- Matrix::t(nih_sample_dtm) * idf
dtm_tfidf <- Matrix::t(dtm_tfidf)
# Fit an LSA model
model <- FitLsaModel(dtm = dtm_tfidf, k = 5)
str(model)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.