Description Usage Arguments Details Value Examples
A wrapper for two implementations of Latent Dirichlet Allocation that returns a nicely-formatted topic model. See details, below.
1 2 |
dtm |
A document term matrix of class |
k |
Number of topics |
iterations |
The number of Gibbs iterations if |
alpha |
Dirichlet parameter for the distribution of topics over documents. Defaults to 0.1 |
beta |
Dirichlet parameter for the distribution of words over topics. Defaults to 0.05 |
smooth |
Logical indicating whether or not you want to smooth the
probabilities in the rows of |
method |
One of either 'gibbs' or 'vem' for either Gibbs sampling or variational expectation maximization. Defaults to 'gibbs'. See details, below. |
return_all |
Logical. Do you want the raw results of the underlying
function returned along with the formatted results? Defaults to |
... |
Other arguments to pass to underlying functions. See details, below. |
For method = 'gibbs'
this is a wrapper for lda.collapsed.gibbs.sampler
from the lda
package. Additional arguments can be passed to
lda.collapsed.gibbs.sampler
through ...
. However, there are some
arguments that, if passed through ...
, can cause conflicts. The
arguments K
, alpha
, and eta
for
lda.collapsed.gibbs.sampler
are set with the arguments k
,
alpha
, and beta
, respectively. The arguments documents
and vocab
for lda.collapsed.gibbs.sampler
are set by dtm
and aren't required.
For method = 'vem'
, this function is a wrapper for LDA
from the
topicmodels
library. Arguments to LDA
's control
argument are passed through ...
. LDA
, by default, has behavior
worth noting. By default, it estimates alpha
and beta
as part
of the expectation maximization. Therefore, the values of alpha
and
beta
passed to LDA
will change unless estimate.alpha
and
estimate.beta
are passed to ...
and set to FALSE
.
The ...
argument can also be used to control the underlying behavior of
TmParallelApply
, such as the number of cpus, for example.
Returns a list with a minumum of two objects, phi
and
theta
. The rows of phi
index topics and the columns index tokens.
The rows of theta
index documents and the columns index topics.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | # Load a pre-formatted dtm
data(nih_sample_dtm)
# Fit an LDA model on a sample of documents
model <- FitLdaModel(dtm = nih_sample_dtm[ sample(1:nrow(nih_sample_dtm), 20), ],
k = 5, iterations = 200)
str(model)
# Fit a model, include likelihoods passed to lda::lda.collapsed.gibbs.sampler
model <- FitLdaModel(dtm = nih_sample_dtm[ sample(1:nrow(nih_sample_dtm), 20), ],
k = 5, iterations = 200, compute.log.likelihood = TRUE)
str(model)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.