Description Usage Arguments Value Examples
This function estimates topic proportions for a new corpus of documents, using the the vocabulary and the topic-token probability distributions from a previously fit LDA topic model. The function samples the latent topics for each token in the new corpus using a Gibbs sampler, and returns the latent topics from the last iteration.
1 2 | predictLDA(word.id = integer(), doc.id = integer(), k = 10,
n.chains = 1, n.iter = 1000, topics.init = NULL, alpha = 0.01, phi)
|
word.id |
Unique token ID. Can be taken directly
from the output of |
doc.id |
Unique document ID. Can be taken directly
from the output of |
k |
number of topics. |
n.chains |
number of MCMC chains. |
n.iter |
number of iterations. |
topics.init |
A vector of topics to initially
assign. The Markov property of MCMC allows one to input
the topic assignments from the last iteration of a
previous model fit. Note that this vector should be the
same length of the |
alpha |
Dirichlet hyperparameter |
phi |
The |
A list of length two. The first element is the sampled latent topic value from the last iteration (for each token). The second element is a vector with the log-likelihood values for every iteration of the Gibbs sampler.
1 2 3 | data(APinput)
#takes a while
## Not run: o <- fitLDA(APinput$word.id, APinput$doc.id, k=20)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.