infer_topics | R Documentation |
Given an already-trained topic model, infer topic proportions for new documents. This is like the Gibbs sampling process for making a topic model, but the topic-word proportions are not updated.
infer_topics(m, instances, ...) ## S3 method for class 'mallet_model_inferred' print(x) ## S3 method for class 'mallet_model_inferred' summary(x) ## S3 method for class 'mallet_model_inferred' docs_top_topics(m, n) ## S3 method for class 'mallet_model_inferred' top_docs(m, n)
m |
either a topic inferencer object from
|
instances |
an InstanceList object. It must be compatible i.e., (its
vocabulary must correspond) with the instances on which |
n_iterations |
number of Gibbs sampling iterations |
sampling_interval |
thinning interval |
burn_in |
number of burn-in iterations |
seed |
integer random seed; set for reproducibility |
a model object of class mallet_model_inferred
, which
inherits from mallet_model
. This does not have all
the elements of the original topic model, however; the new value
of interest is the matrix of estimated document-topic weights,
accessible via doc_topics
. The inferencer sampling
state and hyperparameters are not accessible. MALLET supplies
estimated topic proportions, which we multiply by the document
lengths to obtain the doc-topics matrix.
## Not run: # beginning with a model m and new documents docs: inferred_m <- make_instances(docs) %>% infer_topics(m, .) # extract new doc-topic matrix doc_topics(inferred_m) # or a convenient data frame of high-ranking topics in each doc docs_top_topics(inferred_m, n=3) # or, similarly, but for high-ranking documents in each topic top_docs(inferred_m, n=3) ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.