load_from_mallet_state | R Documentation |
If you have created a topic model using command-line mallet or another tool,
this function loads that model into mallet_model
form suitable
for use in this package. It uses the gzipped text file representing the Gibbs
sampling state. This state can be used to derive document-topic and
topic-word matrices. The model vocabulary and document ID list are obtained
from the MALLET instances file.
load_from_mallet_state( mallet_state_file, simplified_state_file = file.path(dirname(mallet_state_file), "state.csv"), instances_file = NULL, keep_sampling_state = TRUE, metadata_file = NULL, bigmemory = TRUE )
mallet_state_file |
name of gzipped state file |
simplified_state_file |
name of file to save "simplified"
representation of the state to ( |
instances_file |
location of MALLET instances file used to create the model. If NULL, this will be skipped, but the resulting model object will have missing vocabulary and document ID's. |
keep_sampling_state |
If TRUE (default), the returned object will hold a
reference to the sampling state |
metadata_file |
metadata file (CSV or TSV; optional here) |
bigmemory |
If TRUE (default), the bigmemory and
bigtabulate packages will be used to read and store the Gibbs sampling
state. If for some reason this does not work, try |
a mallet_model
object.
load_mallet_model
,
load_mallet_model_directory
,
write_mallet_state
## Not run: system("mallet train-topics --input instances.mallet \\ --output-state topic-state.gz") m <- load_from_mallet_state("topic-state.gz", "state.csv", "instances.mallet") ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.