train_model | R Documentation |
Invokes MALLET's parallel topic modeling algorithm on a set of documents represented as an InstanceList.
train_model( instances, n_topics, alpha_sum = 5, beta = 0.01, n_iters = 200, n_max_iters = 10, optimize_hyperparameters = TRUE, n_hyper_iters = 20, n_burn_in = 50, symmetric_alpha = FALSE, threads = 4L, seed = NULL, metadata = NULL )
instances |
either an rJava reference to an |
n_topics |
how many topics to train? |
alpha_sum |
initial sum of hyperparameters alpha_k: priors of topics over document |
beta |
initial value of hyperparameter β: prior of topics over words |
n_iters |
number of Gibbs sampling iterations to run |
n_max_iters |
number of "iterated conditional modes" |
optimize_hyperparameters |
if TRUE (the default), optimize
α_k and β. If FALSE, the value of
|
n_hyper_iters |
how often to do hyperparameter optimization |
n_burn_in |
number of initial "burn-in" iterations before hyperparameter optimization |
symmetric_alpha |
if FALSE (the default), allow the α_k to be
different from one another. If TRUE when |
threads |
number of threads to run in parallel. |
seed |
MALLET's random number seed: set this to ensure a reproducible run of the Gibbs sampling algorithm. |
metadata |
not used in the modeling process, but the model object returned by the function will store a reference to it if supplied |
Create the instance list object with make_instances
. MALLET's
progress reporting appears on the console by default; to change this, set
the package option dfrtopics.mallet_logging
(see
help("mallet-logging")
).
If Java gives out-of-memory errors, try increasing the Java heap size to a
large value, like 4GB, by setting options(java.parameters="-Xmx4g")
before loading this package (or rJava).
a mallet_model
object
make_instances
, make_instances
,
model_dfr_documents
, write_mallet_model
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.