train_model: Train a topic model
In agoldst/dfrtopics: Tools for exploring topic models of text

train_model

R Documentation

Train a topic model

Description

Invokes MALLET's parallel topic modeling algorithm on a set of documents represented as an InstanceList.

Usage

train_model(
  instances,
  n_topics,
  alpha_sum = 5,
  beta = 0.01,
  n_iters = 200,
  n_max_iters = 10,
  optimize_hyperparameters = TRUE,
  n_hyper_iters = 20,
  n_burn_in = 50,
  symmetric_alpha = FALSE,
  threads = 4L,
  seed = NULL,
  metadata = NULL
)

Arguments

`instances`	either an rJava reference to an `InstanceList` object or the name of a file into which such an object has been serialized
`n_topics`	how many topics to train?
`alpha_sum`	initial sum of hyperparameters alpha_k: priors of topics over document
`beta`	initial value of hyperparameter β: prior of topics over words
`n_iters`	number of Gibbs sampling iterations to run
`n_max_iters`	number of "iterated conditional modes"
`optimize_hyperparameters`	if TRUE (the default), optimize α_k and β. If FALSE, the value of `symmetric_alpha` is ignored.
`n_hyper_iters`	how often to do hyperparameter optimization
`n_burn_in`	number of initial "burn-in" iterations before hyperparameter optimization
`symmetric_alpha`	if FALSE (the default), allow the α_k to be different from one another. If TRUE when `optimize_hyperparameters` is TRUE, then the sum of the alphas will still be varied by the algorithm, but all the α_k will be the same.
`threads`	number of threads to run in parallel.
`seed`	MALLET's random number seed: set this to ensure a reproducible run of the Gibbs sampling algorithm.
`metadata`	not used in the modeling process, but the model object returned by the function will store a reference to it if supplied

Details

Create the instance list object with make_instances. MALLET's progress reporting appears on the console by default; to change this, set the package option dfrtopics.mallet_logging (see help("mallet-logging")).

If Java gives out-of-memory errors, try increasing the Java heap size to a large value, like 4GB, by setting options(java.parameters="-Xmx4g") before loading this package (or rJava).

Value

a mallet_model object

agoldst/dfrtopics
Tools for exploring topic models of text

train_model: Train a topic model
In agoldst/dfrtopics: Tools for exploring topic models of text

Train a topic model

Description

Usage

Arguments

Details

Value

See Also

Related to train_model in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics Tools for exploring topic models of text

train_model: Train a topic model In agoldst/dfrtopics: Tools for exploring topic models of text

Train a topic model

Description

Usage

Arguments

Details

Value

See Also

Related to train_model in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics
Tools for exploring topic models of text

train_model: Train a topic model
In agoldst/dfrtopics: Tools for exploring topic models of text