LDABatch: LDA Replications on a Batch System

View source: R/LDABatch.R

LDABatchR Documentation

LDA Replications on a Batch System

Description

Performs multiple runs of Latent Dirichlet Allocation on a batch system using the batchtools-package.

Usage

LDABatch(
  docs,
  vocab,
  n = 100,
  seeds,
  id = "LDABatch",
  load = FALSE,
  chunk.size = 1,
  resources,
  ...
)

Arguments

docs

[list]
Documents as received from LDAprep.

vocab

[character]
Vocabularies passed to lda.collapsed.gibbs.sampler. For additional (and necessary) arguments passed, see ellipsis (three-dot argument).

n

[integer(1)]
Number of Replications.

seeds

[integer(n)]
Random Seeds for each Replication.

id

[character(1)]
Name for the registry's folder.

load

[logical(1)]
If a folder with name id exists: should the existing registry be loaded?

chunk.size

[integer(1)]
Requested chunk size for each single chunk. See chunk.

resources

[named list]
Computational resources for the jobs to submit. See submitJobs.

...

additional arguments passed to lda.collapsed.gibbs.sampler. Arguments will be coerced to a vector of length n. Default parameters are alpha = eta = 1/K and num.iterations = 200. There is no default for K.

Details

The function generates multiple LDA runs with the possibility of using a batch system. The integration is done by the batchtools-package. After all jobs of the corresponding registry are terminated, the whole registry can be ported to your local computer for further analysis.

The function returns a LDABatch object. You can receive results and all other elements of this object with getter functions (see getJob).

Value

[named list] with entries id for the registry's folder name, jobs for the submitted jobs' ids and its parameter settings and reg for the registry itself.

See Also

Other batch functions: as.LDABatch(), getJob(), mergeBatchTopics()

Other LDA functions: LDARep(), LDA(), getTopics()

Examples

## Not run: 
batch = LDABatch(docs = reuters_docs, vocab = reuters_vocab, n = 4, K = 15)
batch
getRegistry(batch)
getJob(batch)
getLDA(batch, 2)

batch2 = LDABatch(docs = reuters_docs, vocab = reuters_vocab, K = 15, chunk.size = 20)
batch2
head(getJob(batch2))

## End(Not run)


JonasRieger/ldaPrototype documentation built on Feb. 5, 2023, 6:45 p.m.