clda_vem: cLDA: Variational Expectation Maximization
In clintpgeorge/clda: Approximate Inference Algorithms for the Compound Latent Dirichlet Allocation Model

Description Usage Arguments Details Value Note

This implements the Variational Expectation Maximization (EM) algorithm for the compound latent Dirichlet allocation (cLDA) model.

clda_vem(num_topics, vocab_size, docs_cid, docs_tf, alpha_h, gamma_h, eta_h,
  vi_max_iter, em_max_iter, vi_conv_thresh, em_conv_thresh, tau_max_iter,
  tau_step_size, estimate_alpha, estimate_gamma, estimate_eta, verbose, init_pi,
  test_doc_share = 0, test_word_share = 0)

`num_topics`	Number of topics in the corpus
`vocab_size`	Vocabulary size
`docs_cid`	Documents collection IDs (ID indices starts 0)
`docs_tf`	A list of corpus documents read from the Blei corpus using `read_docs` (term indices starts with 0)
`alpha_h`	Hyperparameter for collection-level Dirichlets π
`gamma_h`	Hyperparameter for document-level Dirichlets θ
`eta_h`	Hyperparameter for corpus level topic Dirichlets β
`vi_max_iter`	Maximum number of iterations for variational inference
`em_max_iter`	Maximum number of iterations for variational EM
`vi_conv_thresh`	Convergence threshold for the document variational inference loop
`em_conv_thresh`	Convergence threshold for the variational EM loop
`tau_max_iter`	Maximum number of iterations for the constraint Newton updates of τ
`tau_step_size`	the step size for the constraint Newton updates of τ
`estimate_alpha`	If true, run hyperparameter α optimization
`estimate_gamma`	dummy parameter [not implemented]
`estimate_eta`	If true, run hyperparameter η optimization
`verbose`	from 0, 1, 2, 3
`init_pi`	the initial configuration for the collection level topic mixtures, i.e., π samples
`test_doc_share`	proportion of the test documents in the corpus. Must be from [0., 1.)
`test_word_share`	proportion of the test words in each test document. Must be from [0., 1.)

To compute perplexity, we first partition words in a corpus into two sets: (a) a test set (held-out set), which is selected from the set of words in the test (held-out) documents (identified via test_doc_share and test_word_share) and (b) a training set, i.e., the remaining words in the corpus. We then run the variational EM algorithm based on the training set. Finally, we compute per-word perplexity based on the held-out set.

A list of variational parameters

Created on May 13, 2016

clintpgeorge/clda documentation built on May 13, 2019, 8 p.m.

clintpgeorge/clda index

README.md

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

clintpgeorge/clda
Approximate Inference Algorithms for the Compound Latent Dirichlet Allocation Model

clda_vem: cLDA: Variational Expectation Maximization
In clintpgeorge/clda: Approximate Inference Algorithms for the Compound Latent Dirichlet Allocation Model

Description

Usage

Arguments

Details

Value

Note

Related to clda_vem in clintpgeorge/clda...

R Package Documentation

Browse R Packages

We want your feedback!

clintpgeorge/clda Approximate Inference Algorithms for the Compound Latent Dirichlet Allocation Model

clda_vem: cLDA: Variational Expectation Maximization In clintpgeorge/clda: Approximate Inference Algorithms for the Compound Latent Dirichlet Allocation Model

Description

Usage

Arguments

Details

Value

Note

Related to clda_vem in clintpgeorge/clda...

R Package Documentation

Browse R Packages

We want your feedback!

clintpgeorge/clda
Approximate Inference Algorithms for the Compound Latent Dirichlet Allocation Model

clda_vem: cLDA: Variational Expectation Maximization
In clintpgeorge/clda: Approximate Inference Algorithms for the Compound Latent Dirichlet Allocation Model