gen_corpus: Generates a synthetic corpus based on symmetric Dirichlets

Description Usage Arguments Details Value See Also Examples

View source: R/gen_synth_corpus.R

Description

Generates documents using the LDA generative process based on a set of predefined values.

Usage

1
gen_corpus(K, V, D, doc.size, alpha, eta)

Arguments

K

number of topics

V

vocabulary size

D

number of documents

doc.size

number of words in each document

alpha

hyperparameter for document Dirichlet sampling

eta

hyperparameter for topic Diriclet sampling

Details

Last modified on: May 24, 2015

Value

a list of generated documents and their statistics

See Also

Other corpus: calc_doc_cos, calc_doc_tf

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
## Generates documents with given parameters 

K              <- 2
V              <- 20 
D              <- 100
alpha          <- 7
eta            <- 3
doc.size       <- 80

ds             <- gen_corpus(K, V, D, doc.size, alpha, eta);

clintpgeorge/ldamcmc documentation built on Feb. 22, 2020, 12:39 p.m.