Description Usage Arguments Details Value See Also Examples
View source: R/gen_synth_corpus.R
Generates documents using the LDA generative process based on a set of (α, η) Configurations
1  | gen_synth_corpus_multi_h(K, V, J, collection.size, doc.size, alpha.vec, eta.vec)
 | 
K | 
 number of topics  | 
V | 
 vocabulary size  | 
J | 
 number of collections  | 
collection.size | 
 number of documents in each collection (a list of J elements)  | 
doc.size | 
 number of words in each document  | 
alpha.vec | 
 a set of values for document-level Dirichlet sampling  | 
eta.vec | 
 a set of values for topic Diriclet sampling  | 
Last modified on: April 15, 2016
a list of generated corpus and their statistics
Other corpus: calc_doc_cos,
calc_doc_tf,
gen_synth_corpus2,
gen_synth_corpus_multi_alpha
1 2 3 4 5 6 7 8 9 10 11 12  | ## Generates documents with given parameters
J                  <- 2
K                  <- 4
V                  <- 20
alpha.vec          <- c(.2, 2)
eta.vec            <- c(.5, 2)
doc.size           <- 80
collection.size    <- c(40, 40) # number of documents in each collection
ds.name            <- paste("synth-J", J, "-K", K, "-V", V, sep = "")
ds                 <- gen_synth_corpus_multi_h(K, V, J, collection.size, doc.size, alpha.vec, eta.vec)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.