Description Usage Arguments Details Value See Also Examples
View source: R/gen_synth_corpus.R
Generates documents using the LDA generative process based on a set of (α, η) Configurations
1  | gen_synth_corpus2(K, V, num.docs, doc.size, alpha.h, eta.h)
 | 
K | 
 number of topics  | 
V | 
 vocabulary size  | 
doc.size | 
 number of words in each document  | 
J | 
 number of collections  | 
collection.size | 
 number of documents in each collection (a list of J elements)  | 
alpha.vec | 
 a set of values for document-level Dirichlet sampling  | 
eta.vec | 
 a set of values for topic Diriclet sampling  | 
Last modified on: April 15, 2016
a list of generated corpus and their statistics
Other corpus: calc_doc_cos,
calc_doc_tf,
gen_synth_corpus_multi_alpha,
gen_synth_corpus_multi_h
1 2 3 4 5 6 7 8 9 10 11  | ## Generates documents with given parameters
K                  <- 4
V                  <- 20
alpha.h            <- .2
eta.h              <- .5
doc.size           <- 80
num.docs           <- 40 # number of documents in each collection
ds.name            <- paste("synth-K", K, "-V", V, sep = "")
ds                 <- gen_synth_corpus2(K, V, num.docs, doc.size, alpha.h, eta.h)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.