Description Usage Arguments Details Value See Also Examples
View source: R/gen_synth_corpus.R
Generates documents using the LDA generative process based on a set of (α, η) Configurations
1 | gen_synth_corpus2(K, V, num.docs, doc.size, alpha.h, eta.h)
|
K |
number of topics |
V |
vocabulary size |
doc.size |
number of words in each document |
J |
number of collections |
collection.size |
number of documents in each collection (a list of J elements) |
alpha.vec |
a set of values for document-level Dirichlet sampling |
eta.vec |
a set of values for topic Diriclet sampling |
Last modified on: April 15, 2016
a list of generated corpus and their statistics
Other corpus: calc_doc_cos
,
calc_doc_tf
,
gen_synth_corpus_multi_alpha
,
gen_synth_corpus_multi_h
1 2 3 4 5 6 7 8 9 10 11 | ## Generates documents with given parameters
K <- 4
V <- 20
alpha.h <- .2
eta.h <- .5
doc.size <- 80
num.docs <- 40 # number of documents in each collection
ds.name <- paste("synth-K", K, "-V", V, sep = "")
ds <- gen_synth_corpus2(K, V, num.docs, doc.size, alpha.h, eta.h)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.