gen_synth_corpus_multi_h: Generates a Corpus with Multiple (alpha, eta)s

Description Usage Arguments Details Value See Also Examples

View source: R/gen_synth_corpus.R

Description

Generates documents using the LDA generative process based on a set of (α, η) Configurations

Usage

1
gen_synth_corpus_multi_h(K, V, J, collection.size, doc.size, alpha.vec, eta.vec)

Arguments

K

number of topics

V

vocabulary size

J

number of collections

collection.size

number of documents in each collection (a list of J elements)

doc.size

number of words in each document

alpha.vec

a set of values for document-level Dirichlet sampling

eta.vec

a set of values for topic Diriclet sampling

Details

Last modified on: April 15, 2016

Value

a list of generated corpus and their statistics

See Also

Other corpus: calc_doc_cos, calc_doc_tf, gen_synth_corpus2, gen_synth_corpus_multi_alpha

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
## Generates documents with given parameters

J                  <- 2
K                  <- 4
V                  <- 20
alpha.vec          <- c(.2, 2)
eta.vec            <- c(.5, 2)
doc.size           <- 80
collection.size    <- c(40, 40) # number of documents in each collection
ds.name            <- paste("synth-J", J, "-K", K, "-V", V, sep = "")

ds                 <- gen_synth_corpus_multi_h(K, V, J, collection.size, doc.size, alpha.vec, eta.vec)

clintpgeorge/ldavem documentation built on May 13, 2019, 8:01 p.m.