Description Usage Arguments Value Examples
View source: R/gen_synth_corpus.R
Generates document words using the LDA generative process given a beta. It's used to test the correctness of the Gibbs sampling algorithms.
1 | gen_synth_corpus(D, lambda.hat, alpha.v, beta)
|
D |
the number of documents in the corpus |
lambda.hat |
the mean of document counts |
alpha.v |
the vector of Dirichlet hyperparameters (K X 1) for document topic mixtures |
beta |
the beta matrix (counts) for topic word probabilities (K x V format) |
a list of generated documents' details
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | K <- 2
V <- 20
D <- 100
gen.alpha.v <- array(7, c(K, 1));
gen.eta.v <- array(3, c(1, V));
lambda.hat <- 80
## Generates the synthetic beta.m
beta.m <- matrix(1e-2, nrow=K, ncol=V)
beta.m[1, ] <- rdirichlet(1, gen.eta.v);
beta.m[2, ] <- rdirichlet(1, gen.eta.v);
## Generates documents with a given beta.m
ds <- gen_synth_corpus(D, lambda.hat, gen.alpha.v, beta.m);
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.