sim_LDA_data | R Documentation |
For a given set of parameters alpha
and Beta
and
document-specific total word counts, simulate a document-by-term matrix.
Additional structuring variables (the numbers of topics (k),
documents (M), terms (V)) are inferred from input objects.
sim_LDA_data(N, Beta, alpha = NULL, Theta = NULL, seed = NULL)
N |
A vector of document sizes (total word counts). Must be integer conformable. Is used to infer the total number of documents. |
Beta |
|
alpha |
Single positive numeric value for the Dirichlet distribution
parameter defining topics within documents. To specifically define
document topic probabilities, use |
Theta |
|
seed |
Input to |
A document-by-term matrix
of counts (dim: M x V).
N <- c(10, 22, 15, 31)
alpha <- 1.2
Beta <- matrix(c(0.1, 0.1, 0.8, 0.2, 0.6, 0.2), 2, 3, byrow = TRUE)
sim_LDA_data(N, Beta, alpha = alpha)
Theta <- matrix(c(0.2, 0.8, 0.8, 0.2, 0.5, 0.5, 0.9, 0.1), 4, 2,
byrow = TRUE)
sim_LDA_data(N, Beta, Theta = Theta)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.