rmultinom_sparse: Draw from multinomial distributions
In agoldst/dfrtopics: Tools for exploring topic models of text

rmultinom_sparse

R Documentation

Draw from multinomial distributions

Description

According to the generative model of LDA, documents are drawn from mixtures of multinomial distributions over the vocabulary. When we simulate from the posterior, our task in practice is: for each document d, given the number of words n allocated to topic k in d, generate the result of n multinomial trials with word probabilities given from topic k. This function tries to do this efficiently given a vector of n values (one for each document) and a vector of topic weights, yielding a simulated term-document matrix of within-topic weights.

Usage

rmultinom_sparse(nn, probs)

Arguments

`nn`	vector of trial sizes: `nn[i]` gives the number of words to draw in the `i`th trial.
`probs`	vector of word weights: `probs[j]/sum(probs[j])` gives the probability of word `j` in a single trial. It need not be normalized.

Details

R's built-in rmultinom has two disadvantages here. First, it is set up to generate many samples, each with the same number of trials. But we require varying the number of trials to correspond to our varying numbers of words allocated to the given topic, so we would have to call rmultinom once for each document and then rbind the results. Second, because the vocabulary can be large and topics typically allocate most of the probability to only a few words, most elements of each sample vector will be zero. But the built-in function cannot take advantage of this sparsity and will require space for a full simulated term-document matrix. This function, by contrast, returns a sparse Matrix.

Note that the parameters are not the same as rmultinom's. The equivalent of rmultinom(n, size, prob) is rmultinom_sparse(rep(size, n), prob).

Value

sparse Matrix of sampled term-document counts, with terms in rows and documents in columns. Notice that this means individual multinomial samples are columns of the returned matrix.

agoldst/dfrtopics
Tools for exploring topic models of text

rmultinom_sparse: Draw from multinomial distributions
In agoldst/dfrtopics: Tools for exploring topic models of text

Draw from multinomial distributions

Description

Usage

Arguments

Details

Value

See Also

Related to rmultinom_sparse in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics Tools for exploring topic models of text

rmultinom_sparse: Draw from multinomial distributions In agoldst/dfrtopics: Tools for exploring topic models of text

Draw from multinomial distributions

Description

Usage

Arguments

Details

Value

See Also

Related to rmultinom_sparse in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics
Tools for exploring topic models of text

rmultinom_sparse: Draw from multinomial distributions
In agoldst/dfrtopics: Tools for exploring topic models of text