mi_check: Posterior predictive checking for topics
In agoldst/dfrtopics: Tools for exploring topic models of text

mi_check

R Documentation

Posterior predictive checking for topics

Description

This function provides a way to check the fit of the topic model by comparing the obtained mutual information for topics to values derived from simulations from the posterior. Large deviations from simulated values may indicate a poorer fit.

Usage

mi_check(m, k, groups = NULL, n_reps = 20)

Arguments

`m`	`mallet_model` object with sampling state loaded via `load_sampling_state`
`k`	topic number (calculations are only done for one topic at a time)
`groups`	optional grouping factor for documents. If supplied, the IMI values will be for words over groups rather than over individual documents
`n_reps`	number of simulations

Details

For a given topic k, a simulation draws a new term-document matrix from the posterior for d. Since a topic is simply a multinomial distribution over the words, for a given document d we simply draw the same number of samples from this multinomial as there were words allocated to topic k in d in the model we are checking. Under the assumptions of the model, this is how the distribution p(w, d|k) arises. With this simulated topic-specific term-document matrix in hand, we recalculate the MI. The process is replicated to obtain a reference distribution to compare the values from mi_topic to.

Value

a single-row data frame with topic, mi, and deviance columns. The latter is the MI standardized by the mean and standard deviation of the simulated values. The vector of simulated values is available as the "simulated" attribute of the returned data frame.

References

Mimno, D., and Blei, D. 2011. Bayesian Checking for Topic Models. Empirical Methods in Natural Language Processing. http://www.cs.columbia.edu/~blei/papers/MimnoBlei2011.pdf.

agoldst/dfrtopics
Tools for exploring topic models of text

mi_check: Posterior predictive checking for topics
In agoldst/dfrtopics: Tools for exploring topic models of text

Posterior predictive checking for topics

Description

Usage

Arguments

Details

Value

References

See Also

Related to mi_check in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics Tools for exploring topic models of text

mi_check: Posterior predictive checking for topics In agoldst/dfrtopics: Tools for exploring topic models of text

Posterior predictive checking for topics

Description

Usage

Arguments

Details

Value

References

See Also

Related to mi_check in agoldst/dfrtopics...

R Package Documentation

Browse R Packages

We want your feedback!

agoldst/dfrtopics
Tools for exploring topic models of text

mi_check: Posterior predictive checking for topics
In agoldst/dfrtopics: Tools for exploring topic models of text