Description Usage Arguments Details Value Author(s) References See Also Examples
View source: R/topic_stability.R
Implements fast chi-square like test to evaluate the stability of redundant topics.
1 2 3 4 5 6 7 | topic_stability(
lda_models,
optimal_model,
q = 0.8,
alpha = 0.05,
do_plot = TRUE
)
|
lda_models |
A list of ordered LDA models as estimated by
|
optimal_model |
A number corresponding to the optimal topic model. |
q |
Set a cutoff for important words as the quantile of the expected cumulative probability of word weights. Default to 0.80, meaning that the function reaches 80% of the distribution mass and leaves out the remaining 20%. |
alpha |
Alpha level to identify informative words from the Cumulative Distribution Function over the cosine similarities in the Topic Word Weights matrix. Default to 0.05. |
do_plot |
Plot the chi-square statistic as a function of the number of
topics. Default to |
This function implements Test 3 as defined in
Lewis and Grossetti (2019). Test 3 evaluates the aggregated
stability of over-optimal topic specifications by summing each
point-wise contribution. See 'Value' to understand how topic_stability
returns the results.
A 'data.table' containing the following columns:
|
An integer giving the number of topics. |
|
An integer giving the degrees of freedom. |
|
A numeric giving the chi-square statistic. |
Francesco Grossetti francesco.grossetti@unibocconi.it
Craig M. Lewis craig.lewis@owen.vanderbilt.edu
Lewis, C. and Grossetti, F. (2019 - forthcoming):
A Statistical Approach for Optimal Topic Model Identification.
1 2 3 4 5 6 7 | ## Not run:
test2 <- topic_stability( lda_models = lda_list,
optimal_model = test1,
q = 0.00075,
alpha = 0.05 )
## End(Not run)
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.