load_word_weights | R Documentation |
The word weights matrix (weights of words for topics) can get big dataish when there is a large number of topics and a substantially sized vocabulary. The 'save_word_weights()' and the 'load_word_weights()' are tools to handle this scenario by writing out the data to disk as a sparse matrix, and loading this into the R session. In order to be able to use the function, the 'ParallelTopicModel' class needs to be used, the 'RTopicModel' will not do it.
load_word_weights( filename, minimized = TRUE, beta_coeff, normalized = TRUE, verbose = TRUE ) save_word_weights( model, destfile = tempfile(), minimized = FALSE, verbose = TRUE )
filename |
A file with word weights. |
minimized |
A 'logical' value, whether to print word weights with nonzero values (without smoothing) only. |
beta_coeff |
As a matter of "smoothing", a coefficient is added to the value oif the matrix. Ideally, state value explicitly in function call. If missing, it will be guessed from the data. |
normalized |
A 'logical' value, whether to normalize. |
verbose |
A 'logical' value, whether to output progress messages. |
model |
A topic model (class 'jobjRef'). |
destfile |
Length-one 'character' vector, the filename of the output file. |
The function 'save_word_weights()' will write a file that can be handled as a sparse matrix to a file (argument 'destfile'). Internally, it uses the method '$printTopicWordWeights()' of the 'ParallelTopicModel' class. The (parsed) content of the file is equivalent to matrix that can be obtained directly the class using the '$getTopicWords(FALSE, TRUE)' method. Thus, values are not normalised, but smoothed (= coefficient beta is added to values).
bin <- system.file(package = "biglda", "extdata", "mallet", "lda_mallet.bin") lda <- mallet_load_topicmodel(bin) fname <- save_word_weights(lda) word_weights <- load_word_weights(fname)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.