Description Usage Arguments Value
A function to calculate TF-IDF and other related statistics on a set of documents.
1 2 3 4 |
document_term_matrix |
document_term_matrix A numeric matrix or data.frame with dimensions number of documents X vocabulary length, where each entry is the count of word j in document i. |
vocabulary |
A string vector containing all words in the vocabulary. The vocaublary vector must have the same number of entries as the number of columns in the document_term_matrix, and the word indicated by entries in the j'th column of document_term_matrix must correspond to the j'th entry in vocabulary. |
remove_documents_with_no_terms |
Defualts to FALSE, if TRUE then all words in the vocabulary that appear zero times in the selected set of documents will be removed. |
only_calculate_corpus_level_statistics |
Defaults to TRUE. If FALSE then tfidf scores will be calculated for every token in every document. |
display_rankings |
If TRUE then the function will print out the top_words_to_display number of words ranked by TF-IDF. |
top_words_to_display |
The number of top ranked words to print out if display_rankings == TRUE. |
A list object.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.