Description Usage Arguments Value
Multiple document comparison for textual overlap
1 | multi_doc_compare(texts, n_grams, sd_criterion)
|
texts |
character vector of texts, each text is a string in the vector |
n_grams |
integer to specify ngram units |
sd_criterion |
numeric set a standard deviation criterion for returning documents that are unsually similar, 2-3 is pretty good |
list
dtm matrix document term matrix for all texts
histogram a histogram of the cosine similarity values between every text
similarities matrix cosine similarities between every text
mean_similarity numeric the mean similarity between all texts
sd_similarity numeric the standard deviation of the similarities
check_these dataframe document pairs that were above the criterion, might want to check these ones))
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.