Uses jaccard_shingles to compare candidate pairs from lsh to determine
which documents are actually similar.
1 | tidy_candidates(candidates, shingles, docs = NULL, threshold = 0.8)
|
candidates |
list of buckets with document ids from |
shingles |
list of documents and their shingles from |
docs |
optional text data to include in results |
threshold |
jaccard similarity threshold |
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.