tidy_candidates: Tidy candidate results

Description Usage Arguments

View source: R/edges.r

Description

Uses jaccard_shingles to compare candidate pairs from lsh to determine which documents are actually similar.

Usage

1
tidy_candidates(candidates, shingles, docs = NULL, threshold = 0.8)

Arguments

candidates

list of buckets with document ids from lsh

shingles

list of documents and their shingles from shingle

docs

optional text data to include in results

threshold

jaccard similarity threshold


zamorarr/lshr documentation built on April 24, 2021, 11:35 p.m.