lsh_query: Query a LSH cache for matches to a single document

Description Usage Arguments Value See Also Examples

View source: R/lsh_query.R

Description

This function retrieves the matches for a single document from an lsh_buckets object created by lsh. See lsh_candidates to retrieve all pairs of matches.

Usage

1
lsh_query(buckets, id)

Arguments

buckets

An lsh_buckets object created by lsh.

id

The document ID to find matches for.

Value

An lsh_candidates data frame with matches to the document specified.

See Also

lsh, lsh_candidates

Examples

1
2
3
4
5
6
7
dir <- system.file("extdata/legal", package = "textreuse")
minhash <- minhash_generator(200, seed = 235)
corpus <- TextReuseCorpus(dir = dir,
                          tokenizer = tokenize_ngrams, n = 5,
                          minhash_func = minhash)
buckets <- lsh(corpus, bands = 50)
lsh_query(buckets, "ny1850-match")

ropensci/textreuse documentation built on May 19, 2020, 7:40 a.m.