euclidean_probability: Find Probability of Match Based on Similarity

View source: R/lsh_properties.R

euclidean_probabilityR Documentation

Find Probability of Match Based on Similarity

Description

Find Probability of Match Based on Similarity

Usage

euclidean_probability(distance, n_bands, band_width, r)

Arguments

distance

the euclidian distance between the two vectors you want to compare.

n_bands

The number of LSH bands used in hashing.

band_width

The number of hashes in each band.

r

the "r" hyperparameter used to govern the sensitivity of the hash.

Value

a decimal number giving the proability that the two items will be returned as a candidate pair from the minihash algorithm.


zoomerjoin documentation built on April 13, 2025, 9:08 a.m.