View source: R/find_potential_duplicates.R
find_potential_duplicates | R Documentation |
Function computing distances between titles and selecting those which are too close as defined by a maximum distance.
find_potential_duplicates(x, distmethod = "qgram", maxdist = 10)
x |
Tibble. Table with keys and titles. |
distmethod |
Character. Method to compute distances between titles. Can be: osa, lv, dl, hammig, lcs, qgram, cosine, jaccard, jw, or soundex. |
maxdist |
Numeric. Threshold to apply. Only titles the distance between which is smaller or equal to this number will be returned for check. |
A numeric vector with the row numbers of the potential duplicates
Nicolas Mangin
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.