find_potential_duplicates: Identify potential duplicates in a list of references

View source: R/find_potential_duplicates.R

find_potential_duplicatesR Documentation

Identify potential duplicates in a list of references

Description

Function computing distances between titles and selecting those which are too close as defined by a maximum distance.

Usage

find_potential_duplicates(x, distmethod = "qgram", maxdist = 10)

Arguments

x

Tibble. Table with keys and titles.

distmethod

Character. Method to compute distances between titles. Can be: osa, lv, dl, hammig, lcs, qgram, cosine, jaccard, jw, or soundex.

maxdist

Numeric. Threshold to apply. Only titles the distance between which is smaller or equal to this number will be returned for check.

Value

A numeric vector with the row numbers of the potential duplicates

Author(s)

Nicolas Mangin


NicolasJBM/bibliogR documentation built on April 21, 2024, 12:16 a.m.