Description Usage Arguments Value Examples
For a tabular set of publication records, identifies potential sets of duplicate entries and labels them with a unique identifier.
1 2  | dupes_find(x, match_cols, approx_match = FALSE, string_dist = 5,
  min_length = 10, simplify_match = TRUE)
 | 
x | 
 The dataset in which duplicate entries will be identified  | 
match_cols | 
 Column(s) that will be used to search for duplicate records  | 
approx_match | 
 Whether to perform a duplicate search using string distances or exact values  | 
string_dist | 
 When using approximate matching, the string distance cutoff at which records will be assumed duplicated  | 
min_length | 
 The minimum length for the combined matching string
produced by   | 
simplify_match | 
 Whether to perform duplicate searches after removing
all non alpha-numeric characters from the reference string generated from
  | 
An updated version of x, with one column specifying the
final string used to search for duplicates (matching_col)
and another column containing unique identifiers for each set of
duplicates (match_ID).
1 2 3 4 5 6 7  | ## Not run: 
test <- rbind(form_mm_recs, form_mm_recs)
test <- dupes_find(test, c(1, 3))
dupes <- dupes_return(test)
out <- dupes_rm(test)
## End(Not run)
 | 
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.