Description Usage Arguments Value See Also Examples
Finds the closest string match.
The default method computes Jaro-Winkler string distances using the stringdist
package.
For strings with multiple closest matches, only the first is reported.
1 2 3 | fuzzy_match(a, b, method = "jw", cutoff = 0.5, ...)
fuzzy_check(a, b, method = "jw", ...)
|
a |
a source vector of strings |
b |
a target vector |
method |
method for |
cutoff |
numeric indicating the maximum distance threshold for a match ( |
... |
further arguments for |
For fuzzy_match
, a vector of nearest string matches.
For strings with multiple closest matches, only the first is returned.
For fuzzy_check
, a data.frame containing the source strings,
their closest matches, and the string distance for each match.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 | library(data.table)
set.seed(575)
fruit <- sample(stringr::fruit, 30)
DTA <- data.table(block1 = sample(LETTERS[1:4], 20, TRUE),
block2 = sample(LETTERS[1:4], 20, TRUE),
fruit = sample(fruit, 20))
DTB <- data.table(block1 = sample(LETTERS[1:4], 20, TRUE),
block2 = sample(LETTERS[1:4], 20, TRUE),
fruit = sample(fruit, 20))
fuzzy_check(DTA$fruit, DTB$fruit)
fuzzy_match(DTA$fruit, DTB$fruit)
setkey(DTB, block1, block2)
DTA[ , fuzzy_check(fruit, b = DTB[.BY, fruit]),
by = .(block1, block2)]
DTA[ , .(fruit,
B_fruit = fuzzy_match(fruit, b = DTB[.BY, fruit])),
by = .(block1, block2)]
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.