Description Usage Arguments Value Examples
Tidy stringdist calculation
1 2 | tidy_stringdist(df, v1 = V1, v2 = V2, method = c("osa", "lv", "dl",
"hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex"), ...)
|
df |
a dataframe containing the strings to compare |
v1 |
the name of the first columns |
v2 |
the name of the second columns |
method |
one of the methods implemented in the stringdist package — "osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex". See |
... |
other parameters passed to |
a tibble with string distance
1 2 | proust <- tidy_comb_all(c("Albertine", "Françoise", "Gilberte", "Odette", "Charles"))
tidy_stringdist(proust)
|
# A tibble: 10 x 12
V1 V2 osa lv dl hamming lcs qgram cosine jaccard jw
* <chr> <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl>
1 Albe… Fran… 7 7 7 7 12 10 0.497 0.692 0.444
2 Albe… Gilb… 4 4 4 Inf 5 3 0.142 0.333 0.194
3 Albe… Odet… 6 6 6 Inf 9 9 0.428 0.8 0.389
4 Albe… Char… 8 8 8 Inf 12 10 0.544 0.75 0.579
5 Fran… Gilb… 8 8 8 Inf 13 11 0.578 0.769 0.588
6 Fran… Odet… 8 8 8 Inf 13 13 0.789 0.917 0.574
7 Fran… Char… 7 7 7 Inf 12 8 0.496 0.667 0.495
8 Gilb… Odet… 5 5 5 Inf 8 8 0.4 0.778 0.375
9 Gilb… Char… 7 7 7 Inf 11 9 0.522 0.727 0.565
10 Odet… Char… 6 6 6 Inf 11 11 0.761 0.9 0.563
# … with 1 more variable: soundex <dbl>
Warning message:
In do_dist(a = b, b = a, method = method, weight = weight, q = q, :
Non-printable ascii or non-ascii characters in soundex. Results may be unreliable. See ?printable_ascii.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.