stringsim computes pairwise string similarities between elements of
b, where the vector with less
elements is recycled.
stringsimmatrix computes the string similarity matrix with rows
a and columns according to
stringsim( a, b, method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex"), useBytes = FALSE, q = 1, ... ) stringsimmatrix( a, b, method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex"), useBytes = FALSE, q = 1, ... )
R object (target); will be converted by
R object (source); will be converted by
Method for distance calculation. The default is
Perform byte-wise comparison, see
Size of the q-gram; must be nonnegative. Only applies to
additional arguments are passed on to
The similarity is calculated by first calculating the distance using
stringdist, dividing the distance by the maximum
possible distance, and substracting the result from 1.
This results in a score between 0 and 1, with 1
corresponding to complete similarity and 0 to complete dissimilarity.
Note that complete similarity only means equality for distances satisfying
the identity property. This is not the case e.g. for q-gram based distances
(for example if q=1, anagrams are completely similar).
For distances where weights can be specified, the maximum distance
is currently computed by assuming that all weights are equal to 1.
stringsim returns a vector with similarities, which are values between
0 and 1 where 1 corresponds to perfect similarity (distance 0) and 0 to
NA is returned when
NA. Distances equal to
Inf are truncated to a
similarity of 0.
stringsimmatrix works the same way but, equivalent to
stringdistmatrix, returns a similarity matrix instead of a
# Calculate the similarity using the default method of optimal string alignment stringsim("ca", "abc") # Calculate the similarity using the Jaro-Winkler method # The p argument is passed on to stringdist stringsim('MARTHA','MATHRA',method='jw', p=0.1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.