stringsim  R Documentation 
stringsim
computes pairwise string similarities between elements of
character
vectors a
and b
, where the vector with less
elements is recycled.
stringsimmatrix
computes the string similarity matrix with rows
according to a
and columns according to b
.
stringsim( a, b, method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex"), useBytes = FALSE, q = 1, ... ) stringsimmatrix( a, b, method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw", "soundex"), useBytes = FALSE, q = 1, ... )
a 
R object (target); will be converted by 
b 
R object (source); will be converted by 
method 
Method for distance calculation. The default is 
useBytes 
Perform bytewise comparison, see 
q 
Size of the qgram; must be nonnegative. Only applies to

... 
additional arguments are passed on to 
The similarity is calculated by first calculating the distance using
stringdist
, dividing the distance by the maximum
possible distance, and substracting the result from 1.
This results in a score between 0 and 1, with 1
corresponding to complete similarity and 0 to complete dissimilarity.
Note that complete similarity only means equality for distances satisfying
the identity property. This is not the case e.g. for qgram based distances
(for example if q=1, anagrams are completely similar).
For distances where weights can be specified, the maximum distance
is currently computed by assuming that all weights are equal to 1.
stringsim
returns a vector with similarities, which are values between
0 and 1 where 1 corresponds to perfect similarity (distance 0) and 0 to
complete dissimilarity. NA
is returned when stringdist
returns NA
. Distances equal to Inf
are truncated to a
similarity of 0. stringsimmatrix
works the same way but, equivalent to
stringdistmatrix
, returns a similarity matrix instead of a
vector.
# Calculate the similarity using the default method of optimal string alignment stringsim("ca", "abc") # Calculate the similarity using the JaroWinkler method # The p argument is passed on to stringdist stringsim('MARTHA','MATHRA',method='jw', p=0.1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.