stringsim | R Documentation |

`stringsim`

computes pairwise string similarities between elements of
`character`

vectors `a`

and `b`

, where the vector with less
elements is recycled.
`stringsimmatrix`

computes the string similarity matrix with rows
according to `a`

and columns according to `b`

.

```
stringsim(
a,
b,
method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw",
"soundex"),
useBytes = FALSE,
q = 1,
...
)
stringsimmatrix(
a,
b,
method = c("osa", "lv", "dl", "hamming", "lcs", "qgram", "cosine", "jaccard", "jw",
"soundex"),
useBytes = FALSE,
q = 1,
...
)
```

`a` |
R object (target); will be converted by |

`b` |
R object (source); will be converted by |

`method` |
Method for distance calculation. The default is |

`useBytes` |
Perform byte-wise comparison, see |

`q` |
Size of the |

`...` |
additional arguments are passed on to |

The similarity is calculated by first calculating the distance using
`stringdist`

, dividing the distance by the maximum
possible distance, and substracting the result from 1.
This results in a score between 0 and 1, with 1
corresponding to complete similarity and 0 to complete dissimilarity.
Note that complete similarity only means equality for distances satisfying
the identity property. This is not the case e.g. for q-gram based distances
(for example if q=1, anagrams are completely similar).
For distances where weights can be specified, the maximum distance
is currently computed by assuming that all weights are equal to 1.

`stringsim`

returns a vector with similarities, which are values between
0 and 1 where 1 corresponds to perfect similarity (distance 0) and 0 to
complete dissimilarity. `NA`

is returned when `stringdist`

returns `NA`

. Distances equal to `Inf`

are truncated to a
similarity of 0. `stringsimmatrix`

works the same way but, equivalent to
`stringdistmatrix`

, returns a similarity matrix instead of a
vector.

```
# Calculate the similarity using the default method of optimal string alignment
stringsim("ca", "abc")
# Calculate the similarity using the Jaro-Winkler method
# The p argument is passed on to stringdist
stringsim('MARTHA','MATHRA',method='jw', p=0.1)
```

Embedding an R snippet on your website

Add the following code to your website.

For more information on customizing the embed code, read Embedding Snippets.