Implements an approximate string matching version of R's native 'match' function. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences.
|Author||Mark van der Loo [aut, cre], Jan van der Laan [ctb], R Core Team [ctb], Nick Logan [ctb]|
|Date of publication||2016-12-16 15:25:23|
|Maintainer||Mark van der Loo <email@example.com>|
amatch: Approximate string matching
phonetic: Phonetic algorithms
printable_ascii: Detect the presence of non-printable or non-ascii characters
qgrams: Get a table of qgram counts from one or more character...
seq_amatch: Approximate matching for integer sequences.
seq_dist: Compute distance metrics between integer sequences
seq_qgrams: Get a table of qgram counts for integer sequences
seq_sim: Compute similarity scores between sequences of integers
stringdist: Compute distance metrics between strings
stringdist-encoding: String metrics in 'stringdist'
stringdist-metrics: String metrics in 'stringdist'
stringdist-package: A package for string distance calculation and approximate...
stringdist-parallelization: Multithreading and parallelization in 'stringdist'
stringsim: Compute similarity scores between strings