stringdist: Approximate String Matching and String Distance Functions

Implements an approximate string matching version of R's native 'match' function. Can calculate various string distances based on edits (Damerau-Levenshtein, Hamming, Levenshtein, optimal sting alignment), qgrams (q- gram, cosine, jaccard distance) or heuristic metrics (Jaro, Jaro-Winkler). An implementation of soundex is provided as well. Distances can be computed between character vectors while taking proper care of encoding or between integer vectors representing generic sequences.

AuthorMark van der Loo [aut, cre], Jan van der Laan [ctb], R Core Team [ctb], Nick Logan [ctb]
Date of publication2016-12-16 15:25:23
MaintainerMark van der Loo <mark.vanderloo@gmail.com>
LicenseGPL-3
Version0.9.4.4
https://github.com/markvanderloo/stringdist

View on CRAN

Files

stringdist
stringdist/inst
stringdist/inst/CITATION
stringdist/tests
stringdist/tests/testthat.R
stringdist/tests/testthat
stringdist/tests/testthat/testSeqDist.R
stringdist/tests/testthat/testPhonetic.R
stringdist/tests/testthat/testAmatch.R
stringdist/tests/testthat/testQgrams.R
stringdist/tests/testthat/testStringsim.R
stringdist/tests/testthat/testStringdist.R
stringdist/src
stringdist/src/Makevars
stringdist/src/Rstringdist.c
stringdist/src/utils.c
stringdist/src/stringdist.h
stringdist/src/dictionary.h
stringdist/src/qtree.h
stringdist/src/lv.c
stringdist/src/soundex.c
stringdist/src/osa.c
stringdist/src/utf8ToInt.c
stringdist/src/lcs.c
stringdist/src/utils.h
stringdist/src/dist.h
stringdist/src/stringdist.c
stringdist/src/dl.c
stringdist/src/qgram.c
stringdist/src/jaro.c
stringdist/src/hamming.c
stringdist/NAMESPACE
stringdist/NEWS
stringdist/R
stringdist/R/seqdist.R stringdist/R/utils.R stringdist/R/stringsim.R stringdist/R/doc_metrics.R stringdist/R/phonetic.R stringdist/R/qgrams.R stringdist/R/stringdist.R stringdist/R/doc_parallel.R stringdist/R/doc_encoding.R stringdist/R/amatch.R
stringdist/MD5
stringdist/DESCRIPTION
stringdist/man
stringdist/man/stringdist-package.Rd stringdist/man/stringdist-metrics.Rd stringdist/man/stringdist-parallelization.Rd stringdist/man/seq_amatch.Rd stringdist/man/amatch.Rd stringdist/man/seq_qgrams.Rd stringdist/man/qgrams.Rd stringdist/man/printable_ascii.Rd stringdist/man/stringsim.Rd stringdist/man/phonetic.Rd stringdist/man/stringdist.Rd stringdist/man/seq_sim.Rd stringdist/man/stringdist-encoding.Rd stringdist/man/seq_dist.Rd

Questions? Problems? Suggestions? or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.