fuzzy_matcher: Fuzzy string matching

View source: R/fuzzy_matcher.R

fuzzy_matcherR Documentation

Fuzzy string matching

Description

Produce a matrix showing the results of one or more fuzzy string matches

Usage

fuzzy_matcher(
  comparand1,
  comparand2,
  distance.methods = c("lv", "cosine", "jaccard", "jw")
)

Arguments

comparand1

word or set of words to make a comparison with. Must be a character Vector

comparand2

word or set of words to make a comparison with. Must be a character Vector

methods

String matching method(s) to use. By default, Jaro, Jaccard, Levenshtein, and cosine methods are used. "osa" Optimal string aligment, (restricted Damerau-Levenshtein distance). "lv" Levenshtein distance (as in R’s native adist). "dl" Full Damerau-Levenshtein distance. "hamming" Hamming distance (a and b must have same nr of characters). "lcs" Longest common substring distance. "qgram" q-gram distance. "cosine" cosine distance between q-gram profiles "jaccard" Jaccard distance between q-gram profiles "jw" Jaro, or Jaro-Winker distance.

...

other parameters passed onto methods

Value

A matrix showing degree of matches for each method chosen, for each comparand

Author(s)

Chris Friedman, chris.s.friedman@gmail.com

References

http://bigdata-doctor.com/fuzzy-string-matching-survival-skill-tackle-unstructured-information-r/

Examples

fuzzy_matcher(c("PECS book", "PECS activity book", "PECS"), "PECS")


chris-s-friedman/Friedman documentation built on Feb. 12, 2023, 8:02 p.m.