README.md
In richardvogg/fuzzymatch: Detects and replaces inexact string matches.

fuzzymatch

Helps to find inexact matches (e.g. Nestlé vs Nestle) in text data.

devtools::install_github("richardvogg/fuzzymatch")

Short example from TidyTuesday (Week 5 - 2021)

tuesdata <- tidytuesdayR::tt_load('2021',5)
plastics <- tuesdata$plastics

dedupes <- fuzzymatch::fuzzy_dedupes(plastics$parent_company,find_cutoff=TRUE)

The output is sorted by closest stringdist. I checked that I would have the Nestlé / Nestle difference covered (which was at 0.067).

plastics$parent_company <- fuzzymatch::fuzzy_dedupes(plastics$parent_company,cutoff_distance = 0.08)

I was looking for the top 5 polluters. As Nestle is definitely one of them, I needed the data to be as clean as possible.

richardvogg/fuzzymatch documentation built on May 19, 2021, 8:50 a.m.

Note that we can't provide technical support on individual packages. You should contact the package authors for that.