View source: R/name_phonetic_matching_functions.R
fuzzify_country | R Documentation |
fuzzify_country()
cleans the "country" column in the user's dataset
for those names that did not find an exact match in the existing country_dictionary.
Given the small size of the country dictionary and the relatively fewer number of
countries (as compared to climate actors), a fuzzy string matching algorithm using
the Levenshtein distance is used for the fuzzy matching instead of the phonetic
algorithms used for matching climate actor names.
fuzzify_country(dataset, country_keydict)
dataset |
Dataset containing countries by user |
country_keydict |
Key dictionary to clean actors' countries against |
Cleaned dataset with countries standardized against the country dictionary.
A few vectors of indices will also be created to store the indices of those
countries that needs to be matched. The first is a vector of indices of all actors
that require cleaning. unmatched_count
is a vector of indices of countries
not cleaned by the function. custom_count
is a vector of indices
denoting countries for which custom actor names are given by the user, and will be
used to update the country dictionary.
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.