fuzzy_match: Fuzzy match (single term).

Description Usage Arguments Details Value

Description

Checks term against term vector for fuzzy matches.

Usage

1
2
fuzzy_match(term, term_vector, max_dist = 0.1, min_test_length = NA,
  skip_pure_digit = FALSE, dist_method = "jw", jw_penalty = 0)

Arguments

term

Character string (term to be evaluated for matches).

term_vector

Character vector (for term to be evaluated against).

max_dist

Numeric from 0 to 1. Sets threshold for no match. See agrepl.

min_test_length

Integer. Sets minimum length for term to be evaluated at all.

skip_pure_digit

Boolean. If TRUE, a term that consists only of digits will not be evaluated at all.

Details

Takes a character string (term) and a character vector (population of terms) as inputs. Uses the base R agrepl function to perform fuzzy string matching (see optional parameters for how this behavior can be tweaked). Returns a list including a vector of any terms "matching" the target term and vector of boolean values that can be used to subset the population of terms for the same values.

Value

Returns a list with two elements, "match_logic" and "matches". "match_logic" is a vector of booleans that could be used to subset the term population to extract matched values. "matches" is a character vector with the matched values.


datavores/vgsample documentation built on May 14, 2019, 8:59 p.m.