best_match: Tries to correct misspelling of character string

Description Usage Arguments Value See Also Examples

View source: R/best_match.R

Description

This function uses fuzzy string matching to replace one possibly misspeled (or in other way not fully correct) character string with a correct version of the same string.

Usage

1
best_match(x, key, no_match = NA, all = FALSE)

Arguments

x

is a character string (or a character vector) that should be matched to the key

key

is a vector containging the correct spellings of the character strings.

no_match

Output value if there is no match. Default is NA. The input is returned unchanged if not matched and no_match = NULL.

all

is a boolean indicator to specify what happens if there is more than one match. Default is FALSE resulting in a warning message and that only the first match is used. If TRUE the returned vector will no longer have the same length as x.

Value

The function returns a character vector of the same length as x if all = FALSE but with each element substituted to its best match in the key-vector. Strings that could not be matched are NA if (no_match = TRUE) or unchanged if no_match = FALSE. If all = TRUE, one input character string could result in more than one output character string. The output might therefore be longer than the input.

See Also

clean_text

Examples

1
2
3
best_match(c("Hej_apa!", "erik", "babian"), c("hej apa", "hej bepa", "kungen", "Erik"))
best_match(c("Hej_apa", "erik", "babian"),
   c("hej apa", "hej bepa", "kungen", "Erik"), no_match = FALSE)

Example output

[1] NA     "Erik" NA    
Warning messages:
1: In best_match(c("Hej_apa!", "erik", "babian"), c("hej apa", "hej bepa",  :
  No match!
2: In best_match(c("Hej_apa!", "erik", "babian"), c("hej apa", "hej bepa",  :
  No match!
[1] "hej apa" "Erik"    "FALSE"  
Warning message:
In best_match(c("Hej_apa", "erik", "babian"), c("hej apa", "hej bepa",  :
  No match!

rccmisc documentation built on May 2, 2019, 2:48 p.m.