std_text: Standardize text prior to matching to account for minor...

View source: R/std_text.R

std_textR Documentation

Standardize text prior to matching to account for minor variation in character case, spacing, punctuation, or use of accents

Description

Implements the following transformations:

  1. standardize case (base::tolower)

  2. remove diacritic/accent characters (stringi::stri_trans_general)

  3. remove sequences of space or punctuation characters at start or end of string

  4. replace repeated whitespace characters with a single space

Usage

std_text(x)

Arguments

x

A vector of strings

Value

The standardized version of x

Examples

std_text(c("CONFIRMED", "Conf.", "confirmed"))
std_text(c("R\u00e9publique d\u00e9mocratique du  Congo", "Nigeria_"))

epicentre-msf/dbc documentation built on Oct. 24, 2023, 9:25 p.m.