text_normalization: Normalize text data

Description Usage Arguments Value

View source: R/text_normalization.R

Description

Normalize text data by removing accents, special characters, and normalizing whitespace. This can remove a lot of information from your strings.

Usage

1
2
3
text_normalization(x, lowercase = TRUE, remove = c("'"),
  spaces = c("[[:punct:]]+", "[[:space:]]+"), remove_accents = FALSE,
  trim = TRUE)

Arguments

x

a character vector

lowercase

Decide whether or not to convert the text to lowercase

remove

is a character vector (or regex) of characters to remove entirely

spaces

is a character vector (or regex) of characters to convert to spaces

remove_accents

boolean, decide whether or not to remove accents

trim

boolean, trim the final string before returning

Value

a character vector


zachmayer/r2vec documentation built on May 4, 2019, 9:05 p.m.