str_norm | R Documentation |
Strips special chars, lowercases, optionally re-encode to ASCII
str_norm(x, lower = FALSE, ..., to_ASCII = TRUE)
x |
character vector, or something coercible to chr |
lower |
Lowercase the output? Defaults to |
... |
args to str_replace_all, i.e. pattern and replacement |
to_ASCII |
Logical. Should function first transform input to ASCII (this is often helpful for
otherwise stubborn special characters)? Defaults to |
This is a convenience function designed to streamline e.g. fuzzy chr matching, particularly with scraped text. Therefore, some options are intentionally hard-coded, i.e. any whitespace repeats >1 are truncated to 1, and the output is ws-trimmed on both sides.
If no args are passed to ...
for stringr::str_replace_all(), generic defaults are used.
These defaults are meant to provide a potentially more useful output than just an error message,
but this practice somewhat violates error handling paradigms by still trying to return something,
which might not be expected As such, this behavior might change in future versions, and explicit arguments to
str_replace_all
(again, via ...
) should always be provided.
A character
vector normalized according to input args and ws-normalized of length equal to
x. See details for what ws-normalized means.
x <- "Corrosion Survey Database (COR•SUR)" str_norm(x, "\\W", " ") str_norm(x, "\\s", " ") #keep parentheses str_norm(x, "\\W", " ", to_ASCII = FALSE) #iconv option not used str_norm(x, "[A-Za-z]", " ", to_ASCII = FALSE) #inverse str_norm(Sys.Date(), "\\W", " ") str_norm(1:10, "\\d", "-") ## Not run: str_norm(x) #will try to use default pattern and replacement. Read the error message! ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.