clean_labels | R Documentation |
This function standardises labels e.g. used as variable names or character
string values, removing non-ascii characters, replacing diacritics (e.g. é,
ô) with their closest ascii equivalents, and standardises separating
characters. See details for more information on label transformation.
clean_labels( x, sep = "_", transformation = "Any-Latin; Latin-ASCII", protect = "" )
x |
A vector of labels, normally provided as characters. |
sep |
A character string used as separator, defaulting to '_'. |
transformation |
a string to be passed on to |
protect |
a character string defining the punctuation that should be protected. This helps prevent meaninful symbols like > and < from being removed. |
The following changes are performed:
all non-ascii characters are removed
all diacritics are replaced with their non-accentuated equivalents, e.g. 'é', 'ê' and 'è' become 'e'.
all characters are set to lower case
separators are standardised to the use of a single character provided
in sep
(defaults to '_'); heading and trailing separators are removed.
Because of differences between the underlying transliteration engine
(ICU), the default transformations will not transilierate German umlaute
correctly. You can add them by specifying "de-ASCII" in the transformation
string after "Any-Latin".
Thibaut Jombart thibautjombart@gmail.com, Zhian N. Kamvar
## Not run: clean_labels("-_-This is; A WeÏrD**./sêntënce...") clean_labels("-_-This is; A WeÏrD**./sêntënce...", sep = ".") input <- c("Peter and stëven", "peter-and.stëven", "pëtêr and stëven _-") input clean_labels(input) # Don't transliterate non-latin words clean_labels(input, transformation = "Latin-ASCII") # protect useful symbols clean_labels(c("energy > 9000", "energy < 9000"), protect = "><") # if you only want to clean accents, transform to lower, and transliterate, # you can specify "[:punct:][:space:]" for protect: clean_labels(input, protect = "[:punct:][:space:]") # appropriately transliterate Germanic umlaute if (stringi::stri_info()$ICU.system) { # This will only be true if you have the correct version of ICU installed clean_labels("'é', 'ê' and 'è' become 'e', 'ö' becomes 'oe', etc.", transformation = "Any-Latin; de-ASCII; Latin-ASCII") } ## End(Not run)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.