Description Usage Arguments Details Value See Also Examples
Short utility functions to clean certain characteristics of strings. These
are combined in string_clean
.
1 2 3 4 5 | .remove_special_chars(string)
.replace_umlaute(string)
.remove_diacritics(string)
|
string |
A character vector. |
Replace any characters that do not belong to Regex classes \w or \d, or are a literal whitespace, by a single whitespace. The function preserves German Umlaute and diacritical letters.
Elaboration on the Regex classes: https://stackoverflow.com/a/2998550/13542638.
Replace German Umlaute by their ASCII representations: "ä"->"ae", "ö"->"oe",
and "ü"->"ue". "ß" is diacritical and handled by .remove_diacritics
.
Replace diacritical letters(é, ç, ...) with their "plain" versions. This
function can only handle diacritical letters from latin-based alphabets.
Elements in string
containinig non-latin letters (e.g. cyrillic), will
be replaced by NA
and a warning will be given.
Reference: https://stackoverflow.com/a/20495866/13542638
.remove_special_chars
returns string
with non-letter
Unicode characters replaced by a whitespace.
.replace_umlaute
returns string
with any German Umlaute
replaced.
.remove_diacritics
returns string
with diacritical letters
replaced by their ASCII versions.
string_clean
and string_redund_ws
1 2 3 | thoremisc:::.remove_special_chars("This will be modified: hello-world.")
thoremisc:::.replace_umlaute("Äh, trörö in Überlingen, nicht auf dem Darß.")
thoremisc:::.remove_diacritics("Åll thëşé fūñny leŧters wîll be nørmalised.")
|
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.