cleanText | R Documentation |
cleanText
: Clean character strings automatically. Options to keep ASCII
characters only, keep certain characters, lower caps, title format, are available.
cleanNames
: Resulting names are unique and consist only of the _
character, numbers, and ASCII letters. Capitalization preferences can be
specified using the lower
parameter.
cleanText(
text,
spaces = TRUE,
keep = "",
lower = TRUE,
ascii = TRUE,
title = FALSE
)
cleanNames(df, num = "x", keep = "_", ...)
text |
Character Vector |
spaces |
Boolean. Keep spaces? If character input, spaces will be transformed into passed argument. |
keep |
Character. String (concatenated or as vector) with all characters that are accepted and should be kept, in addition to alphanumeric. |
lower |
Boolean. Transform all to lower case? |
ascii |
Boolean. Only ASCII characters? |
title |
Boolean. Transform to title format (upper case on first letters). |
df |
data.frame/tibble. |
num |
Add character before only-numeric names. |
... |
Additional parameters passed to |
Inspired by janitor::clean_names
.
Character vector with transformed strings.
data.frame/tibble with transformed column names.
Other Data Wrangling:
balance_data()
,
categ_reducer()
,
date_cuts()
,
date_feats()
,
file_name()
,
formatHTML()
,
holidays()
,
impute()
,
left()
,
normalize()
,
num_abbr()
,
ohe_commas()
,
ohse()
,
quants()
,
removenacols()
,
replaceall()
,
replacefactor()
,
textFeats()
,
textTokenizer()
,
vector2text()
,
year_month()
,
zerovar()
Other Text Mining:
ngrams()
,
remove_stopwords()
,
replaceall()
,
sentimentBreakdown()
,
textCloud()
,
textFeats()
,
textTokenizer()
,
topics_rake()
cleanText("Bernardo Lares 123")
cleanText("Bèrnärdo LáreS 123", lower = FALSE)
cleanText("Bernardo Lare$", spaces = ".", ascii = FALSE)
cleanText("\\@®ì÷å %ñS ..-X", spaces = FALSE)
cleanText(c("maría", "€", "núñez_a."), title = TRUE)
cleanText("29_Feb-92()#", keep = c("#", "_"), spaces = FALSE)
# For a data.frame directly:
df <- dft[1:5, 1:6] # Dummy data
colnames(df) <- c("ID.", "34", "x_2", "Num 123", "Nòn-äscì", " white Spaces ")
print(df)
cleanNames(df)
cleanNames(df, lower = FALSE)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.