clean_text: Clean text strings automatically
In laresbernardo/lares: Lean Analytics and Robust Exploration Sidekick

cleanText

R Documentation

Clean text strings automatically

Description

cleanText: Clean character strings automatically. Options to keep ASCII characters only, keep certain characters, lower caps, title format, are available.

cleanNames: Resulting names are unique and consist only of the _ character, numbers, and ASCII letters. Capitalization preferences can be specified using the lower parameter.

Usage

cleanText(
  text,
  spaces = TRUE,
  keep = "",
  lower = TRUE,
  ascii = TRUE,
  title = FALSE
)

cleanNames(df, num = "x", keep = "_", ...)

Arguments

`text`	Character Vector
`spaces`	Boolean. Keep spaces? If character input, spaces will be transformed into passed argument.
`keep`	Character. String (concatenated or as vector) with all characters that are accepted and should be kept, in addition to alphanumeric.
`lower`	Boolean. Transform all to lower case?
`ascii`	Boolean. Only ASCII characters?
`title`	Boolean. Transform to title format (upper case on first letters).
`df`	data.frame/tibble.
`num`	Add character before only-numeric names.
`...`	Additional parameters passed to `cleanText()`.

Details

Inspired by janitor::clean_names.

Value

Character vector with transformed strings.

data.frame/tibble with transformed column names.

Examples

cleanText("Bernardo Lares 123")
cleanText("Bèrnärdo LáreS 123", lower = FALSE)
cleanText("Bernardo Lare$", spaces = ".", ascii = FALSE)
cleanText("\\@®ì÷å   %ñS  ..-X", spaces = FALSE)
cleanText(c("maría", "€", "núñez_a."), title = TRUE)
cleanText("29_Feb-92()#", keep = c("#", "_"), spaces = FALSE)

# For a data.frame directly:
df <- dft[1:5, 1:6] # Dummy data
colnames(df) <- c("ID.", "34", "x_2", "Num 123", "Nòn-äscì", "  white   Spaces  ")
print(df)
cleanNames(df)
cleanNames(df, lower = FALSE)

laresbernardo/lares documentation built on July 4, 2025, 12:23 p.m.