| remove_no_date_characters | R Documentation |
This function cleans a character vector or data frame column containing date-like strings by removing all characters that are not needed for parsing or recognizing dates. It preserves:
Digits (0–9)
Letters that appear in any full month name (e.g., "January" → "J, A, N, U, R, Y")
Selected extra allowed characters: space (" "), dash ("-"), slash ("/"), and "k"/"K"
All other characters (symbols, punctuation, letters not in month names) are removed.
remove_no_date_characters(df_column)
df_column |
A character vector (or data frame column) containing date-like strings. Factors will be coerced to character. NA values are preserved. |
The function works as follows:
Converts input to character vector.
Generates the set of letters present in all English month names (case-insensitive).
Constructs a regex pattern to match all characters that are NOT digits, allowed letters, or allowed extra symbols.
Uses stringr::str_replace_all() to remove unwanted characters.
A character vector of the same length as df_column, with
unwanted characters removed. Only digits, letters from month names,
and selected extra characters are kept.
Lukasz Andrzejewski
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.