testCharDateTime: testCharDateTime

View source: R/testCharDateTime.r

testCharDateTimeR Documentation

testCharDateTime

Description

Test Character Variables for Dates and Times

Usage

testCharDateTime(x, p = 0.5, m = 0, convert = FALSE, existing = FALSE)

Arguments

x

input vector of any type, but interesting cases are for character x

p

minimum proportion of non-missing non-blank values of x for which the format is one of the formats described before considering x to be of that type

m

if greater than 0, a test is applied: the number of distinct illegal values of x (values containing a letter or underscore) must not exceed m, or type character will be returned. p is set to 1.0 when m > 0.

convert

set to TRUE to convert the variable under the dominant format. If all values are NA, type will be set to 'character'.

existing

set to TRUE to return a character string with the current type of variable without examining pattern matches

Details

For a vector x, if it is already a date-time, date, or time variable, the type is returned if convert=FALSE, or a list with that type, the original vector, and numna=0 is returned. Otherwise if x is not a character vector, a type of notcharacter is returned, or a list that includes the original x and type='notcharacter'. When x is character, the main logic is applied. The default logic (when m=0) is to consider x a date-time variable when its format is YYYY-MM-DD HH:MM:SS (:SS is optional) in more than 1/2 of the non-missing observations. It is considered to be a date if its format is YYYY-MM-DD or MM/DD/YYYY or DD-MMM-YYYY in more than 1/2 of the non-missing observations (MMM=3-letter month). A time variable has the format HH:MM:SS or HH:MM. Blank values of x (after trimming) are set to NA before proceeding.

Value

if convert=FALSE, a single character string with the type of x: ⁠"character", "datetime", "date", "time"⁠. If convert=TRUE, a list with components named type, x (converted to POSIXct, Date, or chron times format), and numna, the number of originally non-NA values of x that could not be converted to the predominant format. If there were any non-covertible dates/times, the returned vector is given an additional class special.miss and an attribute special.miss which is a list with original character values (codes) and observation numbers (obs). These are summarized by describe().

Author(s)

Frank Harrell

Examples

for(conv in c(FALSE, TRUE)) {
  print(testCharDateTime(c('2023-03-11', '2023-04-11', 'a', 'b', 'c'), convert=conv))
  print(testCharDateTime(c('2023-03-11', '2023-04-11', 'a', 'b'), convert=conv))
  print(testCharDateTime(c('2023-03-11 11:12:13', '2023-04-11 11:13:14', 'a', 'b'), convert=conv))
  print(testCharDateTime(c('2023-03-11 11:12', '2023-04-11 11:13', 'a', 'b'), convert=conv))
  print(testCharDateTime(c('3/11/2023', '4/11/2023', 'a', 'b'), convert=conv))
}
x <- c(paste0('2023-03-0', 1:9), 'a', 'a', 'a', 'b')
y <- testCharDateTime(x, convert=TRUE)$x
describe(y)  # note counts of special missing values a, b

harrelfe/Hmisc documentation built on Nov. 21, 2024, 3:47 p.m.