validators | R Documentation |
data validators
all validators (except is.na_validator) ignore NA entries.
is.na_validator(x, reason = "mandatory") POSIXct_validator( x, ago = 7, reason = "date-time wrong, in the future or older than a week" ) hhmm_validator(x, reason = "invalid time") date_validator(x, reason = "invalid date - should be: yyyy-mm-dd") datetime_validator( x, reason = "invalid datetime_ - should be: yyyy-mm-dd hh:mm" ) datetime_validatorSS( x, reason = "invalid datetime_ - should be: yyyy-mm-dd hh:mm:ss" ) time_order_validator( x, time1, time2, units = "mins", reason = "invalid time order or time difference larger than expected", time_max = 60 ) datetime_order_validator( x, time1, time2, units = "days", reason = "invalid datetime order or datetime difference larger than expected", time_max = 30 ) interval_validator(x, v, reason = "unusually small or large measure") nchar_validator(x, v, reason = "incorrect number of characters") is.element_validator(x, v, reason = "invalid entry") is.duplicate_validator(x, v, reason = "duplicate entry") is.identical_validator(x, v, reason = "invalid entry") is.regexp_validator(x, regexp, reason = "invalid pattern")
x |
a data.table whose entries needs to be validated. |
reason |
explain why it did not pass the validation. |
ago |
number of days indicating old data entry (set to a week) |
time1 |
start datetime to compare |
time2 |
end datetime to compare |
units |
character string of units |
time_max |
maximal time difference that is passing validation |
v |
a data.table containing the validation rules. See notes. |
regexp |
for is.regexp_validator: a regexp expression |
a data.table with two columns: variable (the names of the columns in x) and rowid (the position of offending (i.e. not validated) entries).
`v` for interval_validator: a data.table with variable, lq, uq columns
`v` for nchar_validator: a data.table with variable and n (number of characters)
`v` for is.element_validator: a data.table with variable and set (a vector of lists containing the valid elements for each variable )
`v` for is.duplicate_validator: a data.table with variable and set (a vector of lists containing the already existing values for each variable )
`v` for is.identical_validator: a data.table with variable and x (the value to test against)
#----------------------------------------------------# x = data.table(v1 = c(1,2, NA, NA), v2 = c(1,2, NA, NA) ) is.na_validator(x) #----------------------------------------------------# t = Sys.time(); d = Sys.Date() require(data.table) x = data.table( v1 = c(NA, as.character(d-1), as.character(t - 3600*24*10 ) ) , v2 = c('2016-11-23 25:23', as.character(t -100) ,as.character(t+100))) POSIXct_validator(x) x = data.table(zz = c( as.character(d -1), as.character(d ) ) ) POSIXct_validator(x) #----------------------------------------------------# x = data.table(v1 = c('02:04' , '16:56', '23:59' ), v2 = c('24:04' , NA, '23:59' ) ) hhmm_validator(x) #----------------------------------------------------# x = data.table(v1 = c('2017-01-21' , '2012-04-21', '2017-05-21' ), v2 = c('2017' , '2017-01-xx', '2015-01-09' ) ) print(date_validator(x)) #----------------------------------------------------# x = data.table(v1 = c('2017-01-21 02:04' , '2012-04-21 16:56', '2017-05-21 23:59' ), v2 = c('2017-07-27 00:00' , '2017-01-21', '2015-01-09 23:59' ) ) datetime_validator(x) #----------------------------------------------------# x = data.table(v1 = c('2017-01-21 02:04:55' , '2012-04-21 16:56:01', '2017-05-21 23:59:00' ), v2 = c('2017-07-27 00:00' , '2017-01-21', '2015-01-09 23:59:01' ) ) datetime_validatorSS(x) #----------------------------------------------------# x = data.table(cap_time = c('10:04' , '16:40', '01:55'), bleeding_time = c('10:10' , '16:30', '04:08'), rowid =1:3) t = time_order_validator(x, time1 = 'cap_time', time2 = 'bleeding_time') #----------------------------------------------------# x = data.table(cap_time = c('2019-06-03 16:04:47' , '2019-04-05 16:40', '2019-04-05 01:55'), bleeding_time = c('2019-06-03 16:00:54' , '2019-04-05 16:30', '2019-04-05 04:08'), rowid = 1:3) t = time_order_validator(x, time1 = 'cap_time', time2 = 'bleeding_time') #----------------------------------------------------# x = data.table(v1 = runif(5) , v2 = runif(5) ) v = data.table(variable = c('v1', 'v2'), lq = c(-1, 0.2), uq = c(.7, 0.5) ) interval_validator(x,v) #-----------------------# x = data.table(box = c(0, 1, 100, 300)) v = data.table(variable = 'box', lq = 1, uq = 277 ) interval_validator(x,v) #----------------------------------------------------# x = data.table(v1 = c('x', 'xy', 'x') , v2 = c('xx', 'x', 'xxx') ) v = data.table(variable = c('v1', 'v2'), n = c(1, 2) ) nchar_validator(x, v) #----------------------------------------------------# x = data.table(v1 = c('A', 'B', 'C') , v2 = c('ZZ', 'YY', 'QQ') ) v = data.table(variable = c('v1', 'v2'), set = c( list( c('A', 'C') ), list( c('YY') )) ) is.element_validator(x, v) #----------------------------------------------------# x = data.table(v1 = c('A', 'B', 'C') , v2 = c('ZZ', 'YY', 'QQ') ) v = data.table(variable = c('v1', 'v2'), set = c( list( c('A', 'C') ), list( c('YY') )) ) is.duplicate_validator(x, v) #----------------------------------------------------# x = data.table(v1 = 1:3 , v2 = c('a', 'b', 'c') ) v = data.table(variable = c('v1', 'v2'), x = c(1, 'd')) is.identical_validator(x, v) #----------------------------------------------------# x = data.table(id = c("x2-011-05-19", "x2-011-05-2019", "x2-011-5-2019", "x2-011- 5-2019") ) is.regexp_validator(x, regexp = "^x[1-9]-\\d{3}-\\b(?:05|09|11)\\b-19$")
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.