Given a set of keys or key combinations, check whether all thos combinations occur, or check that they do not occur. Supports globbing and regular expressions.
contains_exactly(keys, by = NULL, allow_duplicates = FALSE) contains_at_least(keys, by = NULL) contains_at_most(keys, by = NULL) does_not_contain(keys)
A data frame or bare (unquoted) name of a data
frame passed as a reference to
A bare (unquoted) variable or list of variable names that occur in the data under scrutiny. The data will be split into groups according to these variables and the check is performed on each group.
||dataset contains exactly the key set, no more, no less.|
||dataset contains at least the given keys.|
||all keys in the data set are contained the given keys.|
||The keys are interpreted as forbidden key combinations.|
logical vector with one entry for each
record in the dataset. Any group not conforming to the test keys will have
FALSE assigned to each record in the group (see examples).
logical vector equal to the number of
records under scrutiny. It is
FALSE where key combinations do not match
any value in
logical vector with size equal to the
number of records under scrutiny. It is
FALSE where key combinations
do not match any value in
Globbing is a simple method of defining string patterns where the asterisks
*) is used a wildcard. For example, the globbing pattern
"abc*" stands for any string starting with
## Check that data is present for all quarters in 2018-2019 dat <- data.frame( year = rep(c("2018","2019"),each=4) , quarter = rep(sprintf("Q%d",1:4), 2) , value = sample(20:50,8) ) # Method 1: creating a data frame in-place (only for simple cases) rule <- validator(contains_exactly( expand.grid(year=c("2018","2019"), quarter=c("Q1","Q2","Q3","Q4")) ) ) out <- confront(dat, rule) values(out) # Method 2: pass the keyset to 'confront', and reference it in the rule. # this scales to larger key sets but it needs a 'contract' between the # rule definition and how 'confront' is called. keyset <- expand.grid(year=c("2018","2019"), quarter=c("Q1","Q2","Q3","Q4")) rule <- validator(contains_exactly(all_keys)) out <- confront(dat, rule, ref=list(all_keys = keyset)) values(out) ## Globbing (use * as a wildcard) # transaction data transactions <- data.frame( sender = c("S21", "X34", "S45","Z22") , receiver = c("FG0", "FG2", "DF1","KK2") , value = sample(70:100,4) ) # forbidden combinations: if the sender starts with "S", # the receiver can not start "FG" forbidden <- data.frame(sender="S*",receiver = "FG*") rule <- validator(does_not_contain(glob(forbidden_keys))) out <- confront(transactions, rule, ref=list(forbidden_keys=forbidden)) values(out) ## Quick interactive testing # use 'with': with(transactions, does_not_contain(forbidden)) ## Grouping # data in 'long' format dat <- expand.grid( year = c("2018","2019") , quarter = c("Q1","Q2","Q3","Q4") , variable = c("import","export") ) dat$value <- sample(50:100,nrow(dat)) periods <- expand.grid( year = c("2018","2019") , quarter = c("Q1","Q2","Q3","Q4") ) rule <- validator(contains_exactly(all_periods, by=variable)) out <- confront(dat, rule, ref=list(all_periods=periods)) values(out) # remove one export record dat1 <- dat[-15,] out1 <- confront(dat1, rule, ref=list(all_periods=periods)) values(out1) values(out1)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.