aggregate.tbl_validation: Count the number of invalid rules or records.

View source: R/aggregate.R

aggregate.tbl_validationR Documentation

Count the number of invalid rules or records.

Description

See the number of valid and invalid checks either by rule or by record.

Usage

## S3 method for class 'tbl_validation'
aggregate(x, by = c("rule", "record", "key"), ...)

Arguments

x

tbl_validation() object

by

either by "rule" or by "record"

...

not used

Details

The result of a confront() on a db tbl results in a lazy squery. That is it builds a query without executing it. To store the result in the database use compute() or values().

Value

A dbplyr::tbl_dbi() object that represents the aggregation query (to be executed) on the database.

Examples

income <- data.frame(id = 1:2, age=c(12,35), salary = c(1000,NA))
con <- dbplyr::src_memdb()
tbl_income <- dplyr::copy_to(con, income, overwrite=TRUE)
print(tbl_income)

# Let's define a rule set and confront the table with it:
rules <- validator( is_adult   = age >= 18
                    , has_income = salary > 0
)

# and confront!
# in general with a db table it is handy to use a key
cf <- confront(tbl_income, rules, key="id")
aggregate(cf, by = "rule")
aggregate(cf, by = "record")

# to tweak performance of the db query the following options are available
# 1) store validation result in db
cf <- confront(tbl_income, rules, key="id", compute = TRUE)
# or identical
cf <- confront(tbl_income, rules, key="id")
cf <- compute(cf)

# 2) Store the validation sparsely
cf_sparse <- confront(tbl_income, rules, key="id", sparse=TRUE )

show_query(cf_sparse)
values(cf_sparse, type="tbl")

data-cleaning/validatedb documentation built on June 11, 2022, 4:33 p.m.