locate_errors: Locate errors in data

Description Usage Arguments Value Examples

Description

Locate erronuous fields in rows of data using validation rules or a specific errorlocalizer object. This method returns found errors, according to the specified method x. If these errors are to be removed automatically use method replace_errors.

Usage

1
2
3
4
5
6
7
8
9
locate_errors(data, x, ..., timeout = 60)

## S4 method for signature 'data.frame,validator'
locate_errors(data, x, weight = NULL,
  ref = NULL, ..., timeout = 60)

## S4 method for signature 'data.frame,ErrorLocalizer'
locate_errors(data, x,
  weight = NULL, ref = NULL, ..., timeout = 60)

Arguments

data

data to be checked

x

validation rules or errorlocalizer object to be used for finding possible errors.

...

optional parameter to be used by a specific method

timeout

maximum number of seconds that the localizer should use per record.

weight

numeric optional weight vector to be used in the error localization.

ref

data.frame optional reference data to be used in the rules checking

Value

errorlocation-class object describing the errors found.

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
rules <- validator( profit + cost == turnover
              , cost - 0.6*turnover >= 0
              , cost>= 0
              , turnover >= 0
)
data <- data.frame(profit=755, cost=125, turnover=200)
le <- locate_errors(data, rules)

print(le)
summary(le)

v_categorical <- validator( A %in% c("a1", "a2")
                          , B %in% c("b1", "b2")
                          , if (A == "a1") B == "b1"
)

data <- data.frame(A = c("a1", "a2"), B = c("b2", "b2"))
locate_errors(data, v_categorical)

v_logical <- validator( A %in% c(TRUE, FALSE)
                      , B %in% c(TRUE, FALSE)
                      ,  if (A == TRUE) B == TRUE
                      )

data <- data.frame(A = TRUE, B = FALSE)
locate_errors(data, v_logical, weight=c(2,1))

# try a condinational rule
v <- validator( married %in% c(TRUE, FALSE), if (married==TRUE) age >= 17 )
data <- data.frame( married = TRUE, age = 16)
locate_errors(data, v, weight=c(married=1, age=2))

data-cleaning/errorlocate documentation built on June 4, 2019, 9:34 p.m.