replace_errors: Replace erroneous fields with NA or a suggested value

Description Usage Arguments Value Note See Also Examples

Description

Find erronous fields using locate_errors and replace these fields automatically with NA or a suggestion that is provided by the error detection algorithm.

Usage

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
replace_errors(data, x, ref = NULL, ..., value = c("NA", "suggestion"))

## S4 method for signature 'data.frame,validator'
replace_errors(data, x, ref = NULL, ...,
  value = c("NA", "suggestion"))

## S4 method for signature 'data.frame,ErrorLocalizer'
replace_errors(data, x, ref = NULL, ...,
  value = c("NA", "suggestion"))

## S4 method for signature 'data.frame,errorlocation'
replace_errors(data, x, ref = NULL, ...,
  value = c("NA", "suggestion"))

Arguments

data

data to be checked

x

validator object

ref

optional reference data set

...

these parameters are handed over to locate_errors

value

NA

Value

data with erronuous values removed.

Note

In general it is better to replace the erronuous fields with NA and apply a proper imputation methods. Suggested values from the error localization method may introduce an unwanted bias.

The errors that were removed from the data.frame can be retrieved with the function errors_removed. For more control over error localization see locate_errors.

See Also

errorlocation-class

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
library(magrittr)

rules <- validator( profit + cost == turnover
              , cost - 0.6*turnover >= 0
              , cost>= 0
              , turnover >= 0
)
data <- data.frame(profit=755, cost=125, turnover=200)

data_no_error <-
  data %>%
  replace_errors(rules)

# faulty data was replaced with NA
data_no_error

errors_removed(data_no_error)

# a bit more control
error_locations <- locate_errors(data, rules)
data %>%
  replace_errors(error_locations)

data-cleaning/errorlocate documentation built on Sept. 22, 2018, 9:37 p.m.