| locate_errors | R Documentation | 
Find out which fields in a data.frame are "faulty" using validation rules
This method returns found errors, according to the specified method x.
Use method replace_errors(), to automatically remove these errors.
'
locate_errors(
  data,
  x,
  ...,
  cl = NULL,
  Ncpus = getOption("Ncpus", 1),
  timeout = 60
)
## S4 method for signature 'data.frame,validator'
locate_errors(
  data,
  x,
  weight = NULL,
  ref = NULL,
  ...,
  cl = NULL,
  Ncpus = getOption("Ncpus", 1),
  timeout = 60
)
## S4 method for signature 'data.frame,ErrorLocalizer'
locate_errors(
  data,
  x,
  weight = NULL,
  ref = NULL,
  ...,
  cl = NULL,
  Ncpus = getOption("Ncpus", 1),
  timeout = 60
)
| data | data to be checked | 
| x | validation rules or errorlocalizer object to be used for finding possible errors. | 
| ... | optional parameters that are passed to  | 
| cl | optional parallel / cluster. | 
| Ncpus | number of nodes to use. See details | 
| timeout | maximum number of seconds that the localizer should use per record. | 
| weight | 
 | 
| ref | 
 | 
Use an Inf weight specification to fixate variables that can not be changed.
See expand_weights() for more details.
locate_errors uses lpSolveAPI to formulate and solves a mixed integer problem.
For details see the vignettes.
This solver has many options:  lpSolveAPI::lp.control.options. Noteworthy
options to be used are:
timeout: restricts the time the solver spends on a record (seconds)
break.at.value: set this to minimum weight + 1 to improve speed.
presolve: default for errorlocate is "rows". Set to "none" when you have
solutions where all variables are deemed wrong.
locate_errors can be run on multiple cores using R package parallel.
 The easiest way to use the parallel option is to set Ncpus to the number of
desired cores, @seealso parallel::detectCores().
 Alternatively one can create a cluster object (parallel::makeCluster())
and use cl to pass the cluster object.
 Or set cl to an integer which results in parallel::mclapply(), which only works
on non-windows.
errorlocation-class() object describing the errors found.
Other error finding: 
errorlocation-class,
errors_removed(),
expand_weights(),
replace_errors()
rules <- validator( profit + cost == turnover
                  , cost >= 0.6 * turnover # cost should be at least 60% of turnover
                  , turnover >= 0 # can not be negative.
                  )
data <- data.frame( profit   = 755
                  , cost     = 125
                  , turnover = 200
                  )
le <- locate_errors(data, rules)
print(le)
summary(le)
v_categorical <- validator( branch %in% c("government", "industry")
                          , tax %in% c("none", "VAT")
                          , if (tax == "VAT") branch == "industry"
)
data <- read.csv(text=
"   branch, tax
government, VAT
industry  , VAT
", strip.white = TRUE)
locate_errors(data, v_categorical)$errors
v_logical <- validator( citizen %in% c(TRUE, FALSE)
                      , voted %in% c(TRUE, FALSE)
                      ,  if (voted == TRUE) citizen == TRUE
                      )
data <- data.frame(voted = TRUE, citizen = FALSE)
locate_errors(data, v_logical, weight=c(2,1))$errors
# try a condinational rule
v <- validator( married %in% c(TRUE, FALSE)
              , if (married==TRUE) age >= 17
              )
data <- data.frame( married = TRUE, age = 16)
locate_errors(data, v, weight=c(married=1, age=2))$errors
# different weights per row
data <- read.csv(text=
"married, age
    TRUE,  16
    TRUE,  14
", strip.white = TRUE)
weight <- read.csv(text=
"married, age
       1,   2
       2,   1
", strip.white = TRUE)
locate_errors(data, v, weight = weight)$errors
# fixate / exclude a variable from error localiziation
# using an Inf weight
weight <- c(age = Inf)
locate_errors(data, v, weight = weight)$errors
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.