Correct sign errors and value interchanges in data records
Description
Correct sign errors and value interchanges in data records.
Usage
1 2 3 4 5 6 7 8 9 10  correctSigns(E, dat, ...)
## S3 method for class 'editset'
correctSigns(E, dat, ...)
## S3 method for class 'editmatrix'
correctSigns(E, dat, flip = getVars(E), swap = list(),
maxActions = length(flip) + length(swap), maxCombinations = 1e+05,
eps = sqrt(.Machine$double.eps), weight = rep(1, length(flip) +
length(swap)), fixate = NA, ...)

Arguments
E 
An object of class 
dat 

... 
arguments to be passed to other methods. 
flip 
A 
swap 
A 
maxActions 
The maximum number of flips and swaps that may be performed 
maxCombinations 
The number of possible flip/swap combinations in each step of the algorithm is 
eps 
Tolerance to check equalities against. Use this to account for sign errors masked by rounding errors. 
weight 
weight vector. Weights can be assigned either to actions (flips and swap) or to variables.
If 
fixate 
a 
Details
This algorithm tries to correct records violating linear equalities by sign flipping and/or value interchanges.
Linear inequalities are taken into account when judging possible solutions. If one or more inequality restriction
is violated, the solution is rejected. It is important to note that the status
of a record has
the following meaning:
valid  The record obeys all equality constraints on entry. No error correction is performed. 
It may therefore still contain inequality errors.  
corrected  Equality errors were found, and all of them are solved without violating inequalities. 
partial  Does not occur 
invalid  The record contains equality violations which could not be solved with this algorithm 
NA  record could not be checked. It contained missings. 
The algorithm applies all combinations of (userallowed) flip and swap combinations to find a solution, and minimizes the number of actions (flips+swaps) that have to be taken to correct a record. When multiple solutions are found, the solution of minimal weight is chosen. The user may provide a weight vector with weights for every flip and every swap, or a named weight vector with a weight for every variable. If the weights do not single out a solution, the first one found is chosen.
If arguments flip
or swap
contain a variable not in E
, these variables will be ignored by the algorithm.
Value
a deducorrectobject
. The status
slot has the following columns for every records in dat
.
status  a status factor, showing the status of the treated record. 
degeneracy  the number of solutions found, after applying the weight 
weight  the weight of the chosen solution 
nflip  the number of applied sign flips 
nswap  the number of applied value interchanges 
References
Scholtus S (2008). Algorithms for correcting some obvious inconsistencies and rounding errors in business survey data. Technical Report 08015, Netherlands.
See Also
deducorrectobject
Examples
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71  # some data
dat < data.frame(
x = c( 3,14,15, 1, 17,12.3),
y = c(13,4, 5, 2, 7, 2.1),
z = c(10,10,10, NA,10,10 ))
# ... which has to obey
E < editmatrix(c("z == xy"))
# All signs may be flipped, no swaps.
correctSigns(E, dat)
# Allow for rounding errors
correctSigns(E, dat, eps=2)
# Limit the number of combinations that may be tested
correctSigns(E, dat, maxCombinations=2)
# fix z, flip everything else
correctSigns(E, dat,fixate="z")
# the same result is achieved with
correctSigns(E, dat, flip=c("x","y"))
# make x and y swappable, allow no flips
correctSigns(E, dat, flip=c(), swap=list(c("x","y")))
# make x and y swappable, swap a counts as one flip
correctSigns(E, dat, flip="z", swap=list(c("x","y")))
# same, but now, swapping is preferred (has lower weight)
correctSigns(E, dat, flip="z", swap=list(c("x","y")), weight=c(2,1))
# same, but now becayse x any y carry lower weight. Also allow for rounding errors
correctSigns(E, dat, flip="z", swap=list(c("x","y")), eps=2, weight=c(x=1, y=1, z=3))
# demand that solution has y>0
E < editmatrix(c("z==xy", "y>0"))
correctSigns(E,dat)
# demand that solution has y>0, taking acount of roundings in equalities
correctSigns(E,dat,eps=2)
# example with editset
E < editset(expression(
x + y == z,
x >= 0,
y > 0,
y < 2,
z > 1,
z < 3,
A %in% c('a','b'),
B %in% c('c','d'),
if ( A == 'a' ) B == 'b',
if ( B == 'b' ) x < 1
))
x < data.frame(
x = 1,
y = 1,
z = 2,
A = 'a',
B = 'b'
)
correctSigns(E,x)
