Correct records under linear restrictions for rounding errors

Share:

Description

This algorithm tries to detect and repair records that violate linear (in)equality constraints by correcting possible rounding errors as described by Scholtus(2008). Typically data is constrainted by Rx=a and Qx ≥ b.

Usage

1
2
3
4
5
6
7
8
correctRounding(E, dat, ...)

## S3 method for class 'editset'
correctRounding(E, dat, ...)

## S3 method for class 'editmatrix'
correctRounding(E, dat, fixate = NULL, delta = 2,
  K = 10, round = TRUE, assumeUnimodularity = FALSE, ...)

Arguments

E

editmatrix or editset as generated by the editrules package.

dat

data.frame with the data to be corrected

...

arguments to be passed to other methods.

fixate

character with variable names that should not be changed.

delta

tolerance on checking for rounding error

K

number of trials per record. See details

round

should the solution be rounded, default TRUE

assumeUnimodularity

If FALSE, a test is performed before corrections are computed (expensive).

Details

The algorithm first finds violated constraints |r'_{i}x-a_i| > 0 , and selects edits that may be due to a rounding error 0 < |r'_{i}x-a_i| ≤q δ. The algorithm then makes a correction suggestion where the errors are attributed to randomly selected variables under the lineair equality constraints. It checks if the suggested correction does not violate the inequality matrix Q. If it does, it will try to generate a different solution up till K times.

Value

A deducorrrect object.

References

Scholtus S (2008). Algorithms for correcting some obvious inconsistencies and rounding errors in business survey data. Technical Report 08015, Statistics Netherlands.

See Also

deducorrect-object status

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
E <- editmatrix(expression( 
    x1 + x2 == x3,
    x2 == x4,
    x5 + x6  + x7 == x8,
    x3 + x8 == x9,
    x9 - x10 == x11
    )
)

dat <- data.frame( x1=12
                 , x2=4
                 , x3=15
                 , x4=4
                 , x5=3
                 , x6=1
                 , x7=8
                 , x8=11
                 , x9=27
                 , x10=41
                 , x11=-13
                 )

sol <- correctRounding(E, dat)


# example with editset
for ( d in dir("../pkg/R/",full.names=TRUE) ) dmp <- source(d)
E <- editmatrix(expression(
    x + y == z,
    x >= 0,
    y >= 0,
    z >= 0,
    if ( x > 0 ) y > 0
    ))
dat <- data.frame(
    x = 1,
    y = 0,
    z = 1)
# solutions causing new violations of conditional rules are rejected 
sol <- correctRounding(E,dat)

# An example with editset
E <- editset(expression(
    x + y == z,
    x >= 0,
    y > 0,
    y < 2,
    z > 1,
    z < 3,
    A %in% c('a','b'),
    B %in% c('c','d'),
    if ( A == 'a' ) B == 'b',
    if ( B == 'b' ) x < 1
))
dat <- data.frame(
    x = 0,
    y = 1,
    z = 2,
    A = 'a',
    B = 'b'
)

correctRounding(E,dat)    

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.