Given a record x with observerd x_{obs} and missing values x_{miss} under
linear equality constraints Ax=b. The function solSpace
returns
the solution space which can be written as x_{miss} = x_0 + Cz, where x_0 is
are a constant vector (of dimension d=length
(x_{miss})) and C a constant
matrix of dimension d\times d.
1 
x 
(named) numerical vector to be imputed 
x0 

C 

z 
real vector of dimension 
tol 
tolerance used to check which rows of 
If C has rows equal to zero, then those missing values may be imputed deductively. For the other missing values, some z must be chosen or another imputation method used.
The function imputess
imputes missing values in a vector x, based on the
solution space and some chosen vector z. If no z is passed as argument, only
deductive imputations are performend (i.e. some missings may be left).
If C is a named matrix (as returned by solSpace
), rows of x0 and C
are matched by name to x. Otherwise it is assumed that the missings in x occur in the order
of the rows in C (which is also the case when x0 and C are computed by solSpace
).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91  #############################################
# IMPUTATION OF NUMERIC DATA
#############################################
# These examples are taken from De Waal et al (2011) (Examples 9.19.2)
E < editmatrix(c(
"x1 + x2 == x3",
"x2 == x4",
"x5 + x6 + x7 == x8",
"x3 + x8 == x9",
"x9  x10 == x11",
"x6 >= 0",
"x7 >= 0"
))
dat < data.frame(
x1=c(145,145),
x2=c(NA,NA),
x3=c(155,155),
x4=c(NA,NA),
x5=c(NA, 86),
x6=c(NA,NA),
x7=c(NA,NA),
x8=c(86,86),
x9=c(NA,NA),
x10=c(217,217),
x11=c(NA,NA)
)
dat
d < deduImpute(E,dat)
d$corrected
d$status
d$corrections
#############################################
# IMPUTATION OF CATEGORICAL DATA
#############################################
# Here's an example from Katrika (2001) [but see De Waal et al (2011), ex. 9.3)]
E < editarray(c(
"x1 \%in\% letters[1:4]",
"x2 \%in\% letters[1:3]",
"x3 \%in\% letters[1:3]",
"x4 \%in\% letters[1:2]",
"if (x2 == 'c' & x3 != 'c' & x4 == 'a' ) FALSE",
"if (x2 != 'a' & x4 == 'b') FALSE",
"if (x1 != 'c' & x2 != 'b' & x3 != 'a') FALSE",
"if (x1 == 'c' & x3 != 'a' & x4 == 'a' ) FALSE"
))
dat < data.frame(
x1 = c('c', NA ),
x2 = c('b', NA ),
x3 = c(NA , NA ),
x4 = c(NA , 'b'),
stringsAsFactors=FALSE)
s < deduImpute(E,dat)
s$corrected
s$status
s$corrections
E < editset(expression(
x + y == z,
x >= 0,
A %in% c('a','b'),
B %in% c('c','d'),
if ( A == 'a' ) B == 'b',
if ( B == 'b' ) x > 0
))
x < data.frame(
x = NA,
y = 1,
z = 1,
A = 'a',
B = NA
)
# deduImpute will impute x=0 and B='b',which violates the
# last edit. Hence, imputation will be reverted.
deduImpute(E,x)

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.
Please suggest features or report bugs with the GitHub issue tracker.
All documentation is copyright its authors; we didn't write any of that.