Solution space for missing values under equality constraints

solSpace method for editmatrix

This function finds the space of solutions for a numerical record *x* with missing values under
linear constraints *Ax=b*. Write *x=(x_{obs},x_{miss})*.
Then the solution space for *x_{miss}* is given by *x_0 + Cz*, where *x_0* is
a constant vector, *C* a constant matrix and *z* is any real vector of dimension
`ncol(C)`

. This function computes *x_0* and *C*.

1 2 3 4 5 6 7 8 9 |

`E` |
and |

`x` |
a named numeric vector. |

`...` |
Extra parameters to pass to |

`adapt` |
A named logical vector with variables in the same order as in x |

`checkFeasibility` |
Check if the observed values can lead to a consistent record |

`b` |
Equality constraint constant vector |

`tol` |
tolerance used to determine 0-singular values when determining
generalized inverse and to round coefficients of C to zero. See |

The user can specify extra fields to include in *x_{miss}* by specifying `adapt`

.
Also note that the method rests on the assumtion that all nonmissng values of *x* are
correct.

The most timeconsuming step involves computing the generalized inverse of *A_{miss}*
using `MASS::ginv`

(code copied from MASS to avoid dependency). See the package
vignette and De Waal et al. (2011) for more details.

A `list`

with elements *x0* and *C* or `NULL`

if the solution space is empty

T. De Waal, J. Pannekoek and S. Scholtus (2011) Handbook of statistical data editing Chpt 9.2.1

Venables, W. N. & Ripley, B. D. (2002) Modern Applied Statistics with S. Fourth Edition. Springer, New York. ISBN 0-387-95457-0

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 | ```
# This example is taken from De Waal et al (2011) (Examples 9.1-9.2)
E <- editmatrix(c(
"x1 + x2 == x3",
"x2 == x4",
"x5 + x6 + x7 == x8",
"x3 + x8 == x9",
"x9 - x10 == x11",
"x6 >= 0",
"x7 >= 0"
))
dat <- data.frame(
x1=c(145,145),
x2=c(NA,NA),
x3=c(155,155),
x4=c(NA,NA),
x5=c(NA, 86),
x6=c(NA,NA),
x7=c(NA,NA),
x8=c(86,86),
x9=c(NA,NA),
x10=c(217,217),
x11=c(NA,NA)
)
# example with solSpace method for editmatrix
# example 9.1 of De Waal et al (2011).
x <-t(dat)[,1]
s <- solSpace(E,x)
s
# some values are uniquely determined and may be imputed directly:
imputess(x,s$x0,s$C)
# To impute everything, we choose z=1 (arbitrary)
z <- rep(1,sum(is.na(x)))
(y <- imputess(x,s$x0,s$C,z))
# did it work? (use a tolerance in checking to account for machine rounding)
# (FALSE means an edit is not violated)
any(violatedEdits(E,y,tol=1e-8))
# here's an example showing that solSpace only looks at missing values unless
# told otherwise.
Ey <- editmatrix(c(
"yt == y1 + y2 + y3",
"y4 == 0"))
y <- c(yt=10, y1=NA, y2=3, y3=7,y4=12)
# since solSpace by default checks the feasibility, we get no solution (since
# y4 violates the second edit)"
solSpace(Ey,y)
# If we ask solSpace not to check for feasibility, y4 is left alone (although
# the imputed answer is clearly wrong).
(s <- solSpace(Ey,y,checkFeasibility=FALSE))
imputess(y, s$x0, s$C)
# by setting 'adapt' we can include y4 in the imputation Since we know that
# with this adapt vector, imputation can be done consistently, we save some
# time by switching the feasibility check off.
(s <- solSpace(Ey,y,adapt=c(FALSE,FALSE,FALSE,FALSE,TRUE),
checkFeasibility=FALSE))
imputess(y,s$x0,s$C)
``` |

Questions? Problems? Suggestions? Tweet to @rdrrHQ or email at ian@mutexlabs.com.

Please suggest features or report bugs with the GitHub issue tracker.

All documentation is copyright its authors; we didn't write any of that.