Impute values from solution space

Share:

Description

Given a record x with observerd x_{obs} and missing values x_{miss} under linear equality constraints Ax=b. The function solSpace returns the solution space which can be written as x_{miss} = x_0 + Cz, where x_0 is are a constant vector (of dimension d=length(x_{miss})) and C a constant matrix of dimension d\times d.

Usage

1
imputess(x, x0, C, z = NULL, tol = sqrt(.Machine$double.eps))

Arguments

x

(named) numerical vector to be imputed

x0

x0 outcome of solSpace

C

C outcome of solSpace

z

real vector of dimension ncol(C).

tol

tolerance used to check which rows of C equal zero.

Details

If C has rows equal to zero, then those missing values may be imputed deductively. For the other missing values, some z must be chosen or another imputation method used.

The function imputess imputes missing values in a vector x, based on the solution space and some chosen vector z. If no z is passed as argument, only deductive imputations are performend (i.e. some missings may be left).

If C is a named matrix (as returned by solSpace), rows of x0 and C are matched by name to x. Otherwise it is assumed that the missings in x occur in the order of the rows in C (which is also the case when x0 and C are computed by solSpace).

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
#############################################
# IMPUTATION OF NUMERIC DATA
#############################################

# These examples are taken from De Waal et al (2011) (Examples 9.1-9.2)
E <- editmatrix(c(
    "x1 + x2      == x3",
    "x2           == x4",
    "x5 + x6 + x7 == x8",
    "x3 + x8      == x9",
    "x9 - x10     == x11",
    "x6 >= 0",
    "x7 >= 0"
))


dat <- data.frame(
    x1=c(145,145),
    x2=c(NA,NA),
    x3=c(155,155),
    x4=c(NA,NA),
    x5=c(NA, 86),
    x6=c(NA,NA),
    x7=c(NA,NA),
    x8=c(86,86),
    x9=c(NA,NA),
    x10=c(217,217),
    x11=c(NA,NA)
)

dat

d <- deduImpute(E,dat)
d$corrected
d$status
d$corrections




#############################################
# IMPUTATION OF CATEGORICAL DATA
#############################################


# Here's an example from Katrika (2001) [but see De Waal et al (2011), ex. 9.3)]
E <- editarray(c(
    "x1 \%in\% letters[1:4]",
    "x2 \%in\% letters[1:3]",
    "x3 \%in\% letters[1:3]",
    "x4 \%in\% letters[1:2]",
    "if (x2 == 'c'  & x3 != 'c' & x4 == 'a' ) FALSE",
    "if (x2 != 'a'  & x4 == 'b') FALSE",
    "if (x1 != 'c'  & x2 != 'b' & x3 != 'a') FALSE",
    "if (x1 == 'c'  & x3 != 'a' & x4 == 'a' ) FALSE"
))


dat <- data.frame(
    x1 = c('c', NA ),
    x2 = c('b', NA ),
    x3 = c(NA , NA ),
    x4 = c(NA , 'b'),
    stringsAsFactors=FALSE)


s <- deduImpute(E,dat)
s$corrected
s$status
s$corrections


E <- editset(expression(
    x + y == z,
    x >= 0,
    A %in% c('a','b'),
    B %in% c('c','d'),
    if ( A == 'a' ) B == 'b',
    if ( B == 'b' ) x > 0
))

x <- data.frame(
    x = NA,
    y = 1,
    z = 1,
    A = 'a',
    B = NA
)
# deduImpute will impute x=0 and B='b',which violates the 
# last edit. Hence, imputation will be reverted.
deduImpute(E,x) 

Want to suggest features or report bugs for rdrr.io? Use the GitHub issue tracker.