expand_data: Complete incomplete data
In icdGLM: EM by the Method of Weights for Incomplete Categorical Data in Generlized Linear Models

Description Usage Arguments Value Examples

View source: R/expand_data.R

This function fills all incomplete data with a set of possible values equally weighted. This is done in order to apply icdglm.

1 2	expand_data(data, y, missing.x, value.set, weights = rep.int(1, NROW(data)), indicator = rep.int(0, NROW(data)))

`data`	a vector, matrix, list or data frame containing numerics. This data is checked for incompleteness and needs to contain the independent variables for a subsequent regression with n observations and k regressors. Each gap is filled with all values from `value.set`. New observations are added for each possible value.
`y`	a vector of integers or numerics. This vector has to be complete and is the dependent variable for a subsequent regression.
`missing.x`	a vector that contains integers and gives the position of the independent variables, for which the data will be checked for incompleteness, i.e. for a matrix the position of the corresponding columns.
`value.set`	a vector of numerics containing all possible values the missing data can take. This set has to be finite.
`weights`	a vector of numerics giving the initial weight of each observation. Default is 1 for each observation.
`indicator`	a vector of integers that indicates which observations belong to each other. If some columns with incomplete data were already completed, this vector has to be passed here. For raw incomplete data, the function connects observations which belong to each other. Default is 0 for this vector indicating no connection.

expand_data returns a list with the following elements:

dataa data frame of the expanded data with all possible observations (independent variables). The dependent variable is included in the last column.
weightsthe weights for each possible observation.
indicatora vector which indicates which observations belong to each other. Such observations have the same integer being the indicator.

data(TLI.data)
          expand_data(data = TLI.data[,1:3],
          y = TLI.data[,4],
          missing.x = 1:3,
          value.set = 0:1)

$data
    x1 x2 x3 y
1    0  0  0 0
2    1  0  0 0
3    0  0  0 1
4    0  1  0 1
5    0  1  0 0
6    1  1  0 0
7    0  0  0 1
8    0  0  0 1
9    1  0  0 1
10   1  1  0 1
11   1  1  0 1
12   1  1  0 1
13   1  1  0 1
14   0  0  0 1
15   1  0  0 1
16   0  0  0 1
17   1  0  0 1
18   0  1  0 1
19   1  1  0 1
20   0  1  1 0
21   1  1  1 0
22   0  0  1 1
23   0  0  1 1
24   1  0  1 1
25   1  1  1 1
26   1  1  1 1
27   1  1  1 1
28   1  1  1 1
29   0  0  1 1
30   1  0  1 1
31   0  0  1 1
32   1  0  1 1
33   0  1  1 1
34   1  1  1 1
35   0  0  0 0
36   0  0  0 0
37   0  0  0 0
38   0  0  0 0
39   0  0  0 0
40   0  0  0 0
41   0  0  0 0
42   0  0  0 0
43   0  0  0 0
44   0  0  0 0
45   0  0  1 0
46   0  0  1 0
47   0  0  1 0
48   0  0  1 0
49   0  0  1 0
50   0  0  1 0
51   0  0  1 0
52   0  0  1 0
53   0  0  1 0
54   0  0  1 0
55   0  1  0 0
56   0  1  0 0
57   0  1  0 0
58   0  1  0 0
59   0  1  0 0
60   0  1  0 0
61   0  1  1 0
62   0  1  1 0
63   1  0  1 0
64   1  0  1 0
65   1  0  1 0
66   1  0  1 0
67   1  0  1 0
68   1  0  1 0
69   1  0  1 0
70   1  0  0 0
71   0  0  0 1
72   0  0  0 1
73   0  0  0 1
74   0  0  0 1
75   0  0  0 1
76   0  0  0 1
77   0  0  0 1
78   0  0  0 1
79   0  0  0 1
80   0  0  1 1
81   0  0  1 1
82   0  0  1 1
83   0  0  1 1
84   0  0  1 1
85   0  0  1 1
86   0  0  1 1
87   0  0  1 1
88   0  0  1 1
89   0  0  1 1
90   0  1  0 1
91   0  1  0 1
92   0  1  1 1
93   0  1  1 1
94   0  1  1 1
95   0  1  1 1
96   0  1  1 1
97   0  1  1 1
98   0  1  1 1
99   1  0  0 1
100  1  0  0 1
101  1  0  0 1
102  1  0  1 1
103  1  1  0 1

$weights
  [1] 0.500 0.500 0.500 0.500 0.500 0.500 0.500 0.500 0.500 0.500 0.500 0.500
 [13] 0.500 0.250 0.250 0.125 0.125 0.125 0.125 0.500 0.500 0.500 0.500 0.500
 [25] 0.500 0.500 0.500 0.500 0.250 0.250 0.125 0.125 0.125 0.125 1.000 1.000
 [37] 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
 [49] 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
 [61] 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
 [73] 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
 [85] 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000 1.000
 [97] 1.000 1.000 1.000 1.000 1.000 1.000 1.000

$indicator
  [1]  1  1 42 42 28 39 62 63 76 78 80 81 82 41 41 40 40 40 40 28 39 62 63 76 78
 [26] 80 81 82 41 41 40 40 40 40  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
 [51]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
 [76]  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0
[101]  0  0  0

icdGLM documentation built on May 2, 2019, 9:16 a.m.

icdGLM index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

icdGLM
EM by the Method of Weights for Incomplete Categorical Data in Generlized Linear Models

expand_data: Complete incomplete data
In icdGLM: EM by the Method of Weights for Incomplete Categorical Data in Generlized Linear Models

Description

Usage

Arguments

Value

Examples

Example output

Related to expand_data in icdGLM...

R Package Documentation

Browse R Packages

We want your feedback!

icdGLM EM by the Method of Weights for Incomplete Categorical Data in Generlized Linear Models

expand_data: Complete incomplete data In icdGLM: EM by the Method of Weights for Incomplete Categorical Data in Generlized Linear Models

Description

Usage

Arguments

Value

Examples

Example output

Related to expand_data in icdGLM...

R Package Documentation

Browse R Packages

We want your feedback!

icdGLM
EM by the Method of Weights for Incomplete Categorical Data in Generlized Linear Models

expand_data: Complete incomplete data
In icdGLM: EM by the Method of Weights for Incomplete Categorical Data in Generlized Linear Models