fill_NA: 'fill_NA' function for the imputations purpose.

Description Usage Arguments Value Note See Also Examples

View source: R/RcppExports.R

Description

Regular imputations to fill the missing data. Non missing independent variables are used to approximate a missing observations for a dependent variable. Quantitative models were built under Rcpp packages and the C++ library Armadillo.

Usage

1
fill_NA(x, model, posit_y, posit_x, w = 0L)

Arguments

x

a numeric matrix - a numeric matrix with variables

model

a character - posibble options ("lda","lm_pred","lm_bayes","lm_noise")

posit_y

an integer - a position of dependent variable

posit_x

an integer vector - positions of independent variables

w

a numeric vector - a weighting variable - only positive values

Value

load variable at position y with additional imputations in a numeric vector format

Note

The lda model is assessed only if there are more than 15 complete observations and for the lms models if number of independent variables is smaller than number of observations.

See Also

fill_NA_N VIF

Examples

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
## Not run: 
library(miceFast)
library(data.table)
library(magrittr)

data = cbind(as.matrix(airquality[,-5]),intercept=1,index=1:nrow(airquality),
             # a numeric vector - positive values
             weights = round(rgamma(nrow(airquality),3,3),1),
             # as.numeric is needed only for OOP miceFast - see on next pages
             groups = airquality[,5])
data_DT = data.table(data)

# simple mean imputation - intercept at position 6
data_DT[,Ozone_imp:=fill_NA(x=as.matrix(.SD),
                           model="lm_pred",
                           posit_y=1,
                           posit_x=c(6),w=.SD[['weights']]),by=.(groups)] %>%
# avg of 10 multiple imputations - last posit_x equal to 9 not 10
# because the groups variable is not included in .SD
.[,Solar_R_imp:=fill_NA_N(as.matrix(.SD),
                         model="lm_bayes",
                         posit_y=2,
                         posit_x=c(3,4,5,6,9),w=.SD[['weights']],times=10),by=.(groups)]

head(data_DT,10)

######################
#OR using OOP miceFast
######################

data = cbind(as.matrix(airquality[,-5]),intercept=1,index=1:nrow(airquality))
weights = rgamma(nrow(data),3,3) # a numeric vector - positive values
#a numeric vector not integers - positive values - sorted increasingly
groups = as.numeric(airquality[,5])
#a numeric vector not integers - positive values - not sorted
#groups = as.numeric(sample(1:8,nrow(data),replace=T))

model = new(miceFast)
model$set_data(data) # providing data by a reference
model$set_w(weights) # providing by a reference
model$set_g(groups)  # providing by a reference

#impute adapt to provided parmaters like w or g
#Simple mean - permanent imputation at the object and data
#variable will be replaced by imputations
model$update_var(1,model$impute("lm_pred",1,c(6))$imputations)

model$update_var(2,model$impute_N("lm_bayes",2,c(1,3,4,5,6),10)$imputations)

#Printing data and retrieving an old order if data was sorted by the grouping variable
head(cbind(model$get_data(),model$get_g(),model$get_w())[order(model$get_index()),],3)
#the same
head(cbind(data,groups,weights)[order(model$get_index()),],3)


## End(Not run)

miceFast documentation built on May 7, 2018, 1:03 a.m.