sample_impute: Impute missing data using sample from complete cases

Description Usage Arguments Details Value References See Also Examples

View source: R/sample_impute.R

Description

Imputes missing data in a data.frame by sampling with replacement from the complete cases in each column.

Usage

1
sample_impute(X, indicator = lapply(X, is.na))

Arguments

X

data.frame; a incomplete data set including any of numeric, logical, integer, factor and ordered data types.

indicator

named list; indicator of missing (=T) and not-missing (=F) status for each column in X.

Details

This is a similar initial guess as that employed by Multiple Imputations by Chained Equation (van Buuren and Groothuis-Oudshoorn, 2012). This has a higher entropy than the initial state which would be given by that of missForest (Stekhoven, 2012).

Value

data.frame; the same as X except for missing values in each column being replaced by a random (with replacement) sample of the complete cases.

References

Stekhoven, D.J. and Buehlmann, P., 2012. MissForest–non-parametric missing value imputation for mixed-type data. Bioinformatics, 28(1), pp. 112-118. doi.1.1093/bioinformatics/btr597

Van Buuren, S. and Groothuis-Oudshoorn, K., 2011. mice: Multivariate Imputation by Chained Equations in R. _Journal of Statistical Software, 45_(3). pp. 1-67. doi.10.18637/jss.v045.i03

See Also

smirf missForest

Examples

1
2
3
4
5
6
## Not run: 
# simply pass to smirf
smirf(iris, X.init.fn=sample_impute)

## End(Not run)
sample_impute(data.frame(x=c(0,1,NA)))

stephematician/miForang documentation built on July 23, 2019, 5:11 p.m.