no_information_impute: Impute missing data using mean or mode of complete cases

Description Usage Arguments Details Value References See Also Examples

View source: R/no_information_impute.R

Description

Imputes missing data in a data.frame using either the complete cases' mean or most frequent value for non-integer numeric and factor columns respectively.

Usage

1
no_information_impute(X, indicator = lapply(X, is.na))

Arguments

X

data.frame; a incomplete data set including any of numeric, logical, integer, factor and ordered data types.

indicator

named list; indicator of missing (=T) and not-missing (=F) status for each column in X.

Details

This is the same imputation procedure used to determine the initial state of the missForest procedure (Stekhoven and Buehlmann, 2012). In the case of tied most frequent values in a (factor) column, a single value is selected at random from the tied values.

Value

data.frame; the same as X except for missing values in each column being replaced by either

References

Stekhoven, D.J. and Buehlmann, P., 2012. MissForest–non-parametric missing value imputation for mixed-type data. Bioinformatics, 28(1), pp. 112-118. doi.1.1093/bioinformatics/btr597

See Also

smirf missForest

Examples

1
2
3
4
5
6
## Not run: 
# simply pass to smirf
smirf(iris, X.init.fn=no_information_impute)

## End(Not run)
no_information_impute(data.frame(x=c(0,1,NA)))

stephematician/miForang documentation built on July 23, 2019, 5:11 p.m.