miss_tune: Imputation of Missing Values by Automatic Tuned Chained Tree...

Description Usage Arguments Value References Examples

Description

Uses randomForest package to do missing value imputation by automatic chained tree ensembles, see [1, 2]. The optimal mtry parameter is found using the tuneRF function. The iterative chaining stops as soon as max_iter is reached or if the average out-of-bag estimate of performance stops improving. In the latter case, the best imputed data is returned.

Usage

1
2
miss_tune(x_miss, max_iter = 10L, seed = NULL, num_trees = 200,
  verbose = TRUE)

Arguments

x_miss

A data.frame or tibble with missing values to impute.

max_iter

Maximum number of chaining iterations.

seed

Integer seed to initialize the random generator.

num_trees

Number of trees passed to train function of the caret package.

verbose

Boolean. FALSE (default) to print nothing, TRUE to print the OOB prediction error per iteration and variable (1 minus R-squared for regression).

Value

A class with the imputed data having the smaller OOB error, and all the OOB errors from the iterations of the algorithm.

References

[1] Liaw, Andy, and Matthew Wiener. "Classification and regression by randomForest." R news 2.3 (2002): 18-22.

[2] Stekhoven, D.J. and Buehlmann, P. (2012). 'MissForest - nonparametric missing value imputation for mixed-type data', Bioinformatics, 28(1) 2012, 112-118, doi: 10.1093/bioinformatics/btr597

Examples

1
2
3
4
5
6
7
## Not run: 
iris_na <- generate_na(iris)
iris_imp <- miss_tune(iris_na)
head(iris_imp)
head(iris_na)

## End(Not run)

kvantas/missTune documentation built on May 12, 2019, 10:51 a.m.