rfImpute: Missing Value Imputations by randomForest
In randomForest: Breiman and Cutler's random forests for classification and regression

Description Usage Arguments Details Value Author(s) References See Also Examples

Impute missing values in predictor data using proximity from randomForest.

## Default S3 method:
rfImpute(x, y, iter=5, ntree=300, ...)
## S3 method for class 'formula'
rfImpute(x, data, ..., subset)

`x`	A data frame or matrix of predictors, some containing `NA`s, or a formula.
`y`	Response vector (`NA`'s not allowed).
`data`	A data frame containing the predictors and response.
`iter`	Number of iterations to run the imputation.
`ntree`	Number of trees to grow in each iteration of randomForest.
`...`	Other arguments to be passed to `randomForest`.
`subset`	A logical vector indicating which observations to use.

The algorithm starts by imputing NAs using na.roughfix. Then randomForest is called with the completed data. The proximity matrix from the randomForest is used to update the imputation of the NAs. For continuous predictors, the imputed value is the weighted average of the non-missing obervations, where the weights are the proximities. For categorical predictors, the imputed value is the category with the largest average proximity. This process is iterated iter times.

Note: Imputation has not (yet) been implemented for the unsupervised case. Also, Breiman (2003) notes that the OOB estimate of error from randomForest tend to be optimistic when run on the data matrix with imputed values.

A data frame or matrix containing the completed data matrix, where NAs are imputed using proximity from randomForest. The first column contains the response.

Andy Liaw

Leo Breiman (2003). Manual for Setting Up, Using, and Understanding Random Forest V4.0. http://oz.berkeley.edu/users/breiman/Using_random_forests_v4.0.pdf

na.roughfix.

data(iris)
iris.na <- iris
set.seed(111)
## artificially drop some data values.
for (i in 1:4) iris.na[sample(150, sample(20)), i] <- NA
set.seed(222)
iris.imputed <- rfImpute(Species ~ ., iris.na)
set.seed(333)
iris.rf <- randomForest(Species ~ ., iris.imputed)
print(iris.rf)

randomForest 4.6-14
Type rfNews() to see new features/changes/bug fixes.
ntree      OOB      1      2      3
  300:   4.67%  0.00%  8.00%  6.00%
ntree      OOB      1      2      3
  300:   5.33%  0.00%  8.00%  8.00%
ntree      OOB      1      2      3
  300:   5.33%  0.00%  8.00%  8.00%
ntree      OOB      1      2      3
  300:   5.33%  0.00%  8.00%  8.00%
ntree      OOB      1      2      3
  300:   5.33%  0.00%  8.00%  8.00%

Call:
 randomForest(formula = Species ~ ., data = iris.imputed) 
               Type of random forest: classification
                     Number of trees: 500
No. of variables tried at each split: 2

        OOB estimate of  error rate: 5.33%
Confusion matrix:
           setosa versicolor virginica class.error
setosa         50          0         0        0.00
versicolor      0         46         4        0.08
virginica       0          4        46        0.08

randomForest documentation built on May 2, 2019, 5:54 p.m.

randomForest index

rdrr.io home R language documentation Run R code online

CRAN packages Bioconductor packages R-Forge packages GitHub packages

Note that we can't provide technical support on individual packages. You should contact the package authors for that.

randomForest
Breiman and Cutler's random forests for classification and regression

rfImpute: Missing Value Imputations by randomForest
In randomForest: Breiman and Cutler's random forests for classification and regression

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to rfImpute in randomForest...

R Package Documentation

Browse R Packages

We want your feedback!

randomForest Breiman and Cutler's random forests for classification and regression

rfImpute: Missing Value Imputations by randomForest In randomForest: Breiman and Cutler's random forests for classification and regression

Description

Usage

Arguments

Details

Value

Author(s)

References

See Also

Examples

Example output

Related to rfImpute in randomForest...

R Package Documentation

Browse R Packages

We want your feedback!

randomForest
Breiman and Cutler's random forests for classification and regression

rfImpute: Missing Value Imputations by randomForest
In randomForest: Breiman and Cutler's random forests for classification and regression