imp.rfemp  R Documentation 
RfEmp
multiple imputation method is for mixed types of variables,
and calls corresponding functions based on variable types.
Categorical variables should be of type factor
or logical
, etc.
RfPred.Emp
is used for continuous variables, and RfPred.Cate
is used for categorical variables.
imp.rfemp( data, num.imp = 5, max.iter = 5, num.trees = 10, alpha.emp = 0, sym.dist = TRUE, pre.boot = TRUE, num.trees.cont = NULL, num.trees.cate = NULL, num.threads = NULL, print.flag = FALSE, ... )
data 
A data frame or a matrix containing the incomplete data. Missing
values should be coded as 
num.imp 
Number of multiple imputations. The default is

max.iter 
Number of iterations. The default is 
num.trees 
Number of trees to build. The default is

alpha.emp 
The "significance level" for the empirical distribution of
outofbag prediction errors, can be used for prevention for outliers
(helpful for highly skewed variables).
For example, set alpha = 0.05 to use 95% confidence level.
The default is 
sym.dist 
If 
pre.boot 
If 
num.trees.cont 
Number of trees to build for continuous variables.
The default is 
num.trees.cate 
Number of trees to build for categorical variables,
The default is 
num.threads 
Number of threads for parallel computing. The default is

print.flag 
If 
... 
Other arguments to pass down. 
For continuous variables, mice.impute.rfpred.emp
is called, performing
imputation based on the empirical distribution of outofbag
prediction errors of random forests.
For categorical variables, mice.impute.rfpred.cate
is called,
performing imputation based on predicted probabilities.
An object of S3 class mids
.
Shangzhi Hong
Hong, Shangzhi, et al. "Multiple imputation using chained random forests." Preprint, submitted April 30, 2020. https://arxiv.org/abs/2004.14823.
Zhang, Haozhe, et al. "Random Forest Prediction Intervals." The American Statistician (2019): 120.
Shah, Anoop D., et al. "Comparison of random forest and parametric imputation models for imputing missing data using MICE: a CALIBER study." American journal of epidemiology 179.6 (2014): 764774.
Malley, James D., et al. "Probability machines." Methods of information in medicine 51.01 (2012): 7481.
# Prepare data: convert categorical variables to factors nhanes.fix < nhanes nhanes.fix[, c("age", "hyp")] < lapply(nhanes[, c("age", "hyp")], as.factor) # Perform imputation using imp.rfemp imp < imp.rfemp(nhanes.fix) # Do repeated analyses anl < with(imp, lm(chl ~ bmi + hyp)) # Pool the results pool < pool(anl) # Get pooled estimates reg.ests(pool)
Add the following code to your website.
For more information on customizing the embed code, read Embedding Snippets.